Xen-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [Xen-devel] [RFC 0/6] XEN scheduling hardening
@ 2019-07-26 10:37 Andrii Anisov
  2019-07-26 10:37 ` [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path Andrii Anisov
                   ` (7 more replies)
  0 siblings, 8 replies; 49+ messages in thread
From: Andrii Anisov @ 2019-07-26 10:37 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrii Anisov

From: Andrii Anisov <andrii_anisov@epam.com>

This is the very RFC series, which is aimed to address some of VCPU time
accounting problems which affect scheduling fairness and accuracy. Please
note that this is done for ARM64 yet.

One of the scheduling problems is a misleading CPU idle time concept. Now
for the CPU idle time, it is taken an idle vcpu run time. But idle vcpu run
time includes IRQ processing, softirqs processing, tasklets processing, etc.
Those tasks are not actual idle and they accounting may mislead CPU freq
governors who rely on the CPU idle time. In this series, it is suggested to
take the time of the actual CPU low power mode as the idle time. 

The other problem is that pure hypervisor tasks execution time is charged from
the guest vcpu budget. For example, IRQ and softirq processing time are charged
from the current vcpu budget, which is likely the guest vcpu. This is quite
unfair and may break scheduling reliability. It is proposed to charge guest
vcpus for the guest actual run time and time to serve guest's hypercalls and
access to emulated iomem. All the rest is calculated as the hypervisor run time
(IRQ and softirq processing, branch prediction hardening, etc.)

While the series is the early RFC, several points are still untouched:
 - Now the time elapsed from the last rescheduling is not fully charged from
   the current vcpu budget. Are there any changes needed in the existing
   scheduling algorithms?
 - How to avoid the absolute top priority of tasklets (what is obeyed by all
   schedulers so far). Should idle vcpu be scheduled as the normal guest vcpus
   (through queues, priorities, etc)?
 - Idle vcpu naming is quite misleading. It is a kind of system (hypervisor)
   task which is responsible for some hypervisor work. Should it be
   renamed/reconsidered?

Andrii Anisov (5):
  schedule: account true system idle time
  sysctl: extend XEN_SYSCTL_getcpuinfo interface
  xentop: show CPU load information
  arm64: call enter_hypervisor_head only when it is needed
  schedule: account all the hypervisor time to the idle vcpu

Julien Grall (1):
  xen/arm: Re-enable interrupt later in the trap path

 tools/xenstat/libxenstat/src/xenstat.c      |  38 +++++++++
 tools/xenstat/libxenstat/src/xenstat.h      |   9 ++
 tools/xenstat/libxenstat/src/xenstat_priv.h |   3 +
 tools/xenstat/xentop/xentop.c               |  30 +++++++
 xen/arch/arm/arm64/entry.S                  |  17 ++--
 xen/arch/arm/domain.c                       |  24 ++++++
 xen/arch/arm/traps.c                        | 128 +++++++++++++++++++---------
 xen/common/sched_credit.c                   |   2 +-
 xen/common/sched_credit2.c                  |   4 +-
 xen/common/sched_rt.c                       |   2 +-
 xen/common/schedule.c                       |  98 ++++++++++++++++++---
 xen/common/sysctl.c                         |   2 +
 xen/include/public/sysctl.h                 |   2 +
 xen/include/xen/sched.h                     |   7 ++
 14 files changed, 303 insertions(+), 63 deletions(-)

-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-07-26 10:37 [Xen-devel] [RFC 0/6] XEN scheduling hardening Andrii Anisov
@ 2019-07-26 10:37 ` Andrii Anisov
  2019-07-26 10:48   ` Julien Grall
  2019-07-26 10:37 ` [Xen-devel] [RFC 2/6] schedule: account true system idle time Andrii Anisov
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-07-26 10:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Volodymyr Babchuk, Julien Grall, Stefano Stabellini, Andrii Anisov

From: Julien Grall <julien.grall@arm.com>

This makes function enter_hypervisor_head() being executed with
irqs locked.

Signed-off-by: Julien Grall <julien.grall@arm.com>
[Andrii: add a justification commit message]
Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
---
 xen/arch/arm/arm64/entry.S | 11 +++++------
 xen/arch/arm/traps.c       |  6 ++++++
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index 97b05f5..8f28789 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -195,7 +195,6 @@ hyp_error_invalid:
 
 hyp_error:
         entry   hyp=1
-        msr     daifclr, #2
         mov     x0, sp
         bl      do_trap_hyp_serror
         exit    hyp=1
@@ -203,7 +202,7 @@ hyp_error:
 /* Traps taken in Current EL with SP_ELx */
 hyp_sync:
         entry   hyp=1
-        msr     daifclr, #6
+        msr     daifclr, #4
         mov     x0, sp
         bl      do_trap_hyp_sync
         exit    hyp=1
@@ -304,7 +303,7 @@ guest_sync_slowpath:
         ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f",
                     "nop; nop",
                     SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
-        msr     daifclr, #6
+        msr     daifclr, #4
         mov     x0, sp
         bl      do_trap_guest_sync
 1:
@@ -332,7 +331,7 @@ guest_fiq_invalid:
 
 guest_error:
         entry   hyp=0, compat=0
-        msr     daifclr, #6
+        msr     daifclr, #4
         mov     x0, sp
         bl      do_trap_guest_serror
         exit    hyp=0, compat=0
@@ -347,7 +346,7 @@ guest_sync_compat:
         ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f",
                     "nop; nop",
                     SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
-        msr     daifclr, #6
+        msr     daifclr, #4
         mov     x0, sp
         bl      do_trap_guest_sync
 1:
@@ -375,7 +374,7 @@ guest_fiq_invalid_compat:
 
 guest_error_compat:
         entry   hyp=0, compat=1
-        msr     daifclr, #6
+        msr     daifclr, #4
         mov     x0, sp
         bl      do_trap_guest_serror
         exit    hyp=0, compat=1
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 3103620..5a9dc66 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2017,6 +2017,8 @@ static void enter_hypervisor_head(struct cpu_user_regs *regs)
     {
         struct vcpu *v = current;
 
+        ASSERT(!local_irq_is_enabled());
+
         /* If the guest has disabled the workaround, bring it back on. */
         if ( needs_ssbd_flip(v) )
             arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL);
@@ -2051,6 +2053,7 @@ void do_trap_guest_sync(struct cpu_user_regs *regs)
     const union hsr hsr = { .bits = regs->hsr };
 
     enter_hypervisor_head(regs);
+    local_irq_enable();
 
     switch ( hsr.ec )
     {
@@ -2186,6 +2189,7 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs)
     const union hsr hsr = { .bits = regs->hsr };
 
     enter_hypervisor_head(regs);
+    local_irq_enable();
 
     switch ( hsr.ec )
     {
@@ -2224,6 +2228,7 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs)
 void do_trap_hyp_serror(struct cpu_user_regs *regs)
 {
     enter_hypervisor_head(regs);
+    local_irq_enable();
 
     __do_trap_serror(regs, VABORT_GEN_BY_GUEST(regs));
 }
@@ -2231,6 +2236,7 @@ void do_trap_hyp_serror(struct cpu_user_regs *regs)
 void do_trap_guest_serror(struct cpu_user_regs *regs)
 {
     enter_hypervisor_head(regs);
+    local_irq_enable();
 
     __do_trap_serror(regs, true);
 }
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Xen-devel] [RFC 2/6] schedule: account true system idle time
  2019-07-26 10:37 [Xen-devel] [RFC 0/6] XEN scheduling hardening Andrii Anisov
  2019-07-26 10:37 ` [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path Andrii Anisov
@ 2019-07-26 10:37 ` Andrii Anisov
  2019-07-26 12:00   ` Dario Faggioli
  2019-07-26 10:37 ` [Xen-devel] [RFC 3/6] sysctl: extend XEN_SYSCTL_getcpuinfo interface Andrii Anisov
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-07-26 10:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, George Dunlap, Dario Faggioli,
	Julien Grall, Volodymyr Babchuk

From: Andrii Anisov <andrii_anisov@epam.com>

Currently the idle time is being accounted as a idle vcpu runtime.
This is not entirely correct, because the entity named idle vcpu is
in fact a hypervisor tasks worker. E.g. some softirqs are processed
by the idle vcpu.
So lets change idle vcpu time accounting and specify system idle time
as a idle vcpu blocked time. For this we should appropriately change
idle vcpu runstates around the real processor idle entry.

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
---
 xen/arch/arm/domain.c | 24 ++++++++++++++++++++++++
 xen/common/schedule.c |  4 +++-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 941bbff..a4e0fd7 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -19,6 +19,7 @@
 #include <xen/sched.h>
 #include <xen/softirq.h>
 #include <xen/wait.h>
+#include <xen/sched-if.h>
 
 #include <asm/alternative.h>
 #include <asm/cpuerrata.h>
@@ -42,6 +43,27 @@
 
 DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
 
+static inline void idle_vcpu_runstate_change(
+    unsigned int cpu, int new_state, s_time_t new_entry_time)
+{
+    s_time_t delta;
+    struct vcpu *v = idle_vcpu[cpu];
+    spinlock_t *lock = vcpu_schedule_lock(v);
+
+    ASSERT(v == current);
+    ASSERT(v->runstate.state != new_state);
+
+    delta = new_entry_time - v->runstate.state_entry_time;
+    if ( delta > 0 )
+    {
+        v->runstate.time[v->runstate.state] += delta;
+        v->runstate.state_entry_time = new_entry_time;
+    }
+
+    v->runstate.state = new_state;
+    vcpu_schedule_unlock(lock, v);
+}
+
 static void do_idle(void)
 {
     unsigned int cpu = smp_processor_id();
@@ -51,11 +73,13 @@ static void do_idle(void)
     process_pending_softirqs();
 
     local_irq_disable();
+    idle_vcpu_runstate_change(cpu, RUNSTATE_blocked, NOW());
     if ( cpu_is_haltable(cpu) )
     {
         dsb(sy);
         wfi();
     }
+    idle_vcpu_runstate_change(cpu, RUNSTATE_running, NOW());
     local_irq_enable();
 
     sched_tick_resume();
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 349f962..0a38d4a 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -214,7 +214,7 @@ uint64_t get_cpu_idle_time(unsigned int cpu)
     if ( cpu_online(cpu) && v )
         vcpu_runstate_get(v, &state);
 
-    return state.time[RUNSTATE_running];
+    return state.time[RUNSTATE_blocked];
 }
 
 /*
@@ -922,6 +922,8 @@ void vcpu_block(void)
 {
     struct vcpu *v = current;
 
+    ASSERT(!is_idle_vcpu(v));
+
     set_bit(_VPF_blocked, &v->pause_flags);
 
     arch_vcpu_block(v);
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Xen-devel] [RFC 3/6] sysctl: extend XEN_SYSCTL_getcpuinfo interface
  2019-07-26 10:37 [Xen-devel] [RFC 0/6] XEN scheduling hardening Andrii Anisov
  2019-07-26 10:37 ` [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path Andrii Anisov
  2019-07-26 10:37 ` [Xen-devel] [RFC 2/6] schedule: account true system idle time Andrii Anisov
@ 2019-07-26 10:37 ` Andrii Anisov
  2019-07-26 12:15   ` Dario Faggioli
  2019-07-26 10:37 ` [Xen-devel] [RFC 4/6] xentop: show CPU load information Andrii Anisov
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-07-26 10:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Andrii Anisov, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Julien Grall, Jan Beulich

From: Andrii Anisov <andrii_anisov@epam.com>

Extend XEN_SYSCTL_getcpuinfo interface with guest and hypervisor
time information.

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
---
 xen/common/schedule.c       | 22 ++++++++++++++++++++++
 xen/common/sysctl.c         |  2 ++
 xen/include/public/sysctl.h |  2 ++
 xen/include/xen/sched.h     |  2 ++
 4 files changed, 28 insertions(+)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 0a38d4a..9e8805d 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -217,6 +217,28 @@ uint64_t get_cpu_idle_time(unsigned int cpu)
     return state.time[RUNSTATE_blocked];
 }
 
+uint64_t get_cpu_guest_time(unsigned int cpu)
+{
+    struct vcpu_runstate_info state = { 0 };
+    struct vcpu *v = idle_vcpu[cpu];
+
+    if ( cpu_online(cpu) && v )
+        vcpu_runstate_get(v, &state);
+
+    return state.time[RUNSTATE_runnable];
+}
+
+uint64_t get_cpu_hyp_time(unsigned int cpu)
+{
+    struct vcpu_runstate_info state = { 0 };
+    struct vcpu *v = idle_vcpu[cpu];
+
+    if ( cpu_online(cpu) && v )
+        vcpu_runstate_get(v, &state);
+
+    return state.time[RUNSTATE_running];
+}
+
 /*
  * If locks are different, take the one with the lower address first.
  * This avoids dead- or live-locks when this code is running on both
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 765effd..c4abb11 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -152,6 +152,8 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
         for ( i = 0; i < nr_cpus; i++ )
         {
             cpuinfo.idletime = get_cpu_idle_time(i);
+            cpuinfo.guesttime = get_cpu_guest_time(i);
+            cpuinfo.hyptime = get_cpu_hyp_time(i);
 
             if ( copy_to_guest_offset(op->u.getcpuinfo.info, i, &cpuinfo, 1) )
                 goto out;
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 91c48dc..1a4e4de 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -164,6 +164,8 @@ struct xen_sysctl_debug_keys {
 /* XEN_SYSCTL_getcpuinfo */
 struct xen_sysctl_cpuinfo {
     uint64_aligned_t idletime;
+    uint64_aligned_t hyptime;
+    uint64_aligned_t guesttime;
 };
 typedef struct xen_sysctl_cpuinfo xen_sysctl_cpuinfo_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpuinfo_t);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b40c8fd..5e28797 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -882,6 +882,8 @@ int vcpu_pin_override(struct vcpu *v, int cpu);
 
 void vcpu_runstate_get(struct vcpu *v, struct vcpu_runstate_info *runstate);
 uint64_t get_cpu_idle_time(unsigned int cpu);
+uint64_t get_cpu_hyp_time(unsigned int cpu);
+uint64_t get_cpu_guest_time(unsigned int cpu);
 
 /*
  * Used by idle loop to decide whether there is work to do:
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Xen-devel] [RFC 4/6] xentop: show CPU load information
  2019-07-26 10:37 [Xen-devel] [RFC 0/6] XEN scheduling hardening Andrii Anisov
                   ` (2 preceding siblings ...)
  2019-07-26 10:37 ` [Xen-devel] [RFC 3/6] sysctl: extend XEN_SYSCTL_getcpuinfo interface Andrii Anisov
@ 2019-07-26 10:37 ` Andrii Anisov
  2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: сall enter_hypervisor_head only when it is needed Andrii Anisov
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 49+ messages in thread
From: Andrii Anisov @ 2019-07-26 10:37 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Andrii Anisov, Wei Liu

From: Andrii Anisov <andrii_anisov@epam.com>

Let xentop request and show information about CPU load
(hypervisor, guest and idle information)

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
---
 tools/xenstat/libxenstat/src/xenstat.c      | 38 +++++++++++++++++++++++++++++
 tools/xenstat/libxenstat/src/xenstat.h      |  9 +++++++
 tools/xenstat/libxenstat/src/xenstat_priv.h |  3 +++
 tools/xenstat/xentop/xentop.c               | 30 +++++++++++++++++++++++
 4 files changed, 80 insertions(+)

diff --git a/tools/xenstat/libxenstat/src/xenstat.c b/tools/xenstat/libxenstat/src/xenstat.c
index bba143e..e4029d2 100644
--- a/tools/xenstat/libxenstat/src/xenstat.c
+++ b/tools/xenstat/libxenstat/src/xenstat.c
@@ -148,6 +148,9 @@ static inline unsigned long long parse(char *s, char *match)
 xenstat_node *xenstat_get_node(xenstat_handle * handle, unsigned int flags)
 {
 #define DOMAIN_CHUNK_SIZE 256
+	xc_cpuinfo_t *cpuinfo;
+	int max_cpus, nr_cpus;
+
 	xenstat_node *node;
 	xc_physinfo_t physinfo = { 0 };
 	xc_domaininfo_t domaininfo[DOMAIN_CHUNK_SIZE];
@@ -177,6 +180,26 @@ xenstat_node *xenstat_get_node(xenstat_handle * handle, unsigned int flags)
 	    * handle->page_size;
 
 	node->freeable_mb = 0;
+
+	max_cpus = node->num_cpus;
+
+	cpuinfo = calloc(max_cpus, sizeof(xc_cpuinfo_t));
+	if (!cpuinfo)
+		return NULL;
+
+	if (xc_getcpuinfo(handle->xc_handle, max_cpus, cpuinfo, &nr_cpus) < 0) {
+		free(cpuinfo);
+		return NULL;
+	}
+
+	for ( i = 0; i < nr_cpus; i++) {
+		node->idle_time += cpuinfo[i].idletime;
+		node->hyp_time += cpuinfo[i].hyptime;
+		node->guest_time += cpuinfo[i].guesttime;
+	}
+
+	free(cpuinfo);
+
 	/* malloc(0) is not portable, so allocate a single domain.  This will
 	 * be resized below. */
 	node->domains = malloc(sizeof(xenstat_domain));
@@ -346,6 +369,21 @@ unsigned long long xenstat_node_cpu_hz(xenstat_node * node)
 	return node->cpu_hz;
 }
 
+unsigned long long xenstat_node_idle_time(xenstat_node * node)
+{
+	return node->idle_time;
+}
+
+unsigned long long xenstat_node_guest_time(xenstat_node * node)
+{
+	return node->guest_time;
+}
+
+unsigned long long xenstat_node_hyp_time(xenstat_node * node)
+{
+	return node->hyp_time;
+}
+
 /* Get the domain ID for this domain */
 unsigned xenstat_domain_id(xenstat_domain * domain)
 {
diff --git a/tools/xenstat/libxenstat/src/xenstat.h b/tools/xenstat/libxenstat/src/xenstat.h
index 76a660f..5b34461 100644
--- a/tools/xenstat/libxenstat/src/xenstat.h
+++ b/tools/xenstat/libxenstat/src/xenstat.h
@@ -80,6 +80,15 @@ unsigned int xenstat_node_num_cpus(xenstat_node * node);
 /* Get information about the CPU speed */
 unsigned long long xenstat_node_cpu_hz(xenstat_node * node);
 
+/* Get information about the CPU idle time */
+unsigned long long xenstat_node_idle_time(xenstat_node * node);
+
+/* Get information about the CPU guest execution time */
+unsigned long long xenstat_node_guest_time(xenstat_node * node);
+
+/* Get information about the CPU hypervisor execution time */
+unsigned long long xenstat_node_hyp_time(xenstat_node * node);
+
 /*
  * Domain functions - extract information from a xenstat_domain
  */
diff --git a/tools/xenstat/libxenstat/src/xenstat_priv.h b/tools/xenstat/libxenstat/src/xenstat_priv.h
index 4eb44a8..78ad8c4 100644
--- a/tools/xenstat/libxenstat/src/xenstat_priv.h
+++ b/tools/xenstat/libxenstat/src/xenstat_priv.h
@@ -45,6 +45,9 @@ struct xenstat_node {
 	unsigned int flags;
 	unsigned long long cpu_hz;
 	unsigned int num_cpus;
+	unsigned long long hyp_time;
+	unsigned long long guest_time;
+	unsigned long long idle_time;
 	unsigned long long tot_mem;
 	unsigned long long free_mem;
 	unsigned int num_domains;
diff --git a/tools/xenstat/xentop/xentop.c b/tools/xenstat/xentop/xentop.c
index af11ebf..aa6e9b0 100644
--- a/tools/xenstat/xentop/xentop.c
+++ b/tools/xenstat/xentop/xentop.c
@@ -930,6 +930,34 @@ void adjust_field_widths(xenstat_domain *domain)
 		fields[FIELD_VBD_WSECT-1].default_width = length;
 }
 
+void do_utilization(void)
+{
+	double us_elapsed = 0.0,
+		   us_hyp = 0.0,
+		   us_guest = 0.0,
+		   us_idle = 0.0;
+
+	/* Can't calculate CPU percentage without a current and a previous sample.*/
+	if(prev_node != NULL && cur_node != NULL) {
+
+		/* Calculate the time elapsed in microseconds */
+		us_elapsed = ((curtime.tv_sec-oldtime.tv_sec)*1000000.0
+				  +(curtime.tv_usec - oldtime.tv_usec));
+
+		/* In the following, nanoseconds must be multiplied by 1000.0 to
+		 * convert to microseconds, then divided by 100.0 to get a percentage,
+		 * resulting in a multiplication by 10.0 */
+		us_idle = ((xenstat_node_idle_time(cur_node) -
+				   xenstat_node_idle_time(prev_node))/10.0)/us_elapsed;
+		us_guest = ((xenstat_node_guest_time(cur_node) -
+				   xenstat_node_guest_time(prev_node))/10.0)/us_elapsed;
+		us_hyp = ((xenstat_node_hyp_time(cur_node) -
+				   xenstat_node_hyp_time(prev_node))/10.0)/us_elapsed;
+	}
+
+	print("%%CPU(s): %6.1f gu, %6.1f hy, %6.1f id \n",
+		  us_guest, us_hyp, us_idle);
+}
 
 /* Section printing functions */
 /* Prints the top summary, above the domain table */
@@ -972,6 +1000,8 @@ void do_summary(void)
 	used = xenstat_node_tot_mem(cur_node);
 	freeable_mb = 0;
 
+	do_utilization();
+
 	/* Dump node memory and cpu information */
 	if ( freeable_mb <= 0 )
 	     print("Mem: %lluk total, %lluk used, %lluk free    ",
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Xen-devel] [RFC 5/6] arm64: сall enter_hypervisor_head only when it is needed
  2019-07-26 10:37 [Xen-devel] [RFC 0/6] XEN scheduling hardening Andrii Anisov
                   ` (3 preceding siblings ...)
  2019-07-26 10:37 ` [Xen-devel] [RFC 4/6] xentop: show CPU load information Andrii Anisov
@ 2019-07-26 10:37 ` Andrii Anisov
  2019-07-26 10:44   ` Andrii Anisov
  2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: call " Andrii Anisov
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-07-26 10:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Volodymyr Babchuk, Julien Grall, Stefano Stabellini, Andrii Anisov

From: Andrii Anisov <andrii_anisov@epam.com>

On ARM64 we know exactly if trap happened from hypervisor or guest, so
we do not need to take that decision. This reduces a condition for
all enter_hypervisor_head calls and the function call for traps from
the hypervisor mode.

Currently, it is implemented for ARM64 only. Integrating the stuff
with ARM32 requires moving ` if ( guest_mode(regs) )` condition
into ARM32 specific traps.c

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
---
 xen/arch/arm/arm64/entry.S |  6 ++--
 xen/arch/arm/traps.c       | 75 ++++++++++++++++++++++++----------------------
 2 files changed, 43 insertions(+), 38 deletions(-)

diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index 8f28789..21c710d 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -211,7 +211,7 @@ hyp_irq:
         entry   hyp=1
         msr     daifclr, #4
         mov     x0, sp
-        bl      do_trap_irq
+        bl      do_trap_hyp_irq
         exit    hyp=1
 
 guest_sync:
@@ -321,7 +321,7 @@ guest_irq:
                     SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
         msr     daifclr, #4
         mov     x0, sp
-        bl      do_trap_irq
+        bl      do_trap_guest_irq
 1:
         exit    hyp=0, compat=0
 
@@ -364,7 +364,7 @@ guest_irq_compat:
                     SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
         msr     daifclr, #4
         mov     x0, sp
-        bl      do_trap_irq
+        bl      do_trap_guest_irq
 1:
         exit    hyp=0, compat=1
 
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 5a9dc66..13726db 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2011,48 +2011,45 @@ static inline bool needs_ssbd_flip(struct vcpu *v)
              cpu_require_ssbd_mitigation();
 }
 
-static void enter_hypervisor_head(struct cpu_user_regs *regs)
+static void enter_hypervisor_head(void)
 {
-    if ( guest_mode(regs) )
-    {
-        struct vcpu *v = current;
+    struct vcpu *v = current;
 
-        ASSERT(!local_irq_is_enabled());
+    ASSERT(!local_irq_is_enabled());
 
-        /* If the guest has disabled the workaround, bring it back on. */
-        if ( needs_ssbd_flip(v) )
-            arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL);
+    /* If the guest has disabled the workaround, bring it back on. */
+    if ( needs_ssbd_flip(v) )
+        arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL);
 
-        /*
-         * If we pended a virtual abort, preserve it until it gets cleared.
-         * See ARM ARM DDI 0487A.j D1.14.3 (Virtual Interrupts) for details,
-         * but the crucial bit is "On taking a vSError interrupt, HCR_EL2.VSE
-         * (alias of HCR.VA) is cleared to 0."
-         */
-        if ( v->arch.hcr_el2 & HCR_VA )
-            v->arch.hcr_el2 = READ_SYSREG(HCR_EL2);
+    /*
+     * If we pended a virtual abort, preserve it until it gets cleared.
+     * See ARM ARM DDI 0487A.j D1.14.3 (Virtual Interrupts) for details,
+     * but the crucial bit is "On taking a vSError interrupt, HCR_EL2.VSE
+     * (alias of HCR.VA) is cleared to 0."
+     */
+    if ( v->arch.hcr_el2 & HCR_VA )
+        v->arch.hcr_el2 = READ_SYSREG(HCR_EL2);
 
 #ifdef CONFIG_NEW_VGIC
-        /*
-         * We need to update the state of our emulated devices using level
-         * triggered interrupts before syncing back the VGIC state.
-         *
-         * TODO: Investigate whether this is necessary to do on every
-         * trap and how it can be optimised.
-         */
-        vtimer_update_irqs(v);
-        vcpu_update_evtchn_irq(v);
+    /*
+     * We need to update the state of our emulated devices using level
+     * triggered interrupts before syncing back the VGIC state.
+     *
+     * TODO: Investigate whether this is necessary to do on every
+     * trap and how it can be optimised.
+     */
+    vtimer_update_irqs(v);
+    vcpu_update_evtchn_irq(v);
 #endif
 
-        vgic_sync_from_lrs(v);
-    }
+    vgic_sync_from_lrs(v);
 }
 
 void do_trap_guest_sync(struct cpu_user_regs *regs)
 {
     const union hsr hsr = { .bits = regs->hsr };
 
-    enter_hypervisor_head(regs);
+    enter_hypervisor_head();
     local_irq_enable();
 
     switch ( hsr.ec )
@@ -2188,7 +2185,6 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs)
 {
     const union hsr hsr = { .bits = regs->hsr };
 
-    enter_hypervisor_head(regs);
     local_irq_enable();
 
     switch ( hsr.ec )
@@ -2227,7 +2223,6 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs)
 
 void do_trap_hyp_serror(struct cpu_user_regs *regs)
 {
-    enter_hypervisor_head(regs);
     local_irq_enable();
 
     __do_trap_serror(regs, VABORT_GEN_BY_GUEST(regs));
@@ -2235,21 +2230,31 @@ void do_trap_hyp_serror(struct cpu_user_regs *regs)
 
 void do_trap_guest_serror(struct cpu_user_regs *regs)
 {
-    enter_hypervisor_head(regs);
+    enter_hypervisor_head();
     local_irq_enable();
 
     __do_trap_serror(regs, true);
 }
 
-void do_trap_irq(struct cpu_user_regs *regs)
+void do_trap_guest_irq(struct cpu_user_regs *regs)
+{
+    enter_hypervisor_head();
+    gic_interrupt(regs, 0);
+}
+
+void do_trap_guest_fiq(struct cpu_user_regs *regs)
+{
+    enter_hypervisor_head();
+    gic_interrupt(regs, 1);
+}
+
+void do_trap_hyp_irq(struct cpu_user_regs *regs)
 {
-    enter_hypervisor_head(regs);
     gic_interrupt(regs, 0);
 }
 
-void do_trap_fiq(struct cpu_user_regs *regs)
+void do_trap_hyp_fiq(struct cpu_user_regs *regs)
 {
-    enter_hypervisor_head(regs);
     gic_interrupt(regs, 1);
 }
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Xen-devel] [RFC 5/6] arm64: call enter_hypervisor_head only when it is needed
  2019-07-26 10:37 [Xen-devel] [RFC 0/6] XEN scheduling hardening Andrii Anisov
                   ` (4 preceding siblings ...)
  2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: сall enter_hypervisor_head only when it is needed Andrii Anisov
@ 2019-07-26 10:37 ` " Andrii Anisov
  2019-07-26 10:59   ` Julien Grall
  2019-07-26 10:37 ` [Xen-devel] [RFC 6/6] schedule: account all the hypervisor time to the idle vcpu Andrii Anisov
  2019-07-26 11:56 ` [Xen-devel] [RFC 0/6] XEN scheduling hardening Dario Faggioli
  7 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-07-26 10:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Volodymyr Babchuk, Julien Grall, Stefano Stabellini, Andrii Anisov

From: Andrii Anisov <andrii_anisov@epam.com>

On ARM64 we know exactly if trap happened from hypervisor or guest, so
we do not need to take that decision. This reduces a condition for
all enter_hypervisor_head calls and the function call for traps from
the hypervisor mode.

Currently, it is implemented for ARM64 only. Integrating the stuff
with ARM32 requires moving ` if ( guest_mode(regs) )` condition
into ARM32 specific traps.c

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
---
 xen/arch/arm/arm64/entry.S |  6 ++--
 xen/arch/arm/traps.c       | 75 ++++++++++++++++++++++++----------------------
 2 files changed, 43 insertions(+), 38 deletions(-)

diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S
index 8f28789..21c710d 100644
--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -211,7 +211,7 @@ hyp_irq:
         entry   hyp=1
         msr     daifclr, #4
         mov     x0, sp
-        bl      do_trap_irq
+        bl      do_trap_hyp_irq
         exit    hyp=1
 
 guest_sync:
@@ -321,7 +321,7 @@ guest_irq:
                     SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
         msr     daifclr, #4
         mov     x0, sp
-        bl      do_trap_irq
+        bl      do_trap_guest_irq
 1:
         exit    hyp=0, compat=0
 
@@ -364,7 +364,7 @@ guest_irq_compat:
                     SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT)
         msr     daifclr, #4
         mov     x0, sp
-        bl      do_trap_irq
+        bl      do_trap_guest_irq
 1:
         exit    hyp=0, compat=1
 
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 5a9dc66..13726db 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2011,48 +2011,45 @@ static inline bool needs_ssbd_flip(struct vcpu *v)
              cpu_require_ssbd_mitigation();
 }
 
-static void enter_hypervisor_head(struct cpu_user_regs *regs)
+static void enter_hypervisor_head(void)
 {
-    if ( guest_mode(regs) )
-    {
-        struct vcpu *v = current;
+    struct vcpu *v = current;
 
-        ASSERT(!local_irq_is_enabled());
+    ASSERT(!local_irq_is_enabled());
 
-        /* If the guest has disabled the workaround, bring it back on. */
-        if ( needs_ssbd_flip(v) )
-            arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL);
+    /* If the guest has disabled the workaround, bring it back on. */
+    if ( needs_ssbd_flip(v) )
+        arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL);
 
-        /*
-         * If we pended a virtual abort, preserve it until it gets cleared.
-         * See ARM ARM DDI 0487A.j D1.14.3 (Virtual Interrupts) for details,
-         * but the crucial bit is "On taking a vSError interrupt, HCR_EL2.VSE
-         * (alias of HCR.VA) is cleared to 0."
-         */
-        if ( v->arch.hcr_el2 & HCR_VA )
-            v->arch.hcr_el2 = READ_SYSREG(HCR_EL2);
+    /*
+     * If we pended a virtual abort, preserve it until it gets cleared.
+     * See ARM ARM DDI 0487A.j D1.14.3 (Virtual Interrupts) for details,
+     * but the crucial bit is "On taking a vSError interrupt, HCR_EL2.VSE
+     * (alias of HCR.VA) is cleared to 0."
+     */
+    if ( v->arch.hcr_el2 & HCR_VA )
+        v->arch.hcr_el2 = READ_SYSREG(HCR_EL2);
 
 #ifdef CONFIG_NEW_VGIC
-        /*
-         * We need to update the state of our emulated devices using level
-         * triggered interrupts before syncing back the VGIC state.
-         *
-         * TODO: Investigate whether this is necessary to do on every
-         * trap and how it can be optimised.
-         */
-        vtimer_update_irqs(v);
-        vcpu_update_evtchn_irq(v);
+    /*
+     * We need to update the state of our emulated devices using level
+     * triggered interrupts before syncing back the VGIC state.
+     *
+     * TODO: Investigate whether this is necessary to do on every
+     * trap and how it can be optimised.
+     */
+    vtimer_update_irqs(v);
+    vcpu_update_evtchn_irq(v);
 #endif
 
-        vgic_sync_from_lrs(v);
-    }
+    vgic_sync_from_lrs(v);
 }
 
 void do_trap_guest_sync(struct cpu_user_regs *regs)
 {
     const union hsr hsr = { .bits = regs->hsr };
 
-    enter_hypervisor_head(regs);
+    enter_hypervisor_head();
     local_irq_enable();
 
     switch ( hsr.ec )
@@ -2188,7 +2185,6 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs)
 {
     const union hsr hsr = { .bits = regs->hsr };
 
-    enter_hypervisor_head(regs);
     local_irq_enable();
 
     switch ( hsr.ec )
@@ -2227,7 +2223,6 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs)
 
 void do_trap_hyp_serror(struct cpu_user_regs *regs)
 {
-    enter_hypervisor_head(regs);
     local_irq_enable();
 
     __do_trap_serror(regs, VABORT_GEN_BY_GUEST(regs));
@@ -2235,21 +2230,31 @@ void do_trap_hyp_serror(struct cpu_user_regs *regs)
 
 void do_trap_guest_serror(struct cpu_user_regs *regs)
 {
-    enter_hypervisor_head(regs);
+    enter_hypervisor_head();
     local_irq_enable();
 
     __do_trap_serror(regs, true);
 }
 
-void do_trap_irq(struct cpu_user_regs *regs)
+void do_trap_guest_irq(struct cpu_user_regs *regs)
+{
+    enter_hypervisor_head();
+    gic_interrupt(regs, 0);
+}
+
+void do_trap_guest_fiq(struct cpu_user_regs *regs)
+{
+    enter_hypervisor_head();
+    gic_interrupt(regs, 1);
+}
+
+void do_trap_hyp_irq(struct cpu_user_regs *regs)
 {
-    enter_hypervisor_head(regs);
     gic_interrupt(regs, 0);
 }
 
-void do_trap_fiq(struct cpu_user_regs *regs)
+void do_trap_hyp_fiq(struct cpu_user_regs *regs)
 {
-    enter_hypervisor_head(regs);
     gic_interrupt(regs, 1);
 }
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Xen-devel] [RFC 6/6] schedule: account all the hypervisor time to the idle vcpu
  2019-07-26 10:37 [Xen-devel] [RFC 0/6] XEN scheduling hardening Andrii Anisov
                   ` (5 preceding siblings ...)
  2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: call " Andrii Anisov
@ 2019-07-26 10:37 ` Andrii Anisov
  2019-07-26 11:56 ` [Xen-devel] [RFC 0/6] XEN scheduling hardening Dario Faggioli
  7 siblings, 0 replies; 49+ messages in thread
From: Andrii Anisov @ 2019-07-26 10:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Meng Xu, Jan Beulich, Dario Faggioli,
	Volodymyr Babchuk

From: Andrii Anisov <andrii_anisov@epam.com>

Account for a guest:
 - guest running time
 - guest sync traps serving time (hypercalls, trapped emulated iomems, etc)
 - vcpu jobs in leave_hypervisor_tail

Account for the hyp:
 - IRQ processing
 - Softirq processing

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
---
 xen/arch/arm/traps.c       | 49 ++++++++++++++++++++++++++----
 xen/common/sched_credit.c  |  2 +-
 xen/common/sched_credit2.c |  4 +--
 xen/common/sched_rt.c      |  2 +-
 xen/common/schedule.c      | 74 +++++++++++++++++++++++++++++++++++++++-------
 xen/include/xen/sched.h    |  5 ++++
 6 files changed, 116 insertions(+), 20 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 13726db..f978b94 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2064,7 +2064,7 @@ void do_trap_guest_sync(struct cpu_user_regs *regs)
         if ( !check_conditional_instr(regs, hsr) )
         {
             advance_pc(regs, hsr);
-            return;
+            break;
         }
         if ( hsr.wfi_wfe.ti ) {
             /* Yield the VCPU for WFE */
@@ -2126,10 +2126,16 @@ void do_trap_guest_sync(struct cpu_user_regs *regs)
         perfc_incr(trap_hvc32);
 #ifndef NDEBUG
         if ( (hsr.iss & 0xff00) == 0xff00 )
-            return do_debug_trap(regs, hsr.iss & 0x00ff);
+        {
+            do_debug_trap(regs, hsr.iss & 0x00ff);
+            break;
+        }
 #endif
         if ( hsr.iss == 0 )
-            return do_trap_hvc_smccc(regs);
+        {
+            do_trap_hvc_smccc(regs);
+            break;
+        }
         nr = regs->r12;
         do_trap_hypercall(regs, &nr, hsr);
         regs->r12 = (uint32_t)nr;
@@ -2141,10 +2147,16 @@ void do_trap_guest_sync(struct cpu_user_regs *regs)
         perfc_incr(trap_hvc64);
 #ifndef NDEBUG
         if ( (hsr.iss & 0xff00) == 0xff00 )
-            return do_debug_trap(regs, hsr.iss & 0x00ff);
+        {
+            do_debug_trap(regs, hsr.iss & 0x00ff);
+            break;
+        }
 #endif
         if ( hsr.iss == 0 )
-            return do_trap_hvc_smccc(regs);
+        {
+            do_trap_hvc_smccc(regs);
+            break;
+        }
         do_trap_hypercall(regs, &regs->x16, hsr);
         break;
     case HSR_EC_SMC64:
@@ -2179,6 +2191,11 @@ void do_trap_guest_sync(struct cpu_user_regs *regs)
                 hsr.bits, hsr.ec, hsr.len, hsr.iss);
         inject_undef_exception(regs, hsr);
     }
+
+    local_irq_disable();
+    hyp_tacc_head(1);
+
+    /*we will call tacc tail from the leave_hypervisor_tail*/
 }
 
 void do_trap_hyp_sync(struct cpu_user_regs *regs)
@@ -2219,6 +2236,7 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs)
                hsr.bits, hsr.ec, hsr.len, hsr.iss);
         do_unexpected_trap("Hypervisor", regs);
     }
+
 }
 
 void do_trap_hyp_serror(struct cpu_user_regs *regs)
@@ -2234,28 +2252,47 @@ void do_trap_guest_serror(struct cpu_user_regs *regs)
     local_irq_enable();
 
     __do_trap_serror(regs, true);
+
+    local_irq_disable();
+    hyp_tacc_head(2);
 }
 
 void do_trap_guest_irq(struct cpu_user_regs *regs)
 {
+    hyp_tacc_head(3);
+
     enter_hypervisor_head();
     gic_interrupt(regs, 0);
+
+    /*we will call tacc tail from the leave_hypervisor_tail*/
 }
 
 void do_trap_guest_fiq(struct cpu_user_regs *regs)
 {
+    hyp_tacc_head(4);
+
     enter_hypervisor_head();
     gic_interrupt(regs, 1);
+
+    /*we will call tacc tail from the leave_hypervisor_tail*/
 }
 
 void do_trap_hyp_irq(struct cpu_user_regs *regs)
 {
+    hyp_tacc_head(5);
+
     gic_interrupt(regs, 0);
+
+    hyp_tacc_tail(5);
 }
 
 void do_trap_hyp_fiq(struct cpu_user_regs *regs)
 {
+    hyp_tacc_head(6);
+
     gic_interrupt(regs, 1);
+
+    hyp_tacc_tail(6);
 }
 
 static void check_for_pcpu_work(void)
@@ -2318,6 +2355,8 @@ void leave_hypervisor_tail(void)
      */
     SYNCHRONIZE_SERROR(SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT);
 
+    hyp_tacc_tail(1234);
+
     /*
      * The hypervisor runs with the workaround always present.
      * If the guest wants it disabled, so be it...
diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 3c0d7c7..b8d866b 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -1856,7 +1856,7 @@ csched_schedule(
                     (unsigned char *)&d);
     }
 
-    runtime = now - current->runstate.state_entry_time;
+    runtime = current->runtime;
     if ( runtime < 0 ) /* Does this ever happen? */
         runtime = 0;
 
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 8e4381d..2d11a5f 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -3285,7 +3285,7 @@ runq_candidate(struct csched2_runqueue_data *rqd,
      * no point forcing it to do so until rate limiting expires.
      */
     if ( !yield && prv->ratelimit_us && vcpu_runnable(scurr->vcpu) &&
-         (now - scurr->vcpu->runstate.state_entry_time) <
+          scurr->vcpu->runtime <
           MICROSECS(prv->ratelimit_us) )
     {
         if ( unlikely(tb_init_done) )
@@ -3296,7 +3296,7 @@ runq_candidate(struct csched2_runqueue_data *rqd,
             } d;
             d.dom = scurr->vcpu->domain->domain_id;
             d.vcpu = scurr->vcpu->vcpu_id;
-            d.runtime = now - scurr->vcpu->runstate.state_entry_time;
+            d.runtime = scurr->vcpu->runtime;
             __trace_var(TRC_CSCHED2_RATELIMIT, 1,
                         sizeof(d),
                         (unsigned char *)&d);
diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 0acfc3d..f1de511 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -947,7 +947,7 @@ burn_budget(const struct scheduler *ops, struct rt_vcpu *svc, s_time_t now)
         return;
 
     /* burn at nanoseconds level */
-    delta = now - svc->last_start;
+    delta = svc->vcpu->runtime;
     /*
      * delta < 0 only happens in nested virtualization;
      * TODO: how should we handle delta < 0 in a better way?
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 9e8805d..d3246f9 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -1504,20 +1504,16 @@ static void schedule(void)
              (now - next->runstate.state_entry_time) : 0,
              next_slice.time);
 
-    ASSERT(prev->runstate.state == RUNSTATE_running);
-
     TRACE_4D(TRC_SCHED_SWITCH,
              prev->domain->domain_id, prev->vcpu_id,
              next->domain->domain_id, next->vcpu_id);
 
-    vcpu_runstate_change(
-        prev,
-        ((prev->pause_flags & VPF_blocked) ? RUNSTATE_blocked :
-         (vcpu_runnable(prev) ? RUNSTATE_runnable : RUNSTATE_offline)),
-        now);
-
-    ASSERT(next->runstate.state != RUNSTATE_running);
-    vcpu_runstate_change(next, RUNSTATE_running, now);
+    if ( !vcpu_runnable(prev) )
+        vcpu_runstate_change(
+            prev,
+            ((prev->pause_flags & VPF_blocked) ? RUNSTATE_blocked :
+             RUNSTATE_offline),
+            now);
 
     /*
      * NB. Don't add any trace records from here until the actual context
@@ -1526,6 +1522,7 @@ static void schedule(void)
 
     ASSERT(!next->is_running);
     next->is_running = 1;
+    next->runtime = 0;
 
     pcpu_schedule_unlock_irq(lock, cpu);
 
@@ -1541,6 +1538,58 @@ static void schedule(void)
     context_switch(prev, next);
 }
 
+DEFINE_PER_CPU(int, hyp_tacc_cnt);
+
+void hyp_tacc_head(int place)
+{
+    //printk("\thead cpu %u, place %d, cnt %d\n", smp_processor_id(), place, this_cpu(hyp_tacc_cnt));
+
+    ASSERT(this_cpu(hyp_tacc_cnt) >= 0);
+
+    if ( this_cpu(hyp_tacc_cnt) == 0 )
+    {
+        s_time_t now = NOW();
+        spin_lock(per_cpu(schedule_data,smp_processor_id()).schedule_lock);
+        /*
+         * Stop time accounting for guest (guest vcpu)
+         */
+        ASSERT( (current->runstate.state_entry_time & XEN_RUNSTATE_UPDATE) == 0);
+        current->runtime += now - current->runstate.state_entry_time;
+        vcpu_runstate_change(current, RUNSTATE_runnable, now);
+        /*
+         * Start time accounting for hyp (idle vcpu)
+         */
+        vcpu_runstate_change(idle_vcpu[smp_processor_id()], RUNSTATE_running, now);
+        spin_unlock(per_cpu(schedule_data,smp_processor_id()).schedule_lock);
+    }
+
+    this_cpu(hyp_tacc_cnt)++;
+}
+
+void hyp_tacc_tail(int place)
+{
+    //printk("\t\t\t\ttail cpu %u, place %d, cnt %d\n", smp_processor_id(), place, this_cpu(hyp_tacc_cnt));
+
+    ASSERT(this_cpu(hyp_tacc_cnt) > 0);
+
+    if (this_cpu(hyp_tacc_cnt) == 1)
+    {
+        s_time_t now = NOW();
+        spin_lock(per_cpu(schedule_data,smp_processor_id()).schedule_lock);
+        /*
+         * Stop time accounting for guest (guest vcpu)
+         */
+        vcpu_runstate_change(idle_vcpu[smp_processor_id()], RUNSTATE_runnable, now);
+        /*
+         * Start time accounting for hyp (idle vcpu)
+         */
+        vcpu_runstate_change(current, RUNSTATE_running, now);
+        spin_unlock(per_cpu(schedule_data,smp_processor_id()).schedule_lock);
+    }
+
+    this_cpu(hyp_tacc_cnt)--;
+}
+
 void context_saved(struct vcpu *prev)
 {
     /* Clear running flag /after/ writing context to memory. */
@@ -1597,8 +1646,9 @@ static int cpu_schedule_up(unsigned int cpu)
     sd->curr = idle_vcpu[cpu];
     init_timer(&sd->s_timer, s_timer_fn, NULL, cpu);
     atomic_set(&sd->urgent_count, 0);
+    per_cpu(hyp_tacc_cnt, cpu) = 1;
 
-    /* Boot CPU is dealt with later in schedule_init(). */
+    /* Boot CPU is dealt with later in scheduler_init(). */
     if ( cpu == 0 )
         return 0;
 
@@ -1654,6 +1704,8 @@ static void cpu_schedule_down(unsigned int cpu)
     sd->sched_priv = NULL;
 
     kill_timer(&sd->s_timer);
+
+    per_cpu(hyp_tacc_cnt, cpu) = 0;
 }
 
 static int cpu_schedule_callback(
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 5e28797..9391318 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -174,6 +174,8 @@ struct vcpu
     } runstate_guest; /* guest address */
 #endif
 
+    s_time_t runtime;
+
     /* Has the FPU been initialised? */
     bool             fpu_initialised;
     /* Has the FPU been used since it was last saved? */
@@ -998,6 +1000,9 @@ extern void dump_runq(unsigned char key);
 
 void arch_do_physinfo(struct xen_sysctl_physinfo *pi);
 
+void hyp_tacc_head(int place);
+void hyp_tacc_tail(int place);
+
 #endif /* __SCHED_H__ */
 
 /*
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel]  [RFC 5/6] arm64: сall enter_hypervisor_head only when it is needed
  2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: сall enter_hypervisor_head only when it is needed Andrii Anisov
@ 2019-07-26 10:44   ` Andrii Anisov
  0 siblings, 0 replies; 49+ messages in thread
From: Andrii Anisov @ 2019-07-26 10:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Volodymyr Babchuk, Julien Grall, Stefano Stabellini, Andrii Anisov

Sorry guys,

I was eliminating cyrillic "с" in the commit title and occasionally left this patch in the folder.
Please ignore exactly this patch.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-07-26 10:37 ` [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path Andrii Anisov
@ 2019-07-26 10:48   ` Julien Grall
  2019-07-30 17:35     ` Andrii Anisov
  0 siblings, 1 reply; 49+ messages in thread
From: Julien Grall @ 2019-07-26 10:48 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov

Hi,

On 26/07/2019 11:37, Andrii Anisov wrote:
> From: Julien Grall <julien.grall@arm.com>
> 
> This makes function enter_hypervisor_head() being executed with
> irqs locked.

This is the 3rd time you send this patch... and still no proper explanation why 
this is done nor the impact on keeping the interrupts disabled longer than 
necessary.

Resending the patch without things addressed is only going to make it worst. If 
you have any doubt of what I am asking then ask.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 5/6] arm64: call enter_hypervisor_head only when it is needed
  2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: call " Andrii Anisov
@ 2019-07-26 10:59   ` Julien Grall
  2019-07-30 17:35     ` Andrii Anisov
  0 siblings, 1 reply; 49+ messages in thread
From: Julien Grall @ 2019-07-26 10:59 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov

Hi,

On 26/07/2019 11:37, Andrii Anisov wrote:
> From: Andrii Anisov <andrii_anisov@epam.com>
> 
> On ARM64 we know exactly if trap happened from hypervisor or guest, so
> we do not need to take that decision. This reduces a condition for
> all enter_hypervisor_head calls and the function call for traps from
> the hypervisor mode.

One condition lost but ...

> 
> Currently, it is implemented for ARM64 only. Integrating the stuff
> with ARM32 requires moving ` if ( guest_mode(regs) )` condition
> into ARM32 specific traps.

... one more divergence between arm32 and arm64.

There are probably dozens of more conditions in the code that are not necessary 
for one of the architectures. Yet there are value to keep everything common 
because the benefits outweigh the likely non performance improvement.

So I am not convinced that this has any value for Xen.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 0/6] XEN scheduling hardening
  2019-07-26 10:37 [Xen-devel] [RFC 0/6] XEN scheduling hardening Andrii Anisov
                   ` (6 preceding siblings ...)
  2019-07-26 10:37 ` [Xen-devel] [RFC 6/6] schedule: account all the hypervisor time to the idle vcpu Andrii Anisov
@ 2019-07-26 11:56 ` Dario Faggioli
  2019-07-26 12:14   ` Juergen Gross
  2019-07-29 14:28   ` Andrii Anisov
  7 siblings, 2 replies; 49+ messages in thread
From: Dario Faggioli @ 2019-07-26 11:56 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Andrii Anisov, George Dunlap,
	Andrew Cooper, Tim Deegan, Julien Grall, Jan Beulich

[-- Attachment #1.1: Type: text/plain, Size: 4214 bytes --]

[Adding George plus others x86, ARM and core-Xen people]

Hi Andrii,

First of all, thanks a lot for this series!

The problem you mention is a long standing one, and I'm glad we're
eventually starting to properly look into it.

I already have one comment: I think I can see from where this come
from, but I don't think 'XEN scheduling hardening' is what we're doing
in this series... I'd go for something like "xen: sched: improve idle
and vcpu time accounting precision", or something like that.

On Fri, 2019-07-26 at 13:37 +0300, Andrii Anisov wrote:
> One of the scheduling problems is a misleading CPU idle time concept.
> Now
> for the CPU idle time, it is taken an idle vcpu run time. But idle
> vcpu run
> time includes IRQ processing, softirqs processing, tasklets
> processing, etc.
> Those tasks are not actual idle and they accounting may mislead CPU
> freq
> governors who rely on the CPU idle time. 
>
Indeed! And I agree this is quite bad.

> The other problem is that pure hypervisor tasks execution time is
> charged from
> the guest vcpu budget. 
>
Yep, equally bad.

> For example, IRQ and softirq processing time are charged
> from the current vcpu budget, which is likely the guest vcpu. This is
> quite
> unfair and may break scheduling reliability. 
> It is proposed to charge guest
> vcpus for the guest actual run time and time to serve guest's
> hypercalls and
> access to emulated iomem. All the rest is calculated as the
> hypervisor run time
> (IRQ and softirq processing, branch prediction hardening, etc.)
> 
Right.

> While the series is the early RFC, several points are still
> untouched:
>  - Now the time elapsed from the last rescheduling is not fully
> charged from
>    the current vcpu budget. Are there any changes needed in the
> existing
>    scheduling algorithms?
>
I'll think about it, but out of the top of my head, I don't see how
this can be a problem. Scheduling algorithms (should!) base their logic
and their calculations on actual vcpus' runtime, not much on idle
vcpus' one.

>  - How to avoid the absolute top priority of tasklets (what is obeyed
> by all
>    schedulers so far). Should idle vcpu be scheduled as the normal
> guest vcpus
>    (through queues, priorities, etc)?
>
Now, this is something to think about, and try to understand if
anything would break if we go for it. I mean, I see why you'd want to
do that, but tasklets and softirqs works the way they do, in Xen, since
when they were introduced, I believe.

Therefore, even if there wouldn't be any subsystem explicitly relying
on the current behavior (which should be verified), I think we are at
high risk of breaking things, if we change.

That's not to mean it would not be a good change, or that it is
impossible... It's, rather, just to raise some awareness. :-)

>  - Idle vcpu naming is quite misleading. It is a kind of system
> (hypervisor)
>    task which is responsible for some hypervisor work. Should it be
>    renamed/reconsidered?
> 
Well, that's a design question, even for this very series, isn't it? I
mean, I see two ways of achieving proper idle time accounting:
1) you leave things as they are --i.e., idle does not only do idling, 
   it also does all these other things, but you make sure you don't 
   count the time they take as idle time;
2) you move all these activities out of idle, and in some other 
   context, and you let idle just do the idling. At that point, time 
   accounted to idle will be only actual idle time, as the time it 
   took to Xen to do all the other things is now accounted to the new 
   execution context which is running them.

So, which path this path series takes (I believe 1), and which path you
(and others) believe is better?

(And, yes, discussing this is why I've added, apart from George, some
other x86, ARM, and core-Xen people)

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 2/6] schedule: account true system idle time
  2019-07-26 10:37 ` [Xen-devel] [RFC 2/6] schedule: account true system idle time Andrii Anisov
@ 2019-07-26 12:00   ` Dario Faggioli
  2019-07-26 12:42     ` Andrii Anisov
  0 siblings, 1 reply; 49+ messages in thread
From: Dario Faggioli @ 2019-07-26 12:00 UTC (permalink / raw)
  To: andrii.anisov, xen-devel
  Cc: george.dunlap, andrii_anisov, julien.grall, sstabellini,
	Volodymyr_Babchuk

[-- Attachment #1.1: Type: text/plain, Size: 1714 bytes --]

On Fri, 2019-07-26 at 13:37 +0300, Andrii Anisov wrote:
> From: Andrii Anisov <andrii_anisov@epam.com>
> 
> Currently the idle time is being accounted as a idle vcpu runtime.
> This is not entirely correct, because the entity named idle vcpu is
> in fact a hypervisor tasks worker. E.g. some softirqs are processed
> by the idle vcpu.
>
That's all very true, and, as discussed both via mail and in person,
I'm all for it.

About the implementation.

> So lets change idle vcpu time accounting and specify system idle time
> as a idle vcpu blocked time. 
>
This, for one, doesn't really look right to me. You're trying to make
things more clear and more precise... and that's by hiding real idle
time in the idle_vcpu blocked time metric? :-D :-P

Jokes apart, I see how it is rather easy to do something like this, so
I understand it being done like this in an RFC patch, but I don't think
it's correct.

And, on an even more general perspective, the fact that the hypervisor,
when scheduling the idle vcpu, runs softirq, tasklets, etc, it's a
generic concept, not an arch specific one. So, we really should find a
way to implement this in common code, not in arch code.

Maybe, but I'm just thinking out loud, and I need to think more about
this, we can do things the other way round. I.e., we measure the time
that it takes to run softirq and tasklets, and we subtract it from
idle_vcpu runtime?

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 0/6] XEN scheduling hardening
  2019-07-26 11:56 ` [Xen-devel] [RFC 0/6] XEN scheduling hardening Dario Faggioli
@ 2019-07-26 12:14   ` Juergen Gross
  2019-07-29 11:53     ` Dario Faggioli
  2019-07-29 14:47     ` Andrii Anisov
  2019-07-29 14:28   ` Andrii Anisov
  1 sibling, 2 replies; 49+ messages in thread
From: Juergen Gross @ 2019-07-26 12:14 UTC (permalink / raw)
  To: Dario Faggioli, Andrii Anisov, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, George Dunlap, Andrew Cooper,
	Tim Deegan, Julien Grall, Jan Beulich

On 26.07.19 13:56, Dario Faggioli wrote:
> [Adding George plus others x86, ARM and core-Xen people]
> 
> Hi Andrii,
> 
> First of all, thanks a lot for this series!
> 
> The problem you mention is a long standing one, and I'm glad we're
> eventually starting to properly look into it.
> 
> I already have one comment: I think I can see from where this come
> from, but I don't think 'XEN scheduling hardening' is what we're doing
> in this series... I'd go for something like "xen: sched: improve idle
> and vcpu time accounting precision", or something like that.
> 
> On Fri, 2019-07-26 at 13:37 +0300, Andrii Anisov wrote:
>> One of the scheduling problems is a misleading CPU idle time concept.
>> Now
>> for the CPU idle time, it is taken an idle vcpu run time. But idle
>> vcpu run
>> time includes IRQ processing, softirqs processing, tasklets
>> processing, etc.
>> Those tasks are not actual idle and they accounting may mislead CPU
>> freq
>> governors who rely on the CPU idle time.
>>
> Indeed! And I agree this is quite bad.
> 
>> The other problem is that pure hypervisor tasks execution time is
>> charged from
>> the guest vcpu budget.
>>
> Yep, equally bad.
> 
>> For example, IRQ and softirq processing time are charged
>> from the current vcpu budget, which is likely the guest vcpu. This is
>> quite
>> unfair and may break scheduling reliability.
>> It is proposed to charge guest
>> vcpus for the guest actual run time and time to serve guest's
>> hypercalls and
>> access to emulated iomem. All the rest is calculated as the
>> hypervisor run time
>> (IRQ and softirq processing, branch prediction hardening, etc.)
>>
> Right.
> 
>> While the series is the early RFC, several points are still
>> untouched:
>>   - Now the time elapsed from the last rescheduling is not fully
>> charged from
>>     the current vcpu budget. Are there any changes needed in the
>> existing
>>     scheduling algorithms?
>>
> I'll think about it, but out of the top of my head, I don't see how
> this can be a problem. Scheduling algorithms (should!) base their logic
> and their calculations on actual vcpus' runtime, not much on idle
> vcpus' one.
> 
>>   - How to avoid the absolute top priority of tasklets (what is obeyed
>> by all
>>     schedulers so far). Should idle vcpu be scheduled as the normal
>> guest vcpus
>>     (through queues, priorities, etc)?
>>
> Now, this is something to think about, and try to understand if
> anything would break if we go for it. I mean, I see why you'd want to
> do that, but tasklets and softirqs works the way they do, in Xen, since
> when they were introduced, I believe.
> 
> Therefore, even if there wouldn't be any subsystem explicitly relying
> on the current behavior (which should be verified), I think we are at
> high risk of breaking things, if we change.

We'd break things IMO.

Tasklets are sometimes used to perform async actions which can't be done
in guest vcpu context. Like switching a domain to shadow mode for L1TF
mitigation, or marshalling all cpus for stop_machine(). You don't want
to be able to block tasklets, you want them to run as soon as possible.

> 
> That's not to mean it would not be a good change, or that it is
> impossible... It's, rather, just to raise some awareness. :-)
> 
>>   - Idle vcpu naming is quite misleading. It is a kind of system
>> (hypervisor)
>>     task which is responsible for some hypervisor work. Should it be
>>     renamed/reconsidered?
>>
> Well, that's a design question, even for this very series, isn't it? I
> mean, I see two ways of achieving proper idle time accounting:
> 1) you leave things as they are --i.e., idle does not only do idling,
>     it also does all these other things, but you make sure you don't
>     count the time they take as idle time;
> 2) you move all these activities out of idle, and in some other
>     context, and you let idle just do the idling. At that point, time
>     accounted to idle will be only actual idle time, as the time it
>     took to Xen to do all the other things is now accounted to the new
>     execution context which is running them.

And here we are coming back to the idea of a "hypervisor domain" I
suggested about 10 years ago and which was rejected at that time...


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 3/6] sysctl: extend XEN_SYSCTL_getcpuinfo interface
  2019-07-26 10:37 ` [Xen-devel] [RFC 3/6] sysctl: extend XEN_SYSCTL_getcpuinfo interface Andrii Anisov
@ 2019-07-26 12:15   ` Dario Faggioli
  2019-07-26 13:06     ` Andrii Anisov
  0 siblings, 1 reply; 49+ messages in thread
From: Dario Faggioli @ 2019-07-26 12:15 UTC (permalink / raw)
  To: andrii.anisov, xen-devel
  Cc: sstabellini, andrii_anisov, wl, konrad.wilk, George.Dunlap, tim,
	ian.jackson, julien.grall, Jan Beulich, andrew.cooper3

[-- Attachment #1.1: Type: text/plain, Size: 1486 bytes --]

On Fri, 2019-07-26 at 13:37 +0300, Andrii Anisov wrote:
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 0a38d4a..9e8805d 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -217,6 +217,28 @@ uint64_t get_cpu_idle_time(unsigned int cpu)
>      return state.time[RUNSTATE_blocked];
>  }
>  
> +uint64_t get_cpu_guest_time(unsigned int cpu)
> +{
> +    struct vcpu_runstate_info state = { 0 };
> +    struct vcpu *v = idle_vcpu[cpu];
> +
> +    if ( cpu_online(cpu) && v )
> +        vcpu_runstate_get(v, &state);
> +
> +    return state.time[RUNSTATE_runnable];
> +}
> +
Yep, I think being able to know time spent running guests could be
useful.

> +uint64_t get_cpu_hyp_time(unsigned int cpu)
> +{
> +    struct vcpu_runstate_info state = { 0 };
> +    struct vcpu *v = idle_vcpu[cpu];
> +
> +    if ( cpu_online(cpu) && v )
> +        vcpu_runstate_get(v, &state);
> +
> +    return state.time[RUNSTATE_running];
> +}
> +
>
I confirm what I said about patch 1: idle time being the time idle_vcpu
spent in RUNSTATE_blocked, and hypervisor time being the time idle_vcpu
spent in RUNSTATE_running sounds quite confusing to me.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 2/6] schedule: account true system idle time
  2019-07-26 12:00   ` Dario Faggioli
@ 2019-07-26 12:42     ` Andrii Anisov
  2019-07-29 11:40       ` Dario Faggioli
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-07-26 12:42 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: george.dunlap, andrii_anisov, julien.grall, sstabellini,
	Volodymyr_Babchuk

Hello Dario,

On 26.07.19 15:00, Dario Faggioli wrote:
> On Fri, 2019-07-26 at 13:37 +0300, Andrii Anisov wrote:
>> From: Andrii Anisov <andrii_anisov@epam.com>
>>
>> Currently the idle time is being accounted as a idle vcpu runtime.
>> This is not entirely correct, because the entity named idle vcpu is
>> in fact a hypervisor tasks worker. E.g. some softirqs are processed
>> by the idle vcpu.
>>
> That's all very true, and, as discussed both via mail and in person,
> I'm all for it.

Thank you for you interest. And I hope to have some productive discussion here. :)

> and that's by hiding real idle
> time in the idle_vcpu blocked time metric? :-D :-P

Yes, I do. You should be noticed I told about idle_vcpu renaming in the cover letter.
So if you treat current idle_vcpu as a hypervisor_vcpu, you will see that getting it blocked on wait for event strictly match the idle concept.

> Jokes apart,

So let it be :)

> I see how it is rather easy to do something like this, so
> I understand it being done like this in an RFC patch, but I don't think
> it's correct.

This is the VERY RFC with the minimal changes to the existing code and adopting existing approaches.
This topic is really complex and requires wide discussion, so this series is rather an invitation to the discussion.

> And, on an even more general perspective, the fact that the hypervisor,
> when scheduling the idle vcpu, runs softirq, tasklets, etc, it's a
> generic concept, not an arch specific one. So, we really should find a
> way to implement this in common code, not in arch code.

Yes, in terms of this patch, idle_vcpu_runstate_change() better be moved to common/schedule.c.

> Maybe, but I'm just thinking out loud, and I need to think more about
> this, we can do things the other way round. I.e., we measure the time
> that it takes to run softirq and tasklets, and we subtract it from
> idle_vcpu runtime?

In the patch "schedule: account all the hypervisor time to the idle vcpu" I extend what I think should be accounted for the hypervisor run time. And subtraction approach will result in more complex code over there.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 3/6] sysctl: extend XEN_SYSCTL_getcpuinfo interface
  2019-07-26 12:15   ` Dario Faggioli
@ 2019-07-26 13:06     ` Andrii Anisov
  0 siblings, 0 replies; 49+ messages in thread
From: Andrii Anisov @ 2019-07-26 13:06 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: sstabellini, andrii_anisov, wl, konrad.wilk, George.Dunlap, tim,
	ian.jackson, julien.grall, Jan Beulich, andrew.cooper3



On 26.07.19 15:15, Dario Faggioli wrote:
> Yep, I think being able to know time spent running guests could be
> useful.

Well, my intention was to see hypervisor run and true idle time.

With this full series I see the distinct difference in xentop depending on the type of load in domains:

On my regular system (HW less Dom0, Linux with UI aka DomD, Android with PV drivers aka DomA), I see following:

Idle system:

xentop - 10:10:42   Xen 4.13-unstable
3 domains: 1 running, 2 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown
%CPU(s):    7.0 gu,    2.6 hy,  390.4 id
Mem: 8257536k total, 8257536k used, 99020k free    CPUs: 4 @ 8MHz
       NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD   VBD_WR  VBD_RSECT  VBD_WSECT SSID
       DomA --b---         76    3.3    6258456   75.8    6259712      75.8     4    0        0        0    0        0        0        0          0          0    0
   Domain-0 -----r         14    1.0     262144    3.2   no limit       n/a     4    0        0        0    0        0        0        0          0          0    0
       DomD --b---        111    2.8    1181972   14.3    1246208      15.1     4    0        0        0    0        0        0        0          0          0    0


System with CPU burners in all domains:

xentop - 10:12:19   Xen 4.13-unstable
3 domains: 3 running, 0 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown
%CPU(s):  389.1 gu,   10.9 hy,    0.0 id
Mem: 8257536k total, 8257536k used, 99020k free    CPUs: 4 @ 8MHz
       NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD   VBD_WR  VBD_RSECT  VBD_WSECT SSID
       DomA -----r        115  129.7    6258456   75.8    6259712      75.8     4    0        0        0    0        0        0        0          0          0    0
   Domain-0 -----r        120  129.8     262144    3.2   no limit       n/a     4    0        0        0    0        0        0        0          0          0    0
       DomD -----r        163  129.6    1181972   14.3    1246208      15.1     4    0        0        0    0        0        0        0          0          0    0


System with GPU load run both in DomD and DomA:

xentop - 10:14:26   Xen 4.13-unstable
3 domains: 2 running, 1 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown
%CPU(s):  165.7 gu,   51.4 hy,  182.9 id
Mem: 8257536k total, 8257536k used, 99020k free    CPUs: 4 @ 8MHz
       NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD   VBD_WR  VBD_RSECT  VBD_WSECT SSID
       DomA --b---        250   60.8    6258456   75.8    6259712      75.8     4    0        0        0    0        0        0        0          0          0    0
   Domain-0 -----r        159    2.1     262144    3.2   no limit       n/a     4    0        0        0    0        0        0        0          0          0    0
       DomD -----r        275  102.7    1181972   14.3    1246208      15.1     4    0        0        0    0        0        0        0          0          0    0


You can see that rise of CPU used by hypervisor itself in high IRQ use-case (GPU load).

> I confirm what I said about patch 1: idle time being the time idle_vcpu
> spent in RUNSTATE_blocked, and hypervisor time being the time idle_vcpu
> spent in RUNSTATE_running sounds quite confusing to me.

As I said before, think of idle_vcpu as hypervisor_vcpu ;)

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 2/6] schedule: account true system idle time
  2019-07-26 12:42     ` Andrii Anisov
@ 2019-07-29 11:40       ` Dario Faggioli
  2019-08-01  8:23         ` Andrii Anisov
  0 siblings, 1 reply; 49+ messages in thread
From: Dario Faggioli @ 2019-07-29 11:40 UTC (permalink / raw)
  To: andrii.anisov, xen-devel
  Cc: george.dunlap, andrii_anisov, julien.grall, sstabellini,
	Volodymyr_Babchuk

[-- Attachment #1.1: Type: text/plain, Size: 2888 bytes --]

On Fri, 2019-07-26 at 15:42 +0300, Andrii Anisov wrote:
> Hello Dario,
> 
Hi,

> On 26.07.19 15:00, Dario Faggioli wrote:
> > On Fri, 2019-07-26 at 13:37 +0300, Andrii Anisov wrote:
> > I see how it is rather easy to do something like this, so
> > I understand it being done like this in an RFC patch, but I don't
> > think
> > it's correct.
> 
> This is the VERY RFC with the minimal changes to the existing code
> and adopting existing approaches.
>
Sure, I know, and this is fine.

> This topic is really complex and requires wide discussion, so this
> series is rather an invitation to the discussion.
> 
Absolutely.

> > And, on an even more general perspective, the fact that the
> > hypervisor,
> > when scheduling the idle vcpu, runs softirq, tasklets, etc, it's a
> > generic concept, not an arch specific one. So, we really should
> > find a
> > way to implement this in common code, not in arch code.
> 
> Yes, in terms of this patch, idle_vcpu_runstate_change() better be
> moved to common/schedule.c.
> 
I think we should, first of all, think, if using runstates and
runstates manipulation functions is really the best way forward here.

And, if that reveals to be the case, I feel like runstates would need
to be extended to be able to deal with want we want to achieve.

I'll think more about this, and try to form an idea...

> > Maybe, but I'm just thinking out loud, and I need to think more
> > about
> > this, we can do things the other way round. I.e., we measure the
> > time
> > that it takes to run softirq and tasklets, and we subtract it from
> > idle_vcpu runtime?
> 
> In the patch "schedule: account all the hypervisor time to the idle
> vcpu" I extend what I think should be accounted for the hypervisor
> run time. And subtraction approach will result in more complex code
> over there.
> 
Yep, I quickly looked at it, but I need to review it carefully, to
properly understand it, come up with comments, think about
alternatives, etc.

Will do that soon, hopefully.

For now, just consider that, IMO, the big value of this series (or, if
you want, the discussion this series is aimed at starting) is getting
Xen toward having a better, more accurate and more fine grained time
accounting and reporting.

If we time. e.g., interrupts, softirqs and tasklets, we can store such
metrics too, and, if we want, report a breakout of the time spent in
hypervisor... something like (in xentop, as you're doing already, or
somewhere else):

 hyp=12%(irq=4%+softirq=1%+tasklet=5%+other=2%) 

Anyway, thanks again and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 0/6] XEN scheduling hardening
  2019-07-26 12:14   ` Juergen Gross
@ 2019-07-29 11:53     ` Dario Faggioli
  2019-07-29 12:13       ` Juergen Gross
  2019-07-29 14:47     ` Andrii Anisov
  1 sibling, 1 reply; 49+ messages in thread
From: Dario Faggioli @ 2019-07-29 11:53 UTC (permalink / raw)
  To: andrii.anisov, Juergen Gross, xen-devel
  Cc: sstabellini, andrii_anisov, george.dunlap, andrew.cooper3, tim,
	julien.grall, Jan Beulich

[-- Attachment #1.1: Type: text/plain, Size: 2391 bytes --]

On Fri, 2019-07-26 at 14:14 +0200, Juergen Gross wrote:
> On 26.07.19 13:56, Dario Faggioli wrote:
> > On Fri, 2019-07-26 at 13:37 +0300, Andrii Anisov wrote:
> > >   - How to avoid the absolute top priority of tasklets (what is
> > > obeyed
> > > by all
> > >     schedulers so far). Should idle vcpu be scheduled as the
> > > normal
> > > guest vcpus
> > >     (through queues, priorities, etc)?
> > > 
> > Therefore, even if there wouldn't be any subsystem explicitly
> > relying
> > on the current behavior (which should be verified), I think we are
> > at
> > high risk of breaking things, if we change.
> 
> We'd break things IMO.
> 
> Tasklets are sometimes used to perform async actions which can't be
> done
> in guest vcpu context. Like switching a domain to shadow mode for
> L1TF
> mitigation, or marshalling all cpus for stop_machine(). You don't
> want
> to be able to block tasklets, you want them to run as soon as
> possible.
> 
Yep, stop-machine was precisely what I had in mind, but as Juergen
says, there's more.

As said, I suggest we defer this problem or, in general, we treat it
outside of this series.

> > 2) you move all these activities out of idle, and in some other
> >     context, and you let idle just do the idling. At that point,
> > time
> >     accounted to idle will be only actual idle time, as the time it
> >     took to Xen to do all the other things is now accounted to the
> > new
> >     execution context which is running them.
> 
> And here we are coming back to the idea of a "hypervisor domain" I
> suggested about 10 years ago and which was rejected at that time...
> 
It's pretty much what Andrii is proposing already, when he says he'd
consider idle_vcpu an 'hypervisor vcpu'. Or at least a naturale
extension of that.

I don't know what was the occasion for proposing it, and the argument
against it, 10 years ago, so I won't comment on that. :-D

Let's see if something like that end up making sense for this work. I'm
unconvinced, for now, but I'm still looking at and thinking about the
code. :-)

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 0/6] XEN scheduling hardening
  2019-07-29 11:53     ` Dario Faggioli
@ 2019-07-29 12:13       ` Juergen Gross
  0 siblings, 0 replies; 49+ messages in thread
From: Juergen Gross @ 2019-07-29 12:13 UTC (permalink / raw)
  To: Dario Faggioli, andrii.anisov, xen-devel
  Cc: sstabellini, andrii_anisov, george.dunlap, andrew.cooper3, tim,
	julien.grall, Jan Beulich

On 29.07.19 13:53, Dario Faggioli wrote:
> On Fri, 2019-07-26 at 14:14 +0200, Juergen Gross wrote:
>> On 26.07.19 13:56, Dario Faggioli wrote:
>>> On Fri, 2019-07-26 at 13:37 +0300, Andrii Anisov wrote:
>>>>    - How to avoid the absolute top priority of tasklets (what is
>>>> obeyed
>>>> by all
>>>>      schedulers so far). Should idle vcpu be scheduled as the
>>>> normal
>>>> guest vcpus
>>>>      (through queues, priorities, etc)?
>>>>
>>> Therefore, even if there wouldn't be any subsystem explicitly
>>> relying
>>> on the current behavior (which should be verified), I think we are
>>> at
>>> high risk of breaking things, if we change.
>>
>> We'd break things IMO.
>>
>> Tasklets are sometimes used to perform async actions which can't be
>> done
>> in guest vcpu context. Like switching a domain to shadow mode for
>> L1TF
>> mitigation, or marshalling all cpus for stop_machine(). You don't
>> want
>> to be able to block tasklets, you want them to run as soon as
>> possible.
>>
> Yep, stop-machine was precisely what I had in mind, but as Juergen
> says, there's more.
> 
> As said, I suggest we defer this problem or, in general, we treat it
> outside of this series.
> 
>>> 2) you move all these activities out of idle, and in some other
>>>      context, and you let idle just do the idling. At that point,
>>> time
>>>      accounted to idle will be only actual idle time, as the time it
>>>      took to Xen to do all the other things is now accounted to the
>>> new
>>>      execution context which is running them.
>>
>> And here we are coming back to the idea of a "hypervisor domain" I
>> suggested about 10 years ago and which was rejected at that time...
>>
> It's pretty much what Andrii is proposing already, when he says he'd
> consider idle_vcpu an 'hypervisor vcpu'. Or at least a naturale
> extension of that.

The main difference is its priority and the possibility to allow it to
become blocked.

> I don't know what was the occasion for proposing it, and the argument
> against it, 10 years ago, so I won't comment on that. :-D

It was in the discussion of my initial submission of cpupools. It was
one alternative thought to solve the continue_hypercall_on_cpu()
problem. In the end that was done via tasklets. :-)


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 0/6] XEN scheduling hardening
  2019-07-26 11:56 ` [Xen-devel] [RFC 0/6] XEN scheduling hardening Dario Faggioli
  2019-07-26 12:14   ` Juergen Gross
@ 2019-07-29 14:28   ` Andrii Anisov
  1 sibling, 0 replies; 49+ messages in thread
From: Andrii Anisov @ 2019-07-29 14:28 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Andrii Anisov, George Dunlap,
	Andrew Cooper, Tim Deegan, Julien Grall, Jan Beulich

Hello Dario

On 26.07.19 14:56, Dario Faggioli wrote:
> [Adding George plus others x86, ARM and core-Xen people]
> 
> Hi Andrii,
> 
> First of all, thanks a lot for this series!
> 
> The problem you mention is a long standing one, and I'm glad we're
> eventually starting to properly look into it.
> 
> I already have one comment: I think I can see from where this come
> from, but I don't think 'XEN scheduling hardening' is what we're doing
> in this series... I'd go for something like "xen: sched: improve idle
> and vcpu time accounting precision", or something like that.

I do not really stick at the naming. Will rename on the next version.

>> While the series is the early RFC, several points are still
>> untouched:
>>   - Now the time elapsed from the last rescheduling is not fully
>> charged from
>>     the current vcpu budget. Are there any changes needed in the
>> existing
>>     scheduling algorithms?
>>
> I'll think about it, but out of the top of my head, I don't see how
> this can be a problem. Scheduling algorithms (should!) base their logic
> and their calculations on actual vcpus' runtime, not much on idle
> vcpus' one.

IMO RTDS and ARINC653 scheduling algorithms are not affected because they are operating with the absolute value of time spent by vcpu and a future event (nearest deadline or major frame end).
But I have my doubts about credit schedulers (credit, credit2). Now we have an entity which unconditionally steals time from some periods. Wouldn't it affect calculation of domains budget proportions with the respect to the domains weight/cap?


>>   - How to avoid the absolute top priority of tasklets (what is obeyed
>> by all
>>     schedulers so far). Should idle vcpu be scheduled as the normal
>> guest vcpus
>>     (through queues, priorities, etc)?
>>
> Now, this is something to think about, and try to understand if
> anything would break if we go for it. I mean, I see why you'd want to
> do that, but tasklets and softirqs works the way they do, in Xen, since
> when they were introduced, I believe.
> 
> Therefore, even if there wouldn't be any subsystem explicitly relying
> on the current behavior (which should be verified), I think we are at
> high risk of breaking things, if we change.
> 
> That's not to mean it would not be a good change, or that it is
> impossible... It's, rather, just to raise some awareness. :-)

I understand that this area is conservative and hard to change.
But the current scheduling in XEN is quite non-deterministic. And, IMO, with that mess XEN can not go into any safety certified system.

>>   - Idle vcpu naming is quite misleading. It is a kind of system
>> (hypervisor)
>>     task which is responsible for some hypervisor work. Should it be
>>     renamed/reconsidered?
>>
> Well, that's a design question, even for this very series, isn't it? I
> mean, I see two ways of achieving proper idle time accounting:
> 1) you leave things as they are --i.e., idle does not only do idling,
>     it also does all these other things, but you make sure you don't
>     count the time they take as idle time;
> 2) you move all these activities out of idle, and in some other
>     context, and you let idle just do the idling. At that point, time
>     accounted to idle will be only actual idle time, as the time it
>     took to Xen to do all the other things is now accounted to the new
>     execution context which is running them.
> 
> So, which path this path series takes (I believe 1), and which path you
> (and others) believe is better?

This have to be discussed.
I would stress again this is the set of minimal changes following existing approaches (e.g. I don't like runstate usage here)

> (And, yes, discussing this is why I've added, apart from George, some
> other x86, ARM, and core-Xen people)

Thank you.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 0/6] XEN scheduling hardening
  2019-07-26 12:14   ` Juergen Gross
  2019-07-29 11:53     ` Dario Faggioli
@ 2019-07-29 14:47     ` Andrii Anisov
  2019-07-29 18:46       ` Dario Faggioli
  1 sibling, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-07-29 14:47 UTC (permalink / raw)
  To: Juergen Gross, Dario Faggioli, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, George Dunlap, Andrew Cooper,
	Tim Deegan, Julien Grall, Jan Beulich

Hello Juergen,

On 26.07.19 15:14, Juergen Gross wrote:
>>>   - How to avoid the absolute top priority of tasklets (what is obeyed
>>> by all
>>>     schedulers so far). Should idle vcpu be scheduled as the normal
>>> guest vcpus
>>>     (through queues, priorities, etc)?
>>>
>> Now, this is something to think about, and try to understand if
>> anything would break if we go for it. I mean, I see why you'd want to
>> do that, but tasklets and softirqs works the way they do, in Xen, since
>> when they were introduced, I believe.
>>
>> Therefore, even if there wouldn't be any subsystem explicitly relying
>> on the current behavior (which should be verified), I think we are at
>> high risk of breaking things, if we change.
> 
> We'd break things IMO.
> 
> Tasklets are sometimes used to perform async actions which can't be done
> in guest vcpu context.

OK, that stuff can not be done in the guest vcpu context. But why can't it be done with the guest's priority?

> Like switching a domain to shadow mode for L1TF
> mitigation,

Sorry I'm not really aware what L1TF mitigation is.
But

> or marshalling all cpus for stop_machine().

I think I faced some time ago. When fixed a noticeable lag in video playback at the moment of the other domain reboot.

> You don't want
> to be able to block tasklets, you want them to run as soon as possible.

Should it really be done in the way of breaking scheduling? What if the system has RT requirements?

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 0/6] XEN scheduling hardening
  2019-07-29 14:47     ` Andrii Anisov
@ 2019-07-29 18:46       ` Dario Faggioli
  0 siblings, 0 replies; 49+ messages in thread
From: Dario Faggioli @ 2019-07-29 18:46 UTC (permalink / raw)
  To: andrii.anisov, Juergen Gross, xen-devel
  Cc: sstabellini, andrii_anisov, george.dunlap, andrew.cooper3, tim,
	julien.grall, Jan Beulich

[-- Attachment #1.1: Type: text/plain, Size: 1634 bytes --]

On Mon, 2019-07-29 at 17:47 +0300, Andrii Anisov wrote:
> On 26.07.19 15:14, Juergen Gross wrote:
> > > > 
> > Tasklets are sometimes used to perform async actions which can't be
> > done
> > in guest vcpu context.
> > [...]
> > Like switching a domain to shadow mode for L1TF
> > mitigation,
> 
> Sorry I'm not really aware what L1TF mitigation is.
> But
> 
> > or marshalling all cpus for stop_machine().
> 
> I think I faced some time ago. When fixed a noticeable lag in video
> playback at the moment of the other domain reboot.
> 
No, stop_machine() is not about a VM being shutdown/killed/stopped.
It's, let's say, about all the (physical) CPU in the host having to do
something, in a coordinated way.

Check the comment in xen/include/xen/stop_machine.h

> > You don't want
> > to be able to block tasklets, you want them to run as soon as
> > possible.
> 
> Should it really be done in the way of breaking scheduling? What if
> the system has RT requirements?
> 
It's possible, I guess, that we can implement some of the things that
are done in tasklets (either stop_machine() or something else)
differently, and in a way that is less disruptive for scheduling and
determinism.

But, if it is, it's not going to be as easy as <<let's run tasklet at
domain priority, and be done with it>>, I'm afraid. :-(

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-07-26 10:48   ` Julien Grall
@ 2019-07-30 17:35     ` Andrii Anisov
  2019-07-30 20:10       ` Julien Grall
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-07-30 17:35 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov

On 26.07.19 13:48, Julien Grall wrote:
> This is the 3rd time you send this patch... and still no proper explanation why this is done nor the impact on keeping the interrupts disabled longer than necessary.

I know it is the third time for this patch. Yet it is in the RFC series again.
In this series I think I need interrupts locked until I start time accounting for hypervisor. Time accounting is started by `tacc_head()` function. I prefer to have it called from C, because it is more convenient and obvious for those who are less familiar with the ARM code.

> Resending the patch without things addressed is only going to make it worst.

I'm still convinced the patch would improve interrupt latency for high interrupt rate use cases.
But I understand that I have no experiment to prove the effect, so I'm not willing to push through the patch.

Also, I have a question to you about another aspect of this patch. In the function `enter_hypervisor_head()` there is a check for a disabled workaround and turning it on. If we have the interrupts enabled until there, we have good chances to go through the interrupt processing `do_IRQ()` before WA enabled. Is it still OK?

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 5/6] arm64: call enter_hypervisor_head only when it is needed
  2019-07-26 10:59   ` Julien Grall
@ 2019-07-30 17:35     ` Andrii Anisov
  2019-07-31 11:02       ` Julien Grall
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-07-30 17:35 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov


On 26.07.19 13:59, Julien Grall wrote:
> Hi,
> 
> On 26/07/2019 11:37, Andrii Anisov wrote:
>> From: Andrii Anisov <andrii_anisov@epam.com>
>>
>> On ARM64 we know exactly if trap happened from hypervisor or guest, so
>> we do not need to take that decision. This reduces a condition for
>> all enter_hypervisor_head calls and the function call for traps from
>> the hypervisor mode.
> 
> One condition lost but ...

...In the hot path (actually at any trap).
And the function call for traps from hyp.

>> Currently, it is implemented for ARM64 only. Integrating the stuff
>> with ARM32 requires moving ` if ( guest_mode(regs) )` condition
>> into ARM32 specific traps.
> 
> ... one more divergence between arm32 and arm64.
> 
> There are probably dozens of more conditions in the code that are not necessary for one of the architectures.
> Yet there are value to keep everything common because the benefits outweigh the likely non performance improvement.

I'm not seeing any benefits in calling `enter_hypervisor_head()` from functions named `do_trap_hyp_*`. That code is confusing for me.
IMHO, dividing `do_trap_irq/fiq()` into guest and hyp specific functions is not a big deal. Even for ARM32. Moreover, it will make more obvious the fact that nothing from `enter_hypervisor_head()` is done for IRQ traps taken from hyp.

> So I am not convinced that this has any value for Xen.

OK, I wouldn't insist.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-07-30 17:35     ` Andrii Anisov
@ 2019-07-30 20:10       ` Julien Grall
  2019-08-01  6:45         ` Andrii Anisov
  0 siblings, 1 reply; 49+ messages in thread
From: Julien Grall @ 2019-07-30 20:10 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov

Hi Andrii,

On 7/30/19 6:35 PM, Andrii Anisov wrote:
> On 26.07.19 13:48, Julien Grall wrote:
>> This is the 3rd time you send this patch... and still no proper 
>> explanation why this is done nor the impact on keeping the interrupts 
>> disabled longer than necessary.
> 
> I know it is the third time for this patch. Yet it is in the RFC series 
> again.

So? RFC does not mean you have to ignore previous comments... You could 
have at least acknowledge my points...

> In this series I think I need interrupts locked until I start time 
> accounting for hypervisor. Time accounting is started by `tacc_head()` 
> function. I prefer to have it called from C, because it is more 
> convenient and obvious for those who are less familiar with the ARM code.
> 
>> Resending the patch without things addressed is only going to make it 
>> worst.
> 
> I'm still convinced the patch would improve interrupt latency for high 
> interrupt rate use cases.
> But I understand that I have no experiment to prove the effect, so I'm 
> not willing to push through the patch.

The only thing I ask is justification in your commit message rather than 
throwing things and expecting the reviewer to understand why you do 
that. I would recommend to refresh yourself how to submit a patch series 
[1].

> 
> Also, I have a question to you about another aspect of this patch. In 
> the function `enter_hypervisor_head()` there is a check for a disabled 
> workaround and turning it on. If we have the interrupts enabled until 
> there, we have good chances to go through the interrupt processing 
> `do_IRQ()` before WA enabled. Is it still OK?

Hmmm, that's correct.

Cheers,

[1] https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 5/6] arm64: call enter_hypervisor_head only when it is needed
  2019-07-30 17:35     ` Andrii Anisov
@ 2019-07-31 11:02       ` Julien Grall
  2019-07-31 11:33         ` Andre Przywara
  2019-08-01  7:33         ` Andrii Anisov
  0 siblings, 2 replies; 49+ messages in thread
From: Julien Grall @ 2019-07-31 11:02 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov, Andre Przywara

Hi Andrii,

On 30/07/2019 18:35, Andrii Anisov wrote:
> 
> On 26.07.19 13:59, Julien Grall wrote:
>> Hi,
>>
>> On 26/07/2019 11:37, Andrii Anisov wrote:
>>> From: Andrii Anisov <andrii_anisov@epam.com>
>>>
>>> On ARM64 we know exactly if trap happened from hypervisor or guest, so
>>> we do not need to take that decision. This reduces a condition for
>>> all enter_hypervisor_head calls and the function call for traps from
>>> the hypervisor mode.
>>
>> One condition lost but ...
> 
> ...In the hot path (actually at any trap).

Everything is in the hot path here, yet there are a lot of other branches. So 
why this branch in particular?

As I have mentioned a few times before, there are a difference between the 
theory and the practice. In theory, removing a branch looks nice. But in 
practice this may not be the case.

In this particular case, I don't believe this is going to have a real impact on 
the performance.

The PSTATE has been saved a few instructions before in cpu_user_regs, so there
are high chance the value will still be in the L1 cache.

The compiler may also decide to do the direct branch when not in guest_mode. A 
trap from the hypervisor is mostly for interrupts. So there are chance this is 
not going to have a real impact on the overall of the interrupt handling.

If you are really worry of the impact of branch then there are a few more 
important places (with a greater benefits) to look:
     1) It seems the compiler use a jump table for the switch case in 
do_trap_guest_sync(), so it will result to multiple direct branch everytime.
     2) Indirect branch have a non-negligible cost compare to direct branch. 
This is a lot used in the interrupt code (see gic_hw_ops->read_irq()). All of 
them are known at boot time, so they could be replace with direct branch. x86 
recently introduced alternative_call() for this purpose. This could be re-used 
on Arm.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 5/6] arm64: call enter_hypervisor_head only when it is needed
  2019-07-31 11:02       ` Julien Grall
@ 2019-07-31 11:33         ` Andre Przywara
  2019-08-01  7:33         ` Andrii Anisov
  1 sibling, 0 replies; 49+ messages in thread
From: Andre Przywara @ 2019-07-31 11:33 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Stefano Stabellini, Andrii Anisov, Volodymyr Babchuk,
	Andrii Anisov

On Wed, 31 Jul 2019 12:02:20 +0100
Julien Grall <julien.grall@arm.com> wrote:

Hi,

> On 30/07/2019 18:35, Andrii Anisov wrote:
> > 
> > On 26.07.19 13:59, Julien Grall wrote:  
> >> Hi,
> >>
> >> On 26/07/2019 11:37, Andrii Anisov wrote:  
> >>> From: Andrii Anisov <andrii_anisov@epam.com>
> >>>
> >>> On ARM64 we know exactly if trap happened from hypervisor or guest, so
> >>> we do not need to take that decision. This reduces a condition for
> >>> all enter_hypervisor_head calls and the function call for traps from
> >>> the hypervisor mode.  
> >>
> >> One condition lost but ...  
> > 
> > ...In the hot path (actually at any trap).  
> 
> Everything is in the hot path here, yet there are a lot of other branches. So 
> why this branch in particular?
> 
> As I have mentioned a few times before, there are a difference between the 
> theory and the practice. In theory, removing a branch looks nice. But in 
> practice this may not be the case.
> 
> In this particular case, I don't believe this is going to have a real impact on 
> the performance.
> 
> The PSTATE has been saved a few instructions before in cpu_user_regs, so there
> are high chance the value will still be in the L1 cache.

I agree on this, and second the idea of *not* micro-optimising code just for the sake of it. If you have numbers that back this up, it would be a different story.

> The compiler may also decide to do the direct branch when not in guest_mode. A 
> trap from the hypervisor is mostly for interrupts. So there are chance this is 
> not going to have a real impact on the overall of the interrupt handling.
> 
> If you are really worry of the impact of branch then there are a few more 
> important places (with a greater benefits) to look:
>      1) It seems the compiler use a jump table for the switch case in 
> do_trap_guest_sync(), so it will result to multiple direct branch everytime.

I don't think it's worth to "fix" this issue. The compiler has done this for a reason, and I would guess it figured that this is cheaper than other ways of solving this. If you are really paranoid about this, I would try to compile this with tuning for your particular core (-mtune), so that the compiler can throw in more micro-architectural knowledge about the cost of certain instructions.

>      2) Indirect branch have a non-negligible cost compare to direct branch. 
> This is a lot used in the interrupt code (see gic_hw_ops->read_irq()). All of 
> them are known at boot time, so they could be replace with direct branch. x86 
> recently introduced alternative_call() for this purpose. This could be re-used 
> on Arm.

This is indeed something I was always worried about: It looks cheap and elegant in the C source code, but is potentially expensive on hardware. The particular snippet is:
...
  249024:       d5033fdf        isb
  249028:       f9401e80        ldr     x0, [x20, #56]
  24902c:       f9407801        ldr     x1, [x0, #240]
  249030:       2a1303e0        mov     w0, w19
  249034:       d63f0020        blr     x1
...
In case of an interrupt, the first load will probably miss the cache, and the CPU is stuck now, because due to the dependencies it can't do much else. It can't even predict the branch and speculatively execute anything, because the destination address is yet another dependent load away.
This might not matter for little cores like A53s, because they wouldn't speculate anyway. But better cores (A72, for instance) would most likely benefit from an optimisation in this area.

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-07-30 20:10       ` Julien Grall
@ 2019-08-01  6:45         ` Andrii Anisov
  2019-08-01  9:37           ` Julien Grall
  2019-08-01 11:19           ` Dario Faggioli
  0 siblings, 2 replies; 49+ messages in thread
From: Andrii Anisov @ 2019-08-01  6:45 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov

Hello Julien,

On 30.07.19 23:10, Julien Grall wrote:

>> In this series I think I need interrupts locked until I start time accounting for hypervisor. Time accounting is started by `tacc_head()` function. I prefer to have it called from C, because it is more convenient and obvious for those who are less familiar with the ARM code.

Here is the question to you: what is the best place (and way) to start hypervisor time tracking?

>>
>>> Resending the patch without things addressed is only going to make it worst.
>>
>> I'm still convinced the patch would improve interrupt latency for high interrupt rate use cases.
>> But I understand that I have no experiment to prove the effect, so I'm not willing to push through the patch.
> 
> The only thing I ask is justification in your commit message rather than throwing things and expecting the reviewer to understand why you do that. I would recommend to refresh yourself how to submit a patch series [1].

I'll follow you recommendation.

>> Also, I have a question to you about another aspect of this patch. In the function `enter_hypervisor_head()` there is a check for a disabled workaround and turning it on. If we have the interrupts enabled until there, we have good chances to go through the interrupt processing `do_IRQ()` before WA enabled. Is it still OK?
> 
> Hmmm, that's correct.

Sorry I did not get your point. What exactly is correct? My observation of the scenario where we can go through the big piece of the hypervisor without WA enabled? Or processing IRQs without WA enabled is the correct way to do?

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 5/6] arm64: call enter_hypervisor_head only when it is needed
  2019-07-31 11:02       ` Julien Grall
  2019-07-31 11:33         ` Andre Przywara
@ 2019-08-01  7:33         ` Andrii Anisov
  2019-08-01 10:17           ` Julien Grall
  1 sibling, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-08-01  7:33 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov, Andre Przywara



On 31.07.19 14:02, Julien Grall wrote:
> Everything is in the hot path here, yet there are a lot of other branches. So why this branch in particular?

This branch and function call is particular because I find this piece of code quite confusing:

>> I'm not seeing any benefits in calling `enter_hypervisor_head()` from functions named `do_trap_hyp_*`. That code is confusing for me.
>> IMHO, dividing `do_trap_irq/fiq()` into guest and hyp specific functions is not a big deal. Even for ARM32. Moreover, it will make more obvious the fact that nothing from `enter_hypervisor_head()` is done for IRQ traps taken from hyp.

And I believe this patch balances patch "xen/arm: Re-enable interrupt later in the trap path" what you NAKed because of increased IRQ latency.
While them together make the code more straight forward and clear, because:
  - you start all C-coded common trap handlers with IRQs locked, and free from asm code misunderstanding
  - all common trap handlers are distinct in their naming, purpose and action. In terms of calling `enter_hypervisor_head()` only from the traps taken from guest.

> If you are really worry of the impact of branch then there are a few more important places (with a greater benefits) to look:
>      1) It seems the compiler use a jump table for the switch case in do_trap_guest_sync(), so it will result to multiple direct branch everytime.
>      2) Indirect branch have a non-negligible cost compare to direct branch. This is a lot used in the interrupt code (see gic_hw_ops->read_irq()). All of them are known at boot time, so they could be replace with direct branch. x86 recently introduced alternative_call() for this purpose. This could be re-used on Arm.

I remember this. But I'm not ready to code it. I admit that I have not yet good understanding about how alternatives work.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 2/6] schedule: account true system idle time
  2019-07-29 11:40       ` Dario Faggioli
@ 2019-08-01  8:23         ` Andrii Anisov
  0 siblings, 0 replies; 49+ messages in thread
From: Andrii Anisov @ 2019-08-01  8:23 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: george.dunlap, andrii_anisov, julien.grall, sstabellini,
	Volodymyr_Babchuk

Hello Dario,

On 29.07.19 14:40, Dario Faggioli wrote:
>> Yes, in terms of this patch, idle_vcpu_runstate_change() better be
>> moved to common/schedule.c.
>>
> I think we should, first of all, think, if using runstates and
> runstates manipulation functions is really the best way forward here.
> 
> And, if that reveals to be the case, I feel like runstates would need
> to be extended to be able to deal with want we want to achieve.
> 
> I'll think more about this, and try to form an idea...

I think runstate does not suite. Its format is linked to the existing interface and is not flexible enough for possible changes.
 From other hand, we would need reordering in runstate time calculation as well.

> If we time. e.g., interrupts, softirqs and tasklets, we can store such
> metrics too, and, if we want, report a breakout of the time spent in
> hypervisor... something like (in xentop, as you're doing already, or
> somewhere else):
> 
>   hyp=12%(irq=4%+softirq=1%+tasklet=5%+other=2%)

One more downside of runstate - it will not hold all that stuff.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-01  6:45         ` Andrii Anisov
@ 2019-08-01  9:37           ` Julien Grall
  2019-08-02  8:28             ` Andrii Anisov
  2019-08-01 11:19           ` Dario Faggioli
  1 sibling, 1 reply; 49+ messages in thread
From: Julien Grall @ 2019-08-01  9:37 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov

Hi,

On 01/08/2019 07:45, Andrii Anisov wrote:
> On 30.07.19 23:10, Julien Grall wrote:
> 
>>> In this series I think I need interrupts locked until I start time accounting 
>>> for hypervisor. Time accounting is started by `tacc_head()` function. I 
>>> prefer to have it called from C, because it is more convenient and obvious 
>>> for those who are less familiar with the ARM code.
> 
> Here is the question to you: what is the best place (and way) to start 
> hypervisor time tracking?

Looking at the patch, hypervisor time accounting is for:
     1) softirqs
     2) hardirqs

For hardirqs, you always enter in C with interrupt disabled. So this can be 
called directly from there.

For softirqs, they are quite a few places where do_sofirq() is called. So you 
either want to track the time in the function directly or on each callers.

I am not sure which one is the best way.

> 
>>>
>>>> Resending the patch without things addressed is only going to make it worst.
>>>
>>> I'm still convinced the patch would improve interrupt latency for high 
>>> interrupt rate use cases.
>>> But I understand that I have no experiment to prove the effect, so I'm not 
>>> willing to push through the patch.
>>
>> The only thing I ask is justification in your commit message rather than 
>> throwing things and expecting the reviewer to understand why you do that. I 
>> would recommend to refresh yourself how to submit a patch series [1].
> 
> I'll follow you recommendation.
> 
>>> Also, I have a question to you about another aspect of this patch. In the 
>>> function `enter_hypervisor_head()` there is a check for a disabled workaround 
>>> and turning it on. If we have the interrupts enabled until there, we have 
>>> good chances to go through the interrupt processing `do_IRQ()` before WA 
>>> enabled. Is it still OK?
>>
>> Hmmm, that's correct.
> 
> Sorry I did not get your point. What exactly is correct? My observation of the 
> scenario where we can go through the big piece of the hypervisor without WA 
> enabled? Or processing IRQs without WA enabled is the correct way to do?

"big piece" is somewhat half-correct.... All the hypercalls will be correctly 
protected, so the problem is only if you receive an interrupt before SSBD is 
enabled.

I would move the enablement in assembly code as part of entry.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 5/6] arm64: call enter_hypervisor_head only when it is needed
  2019-08-01  7:33         ` Andrii Anisov
@ 2019-08-01 10:17           ` Julien Grall
  2019-08-02 13:50             ` Andrii Anisov
  0 siblings, 1 reply; 49+ messages in thread
From: Julien Grall @ 2019-08-01 10:17 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov, Andre Przywara

Hi Andrii,

On 01/08/2019 08:33, Andrii Anisov wrote:
> 
> 
> On 31.07.19 14:02, Julien Grall wrote:
>> Everything is in the hot path here, yet there are a lot of other branches. So 
>> why this branch in particular?
> 
> This branch and function call is particular because I find this piece of code 
> quite confusing:

All the commit message is based on "performance improvement".... Now you are 
selling it as this is confusing. What are the real reasons for this patch?

> 
>>> I'm not seeing any benefits in calling `enter_hypervisor_head()` from 
>>> functions named `do_trap_hyp_*`. That code is confusing for me.
>>> IMHO, dividing `do_trap_irq/fiq()` into guest and hyp specific functions is 
>>> not a big deal. Even for ARM32. Moreover, it will make more obvious the fact 
>>> that nothing from `enter_hypervisor_head()` is done for IRQ traps taken from 
>>> hyp.
> 
> And I believe this patch balances patch "xen/arm: Re-enable interrupt later in 
> the trap path" what you NAKed because of increased IRQ latency.

I never NAKed the patch as you keep claiming it. You are sending a patch without 
any justification three time in a row, so it is normal to request more details 
and to be slightly annoyed.

If you expect me to ack a patch without understanding the implications, then I 
am afraid this is not going to happen. Additionally, it is important to keep 
track of the reasoning of we can come back in 2 years time and find out quickly 
why interrupts are masked for a long period of time.

As I pointed out Xen supports multiple use-cases. I am concerned you are trying 
to optimize for your use-case and disregard the rest. I have actually requested 
more details on your use case to understand a bit more where you are coming 
from. See my e-mail [1].

I know I wrote the patch but it was from debugging other than real improvement. 
 From my understanding, you are using to optimize the case where all LRs are 
full. Is it something you have seen during your testing?

If so, how many LRs does the platform provide? Can you provide more details on 
your use case? I don't need the full details, but roughly the number of 
interrupts used and often they trigger.

Additionally, it would be good to know the usage over the time.  You could 
modify Xen to record the number of LRs used to each entry.

> While them together make the code more straight forward and clear, because:
>   - you start all C-coded common trap handlers with IRQs locked, and free from 
> asm code misunderstanding

Well, there are only one (two if you count the FIQ) case where interrupts are 
kept disabled, this is when receiving an interrupt. I don't see it as a good 
enough justification to have to impose that to all the handlers.

>   - all common trap handlers are distinct in their naming, purpose and action. 
> In terms of calling `enter_hypervisor_head()` only from the traps taken from guest.

There are nearly no difference between receiving an interrupt while running the 
guest mode and while running in hypervisor mode. So what do you really gain with 
the duplication?

Cheers,

[1] https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg02335.html

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-01  6:45         ` Andrii Anisov
  2019-08-01  9:37           ` Julien Grall
@ 2019-08-01 11:19           ` Dario Faggioli
  2019-08-02  7:50             ` Andrii Anisov
  1 sibling, 1 reply; 49+ messages in thread
From: Dario Faggioli @ 2019-08-01 11:19 UTC (permalink / raw)
  To: Andrii Anisov, Julien Grall, xen-devel
  Cc: Andrii Anisov, Stefano Stabellini, Volodymyr Babchuk

[-- Attachment #1.1: Type: text/plain, Size: 1506 bytes --]

On Thu, 2019-08-01 at 09:45 +0300, Andrii Anisov wrote:
> Hello Julien,
> 
> On 30.07.19 23:10, Julien Grall wrote:
> 
> > > In this series I think I need interrupts locked until I start
> > > time accounting for hypervisor. Time accounting is started by
> > > `tacc_head()` function. I prefer to have it called from C,
> > > because it is more convenient and obvious for those who are less
> > > familiar with the ARM code.
> 
> Here is the question to you: what is the best place (and way) to
> start hypervisor time tracking?
> 
This is actually quite an important question... And I'd like to throw
it back at you (Andrii)! :-D :-P :-)

In fact, I was about to ask myself something similar, such as, can we
take a bit of a step back and define:

- what's the, let's say, accounting granularity we want? "Just" guest
and hypervisor? Or more fine grained?

- assuming we "just" want hypervisor and guest, which are the
events/turning points at which we should switch "timing accounting
bucket"? Can we make a list?

And I'd be fine for such list to be generic, in the first place (e.g.,
hypercall, IRQ, etc)... Then we'll turn the entries into actual
locations in the code, as a second step.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-01 11:19           ` Dario Faggioli
@ 2019-08-02  7:50             ` Andrii Anisov
  2019-08-02  9:15               ` Julien Grall
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-08-02  7:50 UTC (permalink / raw)
  To: Dario Faggioli, Julien Grall, xen-devel
  Cc: Andrii Anisov, Stefano Stabellini, Volodymyr Babchuk

Hello Dario

On 01.08.19 14:19, Dario Faggioli wrote:
> On Thu, 2019-08-01 at 09:45 +0300, Andrii Anisov wrote:
>> Hello Julien,
>>
>> On 30.07.19 23:10, Julien Grall wrote:
>>
>>>> In this series I think I need interrupts locked until I start
>>>> time accounting for hypervisor. Time accounting is started by
>>>> `tacc_head()` function. I prefer to have it called from C,
>>>> because it is more convenient and obvious for those who are less
>>>> familiar with the ARM code.
>>
>> Here is the question to you: what is the best place (and way) to
>> start hypervisor time tracking?
>>
> This is actually quite an important question... And I'd like to throw
> it back at you (Andrii)! :-D :-P :-)

At this series I start time accounting for hypervisor after the trap, before interrupts being enabled. It is done for all traps except synchronous traps from guest, what are hypecalls and io emulation. For synchronous traps, I start hyp accounting after the guest's request has been served, and we start softirq processing (actually all the stuff `leave_hypervisor_tail()` does). I believe it should be so.

> In fact, I was about to ask myself something similar, such as, can we
> take a bit of a step back and define:
> 
> - what's the, let's say, accounting granularity we want? "Just" guest
> and hypervisor? Or more fine grained?

As for me hypervisor/guest/idle is quite fine granularity for the beginning.
Such approach might be enough to implement fair time accounting.
Yet we might need something more sophisticated for RT scheduling. E.g. guest's IRQ time tracking, to not let some guest to spam the system with its IRQs.

> - assuming we "just" want hypervisor and guest, which are the
> events/turning points at which we should switch "timing accounting
> bucket"? Can we make a list?
> And I'd be fine for such list to be generic, in the first place (e.g.,
> hypercall, IRQ, etc)... Then we'll turn the entries into actual
> locations in the code, as a second step.


I can make such a list, how it is done in this series:

Guest -[async trap (IRQ)]-> Hyp : switch to hyp time accounting
Guest -[sync trap (hypercall, io emulation)]-> HYP : switch to hyp time accounting *after* serving sync trap (hypercall, io emulation)

Hyp -[any trap]-> Hyp : if (trapped from guest sync trap serving) then {switch to hyp time accounting}
Hyp -[return from trap]-> Hyp : if (returning to guest sync trap serving) then {switch to guest time accounting}

Hyp -[return from trap]-> Guest : switch to guest time accounting

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-01  9:37           ` Julien Grall
@ 2019-08-02  8:28             ` Andrii Anisov
  2019-08-02  9:03               ` Julien Grall
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-08-02  8:28 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov



On 01.08.19 12:37, Julien Grall wrote:
> Hi,
> 
> On 01/08/2019 07:45, Andrii Anisov wrote:
>> On 30.07.19 23:10, Julien Grall wrote:
>>
>>>> In this series I think I need interrupts locked until I start time accounting for hypervisor. Time accounting is started by `tacc_head()` function. I prefer to have it called from C, because it is more convenient and obvious for those who are less familiar with the ARM code.
>>
>> Here is the question to you: what is the best place (and way) to start hypervisor time tracking?
> 
> Looking at the patch, hypervisor time accounting is for:
>      1) softirqs
>      2) hardirqs
> 
> For hardirqs, you always enter in C with interrupt disabled. So this can be called directly from there.
> 
> For softirqs, they are quite a few places where do_sofirq() is called. So you either want to track the time in the function directly or on each callers.


Softirq? What about the rest of `leave_hypervisor_tail()`?


> "big piece" is somewhat half-correct.... All the hypercalls will be correctly protected, so the problem is only if you receive an interrupt before SSBD is enabled.
> 
> I would move the enablement in assembly code as part of entry.

That's it. I suppose the function `enter_hypervisor_head()` was introduced and named as a part of entry, while in fact is not the part.
And I guess you were confused with it when introducing that WA.
As well as I was some time ago [1].

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-10/msg02248.html

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-02  8:28             ` Andrii Anisov
@ 2019-08-02  9:03               ` Julien Grall
  2019-08-02 12:24                 ` Andrii Anisov
  0 siblings, 1 reply; 49+ messages in thread
From: Julien Grall @ 2019-08-02  9:03 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov



On 02/08/2019 09:28, Andrii Anisov wrote:
> 
> 
> On 01.08.19 12:37, Julien Grall wrote:
>> Hi,
>>
>> On 01/08/2019 07:45, Andrii Anisov wrote:
>>> On 30.07.19 23:10, Julien Grall wrote:
>>>
>>>>> In this series I think I need interrupts locked until I start time 
>>>>> accounting for hypervisor. Time accounting is started by `tacc_head()` 
>>>>> function. I prefer to have it called from C, because it is more convenient 
>>>>> and obvious for those who are less familiar with the ARM code.
>>>
>>> Here is the question to you: what is the best place (and way) to start 
>>> hypervisor time tracking?
>>
>> Looking at the patch, hypervisor time accounting is for:
>>      1) softirqs
>>      2) hardirqs
>>
>> For hardirqs, you always enter in C with interrupt disabled. So this can be 
>> called directly from there.
>>
>> For softirqs, they are quite a few places where do_sofirq() is called. So you 
>> either want to track the time in the function directly or on each callers.
> 
> 
> Softirq? What about the rest of `leave_hypervisor_tail()`?

A fair amount of leave_hypervisor_tail() deal with the guest itself (i.e vGIC, 
P2M...), so I think they should be accounted to the guest time. The only bits in 
leave_hypervisor_tail() that should be accounted to hypervisor time is 
check_for_pcpu_work().

> 
> 
>> "big piece" is somewhat half-correct.... All the hypercalls will be correctly 
>> protected, so the problem is only if you receive an interrupt before SSBD is 
>> enabled.
>>
>> I would move the enablement in assembly code as part of entry.
> 
> That's it.
> I suppose the function `enter_hypervisor_head()` was introduced and 
> named as a part of entry, while in fact is not the part.
> And I guess you were confused with it when introducing that WA.
> As well as I was some time ago [1].

The macro entry.S will deal with anything that needs to be done before any other 
part of the hypervisor is run. All the interrupts (debug, asynchronous, IRQ, 
FIQ) should be masked.

enter_hypervisor_head() can be executed at any time and it does not matter the 
order.

Cheers,

> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2018-10/msg02248.html
> 

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-02  7:50             ` Andrii Anisov
@ 2019-08-02  9:15               ` Julien Grall
  2019-08-02 13:07                 ` Andrii Anisov
  0 siblings, 1 reply; 49+ messages in thread
From: Julien Grall @ 2019-08-02  9:15 UTC (permalink / raw)
  To: Andrii Anisov, Dario Faggioli, xen-devel
  Cc: Andrii Anisov, Stefano Stabellini, Volodymyr Babchuk

Hi,

On 02/08/2019 08:50, Andrii Anisov wrote:
> Hello Dario
> 
> On 01.08.19 14:19, Dario Faggioli wrote:
>> On Thu, 2019-08-01 at 09:45 +0300, Andrii Anisov wrote:
>>> Hello Julien,
>>>
>>> On 30.07.19 23:10, Julien Grall wrote:
>>>
>>>>> In this series I think I need interrupts locked until I start
>>>>> time accounting for hypervisor. Time accounting is started by
>>>>> `tacc_head()` function. I prefer to have it called from C,
>>>>> because it is more convenient and obvious for those who are less
>>>>> familiar with the ARM code.
>>>
>>> Here is the question to you: what is the best place (and way) to
>>> start hypervisor time tracking?
>>>
>> This is actually quite an important question... And I'd like to throw
>> it back at you (Andrii)! :-D :-P :-)
> 
> At this series I start time accounting for hypervisor after the trap, before 
> interrupts being enabled. It is done for all traps except synchronous traps from 
> guest, what are hypecalls and io emulation. For synchronous traps, I start hyp 
> accounting after the guest's request has been served, and we start softirq 
> processing (actually all the stuff `leave_hypervisor_tail()` does). I believe it 
> should be so.
> 
>> In fact, I was about to ask myself something similar, such as, can we
>> take a bit of a step back and define:
>>
>> - what's the, let's say, accounting granularity we want? "Just" guest
>> and hypervisor? Or more fine grained?
> 
> As for me hypervisor/guest/idle is quite fine granularity for the beginning.
> Such approach might be enough to implement fair time accounting.
> Yet we might need something more sophisticated for RT scheduling. E.g. guest's 
> IRQ time tracking, to not let some guest to spam the system with its IRQs.
> 
>> - assuming we "just" want hypervisor and guest, which are the
>> events/turning points at which we should switch "timing accounting
>> bucket"? Can we make a list?
>> And I'd be fine for such list to be generic, in the first place (e.g.,
>> hypercall, IRQ, etc)... Then we'll turn the entries into actual
>> locations in the code, as a second step.
> 
> 
> I can make such a list, how it is done in this series:

 From the list below it is not clear what is the split between hypervisor time 
and guest time. See some of the examples below.

> 
> Guest -[async trap (IRQ)]-> Hyp : switch to hyp time accounting

Why all the interrupts should be accounted to the hypervisor? Why not accounting 
guest time for any interrupt routed to the current guest?

> Guest -[sync trap (hypercall, io emulation)]-> HYP : switch to hyp time 
> accounting *after* serving sync trap (hypercall, io emulation)

Why so? Some of the work after servicing an hypercall/IO emulation are still 
guest specific. For instance, we may update the hardware state for the guest 
(see vgic_sync_to_lrs()). We may also have defer work that take a long time (see 
check_for_pcpu_work()).

IHMO, the only part worth accounting to the hypervisor mode is 
check_for_pcpu_work().

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-02  9:03               ` Julien Grall
@ 2019-08-02 12:24                 ` Andrii Anisov
  2019-08-02 13:22                   ` Julien Grall
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-08-02 12:24 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov



On 02.08.19 12:03, Julien Grall wrote:
> A fair amount of leave_hypervisor_tail() deal with the guest itself (i.e vGIC, P2M...)

All that stuff is what hypervisor does for the guest. And does behind the guest's back.

> , so I think they should be accounted to the guest time.
This point is arguable. That's why we have a discussion here to agree on the time accounting approach, what will directly affect scheduling accuracy.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-02  9:15               ` Julien Grall
@ 2019-08-02 13:07                 ` Andrii Anisov
  2019-08-02 13:49                   ` Julien Grall
  2019-08-03  0:55                   ` Dario Faggioli
  0 siblings, 2 replies; 49+ messages in thread
From: Andrii Anisov @ 2019-08-02 13:07 UTC (permalink / raw)
  To: Julien Grall, Dario Faggioli, xen-devel
  Cc: Andrii Anisov, Stefano Stabellini, Volodymyr Babchuk



On 02.08.19 12:15, Julien Grall wrote:
>> I can make such a list, how it is done in this series:
> 
>  From the list below it is not clear what is the split between hypervisor time and guest time. See some of the examples below.

I guess your question is *why* do I split hyp/guest time in such a way.

So for the guest I count time spent in the guest mode. Plus time spent in hypervisor mode to serve explicit requests by guest.

That time may be quite deterministic from the guest's point of view.

But the time spent by hypervisor to handle interrupts, update the hardware state is not requested by the guest itself. It is a virtualization overhead. And the overhead heavily depends on the system configuration (e.g. how many guests are running).
That overhead may be accounted for a guest or for hyp, depending on the model agreed.

My idea is as following:
Accounting that overhead for guests is quite OK for server applications, you put server overhead time on guests and charge money from their budget.
Yet for RT applications you will have more accurate view on the guest execution time if you drop that overhead.

Our target is XEN in safety critical systems. So I chosen more deterministic (from my point of view) approach.

Well, I suppose we may add granularity to the time accounting, and then decide at the scheduler level what we count for the guest execution time.

But it is so far from the end, and we are here to discuss and agree the stuff.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-02 12:24                 ` Andrii Anisov
@ 2019-08-02 13:22                   ` Julien Grall
  0 siblings, 0 replies; 49+ messages in thread
From: Julien Grall @ 2019-08-02 13:22 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov

Hi Andrii,

On 02/08/2019 13:24, Andrii Anisov wrote:
> 
> 
> On 02.08.19 12:03, Julien Grall wrote:
>> A fair amount of leave_hypervisor_tail() deal with the guest itself (i.e vGIC, 
>> P2M...)
> 
> All that stuff is what hypervisor does for the guest. And does behind the 
> guest's back.
Please define "guest's back".

Technically a guest accessing an IO does not know that the access will be 
emulated. So this should also take into account as "guest's back".

An hypercall is initiated by the guest directly, so I agree this is not done on 
guest's back.

Some of the work done in leave_hypervisor_tail() is an extension of IO 
emulation. They are not done directly in the IO emulation because they may take 
a long time and get preemption. So I don't see any difference with "IO emulation".

Regarding the vGIC. This is a bit more a grey area. While it is an overhead of 
virtualization, this is indirectly initiated by the guest. Indeed, you would 
only configure the vGIC if you receive an interrupt generated by one of the 
device assigned.

> 
>> , so I think they should be accounted to the guest time.
> This point is arguable. That's why we have a discussion here to agree on the 
> time accounting approach, what will directly affect scheduling accuracy.

Note the "I think" in my answer. So this is my opinion and your input is expected.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-02 13:07                 ` Andrii Anisov
@ 2019-08-02 13:49                   ` Julien Grall
  2019-08-03  1:39                     ` Dario Faggioli
  2019-08-03  0:55                   ` Dario Faggioli
  1 sibling, 1 reply; 49+ messages in thread
From: Julien Grall @ 2019-08-02 13:49 UTC (permalink / raw)
  To: Andrii Anisov, Dario Faggioli, xen-devel
  Cc: Andrii Anisov, Stefano Stabellini, Volodymyr Babchuk

Hi,

/!\/!\/!\

I am not a scheduler expert so my view maybe be wrong. Dario feel free to 
correct me :).

/!\/!\/!\

On 02/08/2019 14:07, Andrii Anisov wrote:
> 
> 
> On 02.08.19 12:15, Julien Grall wrote:
>>> I can make such a list, how it is done in this series:
>>
>>  From the list below it is not clear what is the split between hypervisor time 
>> and guest time. See some of the examples below.
> 
> I guess your question is *why* do I split hyp/guest time in such a way.
> 
> So for the guest I count time spent in the guest mode. Plus time spent in 
> hypervisor mode to serve explicit requests by guest.
> 
> That time may be quite deterministic from the guest's point of view.
> 
> But the time spent by hypervisor to handle interrupts, update the hardware state 
> is not requested by the guest itself. It is a virtualization overhead. And the 
> overhead heavily depends on the system configuration (e.g. how many guests are 
> running).

While context switch cost will depend on your system configuration. The HW state 
synchronization on entry to the hypervisor and exit from the hypervisor will 
always be there. This is even if you have one guest running or partitioning your 
system.

Furthermore, Xen is implementing a voluntary preemption model. The main 
preemption point for Arm is on return to the guest. So if you have work 
initiated by the guest that takes long, then you need may want to defer until 
you can preempt without much trouble.

Your definition of "virtualization overhead" is somewhat unclear. A guest is not 
aware that a device may be emulated. So emulating any I/O is per se an overhead.

> That overhead may be accounted for a guest or for hyp, depending on the model 
> agreed.

There are some issues to account some of the work on exit to the hypervisor 
time. Let's take the example of the P2M, this task is a deferred work from an 
system register emulation because we need preemption.

The task can be long running (several hundred milliseconds). A scheduler may 
only take into account the guest time and consider that vCPU does not need to be 
unscheduled. You are at the risk a vCPU will hog a pCPU and delay any other 
vCPU. This is not something ideal even for RT task.

Other work done on exit (e.g syncing the vGIC state to HW) will be less a 
concern where they are accounted because it cannot possibly hog a pCPU.

I understand you want to get the virtualization overhead. It feels to me, this 
needs to be a different category (i.e neither hypervisor time, nor guest time).

> 
> My idea is as following:
> Accounting that overhead for guests is quite OK for server applications, you put 
> server overhead time on guests and charge money from their budget.
> Yet for RT applications you will have more accurate view on the guest execution 
> time if you drop that overhead.
> 
> Our target is XEN in safety critical systems. So I chosen more deterministic 
> (from my point of view) approach.

See above, I believe you are building an secure system with accounting some of 
the guest work to the hypervisor.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 5/6] arm64: call enter_hypervisor_head only when it is needed
  2019-08-01 10:17           ` Julien Grall
@ 2019-08-02 13:50             ` Andrii Anisov
  0 siblings, 0 replies; 49+ messages in thread
From: Andrii Anisov @ 2019-08-02 13:50 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Volodymyr Babchuk, Stefano Stabellini, Andrii Anisov, Andre Przywara



On 01.08.19 13:17, Julien Grall wrote:
> All the commit message is based on "performance improvement".... Now you are selling it as this is confusing. 

Sorry Julien, I have no more arguments for you.
I'll drop these two patches for the next iteration.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-02 13:07                 ` Andrii Anisov
  2019-08-02 13:49                   ` Julien Grall
@ 2019-08-03  0:55                   ` Dario Faggioli
  2019-08-06 13:09                     ` Andrii Anisov
  1 sibling, 1 reply; 49+ messages in thread
From: Dario Faggioli @ 2019-08-03  0:55 UTC (permalink / raw)
  To: andrii.anisov, julien.grall, xen-devel
  Cc: andrii_anisov, sstabellini, Volodymyr_Babchuk

[-- Attachment #1.1: Type: text/plain, Size: 4624 bytes --]

On Fri, 2019-08-02 at 16:07 +0300, Andrii Anisov wrote:
> On 02.08.19 12:15, Julien Grall wrote:
> >  From the list below it is not clear what is the split between
> > hypervisor time and guest time. See some of the examples below.
> 
> I guess your question is *why* do I split hyp/guest time in such a
> way.
> 
> So for the guest I count time spent in the guest mode. Plus time
> spent in hypervisor mode to serve explicit requests by guest.
> 
From an accuracy, but also from a fairness perspective:
- what a guest does directly (in guest mode)
- what the hypervisor does, on behalf of a guest, no matter whether
requested explicitly or not
should all be accounted to the guest. In the sense that the guest
should be charged for it.

Actually, the concepts of "guest time" and "hypervisor time" are
actually orthogonal from the accounting, at least ideally.

In fact, when a guest does an hypercall, the time that we spend inside
Xen for performing the hypercal itself:
* is hypervisor time
* the guest that did the hypercall should be charged for it.

If we don't charge the guest for these activity, in theory, a guest can
start doing a lot of hypercalls and generating a lot of interrupts...
since most of the time is spent in the hypervisor, it's runtime (from
the scheduler point of view) increase only a little, and the scheduler
will continue to run it, and it will continue to generate hypercalls
and interrupts, until it starve/DoS the system!

In fact, this right now can't happen because we always charge guests
for the time spent doing these things. The problem is that we often
charge _the_wrong_ guest. This somewhat manages to prevent (or make it
very unlikely) a DoS situation, but is indeed unfair, and may cause
problems (especially in RT scenarios).

> That time may be quite deterministic from the guest's point of view.
> 
> But the time spent by hypervisor to handle interrupts, update the
> hardware state is not requested by the guest itself. It is a
> virtualization overhead. 
>
Yes, but still, when it is the guest that causes such overhead, it is
important that the guest itself gets to pay for it.

Just as an example (although you don't have this problem on ARM), if I
have an HVM, ideally I would charge to the guest the time that QEMU
executes in dom0!

On the other hand, the time that we spend in the scheduler, for
instance, doing load balancing among the various runqueues, or the time
that we spend in Xen (on x86) for time synchronization rendezvouses,
they should not be charged to any guest.

> And the overhead heavily depends on the system configuration (e.g.
> how many guests are running).
> That overhead may be accounted for a guest or for hyp, depending on
> the model agreed.
> 
Load balancing within the scheduler, indeed depends on how busy the
system is, and I agree that time should be accounted against any guest.

Saving and restoring the register state of a guest, I don't think it
depends on how many other guests there are around, and I think should
be accounted against the guest itself.

> My idea is as following:
> Accounting that overhead for guests is quite OK for server
> applications, you put server overhead time on guests and charge money
> from their budget.
>
I disagree. The benefits of more accurate and correct time accounting
and charging are not workload or use case dependent. If we decide to
charge the guest for hypercalls it does and interrupts it receives,
then we should do that, both for servers and for embedded RT systems.

> Yet for RT applications you will have more accurate view on the guest
> execution time if you drop that overhead.
> 

> Our target is XEN in safety critical systems. So I chosen more
> deterministic (from my point of view) approach.
> 
As said, I believe this is one of those cases, where we want an unified
approach. And not because it's easier, or because "Xen has to work both
on servers and embedded" (which, BTW, is true). But because it is the
right thing to do, IMO.

> Well, I suppose we may add granularity to the time accounting, and
> then decide at the scheduler level what we count for the guest
> execution time.
> 
> But it is so far from the end, and we are here to discuss and agree
> the stuff.
> 
Indeed. :-)

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-02 13:49                   ` Julien Grall
@ 2019-08-03  1:39                     ` Dario Faggioli
  0 siblings, 0 replies; 49+ messages in thread
From: Dario Faggioli @ 2019-08-03  1:39 UTC (permalink / raw)
  To: andrii.anisov, julien.grall, xen-devel
  Cc: andrii_anisov, sstabellini, Volodymyr_Babchuk

[-- Attachment #1.1: Type: text/plain, Size: 3698 bytes --]

On Fri, 2019-08-02 at 14:49 +0100, Julien Grall wrote:
> /!\/!\/!\
> 
> I am not a scheduler expert so my view maybe be wrong. Dario feel
> free to 
> correct me :).
> 
> /!\/!\/!\
> 
:-)

> On 02/08/2019 14:07, Andrii Anisov wrote:
> > On 02.08.19 12:15, Julien Grall wrote:
> > > > 
> > But the time spent by hypervisor to handle interrupts, update the
> > hardware state 
> > is not requested by the guest itself. It is a virtualization
> > overhead. And the 
> > overhead heavily depends on the system configuration (e.g. how many
> > guests are 
> > running).
> 
> While context switch cost will depend on your system configuration.
> The HW state 
> synchronization on entry to the hypervisor and exit from the
> hypervisor will 
> always be there. This is even if you have one guest running or
> partitioning your 
> system.
> 
This might be a good way of thinking to this problem. The
overhead/hypervisor time that is always there, even if you are running
only one guest, with static and strict resource partitioning (e.g.,
hypercalls, IRQs), you want it charged to the guest.

The overhead/hypervisor time coming from operations that you do because
you have multiple guests and/or non-static partitioning (e.g.,
scheduling, load balancing), you don't want it charged to any specific
guest.

Note that, we're talking, in both cases of "hypervisor time".

> There are some issues to account some of the work on exit to the
> hypervisor 
> time. Let's take the example of the P2M, this task is a deferred work
> from an 
> system register emulation because we need preemption.
> 
> The task can be long running (several hundred milliseconds). A
> scheduler may 
> only take into account the guest time and consider that vCPU does not
> need to be 
> unscheduled. You are at the risk a vCPU will hog a pCPU and delay any
> other 
> vCPU. This is not something ideal even for RT task.
> 
Yes, this is indeed an example of what I was also describing in my
other email.

> Other work done on exit (e.g syncing the vGIC state to HW) will be
> less a 
> concern where they are accounted because it cannot possibly hog a
> pCPU.
> 
Indeed. But it'd still be good to charge the proper entity for it, if
possible. :-)

> I understand you want to get the virtualization overhead. It feels to
> me, this 
> needs to be a different category (i.e neither hypervisor time, nor
> guest time).
> 
IMO, what we need to do is separate the concept of guest/hypervisor
time, from the fact that we account/charge someone or not (and if yes,
who).

E.g., hypercalls are hypervisor time and (in most cases) you want to
charge a guest making the hypercalls for it. OTOH, running QEMU (e.g.,
in dom0) is guest time, and you want to charge the guest for which QEMU
is acting as a DM for it (not dom0).

Of course, some parts of this (e.g., the QEMU running in dom0 one) are
going to be very difficult, if possible at all, to implement. But
still, this would be the idea, IMO.

> > Our target is XEN in safety critical systems. So I chosen more
> > deterministic 
> > (from my point of view) approach.
> 
> See above, I believe you are building an secure system with
> accounting some of 
> the guest work to the hypervisor.
> 
Yep, I do agree with Julien here. Doing the accounting right, you get
both a more robust and a more fair system.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-03  0:55                   ` Dario Faggioli
@ 2019-08-06 13:09                     ` Andrii Anisov
  2019-08-08 14:07                       ` Andrii Anisov
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-08-06 13:09 UTC (permalink / raw)
  To: Dario Faggioli, julien.grall, xen-devel
  Cc: andrii_anisov, sstabellini, Volodymyr_Babchuk

Hello Dario,

Please see my comments below:

On 03.08.19 03:55, Dario Faggioli wrote:
> On Fri, 2019-08-02 at 16:07 +0300, Andrii Anisov wrote:
>> On 02.08.19 12:15, Julien Grall wrote:
>>>   From the list below it is not clear what is the split between
>>> hypervisor time and guest time. See some of the examples below.
>>
>> I guess your question is *why* do I split hyp/guest time in such a
>> way.
>>
>> So for the guest I count time spent in the guest mode. Plus time
>> spent in hypervisor mode to serve explicit requests by guest.
>>
>  From an accuracy, but also from a fairness perspective:
> - what a guest does directly (in guest mode)
> - what the hypervisor does, on behalf of a guest, no matter whether
> requested explicitly or not
> should all be accounted to the guest. In the sense that the guest
> should be charged for it.

For the interrupts and implicit overhead I'd give an example (for ARM, and a bit simplified):

In IRQ trap path there is a function `enter_hypervisor_head()`, what does synchronize GIC state of interrupted VCPU to its VGIC representation (manipulates peripheral registers, goes through queues, etc.). Lets imagine we have running a VCPU which belongs to domain A, and it is interrupted by the int belongs to domain B. From what domain budget should be charged `enter_hypervisor_head()` execution time?
 From budget of domain A? But it was not initiated by domain A.
 From budget of domain B? But `enter_hypervisor_head()` execution time depends on domain A configuration and workload.

If you see this example as very simple, please add nested interrupts and guest switch on returning from hyp. And remember there is some mandatory non-softirq work in `leave_hypervisor_tail()`.

> Actually, the concepts of "guest time" and "hypervisor time" are
> actually orthogonal from the accounting, at least ideally.
> 
> In fact, when a guest does an hypercall, the time that we spend inside
> Xen for performing the hypercal itself:
> * is hypervisor time
> * the guest that did the hypercall should be charged for it.
> 
> If we don't charge the guest for these activity, in theory, a guest can
> start doing a lot of hypercalls and generating a lot of interrupts...
> since most of the time is spent in the hypervisor, it's runtime (from
> the scheduler point of view) increase only a little, and the scheduler
> will continue to run it, and it will continue to generate hypercalls
> and interrupts, until it starve/DoS the system!
> 
> In fact, this right now can't happen because we always charge guests
> for the time spent doing these things. The problem is that we often
> charge _the_wrong_ guest. This somewhat manages to prevent (or make it
> very unlikely) a DoS situation, but is indeed unfair, and may cause
> problems (especially in RT scenarios).
> 
>> That time may be quite deterministic from the guest's point of view.
>>
>> But the time spent by hypervisor to handle interrupts, update the
>> hardware state is not requested by the guest itself. It is a
>> virtualization overhead.
>>
> Yes, but still, when it is the guest that causes such overhead, it is
> important that the guest itself gets to pay for it.
> 
> Just as an example (although you don't have this problem on ARM), if I
> have an HVM, ideally I would charge to the guest the time that QEMU
> executes in dom0!
> 
> On the other hand, the time that we spend in the scheduler, for
> instance, doing load balancing among the various runqueues, or the time
> that we spend in Xen (on x86) for time synchronization rendezvouses,
> they should not be charged to any guest.
> 
>> And the overhead heavily depends on the system configuration (e.g.
>> how many guests are running).
>> That overhead may be accounted for a guest or for hyp, depending on
>> the model agreed.
>>
> Load balancing within the scheduler, indeed depends on how busy the
> system is, and I agree that time should be accounted against any guest.

Agree.

> Saving and restoring the register state of a guest, I don't think it
> depends on how many other guests there are around, and I think should
> be accounted against the guest itself.

I'd brief saving restoring register state of a guest to the guest context switch.
So one particular guest context switch does not depend on the system load. But the number of context switches directly depends on how many guests are available and how busy they are.
So the guest context switch overhead varies and might affect sensitive guests if they are charged for it.

>> My idea is as following:
>> Accounting that overhead for guests is quite OK for server
>> applications, you put server overhead time on guests and charge money
>> from their budget.
>>
> I disagree. The benefits of more accurate and correct time accounting
> and charging are not workload or use case dependent.

I would agree, for the ideal system.
But more accurate and correct time accounting and charging is more expensive in runtime, or, sometimes, impossible to implement.
This causes the fact that "time accounting and charging" is use case dependent.

> If we decide to
> charge the guest for hypercalls it does and interrupts it receives,
> then we should do that, both for servers and for embedded RT systems.
> As said, I believe this is one of those cases, where we want an unified
> approach.

I'm totally agree with guests charged for hypercalls. But I doubt interrupts.

> And not because it's easier, or because "Xen has to work both
> on servers and embedded" (which, BTW, is true). But because it is the
> right thing to do, IMO.

p.s. I've spent some more time looking through Linux kernel, time accounting implementation. They have IRQ time accounting configurable. Here is their justification [1].
p.p.s. I'm looking through freertos as well to get wider look on the available approaches

[1] http://lkml.iu.edu/hypermail//linux/kernel/1010.0/01175.html

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-06 13:09                     ` Andrii Anisov
@ 2019-08-08 14:07                       ` Andrii Anisov
  2019-08-13 14:45                         ` Dario Faggioli
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Anisov @ 2019-08-08 14:07 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: andrii_anisov, julien.grall, sstabellini, Volodymyr_Babchuk


On 06.08.19 16:09, Andrii Anisov wrote:
> p.p.s. I'm looking through freertos as well to get wider look on the available approaches

OK, basically Free-RTOS does not account the IRQ time separately. Yet its scheduling is very implementation dependent.
Any ideas about other open-source examples available for investigation?

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-08 14:07                       ` Andrii Anisov
@ 2019-08-13 14:45                         ` Dario Faggioli
  2019-08-15 18:25                           ` Andrii Anisov
  0 siblings, 1 reply; 49+ messages in thread
From: Dario Faggioli @ 2019-08-13 14:45 UTC (permalink / raw)
  To: andrii.anisov, xen-devel
  Cc: andrii_anisov, julien.grall, sstabellini, Volodymyr_Babchuk

[-- Attachment #1.1: Type: text/plain, Size: 794 bytes --]

On Thu, 2019-08-08 at 17:07 +0300, Andrii Anisov wrote:
> On 06.08.19 16:09, Andrii Anisov wrote:
> > p.p.s. I'm looking through freertos as well to get wider look on
> > the available approaches
> 
> OK, basically Free-RTOS does not account the IRQ time separately. Yet
> its scheduling is very implementation dependent.
>
What do you mean with "Yet its scheduling is very implementation
dependant"?

> Any ideas about other open-source examples available for
> investigation?
> 
Zephyr, maybe?

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  2019-08-13 14:45                         ` Dario Faggioli
@ 2019-08-15 18:25                           ` Andrii Anisov
  0 siblings, 0 replies; 49+ messages in thread
From: Andrii Anisov @ 2019-08-15 18:25 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: andrii_anisov, julien.grall, sstabellini, Volodymyr_Babchuk

Hello Dario,


On 13.08.19 17:45, Dario Faggioli wrote:
> What do you mean with "Yet its scheduling is very implementation
> dependant"?

The freertos provides an interface to its scheduling internals like `vTaskSwitchContext` and `xTaskIncrementTick()`, which should be called by the platform-specific code. I thought it is possible to alternate scheduling behavior calling those interfaces differently.
Though, on the second thought, it would not provide enough freedom to build something really different.

> Zephyr, maybe?

OK, will look through it a bit later.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, back to index

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-26 10:37 [Xen-devel] [RFC 0/6] XEN scheduling hardening Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path Andrii Anisov
2019-07-26 10:48   ` Julien Grall
2019-07-30 17:35     ` Andrii Anisov
2019-07-30 20:10       ` Julien Grall
2019-08-01  6:45         ` Andrii Anisov
2019-08-01  9:37           ` Julien Grall
2019-08-02  8:28             ` Andrii Anisov
2019-08-02  9:03               ` Julien Grall
2019-08-02 12:24                 ` Andrii Anisov
2019-08-02 13:22                   ` Julien Grall
2019-08-01 11:19           ` Dario Faggioli
2019-08-02  7:50             ` Andrii Anisov
2019-08-02  9:15               ` Julien Grall
2019-08-02 13:07                 ` Andrii Anisov
2019-08-02 13:49                   ` Julien Grall
2019-08-03  1:39                     ` Dario Faggioli
2019-08-03  0:55                   ` Dario Faggioli
2019-08-06 13:09                     ` Andrii Anisov
2019-08-08 14:07                       ` Andrii Anisov
2019-08-13 14:45                         ` Dario Faggioli
2019-08-15 18:25                           ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 2/6] schedule: account true system idle time Andrii Anisov
2019-07-26 12:00   ` Dario Faggioli
2019-07-26 12:42     ` Andrii Anisov
2019-07-29 11:40       ` Dario Faggioli
2019-08-01  8:23         ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 3/6] sysctl: extend XEN_SYSCTL_getcpuinfo interface Andrii Anisov
2019-07-26 12:15   ` Dario Faggioli
2019-07-26 13:06     ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 4/6] xentop: show CPU load information Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: сall enter_hypervisor_head only when it is needed Andrii Anisov
2019-07-26 10:44   ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 5/6] arm64: call " Andrii Anisov
2019-07-26 10:59   ` Julien Grall
2019-07-30 17:35     ` Andrii Anisov
2019-07-31 11:02       ` Julien Grall
2019-07-31 11:33         ` Andre Przywara
2019-08-01  7:33         ` Andrii Anisov
2019-08-01 10:17           ` Julien Grall
2019-08-02 13:50             ` Andrii Anisov
2019-07-26 10:37 ` [Xen-devel] [RFC 6/6] schedule: account all the hypervisor time to the idle vcpu Andrii Anisov
2019-07-26 11:56 ` [Xen-devel] [RFC 0/6] XEN scheduling hardening Dario Faggioli
2019-07-26 12:14   ` Juergen Gross
2019-07-29 11:53     ` Dario Faggioli
2019-07-29 12:13       ` Juergen Gross
2019-07-29 14:47     ` Andrii Anisov
2019-07-29 18:46       ` Dario Faggioli
2019-07-29 14:28   ` Andrii Anisov

Xen-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/xen-devel/0 xen-devel/git/0.git
	git clone --mirror https://lore.kernel.org/xen-devel/1 xen-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 xen-devel xen-devel/ https://lore.kernel.org/xen-devel \
		xen-devel@lists.xenproject.org xen-devel@archiver.kernel.org
	public-inbox-index xen-devel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.xenproject.lists.xen-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox