All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] xen/wait: Improvements
@ 2022-07-18  7:18 Andrew Cooper
  2022-07-18  7:18 ` [PATCH 1/5] xen/wait: Drop vestigial remnants of TRAP_regs_partial Andrew Cooper
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Andrew Cooper @ 2022-07-18  7:18 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Juergen Gross, Dario Faggioli

This started out as patch 2 for a different task, and quickly identified some
technical debt, long overdue for cleaning up.

Andrew Cooper (5):
  xen/wait: Drop vestigial remnants of TRAP_regs_partial
  xen/wait: Extend the description of how this logic actually works
  xen/wait: Minor asm improvements
  xen/wait: Use relative stack adjustments
  xen/wait: Remove VCPU_AFFINITY_WAIT

 xen/common/domain.c     |   2 -
 xen/common/sched/core.c |   4 +-
 xen/common/wait.c       | 117 +++++++++++++++++++++++++-----------------------
 xen/include/xen/sched.h |   1 -
 4 files changed, 63 insertions(+), 61 deletions(-)

-- 
2.11.0



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/5] xen/wait: Drop vestigial remnants of TRAP_regs_partial
  2022-07-18  7:18 [PATCH 0/5] xen/wait: Improvements Andrew Cooper
@ 2022-07-18  7:18 ` Andrew Cooper
  2022-07-18  9:53   ` Jan Beulich
  2022-07-18  7:18 ` [PATCH 2/5] xen/wait: Extend the description of how this logic actually works Andrew Cooper
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2022-07-18  7:18 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

The preservation of entry_vector was introduced with ecf9846a6a20 ("x86:
save/restore only partial register state where possible") where
TRAP_regs_partial was introduced, but missed from f9eb74789af7 ("x86/entry:
Remove support for partial cpu_user_regs frames") where TRAP_regs_partial was
removed.

Fixes: f9eb74789af7 ("x86/entry: Remove support for partial cpu_user_regs frames")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/common/wait.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/xen/common/wait.c b/xen/common/wait.c
index 9276d76464fb..3ebb884fe738 100644
--- a/xen/common/wait.c
+++ b/xen/common/wait.c
@@ -124,7 +124,6 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
     struct cpu_info *cpu_info = get_cpu_info();
     struct vcpu *curr = current;
     unsigned long dummy;
-    u32 entry_vector = cpu_info->guest_cpu_user_regs.entry_vector;
 
     ASSERT(wqv->esp == 0);
 
@@ -169,8 +168,6 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
         for ( ; ; )
             do_softirq();
     }
-
-    cpu_info->guest_cpu_user_regs.entry_vector = entry_vector;
 }
 
 static void __finish_wait(struct waitqueue_vcpu *wqv)
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/5] xen/wait: Extend the description of how this logic actually works
  2022-07-18  7:18 [PATCH 0/5] xen/wait: Improvements Andrew Cooper
  2022-07-18  7:18 ` [PATCH 1/5] xen/wait: Drop vestigial remnants of TRAP_regs_partial Andrew Cooper
@ 2022-07-18  7:18 ` Andrew Cooper
  2022-07-18 10:00   ` Jan Beulich
  2022-07-18  7:18 ` [PATCH 3/5] xen/wait: Minor asm improvements Andrew Cooper
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2022-07-18  7:18 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/common/wait.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/xen/common/wait.c b/xen/common/wait.c
index 3ebb884fe738..4dcfa17a8a3f 100644
--- a/xen/common/wait.c
+++ b/xen/common/wait.c
@@ -137,7 +137,19 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
             do_softirq();
     }
 
-    /* Hand-rolled setjmp(). */
+    /*
+     * Hand-rolled setjmp().
+     *
+     * __prepare_to_wait() is the leaf of a deep calltree.  Preserve the GPRs,
+     * bounds check what we want to stash in wqv->stack, copy the active stack
+     * (up to cpu_info) into wqv->stack, then return normally.  Our caller
+     * will shortly schedule() and discard the current context.
+     *
+     * The copy out is performed with a rep movsb.  When
+     * check_wakeup_from_wait() longjmp()'s back into us, %rsp is pre-adjusted
+     * to be suitable and %rsi/%rdi are swapped, so the rep movsb instead
+     * copies in from wqv->stack over the active stack.
+     */
     asm volatile (
         "push %%rax; push %%rbx; push %%rdx; push %%rbp;"
         "push %%r8;  push %%r9;  push %%r10; push %%r11;"
@@ -199,9 +211,18 @@ void check_wakeup_from_wait(void)
     }
 
     /*
-     * Hand-rolled longjmp().  Returns to __prepare_to_wait(), and lands on a
-     * `rep movs` instruction.  All other GPRs are restored from the stack, so
-     * are available for use here.
+     * Hand-rolled longjmp().
+     *
+     * check_wakeup_from_wait() is always called with a shallow stack,
+     * immediately after the vCPU has been rescheduled.
+     *
+     * Adjust %rsp to be the correct depth for the (deeper) stack we want to
+     * restore, then prepare %rsi, %rdi and %rcx such that when we intercept
+     * the rep movs in __prepare_to_wait(), it copies from wqv->stack over the
+     * active stack.
+     *
+     * All other GPRs are available for use; they're either restored from
+     * wqv->stack or explicitly clobbered.
      */
     asm volatile (
         "mov %1,%%"__OP"sp; jmp .L_wq_resume;"
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/5] xen/wait: Minor asm improvements
  2022-07-18  7:18 [PATCH 0/5] xen/wait: Improvements Andrew Cooper
  2022-07-18  7:18 ` [PATCH 1/5] xen/wait: Drop vestigial remnants of TRAP_regs_partial Andrew Cooper
  2022-07-18  7:18 ` [PATCH 2/5] xen/wait: Extend the description of how this logic actually works Andrew Cooper
@ 2022-07-18  7:18 ` Andrew Cooper
  2022-07-18 10:04   ` Jan Beulich
  2022-07-18  7:18 ` [PATCH 4/5] xen/wait: Use relative stack adjustments Andrew Cooper
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2022-07-18  7:18 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

There is no point preserving all registers.  Instead, preserve an arbitrary 6
registers, and list the rest as clobbered.  This does not alter the register
scheduling at all, but does reduce the amount of state needing saving.

Use a named parameter for page size, instead of needing to parse which is
parameter 3.  Adjust the formatting of the parameters slightly to simply the
diff of the subsequent patch.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/common/wait.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/xen/common/wait.c b/xen/common/wait.c
index 4dcfa17a8a3f..4bc030d1a09d 100644
--- a/xen/common/wait.c
+++ b/xen/common/wait.c
@@ -151,13 +151,12 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
      * copies in from wqv->stack over the active stack.
      */
     asm volatile (
-        "push %%rax; push %%rbx; push %%rdx; push %%rbp;"
-        "push %%r8;  push %%r9;  push %%r10; push %%r11;"
-        "push %%r12; push %%r13; push %%r14; push %%r15;"
+        "push %%rbx; push %%rbp; push %%r12;"
+        "push %%r13; push %%r14; push %%r15;"
 
         "sub %%esp,%%ecx;"
-        "cmp %3,%%ecx;"
-        "ja .L_skip;"
+        "cmp %[sz], %%ecx;"
+        "ja .L_skip;"       /* Bail if >4k */
         "mov %%rsp,%%rsi;"
 
         /* check_wakeup_from_wait() longjmp()'s to this point. */
@@ -165,12 +164,12 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
         "mov %%rsp,%%rsi;"
 
         ".L_skip:"
-        "pop %%r15; pop %%r14; pop %%r13; pop %%r12;"
-        "pop %%r11; pop %%r10; pop %%r9;  pop %%r8;"
-        "pop %%rbp; pop %%rdx; pop %%rbx; pop %%rax"
+        "pop %%r15; pop %%r14; pop %%r13;"
+        "pop %%r12; pop %%rbp; pop %%rbx;"
         : "=&S" (wqv->esp), "=&c" (dummy), "=&D" (dummy)
-        : "i" (PAGE_SIZE), "0" (0), "1" (cpu_info), "2" (wqv->stack)
-        : "memory" );
+        : "0" (0), "1" (cpu_info), "2" (wqv->stack),
+          [sz] "i" (PAGE_SIZE)
+        : "memory", "rax", "rdx", "r8", "r9", "r10", "r11" );
 
     if ( unlikely(wqv->esp == 0) )
     {
@@ -224,11 +223,12 @@ void check_wakeup_from_wait(void)
      * All other GPRs are available for use; they're either restored from
      * wqv->stack or explicitly clobbered.
      */
-    asm volatile (
-        "mov %1,%%"__OP"sp; jmp .L_wq_resume;"
-        : : "S" (wqv->stack), "D" (wqv->esp),
-          "c" ((char *)get_cpu_info() - (char *)wqv->esp)
-        : "memory" );
+    asm volatile ( "mov %%rdi, %%rsp;"
+                   "jmp .L_wq_resume;"
+                   :
+                   : "S" (wqv->stack), "D" (wqv->esp),
+                     "c" ((char *)get_cpu_info() - (char *)wqv->esp)
+                   : "memory" );
     unreachable();
 }
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/5] xen/wait: Use relative stack adjustments
  2022-07-18  7:18 [PATCH 0/5] xen/wait: Improvements Andrew Cooper
                   ` (2 preceding siblings ...)
  2022-07-18  7:18 ` [PATCH 3/5] xen/wait: Minor asm improvements Andrew Cooper
@ 2022-07-18  7:18 ` Andrew Cooper
  2022-07-18 10:29   ` Jan Beulich
  2022-07-18  7:18 ` [PATCH 5/5] xen/wait: Remove VCPU_AFFINITY_WAIT Andrew Cooper
  2022-07-18 11:11 ` [PATCH 0/5] xen/wait: Improvements Jan Beulich
  5 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2022-07-18  7:18 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

The waitqueue's esp field is overloaded.  It serves both as an indication that
the waitqueue is in use, and as a direction to check_wakeup_from_wait() as to
where to adjust the stack pointer to, but using an absolute pointer comes with
a cost if requiring the vCPU to wake up on the same pCPU it went to sleep on.

Instead, have the waitqueue just keep track of how much data is on wqv->stack.
This is no practical change in __prepare_to_wait() (it already calculated the
delta) but split the result out of the (also overloaded) %rsi output parameter
by using a separate register instead.

check_wakeup_from_wait() has a bit more work to do.  It now needs to calculate
the adjustment to %rsp rather than having the new %rsp provided as a
parameter.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/common/wait.c | 44 ++++++++++++++++++++++++++++----------------
 1 file changed, 28 insertions(+), 16 deletions(-)

diff --git a/xen/common/wait.c b/xen/common/wait.c
index 4bc030d1a09d..4f1daf650bc4 100644
--- a/xen/common/wait.c
+++ b/xen/common/wait.c
@@ -32,8 +32,8 @@ struct waitqueue_vcpu {
      * Xen/x86 does not have per-vcpu hypervisor stacks. So we must save the
      * hypervisor context before sleeping (descheduling), setjmp/longjmp-style.
      */
-    void *esp;
     char *stack;
+    unsigned int used;
 #endif
 };
 
@@ -121,11 +121,11 @@ void wake_up_all(struct waitqueue_head *wq)
 
 static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
 {
-    struct cpu_info *cpu_info = get_cpu_info();
     struct vcpu *curr = current;
     unsigned long dummy;
+    unsigned int used;
 
-    ASSERT(wqv->esp == 0);
+    ASSERT(wqv->used == 0);
 
     /* Save current VCPU affinity; force wakeup on *this* CPU only. */
     if ( vcpu_temporary_affinity(curr, smp_processor_id(), VCPU_AFFINITY_WAIT) )
@@ -154,24 +154,25 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
         "push %%rbx; push %%rbp; push %%r12;"
         "push %%r13; push %%r14; push %%r15;"
 
-        "sub %%esp,%%ecx;"
+        "sub %%esp, %%ecx;" /* ecx = delta to cpu_info */
         "cmp %[sz], %%ecx;"
         "ja .L_skip;"       /* Bail if >4k */
-        "mov %%rsp,%%rsi;"
+
+        "mov %%ecx, %%eax;"
+        "mov %%rsp, %%rsi;" /* Copy from the stack, into wqv->stack */
 
         /* check_wakeup_from_wait() longjmp()'s to this point. */
         ".L_wq_resume: rep movsb;"
-        "mov %%rsp,%%rsi;"
 
         ".L_skip:"
         "pop %%r15; pop %%r14; pop %%r13;"
         "pop %%r12; pop %%rbp; pop %%rbx;"
-        : "=&S" (wqv->esp), "=&c" (dummy), "=&D" (dummy)
-        : "0" (0), "1" (cpu_info), "2" (wqv->stack),
+        : "=a" (used), "=D" (dummy),     "=c" (dummy),         "=&S" (dummy)
+        : "a" (0),     "D" (wqv->stack), "c" (get_cpu_info()),
           [sz] "i" (PAGE_SIZE)
-        : "memory", "rax", "rdx", "r8", "r9", "r10", "r11" );
+        : "memory", "rdx", "r8", "r9", "r10", "r11" );
 
-    if ( unlikely(wqv->esp == 0) )
+    if ( unlikely(used > PAGE_SIZE) )
     {
         gdprintk(XENLOG_ERR, "Stack too large in %s\n", __func__);
         domain_crash(curr->domain);
@@ -179,11 +180,13 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
         for ( ; ; )
             do_softirq();
     }
+
+    wqv->used = used;
 }
 
 static void __finish_wait(struct waitqueue_vcpu *wqv)
 {
-    wqv->esp = NULL;
+    wqv->used = 0;
     vcpu_temporary_affinity(current, NR_CPUS, VCPU_AFFINITY_WAIT);
 }
 
@@ -191,10 +194,11 @@ void check_wakeup_from_wait(void)
 {
     struct vcpu *curr = current;
     struct waitqueue_vcpu *wqv = curr->waitqueue_vcpu;
+    unsigned long tmp;
 
     ASSERT(list_empty(&wqv->list));
 
-    if ( likely(wqv->esp == NULL) )
+    if ( likely(!wqv->used) )
         return;
 
     /* Check if we are still pinned. */
@@ -220,14 +224,22 @@ void check_wakeup_from_wait(void)
      * the rep movs in __prepare_to_wait(), it copies from wqv->stack over the
      * active stack.
      *
+     * We are also bound by __prepare_to_wait()'s output constraints, so %eax
+     * needs to be wqv->used.
+     *
      * All other GPRs are available for use; they're either restored from
      * wqv->stack or explicitly clobbered.
      */
-    asm volatile ( "mov %%rdi, %%rsp;"
+    asm volatile ( "sub %%esp, %k[var];" /* var = delta to cpu_info */
+                   "neg %k[var];"
+                   "add %%ecx, %k[var];" /* var = -delta + wqv->used */
+
+                   "sub %[var], %%rsp;"  /* Adjust %rsp down to make room */
+                   "mov %%rsp, %%rdi;"   /* Copy from wqv->stack, into the stack */
                    "jmp .L_wq_resume;"
-                   :
-                   : "S" (wqv->stack), "D" (wqv->esp),
-                     "c" ((char *)get_cpu_info() - (char *)wqv->esp)
+                   : "=D" (tmp), [var] "=&r" (tmp)
+                   : "S" (wqv->stack), "c" (wqv->used), "a" (wqv->used),
+                     "[var]" (get_cpu_info())
                    : "memory" );
     unreachable();
 }
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/5] xen/wait: Remove VCPU_AFFINITY_WAIT
  2022-07-18  7:18 [PATCH 0/5] xen/wait: Improvements Andrew Cooper
                   ` (3 preceding siblings ...)
  2022-07-18  7:18 ` [PATCH 4/5] xen/wait: Use relative stack adjustments Andrew Cooper
@ 2022-07-18  7:18 ` Andrew Cooper
  2022-07-18 10:48   ` Jan Beulich
  2022-07-18 11:11 ` [PATCH 0/5] xen/wait: Improvements Jan Beulich
  5 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2022-07-18  7:18 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Juergen Gross, Dario Faggioli

With the waitqueue logic updated to not use an absolute stack pointer
reference, the vCPU can safely be resumed anywhere.

Remove VCPU_AFFINITY_WAIT completely, getting rid of two domain crashes, and a
logical corner case where resetting the vcpu with an oustanding waitqueue
would crash the domain.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Juergen Gross <jgross@suse.com>
CC: Dario Faggioli <dfaggioli@suse.com>
---
 xen/common/domain.c     |  2 --
 xen/common/sched/core.c |  4 +---
 xen/common/wait.c       | 23 -----------------------
 xen/include/xen/sched.h |  1 -
 4 files changed, 1 insertion(+), 29 deletions(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 618410e3b257..323b92102cce 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1428,8 +1428,6 @@ int vcpu_reset(struct vcpu *v)
     v->is_initialised  = 0;
     if ( v->affinity_broken & VCPU_AFFINITY_OVERRIDE )
         vcpu_temporary_affinity(v, NR_CPUS, VCPU_AFFINITY_OVERRIDE);
-    if ( v->affinity_broken & VCPU_AFFINITY_WAIT )
-        vcpu_temporary_affinity(v, NR_CPUS, VCPU_AFFINITY_WAIT);
     clear_bit(_VPF_blocked, &v->pause_flags);
     clear_bit(_VPF_in_reset, &v->pause_flags);
 
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index f689b55783f7..cff8e59aba7c 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -1610,12 +1610,10 @@ void watchdog_domain_destroy(struct domain *d)
 /*
  * Pin a vcpu temporarily to a specific CPU (or restore old pinning state if
  * cpu is NR_CPUS).
- * Temporary pinning can be done due to two reasons, which may be nested:
+ * Temporary pinning can be done for a number of reasons, which may be nested:
  * - VCPU_AFFINITY_OVERRIDE (requested by guest): is allowed to fail in case
  *   of a conflict (e.g. in case cpupool doesn't include requested CPU, or
  *   another conflicting temporary pinning is already in effect.
- * - VCPU_AFFINITY_WAIT (called by wait_event()): only used to pin vcpu to the
- *   CPU it is just running on. Can't fail if used properly.
  */
 int vcpu_temporary_affinity(struct vcpu *v, unsigned int cpu, uint8_t reason)
 {
diff --git a/xen/common/wait.c b/xen/common/wait.c
index 4f1daf650bc4..bd6f09662ac0 100644
--- a/xen/common/wait.c
+++ b/xen/common/wait.c
@@ -127,16 +127,6 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
 
     ASSERT(wqv->used == 0);
 
-    /* Save current VCPU affinity; force wakeup on *this* CPU only. */
-    if ( vcpu_temporary_affinity(curr, smp_processor_id(), VCPU_AFFINITY_WAIT) )
-    {
-        gdprintk(XENLOG_ERR, "Unable to set vcpu affinity\n");
-        domain_crash(curr->domain);
-
-        for ( ; ; )
-            do_softirq();
-    }
-
     /*
      * Hand-rolled setjmp().
      *
@@ -187,7 +177,6 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
 static void __finish_wait(struct waitqueue_vcpu *wqv)
 {
     wqv->used = 0;
-    vcpu_temporary_affinity(current, NR_CPUS, VCPU_AFFINITY_WAIT);
 }
 
 void check_wakeup_from_wait(void)
@@ -201,18 +190,6 @@ void check_wakeup_from_wait(void)
     if ( likely(!wqv->used) )
         return;
 
-    /* Check if we are still pinned. */
-    if ( unlikely(!(curr->affinity_broken & VCPU_AFFINITY_WAIT)) )
-    {
-        gdprintk(XENLOG_ERR, "vcpu affinity lost\n");
-        domain_crash(curr->domain);
-
-        /* Re-initiate scheduler and don't longjmp(). */
-        raise_softirq(SCHEDULE_SOFTIRQ);
-        for ( ; ; )
-            do_softirq();
-    }
-
     /*
      * Hand-rolled longjmp().
      *
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b9515eb497de..ba859a4abed3 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -223,7 +223,6 @@ struct vcpu
     /* VCPU need affinity restored */
     uint8_t          affinity_broken;
 #define VCPU_AFFINITY_OVERRIDE    0x01
-#define VCPU_AFFINITY_WAIT        0x02
 
     /* A hypercall has been preempted. */
     bool             hcall_preempted;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] xen/wait: Drop vestigial remnants of TRAP_regs_partial
  2022-07-18  7:18 ` [PATCH 1/5] xen/wait: Drop vestigial remnants of TRAP_regs_partial Andrew Cooper
@ 2022-07-18  9:53   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2022-07-18  9:53 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 18.07.2022 09:18, Andrew Cooper wrote:
> The preservation of entry_vector was introduced with ecf9846a6a20 ("x86:
> save/restore only partial register state where possible") where
> TRAP_regs_partial was introduced, but missed from f9eb74789af7 ("x86/entry:
> Remove support for partial cpu_user_regs frames") where TRAP_regs_partial was
> removed.
> 
> Fixes: f9eb74789af7 ("x86/entry: Remove support for partial cpu_user_regs frames")
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] xen/wait: Extend the description of how this logic actually works
  2022-07-18  7:18 ` [PATCH 2/5] xen/wait: Extend the description of how this logic actually works Andrew Cooper
@ 2022-07-18 10:00   ` Jan Beulich
  2022-07-18 10:11     ` Andrew Cooper
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2022-07-18 10:00 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 18.07.2022 09:18, Andrew Cooper wrote:
> @@ -199,9 +211,18 @@ void check_wakeup_from_wait(void)
>      }
>  
>      /*
> -     * Hand-rolled longjmp().  Returns to __prepare_to_wait(), and lands on a
> -     * `rep movs` instruction.  All other GPRs are restored from the stack, so
> -     * are available for use here.
> +     * Hand-rolled longjmp().
> +     *
> +     * check_wakeup_from_wait() is always called with a shallow stack,
> +     * immediately after the vCPU has been rescheduled.
> +     *
> +     * Adjust %rsp to be the correct depth for the (deeper) stack we want to
> +     * restore, then prepare %rsi, %rdi and %rcx such that when we intercept
> +     * the rep movs in __prepare_to_wait(), it copies from wqv->stack over the
> +     * active stack.

I'm struggling with the use of "intercept" here, but I guess that's just
because I'm not a native speaker.

> +     * All other GPRs are available for use; they're either restored from
> +     * wqv->stack or explicitly clobbered.

You talking of "other GPRs" - there aren't any which are explicitly
clobbered. It's only the previously named ones which are. Hence I'd like
to ask that the respective parts of the sentence be dropped. Then
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] xen/wait: Minor asm improvements
  2022-07-18  7:18 ` [PATCH 3/5] xen/wait: Minor asm improvements Andrew Cooper
@ 2022-07-18 10:04   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2022-07-18 10:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 18.07.2022 09:18, Andrew Cooper wrote:
> --- a/xen/common/wait.c
> +++ b/xen/common/wait.c
> @@ -151,13 +151,12 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
>       * copies in from wqv->stack over the active stack.
>       */
>      asm volatile (
> -        "push %%rax; push %%rbx; push %%rdx; push %%rbp;"
> -        "push %%r8;  push %%r9;  push %%r10; push %%r11;"
> -        "push %%r12; push %%r13; push %%r14; push %%r15;"
> +        "push %%rbx; push %%rbp; push %%r12;"
> +        "push %%r13; push %%r14; push %%r15;"
>  
>          "sub %%esp,%%ecx;"
> -        "cmp %3,%%ecx;"
> -        "ja .L_skip;"
> +        "cmp %[sz], %%ecx;"
> +        "ja .L_skip;"       /* Bail if >4k */
>          "mov %%rsp,%%rsi;"
>  
>          /* check_wakeup_from_wait() longjmp()'s to this point. */
> @@ -165,12 +164,12 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
>          "mov %%rsp,%%rsi;"
>  
>          ".L_skip:"
> -        "pop %%r15; pop %%r14; pop %%r13; pop %%r12;"
> -        "pop %%r11; pop %%r10; pop %%r9;  pop %%r8;"
> -        "pop %%rbp; pop %%rdx; pop %%rbx; pop %%rax"
> +        "pop %%r15; pop %%r14; pop %%r13;"
> +        "pop %%r12; pop %%rbp; pop %%rbx;"
>          : "=&S" (wqv->esp), "=&c" (dummy), "=&D" (dummy)
> -        : "i" (PAGE_SIZE), "0" (0), "1" (cpu_info), "2" (wqv->stack)
> -        : "memory" );
> +        : "0" (0), "1" (cpu_info), "2" (wqv->stack),
> +          [sz] "i" (PAGE_SIZE)
> +        : "memory", "rax", "rdx", "r8", "r9", "r10", "r11" );
>  
>      if ( unlikely(wqv->esp == 0) )
>      {
> @@ -224,11 +223,12 @@ void check_wakeup_from_wait(void)
>       * All other GPRs are available for use; they're either restored from
>       * wqv->stack or explicitly clobbered.
>       */

Ah, this is where the comment stops being misleading. I think it would
be better if you had it correct there and adjusted it here.

> -    asm volatile (
> -        "mov %1,%%"__OP"sp; jmp .L_wq_resume;"
> -        : : "S" (wqv->stack), "D" (wqv->esp),
> -          "c" ((char *)get_cpu_info() - (char *)wqv->esp)
> -        : "memory" );
> +    asm volatile ( "mov %%rdi, %%rsp;"
> +                   "jmp .L_wq_resume;"

Minor remark: No need for trailing semicolons like this one. Anyway:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] xen/wait: Extend the description of how this logic actually works
  2022-07-18 10:00   ` Jan Beulich
@ 2022-07-18 10:11     ` Andrew Cooper
  2022-07-18 10:49       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2022-07-18 10:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 18/07/2022 11:00, Jan Beulich wrote:
> On 18.07.2022 09:18, Andrew Cooper wrote:
>> @@ -199,9 +211,18 @@ void check_wakeup_from_wait(void)
>>      }
>>  
>>      /*
>> -     * Hand-rolled longjmp().  Returns to __prepare_to_wait(), and lands on a
>> -     * `rep movs` instruction.  All other GPRs are restored from the stack, so
>> -     * are available for use here.
>> +     * Hand-rolled longjmp().
>> +     *
>> +     * check_wakeup_from_wait() is always called with a shallow stack,
>> +     * immediately after the vCPU has been rescheduled.
>> +     *
>> +     * Adjust %rsp to be the correct depth for the (deeper) stack we want to
>> +     * restore, then prepare %rsi, %rdi and %rcx such that when we intercept
>> +     * the rep movs in __prepare_to_wait(), it copies from wqv->stack over the
>> +     * active stack.
> I'm struggling with the use of "intercept" here, but I guess that's just
> because I'm not a native speaker.

"intercept" is the same terminology used in the middle of
__prepare_to_wait()'s block.

It's because we have a weird setup where this is (now) a noreturn
function merging into the middle of a function which already executed once.

I'm happy to change it if it's unclear, but I can't think of a better
description.

>> +     * All other GPRs are available for use; they're either restored from
>> +     * wqv->stack or explicitly clobbered.
> You talking of "other GPRs" - there aren't any which are explicitly
> clobbered. It's only the previously named ones which are. Hence I'd like
> to ask that the respective parts of the sentence be dropped. Then
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

It becomes true in the next patch.  I'll try and shuffle things.

~Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] xen/wait: Use relative stack adjustments
  2022-07-18  7:18 ` [PATCH 4/5] xen/wait: Use relative stack adjustments Andrew Cooper
@ 2022-07-18 10:29   ` Jan Beulich
  2022-07-18 10:41     ` Andrew Cooper
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2022-07-18 10:29 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 18.07.2022 09:18, Andrew Cooper wrote:
> @@ -121,11 +121,11 @@ void wake_up_all(struct waitqueue_head *wq)
>  
>  static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
>  {
> -    struct cpu_info *cpu_info = get_cpu_info();
>      struct vcpu *curr = current;
>      unsigned long dummy;
> +    unsigned int used;
>  
> -    ASSERT(wqv->esp == 0);
> +    ASSERT(wqv->used == 0);

Minor: Use ! like you do further down?

> @@ -154,24 +154,25 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
>          "push %%rbx; push %%rbp; push %%r12;"
>          "push %%r13; push %%r14; push %%r15;"
>  
> -        "sub %%esp,%%ecx;"
> +        "sub %%esp, %%ecx;" /* ecx = delta to cpu_info */
>          "cmp %[sz], %%ecx;"
>          "ja .L_skip;"       /* Bail if >4k */

According to the inputs, %eax is still 0 when bailing here, so the
check below won't find "used > PAGE_SIZE". I further wonder why you
don't store directly into wqv->used, and go through %eax instead.

> -        "mov %%rsp,%%rsi;"
> +
> +        "mov %%ecx, %%eax;"
> +        "mov %%rsp, %%rsi;" /* Copy from the stack, into wqv->stack */
>  
>          /* check_wakeup_from_wait() longjmp()'s to this point. */
>          ".L_wq_resume: rep movsb;"
> -        "mov %%rsp,%%rsi;"
>  
>          ".L_skip:"
>          "pop %%r15; pop %%r14; pop %%r13;"
>          "pop %%r12; pop %%rbp; pop %%rbx;"
> -        : "=&S" (wqv->esp), "=&c" (dummy), "=&D" (dummy)
> -        : "0" (0), "1" (cpu_info), "2" (wqv->stack),
> +        : "=a" (used), "=D" (dummy),     "=c" (dummy),         "=&S" (dummy)

You can't validly drop & from =D and =c. If you want to stick to
going through %eax, I think that one wants & added as well and ...

> +        : "a" (0),     "D" (wqv->stack), "c" (get_cpu_info()),

... the (unused) input here dropped.

> @@ -220,14 +224,22 @@ void check_wakeup_from_wait(void)
>       * the rep movs in __prepare_to_wait(), it copies from wqv->stack over the
>       * active stack.
>       *
> +     * We are also bound by __prepare_to_wait()'s output constraints, so %eax
> +     * needs to be wqv->used.
> +     *
>       * All other GPRs are available for use; they're either restored from
>       * wqv->stack or explicitly clobbered.
>       */
> -    asm volatile ( "mov %%rdi, %%rsp;"
> +    asm volatile ( "sub %%esp, %k[var];" /* var = delta to cpu_info */
> +                   "neg %k[var];"
> +                   "add %%ecx, %k[var];" /* var = -delta + wqv->used */
> +
> +                   "sub %[var], %%rsp;"  /* Adjust %rsp down to make room */
> +                   "mov %%rsp, %%rdi;"   /* Copy from wqv->stack, into the stack */
>                     "jmp .L_wq_resume;"
> -                   :
> -                   : "S" (wqv->stack), "D" (wqv->esp),
> -                     "c" ((char *)get_cpu_info() - (char *)wqv->esp)
> +                   : "=D" (tmp), [var] "=&r" (tmp)
> +                   : "S" (wqv->stack), "c" (wqv->used), "a" (wqv->used),

If you want to stick to going through %eax, then I think you need to
make it an output here: "+a" (wqv->used), so it is clear that the
register is blocked from any other use throughout the asm(). Or you
could use "=&a" and copy %ecx into %eax inside the asm(). Both may
cause the compiler to emit dead code updating wqv->used right after
the asm(), so I think not going through %eax is the more desirable
approach (but I may well be overlooking a reason why directly
dealing with wqv->used in __prepare_to_wait() isn't an option).

Strictly speaking (in particular if [right now] there wasn't just a
branch after updating %rdi) you also again want "=&D" here.

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] xen/wait: Use relative stack adjustments
  2022-07-18 10:29   ` Jan Beulich
@ 2022-07-18 10:41     ` Andrew Cooper
  2022-07-18 10:51       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2022-07-18 10:41 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 18/07/2022 11:29, Jan Beulich wrote:
> On 18.07.2022 09:18, Andrew Cooper wrote:
>> -        "mov %%rsp,%%rsi;"
>> +
>> +        "mov %%ecx, %%eax;"
>> +        "mov %%rsp, %%rsi;" /* Copy from the stack, into wqv->stack */
>>  
>>          /* check_wakeup_from_wait() longjmp()'s to this point. */
>>          ".L_wq_resume: rep movsb;"
>> -        "mov %%rsp,%%rsi;"
>>  
>>          ".L_skip:"
>>          "pop %%r15; pop %%r14; pop %%r13;"
>>          "pop %%r12; pop %%rbp; pop %%rbx;"
>> -        : "=&S" (wqv->esp), "=&c" (dummy), "=&D" (dummy)
>> -        : "0" (0), "1" (cpu_info), "2" (wqv->stack),
>> +        : "=a" (used), "=D" (dummy),     "=c" (dummy),         "=&S" (dummy)
> You can't validly drop & from =D and =c.

On the contrary, GCC demands it.

& is only relevant, and only permitted, when there is not an explicit
input tied to the same register.

When there is an explicit input tied to the same register, of course it
can't be reused for another input before being used as an output.

https://godbolt.org/z/4vWP4PKc5

~Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] xen/wait: Remove VCPU_AFFINITY_WAIT
  2022-07-18  7:18 ` [PATCH 5/5] xen/wait: Remove VCPU_AFFINITY_WAIT Andrew Cooper
@ 2022-07-18 10:48   ` Jan Beulich
  2022-07-18 11:16     ` Andrew Cooper
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2022-07-18 10:48 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Roger Pau Monné, Wei Liu, Juergen Gross, Dario Faggioli, Xen-devel

On 18.07.2022 09:18, Andrew Cooper wrote:
> With the waitqueue logic updated to not use an absolute stack pointer
> reference, the vCPU can safely be resumed anywhere.
> 
> Remove VCPU_AFFINITY_WAIT completely, getting rid of two domain crashes,

I understand you mean two domain_crash() invocations here, but ...

> and a
> logical corner case where resetting the vcpu with an oustanding waitqueue
> would crash the domain.

... some other domain crash here?

> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

I assume you've checked thoroughly that calling code hasn't
grown dependencies on execution coming back on the same CPU?

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] xen/wait: Extend the description of how this logic actually works
  2022-07-18 10:11     ` Andrew Cooper
@ 2022-07-18 10:49       ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2022-07-18 10:49 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 18.07.2022 12:11, Andrew Cooper wrote:
> On 18/07/2022 11:00, Jan Beulich wrote:
>> On 18.07.2022 09:18, Andrew Cooper wrote:
>>> @@ -199,9 +211,18 @@ void check_wakeup_from_wait(void)
>>>      }
>>>  
>>>      /*
>>> -     * Hand-rolled longjmp().  Returns to __prepare_to_wait(), and lands on a
>>> -     * `rep movs` instruction.  All other GPRs are restored from the stack, so
>>> -     * are available for use here.
>>> +     * Hand-rolled longjmp().
>>> +     *
>>> +     * check_wakeup_from_wait() is always called with a shallow stack,
>>> +     * immediately after the vCPU has been rescheduled.
>>> +     *
>>> +     * Adjust %rsp to be the correct depth for the (deeper) stack we want to
>>> +     * restore, then prepare %rsi, %rdi and %rcx such that when we intercept
>>> +     * the rep movs in __prepare_to_wait(), it copies from wqv->stack over the
>>> +     * active stack.
>> I'm struggling with the use of "intercept" here, but I guess that's just
>> because I'm not a native speaker.
> 
> "intercept" is the same terminology used in the middle of
> __prepare_to_wait()'s block.
> 
> It's because we have a weird setup where this is (now) a noreturn
> function merging into the middle of a function which already executed once.
> 
> I'm happy to change it if it's unclear, but I can't think of a better
> description.

"... when we go back to ..."? (But I'm not meaning to insist on a
wording change.)

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] xen/wait: Use relative stack adjustments
  2022-07-18 10:41     ` Andrew Cooper
@ 2022-07-18 10:51       ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2022-07-18 10:51 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 18.07.2022 12:41, Andrew Cooper wrote:
> On 18/07/2022 11:29, Jan Beulich wrote:
>> On 18.07.2022 09:18, Andrew Cooper wrote:
>>> -        "mov %%rsp,%%rsi;"
>>> +
>>> +        "mov %%ecx, %%eax;"
>>> +        "mov %%rsp, %%rsi;" /* Copy from the stack, into wqv->stack */
>>>  
>>>          /* check_wakeup_from_wait() longjmp()'s to this point. */
>>>          ".L_wq_resume: rep movsb;"
>>> -        "mov %%rsp,%%rsi;"
>>>  
>>>          ".L_skip:"
>>>          "pop %%r15; pop %%r14; pop %%r13;"
>>>          "pop %%r12; pop %%rbp; pop %%rbx;"
>>> -        : "=&S" (wqv->esp), "=&c" (dummy), "=&D" (dummy)
>>> -        : "0" (0), "1" (cpu_info), "2" (wqv->stack),
>>> +        : "=a" (used), "=D" (dummy),     "=c" (dummy),         "=&S" (dummy)
>> You can't validly drop & from =D and =c.
> 
> On the contrary, GCC demands it.
> 
> & is only relevant, and only permitted, when there is not an explicit
> input tied to the same register.
> 
> When there is an explicit input tied to the same register, of course it
> can't be reused for another input before being used as an output.

Oh, sorry - I neglected to take into account this adding of inputs.

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] xen/wait: Improvements
  2022-07-18  7:18 [PATCH 0/5] xen/wait: Improvements Andrew Cooper
                   ` (4 preceding siblings ...)
  2022-07-18  7:18 ` [PATCH 5/5] xen/wait: Remove VCPU_AFFINITY_WAIT Andrew Cooper
@ 2022-07-18 11:11 ` Jan Beulich
  2022-07-18 11:24   ` Andrew Cooper
  5 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2022-07-18 11:11 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini, Volodymyr Babchuk, Bertrand Marquis
  Cc: Xen-devel, Andrew Cooper

On 18.07.2022 09:18, Andrew Cooper wrote:
> This started out as patch 2 for a different task, and quickly identified some
> technical debt, long overdue for cleaning up.
> 
> Andrew Cooper (5):
>   xen/wait: Drop vestigial remnants of TRAP_regs_partial
>   xen/wait: Extend the description of how this logic actually works
>   xen/wait: Minor asm improvements
>   xen/wait: Use relative stack adjustments
>   xen/wait: Remove VCPU_AFFINITY_WAIT

While going through this series I came to notice that we build wait.c even
on Arm, and I couldn't convince myself that wait_event() is actually not
reachable there when e.g. there's an Arm-specific vm_event.c. I would
generally prefer if non-functioning code paths were actually compiled out.

Thoughts?

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] xen/wait: Remove VCPU_AFFINITY_WAIT
  2022-07-18 10:48   ` Jan Beulich
@ 2022-07-18 11:16     ` Andrew Cooper
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Cooper @ 2022-07-18 11:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Roger Pau Monne, Wei Liu, Juergen Gross, Dario Faggioli, Xen-devel

On 18/07/2022 11:48, Jan Beulich wrote:
> On 18.07.2022 09:18, Andrew Cooper wrote:
>> With the waitqueue logic updated to not use an absolute stack pointer
>> reference, the vCPU can safely be resumed anywhere.
>>
>> Remove VCPU_AFFINITY_WAIT completely, getting rid of two domain crashes,
> I understand you mean two domain_crash() invocations here, but ...
>
>> and a
>> logical corner case where resetting the vcpu with an oustanding waitqueue
>> would crash the domain.
> ... some other domain crash here?

One of the two above.  It's more that resetting (would have) broken the
affinity and would have triggered the domain crash.

>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> I assume you've checked thoroughly that calling code hasn't
> grown dependencies on execution coming back on the same CPU?

Urgh yes, my trivial test case didn't encounter it, but anything with an
smp_processor_id() stashed on the stack is going to end up unhappy.

I'm going to have to retract half this series.  (I'll follow up on the
0/$N with the longer term plan to remove this mess).

~Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] xen/wait: Improvements
  2022-07-18 11:11 ` [PATCH 0/5] xen/wait: Improvements Jan Beulich
@ 2022-07-18 11:24   ` Andrew Cooper
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Cooper @ 2022-07-18 11:24 UTC (permalink / raw)
  To: Jan Beulich, Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Bertrand Marquis
  Cc: Xen-devel

On 18/07/2022 12:11, Jan Beulich wrote:
> On 18.07.2022 09:18, Andrew Cooper wrote:
>> This started out as patch 2 for a different task, and quickly identified some
>> technical debt, long overdue for cleaning up.
>>
>> Andrew Cooper (5):
>>   xen/wait: Drop vestigial remnants of TRAP_regs_partial
>>   xen/wait: Extend the description of how this logic actually works
>>   xen/wait: Minor asm improvements
>>   xen/wait: Use relative stack adjustments
>>   xen/wait: Remove VCPU_AFFINITY_WAIT
> While going through this series I came to notice that we build wait.c even
> on Arm, and I couldn't convince myself that wait_event() is actually not
> reachable there when e.g. there's an Arm-specific vm_event.c. I would
> generally prefer if non-functioning code paths were actually compiled out.
>
> Thoughts?

I've been wanting to delete wait.c fully for a long time.

It is only needed because the event/paging/access interfaces still use a
single 4k shared ring, and even then, only when you happen to fill the
4k ring, which is 11 vCPUs on x86 last I checked.

I could easily believe that limit has never been tripped on ARM.

There are plenty of perf wins to be had by building a new Xen=>agent
interface using acquire_resource, which include being able to guarantee
that we never need to pause a vCPU to wait for space in the ring to post
a event.

With the interface fixed, wait.c ceases to have any use whatsoever.

~Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-07-18 11:24 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-18  7:18 [PATCH 0/5] xen/wait: Improvements Andrew Cooper
2022-07-18  7:18 ` [PATCH 1/5] xen/wait: Drop vestigial remnants of TRAP_regs_partial Andrew Cooper
2022-07-18  9:53   ` Jan Beulich
2022-07-18  7:18 ` [PATCH 2/5] xen/wait: Extend the description of how this logic actually works Andrew Cooper
2022-07-18 10:00   ` Jan Beulich
2022-07-18 10:11     ` Andrew Cooper
2022-07-18 10:49       ` Jan Beulich
2022-07-18  7:18 ` [PATCH 3/5] xen/wait: Minor asm improvements Andrew Cooper
2022-07-18 10:04   ` Jan Beulich
2022-07-18  7:18 ` [PATCH 4/5] xen/wait: Use relative stack adjustments Andrew Cooper
2022-07-18 10:29   ` Jan Beulich
2022-07-18 10:41     ` Andrew Cooper
2022-07-18 10:51       ` Jan Beulich
2022-07-18  7:18 ` [PATCH 5/5] xen/wait: Remove VCPU_AFFINITY_WAIT Andrew Cooper
2022-07-18 10:48   ` Jan Beulich
2022-07-18 11:16     ` Andrew Cooper
2022-07-18 11:11 ` [PATCH 0/5] xen/wait: Improvements Jan Beulich
2022-07-18 11:24   ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.