xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] Live-Patch application failure in core-scheduling mode
@ 2020-02-05 16:03 Sergey Dyasli
  2020-02-05 16:35 ` Jürgen Groß
  2020-02-06  9:57 ` Jürgen Groß
  0 siblings, 2 replies; 19+ messages in thread
From: Sergey Dyasli @ 2020-02-05 16:03 UTC (permalink / raw)
  To: Xen-devel
  Cc: Juergen Gross, sergey.dyasli@citrix.com >> Sergey Dyasli,
	George Dunlap, Dario Faggioli, Ross Lagerwall, Jan Beulich

Hello,

I'm currently investigating a Live-Patch application failure in core-
scheduling mode and this is an example of what I usually get:
(it's easily reproducible)

    (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
    (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
    (XEN) [  342.558343] bad cpus: 6 9

    (XEN) [  342.559293] CPU:    6
    (XEN) [  342.559562] Xen call trace:
    (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
    (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
    (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
    (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60

    (XEN) [  342.559761] CPU:    9
    (XEN) [  342.560026] Xen call trace:
    (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
    (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
    (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
    (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
    (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0

The first HT sibling is waiting for the second in the LP-application
context while the second waits for the first in the scheduler context.

Any suggestions on how to improve this situation are welcome.

--
Thanks,
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-05 16:03 [Xen-devel] Live-Patch application failure in core-scheduling mode Sergey Dyasli
@ 2020-02-05 16:35 ` Jürgen Groß
  2020-02-06  9:57 ` Jürgen Groß
  1 sibling, 0 replies; 19+ messages in thread
From: Jürgen Groß @ 2020-02-05 16:35 UTC (permalink / raw)
  To: Sergey Dyasli, Xen-devel
  Cc: Ross Lagerwall, George Dunlap, Jan Beulich, Dario Faggioli

On 05.02.20 17:03, Sergey Dyasli wrote:
> Hello,
> 
> I'm currently investigating a Live-Patch application failure in core-
> scheduling mode and this is an example of what I usually get:
> (it's easily reproducible)
> 
>      (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>      (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>      (XEN) [  342.558343] bad cpus: 6 9
> 
>      (XEN) [  342.559293] CPU:    6
>      (XEN) [  342.559562] Xen call trace:
>      (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>      (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>      (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>      (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
> 
>      (XEN) [  342.559761] CPU:    9
>      (XEN) [  342.560026] Xen call trace:
>      (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>      (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>      (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>      (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>      (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
> 
> The first HT sibling is waiting for the second in the LP-application
> context while the second waits for the first in the scheduler context.
> 
> Any suggestions on how to improve this situation are welcome.

Working on it. Should be doable.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-05 16:03 [Xen-devel] Live-Patch application failure in core-scheduling mode Sergey Dyasli
  2020-02-05 16:35 ` Jürgen Groß
@ 2020-02-06  9:57 ` Jürgen Groß
  2020-02-06 11:05   ` Sergey Dyasli
  1 sibling, 1 reply; 19+ messages in thread
From: Jürgen Groß @ 2020-02-06  9:57 UTC (permalink / raw)
  To: Sergey Dyasli, Xen-devel
  Cc: Ross Lagerwall, George Dunlap, Jan Beulich, Dario Faggioli

[-- Attachment #1: Type: text/plain, Size: 1789 bytes --]

On 05.02.20 17:03, Sergey Dyasli wrote:
> Hello,
> 
> I'm currently investigating a Live-Patch application failure in core-
> scheduling mode and this is an example of what I usually get:
> (it's easily reproducible)
> 
>      (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>      (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>      (XEN) [  342.558343] bad cpus: 6 9
> 
>      (XEN) [  342.559293] CPU:    6
>      (XEN) [  342.559562] Xen call trace:
>      (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>      (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>      (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>      (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
> 
>      (XEN) [  342.559761] CPU:    9
>      (XEN) [  342.560026] Xen call trace:
>      (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>      (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>      (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>      (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>      (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
> 
> The first HT sibling is waiting for the second in the LP-application
> context while the second waits for the first in the scheduler context.
> 
> Any suggestions on how to improve this situation are welcome.

Can you test the attached patch, please? It is only tested to boot, so
I did no livepatch tests with it.


Juergen

[-- Attachment #2: 0001-xen-do-live-patching-only-from-main-idle-loop.patch --]
[-- Type: text/x-patch, Size: 8724 bytes --]

From c458aa88bf17b3ac885926de5204d8a23a2ca82d Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Thu, 6 Feb 2020 08:18:06 +0100
Subject: [PATCH] xen: do live patching only from main idle loop

One of the main design goals of core scheduling is to avoid actions
which are not directly related to the domain currently running on a
given cpu or core. Live patching is one of those actions which are
allowed taking place on a cpu only when the the idle scheduling unit is
active on that cpu.

Unfortunately live patching tries to force the cpus into the idle loop
just by raising the schedule softirq, which will no longer be
guaranteed to work with core scheduling active. Additionally there are
still some places in the hypervisor calling check_for_livepatch_work()
without being in the idle loop.

It is easy to force a cpu into the main idle loop by scheduling a
tasklet on it. So switch live patching to use tasklets for switching to
idle and raising scheduling events. Additionally the calls of
check_for_livepatch_work() outside the main idle loop can be dropped.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/arch/arm/domain.c       |  9 ++++-----
 xen/arch/arm/traps.c        |  6 ------
 xen/arch/x86/domain.c       |  9 ++++-----
 xen/arch/x86/hvm/svm/svm.c  |  2 +-
 xen/arch/x86/hvm/vmx/vmcs.c |  2 +-
 xen/arch/x86/pv/domain.c    |  2 +-
 xen/arch/x86/setup.c        |  2 +-
 xen/common/livepatch.c      | 39 ++++++++++++++++++++++++++++++++++-----
 8 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index aa3df3b3ba..6627be2922 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -72,7 +72,11 @@ void idle_loop(void)
 
         /* Are we here for running vcpu context tasklets, or for idling? */
         if ( unlikely(tasklet_work_to_do(cpu)) )
+        {
             do_tasklet();
+            /* Livepatch work is always kicked off via a tasklet. */
+            check_for_livepatch_work();
+        }
         /*
          * Test softirqs twice --- first to see if should even try scrubbing
          * and then, after it is done, whether softirqs became pending
@@ -83,11 +87,6 @@ void idle_loop(void)
             do_idle();
 
         do_softirq();
-        /*
-         * We MUST be last (or before dsb, wfi). Otherwise after we get the
-         * softirq we would execute dsb,wfi (and sleep) and not patch.
-         */
-        check_for_livepatch_work();
     }
 }
 
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 6f9bec22d3..30c4c1830b 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -23,7 +23,6 @@
 #include <xen/iocap.h>
 #include <xen/irq.h>
 #include <xen/lib.h>
-#include <xen/livepatch.h>
 #include <xen/mem_access.h>
 #include <xen/mm.h>
 #include <xen/param.h>
@@ -2239,11 +2238,6 @@ static void check_for_pcpu_work(void)
     {
         local_irq_enable();
         do_softirq();
-        /*
-         * Must be the last one - as the IPI will trigger us to come here
-         * and we want to patch the hypervisor with almost no stack.
-         */
-        check_for_livepatch_work();
         local_irq_disable();
     }
 }
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index f53ae5ff86..2bc7c4fb2d 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -141,7 +141,11 @@ static void idle_loop(void)
 
         /* Are we here for running vcpu context tasklets, or for idling? */
         if ( unlikely(tasklet_work_to_do(cpu)) )
+        {
             do_tasklet();
+            /* Livepatch work is always kicked off via a tasklet. */
+            check_for_livepatch_work();
+        }
         /*
          * Test softirqs twice --- first to see if should even try scrubbing
          * and then, after it is done, whether softirqs became pending
@@ -151,11 +155,6 @@ static void idle_loop(void)
                     !softirq_pending(cpu) )
             pm_idle();
         do_softirq();
-        /*
-         * We MUST be last (or before pm_idle). Otherwise after we get the
-         * softirq we would execute pm_idle (and sleep) and not patch.
-         */
-        check_for_livepatch_work();
     }
 }
 
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index b7f67f9f03..32d8d847f2 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1032,7 +1032,7 @@ static void noreturn svm_do_resume(struct vcpu *v)
 
     hvm_do_resume(v);
 
-    reset_stack_and_jump(svm_asm_do_resume);
+    reset_stack_and_jump_nolp(svm_asm_do_resume);
 }
 
 void svm_vmenter_helper(const struct cpu_user_regs *regs)
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 65445afeb0..4c23645454 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1890,7 +1890,7 @@ void vmx_do_resume(struct vcpu *v)
     if ( host_cr4 != read_cr4() )
         __vmwrite(HOST_CR4, read_cr4());
 
-    reset_stack_and_jump(vmx_asm_do_vmentry);
+    reset_stack_and_jump_nolp(vmx_asm_do_vmentry);
 }
 
 static inline unsigned long vmr(unsigned long field)
diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
index c3473b9a47..7dbd8dbfa2 100644
--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -62,7 +62,7 @@ custom_runtime_param("pcid", parse_pcid);
 static void noreturn continue_nonidle_domain(struct vcpu *v)
 {
     check_wakeup_from_wait();
-    reset_stack_and_jump(ret_from_intr);
+    reset_stack_and_jump_nolp(ret_from_intr);
 }
 
 static int setup_compat_l4(struct vcpu *v)
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index e50e1f86b3..3bed0a9492 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -632,7 +632,7 @@ static void __init noreturn reinit_bsp_stack(void)
     stack_base[0] = stack;
     memguard_guard_stack(stack);
 
-    reset_stack_and_jump(init_done);
+    reset_stack_and_jump_nolp(init_done);
 }
 
 /*
diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
index 5e09dc990b..861a227dbd 100644
--- a/xen/common/livepatch.c
+++ b/xen/common/livepatch.c
@@ -17,6 +17,7 @@
 #include <xen/spinlock.h>
 #include <xen/string.h>
 #include <xen/symbols.h>
+#include <xen/tasklet.h>
 #include <xen/version.h>
 #include <xen/virtual_region.h>
 #include <xen/vmap.h>
@@ -69,6 +70,7 @@ static struct livepatch_work livepatch_work;
  * Having an per-cpu lessens the load.
  */
 static DEFINE_PER_CPU(bool_t, work_to_do);
+static DEFINE_PER_CPU(struct tasklet, livepatch_tasklet);
 
 static int get_name(const struct xen_livepatch_name *name, char *n)
 {
@@ -1582,17 +1584,16 @@ static int schedule_work(struct payload *data, uint32_t cmd, uint32_t timeout)
     smp_wmb();
 
     livepatch_work.do_work = 1;
-    this_cpu(work_to_do) = 1;
+    tasklet_schedule_on_cpu(&this_cpu(livepatch_tasklet), smp_processor_id());
 
     put_cpu_maps();
 
     return 0;
 }
 
-static void reschedule_fn(void *unused)
+static void tasklet_fn(void *unused)
 {
     this_cpu(work_to_do) = 1;
-    raise_softirq(SCHEDULE_SOFTIRQ);
 }
 
 static int livepatch_spin(atomic_t *counter, s_time_t timeout,
@@ -1652,7 +1653,7 @@ void check_for_livepatch_work(void)
     if ( atomic_inc_and_test(&livepatch_work.semaphore) )
     {
         struct payload *p;
-        unsigned int cpus;
+        unsigned int cpus, i;
         bool action_done = false;
 
         p = livepatch_work.data;
@@ -1682,7 +1683,9 @@ void check_for_livepatch_work(void)
         {
             dprintk(XENLOG_DEBUG, LIVEPATCH "%s: CPU%u - IPIing the other %u CPUs\n",
                     p->name, cpu, cpus);
-            smp_call_function(reschedule_fn, NULL, 0);
+            for_each_online_cpu ( i )
+                if ( i != cpu )
+                    tasklet_schedule_on_cpu(&per_cpu(livepatch_tasklet, i), i);
         }
 
         timeout = livepatch_work.timeout + NOW();
@@ -2116,8 +2119,34 @@ static void livepatch_printall(unsigned char key)
     spin_unlock(&payload_lock);
 }
 
+static int cpu_callback(
+    struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
+    unsigned int cpu = (unsigned long)hcpu;
+
+    if ( action == CPU_UP_PREPARE )
+        tasklet_init(&per_cpu(livepatch_tasklet, cpu), tasklet_fn, NULL);
+
+    return NOTIFY_DONE;
+}
+
+static struct notifier_block cpu_nfb = {
+    .notifier_call = cpu_callback
+};
+
 static int __init livepatch_init(void)
 {
+    unsigned int cpu;
+
+    for_each_online_cpu ( cpu )
+    {
+        void *hcpu = (void *)(long)cpu;
+
+        cpu_callback(&cpu_nfb, CPU_UP_PREPARE, hcpu);
+    }
+
+    register_cpu_notifier(&cpu_nfb);
+
     register_keyhandler('x', livepatch_printall, "print livepatch info", 1);
 
     arch_livepatch_init();
-- 
2.16.4


[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-06  9:57 ` Jürgen Groß
@ 2020-02-06 11:05   ` Sergey Dyasli
  2020-02-06 14:02     ` Sergey Dyasli
  0 siblings, 1 reply; 19+ messages in thread
From: Sergey Dyasli @ 2020-02-06 11:05 UTC (permalink / raw)
  To: Jürgen Groß, Xen-devel
  Cc: Ross Lagerwall, sergey.dyasli@citrix.com >> Sergey Dyasli,
	George Dunlap, Jan Beulich, Dario Faggioli

On 06/02/2020 09:57, Jürgen Groß wrote:
> On 05.02.20 17:03, Sergey Dyasli wrote:
>> Hello,
>>
>> I'm currently investigating a Live-Patch application failure in core-
>> scheduling mode and this is an example of what I usually get:
>> (it's easily reproducible)
>>
>>      (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>      (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>      (XEN) [  342.558343] bad cpus: 6 9
>>
>>      (XEN) [  342.559293] CPU:    6
>>      (XEN) [  342.559562] Xen call trace:
>>      (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>      (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>      (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>      (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>
>>      (XEN) [  342.559761] CPU:    9
>>      (XEN) [  342.560026] Xen call trace:
>>      (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>      (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>      (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>      (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>      (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>
>> The first HT sibling is waiting for the second in the LP-application
>> context while the second waits for the first in the scheduler context.
>>
>> Any suggestions on how to improve this situation are welcome.
>
> Can you test the attached patch, please? It is only tested to boot, so
> I did no livepatch tests with it.

Thank you for the patch! It seems to fix the issue in my manual testing.
I'm going to submit automatic LP testing for both thread/core modes.

--
Thanks,
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-06 11:05   ` Sergey Dyasli
@ 2020-02-06 14:02     ` Sergey Dyasli
  2020-02-06 14:29       ` Jürgen Groß
  2020-02-07  8:04       ` Jürgen Groß
  0 siblings, 2 replies; 19+ messages in thread
From: Sergey Dyasli @ 2020-02-06 14:02 UTC (permalink / raw)
  To: Jürgen Groß, Xen-devel
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Andrew Cooper,
	George Dunlap, Dario Faggioli, Ross Lagerwall, Jan Beulich

On 06/02/2020 11:05, Sergey Dyasli wrote:
> On 06/02/2020 09:57, Jürgen Groß wrote:
>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>> Hello,
>>>
>>> I'm currently investigating a Live-Patch application failure in core-
>>> scheduling mode and this is an example of what I usually get:
>>> (it's easily reproducible)
>>>
>>>      (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>      (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>      (XEN) [  342.558343] bad cpus: 6 9
>>>
>>>      (XEN) [  342.559293] CPU:    6
>>>      (XEN) [  342.559562] Xen call trace:
>>>      (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>      (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>      (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>      (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>
>>>      (XEN) [  342.559761] CPU:    9
>>>      (XEN) [  342.560026] Xen call trace:
>>>      (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>      (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>      (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>      (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>      (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>
>>> The first HT sibling is waiting for the second in the LP-application
>>> context while the second waits for the first in the scheduler context.
>>>
>>> Any suggestions on how to improve this situation are welcome.
>>
>> Can you test the attached patch, please? It is only tested to boot, so
>> I did no livepatch tests with it.
>
> Thank you for the patch! It seems to fix the issue in my manual testing.
> I'm going to submit automatic LP testing for both thread/core modes.

Andrew suggested to test late ucode loading as well and so I did.
It uses stop_machine() to rendezvous cpus and it failed with a similar
backtrace for a problematic CPU. But in this case the system crashed
since there is no timeout involved:

    (XEN) [  155.025168] Xen call trace:
    (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
    (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
    (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
    (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
    (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20

It looks like your patch provides a workaround for LP case, but other
cases like stop_machine() remain broken since the underlying issue with
the scheduler is still there.

--
Thanks,
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-06 14:02     ` Sergey Dyasli
@ 2020-02-06 14:29       ` Jürgen Groß
  2020-02-07  8:04       ` Jürgen Groß
  1 sibling, 0 replies; 19+ messages in thread
From: Jürgen Groß @ 2020-02-06 14:29 UTC (permalink / raw)
  To: Sergey Dyasli, Xen-devel
  Cc: Ross Lagerwall, Andrew Cooper, George Dunlap, Jan Beulich,
	Dario Faggioli

On 06.02.20 15:02, Sergey Dyasli wrote:
> On 06/02/2020 11:05, Sergey Dyasli wrote:
>> On 06/02/2020 09:57, Jürgen Groß wrote:
>>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>>> Hello,
>>>>
>>>> I'm currently investigating a Live-Patch application failure in core-
>>>> scheduling mode and this is an example of what I usually get:
>>>> (it's easily reproducible)
>>>>
>>>>       (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>>       (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>>       (XEN) [  342.558343] bad cpus: 6 9
>>>>
>>>>       (XEN) [  342.559293] CPU:    6
>>>>       (XEN) [  342.559562] Xen call trace:
>>>>       (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>>       (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>       (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>       (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>>
>>>>       (XEN) [  342.559761] CPU:    9
>>>>       (XEN) [  342.560026] Xen call trace:
>>>>       (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>>       (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>>       (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>       (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>       (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>>
>>>> The first HT sibling is waiting for the second in the LP-application
>>>> context while the second waits for the first in the scheduler context.
>>>>
>>>> Any suggestions on how to improve this situation are welcome.
>>>
>>> Can you test the attached patch, please? It is only tested to boot, so
>>> I did no livepatch tests with it.
>>
>> Thank you for the patch! It seems to fix the issue in my manual testing.
>> I'm going to submit automatic LP testing for both thread/core modes.
> 
> Andrew suggested to test late ucode loading as well and so I did.
> It uses stop_machine() to rendezvous cpus and it failed with a similar
> backtrace for a problematic CPU. But in this case the system crashed
> since there is no timeout involved:
> 
>      (XEN) [  155.025168] Xen call trace:
>      (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
>      (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
>      (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
>      (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
>      (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
> 
> It looks like your patch provides a workaround for LP case, but other
> cases like stop_machine() remain broken since the underlying issue with
> the scheduler is still there.

Ah, that was actually a very good hint!

When analyzing your initial problems with reboot and cpu offlining I
looked into those cases in detail and concluded that stop_machine_run()
was called inside a tasklet in those cases (which is true).

Unfortunately there are some cases like ucode loading which don't do
that, so those cases need to be considered as well.

Writing another patch...


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-06 14:02     ` Sergey Dyasli
  2020-02-06 14:29       ` Jürgen Groß
@ 2020-02-07  8:04       ` Jürgen Groß
  2020-02-07  8:23         ` Jan Beulich
  2020-02-11  9:07         ` Sergey Dyasli
  1 sibling, 2 replies; 19+ messages in thread
From: Jürgen Groß @ 2020-02-07  8:04 UTC (permalink / raw)
  To: Sergey Dyasli, Xen-devel
  Cc: Ross Lagerwall, Andrew Cooper, George Dunlap, Jan Beulich,
	Dario Faggioli

[-- Attachment #1: Type: text/plain, Size: 3343 bytes --]

On 06.02.20 15:02, Sergey Dyasli wrote:
> On 06/02/2020 11:05, Sergey Dyasli wrote:
>> On 06/02/2020 09:57, Jürgen Groß wrote:
>>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>>> Hello,
>>>>
>>>> I'm currently investigating a Live-Patch application failure in core-
>>>> scheduling mode and this is an example of what I usually get:
>>>> (it's easily reproducible)
>>>>
>>>>       (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>>       (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>>       (XEN) [  342.558343] bad cpus: 6 9
>>>>
>>>>       (XEN) [  342.559293] CPU:    6
>>>>       (XEN) [  342.559562] Xen call trace:
>>>>       (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>>       (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>       (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>       (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>>
>>>>       (XEN) [  342.559761] CPU:    9
>>>>       (XEN) [  342.560026] Xen call trace:
>>>>       (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>>       (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>>       (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>       (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>       (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>>
>>>> The first HT sibling is waiting for the second in the LP-application
>>>> context while the second waits for the first in the scheduler context.
>>>>
>>>> Any suggestions on how to improve this situation are welcome.
>>>
>>> Can you test the attached patch, please? It is only tested to boot, so
>>> I did no livepatch tests with it.
>>
>> Thank you for the patch! It seems to fix the issue in my manual testing.
>> I'm going to submit automatic LP testing for both thread/core modes.
> 
> Andrew suggested to test late ucode loading as well and so I did.
> It uses stop_machine() to rendezvous cpus and it failed with a similar
> backtrace for a problematic CPU. But in this case the system crashed
> since there is no timeout involved:
> 
>      (XEN) [  155.025168] Xen call trace:
>      (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
>      (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
>      (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
>      (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
>      (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
> 
> It looks like your patch provides a workaround for LP case, but other
> cases like stop_machine() remain broken since the underlying issue with
> the scheduler is still there.

And here is the fix for ucode loading (that was in fact the only case
where stop_machine_run() wasn't already called in a tasklet).

I have done a manual test loading new ucode with core scheduling
active.


Juergen

[-- Attachment #2: 0001-xen-make-sure-stop_machine_run-is-always-called-in-a.patch --]
[-- Type: text/x-patch, Size: 3563 bytes --]

From 4bfa45935c791c28814565cd261f4d5ff640653c Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
To: xen-devel@lists.xenproject.org
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien@xen.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Date: Thu, 6 Feb 2020 15:39:32 +0100
Subject: [PATCH] xen: make sure stop_machine_run() is always called in a
 tasklet

With core scheduling active it is mandatory for stop_machine_run() to
be called in a tasklet only.

Put a BUG_ON() into stop_machine_run() to test for this being sure
and adapt the missing call site (ucode loading).

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/arch/x86/microcode.c  | 54 +++++++++++++++++++++++++++++------------------
 xen/common/stop_machine.c |  1 +
 2 files changed, 35 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index c0fb690f79..3efdf8269a 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -561,30 +561,18 @@ static int do_microcode_update(void *patch)
     return ret;
 }
 
-int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
+struct ucode_buf {
+    unsigned long len;
+    char buffer[];
+};
+
+static long microcode_update_helper(void *data)
 {
     int ret;
-    void *buffer;
+    struct ucode_buf *buffer = data;
     unsigned int cpu, updated;
     struct microcode_patch *patch;
 
-    if ( len != (uint32_t)len )
-        return -E2BIG;
-
-    if ( microcode_ops == NULL )
-        return -EINVAL;
-
-    buffer = xmalloc_bytes(len);
-    if ( !buffer )
-        return -ENOMEM;
-
-    ret = copy_from_guest(buffer, buf, len);
-    if ( ret )
-    {
-        xfree(buffer);
-        return -EFAULT;
-    }
-
     /* cpu_online_map must not change during update */
     if ( !get_cpu_maps() )
     {
@@ -606,7 +594,7 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         return -EPERM;
     }
 
-    patch = parse_blob(buffer, len);
+    patch = parse_blob(buffer->buffer, buffer->len);
     xfree(buffer);
     if ( IS_ERR(patch) )
     {
@@ -699,6 +687,32 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     return ret;
 }
 
+int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
+{
+    int ret;
+    struct ucode_buf *buffer;
+
+    if ( len != (uint32_t)len )
+        return -E2BIG;
+
+    if ( microcode_ops == NULL )
+        return -EINVAL;
+
+    buffer = xmalloc_bytes(len + sizeof(*buffer));
+    if ( !buffer )
+        return -ENOMEM;
+
+    ret = copy_from_guest(buffer->buffer, buf, len);
+    if ( ret )
+    {
+        xfree(buffer);
+        return -EFAULT;
+    }
+    buffer->len = len;
+
+    return continue_hypercall_on_cpu(0, microcode_update_helper, buffer);
+}
+
 static int __init microcode_init(void)
 {
     /*
diff --git a/xen/common/stop_machine.c b/xen/common/stop_machine.c
index 33d9602217..fe7f7d4447 100644
--- a/xen/common/stop_machine.c
+++ b/xen/common/stop_machine.c
@@ -74,6 +74,7 @@ int stop_machine_run(int (*fn)(void *), void *data, unsigned int cpu)
     int ret;
 
     BUG_ON(!local_irq_is_enabled());
+    BUG_ON(!is_idle_vcpu(current));
 
     /* cpu_online_map must not change. */
     if ( !get_cpu_maps() )
-- 
2.16.4


[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-07  8:04       ` Jürgen Groß
@ 2020-02-07  8:23         ` Jan Beulich
  2020-02-07  8:42           ` Jürgen Groß
  2020-02-11  9:07         ` Sergey Dyasli
  1 sibling, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2020-02-07  8:23 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Sergey Dyasli, Andrew Cooper, George Dunlap, Dario Faggioli,
	Ross Lagerwall, Xen-devel

On 07.02.2020 09:04, Jürgen Groß wrote:
> On 06.02.20 15:02, Sergey Dyasli wrote:
>> On 06/02/2020 11:05, Sergey Dyasli wrote:
>>> On 06/02/2020 09:57, Jürgen Groß wrote:
>>>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>>>> Hello,
>>>>>
>>>>> I'm currently investigating a Live-Patch application failure in core-
>>>>> scheduling mode and this is an example of what I usually get:
>>>>> (it's easily reproducible)
>>>>>
>>>>>       (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>>>       (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>>>       (XEN) [  342.558343] bad cpus: 6 9
>>>>>
>>>>>       (XEN) [  342.559293] CPU:    6
>>>>>       (XEN) [  342.559562] Xen call trace:
>>>>>       (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>>>       (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>       (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>       (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>>>
>>>>>       (XEN) [  342.559761] CPU:    9
>>>>>       (XEN) [  342.560026] Xen call trace:
>>>>>       (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>>>       (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>>>       (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>       (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>       (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>>>
>>>>> The first HT sibling is waiting for the second in the LP-application
>>>>> context while the second waits for the first in the scheduler context.
>>>>>
>>>>> Any suggestions on how to improve this situation are welcome.
>>>>
>>>> Can you test the attached patch, please? It is only tested to boot, so
>>>> I did no livepatch tests with it.
>>>
>>> Thank you for the patch! It seems to fix the issue in my manual testing.
>>> I'm going to submit automatic LP testing for both thread/core modes.
>>
>> Andrew suggested to test late ucode loading as well and so I did.
>> It uses stop_machine() to rendezvous cpus and it failed with a similar
>> backtrace for a problematic CPU. But in this case the system crashed
>> since there is no timeout involved:
>>
>>      (XEN) [  155.025168] Xen call trace:
>>      (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
>>      (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
>>      (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
>>      (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>      (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
>>
>> It looks like your patch provides a workaround for LP case, but other
>> cases like stop_machine() remain broken since the underlying issue with
>> the scheduler is still there.
> 
> And here is the fix for ucode loading (that was in fact the only case
> where stop_machine_run() wasn't already called in a tasklet).

This is a rather odd restriction, and hence will need explaining.
Without it being entirely clear that there's no alternative to
it, I don't think I'd be fine with re-introduction of
continue_hypercall_on_cpu(0, ...) into ucode loading.

Also two remarks on the patch itself: struct ucode_buf's len
field can be unsigned int, seeing the very first check done in
microcode_update(). And instead of xmalloc_bytes() please see
whether you can make use of xmalloc_flex_struct() there.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-07  8:23         ` Jan Beulich
@ 2020-02-07  8:42           ` Jürgen Groß
  2020-02-07  8:49             ` Jan Beulich
  2020-02-08 12:19             ` Andrew Cooper
  0 siblings, 2 replies; 19+ messages in thread
From: Jürgen Groß @ 2020-02-07  8:42 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Andrew Cooper, George Dunlap, Dario Faggioli,
	Ross Lagerwall, Xen-devel

On 07.02.20 09:23, Jan Beulich wrote:
> On 07.02.2020 09:04, Jürgen Groß wrote:
>> On 06.02.20 15:02, Sergey Dyasli wrote:
>>> On 06/02/2020 11:05, Sergey Dyasli wrote:
>>>> On 06/02/2020 09:57, Jürgen Groß wrote:
>>>>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I'm currently investigating a Live-Patch application failure in core-
>>>>>> scheduling mode and this is an example of what I usually get:
>>>>>> (it's easily reproducible)
>>>>>>
>>>>>>        (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>>>>        (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>>>>        (XEN) [  342.558343] bad cpus: 6 9
>>>>>>
>>>>>>        (XEN) [  342.559293] CPU:    6
>>>>>>        (XEN) [  342.559562] Xen call trace:
>>>>>>        (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>>>>        (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>        (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>        (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>>>>
>>>>>>        (XEN) [  342.559761] CPU:    9
>>>>>>        (XEN) [  342.560026] Xen call trace:
>>>>>>        (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>>>>        (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>>>>        (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>        (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>        (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>>>>
>>>>>> The first HT sibling is waiting for the second in the LP-application
>>>>>> context while the second waits for the first in the scheduler context.
>>>>>>
>>>>>> Any suggestions on how to improve this situation are welcome.
>>>>>
>>>>> Can you test the attached patch, please? It is only tested to boot, so
>>>>> I did no livepatch tests with it.
>>>>
>>>> Thank you for the patch! It seems to fix the issue in my manual testing.
>>>> I'm going to submit automatic LP testing for both thread/core modes.
>>>
>>> Andrew suggested to test late ucode loading as well and so I did.
>>> It uses stop_machine() to rendezvous cpus and it failed with a similar
>>> backtrace for a problematic CPU. But in this case the system crashed
>>> since there is no timeout involved:
>>>
>>>       (XEN) [  155.025168] Xen call trace:
>>>       (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
>>>       (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
>>>       (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
>>>       (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>       (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
>>>
>>> It looks like your patch provides a workaround for LP case, but other
>>> cases like stop_machine() remain broken since the underlying issue with
>>> the scheduler is still there.
>>
>> And here is the fix for ucode loading (that was in fact the only case
>> where stop_machine_run() wasn't already called in a tasklet).
> 
> This is a rather odd restriction, and hence will need explaining.

stop_machine_run() is using a tasklet on each online cpu (excluding the
one it was called one) for doing a rendezvous of all cpus. With tasklets
always being executed on idle vcpus it is mandatory for
stop_machine_run() to be called on an idle vcpu as well when core
scheduling is active, as otherwise a deadlock will occur. This is being
accomplished by the use of continue_hypercall_on_cpu().

> Without it being entirely clear that there's no alternative to
> it, I don't think I'd be fine with re-introduction of
> continue_hypercall_on_cpu(0, ...) into ucode loading.

I don't see a viable alternative. As the hypercall needs to wait until
the loading has been performed for being able to report the result I
can't see how this can be done else.

> 
> Also two remarks on the patch itself: struct ucode_buf's len
> field can be unsigned int, seeing the very first check done in
> microcode_update(). And instead of xmalloc_bytes() please see
> whether you can make use of xmalloc_flex_struct() there.

Both are fine with me.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-07  8:42           ` Jürgen Groß
@ 2020-02-07  8:49             ` Jan Beulich
  2020-02-07  9:25               ` Jürgen Groß
  2020-02-08 12:19             ` Andrew Cooper
  1 sibling, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2020-02-07  8:49 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Sergey Dyasli, Andrew Cooper, George Dunlap, Dario Faggioli,
	Ross Lagerwall, Xen-devel

On 07.02.2020 09:42, Jürgen Groß wrote:
> On 07.02.20 09:23, Jan Beulich wrote:
>> On 07.02.2020 09:04, Jürgen Groß wrote:
>>> On 06.02.20 15:02, Sergey Dyasli wrote:
>>>> On 06/02/2020 11:05, Sergey Dyasli wrote:
>>>>> On 06/02/2020 09:57, Jürgen Groß wrote:
>>>>>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I'm currently investigating a Live-Patch application failure in core-
>>>>>>> scheduling mode and this is an example of what I usually get:
>>>>>>> (it's easily reproducible)
>>>>>>>
>>>>>>>        (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>>>>>        (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>>>>>        (XEN) [  342.558343] bad cpus: 6 9
>>>>>>>
>>>>>>>        (XEN) [  342.559293] CPU:    6
>>>>>>>        (XEN) [  342.559562] Xen call trace:
>>>>>>>        (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>>>>>        (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>>        (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>        (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>>>>>
>>>>>>>        (XEN) [  342.559761] CPU:    9
>>>>>>>        (XEN) [  342.560026] Xen call trace:
>>>>>>>        (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>>>>>        (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>>>>>        (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>>        (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>        (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>>>>>
>>>>>>> The first HT sibling is waiting for the second in the LP-application
>>>>>>> context while the second waits for the first in the scheduler context.
>>>>>>>
>>>>>>> Any suggestions on how to improve this situation are welcome.
>>>>>>
>>>>>> Can you test the attached patch, please? It is only tested to boot, so
>>>>>> I did no livepatch tests with it.
>>>>>
>>>>> Thank you for the patch! It seems to fix the issue in my manual testing.
>>>>> I'm going to submit automatic LP testing for both thread/core modes.
>>>>
>>>> Andrew suggested to test late ucode loading as well and so I did.
>>>> It uses stop_machine() to rendezvous cpus and it failed with a similar
>>>> backtrace for a problematic CPU. But in this case the system crashed
>>>> since there is no timeout involved:
>>>>
>>>>       (XEN) [  155.025168] Xen call trace:
>>>>       (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
>>>>       (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
>>>>       (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
>>>>       (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>       (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
>>>>
>>>> It looks like your patch provides a workaround for LP case, but other
>>>> cases like stop_machine() remain broken since the underlying issue with
>>>> the scheduler is still there.
>>>
>>> And here is the fix for ucode loading (that was in fact the only case
>>> where stop_machine_run() wasn't already called in a tasklet).
>>
>> This is a rather odd restriction, and hence will need explaining.
> 
> stop_machine_run() is using a tasklet on each online cpu (excluding the
> one it was called one) for doing a rendezvous of all cpus. With tasklets
> always being executed on idle vcpus it is mandatory for
> stop_machine_run() to be called on an idle vcpu as well when core
> scheduling is active, as otherwise a deadlock will occur. This is being
> accomplished by the use of continue_hypercall_on_cpu().

Well, it's this "a deadlock" which is too vague for me. What exactly is
it that deadlocks, and where (if not obvious from the description of
that case) is the connection to core scheduling? Fundamentally such an
issue would seem to call for an adjustment to core scheduling logic,
not placing of new restrictions on other pre-existing code.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-07  8:49             ` Jan Beulich
@ 2020-02-07  9:25               ` Jürgen Groß
  2020-02-07  9:51                 ` Jan Beulich
  2020-02-07 11:44                 ` Roger Pau Monné
  0 siblings, 2 replies; 19+ messages in thread
From: Jürgen Groß @ 2020-02-07  9:25 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Andrew Cooper, George Dunlap, Dario Faggioli,
	Ross Lagerwall, Xen-devel

On 07.02.20 09:49, Jan Beulich wrote:
> On 07.02.2020 09:42, Jürgen Groß wrote:
>> On 07.02.20 09:23, Jan Beulich wrote:
>>> On 07.02.2020 09:04, Jürgen Groß wrote:
>>>> On 06.02.20 15:02, Sergey Dyasli wrote:
>>>>> On 06/02/2020 11:05, Sergey Dyasli wrote:
>>>>>> On 06/02/2020 09:57, Jürgen Groß wrote:
>>>>>>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I'm currently investigating a Live-Patch application failure in core-
>>>>>>>> scheduling mode and this is an example of what I usually get:
>>>>>>>> (it's easily reproducible)
>>>>>>>>
>>>>>>>>         (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>>>>>>         (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>>>>>>         (XEN) [  342.558343] bad cpus: 6 9
>>>>>>>>
>>>>>>>>         (XEN) [  342.559293] CPU:    6
>>>>>>>>         (XEN) [  342.559562] Xen call trace:
>>>>>>>>         (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>>>>>>         (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>>>         (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>>         (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>>>>>>
>>>>>>>>         (XEN) [  342.559761] CPU:    9
>>>>>>>>         (XEN) [  342.560026] Xen call trace:
>>>>>>>>         (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>>>>>>         (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>>>>>>         (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>>>         (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>>         (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>>>>>>
>>>>>>>> The first HT sibling is waiting for the second in the LP-application
>>>>>>>> context while the second waits for the first in the scheduler context.
>>>>>>>>
>>>>>>>> Any suggestions on how to improve this situation are welcome.
>>>>>>>
>>>>>>> Can you test the attached patch, please? It is only tested to boot, so
>>>>>>> I did no livepatch tests with it.
>>>>>>
>>>>>> Thank you for the patch! It seems to fix the issue in my manual testing.
>>>>>> I'm going to submit automatic LP testing for both thread/core modes.
>>>>>
>>>>> Andrew suggested to test late ucode loading as well and so I did.
>>>>> It uses stop_machine() to rendezvous cpus and it failed with a similar
>>>>> backtrace for a problematic CPU. But in this case the system crashed
>>>>> since there is no timeout involved:
>>>>>
>>>>>        (XEN) [  155.025168] Xen call trace:
>>>>>        (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
>>>>>        (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
>>>>>        (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
>>>>>        (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>        (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
>>>>>
>>>>> It looks like your patch provides a workaround for LP case, but other
>>>>> cases like stop_machine() remain broken since the underlying issue with
>>>>> the scheduler is still there.
>>>>
>>>> And here is the fix for ucode loading (that was in fact the only case
>>>> where stop_machine_run() wasn't already called in a tasklet).
>>>
>>> This is a rather odd restriction, and hence will need explaining.
>>
>> stop_machine_run() is using a tasklet on each online cpu (excluding the
>> one it was called one) for doing a rendezvous of all cpus. With tasklets
>> always being executed on idle vcpus it is mandatory for
>> stop_machine_run() to be called on an idle vcpu as well when core
>> scheduling is active, as otherwise a deadlock will occur. This is being
>> accomplished by the use of continue_hypercall_on_cpu().
> 
> Well, it's this "a deadlock" which is too vague for me. What exactly is
> it that deadlocks, and where (if not obvious from the description of
> that case) is the connection to core scheduling? Fundamentally such an
> issue would seem to call for an adjustment to core scheduling logic,
> not placing of new restrictions on other pre-existing code.

This is the main objective of core scheduling: on all siblings of a
core only vcpus of exactly one domain are allowed to be active.

As tasklets are only running on idle vcpus and stop_machine_run()
is activating tasklets on all cpus but the one it has been called on
to rendezvous, it is mandatory for stop_machine_run() to be called on
an idle vcpu, too, as otherwise there is no way for scheduling to
activate the idle vcpu for the tasklet on the sibling of the cpu
stop_machine_run() has been called on.

The needed adjustment to core scheduling would render it basically
useless as it could no longer fulfill its main objective.

A fully preemptive hypervisor would be another solution, but I guess
this is not a viable way to go.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-07  9:25               ` Jürgen Groß
@ 2020-02-07  9:51                 ` Jan Beulich
  2020-02-07  9:58                   ` Jürgen Groß
  2020-02-07 11:44                 ` Roger Pau Monné
  1 sibling, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2020-02-07  9:51 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Sergey Dyasli, Andrew Cooper, George Dunlap, Dario Faggioli,
	Ross Lagerwall, Xen-devel

On 07.02.2020 10:25, Jürgen Groß wrote:
> On 07.02.20 09:49, Jan Beulich wrote:
>> On 07.02.2020 09:42, Jürgen Groß wrote:
>>> On 07.02.20 09:23, Jan Beulich wrote:
>>>> On 07.02.2020 09:04, Jürgen Groß wrote:
>>>>> On 06.02.20 15:02, Sergey Dyasli wrote:
>>>>>> On 06/02/2020 11:05, Sergey Dyasli wrote:
>>>>>>> On 06/02/2020 09:57, Jürgen Groß wrote:
>>>>>>>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> I'm currently investigating a Live-Patch application failure in core-
>>>>>>>>> scheduling mode and this is an example of what I usually get:
>>>>>>>>> (it's easily reproducible)
>>>>>>>>>
>>>>>>>>>         (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>>>>>>>         (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>>>>>>>         (XEN) [  342.558343] bad cpus: 6 9
>>>>>>>>>
>>>>>>>>>         (XEN) [  342.559293] CPU:    6
>>>>>>>>>         (XEN) [  342.559562] Xen call trace:
>>>>>>>>>         (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>>>>>>>         (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>>>>         (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>>>         (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>>>>>>>
>>>>>>>>>         (XEN) [  342.559761] CPU:    9
>>>>>>>>>         (XEN) [  342.560026] Xen call trace:
>>>>>>>>>         (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>>>>>>>         (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>>>>>>>         (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>>>>         (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>>>         (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>>>>>>>
>>>>>>>>> The first HT sibling is waiting for the second in the LP-application
>>>>>>>>> context while the second waits for the first in the scheduler context.
>>>>>>>>>
>>>>>>>>> Any suggestions on how to improve this situation are welcome.
>>>>>>>>
>>>>>>>> Can you test the attached patch, please? It is only tested to boot, so
>>>>>>>> I did no livepatch tests with it.
>>>>>>>
>>>>>>> Thank you for the patch! It seems to fix the issue in my manual testing.
>>>>>>> I'm going to submit automatic LP testing for both thread/core modes.
>>>>>>
>>>>>> Andrew suggested to test late ucode loading as well and so I did.
>>>>>> It uses stop_machine() to rendezvous cpus and it failed with a similar
>>>>>> backtrace for a problematic CPU. But in this case the system crashed
>>>>>> since there is no timeout involved:
>>>>>>
>>>>>>        (XEN) [  155.025168] Xen call trace:
>>>>>>        (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
>>>>>>        (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
>>>>>>        (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
>>>>>>        (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>        (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
>>>>>>
>>>>>> It looks like your patch provides a workaround for LP case, but other
>>>>>> cases like stop_machine() remain broken since the underlying issue with
>>>>>> the scheduler is still there.
>>>>>
>>>>> And here is the fix for ucode loading (that was in fact the only case
>>>>> where stop_machine_run() wasn't already called in a tasklet).
>>>>
>>>> This is a rather odd restriction, and hence will need explaining.
>>>
>>> stop_machine_run() is using a tasklet on each online cpu (excluding the
>>> one it was called one) for doing a rendezvous of all cpus. With tasklets
>>> always being executed on idle vcpus it is mandatory for
>>> stop_machine_run() to be called on an idle vcpu as well when core
>>> scheduling is active, as otherwise a deadlock will occur. This is being
>>> accomplished by the use of continue_hypercall_on_cpu().
>>
>> Well, it's this "a deadlock" which is too vague for me. What exactly is
>> it that deadlocks, and where (if not obvious from the description of
>> that case) is the connection to core scheduling? Fundamentally such an
>> issue would seem to call for an adjustment to core scheduling logic,
>> not placing of new restrictions on other pre-existing code.
> 
> This is the main objective of core scheduling: on all siblings of a
> core only vcpus of exactly one domain are allowed to be active.
> 
> As tasklets are only running on idle vcpus and stop_machine_run()
> is activating tasklets on all cpus but the one it has been called on
> to rendezvous, it is mandatory for stop_machine_run() to be called on
> an idle vcpu, too, as otherwise there is no way for scheduling to
> activate the idle vcpu for the tasklet on the sibling of the cpu
> stop_machine_run() has been called on.

I can follow all this, but it needs spelling out in the description
of the patch, I think. "only running on idle vcpus" isn't very
precise though, as this ignores softirq tasklets. Which got me to
think of an alternative (faod: without having thought through at
all whether this would indeed be viable): What if stop-machine used
softirq tasklets instead of "ordinary" ones?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-07  9:51                 ` Jan Beulich
@ 2020-02-07  9:58                   ` Jürgen Groß
  0 siblings, 0 replies; 19+ messages in thread
From: Jürgen Groß @ 2020-02-07  9:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Andrew Cooper, George Dunlap, Dario Faggioli,
	Ross Lagerwall, Xen-devel

On 07.02.20 10:51, Jan Beulich wrote:
> On 07.02.2020 10:25, Jürgen Groß wrote:
>> On 07.02.20 09:49, Jan Beulich wrote:
>>> On 07.02.2020 09:42, Jürgen Groß wrote:
>>>> On 07.02.20 09:23, Jan Beulich wrote:
>>>>> On 07.02.2020 09:04, Jürgen Groß wrote:
>>>>>> On 06.02.20 15:02, Sergey Dyasli wrote:
>>>>>>> On 06/02/2020 11:05, Sergey Dyasli wrote:
>>>>>>>> On 06/02/2020 09:57, Jürgen Groß wrote:
>>>>>>>>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> I'm currently investigating a Live-Patch application failure in core-
>>>>>>>>>> scheduling mode and this is an example of what I usually get:
>>>>>>>>>> (it's easily reproducible)
>>>>>>>>>>
>>>>>>>>>>          (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>>>>>>>>          (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>>>>>>>>          (XEN) [  342.558343] bad cpus: 6 9
>>>>>>>>>>
>>>>>>>>>>          (XEN) [  342.559293] CPU:    6
>>>>>>>>>>          (XEN) [  342.559562] Xen call trace:
>>>>>>>>>>          (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>>>>>>>>          (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>>>>>          (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>>>>          (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>>>>>>>>
>>>>>>>>>>          (XEN) [  342.559761] CPU:    9
>>>>>>>>>>          (XEN) [  342.560026] Xen call trace:
>>>>>>>>>>          (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>>>>>>>>          (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>>>>>>>>          (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>>>>>          (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>>>>          (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>>>>>>>>
>>>>>>>>>> The first HT sibling is waiting for the second in the LP-application
>>>>>>>>>> context while the second waits for the first in the scheduler context.
>>>>>>>>>>
>>>>>>>>>> Any suggestions on how to improve this situation are welcome.
>>>>>>>>>
>>>>>>>>> Can you test the attached patch, please? It is only tested to boot, so
>>>>>>>>> I did no livepatch tests with it.
>>>>>>>>
>>>>>>>> Thank you for the patch! It seems to fix the issue in my manual testing.
>>>>>>>> I'm going to submit automatic LP testing for both thread/core modes.
>>>>>>>
>>>>>>> Andrew suggested to test late ucode loading as well and so I did.
>>>>>>> It uses stop_machine() to rendezvous cpus and it failed with a similar
>>>>>>> backtrace for a problematic CPU. But in this case the system crashed
>>>>>>> since there is no timeout involved:
>>>>>>>
>>>>>>>         (XEN) [  155.025168] Xen call trace:
>>>>>>>         (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
>>>>>>>         (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
>>>>>>>         (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
>>>>>>>         (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>         (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
>>>>>>>
>>>>>>> It looks like your patch provides a workaround for LP case, but other
>>>>>>> cases like stop_machine() remain broken since the underlying issue with
>>>>>>> the scheduler is still there.
>>>>>>
>>>>>> And here is the fix for ucode loading (that was in fact the only case
>>>>>> where stop_machine_run() wasn't already called in a tasklet).
>>>>>
>>>>> This is a rather odd restriction, and hence will need explaining.
>>>>
>>>> stop_machine_run() is using a tasklet on each online cpu (excluding the
>>>> one it was called one) for doing a rendezvous of all cpus. With tasklets
>>>> always being executed on idle vcpus it is mandatory for
>>>> stop_machine_run() to be called on an idle vcpu as well when core
>>>> scheduling is active, as otherwise a deadlock will occur. This is being
>>>> accomplished by the use of continue_hypercall_on_cpu().
>>>
>>> Well, it's this "a deadlock" which is too vague for me. What exactly is
>>> it that deadlocks, and where (if not obvious from the description of
>>> that case) is the connection to core scheduling? Fundamentally such an
>>> issue would seem to call for an adjustment to core scheduling logic,
>>> not placing of new restrictions on other pre-existing code.
>>
>> This is the main objective of core scheduling: on all siblings of a
>> core only vcpus of exactly one domain are allowed to be active.
>>
>> As tasklets are only running on idle vcpus and stop_machine_run()
>> is activating tasklets on all cpus but the one it has been called on
>> to rendezvous, it is mandatory for stop_machine_run() to be called on
>> an idle vcpu, too, as otherwise there is no way for scheduling to
>> activate the idle vcpu for the tasklet on the sibling of the cpu
>> stop_machine_run() has been called on.
> 
> I can follow all this, but it needs spelling out in the description
> of the patch, I think. "only running on idle vcpus" isn't very
> precise though, as this ignores softirq tasklets. Which got me to
> think of an alternative (faod: without having thought through at
> all whether this would indeed be viable): What if stop-machine used
> softirq tasklets instead of "ordinary" ones?

This would break its use for entering ACPI S3 state where it relies on
all guest vcpus being descheduled.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-07  9:25               ` Jürgen Groß
  2020-02-07  9:51                 ` Jan Beulich
@ 2020-02-07 11:44                 ` Roger Pau Monné
  2020-02-07 12:58                   ` Jürgen Groß
  1 sibling, 1 reply; 19+ messages in thread
From: Roger Pau Monné @ 2020-02-07 11:44 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Sergey Dyasli, Andrew Cooper, George Dunlap, Dario Faggioli,
	Ross Lagerwall, Jan Beulich, Xen-devel

On Fri, Feb 07, 2020 at 10:25:05AM +0100, Jürgen Groß wrote:
> On 07.02.20 09:49, Jan Beulich wrote:
> > On 07.02.2020 09:42, Jürgen Groß wrote:
> > > On 07.02.20 09:23, Jan Beulich wrote:
> > > > On 07.02.2020 09:04, Jürgen Groß wrote:
> > > > > On 06.02.20 15:02, Sergey Dyasli wrote:
> > > > > > On 06/02/2020 11:05, Sergey Dyasli wrote:
> > > > > > > On 06/02/2020 09:57, Jürgen Groß wrote:
> > > > > > > > On 05.02.20 17:03, Sergey Dyasli wrote:
> > > > > > > > > Hello,
> > > > > > > > > 
> > > > > > > > > I'm currently investigating a Live-Patch application failure in core-
> > > > > > > > > scheduling mode and this is an example of what I usually get:
> > > > > > > > > (it's easily reproducible)
> > > > > > > > > 
> > > > > > > > >         (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
> > > > > > > > >         (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
> > > > > > > > >         (XEN) [  342.558343] bad cpus: 6 9
> > > > > > > > > 
> > > > > > > > >         (XEN) [  342.559293] CPU:    6
> > > > > > > > >         (XEN) [  342.559562] Xen call trace:
> > > > > > > > >         (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
> > > > > > > > >         (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
> > > > > > > > >         (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
> > > > > > > > >         (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
> > > > > > > > > 
> > > > > > > > >         (XEN) [  342.559761] CPU:    9
> > > > > > > > >         (XEN) [  342.560026] Xen call trace:
> > > > > > > > >         (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
> > > > > > > > >         (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
> > > > > > > > >         (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
> > > > > > > > >         (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
> > > > > > > > >         (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
> > > > > > > > > 
> > > > > > > > > The first HT sibling is waiting for the second in the LP-application
> > > > > > > > > context while the second waits for the first in the scheduler context.
> > > > > > > > > 
> > > > > > > > > Any suggestions on how to improve this situation are welcome.
> > > > > > > > 
> > > > > > > > Can you test the attached patch, please? It is only tested to boot, so
> > > > > > > > I did no livepatch tests with it.
> > > > > > > 
> > > > > > > Thank you for the patch! It seems to fix the issue in my manual testing.
> > > > > > > I'm going to submit automatic LP testing for both thread/core modes.
> > > > > > 
> > > > > > Andrew suggested to test late ucode loading as well and so I did.
> > > > > > It uses stop_machine() to rendezvous cpus and it failed with a similar
> > > > > > backtrace for a problematic CPU. But in this case the system crashed
> > > > > > since there is no timeout involved:
> > > > > > 
> > > > > >        (XEN) [  155.025168] Xen call trace:
> > > > > >        (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
> > > > > >        (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
> > > > > >        (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
> > > > > >        (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
> > > > > >        (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
> > > > > > 
> > > > > > It looks like your patch provides a workaround for LP case, but other
> > > > > > cases like stop_machine() remain broken since the underlying issue with
> > > > > > the scheduler is still there.
> > > > > 
> > > > > And here is the fix for ucode loading (that was in fact the only case
> > > > > where stop_machine_run() wasn't already called in a tasklet).
> > > > 
> > > > This is a rather odd restriction, and hence will need explaining.
> > > 
> > > stop_machine_run() is using a tasklet on each online cpu (excluding the
> > > one it was called one) for doing a rendezvous of all cpus. With tasklets
> > > always being executed on idle vcpus it is mandatory for
> > > stop_machine_run() to be called on an idle vcpu as well when core
> > > scheduling is active, as otherwise a deadlock will occur. This is being
> > > accomplished by the use of continue_hypercall_on_cpu().
> > 
> > Well, it's this "a deadlock" which is too vague for me. What exactly is
> > it that deadlocks, and where (if not obvious from the description of
> > that case) is the connection to core scheduling? Fundamentally such an
> > issue would seem to call for an adjustment to core scheduling logic,
> > not placing of new restrictions on other pre-existing code.
> 
> This is the main objective of core scheduling: on all siblings of a
> core only vcpus of exactly one domain are allowed to be active.
> 
> As tasklets are only running on idle vcpus and stop_machine_run()
> is activating tasklets on all cpus but the one it has been called on
> to rendezvous, it is mandatory for stop_machine_run() to be called on
> an idle vcpu, too, as otherwise there is no way for scheduling to
> activate the idle vcpu for the tasklet on the sibling of the cpu
> stop_machine_run() has been called on.

Could there also be issues with other rendezvous not running in
tasklet context?

One triggered by on_selected_cpus for example?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-07 11:44                 ` Roger Pau Monné
@ 2020-02-07 12:58                   ` Jürgen Groß
  0 siblings, 0 replies; 19+ messages in thread
From: Jürgen Groß @ 2020-02-07 12:58 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Sergey Dyasli, Andrew Cooper, George Dunlap, Dario Faggioli,
	Ross Lagerwall, Jan Beulich, Xen-devel

On 07.02.20 12:44, Roger Pau Monné wrote:
> On Fri, Feb 07, 2020 at 10:25:05AM +0100, Jürgen Groß wrote:
>> On 07.02.20 09:49, Jan Beulich wrote:
>>> On 07.02.2020 09:42, Jürgen Groß wrote:
>>>> On 07.02.20 09:23, Jan Beulich wrote:
>>>>> On 07.02.2020 09:04, Jürgen Groß wrote:
>>>>>> On 06.02.20 15:02, Sergey Dyasli wrote:
>>>>>>> On 06/02/2020 11:05, Sergey Dyasli wrote:
>>>>>>>> On 06/02/2020 09:57, Jürgen Groß wrote:
>>>>>>>>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> I'm currently investigating a Live-Patch application failure in core-
>>>>>>>>>> scheduling mode and this is an example of what I usually get:
>>>>>>>>>> (it's easily reproducible)
>>>>>>>>>>
>>>>>>>>>>          (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>>>>>>>>          (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>>>>>>>>          (XEN) [  342.558343] bad cpus: 6 9
>>>>>>>>>>
>>>>>>>>>>          (XEN) [  342.559293] CPU:    6
>>>>>>>>>>          (XEN) [  342.559562] Xen call trace:
>>>>>>>>>>          (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>>>>>>>>          (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>>>>>          (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>>>>          (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>>>>>>>>
>>>>>>>>>>          (XEN) [  342.559761] CPU:    9
>>>>>>>>>>          (XEN) [  342.560026] Xen call trace:
>>>>>>>>>>          (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>>>>>>>>          (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>>>>>>>>          (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>>>>>          (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>>>>          (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>>>>>>>>
>>>>>>>>>> The first HT sibling is waiting for the second in the LP-application
>>>>>>>>>> context while the second waits for the first in the scheduler context.
>>>>>>>>>>
>>>>>>>>>> Any suggestions on how to improve this situation are welcome.
>>>>>>>>>
>>>>>>>>> Can you test the attached patch, please? It is only tested to boot, so
>>>>>>>>> I did no livepatch tests with it.
>>>>>>>>
>>>>>>>> Thank you for the patch! It seems to fix the issue in my manual testing.
>>>>>>>> I'm going to submit automatic LP testing for both thread/core modes.
>>>>>>>
>>>>>>> Andrew suggested to test late ucode loading as well and so I did.
>>>>>>> It uses stop_machine() to rendezvous cpus and it failed with a similar
>>>>>>> backtrace for a problematic CPU. But in this case the system crashed
>>>>>>> since there is no timeout involved:
>>>>>>>
>>>>>>>         (XEN) [  155.025168] Xen call trace:
>>>>>>>         (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
>>>>>>>         (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
>>>>>>>         (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
>>>>>>>         (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>>         (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
>>>>>>>
>>>>>>> It looks like your patch provides a workaround for LP case, but other
>>>>>>> cases like stop_machine() remain broken since the underlying issue with
>>>>>>> the scheduler is still there.
>>>>>>
>>>>>> And here is the fix for ucode loading (that was in fact the only case
>>>>>> where stop_machine_run() wasn't already called in a tasklet).
>>>>>
>>>>> This is a rather odd restriction, and hence will need explaining.
>>>>
>>>> stop_machine_run() is using a tasklet on each online cpu (excluding the
>>>> one it was called one) for doing a rendezvous of all cpus. With tasklets
>>>> always being executed on idle vcpus it is mandatory for
>>>> stop_machine_run() to be called on an idle vcpu as well when core
>>>> scheduling is active, as otherwise a deadlock will occur. This is being
>>>> accomplished by the use of continue_hypercall_on_cpu().
>>>
>>> Well, it's this "a deadlock" which is too vague for me. What exactly is
>>> it that deadlocks, and where (if not obvious from the description of
>>> that case) is the connection to core scheduling? Fundamentally such an
>>> issue would seem to call for an adjustment to core scheduling logic,
>>> not placing of new restrictions on other pre-existing code.
>>
>> This is the main objective of core scheduling: on all siblings of a
>> core only vcpus of exactly one domain are allowed to be active.
>>
>> As tasklets are only running on idle vcpus and stop_machine_run()
>> is activating tasklets on all cpus but the one it has been called on
>> to rendezvous, it is mandatory for stop_machine_run() to be called on
>> an idle vcpu, too, as otherwise there is no way for scheduling to
>> activate the idle vcpu for the tasklet on the sibling of the cpu
>> stop_machine_run() has been called on.
> 
> Could there also be issues with other rendezvous not running in
> tasklet context?
> 
> One triggered by on_selected_cpus for example?

I don't think so. The tasklets are special here as they will be only
started when the whole core is idle. on_selected_cpus is using softirq
which is usable with any vcpu active.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-07  8:42           ` Jürgen Groß
  2020-02-07  8:49             ` Jan Beulich
@ 2020-02-08 12:19             ` Andrew Cooper
  2020-02-08 12:29               ` Jürgen Groß
  1 sibling, 1 reply; 19+ messages in thread
From: Andrew Cooper @ 2020-02-08 12:19 UTC (permalink / raw)
  To: Jürgen Groß, Jan Beulich
  Cc: Sergey Dyasli, Dario Faggioli, Ross Lagerwall, George Dunlap, Xen-devel

On 07/02/2020 08:42, Jürgen Groß wrote:
>
>> Without it being entirely clear that there's no alternative to
>> it, I don't think I'd be fine with re-introduction of
>> continue_hypercall_on_cpu(0, ...) into ucode loading.
>
> I don't see a viable alternative. 

Sorry to interject in the middle of a conversation, but I'd like to make
something very clear.

continue_hypercall_on_cpu(0, ...) is, and has always been fundamentally
broken for microcode updates.  It causes real crashes on real systems,
and that is why the mechanism was replaced.

Changing back to it is going to break customer systems.

It is necessary to have the full system quiesced in practice, because
for a given piece of microcode, we don't know whether its a cross-thread
load (the common case which most people are familiar with), whether it
is a cross-core load (yes - it turns out this does exist - it
highlighted a bug in testing), and whether there an uncore/pcode/etc
update included as well.

I haven't come across a cross-socket load yet (and it likely doesn't
exists, given some aspects of loading which I think would be prohibitive
in this case), but there really are systems where loading microcode on
core 0 will flush and reload the MSROMs on all other cores in the
package, under the feet of whatever else is going on there.  This
includes making things like MSR_SPEC_CTRL disappear transiently.

We don't necessarily need to use stop_machine(), or use it exactly like
we currently do, but we do need a global rendezvous.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-08 12:19             ` Andrew Cooper
@ 2020-02-08 12:29               ` Jürgen Groß
  0 siblings, 0 replies; 19+ messages in thread
From: Jürgen Groß @ 2020-02-08 12:29 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich
  Cc: Sergey Dyasli, Dario Faggioli, Ross Lagerwall, George Dunlap, Xen-devel

On 08.02.20 13:19, Andrew Cooper wrote:
> On 07/02/2020 08:42, Jürgen Groß wrote:
>>
>>> Without it being entirely clear that there's no alternative to
>>> it, I don't think I'd be fine with re-introduction of
>>> continue_hypercall_on_cpu(0, ...) into ucode loading.
>>
>> I don't see a viable alternative.
> 
> Sorry to interject in the middle of a conversation, but I'd like to make
> something very clear.
> 
> continue_hypercall_on_cpu(0, ...) is, and has always been fundamentally
> broken for microcode updates.  It causes real crashes on real systems,
> and that is why the mechanism was replaced.
> 
> Changing back to it is going to break customer systems.
> 
> It is necessary to have the full system quiesced in practice, because
> for a given piece of microcode, we don't know whether its a cross-thread
> load (the common case which most people are familiar with), whether it
> is a cross-core load (yes - it turns out this does exist - it
> highlighted a bug in testing), and whether there an uncore/pcode/etc
> update included as well.
> 
> I haven't come across a cross-socket load yet (and it likely doesn't
> exists, given some aspects of loading which I think would be prohibitive
> in this case), but there really are systems where loading microcode on
> core 0 will flush and reload the MSROMs on all other cores in the
> package, under the feet of whatever else is going on there.  This
> includes making things like MSR_SPEC_CTRL disappear transiently.
> 
> We don't necessarily need to use stop_machine(), or use it exactly like
> we currently do, but we do need a global rendezvous.

Did you look at the patch?

It uses continue_hypercall_on_cpu(0, ...) to call stop_machine_run()
from a tasklet. So there is a global rendezvous. Its just the start
of the rendezvous which is moved into a tasklet. That's all.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-07  8:04       ` Jürgen Groß
  2020-02-07  8:23         ` Jan Beulich
@ 2020-02-11  9:07         ` Sergey Dyasli
  2020-02-11  9:23           ` Jürgen Groß
  1 sibling, 1 reply; 19+ messages in thread
From: Sergey Dyasli @ 2020-02-11  9:07 UTC (permalink / raw)
  To: Jürgen Groß, Xen-devel
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Andrew Cooper,
	George Dunlap, Dario Faggioli, Ross Lagerwall, Jan Beulich

On 07/02/2020 08:04, Jürgen Groß wrote:
> On 06.02.20 15:02, Sergey Dyasli wrote:
>> On 06/02/2020 11:05, Sergey Dyasli wrote:
>>> On 06/02/2020 09:57, Jürgen Groß wrote:
>>>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>>>> Hello,
>>>>>
>>>>> I'm currently investigating a Live-Patch application failure in core-
>>>>> scheduling mode and this is an example of what I usually get:
>>>>> (it's easily reproducible)
>>>>>
>>>>>       (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>>>       (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>>>       (XEN) [  342.558343] bad cpus: 6 9
>>>>>
>>>>>       (XEN) [  342.559293] CPU:    6
>>>>>       (XEN) [  342.559562] Xen call trace:
>>>>>       (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>>>       (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>       (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>       (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>>>
>>>>>       (XEN) [  342.559761] CPU:    9
>>>>>       (XEN) [  342.560026] Xen call trace:
>>>>>       (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>>>       (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>>>       (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>       (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>       (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>>>
>>>>> The first HT sibling is waiting for the second in the LP-application
>>>>> context while the second waits for the first in the scheduler context.
>>>>>
>>>>> Any suggestions on how to improve this situation are welcome.
>>>>
>>>> Can you test the attached patch, please? It is only tested to boot, so
>>>> I did no livepatch tests with it.
>>>
>>> Thank you for the patch! It seems to fix the issue in my manual testing.
>>> I'm going to submit automatic LP testing for both thread/core modes.
>>
>> Andrew suggested to test late ucode loading as well and so I did.
>> It uses stop_machine() to rendezvous cpus and it failed with a similar
>> backtrace for a problematic CPU. But in this case the system crashed
>> since there is no timeout involved:
>>
>>      (XEN) [  155.025168] Xen call trace:
>>      (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
>>      (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
>>      (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
>>      (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>      (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
>>
>> It looks like your patch provides a workaround for LP case, but other
>> cases like stop_machine() remain broken since the underlying issue with
>> the scheduler is still there.
>
> And here is the fix for ucode loading (that was in fact the only case
> where stop_machine_run() wasn't already called in a tasklet).
>
> I have done a manual test loading new ucode with core scheduling
> active.

The patch seems to fix the issue, thanks!
Do you plan to post the 2 patches to the ML now for proper review?

--
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Xen-devel] Live-Patch application failure in core-scheduling mode
  2020-02-11  9:07         ` Sergey Dyasli
@ 2020-02-11  9:23           ` Jürgen Groß
  0 siblings, 0 replies; 19+ messages in thread
From: Jürgen Groß @ 2020-02-11  9:23 UTC (permalink / raw)
  To: Sergey Dyasli, Xen-devel
  Cc: Ross Lagerwall, Andrew Cooper, George Dunlap, Jan Beulich,
	Dario Faggioli

On 11.02.20 10:07, Sergey Dyasli wrote:
> On 07/02/2020 08:04, Jürgen Groß wrote:
>> On 06.02.20 15:02, Sergey Dyasli wrote:
>>> On 06/02/2020 11:05, Sergey Dyasli wrote:
>>>> On 06/02/2020 09:57, Jürgen Groß wrote:
>>>>> On 05.02.20 17:03, Sergey Dyasli wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I'm currently investigating a Live-Patch application failure in core-
>>>>>> scheduling mode and this is an example of what I usually get:
>>>>>> (it's easily reproducible)
>>>>>>
>>>>>>        (XEN) [  342.528305] livepatch: lp: CPU8 - IPIing the other 15 CPUs
>>>>>>        (XEN) [  342.558340] livepatch: lp: Timed out on semaphore in CPU quiesce phase 13/15
>>>>>>        (XEN) [  342.558343] bad cpus: 6 9
>>>>>>
>>>>>>        (XEN) [  342.559293] CPU:    6
>>>>>>        (XEN) [  342.559562] Xen call trace:
>>>>>>        (XEN) [  342.559565]    [<ffff82d08023f304>] R common/schedule.c#sched_wait_rendezvous_in+0xa4/0x270
>>>>>>        (XEN) [  342.559568]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>        (XEN) [  342.559571]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>        (XEN) [  342.559574]    [<ffff82d080278ec5>] F arch/x86/domain.c#guest_idle_loop+0x35/0x60
>>>>>>
>>>>>>        (XEN) [  342.559761] CPU:    9
>>>>>>        (XEN) [  342.560026] Xen call trace:
>>>>>>        (XEN) [  342.560029]    [<ffff82d080241661>] R _spin_lock_irq+0x11/0x40
>>>>>>        (XEN) [  342.560032]    [<ffff82d08023f323>] F common/schedule.c#sched_wait_rendezvous_in+0xc3/0x270
>>>>>>        (XEN) [  342.560036]    [<ffff82d08023f8aa>] F common/schedule.c#schedule+0x17a/0x260
>>>>>>        (XEN) [  342.560039]    [<ffff82d080240d5a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>>>>        (XEN) [  342.560042]    [<ffff82d080279db5>] F arch/x86/domain.c#idle_loop+0x55/0xb0
>>>>>>
>>>>>> The first HT sibling is waiting for the second in the LP-application
>>>>>> context while the second waits for the first in the scheduler context.
>>>>>>
>>>>>> Any suggestions on how to improve this situation are welcome.
>>>>>
>>>>> Can you test the attached patch, please? It is only tested to boot, so
>>>>> I did no livepatch tests with it.
>>>>
>>>> Thank you for the patch! It seems to fix the issue in my manual testing.
>>>> I'm going to submit automatic LP testing for both thread/core modes.
>>>
>>> Andrew suggested to test late ucode loading as well and so I did.
>>> It uses stop_machine() to rendezvous cpus and it failed with a similar
>>> backtrace for a problematic CPU. But in this case the system crashed
>>> since there is no timeout involved:
>>>
>>>       (XEN) [  155.025168] Xen call trace:
>>>       (XEN) [  155.040095]    [<ffff82d0802417f2>] R _spin_unlock_irq+0x22/0x30
>>>       (XEN) [  155.069549]    [<ffff82d08023f3c2>] S common/schedule.c#sched_wait_rendezvous_in+0xa2/0x270
>>>       (XEN) [  155.109696]    [<ffff82d08023f728>] F common/schedule.c#sched_slave+0x198/0x260
>>>       (XEN) [  155.145521]    [<ffff82d080240e1a>] F common/softirq.c#__do_softirq+0x5a/0x90
>>>       (XEN) [  155.180223]    [<ffff82d0803716f6>] F x86_64/entry.S#process_softirqs+0x6/0x20
>>>
>>> It looks like your patch provides a workaround for LP case, but other
>>> cases like stop_machine() remain broken since the underlying issue with
>>> the scheduler is still there.
>>
>> And here is the fix for ucode loading (that was in fact the only case
>> where stop_machine_run() wasn't already called in a tasklet).
>>
>> I have done a manual test loading new ucode with core scheduling
>> active.
> 
> The patch seems to fix the issue, thanks!
> Do you plan to post the 2 patches to the ML now for proper review?

Yes.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-02-11  9:24 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-05 16:03 [Xen-devel] Live-Patch application failure in core-scheduling mode Sergey Dyasli
2020-02-05 16:35 ` Jürgen Groß
2020-02-06  9:57 ` Jürgen Groß
2020-02-06 11:05   ` Sergey Dyasli
2020-02-06 14:02     ` Sergey Dyasli
2020-02-06 14:29       ` Jürgen Groß
2020-02-07  8:04       ` Jürgen Groß
2020-02-07  8:23         ` Jan Beulich
2020-02-07  8:42           ` Jürgen Groß
2020-02-07  8:49             ` Jan Beulich
2020-02-07  9:25               ` Jürgen Groß
2020-02-07  9:51                 ` Jan Beulich
2020-02-07  9:58                   ` Jürgen Groß
2020-02-07 11:44                 ` Roger Pau Monné
2020-02-07 12:58                   ` Jürgen Groß
2020-02-08 12:19             ` Andrew Cooper
2020-02-08 12:29               ` Jürgen Groß
2020-02-11  9:07         ` Sergey Dyasli
2020-02-11  9:23           ` Jürgen Groß

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).