All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] xen: Fix some bugs in scheduling
@ 2020-05-11 11:28 Juergen Gross
  2020-05-11 11:28 ` [PATCH v2 1/3] xen/sched: allow rcu work to happen when syncing cpus in core scheduling Juergen Gross
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Juergen Gross @ 2020-05-11 11:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Julien Grall, Wei Liu,
	Andrew Cooper, Ian Jackson, George Dunlap, Dario Faggioli,
	Jan Beulich, Roger Pau Monné

Some problems I found when trying to find a problem with
cpu-on/offlining in core scheduling mode.

Juergen Gross (3):
  xen/sched: allow rcu work to happen when syncing cpus in core
    scheduling
  xen/sched: don't call sync_vcpu_execstate() in
    sched_unit_migrate_finish()
  xen/sched: fix latent races accessing vcpu->dirty_cpu

 xen/arch/x86/domain.c     | 16 +++++++++++-----
 xen/common/domain.c       |  2 +-
 xen/common/keyhandler.c   |  2 +-
 xen/common/sched/core.c   | 18 ++++++++++--------
 xen/include/xen/sched.h   |  2 +-
 xen/include/xen/softirq.h |  2 +-
 6 files changed, 25 insertions(+), 17 deletions(-)

-- 
2.26.1



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/3] xen/sched: allow rcu work to happen when syncing cpus in core scheduling
  2020-05-11 11:28 [PATCH v2 0/3] xen: Fix some bugs in scheduling Juergen Gross
@ 2020-05-11 11:28 ` Juergen Gross
  2020-05-11 11:28 ` [PATCH v2 2/3] xen/sched: don't call sync_vcpu_execstate() in sched_unit_migrate_finish() Juergen Gross
  2020-05-11 11:28 ` [PATCH v2 3/3] xen/sched: fix latent races accessing vcpu->dirty_cpu Juergen Gross
  2 siblings, 0 replies; 12+ messages in thread
From: Juergen Gross @ 2020-05-11 11:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Sergey Dyasli, Stefano Stabellini, Julien Grall,
	Wei Liu, Andrew Cooper, Ian Jackson, George Dunlap,
	Dario Faggioli, Jan Beulich

With RCU barriers moved from tasklets to normal RCU processing cpu
offlining in core scheduling might deadlock due to cpu synchronization
required by RCU processing and core scheduling concurrently.

Fix that by bailing out from core scheduling synchronization in case
of pending RCU work. Additionally the RCU softirq is now required to
be of higher priority than the scheduling softirqs in order to do
RCU processing before entering the scheduler again, as bailing out from
the core scheduling synchronization requires to raise another softirq
SCHED_SLAVE, which would bypass RCU processing again.

Reported-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Tested-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Dario Faggioli <dfaggioli@suse.com>
---
V2:
- add BUILD_BUG_ON() and comment (Dario Faggioli)
---
 xen/common/sched/core.c   | 13 ++++++++++---
 xen/include/xen/softirq.h |  2 +-
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index d94b95285f..5df66cbf9b 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -2457,13 +2457,20 @@ static struct sched_unit *sched_wait_rendezvous_in(struct sched_unit *prev,
             v = unit2vcpu_cpu(prev, cpu);
         }
         /*
-         * Coming from idle might need to do tasklet work.
+         * Check for any work to be done which might need cpu synchronization.
+         * This is either pending RCU work, or tasklet work when coming from
+         * idle. It is mandatory that RCU softirqs are of higher priority
+         * than scheduling ones as otherwise a deadlock might occur.
          * In order to avoid deadlocks we can't do that here, but have to
-         * continue the idle loop.
+         * schedule the previous vcpu again, which will lead to the desired
+         * processing to be done.
          * Undo the rendezvous_in_cnt decrement and schedule another call of
          * sched_slave().
          */
-        if ( is_idle_unit(prev) && sched_tasklet_check_cpu(cpu) )
+        BUILD_BUG_ON(RCU_SOFTIRQ > SCHED_SLAVE_SOFTIRQ ||
+                     RCU_SOFTIRQ > SCHEDULE_SOFTIRQ);
+        if ( rcu_pending(cpu) ||
+             (is_idle_unit(prev) && sched_tasklet_check_cpu(cpu)) )
         {
             struct vcpu *vprev = current;
 
diff --git a/xen/include/xen/softirq.h b/xen/include/xen/softirq.h
index b4724f5c8b..1f6c4783da 100644
--- a/xen/include/xen/softirq.h
+++ b/xen/include/xen/softirq.h
@@ -4,10 +4,10 @@
 /* Low-latency softirqs come first in the following list. */
 enum {
     TIMER_SOFTIRQ = 0,
+    RCU_SOFTIRQ,
     SCHED_SLAVE_SOFTIRQ,
     SCHEDULE_SOFTIRQ,
     NEW_TLBFLUSH_CLOCK_PERIOD_SOFTIRQ,
-    RCU_SOFTIRQ,
     TASKLET_SOFTIRQ,
     NR_COMMON_SOFTIRQS
 };
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/3] xen/sched: don't call sync_vcpu_execstate() in sched_unit_migrate_finish()
  2020-05-11 11:28 [PATCH v2 0/3] xen: Fix some bugs in scheduling Juergen Gross
  2020-05-11 11:28 ` [PATCH v2 1/3] xen/sched: allow rcu work to happen when syncing cpus in core scheduling Juergen Gross
@ 2020-05-11 11:28 ` Juergen Gross
  2020-05-14 13:57   ` Jan Beulich
  2020-05-11 11:28 ` [PATCH v2 3/3] xen/sched: fix latent races accessing vcpu->dirty_cpu Juergen Gross
  2 siblings, 1 reply; 12+ messages in thread
From: Juergen Gross @ 2020-05-11 11:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, George Dunlap, Dario Faggioli

With support of core scheduling sched_unit_migrate_finish() gained a
call of sync_vcpu_execstate() as it was believed to be called as a
result of vcpu migration in any case.

In case of migrating a vcpu away from a physical cpu for a short period
of time only this might not be true, so drop the call and let the lazy
state syncing do its job.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- new patch
---
 xen/common/sched/core.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 5df66cbf9b..cb49a8bc02 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -1078,12 +1078,7 @@ static void sched_unit_migrate_finish(struct sched_unit *unit)
     sched_spin_unlock_double(old_lock, new_lock, flags);
 
     if ( old_cpu != new_cpu )
-    {
-        /* Vcpus are moved to other pcpus, commit their states to memory. */
-        for_each_sched_unit_vcpu ( unit, v )
-            sync_vcpu_execstate(v);
         sched_move_irqs(unit);
-    }
 
     /* Wake on new CPU. */
     for_each_sched_unit_vcpu ( unit, v )
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/3] xen/sched: fix latent races accessing vcpu->dirty_cpu
  2020-05-11 11:28 [PATCH v2 0/3] xen: Fix some bugs in scheduling Juergen Gross
  2020-05-11 11:28 ` [PATCH v2 1/3] xen/sched: allow rcu work to happen when syncing cpus in core scheduling Juergen Gross
  2020-05-11 11:28 ` [PATCH v2 2/3] xen/sched: don't call sync_vcpu_execstate() in sched_unit_migrate_finish() Juergen Gross
@ 2020-05-11 11:28 ` Juergen Gross
  2020-05-14  7:10   ` Jan Beulich
  2 siblings, 1 reply; 12+ messages in thread
From: Juergen Gross @ 2020-05-11 11:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Julien Grall, Wei Liu,
	Andrew Cooper, Ian Jackson, George Dunlap, Jan Beulich,
	Roger Pau Monné

The dirty_cpu field of struct vcpu denotes which cpu still holds data
of a vcpu. All accesses to this field should be atomic in case the
vcpu could just be running, as it is accessed without any lock held
in most cases. Especially sync_local_execstate() and context_switch()
for the same vcpu running concurrently have a risk for failing.

There are some instances where accesses are not atomically done, and
even worse where multiple accesses are done when a single one would
be mandated.

Correct that in order to avoid potential problems.

Add some assertions to verify dirty_cpu is handled properly.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- convert all accesses to v->dirty_cpu to atomic ones (Jan Beulich)
- drop cast (Julien Grall)
---
 xen/arch/x86/domain.c   | 16 +++++++++++-----
 xen/common/domain.c     |  2 +-
 xen/common/keyhandler.c |  2 +-
 xen/include/xen/sched.h |  2 +-
 4 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a4428190d5..2e5717b983 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -183,7 +183,7 @@ void startup_cpu_idle_loop(void)
 
     ASSERT(is_idle_vcpu(v));
     cpumask_set_cpu(v->processor, v->domain->dirty_cpumask);
-    v->dirty_cpu = v->processor;
+    write_atomic(&v->dirty_cpu, v->processor);
 
     reset_stack_and_jump(idle_loop);
 }
@@ -1769,6 +1769,7 @@ static void __context_switch(void)
 
     if ( !is_idle_domain(pd) )
     {
+        ASSERT(read_atomic(&p->dirty_cpu) == cpu);
         memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
         vcpu_save_fpu(p);
         pd->arch.ctxt_switch->from(p);
@@ -1832,7 +1833,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
 {
     unsigned int cpu = smp_processor_id();
     const struct domain *prevd = prev->domain, *nextd = next->domain;
-    unsigned int dirty_cpu = next->dirty_cpu;
+    unsigned int dirty_cpu = read_atomic(&next->dirty_cpu);
 
     ASSERT(prev != next);
     ASSERT(local_irq_is_enabled());
@@ -1844,6 +1845,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
     {
         /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
         flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
+        ASSERT(!vcpu_cpu_dirty(next));
     }
 
     _update_runstate_area(prev);
@@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
 
 void sync_vcpu_execstate(struct vcpu *v)
 {
-    if ( v->dirty_cpu == smp_processor_id() )
+    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
+
+    if ( dirty_cpu == smp_processor_id() )
         sync_local_execstate();
-    else if ( vcpu_cpu_dirty(v) )
+    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
     {
         /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
-        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
+        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
     }
+    ASSERT(!is_vcpu_dirty_cpu(dirty_cpu) ||
+           read_atomic(&v->dirty_cpu) != dirty_cpu);
 }
 
 static int relinquish_memory(
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 7cc9526139..70ff05eefc 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -158,7 +158,7 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int vcpu_id)
 
     v->domain = d;
     v->vcpu_id = vcpu_id;
-    v->dirty_cpu = VCPU_CPU_CLEAN;
+    write_atomic(&v->dirty_cpu, VCPU_CPU_CLEAN);
 
     spin_lock_init(&v->virq_lock);
 
diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index 87bd145374..68364e987d 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -316,7 +316,7 @@ static void dump_domains(unsigned char key)
                        vcpu_info(v, evtchn_upcall_pending),
                        !vcpu_event_delivery_is_enabled(v));
                 if ( vcpu_cpu_dirty(v) )
-                    printk("dirty_cpu=%u", v->dirty_cpu);
+                    printk("dirty_cpu=%u", read_atomic(&v->dirty_cpu));
                 printk("\n");
                 printk("    pause_count=%d pause_flags=%lx\n",
                        atomic_read(&v->pause_count), v->pause_flags);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 6101761d25..ac53519d7f 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -844,7 +844,7 @@ static inline bool is_vcpu_dirty_cpu(unsigned int cpu)
 
 static inline bool vcpu_cpu_dirty(const struct vcpu *v)
 {
-    return is_vcpu_dirty_cpu(v->dirty_cpu);
+    return is_vcpu_dirty_cpu(read_atomic(&v->dirty_cpu));
 }
 
 void vcpu_block(void);
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] xen/sched: fix latent races accessing vcpu->dirty_cpu
  2020-05-11 11:28 ` [PATCH v2 3/3] xen/sched: fix latent races accessing vcpu->dirty_cpu Juergen Gross
@ 2020-05-14  7:10   ` Jan Beulich
  2020-05-14  8:50     ` Jürgen Groß
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2020-05-14  7:10 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel, Roger Pau Monné

On 11.05.2020 13:28, Juergen Gross wrote:
> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>  
>  void sync_vcpu_execstate(struct vcpu *v)
>  {
> -    if ( v->dirty_cpu == smp_processor_id() )
> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
> +
> +    if ( dirty_cpu == smp_processor_id() )
>          sync_local_execstate();
> -    else if ( vcpu_cpu_dirty(v) )
> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>      {
>          /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>      }
> +    ASSERT(!is_vcpu_dirty_cpu(dirty_cpu) ||
> +           read_atomic(&v->dirty_cpu) != dirty_cpu);

Repeating my v1.1 comments:

"However, having stared at it for a while now - is this race
 free? I can see this being fine in the (initial) case of
 dirty_cpu == smp_processor_id(), but if this is for a foreign
 CPU, can't the vCPU have gone back to that same CPU again in
 the meantime?"

and later

"There is a time window from late in flush_mask() to the assertion
 you add. All sorts of things can happen during this window on
 other CPUs. IOW what guarantees the vCPU not getting unpaused or
 its affinity getting changed yet another time?"

You did reply that by what is now patch 2 this race can be
eliminated, but I have to admit I don't see why this would be.
Hence at the very least I'd expect justification in either the
description or a code comment as to why there's no race left
(and also no race to be expected to be re-introduced by code
changes elsewhere - very unlikely races are, by their nature,
unlikely to be hit during code development and the associated
testing, hence I'd like there to be sufficiently close to a
guarantee here).

My reservations here may in part be due to not following the
reasoning for patch 2, which therefore I'll have to rely on the
scheduler maintainers to judge on.

> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -158,7 +158,7 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int vcpu_id)
>  
>      v->domain = d;
>      v->vcpu_id = vcpu_id;
> -    v->dirty_cpu = VCPU_CPU_CLEAN;
> +    write_atomic(&v->dirty_cpu, VCPU_CPU_CLEAN);

This is not strictly necessary (the vCPU won't be acted upon by
any other entity in the system just yet), and with this I'd like
to suggest to drop this change again.

Jan


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] xen/sched: fix latent races accessing vcpu->dirty_cpu
  2020-05-14  7:10   ` Jan Beulich
@ 2020-05-14  8:50     ` Jürgen Groß
  2020-05-14  9:24       ` Jan Beulich
  0 siblings, 1 reply; 12+ messages in thread
From: Jürgen Groß @ 2020-05-14  8:50 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel, Roger Pau Monné

On 14.05.20 09:10, Jan Beulich wrote:
> On 11.05.2020 13:28, Juergen Gross wrote:
>> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>>   
>>   void sync_vcpu_execstate(struct vcpu *v)
>>   {
>> -    if ( v->dirty_cpu == smp_processor_id() )
>> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
>> +
>> +    if ( dirty_cpu == smp_processor_id() )
>>           sync_local_execstate();
>> -    else if ( vcpu_cpu_dirty(v) )
>> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>>       {
>>           /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
>> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>>       }
>> +    ASSERT(!is_vcpu_dirty_cpu(dirty_cpu) ||
>> +           read_atomic(&v->dirty_cpu) != dirty_cpu);
> 
> Repeating my v1.1 comments:
> 
> "However, having stared at it for a while now - is this race
>   free? I can see this being fine in the (initial) case of
>   dirty_cpu == smp_processor_id(), but if this is for a foreign
>   CPU, can't the vCPU have gone back to that same CPU again in
>   the meantime?"
> 
> and later
> 
> "There is a time window from late in flush_mask() to the assertion
>   you add. All sorts of things can happen during this window on
>   other CPUs. IOW what guarantees the vCPU not getting unpaused or
>   its affinity getting changed yet another time?"
> 
> You did reply that by what is now patch 2 this race can be
> eliminated, but I have to admit I don't see why this would be.
> Hence at the very least I'd expect justification in either the
> description or a code comment as to why there's no race left
> (and also no race to be expected to be re-introduced by code
> changes elsewhere - very unlikely races are, by their nature,
> unlikely to be hit during code development and the associated
> testing, hence I'd like there to be sufficiently close to a
> guarantee here).
> 
> My reservations here may in part be due to not following the
> reasoning for patch 2, which therefore I'll have to rely on the
> scheduler maintainers to judge on.

sync_vcpu_execstate() isn't called for a running or runnable vcpu any
longer. I can add an ASSERT() and a comment explaining it if you like
that better.

> 
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -158,7 +158,7 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int vcpu_id)
>>   
>>       v->domain = d;
>>       v->vcpu_id = vcpu_id;
>> -    v->dirty_cpu = VCPU_CPU_CLEAN;
>> +    write_atomic(&v->dirty_cpu, VCPU_CPU_CLEAN);
> 
> This is not strictly necessary (the vCPU won't be acted upon by
> any other entity in the system just yet), and with this I'd like
> to suggest to drop this change again.

Fine with me.


Juergen


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] xen/sched: fix latent races accessing vcpu->dirty_cpu
  2020-05-14  8:50     ` Jürgen Groß
@ 2020-05-14  9:24       ` Jan Beulich
  2020-05-14  9:29         ` Jürgen Groß
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2020-05-14  9:24 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel, Roger Pau Monné

On 14.05.2020 10:50, Jürgen Groß wrote:
> On 14.05.20 09:10, Jan Beulich wrote:
>> On 11.05.2020 13:28, Juergen Gross wrote:
>>> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>>>     void sync_vcpu_execstate(struct vcpu *v)
>>>   {
>>> -    if ( v->dirty_cpu == smp_processor_id() )
>>> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
>>> +
>>> +    if ( dirty_cpu == smp_processor_id() )
>>>           sync_local_execstate();
>>> -    else if ( vcpu_cpu_dirty(v) )
>>> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>>>       {
>>>           /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>>> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
>>> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>>>       }
>>> +    ASSERT(!is_vcpu_dirty_cpu(dirty_cpu) ||
>>> +           read_atomic(&v->dirty_cpu) != dirty_cpu);
>>
>> Repeating my v1.1 comments:
>>
>> "However, having stared at it for a while now - is this race
>>   free? I can see this being fine in the (initial) case of
>>   dirty_cpu == smp_processor_id(), but if this is for a foreign
>>   CPU, can't the vCPU have gone back to that same CPU again in
>>   the meantime?"
>>
>> and later
>>
>> "There is a time window from late in flush_mask() to the assertion
>>   you add. All sorts of things can happen during this window on
>>   other CPUs. IOW what guarantees the vCPU not getting unpaused or
>>   its affinity getting changed yet another time?"
>>
>> You did reply that by what is now patch 2 this race can be
>> eliminated, but I have to admit I don't see why this would be.
>> Hence at the very least I'd expect justification in either the
>> description or a code comment as to why there's no race left
>> (and also no race to be expected to be re-introduced by code
>> changes elsewhere - very unlikely races are, by their nature,
>> unlikely to be hit during code development and the associated
>> testing, hence I'd like there to be sufficiently close to a
>> guarantee here).
>>
>> My reservations here may in part be due to not following the
>> reasoning for patch 2, which therefore I'll have to rely on the
>> scheduler maintainers to judge on.
> 
> sync_vcpu_execstate() isn't called for a running or runnable vcpu any
> longer. I can add an ASSERT() and a comment explaining it if you like
> that better.

This would help (hopefully people adding new uses of the function
would run into this assertion/comment), but for example the uses
in mapcache_current_vcpu() or do_tasklet_work() look to be pretty
hard to prove they can't happen for a runnable vCPU.

Jan


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] xen/sched: fix latent races accessing vcpu->dirty_cpu
  2020-05-14  9:24       ` Jan Beulich
@ 2020-05-14  9:29         ` Jürgen Groß
  2020-05-14 13:58           ` Jan Beulich
  0 siblings, 1 reply; 12+ messages in thread
From: Jürgen Groß @ 2020-05-14  9:29 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel, Roger Pau Monné

On 14.05.20 11:24, Jan Beulich wrote:
> On 14.05.2020 10:50, Jürgen Groß wrote:
>> On 14.05.20 09:10, Jan Beulich wrote:
>>> On 11.05.2020 13:28, Juergen Gross wrote:
>>>> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>>>>      void sync_vcpu_execstate(struct vcpu *v)
>>>>    {
>>>> -    if ( v->dirty_cpu == smp_processor_id() )
>>>> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
>>>> +
>>>> +    if ( dirty_cpu == smp_processor_id() )
>>>>            sync_local_execstate();
>>>> -    else if ( vcpu_cpu_dirty(v) )
>>>> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>>>>        {
>>>>            /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>>>> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
>>>> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>>>>        }
>>>> +    ASSERT(!is_vcpu_dirty_cpu(dirty_cpu) ||
>>>> +           read_atomic(&v->dirty_cpu) != dirty_cpu);
>>>
>>> Repeating my v1.1 comments:
>>>
>>> "However, having stared at it for a while now - is this race
>>>    free? I can see this being fine in the (initial) case of
>>>    dirty_cpu == smp_processor_id(), but if this is for a foreign
>>>    CPU, can't the vCPU have gone back to that same CPU again in
>>>    the meantime?"
>>>
>>> and later
>>>
>>> "There is a time window from late in flush_mask() to the assertion
>>>    you add. All sorts of things can happen during this window on
>>>    other CPUs. IOW what guarantees the vCPU not getting unpaused or
>>>    its affinity getting changed yet another time?"
>>>
>>> You did reply that by what is now patch 2 this race can be
>>> eliminated, but I have to admit I don't see why this would be.
>>> Hence at the very least I'd expect justification in either the
>>> description or a code comment as to why there's no race left
>>> (and also no race to be expected to be re-introduced by code
>>> changes elsewhere - very unlikely races are, by their nature,
>>> unlikely to be hit during code development and the associated
>>> testing, hence I'd like there to be sufficiently close to a
>>> guarantee here).
>>>
>>> My reservations here may in part be due to not following the
>>> reasoning for patch 2, which therefore I'll have to rely on the
>>> scheduler maintainers to judge on.
>>
>> sync_vcpu_execstate() isn't called for a running or runnable vcpu any
>> longer. I can add an ASSERT() and a comment explaining it if you like
>> that better.
> 
> This would help (hopefully people adding new uses of the function
> would run into this assertion/comment), but for example the uses
> in mapcache_current_vcpu() or do_tasklet_work() look to be pretty
> hard to prove they can't happen for a runnable vCPU.

Those call sync_local_execstate(), not sync_vcpu_execstate().


Juergen


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] xen/sched: don't call sync_vcpu_execstate() in sched_unit_migrate_finish()
  2020-05-11 11:28 ` [PATCH v2 2/3] xen/sched: don't call sync_vcpu_execstate() in sched_unit_migrate_finish() Juergen Gross
@ 2020-05-14 13:57   ` Jan Beulich
  2020-05-14 14:21     ` Jürgen Groß
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2020-05-14 13:57 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, George Dunlap, Dario Faggioli

On 11.05.2020 13:28, Juergen Gross wrote:
> With support of core scheduling sched_unit_migrate_finish() gained a
> call of sync_vcpu_execstate() as it was believed to be called as a
> result of vcpu migration in any case.
> 
> In case of migrating a vcpu away from a physical cpu for a short period
> of time only this might not be true, so drop the call and let the lazy
> state syncing do its job.

Replying here instead of on the patch 3 thread (and I'm sorry
for mixing up function names there): By saying "for a short
period of time only", do you imply without ever getting scheduled
on the new (temporary) CPU? If so, I think I understand this
change now, but then this could do with saying here. If not, I'm
afraid I'm still lost.

Jan


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] xen/sched: fix latent races accessing vcpu->dirty_cpu
  2020-05-14  9:29         ` Jürgen Groß
@ 2020-05-14 13:58           ` Jan Beulich
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Beulich @ 2020-05-14 13:58 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel, Roger Pau Monné

On 14.05.2020 11:29, Jürgen Groß wrote:
> On 14.05.20 11:24, Jan Beulich wrote:
>> On 14.05.2020 10:50, Jürgen Groß wrote:
>>> On 14.05.20 09:10, Jan Beulich wrote:
>>>> On 11.05.2020 13:28, Juergen Gross wrote:
>>>>> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>>>>>      void sync_vcpu_execstate(struct vcpu *v)
>>>>>    {
>>>>> -    if ( v->dirty_cpu == smp_processor_id() )
>>>>> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
>>>>> +
>>>>> +    if ( dirty_cpu == smp_processor_id() )
>>>>>            sync_local_execstate();
>>>>> -    else if ( vcpu_cpu_dirty(v) )
>>>>> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>>>>>        {
>>>>>            /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>>>>> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
>>>>> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>>>>>        }
>>>>> +    ASSERT(!is_vcpu_dirty_cpu(dirty_cpu) ||
>>>>> +           read_atomic(&v->dirty_cpu) != dirty_cpu);
>>>>
>>>> Repeating my v1.1 comments:
>>>>
>>>> "However, having stared at it for a while now - is this race
>>>>    free? I can see this being fine in the (initial) case of
>>>>    dirty_cpu == smp_processor_id(), but if this is for a foreign
>>>>    CPU, can't the vCPU have gone back to that same CPU again in
>>>>    the meantime?"
>>>>
>>>> and later
>>>>
>>>> "There is a time window from late in flush_mask() to the assertion
>>>>    you add. All sorts of things can happen during this window on
>>>>    other CPUs. IOW what guarantees the vCPU not getting unpaused or
>>>>    its affinity getting changed yet another time?"
>>>>
>>>> You did reply that by what is now patch 2 this race can be
>>>> eliminated, but I have to admit I don't see why this would be.
>>>> Hence at the very least I'd expect justification in either the
>>>> description or a code comment as to why there's no race left
>>>> (and also no race to be expected to be re-introduced by code
>>>> changes elsewhere - very unlikely races are, by their nature,
>>>> unlikely to be hit during code development and the associated
>>>> testing, hence I'd like there to be sufficiently close to a
>>>> guarantee here).
>>>>
>>>> My reservations here may in part be due to not following the
>>>> reasoning for patch 2, which therefore I'll have to rely on the
>>>> scheduler maintainers to judge on.
>>>
>>> sync_vcpu_execstate() isn't called for a running or runnable vcpu any
>>> longer. I can add an ASSERT() and a comment explaining it if you like
>>> that better.
>>
>> This would help (hopefully people adding new uses of the function
>> would run into this assertion/comment), but for example the uses
>> in mapcache_current_vcpu() or do_tasklet_work() look to be pretty
>> hard to prove they can't happen for a runnable vCPU.
> 
> Those call sync_local_execstate(), not sync_vcpu_execstate().

Ouch, as said on the other sub-thread - I'm sorry for mixing those up.

Jan


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] xen/sched: don't call sync_vcpu_execstate() in sched_unit_migrate_finish()
  2020-05-14 13:57   ` Jan Beulich
@ 2020-05-14 14:21     ` Jürgen Groß
  2020-05-14 14:59       ` Jan Beulich
  0 siblings, 1 reply; 12+ messages in thread
From: Jürgen Groß @ 2020-05-14 14:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, George Dunlap, Dario Faggioli

On 14.05.20 15:57, Jan Beulich wrote:
> On 11.05.2020 13:28, Juergen Gross wrote:
>> With support of core scheduling sched_unit_migrate_finish() gained a
>> call of sync_vcpu_execstate() as it was believed to be called as a
>> result of vcpu migration in any case.
>>
>> In case of migrating a vcpu away from a physical cpu for a short period
>> of time only this might not be true, so drop the call and let the lazy
>> state syncing do its job.
> 
> Replying here instead of on the patch 3 thread (and I'm sorry
> for mixing up function names there): By saying "for a short
> period of time only", do you imply without ever getting scheduled
> on the new (temporary) CPU? If so, I think I understand this
> change now, but then this could do with saying here. If not, I'm
> afraid I'm still lost.

I'll change the commit message to:

... for a short period of time only without ever being scheduled on the
selected new cpu ...


Juergen


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] xen/sched: don't call sync_vcpu_execstate() in sched_unit_migrate_finish()
  2020-05-14 14:21     ` Jürgen Groß
@ 2020-05-14 14:59       ` Jan Beulich
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Beulich @ 2020-05-14 14:59 UTC (permalink / raw)
  To: Jürgen Groß; +Cc: xen-devel, George Dunlap, Dario Faggioli

On 14.05.2020 16:21, Jürgen Groß wrote:
> On 14.05.20 15:57, Jan Beulich wrote:
>> On 11.05.2020 13:28, Juergen Gross wrote:
>>> With support of core scheduling sched_unit_migrate_finish() gained a
>>> call of sync_vcpu_execstate() as it was believed to be called as a
>>> result of vcpu migration in any case.
>>>
>>> In case of migrating a vcpu away from a physical cpu for a short period
>>> of time only this might not be true, so drop the call and let the lazy
>>> state syncing do its job.
>>
>> Replying here instead of on the patch 3 thread (and I'm sorry
>> for mixing up function names there): By saying "for a short
>> period of time only", do you imply without ever getting scheduled
>> on the new (temporary) CPU? If so, I think I understand this
>> change now, but then this could do with saying here. If not, I'm
>> afraid I'm still lost.
> 
> I'll change the commit message to:
> 
> ... for a short period of time only without ever being scheduled on the
> selected new cpu ...

And then
Reviewed-by: Jan Beulich <jbeulich@suse.com>
for both this one and patch 3 (ideally with the one unnecessary hunk
dropped).

Jan


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-05-14 14:59 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-11 11:28 [PATCH v2 0/3] xen: Fix some bugs in scheduling Juergen Gross
2020-05-11 11:28 ` [PATCH v2 1/3] xen/sched: allow rcu work to happen when syncing cpus in core scheduling Juergen Gross
2020-05-11 11:28 ` [PATCH v2 2/3] xen/sched: don't call sync_vcpu_execstate() in sched_unit_migrate_finish() Juergen Gross
2020-05-14 13:57   ` Jan Beulich
2020-05-14 14:21     ` Jürgen Groß
2020-05-14 14:59       ` Jan Beulich
2020-05-11 11:28 ` [PATCH v2 3/3] xen/sched: fix latent races accessing vcpu->dirty_cpu Juergen Gross
2020-05-14  7:10   ` Jan Beulich
2020-05-14  8:50     ` Jürgen Groß
2020-05-14  9:24       ` Jan Beulich
2020-05-14  9:29         ` Jürgen Groß
2020-05-14 13:58           ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.