All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu
@ 2020-04-30 15:28 Juergen Gross
  2020-05-02 11:36 ` Julien Grall
  2020-05-04 11:51 ` Jan Beulich
  0 siblings, 2 replies; 10+ messages in thread
From: Juergen Gross @ 2020-04-30 15:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Julien Grall, Wei Liu,
	Andrew Cooper, Ian Jackson, George Dunlap, Jan Beulich,
	Roger Pau Monné

The dirty_cpu field of struct vcpu denotes which cpu still holds data
of a vcpu. All accesses to this field should be atomic in case the
vcpu could just be running, as it is accessed without any lock held
in most cases.

There are some instances where accesses are not atomically done, and
even worse where multiple accesses are done when a single one would
be mandated.

Correct that in order to avoid potential problems.

Add some assertions to verify dirty_cpu is handled properly.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/arch/x86/domain.c   | 14 ++++++++++----
 xen/include/xen/sched.h |  2 +-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a4428190d5..f0579a56d1 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1769,6 +1769,7 @@ static void __context_switch(void)
 
     if ( !is_idle_domain(pd) )
     {
+        ASSERT(read_atomic(&p->dirty_cpu) == cpu);
         memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
         vcpu_save_fpu(p);
         pd->arch.ctxt_switch->from(p);
@@ -1832,7 +1833,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
 {
     unsigned int cpu = smp_processor_id();
     const struct domain *prevd = prev->domain, *nextd = next->domain;
-    unsigned int dirty_cpu = next->dirty_cpu;
+    unsigned int dirty_cpu = read_atomic(&next->dirty_cpu);
 
     ASSERT(prev != next);
     ASSERT(local_irq_is_enabled());
@@ -1844,6 +1845,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
     {
         /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
         flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
+        ASSERT(read_atomic(&next->dirty_cpu) == VCPU_CPU_CLEAN);
     }
 
     _update_runstate_area(prev);
@@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
 
 void sync_vcpu_execstate(struct vcpu *v)
 {
-    if ( v->dirty_cpu == smp_processor_id() )
+    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
+
+    if ( dirty_cpu == smp_processor_id() )
         sync_local_execstate();
-    else if ( vcpu_cpu_dirty(v) )
+    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
     {
         /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
-        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
+        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
     }
+    ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu ||
+           dirty_cpu == VCPU_CPU_CLEAN);
 }
 
 static int relinquish_memory(
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 195e7ee583..81628e2d98 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -844,7 +844,7 @@ static inline bool is_vcpu_dirty_cpu(unsigned int cpu)
 
 static inline bool vcpu_cpu_dirty(const struct vcpu *v)
 {
-    return is_vcpu_dirty_cpu(v->dirty_cpu);
+    return is_vcpu_dirty_cpu(read_atomic((unsigned int *)&v->dirty_cpu));
 }
 
 void vcpu_block(void);
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu
  2020-04-30 15:28 [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu Juergen Gross
@ 2020-05-02 11:36 ` Julien Grall
  2020-05-02 11:45   ` Jürgen Groß
  2020-05-04 11:51 ` Jan Beulich
  1 sibling, 1 reply; 10+ messages in thread
From: Julien Grall @ 2020-05-02 11:36 UTC (permalink / raw)
  To: Juergen Gross, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Roger Pau Monné

Hi Juergen,

They are less theoritical than we would want. :/ There was a great 
series of article on lwn [1] about compiler optimization last year.

There is at least a few TOCTOU in the code where you could end up with 
cpumask_of(VCPU_CPU_CLEAN).

On 30/04/2020 16:28, Juergen Gross wrote:
> The dirty_cpu field of struct vcpu denotes which cpu still holds data
> of a vcpu. All accesses to this field should be atomic in case the
> vcpu could just be running, as it is accessed without any lock held
> in most cases.

Looking at the patch below, I am not sure why the issue is happening 
only when running. For instance, in the case of context_switch(), 'next' 
should not be running.

Instead, I think, the race would happen if the vCPU state is 
synchronized (__sync_local_execstate()) at the same time as time 
context_switch(). Am I correct?

> 
> There are some instances where accesses are not atomically done, and
> even worse where multiple accesses are done when a single one would
> be mandated.
> 
> Correct that in order to avoid potential problems.
> 
> Add some assertions to verify dirty_cpu is handled properly.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>   xen/arch/x86/domain.c   | 14 ++++++++++----
>   xen/include/xen/sched.h |  2 +-
>   2 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index a4428190d5..f0579a56d1 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1769,6 +1769,7 @@ static void __context_switch(void)
>   
>       if ( !is_idle_domain(pd) )
>       {
> +        ASSERT(read_atomic(&p->dirty_cpu) == cpu);
>           memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
>           vcpu_save_fpu(p);
>           pd->arch.ctxt_switch->from(p);
> @@ -1832,7 +1833,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
>   {
>       unsigned int cpu = smp_processor_id();
>       const struct domain *prevd = prev->domain, *nextd = next->domain;
> -    unsigned int dirty_cpu = next->dirty_cpu;
> +    unsigned int dirty_cpu = read_atomic(&next->dirty_cpu);
>   
>       ASSERT(prev != next);
>       ASSERT(local_irq_is_enabled());
> @@ -1844,6 +1845,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
>       {
>           /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>           flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
> +        ASSERT(read_atomic(&next->dirty_cpu) == VCPU_CPU_CLEAN);
>       }
>   
>       _update_runstate_area(prev);
> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>   
>   void sync_vcpu_execstate(struct vcpu *v)
>   {
> -    if ( v->dirty_cpu == smp_processor_id() )
> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
> +
> +    if ( dirty_cpu == smp_processor_id() )
>           sync_local_execstate();
> -    else if ( vcpu_cpu_dirty(v) )
> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>       {
>           /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>       }
> +    ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu ||
> +           dirty_cpu == VCPU_CPU_CLEAN); >   }
>   
>   static int relinquish_memory(
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 195e7ee583..81628e2d98 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -844,7 +844,7 @@ static inline bool is_vcpu_dirty_cpu(unsigned int cpu)
>   
>   static inline bool vcpu_cpu_dirty(const struct vcpu *v)
>   {
> -    return is_vcpu_dirty_cpu(v->dirty_cpu);
> +    return is_vcpu_dirty_cpu(read_atomic((unsigned int *)&v->dirty_cpu));

Is the cast necessary?

>   }
>   
>   void vcpu_block(void);
> 

Cheers,

[1] https://lwn.net/Articles/793253/

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu
  2020-05-02 11:36 ` Julien Grall
@ 2020-05-02 11:45   ` Jürgen Groß
  2020-05-02 12:34     ` Julien Grall
  0 siblings, 1 reply; 10+ messages in thread
From: Jürgen Groß @ 2020-05-02 11:45 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Roger Pau Monné

On 02.05.20 13:36, Julien Grall wrote:
> Hi Juergen,
> 
> They are less theoritical than we would want. :/ There was a great 
> series of article on lwn [1] about compiler optimization last year.
> 
> There is at least a few TOCTOU in the code where you could end up with 
> cpumask_of(VCPU_CPU_CLEAN).

It is theoretical in the sense that I don't know of any failure
resulting due to this.

> 
> On 30/04/2020 16:28, Juergen Gross wrote:
>> The dirty_cpu field of struct vcpu denotes which cpu still holds data
>> of a vcpu. All accesses to this field should be atomic in case the
>> vcpu could just be running, as it is accessed without any lock held
>> in most cases.
> 
> Looking at the patch below, I am not sure why the issue is happening 
> only when running. For instance, in the case of context_switch(), 'next' 
> should not be running.
> 
> Instead, I think, the race would happen if the vCPU state is 
> synchronized (__sync_local_execstate()) at the same time as time 
> context_switch(). Am I correct?

Yes.

> 
>>
>> There are some instances where accesses are not atomically done, and
>> even worse where multiple accesses are done when a single one would
>> be mandated.
>>
>> Correct that in order to avoid potential problems.
>>
>> Add some assertions to verify dirty_cpu is handled properly.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   xen/arch/x86/domain.c   | 14 ++++++++++----
>>   xen/include/xen/sched.h |  2 +-
>>   2 files changed, 11 insertions(+), 5 deletions(-)
>>
>> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
>> index a4428190d5..f0579a56d1 100644
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -1769,6 +1769,7 @@ static void __context_switch(void)
>>       if ( !is_idle_domain(pd) )
>>       {
>> +        ASSERT(read_atomic(&p->dirty_cpu) == cpu);
>>           memcpy(&p->arch.user_regs, stack_regs, 
>> CTXT_SWITCH_STACK_BYTES);
>>           vcpu_save_fpu(p);
>>           pd->arch.ctxt_switch->from(p);
>> @@ -1832,7 +1833,7 @@ void context_switch(struct vcpu *prev, struct 
>> vcpu *next)
>>   {
>>       unsigned int cpu = smp_processor_id();
>>       const struct domain *prevd = prev->domain, *nextd = next->domain;
>> -    unsigned int dirty_cpu = next->dirty_cpu;
>> +    unsigned int dirty_cpu = read_atomic(&next->dirty_cpu);
>>       ASSERT(prev != next);
>>       ASSERT(local_irq_is_enabled());
>> @@ -1844,6 +1845,7 @@ void context_switch(struct vcpu *prev, struct 
>> vcpu *next)
>>       {
>>           /* Remote CPU calls __sync_local_execstate() from flush IPI 
>> handler. */
>>           flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>> +        ASSERT(read_atomic(&next->dirty_cpu) == VCPU_CPU_CLEAN);
>>       }
>>       _update_runstate_area(prev);
>> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>>   void sync_vcpu_execstate(struct vcpu *v)
>>   {
>> -    if ( v->dirty_cpu == smp_processor_id() )
>> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
>> +
>> +    if ( dirty_cpu == smp_processor_id() )
>>           sync_local_execstate();
>> -    else if ( vcpu_cpu_dirty(v) )
>> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>>       {
>>           /* Remote CPU calls __sync_local_execstate() from flush IPI 
>> handler. */
>> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
>> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>>       }
>> +    ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu ||
>> +           dirty_cpu == VCPU_CPU_CLEAN); >   }
>>   static int relinquish_memory(
>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>> index 195e7ee583..81628e2d98 100644
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -844,7 +844,7 @@ static inline bool is_vcpu_dirty_cpu(unsigned int 
>> cpu)
>>   static inline bool vcpu_cpu_dirty(const struct vcpu *v)
>>   {
>> -    return is_vcpu_dirty_cpu(v->dirty_cpu);
>> +    return is_vcpu_dirty_cpu(read_atomic((unsigned int 
>> *)&v->dirty_cpu));
> 
> Is the cast necessary?

Yes, that was the problem when building for ARM I encountered.

read_atomic() on ARM has a local variable of the same type as the
read_atomic() parameter for storing the result. Due to the const
attribute of v this results in assignment to a read-only variable.


Juergen


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu
  2020-05-02 11:45   ` Jürgen Groß
@ 2020-05-02 12:34     ` Julien Grall
  2020-05-02 16:09       ` Julien Grall
  0 siblings, 1 reply; 10+ messages in thread
From: Julien Grall @ 2020-05-02 12:34 UTC (permalink / raw)
  To: Jürgen Groß, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Roger Pau Monné

Hi,

On 02/05/2020 12:45, Jürgen Groß wrote:
> On 02.05.20 13:36, Julien Grall wrote:
>> Hi Juergen,
>>
>> They are less theoritical than we would want. :/ There was a great 
>> series of article on lwn [1] about compiler optimization last year.
>>
>> There is at least a few TOCTOU in the code where you could end up with 
>> cpumask_of(VCPU_CPU_CLEAN).
> 
> It is theoretical in the sense that I don't know of any failure
> resulting due to this.

How about "latent" instead of "theoritical"?

> 
>>
>> On 30/04/2020 16:28, Juergen Gross wrote:
>>> The dirty_cpu field of struct vcpu denotes which cpu still holds data
>>> of a vcpu. All accesses to this field should be atomic in case the
>>> vcpu could just be running, as it is accessed without any lock held
>>> in most cases.
>>
>> Looking at the patch below, I am not sure why the issue is happening 
>> only when running. For instance, in the case of context_switch(), 
>> 'next' should not be running.
>>
>> Instead, I think, the race would happen if the vCPU state is 
>> synchronized (__sync_local_execstate()) at the same time as time 
>> context_switch(). Am I correct?
> 
> Yes.

Would you mind adding this context in the commit message?

[...]

>>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>>> index 195e7ee583..81628e2d98 100644
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -844,7 +844,7 @@ static inline bool is_vcpu_dirty_cpu(unsigned int 
>>> cpu)
>>>   static inline bool vcpu_cpu_dirty(const struct vcpu *v)
>>>   {
>>> -    return is_vcpu_dirty_cpu(v->dirty_cpu);
>>> +    return is_vcpu_dirty_cpu(read_atomic((unsigned int 
>>> *)&v->dirty_cpu));
>>
>> Is the cast necessary?
> 
> Yes, that was the problem when building for ARM I encountered.
> 
> read_atomic() on ARM has a local variable of the same type as the
> read_atomic() parameter for storing the result. Due to the const
> attribute of v this results in assignment to a read-only variable.

Doh, we should be able to read from a const value without. So I would 
argue this is a bug in the read_atomic() implementation on Arm. I will 
try to come up with a patch.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu
  2020-05-02 12:34     ` Julien Grall
@ 2020-05-02 16:09       ` Julien Grall
  2020-05-04  7:33         ` Jürgen Groß
  0 siblings, 1 reply; 10+ messages in thread
From: Julien Grall @ 2020-05-02 16:09 UTC (permalink / raw)
  To: Jürgen Groß, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Roger Pau Monné



On 02/05/2020 13:34, Julien Grall wrote:
>>>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>>>> index 195e7ee583..81628e2d98 100644
>>>> --- a/xen/include/xen/sched.h
>>>> +++ b/xen/include/xen/sched.h
>>>> @@ -844,7 +844,7 @@ static inline bool is_vcpu_dirty_cpu(unsigned 
>>>> int cpu)
>>>>   static inline bool vcpu_cpu_dirty(const struct vcpu *v)
>>>>   {
>>>> -    return is_vcpu_dirty_cpu(v->dirty_cpu);
>>>> +    return is_vcpu_dirty_cpu(read_atomic((unsigned int 
>>>> *)&v->dirty_cpu));
>>>
>>> Is the cast necessary?
>>
>> Yes, that was the problem when building for ARM I encountered.
>>
>> read_atomic() on ARM has a local variable of the same type as the
>> read_atomic() parameter for storing the result. Due to the const
>> attribute of v this results in assignment to a read-only variable.
> 
> Doh, we should be able to read from a const value without. So I would 
> argue this is a bug in the read_atomic() implementation on Arm. I will 
> try to come up with a patch.

I have just sent a series [1] to address the issue reported here and a 
few more.

Cheers,

[1] <20200502160700.19573-1-julien@xen.org>

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu
  2020-05-02 16:09       ` Julien Grall
@ 2020-05-04  7:33         ` Jürgen Groß
  0 siblings, 0 replies; 10+ messages in thread
From: Jürgen Groß @ 2020-05-04  7:33 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Roger Pau Monné

On 02.05.20 18:09, Julien Grall wrote:
> 
> 
> On 02/05/2020 13:34, Julien Grall wrote:
>>>>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>>>>> index 195e7ee583..81628e2d98 100644
>>>>> --- a/xen/include/xen/sched.h
>>>>> +++ b/xen/include/xen/sched.h
>>>>> @@ -844,7 +844,7 @@ static inline bool is_vcpu_dirty_cpu(unsigned 
>>>>> int cpu)
>>>>>   static inline bool vcpu_cpu_dirty(const struct vcpu *v)
>>>>>   {
>>>>> -    return is_vcpu_dirty_cpu(v->dirty_cpu);
>>>>> +    return is_vcpu_dirty_cpu(read_atomic((unsigned int 
>>>>> *)&v->dirty_cpu));
>>>>
>>>> Is the cast necessary?
>>>
>>> Yes, that was the problem when building for ARM I encountered.
>>>
>>> read_atomic() on ARM has a local variable of the same type as the
>>> read_atomic() parameter for storing the result. Due to the const
>>> attribute of v this results in assignment to a read-only variable.
>>
>> Doh, we should be able to read from a const value without. So I would 
>> argue this is a bug in the read_atomic() implementation on Arm. I will 
>> try to come up with a patch.
> 
> I have just sent a series [1] to address the issue reported here and a 
> few more.

With that series V1 of this patch is fine again. :-)


Juergen


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu
  2020-04-30 15:28 [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu Juergen Gross
  2020-05-02 11:36 ` Julien Grall
@ 2020-05-04 11:51 ` Jan Beulich
  2020-05-04 12:41   ` Jürgen Groß
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2020-05-04 11:51 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel, Roger Pau Monné

On 30.04.2020 17:28, Juergen Gross wrote:
> The dirty_cpu field of struct vcpu denotes which cpu still holds data
> of a vcpu. All accesses to this field should be atomic in case the
> vcpu could just be running, as it is accessed without any lock held
> in most cases.
> 
> There are some instances where accesses are not atomically done, and
> even worse where multiple accesses are done when a single one would
> be mandated.
> 
> Correct that in order to avoid potential problems.

Beyond the changes you're making, what about the assignment in
startup_cpu_idle_loop()? And while less important, dump_domains()
also has a use that I think would better be converted for
completeness.

> @@ -1844,6 +1845,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
>      {
>          /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>          flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
> +        ASSERT(read_atomic(&next->dirty_cpu) == VCPU_CPU_CLEAN);

ASSERT(!is_vcpu_dirty_cpu(read_atomic(&next->dirty_cpu))) ?

> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>  
>  void sync_vcpu_execstate(struct vcpu *v)
>  {
> -    if ( v->dirty_cpu == smp_processor_id() )
> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
> +
> +    if ( dirty_cpu == smp_processor_id() )
>          sync_local_execstate();
> -    else if ( vcpu_cpu_dirty(v) )
> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>      {
>          /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>      }
> +    ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu ||
> +           dirty_cpu == VCPU_CPU_CLEAN);

!is_vcpu_dirty_cpu(dirty_cpu) again? Also perhaps flip both
sides of the || (to have the cheaper one first), and maybe

    if ( is_vcpu_dirty_cpu(dirty_cpu) )
        ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu);

as the longer assertion string literal isn't really of that
much extra value.

However, having stared at it for a while now - is this race
free? I can see this being fine in the (initial) case of
dirty_cpu == smp_processor_id(), but if this is for a foreign
CPU, can't the vCPU have gone back to that same CPU again in
the meantime?

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -844,7 +844,7 @@ static inline bool is_vcpu_dirty_cpu(unsigned int cpu)
>  
>  static inline bool vcpu_cpu_dirty(const struct vcpu *v)
>  {
> -    return is_vcpu_dirty_cpu(v->dirty_cpu);
> +    return is_vcpu_dirty_cpu(read_atomic((unsigned int *)&v->dirty_cpu));

As per your communication with Julien I understand the cast
will go away again.

Jan


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu
  2020-05-04 11:51 ` Jan Beulich
@ 2020-05-04 12:41   ` Jürgen Groß
  2020-05-04 12:48     ` Jan Beulich
  0 siblings, 1 reply; 10+ messages in thread
From: Jürgen Groß @ 2020-05-04 12:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel, Roger Pau Monné

On 04.05.20 13:51, Jan Beulich wrote:
> On 30.04.2020 17:28, Juergen Gross wrote:
>> The dirty_cpu field of struct vcpu denotes which cpu still holds data
>> of a vcpu. All accesses to this field should be atomic in case the
>> vcpu could just be running, as it is accessed without any lock held
>> in most cases.
>>
>> There are some instances where accesses are not atomically done, and
>> even worse where multiple accesses are done when a single one would
>> be mandated.
>>
>> Correct that in order to avoid potential problems.
> 
> Beyond the changes you're making, what about the assignment in
> startup_cpu_idle_loop()? And while less important, dump_domains()
> also has a use that I think would better be converted for
> completeness.

startup_cpu_idle_loop() is not important, as it is set before any
scheduling activity might occur on a cpu. But I can change both
instances.

> 
>> @@ -1844,6 +1845,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
>>       {
>>           /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>>           flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>> +        ASSERT(read_atomic(&next->dirty_cpu) == VCPU_CPU_CLEAN);
> 
> ASSERT(!is_vcpu_dirty_cpu(read_atomic(&next->dirty_cpu))) ?

Yes, this is better.

> 
>> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>>   
>>   void sync_vcpu_execstate(struct vcpu *v)
>>   {
>> -    if ( v->dirty_cpu == smp_processor_id() )
>> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
>> +
>> +    if ( dirty_cpu == smp_processor_id() )
>>           sync_local_execstate();
>> -    else if ( vcpu_cpu_dirty(v) )
>> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>>       {
>>           /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
>> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>>       }
>> +    ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu ||
>> +           dirty_cpu == VCPU_CPU_CLEAN);
> 
> !is_vcpu_dirty_cpu(dirty_cpu) again? Also perhaps flip both
> sides of the || (to have the cheaper one first), and maybe
> 
>      if ( is_vcpu_dirty_cpu(dirty_cpu) )
>          ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu);
> 
> as the longer assertion string literal isn't really of that
> much extra value.

I can do that, in case we can be sure the compiler will drop the
test in case of a non-debug build.

> 
> However, having stared at it for a while now - is this race
> free? I can see this being fine in the (initial) case of
> dirty_cpu == smp_processor_id(), but if this is for a foreign
> CPU, can't the vCPU have gone back to that same CPU again in
> the meantime?

This should never happen. Either the vcpu in question is paused,
or it has been forced off the cpu due to not being allowed to run
there (e.g. affinity has been changed, or cpu is about to be
removed from cpupool). I can add a comment explaining it.

> 
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -844,7 +844,7 @@ static inline bool is_vcpu_dirty_cpu(unsigned int cpu)
>>   
>>   static inline bool vcpu_cpu_dirty(const struct vcpu *v)
>>   {
>> -    return is_vcpu_dirty_cpu(v->dirty_cpu);
>> +    return is_vcpu_dirty_cpu(read_atomic((unsigned int *)&v->dirty_cpu));
> 
> As per your communication with Julien I understand the cast
> will go away again.

Yes, I think so.


Juergen


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu
  2020-05-04 12:41   ` Jürgen Groß
@ 2020-05-04 12:48     ` Jan Beulich
  2020-05-04 13:53       ` Jürgen Groß
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2020-05-04 12:48 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel, Roger Pau Monné

On 04.05.2020 14:41, Jürgen Groß wrote:
> On 04.05.20 13:51, Jan Beulich wrote:
>> On 30.04.2020 17:28, Juergen Gross wrote:
>>> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>>>     void sync_vcpu_execstate(struct vcpu *v)
>>>   {
>>> -    if ( v->dirty_cpu == smp_processor_id() )
>>> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
>>> +
>>> +    if ( dirty_cpu == smp_processor_id() )
>>>           sync_local_execstate();
>>> -    else if ( vcpu_cpu_dirty(v) )
>>> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>>>       {
>>>           /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>>> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
>>> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>>>       }
>>> +    ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu ||
>>> +           dirty_cpu == VCPU_CPU_CLEAN);
>>
>> !is_vcpu_dirty_cpu(dirty_cpu) again? Also perhaps flip both
>> sides of the || (to have the cheaper one first), and maybe
>>
>>      if ( is_vcpu_dirty_cpu(dirty_cpu) )
>>          ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu);
>>
>> as the longer assertion string literal isn't really of that
>> much extra value.
> 
> I can do that, in case we can be sure the compiler will drop the
> test in case of a non-debug build.

Modern gcc will afaik; no idea about clang though.

>> However, having stared at it for a while now - is this race
>> free? I can see this being fine in the (initial) case of
>> dirty_cpu == smp_processor_id(), but if this is for a foreign
>> CPU, can't the vCPU have gone back to that same CPU again in
>> the meantime?
> 
> This should never happen. Either the vcpu in question is paused,
> or it has been forced off the cpu due to not being allowed to run
> there (e.g. affinity has been changed, or cpu is about to be
> removed from cpupool). I can add a comment explaining it.

There is a time window from late in flush_mask() to the assertion
you add. All sorts of things can happen during this window on
other CPUs. IOW what guarantees the vCPU not getting unpaused or
its affinity getting changed yet another time?

Jan


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu
  2020-05-04 12:48     ` Jan Beulich
@ 2020-05-04 13:53       ` Jürgen Groß
  0 siblings, 0 replies; 10+ messages in thread
From: Jürgen Groß @ 2020-05-04 13:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel, Roger Pau Monné

On 04.05.20 14:48, Jan Beulich wrote:
> On 04.05.2020 14:41, Jürgen Groß wrote:
>> On 04.05.20 13:51, Jan Beulich wrote:
>>> On 30.04.2020 17:28, Juergen Gross wrote:
>>>> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>>>>      void sync_vcpu_execstate(struct vcpu *v)
>>>>    {
>>>> -    if ( v->dirty_cpu == smp_processor_id() )
>>>> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
>>>> +
>>>> +    if ( dirty_cpu == smp_processor_id() )
>>>>            sync_local_execstate();
>>>> -    else if ( vcpu_cpu_dirty(v) )
>>>> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>>>>        {
>>>>            /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>>>> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
>>>> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>>>>        }
>>>> +    ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu ||
>>>> +           dirty_cpu == VCPU_CPU_CLEAN);
>>>
>>> !is_vcpu_dirty_cpu(dirty_cpu) again? Also perhaps flip both
>>> sides of the || (to have the cheaper one first), and maybe
>>>
>>>       if ( is_vcpu_dirty_cpu(dirty_cpu) )
>>>           ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu);
>>>
>>> as the longer assertion string literal isn't really of that
>>> much extra value.
>>
>> I can do that, in case we can be sure the compiler will drop the
>> test in case of a non-debug build.
> 
> Modern gcc will afaik; no idea about clang though.

I'll go with both tests inside the ASSERT(), as this will hurt debug
build only.

> 
>>> However, having stared at it for a while now - is this race
>>> free? I can see this being fine in the (initial) case of
>>> dirty_cpu == smp_processor_id(), but if this is for a foreign
>>> CPU, can't the vCPU have gone back to that same CPU again in
>>> the meantime?
>>
>> This should never happen. Either the vcpu in question is paused,
>> or it has been forced off the cpu due to not being allowed to run
>> there (e.g. affinity has been changed, or cpu is about to be
>> removed from cpupool). I can add a comment explaining it.
> 
> There is a time window from late in flush_mask() to the assertion
> you add. All sorts of things can happen during this window on
> other CPUs. IOW what guarantees the vCPU not getting unpaused or
> its affinity getting changed yet another time?

Hmm, very unlikely, but not impossible (especially in nested virt case).

This makes me wonder whether adding the call to sync_vcpu_execstate() to
sched_unit_migrate_finish() was really a good idea. There might be cases
where it is not necessary (e.g. in the case you are mentioning, but with
a much larger time window), or where it is more expensive than doing it
the lazy way.

Dropping this call of sync_vcpu_execstate() will remove the race you are
mentioning completely.


Juergen


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-05-04 13:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-30 15:28 [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu Juergen Gross
2020-05-02 11:36 ` Julien Grall
2020-05-02 11:45   ` Jürgen Groß
2020-05-02 12:34     ` Julien Grall
2020-05-02 16:09       ` Julien Grall
2020-05-04  7:33         ` Jürgen Groß
2020-05-04 11:51 ` Jan Beulich
2020-05-04 12:41   ` Jürgen Groß
2020-05-04 12:48     ` Jan Beulich
2020-05-04 13:53       ` Jürgen Groß

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.