All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jürgen Groß" <jgross@suse.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "Stefano Stabellini" <sstabellini@kernel.org>,
	"Julien Grall" <julien@xen.org>, "Wei Liu" <wl@xen.org>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Ian Jackson" <ian.jackson@eu.citrix.com>,
	"George Dunlap" <george.dunlap@citrix.com>,
	xen-devel@lists.xenproject.org,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu
Date: Mon, 4 May 2020 14:41:20 +0200	[thread overview]
Message-ID: <9d4fd1cd-173f-5128-6a73-ac2c6d679f93@suse.com> (raw)
In-Reply-To: <d1b322c2-98d8-b3a3-1f48-2af89cf9407e@suse.com>

On 04.05.20 13:51, Jan Beulich wrote:
> On 30.04.2020 17:28, Juergen Gross wrote:
>> The dirty_cpu field of struct vcpu denotes which cpu still holds data
>> of a vcpu. All accesses to this field should be atomic in case the
>> vcpu could just be running, as it is accessed without any lock held
>> in most cases.
>>
>> There are some instances where accesses are not atomically done, and
>> even worse where multiple accesses are done when a single one would
>> be mandated.
>>
>> Correct that in order to avoid potential problems.
> 
> Beyond the changes you're making, what about the assignment in
> startup_cpu_idle_loop()? And while less important, dump_domains()
> also has a use that I think would better be converted for
> completeness.

startup_cpu_idle_loop() is not important, as it is set before any
scheduling activity might occur on a cpu. But I can change both
instances.

> 
>> @@ -1844,6 +1845,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
>>       {
>>           /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>>           flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>> +        ASSERT(read_atomic(&next->dirty_cpu) == VCPU_CPU_CLEAN);
> 
> ASSERT(!is_vcpu_dirty_cpu(read_atomic(&next->dirty_cpu))) ?

Yes, this is better.

> 
>> @@ -1956,13 +1958,17 @@ void sync_local_execstate(void)
>>   
>>   void sync_vcpu_execstate(struct vcpu *v)
>>   {
>> -    if ( v->dirty_cpu == smp_processor_id() )
>> +    unsigned int dirty_cpu = read_atomic(&v->dirty_cpu);
>> +
>> +    if ( dirty_cpu == smp_processor_id() )
>>           sync_local_execstate();
>> -    else if ( vcpu_cpu_dirty(v) )
>> +    else if ( is_vcpu_dirty_cpu(dirty_cpu) )
>>       {
>>           /* Remote CPU calls __sync_local_execstate() from flush IPI handler. */
>> -        flush_mask(cpumask_of(v->dirty_cpu), FLUSH_VCPU_STATE);
>> +        flush_mask(cpumask_of(dirty_cpu), FLUSH_VCPU_STATE);
>>       }
>> +    ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu ||
>> +           dirty_cpu == VCPU_CPU_CLEAN);
> 
> !is_vcpu_dirty_cpu(dirty_cpu) again? Also perhaps flip both
> sides of the || (to have the cheaper one first), and maybe
> 
>      if ( is_vcpu_dirty_cpu(dirty_cpu) )
>          ASSERT(read_atomic(&v->dirty_cpu) != dirty_cpu);
> 
> as the longer assertion string literal isn't really of that
> much extra value.

I can do that, in case we can be sure the compiler will drop the
test in case of a non-debug build.

> 
> However, having stared at it for a while now - is this race
> free? I can see this being fine in the (initial) case of
> dirty_cpu == smp_processor_id(), but if this is for a foreign
> CPU, can't the vCPU have gone back to that same CPU again in
> the meantime?

This should never happen. Either the vcpu in question is paused,
or it has been forced off the cpu due to not being allowed to run
there (e.g. affinity has been changed, or cpu is about to be
removed from cpupool). I can add a comment explaining it.

> 
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -844,7 +844,7 @@ static inline bool is_vcpu_dirty_cpu(unsigned int cpu)
>>   
>>   static inline bool vcpu_cpu_dirty(const struct vcpu *v)
>>   {
>> -    return is_vcpu_dirty_cpu(v->dirty_cpu);
>> +    return is_vcpu_dirty_cpu(read_atomic((unsigned int *)&v->dirty_cpu));
> 
> As per your communication with Julien I understand the cast
> will go away again.

Yes, I think so.


Juergen


  reply	other threads:[~2020-05-04 12:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-30 15:28 [PATCH v1.1 2/3] xen/sched: fix theoretical races accessing vcpu->dirty_cpu Juergen Gross
2020-05-02 11:36 ` Julien Grall
2020-05-02 11:45   ` Jürgen Groß
2020-05-02 12:34     ` Julien Grall
2020-05-02 16:09       ` Julien Grall
2020-05-04  7:33         ` Jürgen Groß
2020-05-04 11:51 ` Jan Beulich
2020-05-04 12:41   ` Jürgen Groß [this message]
2020-05-04 12:48     ` Jan Beulich
2020-05-04 13:53       ` Jürgen Groß

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d4fd1cd-173f-5128-6a73-ac2c6d679f93@suse.com \
    --to=jgross@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.