xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Julien Grall <julien@xen.org>
To: "Durrant, Paul" <pdurrant@amazon.co.uk>,
	Jan Beulich <jbeulich@suse.com>,
	 Stefano Stabellini <sstabellini@kernel.org>,
	"andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
	George Dunlap <george.dunlap@citrix.com>
Cc: "Xia, Hongyan" <hongyxia@amazon.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: Memory ordering question in the shutdown deferral code
Date: Mon, 21 Sep 2020 14:27:41 +0100	[thread overview]
Message-ID: <ad81f6ac-6127-bea8-a503-d16d3dc175df@xen.org> (raw)
In-Reply-To: <92a6373003e142e9943a4057024a2616@EX13D32EUC003.ant.amazon.com>



On 21/09/2020 13:55, Durrant, Paul wrote:
>> -----Original Message-----
>> From: Julien Grall <julien@xen.org>
>> Sent: 21 September 2020 12:41
>> To: Jan Beulich <jbeulich@suse.com>; Stefano Stabellini <sstabellini@kernel.org>;
>> andrew.cooper3@citrix.com; George Dunlap <george.dunlap@citrix.com>; Durrant, Paul
>> <pdurrant@amazon.co.uk>
>> Cc: Xia, Hongyan <hongyxia@amazon.com>; xen-devel@lists.xenproject.org
>> Subject: RE: [EXTERNAL] Memory ordering question in the shutdown deferral code
>>
>> CAUTION: This email originated from outside of the organization. Do not click links or open
>> attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> (+ Xen-devel)
>>
>> Sorry I forgot to CC xen-devel.
>>
>> On 21/09/2020 12:38, Julien Grall wrote:
>>> Hi all,
>>>
>>> I have started to look at the deferral code (see
>>> vcpu_start_shutdown_deferral()) because we need it for LiveUpdate and
>>> Arm will soon use it.
>>>
>>> The current implementation is using an smp_mb() to ensure ordering
>>> between a write then a read. The code looks roughly (I have slightly
>>> adapted it to make my question more obvious):
>>>
>>> domain_shutdown()
>>>       d->is_shutting_down = 1;
>>>       smp_mb();
>>>       if ( !vcpu0->defer_shutdown )
>>>       {
>>>         vcpu_pause_nosync(v);
>>>         v->paused_for_shutdown = 1;
>>>       }
>>>
>>> vcpu_start_shutdown_deferral()
>>>       vcpu0->defer_shutdown = 1;
>>>       smp_mb();
>>>       if ( unlikely(d->is_shutting_down) )
>>>         vcpu_check_shutdown(v);
>>>
>>>       return vcpu0->defer_shutdown;
>>>
>>> smp_mb() should only guarantee ordering (this may be stronger on some
>>> arch), so I think there is a race between the two functions.
>>>
>>> It would be possible to pause the vCPU in domain_shutdown() because
>>> vcpu0->defer_shutdown wasn't yet seen.
>>>
>>> Equally, vcpu_start_shutdown_deferral() may not see d->is_shutting_down
>>> and therefore Xen may continue to send the I/O. Yet the vCPU will be
>>> paused so the I/O will never complete.
>>>
> 
> The barrier enforces global order, right?

It is not clear to me what you mean by "global ordering". This seems to 
suggest a very expensive synchronization barrier between all the processors.

 From an arch-agnostic PoV, smp_mb() will enforce an ordering between 
loads/stores but it doesn't guarantee *when* they will be observed.

> So, if domain_shutdown() pauses the vcpu then is_shutting_down must necessarily be visible all cpus.

That's not the guarantee provided by smp_mb() (see above).

> Thus vcpu_start_shutdown referral will execute vcpu_check_shutdown(), so I'm having a hard time seeing the race.
> 
>>> I am not fully familiar with the IOREQ code, but it sounds to me this is
>>> not the behavior that was intended. Can someone more familiar with the
>>> code confirm it?
>>>
> 
> No indeed. I think emulation should complete before the vcpu pauses.

I think this part is racy at least on non-x86 platform as x86 seems to 
implement smp_mb() with a strong memory barrier (mfence).

Cheers,

-- 
Julien Grall


  parent reply	other threads:[~2020-09-21 13:28 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <468576ba-8d3f-98e9-e65e-1128b5220d40@xen.org>
2020-09-21 11:40 ` Memory ordering question in the shutdown deferral code Julien Grall
2020-09-21 12:55   ` Durrant, Paul
2020-09-21 13:25     ` Xia, Hongyan
2020-09-21 13:27     ` Julien Grall [this message]
2020-09-21 13:32       ` Jan Beulich
2020-09-21 13:35         ` Durrant, Paul
2020-09-21 14:02           ` Jan Beulich
2020-09-23 22:57       ` Stefano Stabellini
2020-09-24 11:10         ` Julien Grall
2020-09-21 13:11   ` Jan Beulich
2020-09-21 14:22     ` Julien Grall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ad81f6ac-6127-bea8-a503-d16d3dc175df@xen.org \
    --to=julien@xen.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=hongyxia@amazon.com \
    --cc=jbeulich@suse.com \
    --cc=pdurrant@amazon.co.uk \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).