xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Hongyan Xia <hx242@xen.org>
To: Jan Beulich <jbeulich@suse.com>, Julien Grall <julien@xen.org>
Cc: Charles Arnold <CARNOLD@suse.com>,
	Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	George Dunlap <george.dunlap@citrix.com>,
	Jim Fehlig <JFEHLIG@suse.com>,
	xen-devel@lists.xenproject.org
Subject: Re: [Xen-devel] [PATCH] Revert "domctl: improve locking during domain destruction"
Date: Thu, 26 Mar 2020 14:39:55 +0000	[thread overview]
Message-ID: <059d2dad459bbdafe11b21c93cc8e9ce42df4ecf.camel@xen.org> (raw)
In-Reply-To: <30f1ec6d-b5be-fcb1-c685-ed02961175c1@suse.com>

On Wed, 2020-03-25 at 08:11 +0100, Jan Beulich wrote:
> On 24.03.2020 19:39, Julien Grall wrote:
> > On 24/03/2020 16:13, Jan Beulich wrote:
> > > On 24.03.2020 16:21, Hongyan Xia wrote:
> > > > From: Hongyan Xia <hongyxia@amazon.com>
> > > > In contrast,
> > > > after dropping that commit, parallel domain destructions will
> > > > just fail
> > > > to take the domctl lock, creating a hypercall continuation and
> > > > backing
> > > > off immediately, allowing the thread that holds the lock to
> > > > destroy a
> > > > domain much more quickly and allowing backed-off threads to
> > > > process
> > > > events and irqs.
> > > > 
> > > > On a 144-core server with 4TiB of memory, destroying 32 guests
> > > > (each
> > > > with 4 vcpus and 122GiB memory) simultaneously takes:
> > > > 
> > > > before the revert: 29 minutes
> > > > after the revert: 6 minutes
> > > 
> > > This wants comparing against numbers demonstrating the bad
> > > effects of
> > > the global domctl lock. Iirc they were quite a bit higher than 6
> > > min,
> > > perhaps depending on guest properties.
> > 
> > Your original commit message doesn't contain any clue in which
> > cases the domctl lock was an issue. So please provide information
> > on the setups you think it will make it worse.
> 
> I did never observe the issue myself - let's see whether one of the
> SUSE
> people possibly involved in this back then recall (or have further
> pointers; Jim, Charles?), or whether any of the (partly former)
> Citrix
> folks do. My vague recollection is that the issue was the tool stack
> as
> a whole stalling for far too long in particular when destroying very
> large guests. One important aspect not discussed in the commit
> message
> at all is that holding the domctl lock block basically _all_ tool
> stack
> operations (including e.g. creation of new guests), whereas the new
> issue attempted to be addressed is limited to just domain cleanup.

The best solution is to make the heap scalable instead of a global
lock, but that is not going to be trivial.

Of course, another solution is to keep the domctl lock dropped in
domain_kill() but have another domain_kill lock so that competing
domain_kill()s will try to take that lock and back off with hypercall
continuation. But this is kind of hacky (we introduce a lock to reduce
spinlock contention elsewhere), which is probably not a solution but a
workaround.

Seeing the dramatic increase from 6 to 29 minutes in concurrent guest
destruction, I wonder if the benefit of that commit can outweigh this
negative though.

Hongyan



  reply	other threads:[~2020-03-26 14:40 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-24 15:21 Hongyan Xia
2020-03-24 16:13 ` Jan Beulich
2020-03-24 18:39   ` Julien Grall
2020-03-25  7:11     ` Jan Beulich
2020-03-26 14:39       ` Hongyan Xia [this message]
2020-03-26 16:55       ` Jim Fehlig
2020-03-31 10:31         ` Julien Grall
2020-03-24 18:40 ` [Xen-devel] " Julien Grall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=059d2dad459bbdafe11b21c93cc8e9ce42df4ecf.camel@xen.org \
    --to=hx242@xen.org \
    --cc=CARNOLD@suse.com \
    --cc=JFEHLIG@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    --subject='Re: [Xen-devel] [PATCH] Revert "domctl: improve locking during domain destruction"' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).