All of lore.kernel.org
 help / color / mirror / Atom feed
From: Julien Grall <julien@xen.org>
To: Jan Beulich <jbeulich@suse.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>,
	Ian Jackson <iwj@xenproject.org>,
	Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org>,
	xen-devel@lists.xenproject.org,
	Dmitry Isaikin <isaikin-dmitry@yandex.ru>,
	rjstone@amazon.co.uk, raphning@amazon.co.uk,
	Paul Durrant <paul@xen.org>
Subject: Re: [PATCH v1] domctl: hold domctl lock while domain is destroyed
Date: Fri, 17 Sep 2021 21:01:26 +0500	[thread overview]
Message-ID: <f6225dc6-0590-3456-8c48-7ab29aa00200@xen.org> (raw)
In-Reply-To: <0c860901-0992-74df-4a53-d75a0971d1f3@suse.com>

Hi Jan,

On 17/09/2021 14:47, Jan Beulich wrote:
> On 17.09.2021 11:41, Andrew Cooper wrote:
>> On 17/09/2021 10:27, Julien Grall wrote:
>>> Hi,
>>>
>>> (+ some AWS folks)
>>>
>>> On 17/09/2021 11:17, Jan Beulich wrote:
>>>> On 16.09.2021 19:52, Andrew Cooper wrote:
>>>>> On 16/09/2021 13:30, Jan Beulich wrote:
>>>>>> On 16.09.2021 13:10, Dmitry Isaikin wrote:
>>>>>>> From: Dmitry Isaykin <isaikin-dmitry@yandex.ru>
>>>>>>>
>>>>>>> This significantly speeds up concurrent destruction of multiple
>>>>>>> domains on x86.
>>>>>> This effectively is a simplistic revert of 228ab9992ffb ("domctl:
>>>>>> improve locking during domain destruction"). There it was found to
>>>>>> actually improve things;
>>>>>
>>>>> Was it?  I recall that it was simply an expectation that performance
>>>>> would be better...
>>>>
>>>> My recollection is that it was, for one of our customers.
>>>>
>>>>> Amazon previously identified 228ab9992ffb as a massive perf hit, too.
>>>>
>>>> Interesting. I don't recall any mail to that effect.
>>>
>>> Here we go:
>>>
>>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fxen-devel%2Fde46590ad566d9be55b26eaca0bc4dc7fbbada59.1585063311.git.hongyxia%40amazon.com%2F&amp;data=04%7C01%7CAndrew.Cooper3%40citrix.com%7C8cf65b3fb3324abe7cf108d979bd7171%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637674676843910175%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=si7eYIxSqsJY77sWuwsad5MzJDMzGF%2F8L0JxGrWTmtI%3D&amp;reserved=0
>>>
>>>
>>> We have been using the revert for quite a while in production and didn't
>>> notice any regression.
>>>
>>>>
>>>>> Clearly some of the reasoning behind 228ab9992ffb was flawed and/or
>>>>> incomplete, and it appears as if it wasn't necessarily a wise move in
>>>>> hindsight.
>>>>
>>>> Possible; I continue to think though that the present observation wants
>>>> properly understanding instead of more or less blindly undoing that
>>>> change.
>>>
>>> To be honest, I think this is the other way around. You wrote and merged
>>> a patch with the following justification:
>>>
>>> "
>>>      There is no need to hold the global domctl lock across domain_kill() -
>>>      the domain lock is fully sufficient here, and parallel cleanup after
>>>      multiple domains performs quite a bit better this way.
>>> "
>>>
>>> Clearly, the original commit message is lacking details on the exact
>>> setups and numbers. But we now have two stakeholders with proof that
>>> your patch is harmful to the setup you claim perform better with your
>>> patch.
>>>
>>> To me this is enough justification to revert the original patch. Anyone
>>> against the revert, should provide clear details of why the patch should
>>> not be reverted.
>>
>> I second a revert.
>>
>> I was concerned at the time that the claim was unsubstantiated, and now
>> there is plenty of evidence to counter the claim.
> 
> Well, I won't object to a proper revert. I still think we'd better get to
> the bottom of this, not the least because I thought there was agreement
> that mid to long term we should get rid of global locking wherever
> possible. Or are both of you saying that using a global lock here is
> obviously fine? And does either of you have at least a theory to explain
> the observation? I can only say that I find it puzzling.

I will quote what Hongyan wrote back on the first report:

"
The best solution is to make the heap scalable instead of a global
lock, but that is not going to be trivial.

Of course, another solution is to keep the domctl lock dropped in
domain_kill() but have another domain_kill lock so that competing
domain_kill()s will try to take that lock and back off with hypercall
continuation. But this is kind of hacky (we introduce a lock to reduce
spinlock contention elsewhere), which is probably not a solution but a
workaround.
"

Cheers,

-- 
Julien Grall


  reply	other threads:[~2021-09-17 16:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-16 11:10 [PATCH v1] domctl: hold domctl lock while domain is destroyed Dmitry Isaikin
2021-09-16 12:30 ` Jan Beulich
2021-09-16 13:08   ` Roger Pau Monné
2021-09-16 17:52   ` Andrew Cooper
2021-09-17  6:17     ` Jan Beulich
2021-09-17  9:27       ` Julien Grall
2021-09-17  9:41         ` Andrew Cooper
2021-09-17  9:47           ` Jan Beulich
2021-09-17 16:01             ` Julien Grall [this message]
2021-09-20  8:19               ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f6225dc6-0590-3456-8c48-7ab29aa00200@xen.org \
    --to=julien@xen.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=isaikin-dmitry@yandex.ru \
    --cc=iwj@xenproject.org \
    --cc=jbeulich@suse.com \
    --cc=paul@xen.org \
    --cc=raphning@amazon.co.uk \
    --cc=rjstone@amazon.co.uk \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.