All of lore.kernel.org
 help / color / mirror / Atom feed
From: Julien Grall <julien@xen.org>
To: Jan Beulich <jbeulich@suse.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>,
	Ian Jackson <iwj@xenproject.org>,
	Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org>,
	xen-devel@lists.xenproject.org,
	Dmitry Isaikin <isaikin-dmitry@yandex.ru>,
	rjstone@amazon.co.uk, raphning@amazon.co.uk,
	Paul Durrant <paul@xen.org>
Subject: Re: [PATCH v1] domctl: hold domctl lock while domain is destroyed
Date: Fri, 17 Sep 2021 21:01:26 +0500	[thread overview]
Message-ID: <f6225dc6-0590-3456-8c48-7ab29aa00200@xen.org> (raw)
In-Reply-To: <0c860901-0992-74df-4a53-d75a0971d1f3@suse.com>

Hi Jan,

On 17/09/2021 14:47, Jan Beulich wrote:
> On 17.09.2021 11:41, Andrew Cooper wrote:
>> On 17/09/2021 10:27, Julien Grall wrote:
>>> Hi,
>>>
>>> (+ some AWS folks)
>>>
>>> On 17/09/2021 11:17, Jan Beulich wrote:
>>>> On 16.09.2021 19:52, Andrew Cooper wrote:
>>>>> On 16/09/2021 13:30, Jan Beulich wrote:
>>>>>> On 16.09.2021 13:10, Dmitry Isaikin wrote:
>>>>>>> From: Dmitry Isaykin <isaikin-dmitry@yandex.ru>
>>>>>>>
>>>>>>> This significantly speeds up concurrent destruction of multiple
>>>>>>> domains on x86.
>>>>>> This effectively is a simplistic revert of 228ab9992ffb ("domctl:
>>>>>> improve locking during domain destruction"). There it was found to
>>>>>> actually improve things;
>>>>>
>>>>> Was it?  I recall that it was simply an expectation that performance
>>>>> would be better...
>>>>
>>>> My recollection is that it was, for one of our customers.
>>>>
>>>>> Amazon previously identified 228ab9992ffb as a massive perf hit, too.
>>>>
>>>> Interesting. I don't recall any mail to that effect.
>>>
>>> Here we go:
>>>
>>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fxen-devel%2Fde46590ad566d9be55b26eaca0bc4dc7fbbada59.1585063311.git.hongyxia%40amazon.com%2F&amp;data=04%7C01%7CAndrew.Cooper3%40citrix.com%7C8cf65b3fb3324abe7cf108d979bd7171%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637674676843910175%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=si7eYIxSqsJY77sWuwsad5MzJDMzGF%2F8L0JxGrWTmtI%3D&amp;reserved=0
>>>
>>>
>>> We have been using the revert for quite a while in production and didn't
>>> notice any regression.
>>>
>>>>
>>>>> Clearly some of the reasoning behind 228ab9992ffb was flawed and/or
>>>>> incomplete, and it appears as if it wasn't necessarily a wise move in
>>>>> hindsight.
>>>>
>>>> Possible; I continue to think though that the present observation wants
>>>> properly understanding instead of more or less blindly undoing that
>>>> change.
>>>
>>> To be honest, I think this is the other way around. You wrote and merged
>>> a patch with the following justification:
>>>
>>> "
>>>      There is no need to hold the global domctl lock across domain_kill() -
>>>      the domain lock is fully sufficient here, and parallel cleanup after
>>>      multiple domains performs quite a bit better this way.
>>> "
>>>
>>> Clearly, the original commit message is lacking details on the exact
>>> setups and numbers. But we now have two stakeholders with proof that
>>> your patch is harmful to the setup you claim perform better with your
>>> patch.
>>>
>>> To me this is enough justification to revert the original patch. Anyone
>>> against the revert, should provide clear details of why the patch should
>>> not be reverted.
>>
>> I second a revert.
>>
>> I was concerned at the time that the claim was unsubstantiated, and now
>> there is plenty of evidence to counter the claim.
> 
> Well, I won't object to a proper revert. I still think we'd better get to
> the bottom of this, not the least because I thought there was agreement
> that mid to long term we should get rid of global locking wherever
> possible. Or are both of you saying that using a global lock here is
> obviously fine? And does either of you have at least a theory to explain
> the observation? I can only say that I find it puzzling.

I will quote what Hongyan wrote back on the first report:

"
The best solution is to make the heap scalable instead of a global
lock, but that is not going to be trivial.

Of course, another solution is to keep the domctl lock dropped in
domain_kill() but have another domain_kill lock so that competing
domain_kill()s will try to take that lock and back off with hypercall
continuation. But this is kind of hacky (we introduce a lock to reduce
spinlock contention elsewhere), which is probably not a solution but a
workaround.
"

Cheers,

-- 
Julien Grall


  reply	other threads:[~2021-09-17 16:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-16 11:10 Dmitry Isaikin
2021-09-16 12:30 ` Jan Beulich
2021-09-16 13:08   ` Roger Pau Monné
2021-09-16 17:52   ` Andrew Cooper
2021-09-17  6:17     ` Jan Beulich
2021-09-17  9:27       ` Julien Grall
2021-09-17  9:41         ` Andrew Cooper
2021-09-17  9:47           ` Jan Beulich
2021-09-17 16:01             ` Julien Grall [this message]
2021-09-20  8:19               ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f6225dc6-0590-3456-8c48-7ab29aa00200@xen.org \
    --to=julien@xen.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=isaikin-dmitry@yandex.ru \
    --cc=iwj@xenproject.org \
    --cc=jbeulich@suse.com \
    --cc=paul@xen.org \
    --cc=raphning@amazon.co.uk \
    --cc=rjstone@amazon.co.uk \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    --subject='Re: [PATCH v1] domctl: hold domctl lock while domain is destroyed' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.