linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Usama Arif <usama.arif@bytedance.com>
To: David Woodhouse <dwmw2@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	kim.phillips@amd.com
Cc: arjan@linux.intel.com, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, hpa@zytor.com, x86@kernel.org,
	pbonzini@redhat.com, paulmck@kernel.org,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	rcu@vger.kernel.org, mimoja@mimoja.de, hewenliang4@huawei.com,
	thomas.lendacky@amd.com, seanjc@google.com,
	pmenzel@molgen.mpg.de, fam.zheng@bytedance.com,
	punit.agrawal@bytedance.com, simon.evans@bytedance.com,
	liangma@liangbit.com
Subject: Re: [External] Re: [PATCH v9 0/8] Parallel CPU bringup for x86_64
Date: Thu, 23 Feb 2023 19:24:26 +0000	[thread overview]
Message-ID: <97e1f2aa-d823-1aea-a41f-8024ba5075aa@bytedance.com> (raw)
In-Reply-To: <701ce2da00e559d517d4e48bd5d88ccae1198e44.camel@infradead.org>



On 23/02/2023 11:07, David Woodhouse wrote:
> On Wed, 2023-02-22 at 17:42 +0100, Thomas Gleixner wrote:
>> David!
>>
>> On Wed, Feb 22 2023 at 10:11, David Woodhouse wrote:
>>> On Wed, 2023-02-15 at 14:54 +0000, Usama Arif wrote:
>>> So the next thing that might be worth looking at is allowing the APs
>>> all to be running their hotplug thread simultaneously, bringing
>>> themselves from CPUHP_BRINGUP_CPU to CPUHP_AP_ONLINE. This series eats
>>> the initial INIT/SIPI/SIPI latency, but if there's any significant time
>>> in the AP hotplug thread, that could be worth parallelising.
>>
>> On a 112 CPU machine (64 cores, HT enabled) the bringup takes
>>
>> Setup and SIPIs sent:    49 ms
>> Bringup each CPU:       516 ms
>>
>> That's about 500 ms faster than a non-parallel bringup!
>>
>> Now looking at the 516 ms, which is ~4.7 ms/CPU. The vast majority of the
>> time is spent on the APs in
>>
>>       cpu_init() -> ucode_cpu_init()
>>
>> for the primary threads of each core. The secondary threads are quickly
>> (1us) out of ucode_cpu_init() because the primary thread already loaded
>> it.
>>
>> A microcode load on that machine takes ~7.5 ms per primary thread on
>> average which sums up to 7.5 * 55 = 412.5 ms
>>
>> The threaded bringup after CPU_AP_ONLINE takes about 100us per CPU.
> 
> Nice analysis; thanks!
> 
>> identify_secondary_cpu() is one of the longer functions which takes
>> ~125us / CPU summing up to 13ms
> 
> Hm, shouldn't that one already be parallelised by my 'part 2' patch?
> 
> It's called from smp_store_cpu_info(), from smp_callin(), which is
> called from somewhere in the middle of start_secondary().
> 
> And if the comments I helpfully added to that function for the benefit
> of our future selves are telling the truth, the AP is free to get that
> far once the BSP has set its bit in cpu_callout_mask, which happens in
> do_wait_cpu_initialized().
> 
> So
> https://git.infradead.org/users/dwmw2/linux.git/commitdiff/4b5731e05b0#patch3
> ought to parallelise that. But Usama emirically reported that 'part 2'
> didn't add any noticeable benefit, not even those 13ms? On a *larger*
> machine.
> 

So I am using a similar machine to Thomas 128 CPU machine (64 cores, HT 
enabled). I have microcode config disabled, so I guess I get similar 
numbers to Thomas, i.e. 100ms (516 - 412) ms. I do see a difference of 
~3ms with part2 which I thought is maybe within the margin of error for 
measuring, but I guess it isn't. After seeing the ~70ms that is cut with 
reusing timer calibration, I didnt really then focus much on part 2 
then. I guess that ~70ms is the "rest" from Thomas' table below?

Thanks,
Usama

> 
>> The TSC sync check for the first CPU on the second socket consumes
>> 20ms. That's only once per socket, intra socket is using MSR_TSC_ADJUST,
>> which is more or less free.
>>
>> So the 516 ms are wasted here:
>>
>>     total                                516 ms
>>     ucode_cpu_init()                     412 ms
>>     identify_secondary_cpu()              13 ms
>>     2ndsocket_tsc_sync                    20 ms
>>     threaded bringup                      12 ms
>>     rest                                  59 ms
>>
>> So the rest is about 530us per CPU, which is just the sum of many small
>> functions, lock contentions...
>>
>> Getting rid of the micro code overhead is possible. There is no reason
>> to serialize that between the cores. But it needs serialization vs. HT
>> siblings, which requires to move identify_secondary_cpu() and its caller
>> smp_store_cpu_info() ahead of the synchronization point and then have
>> serialization between the siblings. That's going to be a major surgery
>> and inspection effort to ensure that there are no hidden assumptions
>> about global hotplug serialization.
>>
>> So that would cut the total cost down to ~100ms plus the
>> preparatory/SIPI stage of 60ms which sums up to about 160ms and about
>> 1.5ms per CPU total.
>>
>> Further optimization starts to be questionable IMO. It's surely possible
>> somehow, but then you really have to go and inspect each and every
>> function in those code pathes, add local locking, etc. Not to talk about
>> the required mess in the core code to support that.
>>
>> The low hanging fruit which brings most is the identification/topology
>> muck and the microcode loading. That needs to be addressed first anyway.
> 
> Agreed, thanks.
> 

      parent reply	other threads:[~2023-02-23 19:25 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-15 14:54 [PATCH v9 0/8] Parallel CPU bringup for x86_64 Usama Arif
2023-02-15 14:54 ` [PATCH v9 1/8] x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel Usama Arif
2023-02-16 20:58   ` Kim Phillips
2023-02-15 14:54 ` [PATCH v9 2/8] cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h> Usama Arif
2023-02-15 14:54 ` [PATCH v9 3/8] cpu/hotplug: Add dynamic parallel bringup states before CPUHP_BRINGUP_CPU Usama Arif
2023-02-15 14:54 ` [PATCH v9 4/8] x86/smpboot: Reference count on smpboot_setup_warm_reset_vector() Usama Arif
2023-02-15 14:54 ` [PATCH v9 5/8] x86/smpboot: Split up native_cpu_up into separate phases and document them Usama Arif
2023-02-15 14:54 ` [PATCH v9 6/8] x86/smpboot: Support parallel startup of secondary CPUs Usama Arif
2023-02-15 14:54 ` [PATCH v9 7/8] x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel Usama Arif
2023-02-15 14:54 ` [PATCH v9 8/8] x86/smpboot: Serialize topology updates for secondary bringup Usama Arif
2023-02-16  6:34 ` [PATCH v9 0/8] Parallel CPU bringup for x86_64 Paul E. McKenney
2023-02-20 16:08 ` Oleksandr Natalenko
2023-02-20 16:20   ` David Woodhouse
2023-02-20 16:40     ` Oleksandr Natalenko
2023-02-20 20:31       ` David Woodhouse
2023-02-20 21:23         ` Oleksandr Natalenko
2023-02-20 21:34           ` Piotr Gorski
2023-02-20 21:39           ` David Woodhouse
2023-02-20 23:23             ` Kim Phillips
2023-02-20 23:30               ` David Woodhouse
2023-02-21  4:20                 ` Kim Phillips
2023-02-21  7:16                   ` David Woodhouse
2023-02-21  7:27                 ` Oleksandr Natalenko
2023-02-21  7:53                   ` David Woodhouse
2023-02-21  8:05                     ` Oleksandr Natalenko
2023-02-21  8:17                       ` David Woodhouse
2023-02-21  8:25                         ` Oleksandr Natalenko
2023-02-21  8:35                           ` David Woodhouse
2023-02-21  8:44                             ` Oleksandr Natalenko
2023-02-21  9:06                               ` David Woodhouse
2023-02-21  9:49                                 ` Oleksandr Natalenko
2023-02-21 10:27                                   ` David Woodhouse
2023-02-21 10:47                                     ` [External] " Usama Arif
2023-02-21 11:42                                       ` Oleksandr Natalenko
2023-02-21 11:54                                         ` Usama Arif
2023-02-21 13:22                                           ` David Woodhouse
2023-02-21 11:46                                     ` Oleksandr Natalenko
2023-02-21 11:49                                       ` David Woodhouse
2023-02-21 12:14                                         ` Oleksandr Natalenko
2023-02-21 19:10                                           ` David Woodhouse
2023-02-21 20:04                                             ` [External] " Usama Arif
2023-02-21 21:04                                               ` Oleksandr Natalenko
2023-02-21 21:41                                             ` Thomas Gleixner
2023-02-21 21:44                                               ` David Woodhouse
2023-02-21 23:18                                               ` David Woodhouse
2023-02-22  0:00                                                 ` [External] " Usama Arif
2023-02-22  8:19                                                   ` David Woodhouse
2023-02-22  9:46                                                     ` Thomas Gleixner
2023-02-22  9:51                                                       ` David Woodhouse
2023-02-22  9:31                                                 ` Thomas Gleixner
2023-02-20 22:22           ` Piotr Gorski
2023-02-20 22:23           ` [External] " Usama Arif
2023-02-20 22:41             ` Oleksandr Natalenko
2023-02-22 10:11 ` David Woodhouse
2023-02-22 11:11   ` [External] " Usama Arif
2023-02-22 12:08   ` Brian Gerst
2023-02-22 12:53     ` David Woodhouse
2023-02-22 16:42   ` Thomas Gleixner
2023-02-23 11:07     ` David Woodhouse
2023-02-23 14:37       ` Thomas Gleixner
2023-02-23 15:12         ` David Woodhouse
2023-02-23 19:24       ` Usama Arif [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=97e1f2aa-d823-1aea-a41f-8024ba5075aa@bytedance.com \
    --to=usama.arif@bytedance.com \
    --cc=arjan@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=fam.zheng@bytedance.com \
    --cc=hewenliang4@huawei.com \
    --cc=hpa@zytor.com \
    --cc=kim.phillips@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=liangma@liangbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mimoja@mimoja.de \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=pmenzel@molgen.mpg.de \
    --cc=punit.agrawal@bytedance.com \
    --cc=rcu@vger.kernel.org \
    --cc=seanjc@google.com \
    --cc=simon.evans@bytedance.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).