linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wanpeng Li <kernellwp@gmail.com>
To: Mike Galbraith <efault@gmx.de>
Cc: linux-kernel@vger.kernel.org, kvm <kvm@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Radim Krcmar <rkrcmar@redhat.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: unixbench context switch perfomance & cpu topology
Date: Tue, 23 Jan 2018 18:36:39 +0800	[thread overview]
Message-ID: <CANRm+CxpP-MNLsJS+-bu19Avp96=8jXrc4et1HE+-7SbKgGbPQ@mail.gmail.com> (raw)
In-Reply-To: <1516628254.7500.19.camel@gmx.de>

2018-01-22 21:37 GMT+08:00 Mike Galbraith <efault@gmx.de>:
> On Mon, 2018-01-22 at 20:27 +0800, Wanpeng Li wrote:
>> 2018-01-22 20:08 GMT+08:00 Mike Galbraith <efault@gmx.de>:
>> > On Mon, 2018-01-22 at 19:47 +0800, Wanpeng Li wrote:
>> >> Hi all,
>> >>
>> >> We can observe unixbench context switch performance is heavily
>> >> influenced by cpu topology which is exposed to the guest. the score is
>> >> posted below, bigger is better, both the guest and the host kernel are
>> >> 3.15-rc3(we can also reproduce against centos 7.4 693 guest/host), LLC
>> >> is exposed to the guest, kvm adaptive halt-polling is default enabled,
>> >> then start a guest w/ 8 logical cpus.
>> >>
>> >>
>> >>
>> >> unixbench context switch
>> >> -smp 8, sockets=8, cores=1, threads=1    382036
>> >> -smp 8, sockets=4, cores=2, threads=1    132480
>> >> -smp 8, sockets=2, cores=4, threads=1    128032
>> >> -smp 8, sockets=2, cores=2, threads=2    131767
>> >> -smp 8, sockets=1, cores=4, threads=2    132742
>> >> -smp 8, sockets=1, cores=4, threads=2 (guest w/ nohz=off idle=poll)    331471
>> >>
>> >> I can observe there are a lot of reschedule IPIs sent from one vCPU to
>> >> another vCPU, the context switch workload switches between running and
>> >> idle frequently which results in HLT instruction in the idle path, I
>> >> use idle=poll to avoid vmexit due to HLT and to avoid reschedule IPIs
>> >> since idle task checks TIF_NEED_RESCHED flags in a loop, nohz=off can
>> >> stop to program lapic timer/other nohz stuffs. Any idea why sockets=8
>> >> can get best performance?
>> >
>> > Probably because with that topology, there is no shared llc, thus no
>> > cross-core scheduling, micro-benchmark waker/wakee are stacked.  If
>> > your benchmark does nothing but schedule, stacking makes beautiful (but
>> > utterly meaningless) numbers.
>>
>> The waker and wakee are just sporadic on the same logical cpu in the
>> guest(-smp 8, sockets=8, cores=1, threads=1) during the testing, in
>> addition, binding the waker/wakee to one logical cpu in the guest(-smp
>> 8, sockets=1, cores=4, threads=2) also can get the performance as
>> better as 8 sockets setup.
>
> Here, with tip.today and that topology, context1 does stack up on one core.
>
>  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ P COMMAND
>  4218 root      20   0    4048    808    732 R 52.16 0.022   0:12.77 4 context1
>  4219 root      20   0    4048     80      0 S 47.18 0.002   0:11.96 4 context1
>
> There's a bit of bouncing, but the two stack right back up.  But
> whatever, what Peter said, the benchmark should pin itself to do this.

Thanks for having a try, Mike. :) Actually the two context1 tasks
don't stack up on one logical cpu at the most of time which is
observed by kernelshark. Do you have any idea why there is 4.5 times
RESCHED IPIs which is mentioned in another reply for this thread?

Regards,
Wanpeng Li

  reply	other threads:[~2018-01-23 10:36 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-22 11:47 unixbench context switch perfomance & cpu topology Wanpeng Li
2018-01-22 12:08 ` Mike Galbraith
2018-01-22 12:27   ` Wanpeng Li
2018-01-22 13:37     ` Mike Galbraith
2018-01-23 10:36       ` Wanpeng Li [this message]
2018-01-23 13:49         ` Mike Galbraith
2018-01-24  8:07           ` Wanpeng Li
2018-01-22 12:53 ` Peter Zijlstra
2018-01-23 10:33   ` Wanpeng Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANRm+CxpP-MNLsJS+-bu19Avp96=8jXrc4et1HE+-7SbKgGbPQ@mail.gmail.com' \
    --to=kernellwp@gmail.com \
    --cc=efault@gmx.de \
    --cc=fweisbec@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rkrcmar@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).