From: Andrew Theurer <habanero@linux.vnet.ibm.com>
To: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Rik van Riel <riel@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
Ingo Molnar <mingo@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
Srikar <srikar@linux.vnet.ibm.com>,
"Nikunj A. Dadhania" <nikunj@linux.vnet.ibm.com>,
KVM <kvm@vger.kernel.org>, Jiannan Ouyang <ouyang@cs.pitt.edu>,
chegu vinod <chegu_vinod@hp.com>,
LKML <linux-kernel@vger.kernel.org>,
Srivatsa Vaddagiri <srivatsa.vaddagiri@gmail.com>,
Gleb Natapov <gleb@redhat.com>, Andrew Jones <drjones@redhat.com>
Subject: Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler
Date: Wed, 10 Oct 2012 14:27:50 -0500 [thread overview]
Message-ID: <1349897270.22418.7.camel@oc2024037011.ibm.com> (raw)
In-Reply-To: <5075B3B6.3070802@linux.vnet.ibm.com>
On Wed, 2012-10-10 at 23:13 +0530, Raghavendra K T wrote:
> On 10/10/2012 07:54 PM, Andrew Theurer wrote:
> > I ran 'perf sched map' on the dbench workload for medium and large VMs,
> > and I thought I would share some of the results. I think it helps to
> > visualize what's going on regarding the yielding.
> >
> > These files are png bitmaps, generated from processing output from 'perf
> > sched map' (and perf data generated from 'perf sched record'). The Y
> > axis is the host cpus, each row being 10 pixels high. For these tests,
> > there are 80 host cpus, so the total height is 800 pixels. The X axis
> > is time (in microseconds), with each pixel representing 1 microsecond.
> > Each bitmap plots 30,000 microseconds. The bitmaps are quite wide
> > obviously, and zooming in/out while viewing is recommended.
> >
> > Each row (each host cpu) is assigned a color based on what thread is
> > running. vCPUs of the same VM are assigned a common color (like red,
> > blue, magenta, etc), and each vCPU has a unique brightness for that
> > color. There are a maximum of 12 assignable colors, so in any VMs >12
> > revert to vCPU color of gray. I would use more colors, but it becomes
> > harder to distinguish one color from another. The white color
> > represents missing data from perf, and black color represents any thread
> > which is not a vCPU.
> >
> > For the following tests, VMs were pinned to host NUMA nodes and to
> > specific cpus to help with consistency and operate within the
> > constraints of the last test (gang scheduler).
> >
> > Here is a good example of PLE. These are 10-way VMs, 16 of them (as
> > described above only 12 of the VMs have a color, rest are gray).
> >
> > https://docs.google.com/open?id=0B6tfUNlZ-14wdmFqUmE5QjJHMFU
>
> This looks very nice to visualize what is happening. Beginning of the
> graph looks little messy but later it is clear.
>
> >
> > If you zoom out and look at the whole bitmap, you may notice the 4ms
> > intervals of the scheduler. They are pretty well aligned across all
> > cpus. Normally, for cpu bound workloads, we would expect to see each
> > thread to run for 4 ms, then something else getting to run, and so on.
> > That is mostly true in this test. We have 2x over-commit and we
> > generally see the switching of threads at 4ms. One thing to note is
> > that not all vCPU threads for the same VM run at exactly the same time,
> > and that is expected and the whole reason for lock-holder preemption.
> > Now, if you zoom in on the bitmap, you should notice within the 4ms
> > intervals there is some task switching going on. This is most likely
> > because of the yield_to initiated by the PLE handler. In this case
> > there is not that much yielding to do. It's quite clean, and the
> > performance is quite good.
> >
> > Below is an example of PLE, but this time with 20-way VMs, 8 of them.
> > CPU over-commit is still 2x.
> >
> > https://docs.google.com/open?id=0B6tfUNlZ-14wdmFqUmE5QjJHMFU
>
> I think this link still 10x16. Could you paste the link again?
Oops
https://docs.google.com/open?id=0B6tfUNlZ-14wSGtYYzZtRTcyVjQ
>
> >
> > This one looks quite different. In short, it's a mess. The switching
> > between tasks can be lower than 10 microseconds. It basically never
> > recovers. There is constant yielding all the time.
> >
> > Below is again 8 x 20-way VMs, but this time I tried out Nikunj's gang
> > scheduling patches. While I am not recommending gang scheduling, I
> > think it's a good data point. The performance is 3.88x the PLE result.
> >
> > https://docs.google.com/open?id=0B6tfUNlZ-14wWXdscWcwNTVEY3M
> >
> > Note that the task switching intervals of 4ms are quite obvious again,
> > and this time all vCPUs from same VM run at the same time. It
> > represents the best possible outcome.
> >
> >
> > Anyway, I thought the bitmaps might help better visualize what's going
> > on.
> >
> > -Andrew
> >
> >
> >
> >
>
next prev parent reply other threads:[~2012-10-10 19:28 UTC|newest]
Thread overview: 126+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-21 11:59 [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler Raghavendra K T
2012-09-21 12:00 ` [PATCH RFC 1/2] kvm: Handle undercommitted guest case " Raghavendra K T
2012-09-21 13:02 ` Rik van Riel
2012-09-21 17:24 ` Raghavendra K T
2012-09-24 15:41 ` Avi Kivity
2012-09-24 16:06 ` Avi Kivity
2012-09-24 16:14 ` Peter Zijlstra
2012-09-24 16:25 ` Avi Kivity
2012-09-25 8:09 ` Raghavendra K T
2012-09-25 8:54 ` Avi Kivity
2012-09-25 13:49 ` Raghavendra K T
2012-09-27 7:44 ` Gleb Natapov
2012-09-27 8:59 ` Avi Kivity
2012-09-27 9:11 ` Gleb Natapov
2012-09-27 9:33 ` Avi Kivity
2012-09-27 9:58 ` Gleb Natapov
2012-09-27 10:04 ` Avi Kivity
2012-09-27 10:08 ` Gleb Natapov
2012-09-27 10:15 ` Avi Kivity
[not found] ` <CAJocwcf+8u84_yDC-PK0Yni93YSTWzYvr69nq6b3pNv1MwVJzQ@mail.gmail.com>
2012-09-27 8:50 ` Avi Kivity
2012-09-27 11:26 ` Raghavendra K T
2012-09-27 12:06 ` Avi Kivity
2012-09-28 18:18 ` Konrad Rzeszutek Wilk
2012-09-30 8:16 ` Avi Kivity
[not found] ` <CAJocwcc19F+PtsQ5okGMvYeVnkEigpZRpwWY9JgeRPFqfcVoXA@mail.gmail.com>
2012-09-28 6:16 ` Raghavendra K T
2012-09-30 8:18 ` Avi Kivity
2012-09-30 11:07 ` Gleb Natapov
2012-09-30 11:13 ` Avi Kivity
2012-10-03 14:17 ` Raghavendra K T
2012-10-03 14:56 ` Avi Kivity
2012-10-04 7:29 ` Gleb Natapov
2012-10-05 8:36 ` Raghavendra K T
2012-10-07 9:51 ` Avi Kivity
2012-09-25 7:36 ` Raghavendra K T
2012-09-25 8:12 ` Avi Kivity
2012-09-25 14:21 ` Takuya Yoshikawa
2012-09-27 8:43 ` Avi Kivity
2012-10-03 12:22 ` Raghavendra K T
2012-10-03 17:05 ` Avi Kivity
2012-10-04 10:49 ` Raghavendra K T
2012-10-04 12:41 ` Avi Kivity
2012-10-04 13:07 ` Peter Zijlstra
2012-10-04 15:00 ` Avi Kivity
2012-10-09 18:51 ` Raghavendra K T
2012-10-10 2:59 ` Andrew Theurer
2012-10-10 17:54 ` Raghavendra K T
2012-10-10 18:03 ` David Ahern
2012-10-10 18:14 ` Raghavendra K T
2012-10-10 19:36 ` Andrew Theurer
2012-10-15 12:10 ` Raghavendra K T
2012-10-15 14:34 ` Andrew Theurer
2012-10-19 8:30 ` Raghavendra K T
2012-10-19 13:31 ` Andrew Theurer
2012-10-10 14:24 ` Andrew Theurer
2012-10-10 17:43 ` Raghavendra K T
2012-10-10 19:27 ` Andrew Theurer [this message]
2012-10-11 17:13 ` Raghavendra K T
2012-10-11 10:39 ` Nikunj A Dadhania
2012-10-18 12:39 ` Avi Kivity
2012-10-19 8:19 ` Raghavendra K T
2012-10-04 14:41 ` Andrew Theurer
2012-10-05 9:06 ` Raghavendra K T
2012-10-05 9:02 ` Raghavendra K T
2012-09-24 11:33 ` Peter Zijlstra
2012-09-24 11:40 ` Raghavendra K T
2012-09-21 12:00 ` [PATCH RFC 2/2] kvm: Be courteous to other VMs in overcommitted scenario " Raghavendra K T
2012-09-21 13:22 ` Rik van Riel
2012-09-21 13:46 ` Takuya Yoshikawa
2012-09-21 13:52 ` Rik van Riel
2012-09-21 17:45 ` Raghavendra K T
2012-09-24 13:43 ` Takuya Yoshikawa
2012-09-24 15:26 ` Avi Kivity
2012-09-24 15:34 ` Peter Zijlstra
2012-09-24 15:43 ` Avi Kivity
2012-09-24 15:52 ` Peter Zijlstra
2012-09-24 15:58 ` Avi Kivity
2012-09-24 16:05 ` Peter Zijlstra
2012-09-24 16:10 ` Avi Kivity
2012-09-24 16:13 ` Peter Zijlstra
2012-09-24 16:21 ` Avi Kivity
2012-09-25 10:11 ` Avi Kivity
2012-09-21 13:18 ` [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios " Chegu Vinod
2012-09-21 17:36 ` Raghavendra K T
2012-09-24 8:42 ` Dor Laor
2012-09-24 12:02 ` Raghavendra K T
2012-09-25 15:00 ` Dor Laor
2012-09-26 12:27 ` Konrad Rzeszutek Wilk
2012-09-27 10:07 ` Raghavendra K T
2012-09-27 9:49 ` Raghavendra K T
2012-09-27 10:28 ` Andrew Jones
2012-09-27 10:44 ` Avi Kivity
2012-09-27 11:31 ` Raghavendra K T
2012-09-27 10:33 ` Dor Laor
2012-09-24 11:34 ` Peter Zijlstra
2012-09-24 11:52 ` Raghavendra K T
2012-09-24 12:36 ` Peter Zijlstra
2012-09-24 13:29 ` Raghavendra K T
2012-09-24 13:54 ` Peter Zijlstra
2012-09-24 14:16 ` Raghavendra K T
2012-09-25 13:40 ` Raghavendra K T
2012-09-27 8:36 ` Avi Kivity
2012-09-27 11:23 ` Raghavendra K T
2012-09-27 12:03 ` Avi Kivity
2012-09-27 12:25 ` Andrew Theurer
2012-09-28 5:38 ` Raghavendra K T
2012-09-28 5:45 ` H. Peter Anvin
2012-09-28 6:03 ` Raghavendra K T
2012-09-28 8:38 ` Peter Zijlstra
2012-09-28 11:40 ` Andrew Theurer
2012-09-28 14:11 ` Raghavendra K T
2012-09-28 14:13 ` Peter Zijlstra
2012-09-30 8:24 ` Avi Kivity
2012-10-03 14:29 ` Raghavendra K T
2012-10-03 17:25 ` Avi Kivity
2012-10-04 10:56 ` Raghavendra K T
2012-10-04 12:44 ` Avi Kivity
2012-10-05 9:04 ` Raghavendra K T
2012-09-24 15:51 ` Avi Kivity
2012-09-24 16:03 ` Peter Zijlstra
2012-09-24 16:20 ` Avi Kivity
2012-09-26 13:20 ` Andrew Jones
2012-09-26 13:26 ` Peter Zijlstra
2012-09-26 13:39 ` Andrew Jones
2012-09-26 13:45 ` Peter Zijlstra
2012-09-26 12:57 ` Andrew Jones
2012-09-27 10:21 ` Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1349897270.22418.7.camel@oc2024037011.ibm.com \
--to=habanero@linux.vnet.ibm.com \
--cc=avi@redhat.com \
--cc=chegu_vinod@hp.com \
--cc=drjones@redhat.com \
--cc=gleb@redhat.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mtosatti@redhat.com \
--cc=nikunj@linux.vnet.ibm.com \
--cc=ouyang@cs.pitt.edu \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@linux.vnet.ibm.com \
--cc=riel@redhat.com \
--cc=srikar@linux.vnet.ibm.com \
--cc=srivatsa.vaddagiri@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).