linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Theurer <habanero@linux.vnet.ibm.com>
To: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Rik van Riel <riel@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Ingo Molnar <mingo@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Srikar <srikar@linux.vnet.ibm.com>,
	"Nikunj A. Dadhania" <nikunj@linux.vnet.ibm.com>,
	KVM <kvm@vger.kernel.org>, Jiannan Ouyang <ouyang@cs.pitt.edu>,
	chegu vinod <chegu_vinod@hp.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Srivatsa Vaddagiri <srivatsa.vaddagiri@gmail.com>,
	Gleb Natapov <gleb@redhat.com>, Andrew Jones <drjones@redhat.com>
Subject: Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler
Date: Tue, 09 Oct 2012 21:59:47 -0500	[thread overview]
Message-ID: <1349837987.5551.182.camel@oc6622382223.ibm.com> (raw)
In-Reply-To: <20121009185108.GA2549@linux.vnet.ibm.com>

On Wed, 2012-10-10 at 00:21 +0530, Raghavendra K T wrote:
> * Avi Kivity <avi@redhat.com> [2012-10-04 17:00:28]:
> 
> > On 10/04/2012 03:07 PM, Peter Zijlstra wrote:
> > > On Thu, 2012-10-04 at 14:41 +0200, Avi Kivity wrote:
> > >> 
> > >> Again the numbers are ridiculously high for arch_local_irq_restore.
> > >> Maybe there's a bad perf/kvm interaction when we're injecting an
> > >> interrupt, I can't believe we're spending 84% of the time running the
> > >> popf instruction. 
> > > 
> > > Smells like a software fallback that doesn't do NMI, hrtimer based
> > > sampling typically hits popf where we re-enable interrupts.
> > 
> > Good nose, that's probably it.  Raghavendra, can you ensure that the PMU
> > is properly exposed?  'dmesg' in the guest will tell.  If it isn't, -cpu
> > host will expose it (and a good idea anyway to get best performance).
> > 
> 
> Hi Avi, you are right. SandyBridge machine result was not proper.
> I cleaned up the services, enabled PMU, re-ran all the test again.
> 
> Here is the summary:
> We do get good benefit by increasing ple window. Though we don't
> see good benefit for kernbench and sysbench, for ebizzy, we get huge
> improvement for 1x scenario. (almost 2/3rd of ple disabled case).
> 
> Let me know if you think we can increase the default ple_window
> itself to 16k.
> 
> I am experimenting with V2 version of undercommit improvement(this) patch
> series, But I think if you wish  to go for increase of
> default ple_window, then we would have to measure the benefit of patches
> when ple_window = 16k.
> 
> I can respin the whole series including this default ple_window change.
> 
> I also have the perf kvm top result for both ebizzy and kernbench.
> I think they are in expected lines now.
> 
> Improvements
> ================
> 
> 16 core PLE machine with 16 vcpu guest
> 
> base = 3.6.0-rc5 + ple handler optimization patches
> base_pleopt_16k = base + ple_window = 16k
> base_pleopt_32k = base + ple_window = 32k
> base_pleopt_nople = base + ple_gap = 0
> kernbench, hackbench, sysbench (time in sec lower is better)
> ebizzy (rec/sec higher is better)
> 
> % improvements w.r.t base (ple_window = 4k)
> ---------------+---------------+-----------------+-------------------+
>                |base_pleopt_16k| base_pleopt_32k | base_pleopt_nople |
> ---------------+---------------+-----------------+-------------------+
> kernbench_1x   |  0.42371      |  1.15164        |   0.09320         |
> kernbench_2x   | -1.40981      | -17.48282       |  -570.77053       |
> ---------------+---------------+-----------------+-------------------+
> sysbench_1x    | -0.92367      | 0.24241         | -0.27027          |
> sysbench_2x    | -2.22706      |-0.30896         | -1.27573          |
> sysbench_3x    | -0.75509      | 0.09444         | -2.97756          |
> ---------------+---------------+-----------------+-------------------+
> ebizzy_1x      | 54.99976      | 67.29460        |  74.14076         |
> ebizzy_2x      | -8.83386      |-27.38403        | -96.22066         |
> ---------------+---------------+-----------------+-------------------+
> 
> perf kvm top observation for kernbench and ebizzy (nople, 4k, 32k window) 
> ========================================================================

Is the perf data for 1x overcommit?

> pleopt   ple_gap=0
> --------------------
> ebizzy : 18131 records/s
> 63.78%  [guest.kernel]  [g] _raw_spin_lock_irqsave
>     5.65%  [guest.kernel]  [g] smp_call_function_many
>     3.12%  [guest.kernel]  [g] clear_page
>     3.02%  [guest.kernel]  [g] down_read_trylock
>     1.85%  [guest.kernel]  [g] async_page_fault
>     1.81%  [guest.kernel]  [g] up_read
>     1.76%  [guest.kernel]  [g] native_apic_mem_write
>     1.70%  [guest.kernel]  [g] find_vma

Does 'perf kvm top' not give host samples at the same time?  Would be
nice to see the host overhead as a function of varying ple window.  I
would expect that to be the major difference between 4/16/32k window
sizes.

A big concern I have (if this is 1x overcommit) for ebizzy is that it
has just terrible scalability to begin with.  I do not think we should
try to optimize such a bad workload.

> kernbench :Elapsed Time 29.4933 (27.6007)
>    5.72%  [guest.kernel]  [g] async_page_fault
>     3.48%  [guest.kernel]  [g] pvclock_clocksource_read
>     2.68%  [guest.kernel]  [g] copy_user_generic_unrolled
>     2.58%  [guest.kernel]  [g] clear_page
>     2.09%  [guest.kernel]  [g] page_cache_get_speculative
>     2.00%  [guest.kernel]  [g] do_raw_spin_lock
>     1.78%  [guest.kernel]  [g] unmap_single_vma
>     1.74%  [guest.kernel]  [g] kmem_cache_alloc

> 
> pleopt ple_window = 4k
> ---------------------------
> ebizzy: 10176 records/s
>    69.17%  [guest.kernel]  [g] _raw_spin_lock_irqsave
>     3.34%  [guest.kernel]  [g] clear_page
>     2.16%  [guest.kernel]  [g] down_read_trylock
>     1.94%  [guest.kernel]  [g] async_page_fault
>     1.89%  [guest.kernel]  [g] native_apic_mem_write
>     1.63%  [guest.kernel]  [g] smp_call_function_many
>     1.58%  [guest.kernel]  [g] SetPageLRU
>     1.37%  [guest.kernel]  [g] up_read
>     1.01%  [guest.kernel]  [g] find_vma
> 
> 
> kernbench: 29.9533
> nts: 240K cycles
>     6.04%  [guest.kernel]  [g] async_page_fault
>     4.17%  [guest.kernel]  [g] pvclock_clocksource_read
>     3.28%  [guest.kernel]  [g] clear_page
>     2.57%  [guest.kernel]  [g] copy_user_generic_unrolled
>     2.30%  [guest.kernel]  [g] do_raw_spin_lock
>     2.13%  [guest.kernel]  [g] _raw_spin_lock_irqsave
>     1.93%  [guest.kernel]  [g] page_cache_get_speculative
>     1.92%  [guest.kernel]  [g] unmap_single_vma
>     1.77%  [guest.kernel]  [g] kmem_cache_alloc
>     1.61%  [guest.kernel]  [g] __d_lookup_rcu
>     1.19%  [guest.kernel]  [g] find_vma
>     1.19%  [guest.kernel]  [g] __list_del_entry
> 
> 
> pleopt: ple_window=16k
> -------------------------
> ebizzy: 16990
>  62.35%  [guest.kernel]  [g] _raw_spin_lock_irqsave
>     5.22%  [guest.kernel]  [g] smp_call_function_many
>     3.57%  [guest.kernel]  [g] down_read_trylock
>     3.20%  [guest.kernel]  [g] clear_page
>     2.16%  [guest.kernel]  [g] up_read
>     1.89%  [guest.kernel]  [g] find_vma
>     1.86%  [guest.kernel]  [g] async_page_fault
>     1.81%  [guest.kernel]  [g] native_apic_mem_write
> 
> kernbench: 28.5
>  6.24%  [guest.kernel]  [g] async_page_fault
>     4.16%  [guest.kernel]  [g] pvclock_clocksource_read
>     3.33%  [guest.kernel]  [g] clear_page
>     2.50%  [guest.kernel]  [g] copy_user_generic_unrolled
>     2.08%  [guest.kernel]  [g] do_raw_spin_lock
>     1.98%  [guest.kernel]  [g] unmap_single_vma
>     1.89%  [guest.kernel]  [g] kmem_cache_alloc
>     1.82%  [guest.kernel]  [g] page_cache_get_speculative
>     1.46%  [guest.kernel]  [g] __d_lookup_rcu
>     1.42%  [guest.kernel]  [g] _raw_spin_lock_irqsave
>     1.15%  [guest.kernel]  [g] __list_del_entry
>     1.10%  [guest.kernel]  [g] find_vma
> 
> 
> 
> Detailed result for the run
> =============================
> patched = base_pleopt_16k 
> +-----------+-----------+-----------+------------+-----------+
>                               kernbench 
> +-----------+-----------+-----------+------------+-----------+
>    base        stddev       patched    stdev        %improve     
> +-----------+-----------+-----------+------------+-----------+
> 1x    30.0440     1.1896    29.9167     1.6755	   0.42371
> 2x    62.0083     3.4884    62.8825     2.5509	  -1.40981
> +-----------+-----------+-----------+------------+-----------+
> +-----------+-----------+-----------+------------+-----------+
>                               sysbench 
> +-----------+-----------+-----------+------------+-----------+
> 1x     7.1779     0.0577     7.2442     0.0479	  -0.92367
> 2x    15.5362     0.3370    15.8822     0.3591	  -2.22706
> 3x    23.8249     0.1513    24.0048     0.1844	  -0.75509
> +-----------+-----------+-----------+------------+-----------+
> +-----------+-----------+-----------+------------+-----------+
>                               ebizzy 
> +-----------+-----------+-----------+------------+-----------+
> 1x 10358.0000   442.6598   16054.8750  252.5088    54.99976
> 2x  2705.5000   130.0286   2466.5000   120.0024	  -8.83386
> +-----------+-----------+-----------+------------+-----------+
> 
> patched = base_pleopt_32k
> +-----------+-----------+-----------+------------+-----------+
>                               kernbench 
> +-----------+-----------+-----------+------------+-----------+
>    base        stddev       patched    stdev        %improve     
> +-----------+-----------+-----------+------------+-----------+
> 1x    30.0440     1.1896    29.6980     0.6760	   1.15164
> 2x    62.0083     3.4884    72.8491     4.4616	 -17.48282
> +-----------+-----------+-----------+------------+-----------+
> +-----------+-----------+-----------+------------+-----------+
>                               sysbench 
> +-----------+-----------+-----------+------------+-----------+
> 1x     7.1779     0.0577     7.1605     0.0447	   0.24241
> 2x    15.5362     0.3370    15.5842     0.1731	  -0.30896
> 3x    23.8249     0.1513    23.8024     0.2342	   0.09444
> +-----------+-----------+-----------+------------+-----------+
> +-----------+-----------+-----------+------------+-----------+
>                               ebizzy 
> +-----------+-----------+-----------+------------+-----------+
> 1x  10358.0000   442.6598   17328.3750   281.4569   67.29460
> 2x  2705.5000   130.0286    1964.6250   143.0793   -27.38403
> +-----------+-----------+-----------+------------+-----------+
> 
> patched = base_pleopt_nople
> +-----------+-----------+-----------+------------+-----------+
>                               kernbench 
> +-----------+-----------+-----------+------------+-----------+
>    base        stddev       patched    stdev        %improve     
> +-----------+-----------+-----------+------------+-----------+
> 1x    30.0440     1.1896    30.0160     0.7523	   0.09320
> 2x    62.0083     3.4884   415.9334   189.9901	  -570.77053
> +-----------+-----------+-----------+------------+-----------+
> +-----------+-----------+-----------+------------+-----------+
>                               sysbench 
> +-----------+-----------+-----------+------------+-----------+
> 1x     7.1779     0.0577     7.1973     0.0354	  -0.27027
> 2x    15.5362     0.3370    15.7344     0.2315	  -1.27573
> 3x    23.8249     0.1513    24.5343     0.3437	  -2.97756
> +-----------+-----------+-----------+------------+-----------+
> +-----------+-----------+-----------+------------+-----------+
>                               ebizzy 
> +-----------+-----------+-----------+------------+-----------+
> 1x 10358.0000   442.6598 18037.5000   315.2074	   74.14076
> 2x  2705.5000   130.0286   102.2500   104.3521	  -96.22066
> +-----------+-----------+-----------+------------+-----------+
> 



  reply	other threads:[~2012-10-10  3:00 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-21 11:59 [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler Raghavendra K T
2012-09-21 12:00 ` [PATCH RFC 1/2] kvm: Handle undercommitted guest case " Raghavendra K T
2012-09-21 13:02   ` Rik van Riel
2012-09-21 17:24     ` Raghavendra K T
2012-09-24 15:41       ` Avi Kivity
2012-09-24 16:06         ` Avi Kivity
2012-09-24 16:14           ` Peter Zijlstra
2012-09-24 16:25             ` Avi Kivity
2012-09-25  8:09           ` Raghavendra K T
2012-09-25  8:54             ` Avi Kivity
2012-09-25 13:49               ` Raghavendra K T
2012-09-27  7:44               ` Gleb Natapov
2012-09-27  8:59                 ` Avi Kivity
2012-09-27  9:11                   ` Gleb Natapov
2012-09-27  9:33                     ` Avi Kivity
2012-09-27  9:58                       ` Gleb Natapov
2012-09-27 10:04                         ` Avi Kivity
2012-09-27 10:08                           ` Gleb Natapov
2012-09-27 10:15                             ` Avi Kivity
     [not found]               ` <CAJocwcf+8u84_yDC-PK0Yni93YSTWzYvr69nq6b3pNv1MwVJzQ@mail.gmail.com>
2012-09-27  8:50                 ` Avi Kivity
2012-09-27 11:26                   ` Raghavendra K T
2012-09-27 12:06                     ` Avi Kivity
2012-09-28 18:18                       ` Konrad Rzeszutek Wilk
2012-09-30  8:16                         ` Avi Kivity
     [not found]                   ` <CAJocwcc19F+PtsQ5okGMvYeVnkEigpZRpwWY9JgeRPFqfcVoXA@mail.gmail.com>
2012-09-28  6:16                     ` Raghavendra K T
2012-09-30  8:18                       ` Avi Kivity
2012-09-30 11:07                         ` Gleb Natapov
2012-09-30 11:13                           ` Avi Kivity
2012-10-03 14:17                             ` Raghavendra K T
2012-10-03 14:56                               ` Avi Kivity
2012-10-04  7:29                                 ` Gleb Natapov
2012-10-05  8:36                                   ` Raghavendra K T
2012-10-07  9:51                                     ` Avi Kivity
2012-09-25  7:36         ` Raghavendra K T
2012-09-25  8:12           ` Avi Kivity
2012-09-25 14:21             ` Takuya Yoshikawa
2012-09-27  8:43               ` Avi Kivity
2012-10-03 12:22         ` Raghavendra K T
2012-10-03 17:05           ` Avi Kivity
2012-10-04 10:49             ` Raghavendra K T
2012-10-04 12:41               ` Avi Kivity
2012-10-04 13:07                 ` Peter Zijlstra
2012-10-04 15:00                   ` Avi Kivity
2012-10-09 18:51                     ` Raghavendra K T
2012-10-10  2:59                       ` Andrew Theurer [this message]
2012-10-10 17:54                         ` Raghavendra K T
2012-10-10 18:03                           ` David Ahern
2012-10-10 18:14                             ` Raghavendra K T
2012-10-10 19:36                           ` Andrew Theurer
2012-10-15 12:10                             ` Raghavendra K T
2012-10-15 14:34                               ` Andrew Theurer
2012-10-19  8:30                                 ` Raghavendra K T
2012-10-19 13:31                                   ` Andrew Theurer
2012-10-10 14:24                       ` Andrew Theurer
2012-10-10 17:43                         ` Raghavendra K T
2012-10-10 19:27                           ` Andrew Theurer
2012-10-11 17:13                             ` Raghavendra K T
2012-10-11 10:39                         ` Nikunj A Dadhania
2012-10-18 12:39                       ` Avi Kivity
2012-10-19  8:19                         ` Raghavendra K T
2012-10-04 14:41                 ` Andrew Theurer
2012-10-05  9:06                   ` Raghavendra K T
2012-10-05  9:02                 ` Raghavendra K T
2012-09-24 11:33   ` Peter Zijlstra
2012-09-24 11:40     ` Raghavendra K T
2012-09-21 12:00 ` [PATCH RFC 2/2] kvm: Be courteous to other VMs in overcommitted scenario " Raghavendra K T
2012-09-21 13:22   ` Rik van Riel
2012-09-21 13:46   ` Takuya Yoshikawa
2012-09-21 13:52     ` Rik van Riel
2012-09-21 17:45       ` Raghavendra K T
2012-09-24 13:43         ` Takuya Yoshikawa
2012-09-24 15:26   ` Avi Kivity
2012-09-24 15:34     ` Peter Zijlstra
2012-09-24 15:43       ` Avi Kivity
2012-09-24 15:52         ` Peter Zijlstra
2012-09-24 15:58           ` Avi Kivity
2012-09-24 16:05             ` Peter Zijlstra
2012-09-24 16:10               ` Avi Kivity
2012-09-24 16:13                 ` Peter Zijlstra
2012-09-24 16:21                   ` Avi Kivity
2012-09-25 10:11                     ` Avi Kivity
2012-09-21 13:18 ` [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios " Chegu Vinod
2012-09-21 17:36   ` Raghavendra K T
2012-09-24  8:42     ` Dor Laor
2012-09-24 12:02       ` Raghavendra K T
2012-09-25 15:00         ` Dor Laor
2012-09-26 12:27           ` Konrad Rzeszutek Wilk
2012-09-27 10:07             ` Raghavendra K T
2012-09-27  9:49           ` Raghavendra K T
2012-09-27 10:28             ` Andrew Jones
2012-09-27 10:44               ` Avi Kivity
2012-09-27 11:31               ` Raghavendra K T
2012-09-27 10:33             ` Dor Laor
2012-09-24 11:34 ` Peter Zijlstra
2012-09-24 11:52   ` Raghavendra K T
2012-09-24 12:36     ` Peter Zijlstra
2012-09-24 13:29       ` Raghavendra K T
2012-09-24 13:54         ` Peter Zijlstra
2012-09-24 14:16           ` Raghavendra K T
2012-09-25 13:40             ` Raghavendra K T
2012-09-27  8:36               ` Avi Kivity
2012-09-27 11:23                 ` Raghavendra K T
2012-09-27 12:03                   ` Avi Kivity
2012-09-27 12:25                     ` Andrew Theurer
2012-09-28  5:38                     ` Raghavendra K T
2012-09-28  5:45                       ` H. Peter Anvin
2012-09-28  6:03                         ` Raghavendra K T
2012-09-28  8:38                       ` Peter Zijlstra
2012-09-28 11:40                       ` Andrew Theurer
2012-09-28 14:11                         ` Raghavendra K T
2012-09-28 14:13                         ` Peter Zijlstra
2012-09-30  8:24                         ` Avi Kivity
2012-10-03 14:29                     ` Raghavendra K T
2012-10-03 17:25                       ` Avi Kivity
2012-10-04 10:56                         ` Raghavendra K T
2012-10-04 12:44                           ` Avi Kivity
2012-10-05  9:04                             ` Raghavendra K T
2012-09-24 15:51           ` Avi Kivity
2012-09-24 16:03             ` Peter Zijlstra
2012-09-24 16:20               ` Avi Kivity
2012-09-26 13:20                 ` Andrew Jones
2012-09-26 13:26                   ` Peter Zijlstra
2012-09-26 13:39                     ` Andrew Jones
2012-09-26 13:45                       ` Peter Zijlstra
2012-09-26 12:57       ` Andrew Jones
2012-09-27 10:21         ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1349837987.5551.182.camel@oc6622382223.ibm.com \
    --to=habanero@linux.vnet.ibm.com \
    --cc=avi@redhat.com \
    --cc=chegu_vinod@hp.com \
    --cc=drjones@redhat.com \
    --cc=gleb@redhat.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=ouyang@cs.pitt.edu \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=srivatsa.vaddagiri@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).