All of lore.kernel.org
 help / color / mirror / Atom feed
From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
To: Avi Kivity <avi@redhat.com>
Cc: Rik van Riel <riel@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Srikar <srikar@linux.vnet.ibm.com>,
	"Nikunj A. Dadhania" <nikunj@linux.vnet.ibm.com>,
	KVM <kvm@vger.kernel.org>, Jiannan Ouyang <ouyang@cs.pitt.edu>,
	chegu vinod <chegu_vinod@hp.com>,
	"Andrew M. Theurer" <habanero@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Srivatsa Vaddagiri <srivatsa.vaddagiri@gmail.com>,
	Gleb Natapov <gleb@redhat.com>
Subject: Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler
Date: Fri, 05 Oct 2012 14:32:56 +0530	[thread overview]
Message-ID: <506EA240.3090104@linux.vnet.ibm.com> (raw)
In-Reply-To: <506D83EE.2020303@redhat.com>

On 10/04/2012 06:11 PM, Avi Kivity wrote:
> On 10/04/2012 12:49 PM, Raghavendra K T wrote:
>> On 10/03/2012 10:35 PM, Avi Kivity wrote:
>>> On 10/03/2012 02:22 PM, Raghavendra K T wrote:
>>>>> So I think it's worth trying again with ple_window of 20000-40000.
>>>>>
>>>>
>>>> Hi Avi,
>>>>
>>>> I ran different benchmarks increasing ple_window, and results does not
>>>> seem to be encouraging for increasing ple_window.
>>>
>>> Thanks for testing! Comments below.
>>>
>>>> Results:
>>>> 16 core PLE machine with 16 vcpu guest.
>>>>
>>>> base kernel = 3.6-rc5 + ple handler optimization patch
>>>> base_pleopt_8k = base kernel + ple window = 8k
>>>> base_pleopt_16k = base kernel + ple window = 16k
>>>> base_pleopt_32k = base kernel + ple window = 32k
>>>>
>>>>
>>>> Percentage improvements of benchmarks w.r.t base_pleopt with
>>>> ple_window = 4096
>>>>
>>>>          base_pleopt_8k    base_pleopt_16k    base_pleopt_32k
>>>> -----------------------------------------------------------------
>>>>
>>>> kernbench_1x    -5.54915    -15.94529    -44.31562
>>>> kernbench_2x    -7.89399    -17.75039    -37.73498
>>>
>>> So, 44% degradation even with no overcommit?  That's surprising.
>>
>> Yes. Kernbench was run with #threads = #vcpu * 2 as usual. Is it
>> spending 8 times the original ple_window cycles for 16 vcpus
>> significant?
>
> A PLE exit when not overcommitted cannot do any good, it is better to
> spin in the guest rather that look for candidates on the host.  In fact
> when we benchmark we often disable PLE completely.
>
>>
>>>
>>>> I also got perf top output to analyse the difference. Difference comes
>>>> because of flushtlb (and also spinlock).
>>>
>>> That's in the guest, yes?
>>
>> Yes. Perf is in guest.
>>
>>>
>>>>
>>>> Ebizzy run for 4k ple_window
>>>> -  87.20%  [kernel]  [k] arch_local_irq_restore
>>>>      - arch_local_irq_restore
>>>>         - 100.00% _raw_spin_unlock_irqrestore
>>>>            + 52.89% release_pages
>>>>            + 47.10% pagevec_lru_move_fn
>>>> -   5.71%  [kernel]  [k] arch_local_irq_restore
>>>>      - arch_local_irq_restore
>>>>         + 86.03% default_send_IPI_mask_allbutself_phys
>>>>         + 13.96% default_send_IPI_mask_sequence_phys
>>>> -   3.10%  [kernel]  [k] smp_call_function_many
>>>>        smp_call_function_many
>>>>
>>>>
>>>> Ebizzy run for 32k ple_window
>>>>
>>>> -  91.40%  [kernel]  [k] arch_local_irq_restore
>>>>      - arch_local_irq_restore
>>>>         - 100.00% _raw_spin_unlock_irqrestore
>>>>            + 53.13% release_pages
>>>>            + 46.86% pagevec_lru_move_fn
>>>> -   4.38%  [kernel]  [k] smp_call_function_many
>>>>        smp_call_function_many
>>>> -   2.51%  [kernel]  [k] arch_local_irq_restore
>>>>      - arch_local_irq_restore
>>>>         + 90.76% default_send_IPI_mask_allbutself_phys
>>>>         + 9.24% default_send_IPI_mask_sequence_phys
>>>>
>>>
>>> Both the 4k and the 32k results are crazy.  Why is
>>> arch_local_irq_restore() so prominent?  Do you have a very high
>>> interrupt rate in the guest?
>>
>> How to measure if I have high interrupt rate in guest?
>>  From /proc/interrupt numbers I am not able to judge :(
>
> 'vmstat 1'
>

Thanks you. 'll save this. Apart from in,cs I think r: The number of 
processes waiting for run time, would be useful for me in vmstat.

>>
>> I went back and got the results on a 32 core machine with 32 vcpu guest.
>> Strangely, I got result supporting the claim that increasing ple_window
>> helps for non-overcommitted scenario.
>>
>> 32 core 32 vcpu guest 1x scenarios.
>>
>> ple_gap = 0
>> kernbench: Elapsed Time 38.61
>> ebizzy: 7463 records/s
>>
>> ple_window = 4k
>> kernbench: Elapsed Time 43.5067
>> ebizzy:    2528 records/s
>>
>> ple_window = 32k
>> kernebench : Elapsed Time 39.4133
>> ebizzy: 7196 records/s
>
> So maybe something was wrong with the first measurement.

May be I was not clear. The first time I had run on x240 (sandybridge)
16 core cpu,

Then ran on 32 core x3850 to confirm the perf top results.
But yes both had

[    0.018997] Performance Events: Broken PMU hardware detected, using 
software events only.

problem as rightly pointed by you and PeterZ.

after -cpu host, I see that is fixed on x240,

[    0.017997] Performance Events: 16-deep LBR, SandyBridge events, 
Intel PMU driver.
[    0.018868] NMI watchdog: enabled on all CPUs, permanently consumes 
one hw-PMU counter.

So I 'll try it on x240 again.

( Some how mx3850 -cpu host resulted in
[    0.026995] Performance Events: unsupported p6 CPU model 26 no PMU 
driver, software events only.
I think qemu needs some fix as pointed in
http://www.mail-archive.com/kvm@vger.kernel.org/msg55836.html


>
>>
>>
>> perf top for ebizzy for above:
>> ple_gap = 0
>> -  84.74%  [kernel]  [k] arch_local_irq_restore
>>     - arch_local_irq_restore
>>        - 100.00% _raw_spin_unlock_irqrestore
>>           + 50.96% release_pages
>>           + 49.02% pagevec_lru_move_fn
>> -   6.57%  [kernel]  [k] arch_local_irq_restore
>>     - arch_local_irq_restore
>>        + 92.54% default_send_IPI_mask_allbutself_phys
>>        + 7.46% default_send_IPI_mask_sequence_phys
>> -   1.54%  [kernel]  [k] smp_call_function_many
>>       smp_call_function_many
>
> Again the numbers are ridiculously high for arch_local_irq_restore.
> Maybe there's a bad perf/kvm interaction when we're injecting an
> interrupt, I can't believe we're spending 84% of the time running the
> popf instruction.
>
>>
>> ple_window = 32k
>> -  84.47%  [kernel]  [k] arch_local_irq_restore
>>     + arch_local_irq_restore
>> -   6.46%  [kernel]  [k] arch_local_irq_restore
>>     - arch_local_irq_restore
>>        + 93.51% default_send_IPI_mask_allbutself_phys
>>        + 6.49% default_send_IPI_mask_sequence_phys
>> -   1.80%  [kernel]  [k] smp_call_function_many
>>     - smp_call_function_many
>>        + 99.98% native_flush_tlb_others
>>
>>
>> ple_window = 4k
>> -  91.35%  [kernel]  [k] arch_local_irq_restore
>>     - arch_local_irq_restore
>>        - 100.00% _raw_spin_unlock_irqrestore
>>           + 53.19% release_pages
>>           + 46.81% pagevec_lru_move_fn
>> -   3.90%  [kernel]  [k] smp_call_function_many
>>       smp_call_function_many
>> -   2.94%  [kernel]  [k] arch_local_irq_restore
>>     - arch_local_irq_restore
>>        + 93.12% default_send_IPI_mask_allbutself_phys
>>        + 6.88% default_send_IPI_mask_sequence_phys
>>
>> Let me know if I can try something here..
>> /me confused :(
>>
>
> I'm even more confused.  Please try 'perf kvm' from the host, it does
> fewer dirty tricks with the PMU and so may be more accurate.
>

I will try with host perf kvm this time..


  parent reply	other threads:[~2012-10-05  9:07 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-21 11:59 [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler Raghavendra K T
2012-09-21 12:00 ` [PATCH RFC 1/2] kvm: Handle undercommitted guest case " Raghavendra K T
2012-09-21 13:02   ` Rik van Riel
2012-09-21 17:24     ` Raghavendra K T
2012-09-24 15:41       ` Avi Kivity
2012-09-24 16:06         ` Avi Kivity
2012-09-24 16:14           ` Peter Zijlstra
2012-09-24 16:25             ` Avi Kivity
2012-09-25  8:09           ` Raghavendra K T
2012-09-25  8:54             ` Avi Kivity
2012-09-25 13:49               ` Raghavendra K T
2012-09-27  7:44               ` Gleb Natapov
2012-09-27  8:59                 ` Avi Kivity
2012-09-27  9:11                   ` Gleb Natapov
2012-09-27  9:33                     ` Avi Kivity
2012-09-27  9:58                       ` Gleb Natapov
2012-09-27 10:04                         ` Avi Kivity
2012-09-27 10:08                           ` Gleb Natapov
2012-09-27 10:15                             ` Avi Kivity
     [not found]               ` <CAJocwcf+8u84_yDC-PK0Yni93YSTWzYvr69nq6b3pNv1MwVJzQ@mail.gmail.com>
2012-09-27  8:50                 ` Avi Kivity
2012-09-27 11:26                   ` Raghavendra K T
2012-09-27 12:06                     ` Avi Kivity
2012-09-28 18:18                       ` Konrad Rzeszutek Wilk
2012-09-30  8:16                         ` Avi Kivity
     [not found]                   ` <CAJocwcc19F+PtsQ5okGMvYeVnkEigpZRpwWY9JgeRPFqfcVoXA@mail.gmail.com>
2012-09-28  6:16                     ` Raghavendra K T
2012-09-30  8:18                       ` Avi Kivity
2012-09-30 11:07                         ` Gleb Natapov
2012-09-30 11:13                           ` Avi Kivity
2012-10-03 14:17                             ` Raghavendra K T
2012-10-03 14:56                               ` Avi Kivity
2012-10-04  7:29                                 ` Gleb Natapov
2012-10-05  8:36                                   ` Raghavendra K T
2012-10-07  9:51                                     ` Avi Kivity
2012-09-25  7:36         ` Raghavendra K T
2012-09-25  8:12           ` Avi Kivity
2012-09-25 14:21             ` Takuya Yoshikawa
2012-09-27  8:43               ` Avi Kivity
2012-10-03 12:22         ` Raghavendra K T
2012-10-03 17:05           ` Avi Kivity
2012-10-04 10:49             ` Raghavendra K T
2012-10-04 12:41               ` Avi Kivity
2012-10-04 13:07                 ` Peter Zijlstra
2012-10-04 15:00                   ` Avi Kivity
2012-10-09 18:51                     ` Raghavendra K T
2012-10-10  2:59                       ` Andrew Theurer
2012-10-10 17:54                         ` Raghavendra K T
2012-10-10 18:03                           ` David Ahern
2012-10-10 18:14                             ` Raghavendra K T
2012-10-10 19:36                           ` Andrew Theurer
2012-10-15 12:10                             ` Raghavendra K T
2012-10-15 14:34                               ` Andrew Theurer
2012-10-19  8:30                                 ` Raghavendra K T
2012-10-19 13:31                                   ` Andrew Theurer
2012-10-10 14:24                       ` Andrew Theurer
2012-10-10 17:43                         ` Raghavendra K T
2012-10-10 19:27                           ` Andrew Theurer
2012-10-11 17:13                             ` Raghavendra K T
2012-10-11 10:39                         ` Nikunj A Dadhania
2012-10-18 12:39                       ` Avi Kivity
2012-10-19  8:19                         ` Raghavendra K T
2012-10-04 14:41                 ` Andrew Theurer
2012-10-05  9:06                   ` Raghavendra K T
2012-10-05  9:02                 ` Raghavendra K T [this message]
2012-09-24 11:33   ` Peter Zijlstra
2012-09-24 11:40     ` Raghavendra K T
2012-09-21 12:00 ` [PATCH RFC 2/2] kvm: Be courteous to other VMs in overcommitted scenario " Raghavendra K T
2012-09-21 13:22   ` Rik van Riel
2012-09-21 13:46   ` Takuya Yoshikawa
2012-09-21 13:52     ` Rik van Riel
2012-09-21 17:45       ` Raghavendra K T
2012-09-24 13:43         ` Takuya Yoshikawa
2012-09-24 15:26   ` Avi Kivity
2012-09-24 15:34     ` Peter Zijlstra
2012-09-24 15:43       ` Avi Kivity
2012-09-24 15:52         ` Peter Zijlstra
2012-09-24 15:58           ` Avi Kivity
2012-09-24 16:05             ` Peter Zijlstra
2012-09-24 16:10               ` Avi Kivity
2012-09-24 16:13                 ` Peter Zijlstra
2012-09-24 16:21                   ` Avi Kivity
2012-09-25 10:11                     ` Avi Kivity
2012-09-21 13:18 ` [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios " Chegu Vinod
2012-09-21 17:36   ` Raghavendra K T
2012-09-24  8:42     ` Dor Laor
2012-09-24 12:02       ` Raghavendra K T
2012-09-25 15:00         ` Dor Laor
2012-09-26 12:27           ` Konrad Rzeszutek Wilk
2012-09-27 10:07             ` Raghavendra K T
2012-09-27  9:49           ` Raghavendra K T
2012-09-27 10:28             ` Andrew Jones
2012-09-27 10:44               ` Avi Kivity
2012-09-27 11:31               ` Raghavendra K T
2012-09-27 10:33             ` Dor Laor
2012-09-24 11:34 ` Peter Zijlstra
2012-09-24 11:52   ` Raghavendra K T
2012-09-24 12:36     ` Peter Zijlstra
2012-09-24 13:29       ` Raghavendra K T
2012-09-24 13:54         ` Peter Zijlstra
2012-09-24 14:16           ` Raghavendra K T
2012-09-25 13:40             ` Raghavendra K T
2012-09-27  8:36               ` Avi Kivity
2012-09-27 11:23                 ` Raghavendra K T
2012-09-27 12:03                   ` Avi Kivity
2012-09-27 12:25                     ` Andrew Theurer
2012-09-28  5:38                     ` Raghavendra K T
2012-09-28  5:45                       ` H. Peter Anvin
2012-09-28  6:03                         ` Raghavendra K T
2012-09-28  8:38                       ` Peter Zijlstra
2012-09-28 11:40                       ` Andrew Theurer
2012-09-28 14:11                         ` Raghavendra K T
2012-09-28 14:13                         ` Peter Zijlstra
2012-09-30  8:24                         ` Avi Kivity
2012-10-03 14:29                     ` Raghavendra K T
2012-10-03 17:25                       ` Avi Kivity
2012-10-04 10:56                         ` Raghavendra K T
2012-10-04 12:44                           ` Avi Kivity
2012-10-05  9:04                             ` Raghavendra K T
2012-09-24 15:51           ` Avi Kivity
2012-09-24 16:03             ` Peter Zijlstra
2012-09-24 16:20               ` Avi Kivity
2012-09-26 13:20                 ` Andrew Jones
2012-09-26 13:26                   ` Peter Zijlstra
2012-09-26 13:39                     ` Andrew Jones
2012-09-26 13:45                       ` Peter Zijlstra
2012-09-26 12:57       ` Andrew Jones
2012-09-27 10:21         ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=506EA240.3090104@linux.vnet.ibm.com \
    --to=raghavendra.kt@linux.vnet.ibm.com \
    --cc=avi@redhat.com \
    --cc=chegu_vinod@hp.com \
    --cc=gleb@redhat.com \
    --cc=habanero@linux.vnet.ibm.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=ouyang@cs.pitt.edu \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=srivatsa.vaddagiri@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.