All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle
@ 2011-08-26 23:31 Scott Wood
  2011-09-05 22:30 ` Alexander Graf
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Scott Wood @ 2011-08-26 23:31 UTC (permalink / raw)
  To: kvm-ppc

KVM will respond to setting MSR[WE], no matter what is in HID0.

With KVM we also know it will take effect right away, so no need for
a loop, TLF_NAPPING, etc.

KVM paravirt's initcall is moved before SMP init, to avoid any chance
of a race when updating the idle loop handler.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/kernel/kvm.c |   21 ++++++++++++++++++++-
 1 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
index eb95a03..5e13500 100644
--- a/arch/powerpc/kernel/kvm.c
+++ b/arch/powerpc/kernel/kvm.c
@@ -30,6 +30,7 @@
 #include <asm/cacheflush.h>
 #include <asm/disassemble.h>
 #include <asm/ppc-opcode.h>
+#include <asm/machdep.h>
 
 #define KVM_MAGIC_PAGE		(-4096L)
 #define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x)
@@ -774,6 +775,13 @@ static __init void kvm_free_tmp(void)
 	}
 }
 
+#ifdef CONFIG_E500
+static void kvm_msrwe_idle(void)
+{
+	mtmsr(mfmsr() | MSR_WE | MSR_EE);
+}
+#endif
+
 static int __init kvm_guest_init(void)
 {
 	if (!kvm_para_available())
@@ -789,6 +797,17 @@ static int __init kvm_guest_init(void)
 	/* Enable napping */
 	powersave_nap = 1;
 #endif
+#ifdef CONFIG_E500
+	/*
+	 * Skip the overhead of HID0 accesses that KVM ignores --
+	 * just write MSR[WE].
+	 *
+	 * We don't need _TLF_NAPPING, because under KVM we know
+	 * it will take effect right away.
+	 */
+	if (ppc_md.power_save = e500_idle)
+		ppc_md.power_save = kvm_msrwe_idle;
+#endif
 
 free_tmp:
 	kvm_free_tmp();
@@ -796,4 +815,4 @@ free_tmp:
 	return 0;
 }
 
-postcore_initcall(kvm_guest_init);
+early_initcall(kvm_guest_init);
-- 
1.7.6



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle
  2011-08-26 23:31 [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle Scott Wood
@ 2011-09-05 22:30 ` Alexander Graf
  2011-09-15 21:36 ` Scott Wood
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Alexander Graf @ 2011-09-05 22:30 UTC (permalink / raw)
  To: kvm-ppc


On 27.08.2011, at 01:31, Scott Wood wrote:

> KVM will respond to setting MSR[WE], no matter what is in HID0.
> 
> With KVM we also know it will take effect right away, so no need for
> a loop, TLF_NAPPING, etc.
> 
> KVM paravirt's initcall is moved before SMP init, to avoid any chance
> of a race when updating the idle loop handler.
> 
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
> arch/powerpc/kernel/kvm.c |   21 ++++++++++++++++++++-
> 1 files changed, 20 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c
> index eb95a03..5e13500 100644
> --- a/arch/powerpc/kernel/kvm.c
> +++ b/arch/powerpc/kernel/kvm.c
> @@ -30,6 +30,7 @@
> #include <asm/cacheflush.h>
> #include <asm/disassemble.h>
> #include <asm/ppc-opcode.h>
> +#include <asm/machdep.h>
> 
> #define KVM_MAGIC_PAGE		(-4096L)
> #define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x)
> @@ -774,6 +775,13 @@ static __init void kvm_free_tmp(void)
> 	}
> }
> 
> +#ifdef CONFIG_E500
> +static void kvm_msrwe_idle(void)
> +{
> +	mtmsr(mfmsr() | MSR_WE | MSR_EE);
> +}
> +#endif
> +
> static int __init kvm_guest_init(void)
> {
> 	if (!kvm_para_available())
> @@ -789,6 +797,17 @@ static int __init kvm_guest_init(void)
> 	/* Enable napping */
> 	powersave_nap = 1;
> #endif
> +#ifdef CONFIG_E500
> +	/*
> +	 * Skip the overhead of HID0 accesses that KVM ignores --
> +	 * just write MSR[WE].
> +	 *
> +	 * We don't need _TLF_NAPPING, because under KVM we know
> +	 * it will take effect right away.
> +	 */
> +	if (ppc_md.power_save = e500_idle)
> +		ppc_md.power_save = kvm_msrwe_idle;

Why the if() here?


Alex


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle
  2011-08-26 23:31 [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle Scott Wood
  2011-09-05 22:30 ` Alexander Graf
@ 2011-09-15 21:36 ` Scott Wood
  2011-09-19  9:33 ` Alexander Graf
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Scott Wood @ 2011-09-15 21:36 UTC (permalink / raw)
  To: kvm-ppc

On 09/05/2011 05:30 PM, Alexander Graf wrote:
> 
> On 27.08.2011, at 01:31, Scott Wood wrote:
> 
>> +#ifdef CONFIG_E500
>> +	/*
>> +	 * Skip the overhead of HID0 accesses that KVM ignores --
>> +	 * just write MSR[WE].
>> +	 *
>> +	 * We don't need _TLF_NAPPING, because under KVM we know
>> +	 * it will take effect right away.
>> +	 */
>> +	if (ppc_md.power_save = e500_idle)
>> +		ppc_md.power_save = kvm_msrwe_idle;
> 
> Why the if() here?

To avoid replacing some other power_save() implementation.
kvm_msrwe_idle() is a paravirt-optimized version of e500_idle().

However, now that e500_idle has an ifdef for e500mc, we'll need that
ifdef here as well.  e500mc doesn't use MSR[WE] (and if it did, we
couldn't trap on it).  For e500mc we'll want to make an hcall for idle
(ePAPR EV_IDLE).

-Scott


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle
  2011-08-26 23:31 [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle Scott Wood
  2011-09-05 22:30 ` Alexander Graf
  2011-09-15 21:36 ` Scott Wood
@ 2011-09-19  9:33 ` Alexander Graf
  2011-09-19 18:46 ` Scott Wood
  2011-09-19 18:50 ` Alexander Graf
  4 siblings, 0 replies; 6+ messages in thread
From: Alexander Graf @ 2011-09-19  9:33 UTC (permalink / raw)
  To: kvm-ppc


On 15.09.2011, at 23:36, Scott Wood wrote:

> On 09/05/2011 05:30 PM, Alexander Graf wrote:
>> 
>> On 27.08.2011, at 01:31, Scott Wood wrote:
>> 
>>> +#ifdef CONFIG_E500
>>> +	/*
>>> +	 * Skip the overhead of HID0 accesses that KVM ignores --
>>> +	 * just write MSR[WE].
>>> +	 *
>>> +	 * We don't need _TLF_NAPPING, because under KVM we know
>>> +	 * it will take effect right away.
>>> +	 */
>>> +	if (ppc_md.power_save = e500_idle)
>>> +		ppc_md.power_save = kvm_msrwe_idle;
>> 
>> Why the if() here?
> 
> To avoid replacing some other power_save() implementation.
> kvm_msrwe_idle() is a paravirt-optimized version of e500_idle().
> 
> However, now that e500_idle has an ifdef for e500mc, we'll need that
> ifdef here as well.  e500mc doesn't use MSR[WE] (and if it did, we
> couldn't trap on it).  For e500mc we'll want to make an hcall for idle
> (ePAPR EV_IDLE).

Since we're PV'ing here either way, can't we simply define a generic power_save implementation that works across different CPU types? We could for example always use EV_IDLE.


Alex


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle
  2011-08-26 23:31 [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle Scott Wood
                   ` (2 preceding siblings ...)
  2011-09-19  9:33 ` Alexander Graf
@ 2011-09-19 18:46 ` Scott Wood
  2011-09-19 18:50 ` Alexander Graf
  4 siblings, 0 replies; 6+ messages in thread
From: Scott Wood @ 2011-09-19 18:46 UTC (permalink / raw)
  To: kvm-ppc

On 09/19/2011 04:33 AM, Alexander Graf wrote:
> 
> On 15.09.2011, at 23:36, Scott Wood wrote:
> 
>> On 09/05/2011 05:30 PM, Alexander Graf wrote:
>>>
>>> On 27.08.2011, at 01:31, Scott Wood wrote:
>>>
>>>> +#ifdef CONFIG_E500
>>>> +	/*
>>>> +	 * Skip the overhead of HID0 accesses that KVM ignores --
>>>> +	 * just write MSR[WE].
>>>> +	 *
>>>> +	 * We don't need _TLF_NAPPING, because under KVM we know
>>>> +	 * it will take effect right away.
>>>> +	 */
>>>> +	if (ppc_md.power_save = e500_idle)
>>>> +		ppc_md.power_save = kvm_msrwe_idle;
>>>
>>> Why the if() here?
>>
>> To avoid replacing some other power_save() implementation.
>> kvm_msrwe_idle() is a paravirt-optimized version of e500_idle().
>>
>> However, now that e500_idle has an ifdef for e500mc, we'll need that
>> ifdef here as well.  e500mc doesn't use MSR[WE] (and if it did, we
>> couldn't trap on it).  For e500mc we'll want to make an hcall for idle
>> (ePAPR EV_IDLE).
> 
> Since we're PV'ing here either way, can't we simply define a generic power_save implementation that works across different CPU types? We could for example always use EV_IDLE.

I suppose so, though I'd be more comfortable knowing what it is we're
substituting for.  E.g. maybe some no-op was put in power_save() because
it was determined that the wake latency was too high relative to the
workload.  Not something you'd usually expect with virtualization, but
you never know.  There might be one guest that is meant to be
prioritized and will be doing a lot of I/O, and the desire is to bunch
up the time when other things run.

OTOH, detecting this by what function was put into power_save won't help
if for similar reasons someone wants to run an actual wait instruction
rather than trap to KVM.  So if such a case does come up, it should be
via some explicit configuration that tells KVM to leave it alone.

In practice, with the current code, I think e500_idle is the only thing
that will show up in power_save if CONFIG_E500 is defined.

-Scott


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle
  2011-08-26 23:31 [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle Scott Wood
                   ` (3 preceding siblings ...)
  2011-09-19 18:46 ` Scott Wood
@ 2011-09-19 18:50 ` Alexander Graf
  4 siblings, 0 replies; 6+ messages in thread
From: Alexander Graf @ 2011-09-19 18:50 UTC (permalink / raw)
  To: kvm-ppc


On 19.09.2011, at 20:46, Scott Wood wrote:

> On 09/19/2011 04:33 AM, Alexander Graf wrote:
>> 
>> On 15.09.2011, at 23:36, Scott Wood wrote:
>> 
>>> On 09/05/2011 05:30 PM, Alexander Graf wrote:
>>>> 
>>>> On 27.08.2011, at 01:31, Scott Wood wrote:
>>>> 
>>>>> +#ifdef CONFIG_E500
>>>>> +	/*
>>>>> +	 * Skip the overhead of HID0 accesses that KVM ignores --
>>>>> +	 * just write MSR[WE].
>>>>> +	 *
>>>>> +	 * We don't need _TLF_NAPPING, because under KVM we know
>>>>> +	 * it will take effect right away.
>>>>> +	 */
>>>>> +	if (ppc_md.power_save = e500_idle)
>>>>> +		ppc_md.power_save = kvm_msrwe_idle;
>>>> 
>>>> Why the if() here?
>>> 
>>> To avoid replacing some other power_save() implementation.
>>> kvm_msrwe_idle() is a paravirt-optimized version of e500_idle().
>>> 
>>> However, now that e500_idle has an ifdef for e500mc, we'll need that
>>> ifdef here as well.  e500mc doesn't use MSR[WE] (and if it did, we
>>> couldn't trap on it).  For e500mc we'll want to make an hcall for idle
>>> (ePAPR EV_IDLE).
>> 
>> Since we're PV'ing here either way, can't we simply define a generic power_save implementation that works across different CPU types? We could for example always use EV_IDLE.
> 
> I suppose so, though I'd be more comfortable knowing what it is we're
> substituting for.  E.g. maybe some no-op was put in power_save() because
> it was determined that the wake latency was too high relative to the
> workload.  Not something you'd usually expect with virtualization, but
> you never know.  There might be one guest that is meant to be
> prioritized and will be doing a lot of I/O, and the desire is to bunch
> up the time when other things run.

But at that point the guest needs to be modified too, no? Or is there a command line parameter that specifies which idle function to use? If so, we maybe want to read that one out to see if it's set manually and only change the default behavior?

> OTOH, detecting this by what function was put into power_save won't help
> if for similar reasons someone wants to run an actual wait instruction
> rather than trap to KVM.  So if such a case does come up, it should be
> via some explicit configuration that tells KVM to leave it alone.
> 
> In practice, with the current code, I think e500_idle is the only thing
> that will show up in power_save if CONFIG_E500 is defined.

That's what I assumed, yes. If we need something different, we better make it solid on all edges :)


Alex


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-09-19 18:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-26 23:31 [PATCH 4/5] KVM: PPC: e500: eliminate a trap when entering idle Scott Wood
2011-09-05 22:30 ` Alexander Graf
2011-09-15 21:36 ` Scott Wood
2011-09-19  9:33 ` Alexander Graf
2011-09-19 18:46 ` Scott Wood
2011-09-19 18:50 ` Alexander Graf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.