All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mario Smarduch <m.smarduch@samsung.com>
To: Marc Zyngier <marc.zyngier@arm.com>,
	Christoffer Dall <christoffer.dall@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>
Subject: Re: [PATCH v3 07/22] arm64: KVM: Implement system register save/restore
Date: Sat, 12 Dec 2015 20:56:25 -0800	[thread overview]
Message-ID: <566CFA79.6060409@samsung.com> (raw)
In-Reply-To: <566B15ED.5030906@arm.com>



On 12/11/2015 10:29 AM, Marc Zyngier wrote:
> Hi Mario,
> 
> On 11/12/15 03:24, Mario Smarduch wrote:
>> Hi Marc,
>>
>> On 12/7/2015 2:53 AM, Marc Zyngier wrote:
>>> Implement the system register save/restore as a direct translation of
>>> the assembly code version.
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
>>> ---
>>>  arch/arm64/kvm/hyp/Makefile    |  1 +
>>>  arch/arm64/kvm/hyp/hyp.h       |  3 ++
>>>  arch/arm64/kvm/hyp/sysreg-sr.c | 90 ++++++++++++++++++++++++++++++++++++++++++
>>>  3 files changed, 94 insertions(+)
>>>  create mode 100644 arch/arm64/kvm/hyp/sysreg-sr.c
>>>
>>> diff --git a/arch/arm64/kvm/hyp/Makefile b/arch/arm64/kvm/hyp/Makefile
>>> index 455dc0a..ec94200 100644
>>> --- a/arch/arm64/kvm/hyp/Makefile
>>> +++ b/arch/arm64/kvm/hyp/Makefile
>>> @@ -5,3 +5,4 @@
>>>  obj-$(CONFIG_KVM_ARM_HOST) += vgic-v2-sr.o
>>>  obj-$(CONFIG_KVM_ARM_HOST) += vgic-v3-sr.o
>>>  obj-$(CONFIG_KVM_ARM_HOST) += timer-sr.o
>>> +obj-$(CONFIG_KVM_ARM_HOST) += sysreg-sr.o
>>> diff --git a/arch/arm64/kvm/hyp/hyp.h b/arch/arm64/kvm/hyp/hyp.h
>>> index f213e46..778d56d 100644
>>> --- a/arch/arm64/kvm/hyp/hyp.h
>>> +++ b/arch/arm64/kvm/hyp/hyp.h
>>> @@ -38,5 +38,8 @@ void __vgic_v3_restore_state(struct kvm_vcpu *vcpu);
>>>  void __timer_save_state(struct kvm_vcpu *vcpu);
>>>  void __timer_restore_state(struct kvm_vcpu *vcpu);
>>>  
>>> +void __sysreg_save_state(struct kvm_cpu_context *ctxt);
>>> +void __sysreg_restore_state(struct kvm_cpu_context *ctxt);
>>> +
>>>  #endif /* __ARM64_KVM_HYP_H__ */
>>>  
>>> diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
>>> new file mode 100644
>>> index 0000000..add8fcb
>>> --- /dev/null
>>> +++ b/arch/arm64/kvm/hyp/sysreg-sr.c
>>> @@ -0,0 +1,90 @@
>>> +/*
>>> + * Copyright (C) 2012-2015 - ARM Ltd
>>> + * Author: Marc Zyngier <marc.zyngier@arm.com>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>>> + */
>>> +
>>> +#include <linux/compiler.h>
>>> +#include <linux/kvm_host.h>
>>> +
>>> +#include <asm/kvm_mmu.h>
>>> +
>>> +#include "hyp.h"
>>> +
>>
>> I looked closer on some other ways to get better performance out of
>> the compiler. This code sequence performs about 35% faster for 
>> __sysreg_save_state(..) for 5000 exits you save about 500mS or 100nS
>> per exit. This is on Juno.
> 
> 35% faster? Really? That's pretty crazy. Was that on the A57 or the A53?

Good question, I bind kvmtool to cpu1, I think that's an A57.
> 
>>
>> register int volatile count asm("r2") = 0;

I meant x2, but this compiles with aarch64 compiler and runs on Juno. Appears
like compiler may have an issue.

> 
> Does this even work on arm64? We don't have an "r2" register...
> 
>>
>> do {
>> ....
>> } while(count);
>>
>> I didn't test the restore function (ran out of time) but I suspect it should be
>> the same. The assembler pretty much uses all the GPRs, (a little too many, using
>> stp to push 4 pairs on the stack and restore) looking at the assembler it all
>> should execute out of order.
> 
> Are you talking about the original implementation here? or the generated
> code out of the compiler? The original implementation didn't push
> anything on the stack (apart from the prologue, but we have the same
> thing in the C implementation).

This is generated compiler code using the do { ... } while code.
> 
> Looking at the compiler output, we have a bunch of mrs/str, one after
> the other - pretty basic. Maybe that gives the CPU some "breathing"
> time, but I have no idea if that's more or less efficient.
> 
> But the main thing is that we can now rely on the compiler to generate
> something that is more or less optimized for a given platform if there
> is such a requirement. We go from something that was cast in stone to
> something that has {some degree of flexibility.

Yes definitely, the do {....} while does bunch of mrs then bunch of str,
probably leads to out of order execution, eliminating the write after read
dependency.
Right now I don't know the compiler option that leads to this optimization.
> 
>>
>> FWIW I gave this a try since compilers like to optimize loops. I used
>> 'cntpct_el0' counter register to measure the intervals.
> 
> It'd be nice to have a measure in terms of cycle, but that's a good
> first approximation.
Will do that in the future. This series performs no worse  then assembler one
and the huge change is the clean C code and other advantages. Optimizations
could maybe be deferred for future revisions.

> 
> Thanks,
> 
> 	M.
> 

WARNING: multiple messages have this Message-ID (diff)
From: m.smarduch@samsung.com (Mario Smarduch)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v3 07/22] arm64: KVM: Implement system register save/restore
Date: Sat, 12 Dec 2015 20:56:25 -0800	[thread overview]
Message-ID: <566CFA79.6060409@samsung.com> (raw)
In-Reply-To: <566B15ED.5030906@arm.com>



On 12/11/2015 10:29 AM, Marc Zyngier wrote:
> Hi Mario,
> 
> On 11/12/15 03:24, Mario Smarduch wrote:
>> Hi Marc,
>>
>> On 12/7/2015 2:53 AM, Marc Zyngier wrote:
>>> Implement the system register save/restore as a direct translation of
>>> the assembly code version.
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
>>> ---
>>>  arch/arm64/kvm/hyp/Makefile    |  1 +
>>>  arch/arm64/kvm/hyp/hyp.h       |  3 ++
>>>  arch/arm64/kvm/hyp/sysreg-sr.c | 90 ++++++++++++++++++++++++++++++++++++++++++
>>>  3 files changed, 94 insertions(+)
>>>  create mode 100644 arch/arm64/kvm/hyp/sysreg-sr.c
>>>
>>> diff --git a/arch/arm64/kvm/hyp/Makefile b/arch/arm64/kvm/hyp/Makefile
>>> index 455dc0a..ec94200 100644
>>> --- a/arch/arm64/kvm/hyp/Makefile
>>> +++ b/arch/arm64/kvm/hyp/Makefile
>>> @@ -5,3 +5,4 @@
>>>  obj-$(CONFIG_KVM_ARM_HOST) += vgic-v2-sr.o
>>>  obj-$(CONFIG_KVM_ARM_HOST) += vgic-v3-sr.o
>>>  obj-$(CONFIG_KVM_ARM_HOST) += timer-sr.o
>>> +obj-$(CONFIG_KVM_ARM_HOST) += sysreg-sr.o
>>> diff --git a/arch/arm64/kvm/hyp/hyp.h b/arch/arm64/kvm/hyp/hyp.h
>>> index f213e46..778d56d 100644
>>> --- a/arch/arm64/kvm/hyp/hyp.h
>>> +++ b/arch/arm64/kvm/hyp/hyp.h
>>> @@ -38,5 +38,8 @@ void __vgic_v3_restore_state(struct kvm_vcpu *vcpu);
>>>  void __timer_save_state(struct kvm_vcpu *vcpu);
>>>  void __timer_restore_state(struct kvm_vcpu *vcpu);
>>>  
>>> +void __sysreg_save_state(struct kvm_cpu_context *ctxt);
>>> +void __sysreg_restore_state(struct kvm_cpu_context *ctxt);
>>> +
>>>  #endif /* __ARM64_KVM_HYP_H__ */
>>>  
>>> diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
>>> new file mode 100644
>>> index 0000000..add8fcb
>>> --- /dev/null
>>> +++ b/arch/arm64/kvm/hyp/sysreg-sr.c
>>> @@ -0,0 +1,90 @@
>>> +/*
>>> + * Copyright (C) 2012-2015 - ARM Ltd
>>> + * Author: Marc Zyngier <marc.zyngier@arm.com>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>>> + */
>>> +
>>> +#include <linux/compiler.h>
>>> +#include <linux/kvm_host.h>
>>> +
>>> +#include <asm/kvm_mmu.h>
>>> +
>>> +#include "hyp.h"
>>> +
>>
>> I looked closer on some other ways to get better performance out of
>> the compiler. This code sequence performs about 35% faster for 
>> __sysreg_save_state(..) for 5000 exits you save about 500mS or 100nS
>> per exit. This is on Juno.
> 
> 35% faster? Really? That's pretty crazy. Was that on the A57 or the A53?

Good question, I bind kvmtool to cpu1, I think that's an A57.
> 
>>
>> register int volatile count asm("r2") = 0;

I meant x2, but this compiles with aarch64 compiler and runs on Juno. Appears
like compiler may have an issue.

> 
> Does this even work on arm64? We don't have an "r2" register...
> 
>>
>> do {
>> ....
>> } while(count);
>>
>> I didn't test the restore function (ran out of time) but I suspect it should be
>> the same. The assembler pretty much uses all the GPRs, (a little too many, using
>> stp to push 4 pairs on the stack and restore) looking at the assembler it all
>> should execute out of order.
> 
> Are you talking about the original implementation here? or the generated
> code out of the compiler? The original implementation didn't push
> anything on the stack (apart from the prologue, but we have the same
> thing in the C implementation).

This is generated compiler code using the do { ... } while code.
> 
> Looking at the compiler output, we have a bunch of mrs/str, one after
> the other - pretty basic. Maybe that gives the CPU some "breathing"
> time, but I have no idea if that's more or less efficient.
> 
> But the main thing is that we can now rely on the compiler to generate
> something that is more or less optimized for a given platform if there
> is such a requirement. We go from something that was cast in stone to
> something that has {some degree of flexibility.

Yes definitely, the do {....} while does bunch of mrs then bunch of str,
probably leads to out of order execution, eliminating the write after read
dependency.
Right now I don't know the compiler option that leads to this optimization.
> 
>>
>> FWIW I gave this a try since compilers like to optimize loops. I used
>> 'cntpct_el0' counter register to measure the intervals.
> 
> It'd be nice to have a measure in terms of cycle, but that's a good
> first approximation.
Will do that in the future. This series performs no worse  then assembler one
and the huge change is the clean C code and other advantages. Optimizations
could maybe be deferred for future revisions.

> 
> Thanks,
> 
> 	M.
> 

  reply	other threads:[~2015-12-13  4:56 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-07 10:53 [PATCH v3 00/22] arm64: KVM: Rewriting the world switch in C Marc Zyngier
2015-12-07 10:53 ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 01/22] arm64: Add macros to read/write system registers Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 17:35   ` Catalin Marinas
2015-12-07 17:35     ` Catalin Marinas
2015-12-07 17:45     ` Mark Rutland
2015-12-07 17:45       ` Mark Rutland
2015-12-07 17:51       ` Marc Zyngier
2015-12-07 17:51         ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 02/22] arm64: KVM: Add a HYP-specific header file Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-11 21:19   ` Christoffer Dall
2015-12-11 21:19     ` Christoffer Dall
2015-12-07 10:53 ` [PATCH v3 03/22] arm64: KVM: Implement vgic-v2 save/restore Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-11 20:55   ` Christoffer Dall
2015-12-11 20:55     ` Christoffer Dall
2015-12-07 10:53 ` [PATCH v3 04/22] KVM: arm/arm64: vgic-v3: Make the LR indexing macro public Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-11 20:57   ` Christoffer Dall
2015-12-11 20:57     ` Christoffer Dall
2015-12-07 10:53 ` [PATCH v3 05/22] arm64: KVM: Implement vgic-v3 save/restore Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 16:40   ` Mario Smarduch
2015-12-07 16:40     ` Mario Smarduch
2015-12-07 16:52     ` Marc Zyngier
2015-12-07 16:52       ` Marc Zyngier
2015-12-07 17:18       ` Mario Smarduch
2015-12-07 17:18         ` Mario Smarduch
2015-12-07 17:37         ` Marc Zyngier
2015-12-07 17:37           ` Marc Zyngier
2015-12-07 18:05           ` Mario Smarduch
2015-12-07 18:05             ` Mario Smarduch
2015-12-07 18:20             ` Marc Zyngier
2015-12-07 18:20               ` Marc Zyngier
2015-12-08  2:14               ` Mario Smarduch
2015-12-08  2:14                 ` Mario Smarduch
2015-12-08  8:19                 ` Marc Zyngier
2015-12-08  8:19                   ` Marc Zyngier
2015-12-08  8:19                   ` Marc Zyngier
2015-12-11 21:04   ` Christoffer Dall
2015-12-11 21:04     ` Christoffer Dall
2015-12-07 10:53 ` [PATCH v3 06/22] arm64: KVM: Implement timer save/restore Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-08  2:18   ` Mario Smarduch
2015-12-08  2:18     ` Mario Smarduch
2015-12-08 10:02     ` Marc Zyngier
2015-12-08 10:02       ` Marc Zyngier
2015-12-11 21:20   ` Christoffer Dall
2015-12-11 21:20     ` Christoffer Dall
2015-12-07 10:53 ` [PATCH v3 07/22] arm64: KVM: Implement system register save/restore Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-11  3:24   ` Mario Smarduch
2015-12-11  3:24     ` Mario Smarduch
2015-12-11 18:29     ` Marc Zyngier
2015-12-11 18:29       ` Marc Zyngier
2015-12-13  4:56       ` Mario Smarduch [this message]
2015-12-13  4:56         ` Mario Smarduch
2015-12-07 10:53 ` [PATCH v3 08/22] arm64: KVM: Implement 32bit " Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 09/22] arm64: KVM: Implement debug save/restore Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 10/22] arm64: KVM: Implement guest entry Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-14 11:06   ` Christoffer Dall
2015-12-14 11:06     ` Christoffer Dall
2015-12-07 10:53 ` [PATCH v3 11/22] arm64: KVM: Add patchable function selector Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-11 21:21   ` Christoffer Dall
2015-12-11 21:21     ` Christoffer Dall
2015-12-07 10:53 ` [PATCH v3 12/22] arm64: KVM: Implement the core world switch Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 13/22] arm64: KVM: Implement fpsimd save/restore Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 14/22] arm64: KVM: Implement TLB handling Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 15/22] arm64: KVM: HYP mode entry points Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 16/22] arm64: KVM: Add panic handling Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 17/22] arm64: KVM: Add compatibility aliases Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 18/22] arm64: KVM: Map the kernel RO section into HYP Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 19/22] arm64: KVM: Move away from the assembly version of the world switch Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 20/22] arm64: KVM: Turn system register numbers to an enum Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 21/22] arm64: KVM: Cleanup asm-offset.c Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-07 10:53 ` [PATCH v3 22/22] arm64: KVM: Remove weak attributes Marc Zyngier
2015-12-07 10:53   ` Marc Zyngier
2015-12-14 11:07   ` Christoffer Dall
2015-12-14 11:07     ` Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=566CFA79.6060409@samsung.com \
    --to=m.smarduch@samsung.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.