* [kvm-unit-tests PATCH] arm64: microbench: Benchmark with virtual instead of physical timer
@ 2023-08-23 20:04 Colton Lewis
2023-08-24 7:29 ` Andrew Jones
0 siblings, 1 reply; 4+ messages in thread
From: Colton Lewis @ 2023-08-23 20:04 UTC (permalink / raw)
To: kvm
Cc: Andrew Jones, Alexandru Elisei, Eric Auger, Oliver Upton, kvmarm,
Colton Lewis
Use the virtual instead of the physical timer for measuring the time
taken to execute the microbenchmark.
Internal testing discovered a performance regression on this test
starting with Linux commit 680232a94c12 "KVM: arm64: timers: Allow
save/restoring of the physical timer". Oliver Upton speculates QEMU is
changing the guest physical counter to have a nonzero offset since it
gained the ability as of that commit. As a consequence KVM is
trap-and-emulating here on architectures without FEAT_ECV.
While this isn't a correctness issue, the trap-and-emulate overhead of
physical counter emulation on systems without ECV leads to surprising
microbenchmark results.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
arm/micro-bench.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arm/micro-bench.c b/arm/micro-bench.c
index bfd181dc..fbe59d03 100644
--- a/arm/micro-bench.c
+++ b/arm/micro-bench.c
@@ -348,10 +348,10 @@ static void loop_test(struct exit_test *test)
while (ntimes < test->times && total_ns.ns < NS_5_SECONDS) {
isb();
- start = read_sysreg(cntpct_el0);
+ start = read_sysreg(cntvct_el0);
test->exec();
isb();
- end = read_sysreg(cntpct_el0);
+ end = read_sysreg(cntvct_el0);
ntimes++;
total_ticks += (end - start);
--
2.42.0.rc1.204.g551eb34607-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [kvm-unit-tests PATCH] arm64: microbench: Benchmark with virtual instead of physical timer
2023-08-23 20:04 [kvm-unit-tests PATCH] arm64: microbench: Benchmark with virtual instead of physical timer Colton Lewis
@ 2023-08-24 7:29 ` Andrew Jones
2023-08-24 7:47 ` Andrew Jones
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Jones @ 2023-08-24 7:29 UTC (permalink / raw)
To: Colton Lewis; +Cc: kvm, Alexandru Elisei, Eric Auger, Oliver Upton, kvmarm
On Wed, Aug 23, 2023 at 08:04:08PM +0000, Colton Lewis wrote:
> Use the virtual instead of the physical timer for measuring the time
> taken to execute the microbenchmark.
>
> Internal testing discovered a performance regression on this test
> starting with Linux commit 680232a94c12 "KVM: arm64: timers: Allow
> save/restoring of the physical timer". Oliver Upton speculates QEMU is
> changing the guest physical counter to have a nonzero offset since it
> gained the ability as of that commit. As a consequence KVM is
> trap-and-emulating here on architectures without FEAT_ECV.
No need to speculate, QEMU is open source :-) QEMU is setting on offset,
but not because it explicitly wants to. Simply reading the CNT register
and writing back the same value is enough to set an offset, since the
timer will have certainly moved past whatever value was read by the time
it's written. QEMU frequently saves and restores all registers in the
get-reg-list array, unless they've been explicitly filtered out (with
Linux commit 680232a94c12, KVM_REG_ARM_PTIMER_CNT is now in the array).
So, to restore trapless ptimer accesses, we need a QEMU patch like below
to filter out the register.
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index 94bbd9661fd3..f89ea31f170d 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -674,6 +674,7 @@ typedef struct CPRegStateLevel {
*/
static const CPRegStateLevel non_runtime_cpregs[] = {
{ KVM_REG_ARM_TIMER_CNT, KVM_PUT_FULL_STATE },
+ { KVM_REG_ARM_PTIMER_CNT, KVM_PUT_FULL_STATE },
};
int kvm_arm_cpreg_level(uint64_t regidx)
>
> While this isn't a correctness issue, the trap-and-emulate overhead of
> physical counter emulation on systems without ECV leads to surprising
> microbenchmark results.
>
> Signed-off-by: Colton Lewis <coltonlewis@google.com>
> ---
> arm/micro-bench.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arm/micro-bench.c b/arm/micro-bench.c
> index bfd181dc..fbe59d03 100644
> --- a/arm/micro-bench.c
> +++ b/arm/micro-bench.c
> @@ -348,10 +348,10 @@ static void loop_test(struct exit_test *test)
>
> while (ntimes < test->times && total_ns.ns < NS_5_SECONDS) {
> isb();
> - start = read_sysreg(cntpct_el0);
> + start = read_sysreg(cntvct_el0);
> test->exec();
> isb();
> - end = read_sysreg(cntpct_el0);
> + end = read_sysreg(cntvct_el0);
This patch should be merged too, though, to avoid hitting the issue on
certain versions of QEMU. And, it's pretty clear that the original
authors just picked the ptimer arbitrarily.
>
> ntimes++;
> total_ticks += (end - start);
> --
> 2.42.0.rc1.204.g551eb34607-goog
Queued.
Thanks,
drew
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [kvm-unit-tests PATCH] arm64: microbench: Benchmark with virtual instead of physical timer
2023-08-24 7:29 ` Andrew Jones
@ 2023-08-24 7:47 ` Andrew Jones
2023-08-24 18:46 ` Colton Lewis
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Jones @ 2023-08-24 7:47 UTC (permalink / raw)
To: Colton Lewis
Cc: kvm, Alexandru Elisei, Eric Auger, Oliver Upton, kvmarm, peter.maydell
On Thu, Aug 24, 2023 at 09:29:33AM +0200, Andrew Jones wrote:
> On Wed, Aug 23, 2023 at 08:04:08PM +0000, Colton Lewis wrote:
> > Use the virtual instead of the physical timer for measuring the time
> > taken to execute the microbenchmark.
> >
> > Internal testing discovered a performance regression on this test
> > starting with Linux commit 680232a94c12 "KVM: arm64: timers: Allow
> > save/restoring of the physical timer". Oliver Upton speculates QEMU is
> > changing the guest physical counter to have a nonzero offset since it
> > gained the ability as of that commit. As a consequence KVM is
> > trap-and-emulating here on architectures without FEAT_ECV.
>
> No need to speculate, QEMU is open source :-) QEMU is setting on offset,
> but not because it explicitly wants to. Simply reading the CNT register
> and writing back the same value is enough to set an offset, since the
> timer will have certainly moved past whatever value was read by the time
> it's written. QEMU frequently saves and restores all registers in the
> get-reg-list array, unless they've been explicitly filtered out (with
> Linux commit 680232a94c12, KVM_REG_ARM_PTIMER_CNT is now in the array).
> So, to restore trapless ptimer accesses, we need a QEMU patch like below
> to filter out the register.
>
> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> index 94bbd9661fd3..f89ea31f170d 100644
> --- a/target/arm/kvm64.c
> +++ b/target/arm/kvm64.c
> @@ -674,6 +674,7 @@ typedef struct CPRegStateLevel {
> */
> static const CPRegStateLevel non_runtime_cpregs[] = {
> { KVM_REG_ARM_TIMER_CNT, KVM_PUT_FULL_STATE },
> + { KVM_REG_ARM_PTIMER_CNT, KVM_PUT_FULL_STATE },
> };
>
> int kvm_arm_cpreg_level(uint64_t regidx)
>
>
> >
> > While this isn't a correctness issue, the trap-and-emulate overhead of
Actually, the QEMU fix above is necessary for more than just trap
avoidance. The ptimer will lag more and more with respect to other time
sources, even with respect to the vtimer (QEMU only updates the vtimer
offset when the VM is "paused".)
I'm not sure when I'll have time to test and post the QEMU patch. It may
be better if somebody else picks it up.
Thanks,
drew
> > physical counter emulation on systems without ECV leads to surprising
> > microbenchmark results.
> >
> > Signed-off-by: Colton Lewis <coltonlewis@google.com>
> > ---
> > arm/micro-bench.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/arm/micro-bench.c b/arm/micro-bench.c
> > index bfd181dc..fbe59d03 100644
> > --- a/arm/micro-bench.c
> > +++ b/arm/micro-bench.c
> > @@ -348,10 +348,10 @@ static void loop_test(struct exit_test *test)
> >
> > while (ntimes < test->times && total_ns.ns < NS_5_SECONDS) {
> > isb();
> > - start = read_sysreg(cntpct_el0);
> > + start = read_sysreg(cntvct_el0);
> > test->exec();
> > isb();
> > - end = read_sysreg(cntpct_el0);
> > + end = read_sysreg(cntvct_el0);
>
> This patch should be merged too, though, to avoid hitting the issue on
> certain versions of QEMU. And, it's pretty clear that the original
> authors just picked the ptimer arbitrarily.
>
> >
> > ntimes++;
> > total_ticks += (end - start);
> > --
> > 2.42.0.rc1.204.g551eb34607-goog
>
> Queued.
>
> Thanks,
> drew
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [kvm-unit-tests PATCH] arm64: microbench: Benchmark with virtual instead of physical timer
2023-08-24 7:47 ` Andrew Jones
@ 2023-08-24 18:46 ` Colton Lewis
0 siblings, 0 replies; 4+ messages in thread
From: Colton Lewis @ 2023-08-24 18:46 UTC (permalink / raw)
To: Andrew Jones
Cc: kvm, alexandru.elisei, eric.auger, oliver.upton, kvmarm, peter.maydell
Andrew Jones <andrew.jones@linux.dev> writes:
> On Thu, Aug 24, 2023 at 09:29:33AM +0200, Andrew Jones wrote:
>> No need to speculate, QEMU is open source :-) QEMU is setting on offset,
>> but not because it explicitly wants to. Simply reading the CNT register
>> and writing back the same value is enough to set an offset, since the
>> timer will have certainly moved past whatever value was read by the time
>> it's written. QEMU frequently saves and restores all registers in the
>> get-reg-list array, unless they've been explicitly filtered out (with
>> Linux commit 680232a94c12, KVM_REG_ARM_PTIMER_CNT is now in the array).
>> So, to restore trapless ptimer accesses, we need a QEMU patch like below
>> to filter out the register.
>> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
>> index 94bbd9661fd3..f89ea31f170d 100644
>> --- a/target/arm/kvm64.c
>> +++ b/target/arm/kvm64.c
>> @@ -674,6 +674,7 @@ typedef struct CPRegStateLevel {
>> */
>> static const CPRegStateLevel non_runtime_cpregs[] = {
>> { KVM_REG_ARM_TIMER_CNT, KVM_PUT_FULL_STATE },
>> + { KVM_REG_ARM_PTIMER_CNT, KVM_PUT_FULL_STATE },
>> };
>> int kvm_arm_cpreg_level(uint64_t regidx)
> Actually, the QEMU fix above is necessary for more than just trap
> avoidance. The ptimer will lag more and more with respect to other time
> sources, even with respect to the vtimer (QEMU only updates the vtimer
> offset when the VM is "paused".)
> I'm not sure when I'll have time to test and post the QEMU patch. It may
> be better if somebody else picks it up.
I haven't posted any QEMU patches before, but I think I can handle this.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-08-24 18:46 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-23 20:04 [kvm-unit-tests PATCH] arm64: microbench: Benchmark with virtual instead of physical timer Colton Lewis
2023-08-24 7:29 ` Andrew Jones
2023-08-24 7:47 ` Andrew Jones
2023-08-24 18:46 ` Colton Lewis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).