All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] target/arm: Reduce tlb_flush overhead
@ 2018-10-18 18:27 Richard Henderson
  2018-10-18 18:27 ` [Qemu-devel] [PATCH 1/3] target/arm: Remove writefn from TTBR0_EL3 Richard Henderson
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Richard Henderson @ 2018-10-18 18:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

While installing AArch64 Ubuntu into a new vm,
I happened to notice that tlb_flush+memset was
consuming 25% of the total runtime.

This patch set reduces that overhead to 10%.
Full tlb flushes are down to 11k from 1.8M,
when pausing the installation at the first menu.


r~


Richard Henderson (3):
  target/arm: Remove writefn from TTBR0_EL3
  target/arm: Only flush tlb if ASID changes
  target/arm: Flush only the TLBs affected by TTBR*_EL1

 target/arm/helper.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

-- 
2.17.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 1/3] target/arm: Remove writefn from TTBR0_EL3
  2018-10-18 18:27 [Qemu-devel] [PATCH 0/3] target/arm: Reduce tlb_flush overhead Richard Henderson
@ 2018-10-18 18:27 ` Richard Henderson
  2018-10-18 20:28   ` Aaron Lindsay
  2018-10-18 18:27 ` [Qemu-devel] [PATCH 2/3] target/arm: Only flush tlb if ASID changes Richard Henderson
  2018-10-18 18:27 ` [Qemu-devel] [PATCH 3/3] target/arm: Flush only the TLBs affected by TTBR*_EL1 Richard Henderson
  2 siblings, 1 reply; 8+ messages in thread
From: Richard Henderson @ 2018-10-18 18:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

The EL3 version of this register does not include an ASID,
and so the tlb_flush performed by vmsa_ttbr_write is not needed.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index e3946562aa..24bbde4f76 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4214,7 +4214,7 @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.mvbar) },
     { .name = "TTBR0_EL3", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 6, .crn = 2, .crm = 0, .opc2 = 0,
-      .access = PL3_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .access = PL3_RW, .resetvalue = 0,
       .fieldoffset = offsetof(CPUARMState, cp15.ttbr0_el[3]) },
     { .name = "TCR_EL3", .state = ARM_CP_STATE_AA64,
       .opc0 = 3, .opc1 = 6, .crn = 2, .crm = 0, .opc2 = 2,
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 2/3] target/arm: Only flush tlb if ASID changes
  2018-10-18 18:27 [Qemu-devel] [PATCH 0/3] target/arm: Reduce tlb_flush overhead Richard Henderson
  2018-10-18 18:27 ` [Qemu-devel] [PATCH 1/3] target/arm: Remove writefn from TTBR0_EL3 Richard Henderson
@ 2018-10-18 18:27 ` Richard Henderson
  2018-10-18 20:28   ` Aaron Lindsay
  2018-10-18 18:27 ` [Qemu-devel] [PATCH 3/3] target/arm: Flush only the TLBs affected by TTBR*_EL1 Richard Henderson
  2 siblings, 1 reply; 8+ messages in thread
From: Richard Henderson @ 2018-10-18 18:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Since QEMU does not implement ASIDs, changes to the ASID must flush the
tlb.  However, if the ASID does not change there is no reason to flush.

In testing a boot of the Ubuntu installer to the first menu, this reduces
the number of flushes by 30%, or nearly 600k instances.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 24bbde4f76..ed70ac645e 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2709,12 +2709,10 @@ static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                             uint64_t value)
 {
-    /* 64 bit accesses to the TTBRs can change the ASID and so we
-     * must flush the TLB.
-     */
-    if (cpreg_field_is_64bit(ri)) {
+    /* If the ASID changes (with a 64-bit write), we must flush the TLB.  */
+    if (cpreg_field_is_64bit(ri) &&
+        extract64(raw_read(env, ri) ^ value, 48, 16) != 0) {
         ARMCPU *cpu = arm_env_get_cpu(env);
-
         tlb_flush(CPU(cpu));
     }
     raw_write(env, ri, value);
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 3/3] target/arm: Flush only the TLBs affected by TTBR*_EL1
  2018-10-18 18:27 [Qemu-devel] [PATCH 0/3] target/arm: Reduce tlb_flush overhead Richard Henderson
  2018-10-18 18:27 ` [Qemu-devel] [PATCH 1/3] target/arm: Remove writefn from TTBR0_EL3 Richard Henderson
  2018-10-18 18:27 ` [Qemu-devel] [PATCH 2/3] target/arm: Only flush tlb if ASID changes Richard Henderson
@ 2018-10-18 18:27 ` Richard Henderson
  2018-10-18 20:27   ` Aaron Lindsay
  2 siblings, 1 reply; 8+ messages in thread
From: Richard Henderson @ 2018-10-18 18:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Only the EL0 and EL1 TLBs are affected by the EL1 register,
so flush only 2 of the 8 TLBs.

In testing a boot of the Ubuntu installer to the first menu, this
accounts for nearly all of the full tlb flushes: all but 11k of
the 1.2M instances without the patch.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index ed70ac645e..a943e91666 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2706,14 +2706,16 @@ static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
     tcr->raw_tcr = value;
 }
 
-static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                            uint64_t value)
+static void vmsa_ttbr1_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                             uint64_t value)
 {
     /* If the ASID changes (with a 64-bit write), we must flush the TLB.  */
     if (cpreg_field_is_64bit(ri) &&
         extract64(raw_read(env, ri) ^ value, 48, 16) != 0) {
         ARMCPU *cpu = arm_env_get_cpu(env);
-        tlb_flush(CPU(cpu));
+        tlb_flush_by_mmuidx(CPU(cpu),
+                            ARMMMUIdxBit_S12NSE1 |
+                            ARMMMUIdxBit_S12NSE0);
     }
     raw_write(env, ri, value);
 }
@@ -2761,12 +2763,12 @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.esr_el[1]), .resetvalue = 0, },
     { .name = "TTBR0_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
-      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .access = PL1_RW, .writefn = vmsa_ttbr1_write, .resetvalue = 0,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                              offsetof(CPUARMState, cp15.ttbr0_ns) } },
     { .name = "TTBR1_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 1,
-      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .access = PL1_RW, .writefn = vmsa_ttbr1_write, .resetvalue = 0,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) } },
     { .name = "TCR_EL1", .state = ARM_CP_STATE_AA64,
@@ -3018,12 +3020,12 @@ static const ARMCPRegInfo lpae_cp_reginfo[] = {
       .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                              offsetof(CPUARMState, cp15.ttbr0_ns) },
-      .writefn = vmsa_ttbr_write, },
+      .writefn = vmsa_ttbr1_write, },
     { .name = "TTBR1", .cp = 15, .crm = 2, .opc1 = 1,
       .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) },
-      .writefn = vmsa_ttbr_write, },
+      .writefn = vmsa_ttbr1_write, },
     REGINFO_SENTINEL
 };
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] target/arm: Flush only the TLBs affected by TTBR*_EL1
  2018-10-18 18:27 ` [Qemu-devel] [PATCH 3/3] target/arm: Flush only the TLBs affected by TTBR*_EL1 Richard Henderson
@ 2018-10-18 20:27   ` Aaron Lindsay
  2018-10-18 20:52     ` Richard Henderson
  0 siblings, 1 reply; 8+ messages in thread
From: Aaron Lindsay @ 2018-10-18 20:27 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, peter.maydell

On Oct 18 11:27, Richard Henderson wrote:
> @@ -2761,12 +2763,12 @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
>        .fieldoffset = offsetof(CPUARMState, cp15.esr_el[1]), .resetvalue = 0, },
>      { .name = "TTBR0_EL1", .state = ARM_CP_STATE_BOTH,
>        .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
> -      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
> +      .access = PL1_RW, .writefn = vmsa_ttbr1_write, .resetvalue = 0,

It's a little confusing that vmsa_ttbr1_write is used for TTBR0_EL1. Is
the '1' indicating the EL instead of which TTBR is being used?

-Aaron

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] target/arm: Remove writefn from TTBR0_EL3
  2018-10-18 18:27 ` [Qemu-devel] [PATCH 1/3] target/arm: Remove writefn from TTBR0_EL3 Richard Henderson
@ 2018-10-18 20:28   ` Aaron Lindsay
  0 siblings, 0 replies; 8+ messages in thread
From: Aaron Lindsay @ 2018-10-18 20:28 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, peter.maydell

On Oct 18 11:27, Richard Henderson wrote:
> The EL3 version of this register does not include an ASID,
> and so the tlb_flush performed by vmsa_ttbr_write is not needed.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Aaron Lindsay <aaron@os.amperecomputing.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] target/arm: Only flush tlb if ASID changes
  2018-10-18 18:27 ` [Qemu-devel] [PATCH 2/3] target/arm: Only flush tlb if ASID changes Richard Henderson
@ 2018-10-18 20:28   ` Aaron Lindsay
  0 siblings, 0 replies; 8+ messages in thread
From: Aaron Lindsay @ 2018-10-18 20:28 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, peter.maydell

On Oct 18 11:27, Richard Henderson wrote:
> Since QEMU does not implement ASIDs, changes to the ASID must flush the
> tlb.  However, if the ASID does not change there is no reason to flush.
> 
> In testing a boot of the Ubuntu installer to the first menu, this reduces
> the number of flushes by 30%, or nearly 600k instances.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Aaron Lindsay <aaron@os.amperecomputing.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] target/arm: Flush only the TLBs affected by TTBR*_EL1
  2018-10-18 20:27   ` Aaron Lindsay
@ 2018-10-18 20:52     ` Richard Henderson
  0 siblings, 0 replies; 8+ messages in thread
From: Richard Henderson @ 2018-10-18 20:52 UTC (permalink / raw)
  To: Aaron Lindsay; +Cc: qemu-devel, peter.maydell

On 10/18/18 1:27 PM, Aaron Lindsay wrote:
> On Oct 18 11:27, Richard Henderson wrote:
>> @@ -2761,12 +2763,12 @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
>>        .fieldoffset = offsetof(CPUARMState, cp15.esr_el[1]), .resetvalue = 0, },
>>      { .name = "TTBR0_EL1", .state = ARM_CP_STATE_BOTH,
>>        .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
>> -      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
>> +      .access = PL1_RW, .writefn = vmsa_ttbr1_write, .resetvalue = 0,
> 
> It's a little confusing that vmsa_ttbr1_write is used for TTBR0_EL1. Is
> the '1' indicating the EL instead of which TTBR is being used?

Yes.  Perhaps I should have included "_el" in the symbol for clarity.

I expect to add a different function (vmsr_ttbr_el2_write?), for TTBR{0,1}_EL2,
which will also check HCR_EL2.E2H, when I get around to implementing ARMv8.1-VHE.


r~

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-10-18 20:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-18 18:27 [Qemu-devel] [PATCH 0/3] target/arm: Reduce tlb_flush overhead Richard Henderson
2018-10-18 18:27 ` [Qemu-devel] [PATCH 1/3] target/arm: Remove writefn from TTBR0_EL3 Richard Henderson
2018-10-18 20:28   ` Aaron Lindsay
2018-10-18 18:27 ` [Qemu-devel] [PATCH 2/3] target/arm: Only flush tlb if ASID changes Richard Henderson
2018-10-18 20:28   ` Aaron Lindsay
2018-10-18 18:27 ` [Qemu-devel] [PATCH 3/3] target/arm: Flush only the TLBs affected by TTBR*_EL1 Richard Henderson
2018-10-18 20:27   ` Aaron Lindsay
2018-10-18 20:52     ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.