* [PATCH 0/2] KVM powerpc tlbie scalability improvement @ 2018-04-05 17:56 ` Nicholas Piggin 0 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-05 17:56 UTC (permalink / raw) To: kvm-ppc; +Cc: Nicholas Piggin, linuxppc-dev, Paul Mackerras Any reason we still need to take the tlbie lock on modern processors? Nicholas Piggin (2): KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls arch/powerpc/kvm/book3s_hv_rm_mmu.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) -- 2.16.3 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 0/2] KVM powerpc tlbie scalability improvement @ 2018-04-05 17:56 ` Nicholas Piggin 0 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-05 17:56 UTC (permalink / raw) To: kvm-ppc; +Cc: Nicholas Piggin, linuxppc-dev, Paul Mackerras Any reason we still need to take the tlbie lock on modern processors? Nicholas Piggin (2): KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls arch/powerpc/kvm/book3s_hv_rm_mmu.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) -- 2.16.3 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode 2018-04-05 17:56 ` Nicholas Piggin @ 2018-04-05 17:56 ` Nicholas Piggin -1 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-05 17:56 UTC (permalink / raw) To: kvm-ppc; +Cc: Nicholas Piggin, linuxppc-dev, Paul Mackerras, Balbir Singh This crashes with a "Bad real address for load" attempting to load from the vmalloc region in realmode (faulting address is in DAR). Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 LR [c0000000000c2430] do_tlbies+0x230/0x2f0 I suspect the reason is the per-cpu data is not in the linear chunk. This could be restored if that was able to be fixed, but for now, just remove the tracepoints. Fixes: 0428491cba ("powerpc/mm: Trace tlbie(l) instructions") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index e1c083fbe434..78e6a392330f 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -470,8 +470,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, for (i = 0; i < npages; ++i) { asm volatile(PPC_TLBIE_5(%0,%1,0,0,0) : : "r" (rbvalues[i]), "r" (kvm->arch.lpid)); - trace_tlbie(kvm->arch.lpid, 0, rbvalues[i], - kvm->arch.lpid, 0, 0, 0); } if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG)) { @@ -492,8 +490,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, for (i = 0; i < npages; ++i) { asm volatile(PPC_TLBIEL(%0,%1,0,0,0) : : "r" (rbvalues[i]), "r" (0)); - trace_tlbie(kvm->arch.lpid, 1, rbvalues[i], - 0, 0, 0, 0); } asm volatile("ptesync" : : : "memory"); } -- 2.16.3 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode @ 2018-04-05 17:56 ` Nicholas Piggin 0 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-05 17:56 UTC (permalink / raw) To: kvm-ppc; +Cc: Nicholas Piggin, linuxppc-dev, Paul Mackerras, Balbir Singh This crashes with a "Bad real address for load" attempting to load from the vmalloc region in realmode (faulting address is in DAR). Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] LE SMP NR_CPUS 48 NUMA PowerNV CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 LR [c0000000000c2430] do_tlbies+0x230/0x2f0 I suspect the reason is the per-cpu data is not in the linear chunk. This could be restored if that was able to be fixed, but for now, just remove the tracepoints. Fixes: 0428491cba ("powerpc/mm: Trace tlbie(l) instructions") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index e1c083fbe434..78e6a392330f 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -470,8 +470,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, for (i = 0; i < npages; ++i) { asm volatile(PPC_TLBIE_5(%0,%1,0,0,0) : : "r" (rbvalues[i]), "r" (kvm->arch.lpid)); - trace_tlbie(kvm->arch.lpid, 0, rbvalues[i], - kvm->arch.lpid, 0, 0, 0); } if (cpu_has_feature(CPU_FTR_P9_TLBIE_BUG)) { @@ -492,8 +490,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, for (i = 0; i < npages; ++i) { asm volatile(PPC_TLBIEL(%0,%1,0,0,0) : : "r" (rbvalues[i]), "r" (0)); - trace_tlbie(kvm->arch.lpid, 1, rbvalues[i], - 0, 0, 0, 0); } asm volatile("ptesync" : : : "memory"); } -- 2.16.3 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode 2018-04-05 17:56 ` Nicholas Piggin @ 2018-04-08 10:17 ` Balbir Singh -1 siblings, 0 replies; 28+ messages in thread From: Balbir Singh @ 2018-04-08 10:17 UTC (permalink / raw) To: Nicholas Piggin Cc: open list:KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), Paul Mackerras On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote: > This crashes with a "Bad real address for load" attempting to load > from the vmalloc region in realmode (faulting address is in DAR). > > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] > LE SMP NR_CPUS=2048 NUMA PowerNV > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 > > I suspect the reason is the per-cpu data is not in the linear chunk. > This could be restored if that was able to be fixed, but for now, > just remove the tracepoints. Could you share the stack trace as well? I've not observed this in my testing. May be I don't have as many cpus. I presume your talking about the per cpu data offsets for per cpu trace data? Balbir Singh. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode @ 2018-04-08 10:17 ` Balbir Singh 0 siblings, 0 replies; 28+ messages in thread From: Balbir Singh @ 2018-04-08 10:17 UTC (permalink / raw) To: Nicholas Piggin Cc: open list:KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), Paul Mackerras On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote: > This crashes with a "Bad real address for load" attempting to load > from the vmalloc region in realmode (faulting address is in DAR). > > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] > LE SMP NR_CPUS 48 NUMA PowerNV > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 > > I suspect the reason is the per-cpu data is not in the linear chunk. > This could be restored if that was able to be fixed, but for now, > just remove the tracepoints. Could you share the stack trace as well? I've not observed this in my testing. May be I don't have as many cpus. I presume your talking about the per cpu data offsets for per cpu trace data? Balbir Singh. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode 2018-04-08 10:17 ` Balbir Singh @ 2018-04-08 13:41 ` Nicholas Piggin -1 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-08 13:41 UTC (permalink / raw) To: Balbir Singh Cc: open list:KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), Paul Mackerras On Sun, 8 Apr 2018 20:17:47 +1000 Balbir Singh <bsingharora@gmail.com> wrote: > On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote: > > This crashes with a "Bad real address for load" attempting to load > > from the vmalloc region in realmode (faulting address is in DAR). > > > > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] > > LE SMP NR_CPUS=2048 NUMA PowerNV > > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 > > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 > > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) > > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 > > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 > > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 > > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 > > > > I suspect the reason is the per-cpu data is not in the linear chunk. > > This could be restored if that was able to be fixed, but for now, > > just remove the tracepoints. > > Could you share the stack trace as well? I've not observed this in my testing. I can't seem to find it, I can try reproduce tomorrow. It was coming from h_remove hcall from the guest. It's 176 logical CPUs. > May be I don't have as many cpus. I presume your talking about the per cpu > data offsets for per cpu trace data? It looked like it was dereferencing virtually mapped per-cpu data, yes. Probably the perf_events deref. Thanks, Nick ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode @ 2018-04-08 13:41 ` Nicholas Piggin 0 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-08 13:41 UTC (permalink / raw) To: Balbir Singh Cc: open list:KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), Paul Mackerras On Sun, 8 Apr 2018 20:17:47 +1000 Balbir Singh <bsingharora@gmail.com> wrote: > On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote: > > This crashes with a "Bad real address for load" attempting to load > > from the vmalloc region in realmode (faulting address is in DAR). > > > > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] > > LE SMP NR_CPUS 48 NUMA PowerNV > > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 > > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 > > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) > > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 > > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 > > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 > > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 > > > > I suspect the reason is the per-cpu data is not in the linear chunk. > > This could be restored if that was able to be fixed, but for now, > > just remove the tracepoints. > > Could you share the stack trace as well? I've not observed this in my testing. I can't seem to find it, I can try reproduce tomorrow. It was coming from h_remove hcall from the guest. It's 176 logical CPUs. > May be I don't have as many cpus. I presume your talking about the per cpu > data offsets for per cpu trace data? It looked like it was dereferencing virtually mapped per-cpu data, yes. Probably the perf_events deref. Thanks, Nick ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode 2018-04-08 13:41 ` Nicholas Piggin @ 2018-04-10 3:21 ` Michael Ellerman -1 siblings, 0 replies; 28+ messages in thread From: Michael Ellerman @ 2018-04-10 3:21 UTC (permalink / raw) To: Nicholas Piggin, Balbir Singh Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), open list:KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC Nicholas Piggin <npiggin@gmail.com> writes: > On Sun, 8 Apr 2018 20:17:47 +1000 > Balbir Singh <bsingharora@gmail.com> wrote: > >> On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote: >> > This crashes with a "Bad real address for load" attempting to load >> > from the vmalloc region in realmode (faulting address is in DAR). >> > >> > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] >> > LE SMP NR_CPUS=2048 NUMA PowerNV >> > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 >> > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 >> > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) >> > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 >> > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 >> > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 >> > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 >> > >> > I suspect the reason is the per-cpu data is not in the linear chunk. >> > This could be restored if that was able to be fixed, but for now, >> > just remove the tracepoints. >> >> Could you share the stack trace as well? I've not observed this in my testing. > > I can't seem to find it, I can try reproduce tomorrow. It was coming > from h_remove hcall from the guest. It's 176 logical CPUs. > >> May be I don't have as many cpus. I presume your talking about the per cpu >> data offsets for per cpu trace data? > > It looked like it was dereferencing virtually mapped per-cpu data, yes. > Probably the perf_events deref. Naveen has posted a series to (hopefully) fix this, which just missed the merge window: https://patchwork.ozlabs.org/patch/894757/ cheers ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode @ 2018-04-10 3:21 ` Michael Ellerman 0 siblings, 0 replies; 28+ messages in thread From: Michael Ellerman @ 2018-04-10 3:21 UTC (permalink / raw) To: Nicholas Piggin, Balbir Singh Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), open list:KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC Nicholas Piggin <npiggin@gmail.com> writes: > On Sun, 8 Apr 2018 20:17:47 +1000 > Balbir Singh <bsingharora@gmail.com> wrote: > >> On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote: >> > This crashes with a "Bad real address for load" attempting to load >> > from the vmalloc region in realmode (faulting address is in DAR). >> > >> > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] >> > LE SMP NR_CPUS 48 NUMA PowerNV >> > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 >> > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 >> > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) >> > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 >> > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 >> > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 >> > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 >> > >> > I suspect the reason is the per-cpu data is not in the linear chunk. >> > This could be restored if that was able to be fixed, but for now, >> > just remove the tracepoints. >> >> Could you share the stack trace as well? I've not observed this in my testing. > > I can't seem to find it, I can try reproduce tomorrow. It was coming > from h_remove hcall from the guest. It's 176 logical CPUs. > >> May be I don't have as many cpus. I presume your talking about the per cpu >> data offsets for per cpu trace data? > > It looked like it was dereferencing virtually mapped per-cpu data, yes. > Probably the perf_events deref. Naveen has posted a series to (hopefully) fix this, which just missed the merge window: https://patchwork.ozlabs.org/patch/894757/ cheers ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode 2018-04-10 3:21 ` Michael Ellerman @ 2018-04-10 5:55 ` Naveen N. Rao -1 siblings, 0 replies; 28+ messages in thread From: Naveen N. Rao @ 2018-04-10 5:55 UTC (permalink / raw) To: Balbir Singh, Michael Ellerman, Nicholas Piggin Cc: open list:KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT) Michael Ellerman wrote: > Nicholas Piggin <npiggin@gmail.com> writes: >=20 >> On Sun, 8 Apr 2018 20:17:47 +1000 >> Balbir Singh <bsingharora@gmail.com> wrote: >> >>> On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wro= te: >>> > This crashes with a "Bad real address for load" attempting to load >>> > from the vmalloc region in realmode (faulting address is in DAR). >>> > >>> > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] >>> > LE SMP NR_CPUS=3D2048 NUMA PowerNV >>> > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g4= 3d1859f0994 >>> > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 >>> > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d= 1859f0994) >>> > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 0000000= 0 >>> > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE:= 3 >>> > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 >>> > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 >>> > >>> > I suspect the reason is the per-cpu data is not in the linear chunk. >>> > This could be restored if that was able to be fixed, but for now, >>> > just remove the tracepoints. =20 >>>=20 >>> Could you share the stack trace as well? I've not observed this in my t= esting. >> >> I can't seem to find it, I can try reproduce tomorrow. It was coming >> from h_remove hcall from the guest. It's 176 logical CPUs. >> >>> May be I don't have as many cpus. I presume your talking about the per = cpu >>> data offsets for per cpu trace data? >> >> It looked like it was dereferencing virtually mapped per-cpu data, yes. >> Probably the perf_events deref. >=20 > Naveen has posted a series to (hopefully) fix this, which just missed > the merge window: >=20 > https://patchwork.ozlabs.org/patch/894757/ I'm afraid that won't actually help here :( That series is specific to the function tracer, while this is using=20 static tracepoints. We could convert trace_tlbie() to a TRACE_EVENT_CONDITION() and guard it=20 within a check for paca->ftrace_enabled, but that would only be useful=20 if the below callsites can ever be hit outside of KVM guest mode. - Naveen = ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode @ 2018-04-10 5:55 ` Naveen N. Rao 0 siblings, 0 replies; 28+ messages in thread From: Naveen N. Rao @ 2018-04-10 5:55 UTC (permalink / raw) To: Balbir Singh, Michael Ellerman, Nicholas Piggin Cc: open list:KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT) Michael Ellerman wrote: > Nicholas Piggin <npiggin@gmail.com> writes: > >> On Sun, 8 Apr 2018 20:17:47 +1000 >> Balbir Singh <bsingharora@gmail.com> wrote: >> >>> On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote: >>> > This crashes with a "Bad real address for load" attempting to load >>> > from the vmalloc region in realmode (faulting address is in DAR). >>> > >>> > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] >>> > LE SMP NR_CPUS 48 NUMA PowerNV >>> > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 >>> > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 >>> > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) >>> > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 >>> > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 >>> > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 >>> > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 >>> > >>> > I suspect the reason is the per-cpu data is not in the linear chunk. >>> > This could be restored if that was able to be fixed, but for now, >>> > just remove the tracepoints. >>> >>> Could you share the stack trace as well? I've not observed this in my testing. >> >> I can't seem to find it, I can try reproduce tomorrow. It was coming >> from h_remove hcall from the guest. It's 176 logical CPUs. >> >>> May be I don't have as many cpus. I presume your talking about the per cpu >>> data offsets for per cpu trace data? >> >> It looked like it was dereferencing virtually mapped per-cpu data, yes. >> Probably the perf_events deref. > > Naveen has posted a series to (hopefully) fix this, which just missed > the merge window: > > https://patchwork.ozlabs.org/patch/894757/ I'm afraid that won't actually help here :( That series is specific to the function tracer, while this is using static tracepoints. We could convert trace_tlbie() to a TRACE_EVENT_CONDITION() and guard it within a check for paca->ftrace_enabled, but that would only be useful if the below callsites can ever be hit outside of KVM guest mode. - Naveen ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode 2018-04-10 5:55 ` Naveen N. Rao @ 2018-04-10 6:10 ` Nicholas Piggin -1 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-10 6:10 UTC (permalink / raw) To: Naveen N. Rao Cc: Balbir Singh, Michael Ellerman, open list:KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT) On Tue, 10 Apr 2018 11:25:02 +0530 "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > Michael Ellerman wrote: > > Nicholas Piggin <npiggin@gmail.com> writes: > > > >> On Sun, 8 Apr 2018 20:17:47 +1000 > >> Balbir Singh <bsingharora@gmail.com> wrote: > >> > >>> On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote: > >>> > This crashes with a "Bad real address for load" attempting to load > >>> > from the vmalloc region in realmode (faulting address is in DAR). > >>> > > >>> > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] > >>> > LE SMP NR_CPUS=2048 NUMA PowerNV > >>> > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 > >>> > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 > >>> > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) > >>> > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 > >>> > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 > >>> > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 > >>> > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 > >>> > > >>> > I suspect the reason is the per-cpu data is not in the linear chunk. > >>> > This could be restored if that was able to be fixed, but for now, > >>> > just remove the tracepoints. > >>> > >>> Could you share the stack trace as well? I've not observed this in my testing. > >> > >> I can't seem to find it, I can try reproduce tomorrow. It was coming > >> from h_remove hcall from the guest. It's 176 logical CPUs. > >> > >>> May be I don't have as many cpus. I presume your talking about the per cpu > >>> data offsets for per cpu trace data? > >> > >> It looked like it was dereferencing virtually mapped per-cpu data, yes. > >> Probably the perf_events deref. > > > > Naveen has posted a series to (hopefully) fix this, which just missed > > the merge window: > > > > https://patchwork.ozlabs.org/patch/894757/ > > I'm afraid that won't actually help here :( > That series is specific to the function tracer, while this is using > static tracepoints. > > We could convert trace_tlbie() to a TRACE_EVENT_CONDITION() and guard it > within a check for paca->ftrace_enabled, but that would only be useful > if the below callsites can ever be hit outside of KVM guest mode. Right, removing the trace points is the right thing to do here. Doing tracing in real mode would be a whole effort itself, I'd expect. Or disabling realmode handling of HPT hcalls if trace points are active. Thanks, Nick ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode @ 2018-04-10 6:10 ` Nicholas Piggin 0 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-10 6:10 UTC (permalink / raw) To: Naveen N. Rao Cc: Balbir Singh, Michael Ellerman, open list:KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT) On Tue, 10 Apr 2018 11:25:02 +0530 "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > Michael Ellerman wrote: > > Nicholas Piggin <npiggin@gmail.com> writes: > > > >> On Sun, 8 Apr 2018 20:17:47 +1000 > >> Balbir Singh <bsingharora@gmail.com> wrote: > >> > >>> On Fri, Apr 6, 2018 at 3:56 AM, Nicholas Piggin <npiggin@gmail.com> wrote: > >>> > This crashes with a "Bad real address for load" attempting to load > >>> > from the vmalloc region in realmode (faulting address is in DAR). > >>> > > >>> > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] > >>> > LE SMP NR_CPUS 48 NUMA PowerNV > >>> > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 > >>> > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 > >>> > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) > >>> > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 > >>> > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 > >>> > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 > >>> > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 > >>> > > >>> > I suspect the reason is the per-cpu data is not in the linear chunk. > >>> > This could be restored if that was able to be fixed, but for now, > >>> > just remove the tracepoints. > >>> > >>> Could you share the stack trace as well? I've not observed this in my testing. > >> > >> I can't seem to find it, I can try reproduce tomorrow. It was coming > >> from h_remove hcall from the guest. It's 176 logical CPUs. > >> > >>> May be I don't have as many cpus. I presume your talking about the per cpu > >>> data offsets for per cpu trace data? > >> > >> It looked like it was dereferencing virtually mapped per-cpu data, yes. > >> Probably the perf_events deref. > > > > Naveen has posted a series to (hopefully) fix this, which just missed > > the merge window: > > > > https://patchwork.ozlabs.org/patch/894757/ > > I'm afraid that won't actually help here :( > That series is specific to the function tracer, while this is using > static tracepoints. > > We could convert trace_tlbie() to a TRACE_EVENT_CONDITION() and guard it > within a check for paca->ftrace_enabled, but that would only be useful > if the below callsites can ever be hit outside of KVM guest mode. Right, removing the trace points is the right thing to do here. Doing tracing in real mode would be a whole effort itself, I'd expect. Or disabling realmode handling of HPT hcalls if trace points are active. Thanks, Nick ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode 2018-04-05 17:56 ` Nicholas Piggin @ 2018-04-11 14:49 ` Michael Ellerman -1 siblings, 0 replies; 28+ messages in thread From: Michael Ellerman @ 2018-04-11 14:49 UTC (permalink / raw) To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin On Thu, 2018-04-05 at 17:56:30 UTC, Nicholas Piggin wrote: > This crashes with a "Bad real address for load" attempting to load > from the vmalloc region in realmode (faulting address is in DAR). > > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] > LE SMP NR_CPUS=2048 NUMA PowerNV > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 > > I suspect the reason is the per-cpu data is not in the linear chunk. > This could be restored if that was able to be fixed, but for now, > just remove the tracepoints. > > Fixes: 0428491cba ("powerpc/mm: Trace tlbie(l) instructions") > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/19ce7909ed11c49f7eddf59e7f49cd cheers ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode @ 2018-04-11 14:49 ` Michael Ellerman 0 siblings, 0 replies; 28+ messages in thread From: Michael Ellerman @ 2018-04-11 14:49 UTC (permalink / raw) To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin On Thu, 2018-04-05 at 17:56:30 UTC, Nicholas Piggin wrote: > This crashes with a "Bad real address for load" attempting to load > from the vmalloc region in realmode (faulting address is in DAR). > > Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] > LE SMP NR_CPUS 48 NUMA PowerNV > CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994 > NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 > REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994) > MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 > CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 > NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 > LR [c0000000000c2430] do_tlbies+0x230/0x2f0 > > I suspect the reason is the per-cpu data is not in the linear chunk. > This could be restored if that was able to be fixed, but for now, > just remove the tracepoints. > > Fixes: 0428491cba ("powerpc/mm: Trace tlbie(l) instructions") > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/19ce7909ed11c49f7eddf59e7f49cd cheers ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls 2018-04-05 17:56 ` Nicholas Piggin @ 2018-04-05 17:56 ` Nicholas Piggin -1 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-05 17:56 UTC (permalink / raw) To: kvm-ppc; +Cc: Nicholas Piggin, linuxppc-dev, Paul Mackerras tlbies to an LPAR do not have to be serialised since POWER4, MMU_FTR_LOCKLESS_TLBIE can be used to avoid the spin lock in do_tlbies. Testing was done on a POWER9 system in HPT mode, with a -smp 32 guest in HPT mode. 32 instances of the powerpc fork benchmark from selftests were run with --fork, and the results measured. Without this patch, total throughput was about 13.5K/sec, and this is the top of the host profile: 74.52% [k] do_tlbies 2.95% [k] kvmppc_book3s_hv_page_fault 1.80% [k] calc_checksum 1.80% [k] kvmppc_vcpu_run_hv 1.49% [k] kvmppc_run_core After this patch, throughput was about 51K/sec, with this profile: 21.28% [k] do_tlbies 5.26% [k] kvmppc_run_core 4.88% [k] kvmppc_book3s_hv_page_fault 3.30% [k] _raw_spin_lock_irqsave 3.25% [k] gup_pgd_range Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 78e6a392330f..0221a0f74f07 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock) unsigned int tmp, old; unsigned int token = LOCK_TOKEN; + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) + return 1; + asm volatile("1:lwarx %1,0,%2\n" " cmpwi cr0,%1,0\n" " bne 2f\n" @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock) return old == 0; } +static inline void unlock_tlbie_after_sync(unsigned int *lock) +{ + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) + return; +} + static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, long npages, int global, bool need_sync) { @@ -483,7 +492,7 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, } asm volatile("eieio; tlbsync; ptesync" : : : "memory"); - kvm->arch.tlbie_lock = 0; + unlock_tlbie_after_sync(&kvm->arch.tlbie_lock); } else { if (need_sync) asm volatile("ptesync" : : : "memory"); -- 2.16.3 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls @ 2018-04-05 17:56 ` Nicholas Piggin 0 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-05 17:56 UTC (permalink / raw) To: kvm-ppc; +Cc: Nicholas Piggin, linuxppc-dev, Paul Mackerras tlbies to an LPAR do not have to be serialised since POWER4, MMU_FTR_LOCKLESS_TLBIE can be used to avoid the spin lock in do_tlbies. Testing was done on a POWER9 system in HPT mode, with a -smp 32 guest in HPT mode. 32 instances of the powerpc fork benchmark from selftests were run with --fork, and the results measured. Without this patch, total throughput was about 13.5K/sec, and this is the top of the host profile: 74.52% [k] do_tlbies 2.95% [k] kvmppc_book3s_hv_page_fault 1.80% [k] calc_checksum 1.80% [k] kvmppc_vcpu_run_hv 1.49% [k] kvmppc_run_core After this patch, throughput was about 51K/sec, with this profile: 21.28% [k] do_tlbies 5.26% [k] kvmppc_run_core 4.88% [k] kvmppc_book3s_hv_page_fault 3.30% [k] _raw_spin_lock_irqsave 3.25% [k] gup_pgd_range Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 78e6a392330f..0221a0f74f07 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock) unsigned int tmp, old; unsigned int token = LOCK_TOKEN; + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) + return 1; + asm volatile("1:lwarx %1,0,%2\n" " cmpwi cr0,%1,0\n" " bne 2f\n" @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock) return old = 0; } +static inline void unlock_tlbie_after_sync(unsigned int *lock) +{ + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) + return; +} + static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, long npages, int global, bool need_sync) { @@ -483,7 +492,7 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, } asm volatile("eieio; tlbsync; ptesync" : : : "memory"); - kvm->arch.tlbie_lock = 0; + unlock_tlbie_after_sync(&kvm->arch.tlbie_lock); } else { if (need_sync) asm volatile("ptesync" : : : "memory"); -- 2.16.3 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls 2018-04-05 17:56 ` Nicholas Piggin @ 2018-04-06 5:39 ` Nicholas Piggin -1 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-06 5:39 UTC (permalink / raw) To: kvm-ppc; +Cc: linuxppc-dev, Paul Mackerras On Fri, 6 Apr 2018 03:56:31 +1000 Nicholas Piggin <npiggin@gmail.com> wrote: > tlbies to an LPAR do not have to be serialised since POWER4, > MMU_FTR_LOCKLESS_TLBIE can be used to avoid the spin lock in > do_tlbies. > > Testing was done on a POWER9 system in HPT mode, with a -smp 32 guest > in HPT mode. 32 instances of the powerpc fork benchmark from selftests > were run with --fork, and the results measured. > > Without this patch, total throughput was about 13.5K/sec, and this is > the top of the host profile: > > 74.52% [k] do_tlbies > 2.95% [k] kvmppc_book3s_hv_page_fault > 1.80% [k] calc_checksum > 1.80% [k] kvmppc_vcpu_run_hv > 1.49% [k] kvmppc_run_core > > After this patch, throughput was about 51K/sec, with this profile: > > 21.28% [k] do_tlbies > 5.26% [k] kvmppc_run_core > 4.88% [k] kvmppc_book3s_hv_page_fault > 3.30% [k] _raw_spin_lock_irqsave > 3.25% [k] gup_pgd_range > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > --- > arch/powerpc/kvm/book3s_hv_rm_mmu.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > index 78e6a392330f..0221a0f74f07 100644 > --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c > +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock) > unsigned int tmp, old; > unsigned int token = LOCK_TOKEN; > > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > + return 1; > + > asm volatile("1:lwarx %1,0,%2\n" > " cmpwi cr0,%1,0\n" > " bne 2f\n" > @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock) > return old == 0; > } > > +static inline void unlock_tlbie_after_sync(unsigned int *lock) > +{ > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > + return; > +} > + > static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, > long npages, int global, bool need_sync) > { > @@ -483,7 +492,7 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, > } > > asm volatile("eieio; tlbsync; ptesync" : : : "memory"); > - kvm->arch.tlbie_lock = 0; > + unlock_tlbie_after_sync(&kvm->arch.tlbie_lock); Well that's a silly bug in the !LOCKLESS path, that was supposed to move to unlock, of course. Will fix it up after some time for comments. Thanks, Nick ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls @ 2018-04-06 5:39 ` Nicholas Piggin 0 siblings, 0 replies; 28+ messages in thread From: Nicholas Piggin @ 2018-04-06 5:39 UTC (permalink / raw) To: kvm-ppc; +Cc: linuxppc-dev, Paul Mackerras On Fri, 6 Apr 2018 03:56:31 +1000 Nicholas Piggin <npiggin@gmail.com> wrote: > tlbies to an LPAR do not have to be serialised since POWER4, > MMU_FTR_LOCKLESS_TLBIE can be used to avoid the spin lock in > do_tlbies. > > Testing was done on a POWER9 system in HPT mode, with a -smp 32 guest > in HPT mode. 32 instances of the powerpc fork benchmark from selftests > were run with --fork, and the results measured. > > Without this patch, total throughput was about 13.5K/sec, and this is > the top of the host profile: > > 74.52% [k] do_tlbies > 2.95% [k] kvmppc_book3s_hv_page_fault > 1.80% [k] calc_checksum > 1.80% [k] kvmppc_vcpu_run_hv > 1.49% [k] kvmppc_run_core > > After this patch, throughput was about 51K/sec, with this profile: > > 21.28% [k] do_tlbies > 5.26% [k] kvmppc_run_core > 4.88% [k] kvmppc_book3s_hv_page_fault > 3.30% [k] _raw_spin_lock_irqsave > 3.25% [k] gup_pgd_range > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > --- > arch/powerpc/kvm/book3s_hv_rm_mmu.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > index 78e6a392330f..0221a0f74f07 100644 > --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c > +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock) > unsigned int tmp, old; > unsigned int token = LOCK_TOKEN; > > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > + return 1; > + > asm volatile("1:lwarx %1,0,%2\n" > " cmpwi cr0,%1,0\n" > " bne 2f\n" > @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock) > return old = 0; > } > > +static inline void unlock_tlbie_after_sync(unsigned int *lock) > +{ > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > + return; > +} > + > static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, > long npages, int global, bool need_sync) > { > @@ -483,7 +492,7 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, > } > > asm volatile("eieio; tlbsync; ptesync" : : : "memory"); > - kvm->arch.tlbie_lock = 0; > + unlock_tlbie_after_sync(&kvm->arch.tlbie_lock); Well that's a silly bug in the !LOCKLESS path, that was supposed to move to unlock, of course. Will fix it up after some time for comments. Thanks, Nick ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls 2018-04-05 17:56 ` Nicholas Piggin @ 2018-04-06 6:12 ` Michael Ellerman -1 siblings, 0 replies; 28+ messages in thread From: Michael Ellerman @ 2018-04-06 6:12 UTC (permalink / raw) To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin Nicholas Piggin <npiggin@gmail.com> writes: > diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > index 78e6a392330f..0221a0f74f07 100644 > --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c > +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock) > unsigned int tmp, old; > unsigned int token = LOCK_TOKEN; > > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > + return 1; > + > asm volatile("1:lwarx %1,0,%2\n" > " cmpwi cr0,%1,0\n" > " bne 2f\n" > @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock) > return old == 0; > } > > +static inline void unlock_tlbie_after_sync(unsigned int *lock) > +{ > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > + return; > +} So this is a bit hard to follow: #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2 \ MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2 #define MMU_FTRS_POWER MMU_FTRS_DEFAULT_HPTE_ARCH_V2 #define MMU_FTRS_PPC970 MMU_FTRS_POWER | MMU_FTR_TLBIE_CROP_VA // does NOT #define MMU_FTRS_POWER5 MMU_FTRS_POWER | MMU_FTR_LOCKLESS_TLBIE #define MMU_FTRS_POWER6 MMU_FTRS_POWER5 | MMU_FTR_KERNEL_RO | MMU_FTR_68_BIT_VA // includes lockless TLBIE #define MMU_FTRS_POWER7 MMU_FTRS_POWER6 // includes lockless TLBIE #define MMU_FTRS_POWER8 MMU_FTRS_POWER6 // includes lockless TLBIE #define MMU_FTRS_POWER9 MMU_FTRS_POWER6 // includes lockless TLBIE #define MMU_FTRS_CELL MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | // does NOT MMU_FTR_CI_LARGE_PAGE #define MMU_FTRS_PA6T MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \ // does NOT MMU_FTR_CI_LARGE_PAGE | MMU_FTR_NO_SLBIE_B So it's only 970, Cell and Pasemi that *don't* have lockless TLBIE. And KVM HV doesn't doesn't run on any of those. So we can just not check for the feature in the KVM HV code. Am I right? cheers ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls @ 2018-04-06 6:12 ` Michael Ellerman 0 siblings, 0 replies; 28+ messages in thread From: Michael Ellerman @ 2018-04-06 6:12 UTC (permalink / raw) To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin Nicholas Piggin <npiggin@gmail.com> writes: > diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > index 78e6a392330f..0221a0f74f07 100644 > --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c > +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock) > unsigned int tmp, old; > unsigned int token = LOCK_TOKEN; > > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > + return 1; > + > asm volatile("1:lwarx %1,0,%2\n" > " cmpwi cr0,%1,0\n" > " bne 2f\n" > @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock) > return old = 0; > } > > +static inline void unlock_tlbie_after_sync(unsigned int *lock) > +{ > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > + return; > +} So this is a bit hard to follow: #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2 \ MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2 #define MMU_FTRS_POWER MMU_FTRS_DEFAULT_HPTE_ARCH_V2 #define MMU_FTRS_PPC970 MMU_FTRS_POWER | MMU_FTR_TLBIE_CROP_VA // does NOT #define MMU_FTRS_POWER5 MMU_FTRS_POWER | MMU_FTR_LOCKLESS_TLBIE #define MMU_FTRS_POWER6 MMU_FTRS_POWER5 | MMU_FTR_KERNEL_RO | MMU_FTR_68_BIT_VA // includes lockless TLBIE #define MMU_FTRS_POWER7 MMU_FTRS_POWER6 // includes lockless TLBIE #define MMU_FTRS_POWER8 MMU_FTRS_POWER6 // includes lockless TLBIE #define MMU_FTRS_POWER9 MMU_FTRS_POWER6 // includes lockless TLBIE #define MMU_FTRS_CELL MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | // does NOT MMU_FTR_CI_LARGE_PAGE #define MMU_FTRS_PA6T MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \ // does NOT MMU_FTR_CI_LARGE_PAGE | MMU_FTR_NO_SLBIE_B So it's only 970, Cell and Pasemi that *don't* have lockless TLBIE. And KVM HV doesn't doesn't run on any of those. So we can just not check for the feature in the KVM HV code. Am I right? cheers ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls 2018-04-06 6:12 ` Michael Ellerman @ 2018-05-10 5:30 ` Paul Mackerras -1 siblings, 0 replies; 28+ messages in thread From: Paul Mackerras @ 2018-05-10 5:30 UTC (permalink / raw) To: Michael Ellerman; +Cc: Nicholas Piggin, kvm-ppc, linuxppc-dev On Fri, Apr 06, 2018 at 04:12:32PM +1000, Michael Ellerman wrote: > Nicholas Piggin <npiggin@gmail.com> writes: > > diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > > index 78e6a392330f..0221a0f74f07 100644 > > --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c > > +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > > @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock) > > unsigned int tmp, old; > > unsigned int token = LOCK_TOKEN; > > > > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > > + return 1; > > + > > asm volatile("1:lwarx %1,0,%2\n" > > " cmpwi cr0,%1,0\n" > > " bne 2f\n" > > @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock) > > return old == 0; > > } > > > > +static inline void unlock_tlbie_after_sync(unsigned int *lock) > > +{ > > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > > + return; > > +} > > So this is a bit hard to follow: > > #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2 \ > MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2 > #define MMU_FTRS_POWER MMU_FTRS_DEFAULT_HPTE_ARCH_V2 > #define MMU_FTRS_PPC970 MMU_FTRS_POWER | MMU_FTR_TLBIE_CROP_VA // does NOT > #define MMU_FTRS_POWER5 MMU_FTRS_POWER | MMU_FTR_LOCKLESS_TLBIE > #define MMU_FTRS_POWER6 MMU_FTRS_POWER5 | MMU_FTR_KERNEL_RO | MMU_FTR_68_BIT_VA // includes lockless TLBIE > #define MMU_FTRS_POWER7 MMU_FTRS_POWER6 // includes lockless TLBIE > #define MMU_FTRS_POWER8 MMU_FTRS_POWER6 // includes lockless TLBIE > #define MMU_FTRS_POWER9 MMU_FTRS_POWER6 // includes lockless TLBIE > #define MMU_FTRS_CELL MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | // does NOT > MMU_FTR_CI_LARGE_PAGE > #define MMU_FTRS_PA6T MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \ // does NOT > MMU_FTR_CI_LARGE_PAGE | MMU_FTR_NO_SLBIE_B > > > So it's only 970, Cell and Pasemi that *don't* have lockless TLBIE. > > And KVM HV doesn't doesn't run on any of those. > > So we can just not check for the feature in the KVM HV code. > > Am I right? Yes; that code was written when we still supported HV KVM on 970, but we ripped that out some time ago. Paul. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls @ 2018-05-10 5:30 ` Paul Mackerras 0 siblings, 0 replies; 28+ messages in thread From: Paul Mackerras @ 2018-05-10 5:30 UTC (permalink / raw) To: Michael Ellerman; +Cc: Nicholas Piggin, kvm-ppc, linuxppc-dev On Fri, Apr 06, 2018 at 04:12:32PM +1000, Michael Ellerman wrote: > Nicholas Piggin <npiggin@gmail.com> writes: > > diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > > index 78e6a392330f..0221a0f74f07 100644 > > --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c > > +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > > @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock) > > unsigned int tmp, old; > > unsigned int token = LOCK_TOKEN; > > > > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > > + return 1; > > + > > asm volatile("1:lwarx %1,0,%2\n" > > " cmpwi cr0,%1,0\n" > > " bne 2f\n" > > @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock) > > return old = 0; > > } > > > > +static inline void unlock_tlbie_after_sync(unsigned int *lock) > > +{ > > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) > > + return; > > +} > > So this is a bit hard to follow: > > #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2 \ > MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2 > #define MMU_FTRS_POWER MMU_FTRS_DEFAULT_HPTE_ARCH_V2 > #define MMU_FTRS_PPC970 MMU_FTRS_POWER | MMU_FTR_TLBIE_CROP_VA // does NOT > #define MMU_FTRS_POWER5 MMU_FTRS_POWER | MMU_FTR_LOCKLESS_TLBIE > #define MMU_FTRS_POWER6 MMU_FTRS_POWER5 | MMU_FTR_KERNEL_RO | MMU_FTR_68_BIT_VA // includes lockless TLBIE > #define MMU_FTRS_POWER7 MMU_FTRS_POWER6 // includes lockless TLBIE > #define MMU_FTRS_POWER8 MMU_FTRS_POWER6 // includes lockless TLBIE > #define MMU_FTRS_POWER9 MMU_FTRS_POWER6 // includes lockless TLBIE > #define MMU_FTRS_CELL MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | // does NOT > MMU_FTR_CI_LARGE_PAGE > #define MMU_FTRS_PA6T MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \ // does NOT > MMU_FTR_CI_LARGE_PAGE | MMU_FTR_NO_SLBIE_B > > > So it's only 970, Cell and Pasemi that *don't* have lockless TLBIE. > > And KVM HV doesn't doesn't run on any of those. > > So we can just not check for the feature in the KVM HV code. > > Am I right? Yes; that code was written when we still supported HV KVM on 970, but we ripped that out some time ago. Paul. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls 2018-05-10 5:30 ` Paul Mackerras @ 2018-05-14 4:04 ` Michael Ellerman -1 siblings, 0 replies; 28+ messages in thread From: Michael Ellerman @ 2018-05-14 4:04 UTC (permalink / raw) To: Paul Mackerras; +Cc: Nicholas Piggin, kvm-ppc, linuxppc-dev Paul Mackerras <paulus@ozlabs.org> writes: > On Fri, Apr 06, 2018 at 04:12:32PM +1000, Michael Ellerman wrote: >> Nicholas Piggin <npiggin@gmail.com> writes: >> > diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c >> > index 78e6a392330f..0221a0f74f07 100644 >> > --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c >> > +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c >> > @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock) >> > unsigned int tmp, old; >> > unsigned int token = LOCK_TOKEN; >> > >> > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) >> > + return 1; >> > + >> > asm volatile("1:lwarx %1,0,%2\n" >> > " cmpwi cr0,%1,0\n" >> > " bne 2f\n" >> > @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock) >> > return old == 0; >> > } >> > >> > +static inline void unlock_tlbie_after_sync(unsigned int *lock) >> > +{ >> > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) >> > + return; >> > +} >> >> So this is a bit hard to follow: >> >> #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2 \ >> MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2 >> #define MMU_FTRS_POWER MMU_FTRS_DEFAULT_HPTE_ARCH_V2 >> #define MMU_FTRS_PPC970 MMU_FTRS_POWER | MMU_FTR_TLBIE_CROP_VA // does NOT >> #define MMU_FTRS_POWER5 MMU_FTRS_POWER | MMU_FTR_LOCKLESS_TLBIE >> #define MMU_FTRS_POWER6 MMU_FTRS_POWER5 | MMU_FTR_KERNEL_RO | MMU_FTR_68_BIT_VA // includes lockless TLBIE >> #define MMU_FTRS_POWER7 MMU_FTRS_POWER6 // includes lockless TLBIE >> #define MMU_FTRS_POWER8 MMU_FTRS_POWER6 // includes lockless TLBIE >> #define MMU_FTRS_POWER9 MMU_FTRS_POWER6 // includes lockless TLBIE >> #define MMU_FTRS_CELL MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | // does NOT >> MMU_FTR_CI_LARGE_PAGE >> #define MMU_FTRS_PA6T MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \ // does NOT >> MMU_FTR_CI_LARGE_PAGE | MMU_FTR_NO_SLBIE_B >> >> >> So it's only 970, Cell and Pasemi that *don't* have lockless TLBIE. >> >> And KVM HV doesn't doesn't run on any of those. >> >> So we can just not check for the feature in the KVM HV code. >> >> Am I right? > > Yes; that code was written when we still supported HV KVM on 970, > but we ripped that out some time ago. OK good, in commit: c17b98cf6028 ("KVM: PPC: Book3S HV: Remove code for PPC970 processors") (Dec 2014) So we should be able to do the patch below. cheers diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 17498e9a26e4..7756b0c6da75 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -269,7 +269,6 @@ struct kvm_arch { unsigned long host_lpcr; unsigned long sdr1; unsigned long host_sdr1; - int tlbie_lock; unsigned long lpcr; unsigned long vrma_slb_v; int mmu_ready; diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 78e6a392330f..89d909b3b881 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -434,24 +434,6 @@ static inline int is_mmio_hpte(unsigned long v, unsigned long r) (HPTE_R_KEY_HI | HPTE_R_KEY_LO)); } -static inline int try_lock_tlbie(unsigned int *lock) -{ - unsigned int tmp, old; - unsigned int token = LOCK_TOKEN; - - asm volatile("1:lwarx %1,0,%2\n" - " cmpwi cr0,%1,0\n" - " bne 2f\n" - " stwcx. %3,0,%2\n" - " bne- 1b\n" - " isync\n" - "2:" - : "=&r" (tmp), "=&r" (old) - : "r" (lock), "r" (token) - : "cc", "memory"); - return old == 0; -} - static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, long npages, int global, bool need_sync) { @@ -463,8 +445,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, * the RS field, this is backwards-compatible with P7 and P8. */ if (global) { - while (!try_lock_tlbie(&kvm->arch.tlbie_lock)) - cpu_relax(); if (need_sync) asm volatile("ptesync" : : : "memory"); for (i = 0; i < npages; ++i) { @@ -483,7 +463,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, } asm volatile("eieio; tlbsync; ptesync" : : : "memory"); - kvm->arch.tlbie_lock = 0; } else { if (need_sync) asm volatile("ptesync" : : : "memory"); ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls @ 2018-05-14 4:04 ` Michael Ellerman 0 siblings, 0 replies; 28+ messages in thread From: Michael Ellerman @ 2018-05-14 4:04 UTC (permalink / raw) To: Paul Mackerras; +Cc: Nicholas Piggin, kvm-ppc, linuxppc-dev Paul Mackerras <paulus@ozlabs.org> writes: > On Fri, Apr 06, 2018 at 04:12:32PM +1000, Michael Ellerman wrote: >> Nicholas Piggin <npiggin@gmail.com> writes: >> > diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c >> > index 78e6a392330f..0221a0f74f07 100644 >> > --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c >> > +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c >> > @@ -439,6 +439,9 @@ static inline int try_lock_tlbie(unsigned int *lock) >> > unsigned int tmp, old; >> > unsigned int token = LOCK_TOKEN; >> > >> > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) >> > + return 1; >> > + >> > asm volatile("1:lwarx %1,0,%2\n" >> > " cmpwi cr0,%1,0\n" >> > " bne 2f\n" >> > @@ -452,6 +455,12 @@ static inline int try_lock_tlbie(unsigned int *lock) >> > return old = 0; >> > } >> > >> > +static inline void unlock_tlbie_after_sync(unsigned int *lock) >> > +{ >> > + if (mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE)) >> > + return; >> > +} >> >> So this is a bit hard to follow: >> >> #define MMU_FTRS_DEFAULT_HPTE_ARCH_V2 \ >> MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2 >> #define MMU_FTRS_POWER MMU_FTRS_DEFAULT_HPTE_ARCH_V2 >> #define MMU_FTRS_PPC970 MMU_FTRS_POWER | MMU_FTR_TLBIE_CROP_VA // does NOT >> #define MMU_FTRS_POWER5 MMU_FTRS_POWER | MMU_FTR_LOCKLESS_TLBIE >> #define MMU_FTRS_POWER6 MMU_FTRS_POWER5 | MMU_FTR_KERNEL_RO | MMU_FTR_68_BIT_VA // includes lockless TLBIE >> #define MMU_FTRS_POWER7 MMU_FTRS_POWER6 // includes lockless TLBIE >> #define MMU_FTRS_POWER8 MMU_FTRS_POWER6 // includes lockless TLBIE >> #define MMU_FTRS_POWER9 MMU_FTRS_POWER6 // includes lockless TLBIE >> #define MMU_FTRS_CELL MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | // does NOT >> MMU_FTR_CI_LARGE_PAGE >> #define MMU_FTRS_PA6T MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \ // does NOT >> MMU_FTR_CI_LARGE_PAGE | MMU_FTR_NO_SLBIE_B >> >> >> So it's only 970, Cell and Pasemi that *don't* have lockless TLBIE. >> >> And KVM HV doesn't doesn't run on any of those. >> >> So we can just not check for the feature in the KVM HV code. >> >> Am I right? > > Yes; that code was written when we still supported HV KVM on 970, > but we ripped that out some time ago. OK good, in commit: c17b98cf6028 ("KVM: PPC: Book3S HV: Remove code for PPC970 processors") (Dec 2014) So we should be able to do the patch below. cheers diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 17498e9a26e4..7756b0c6da75 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -269,7 +269,6 @@ struct kvm_arch { unsigned long host_lpcr; unsigned long sdr1; unsigned long host_sdr1; - int tlbie_lock; unsigned long lpcr; unsigned long vrma_slb_v; int mmu_ready; diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 78e6a392330f..89d909b3b881 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -434,24 +434,6 @@ static inline int is_mmio_hpte(unsigned long v, unsigned long r) (HPTE_R_KEY_HI | HPTE_R_KEY_LO)); } -static inline int try_lock_tlbie(unsigned int *lock) -{ - unsigned int tmp, old; - unsigned int token = LOCK_TOKEN; - - asm volatile("1:lwarx %1,0,%2\n" - " cmpwi cr0,%1,0\n" - " bne 2f\n" - " stwcx. %3,0,%2\n" - " bne- 1b\n" - " isync\n" - "2:" - : "=&r" (tmp), "=&r" (old) - : "r" (lock), "r" (token) - : "cc", "memory"); - return old = 0; -} - static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, long npages, int global, bool need_sync) { @@ -463,8 +445,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, * the RS field, this is backwards-compatible with P7 and P8. */ if (global) { - while (!try_lock_tlbie(&kvm->arch.tlbie_lock)) - cpu_relax(); if (need_sync) asm volatile("ptesync" : : : "memory"); for (i = 0; i < npages; ++i) { @@ -483,7 +463,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, } asm volatile("eieio; tlbsync; ptesync" : : : "memory"); - kvm->arch.tlbie_lock = 0; } else { if (need_sync) asm volatile("ptesync" : : : "memory"); ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls 2018-05-14 4:04 ` Michael Ellerman @ 2018-05-17 3:53 ` Paul Mackerras -1 siblings, 0 replies; 28+ messages in thread From: Paul Mackerras @ 2018-05-17 3:53 UTC (permalink / raw) To: Michael Ellerman; +Cc: Nicholas Piggin, kvm-ppc, linuxppc-dev On Mon, May 14, 2018 at 02:04:10PM +1000, Michael Ellerman wrote: [snip] > OK good, in commit: > > c17b98cf6028 ("KVM: PPC: Book3S HV: Remove code for PPC970 processors") (Dec 2014) > > So we should be able to do the patch below. > > cheers > > > diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h > index 17498e9a26e4..7756b0c6da75 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -269,7 +269,6 @@ struct kvm_arch { > unsigned long host_lpcr; > unsigned long sdr1; > unsigned long host_sdr1; > - int tlbie_lock; > unsigned long lpcr; > unsigned long vrma_slb_v; > int mmu_ready; > diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > index 78e6a392330f..89d909b3b881 100644 > --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c > +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > @@ -434,24 +434,6 @@ static inline int is_mmio_hpte(unsigned long v, unsigned long r) > (HPTE_R_KEY_HI | HPTE_R_KEY_LO)); > } > > -static inline int try_lock_tlbie(unsigned int *lock) > -{ > - unsigned int tmp, old; > - unsigned int token = LOCK_TOKEN; > - > - asm volatile("1:lwarx %1,0,%2\n" > - " cmpwi cr0,%1,0\n" > - " bne 2f\n" > - " stwcx. %3,0,%2\n" > - " bne- 1b\n" > - " isync\n" > - "2:" > - : "=&r" (tmp), "=&r" (old) > - : "r" (lock), "r" (token) > - : "cc", "memory"); > - return old == 0; > -} > - > static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, > long npages, int global, bool need_sync) > { > @@ -463,8 +445,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, > * the RS field, this is backwards-compatible with P7 and P8. > */ > if (global) { > - while (!try_lock_tlbie(&kvm->arch.tlbie_lock)) > - cpu_relax(); > if (need_sync) > asm volatile("ptesync" : : : "memory"); > for (i = 0; i < npages; ++i) { > @@ -483,7 +463,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, > } > > asm volatile("eieio; tlbsync; ptesync" : : : "memory"); > - kvm->arch.tlbie_lock = 0; > } else { > if (need_sync) > asm volatile("ptesync" : : : "memory"); Seems reasonable; is that a patch submission? :) Paul. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls @ 2018-05-17 3:53 ` Paul Mackerras 0 siblings, 0 replies; 28+ messages in thread From: Paul Mackerras @ 2018-05-17 3:53 UTC (permalink / raw) To: Michael Ellerman; +Cc: Nicholas Piggin, kvm-ppc, linuxppc-dev On Mon, May 14, 2018 at 02:04:10PM +1000, Michael Ellerman wrote: [snip] > OK good, in commit: > > c17b98cf6028 ("KVM: PPC: Book3S HV: Remove code for PPC970 processors") (Dec 2014) > > So we should be able to do the patch below. > > cheers > > > diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h > index 17498e9a26e4..7756b0c6da75 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -269,7 +269,6 @@ struct kvm_arch { > unsigned long host_lpcr; > unsigned long sdr1; > unsigned long host_sdr1; > - int tlbie_lock; > unsigned long lpcr; > unsigned long vrma_slb_v; > int mmu_ready; > diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > index 78e6a392330f..89d909b3b881 100644 > --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c > +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c > @@ -434,24 +434,6 @@ static inline int is_mmio_hpte(unsigned long v, unsigned long r) > (HPTE_R_KEY_HI | HPTE_R_KEY_LO)); > } > > -static inline int try_lock_tlbie(unsigned int *lock) > -{ > - unsigned int tmp, old; > - unsigned int token = LOCK_TOKEN; > - > - asm volatile("1:lwarx %1,0,%2\n" > - " cmpwi cr0,%1,0\n" > - " bne 2f\n" > - " stwcx. %3,0,%2\n" > - " bne- 1b\n" > - " isync\n" > - "2:" > - : "=&r" (tmp), "=&r" (old) > - : "r" (lock), "r" (token) > - : "cc", "memory"); > - return old = 0; > -} > - > static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, > long npages, int global, bool need_sync) > { > @@ -463,8 +445,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, > * the RS field, this is backwards-compatible with P7 and P8. > */ > if (global) { > - while (!try_lock_tlbie(&kvm->arch.tlbie_lock)) > - cpu_relax(); > if (need_sync) > asm volatile("ptesync" : : : "memory"); > for (i = 0; i < npages; ++i) { > @@ -483,7 +463,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues, > } > > asm volatile("eieio; tlbsync; ptesync" : : : "memory"); > - kvm->arch.tlbie_lock = 0; > } else { > if (need_sync) > asm volatile("ptesync" : : : "memory"); Seems reasonable; is that a patch submission? :) Paul. ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2018-05-17 3:54 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-04-05 17:56 [PATCH 0/2] KVM powerpc tlbie scalability improvement Nicholas Piggin 2018-04-05 17:56 ` Nicholas Piggin 2018-04-05 17:56 ` [PATCH 1/2] KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode Nicholas Piggin 2018-04-05 17:56 ` Nicholas Piggin 2018-04-08 10:17 ` Balbir Singh 2018-04-08 10:17 ` Balbir Singh 2018-04-08 13:41 ` Nicholas Piggin 2018-04-08 13:41 ` Nicholas Piggin 2018-04-10 3:21 ` Michael Ellerman 2018-04-10 3:21 ` Michael Ellerman 2018-04-10 5:55 ` Naveen N. Rao 2018-04-10 5:55 ` Naveen N. Rao 2018-04-10 6:10 ` Nicholas Piggin 2018-04-10 6:10 ` Nicholas Piggin 2018-04-11 14:49 ` [1/2] " Michael Ellerman 2018-04-11 14:49 ` Michael Ellerman 2018-04-05 17:56 ` [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls Nicholas Piggin 2018-04-05 17:56 ` Nicholas Piggin 2018-04-06 5:39 ` Nicholas Piggin 2018-04-06 5:39 ` Nicholas Piggin 2018-04-06 6:12 ` Michael Ellerman 2018-04-06 6:12 ` Michael Ellerman 2018-05-10 5:30 ` Paul Mackerras 2018-05-10 5:30 ` Paul Mackerras 2018-05-14 4:04 ` Michael Ellerman 2018-05-14 4:04 ` Michael Ellerman 2018-05-17 3:53 ` Paul Mackerras 2018-05-17 3:53 ` Paul Mackerras
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.