All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
@ 2022-05-26 20:39 ` Qian Cai
  0 siblings, 0 replies; 15+ messages in thread
From: Qian Cai @ 2022-05-26 20:39 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, linux-arm-kernel, kvmarm, linux-kernel, Qian Cai

Running some SR-IOV workloads could trigger some leak reports from
kmemleak.

unreferenced object 0xffff080243cef500 (size 128):
  comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
  hex dump (first 32 bytes):
    28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
    e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
  backtrace:
     kmem_cache_alloc_trace
     kvm_init_stage2_mmu
     kvm_arch_init_vm
     kvm_create_vm
     kvm_dev_ioctl
     __arm64_sys_ioctl
     invoke_syscall
     el0_svc_common.constprop.0
     do_el0_svc
     el0_svc
     el0t_64_sync_handler
     el0t_64_sync

Since I yet to find a way to reproduce this at will, I just did a code
inspection and found this one spot that could happen. It is unlikely
that will fix my issue because I don't see mine went into the error
paths. But, we should fix it regardless.

If hardware_enable_all() or kvm_init_mmu_notifier() failed in
kvm_create_vm(), we ended up leaking stage2 pagetable memory from
kvm_init_stage2_mmu() because we will no longer call
kvm_arch_flush_shadow_all().

It seems that it is impossible to simply move kvm_free_stage2_pgd() from
kvm_arch_flush_shadow_all() into kvm_arch_destroy_vm() due to the issue
mentioned in the "Fixes" commit below. Thus, fixed it by freeing the
memory from kvm_arch_destroy_vm() only if the MMU notifier is not even
initialized.

Fixes: 293f293637b5 ("kvm-arm: Unmap shadow pagetables properly")
Signed-off-by: Qian Cai <quic_qiancai@quicinc.com>
---
 arch/arm64/kvm/arm.c | 3 +++
 arch/arm64/kvm/mmu.c | 3 ++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 400bb0fe2745..7d12824f2034 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -180,6 +180,9 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
  */
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
+	if (!kvm->mmu_notifier.ops)
+		kvm_free_stage2_pgd(&kvm->arch.mmu);
+
 	bitmap_free(kvm->arch.pmu_filter);
 	free_cpumask_var(kvm->arch.supported_cpus);
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f5651a05b6a8..13a527656ba7 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1739,7 +1739,8 @@ void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen)
 
 void kvm_arch_flush_shadow_all(struct kvm *kvm)
 {
-	kvm_free_stage2_pgd(&kvm->arch.mmu);
+	if (kvm->mmu_notifier.ops)
+		kvm_free_stage2_pgd(&kvm->arch.mmu);
 }
 
 void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
@ 2022-05-26 20:39 ` Qian Cai
  0 siblings, 0 replies; 15+ messages in thread
From: Qian Cai @ 2022-05-26 20:39 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Catalin Marinas,
	Will Deacon, linux-arm-kernel, kvmarm, linux-kernel, Qian Cai

Running some SR-IOV workloads could trigger some leak reports from
kmemleak.

unreferenced object 0xffff080243cef500 (size 128):
  comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
  hex dump (first 32 bytes):
    28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
    e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
  backtrace:
     kmem_cache_alloc_trace
     kvm_init_stage2_mmu
     kvm_arch_init_vm
     kvm_create_vm
     kvm_dev_ioctl
     __arm64_sys_ioctl
     invoke_syscall
     el0_svc_common.constprop.0
     do_el0_svc
     el0_svc
     el0t_64_sync_handler
     el0t_64_sync

Since I yet to find a way to reproduce this at will, I just did a code
inspection and found this one spot that could happen. It is unlikely
that will fix my issue because I don't see mine went into the error
paths. But, we should fix it regardless.

If hardware_enable_all() or kvm_init_mmu_notifier() failed in
kvm_create_vm(), we ended up leaking stage2 pagetable memory from
kvm_init_stage2_mmu() because we will no longer call
kvm_arch_flush_shadow_all().

It seems that it is impossible to simply move kvm_free_stage2_pgd() from
kvm_arch_flush_shadow_all() into kvm_arch_destroy_vm() due to the issue
mentioned in the "Fixes" commit below. Thus, fixed it by freeing the
memory from kvm_arch_destroy_vm() only if the MMU notifier is not even
initialized.

Fixes: 293f293637b5 ("kvm-arm: Unmap shadow pagetables properly")
Signed-off-by: Qian Cai <quic_qiancai@quicinc.com>
---
 arch/arm64/kvm/arm.c | 3 +++
 arch/arm64/kvm/mmu.c | 3 ++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 400bb0fe2745..7d12824f2034 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -180,6 +180,9 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
  */
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
+	if (!kvm->mmu_notifier.ops)
+		kvm_free_stage2_pgd(&kvm->arch.mmu);
+
 	bitmap_free(kvm->arch.pmu_filter);
 	free_cpumask_var(kvm->arch.supported_cpus);
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f5651a05b6a8..13a527656ba7 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1739,7 +1739,8 @@ void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen)
 
 void kvm_arch_flush_shadow_all(struct kvm *kvm)
 {
-	kvm_free_stage2_pgd(&kvm->arch.mmu);
+	if (kvm->mmu_notifier.ops)
+		kvm_free_stage2_pgd(&kvm->arch.mmu);
 }
 
 void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
-- 
2.32.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
@ 2022-05-26 20:39 ` Qian Cai
  0 siblings, 0 replies; 15+ messages in thread
From: Qian Cai @ 2022-05-26 20:39 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Will Deacon, Catalin Marinas, Qian Cai, linux-kernel, kvmarm,
	linux-arm-kernel

Running some SR-IOV workloads could trigger some leak reports from
kmemleak.

unreferenced object 0xffff080243cef500 (size 128):
  comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
  hex dump (first 32 bytes):
    28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
    e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
  backtrace:
     kmem_cache_alloc_trace
     kvm_init_stage2_mmu
     kvm_arch_init_vm
     kvm_create_vm
     kvm_dev_ioctl
     __arm64_sys_ioctl
     invoke_syscall
     el0_svc_common.constprop.0
     do_el0_svc
     el0_svc
     el0t_64_sync_handler
     el0t_64_sync

Since I yet to find a way to reproduce this at will, I just did a code
inspection and found this one spot that could happen. It is unlikely
that will fix my issue because I don't see mine went into the error
paths. But, we should fix it regardless.

If hardware_enable_all() or kvm_init_mmu_notifier() failed in
kvm_create_vm(), we ended up leaking stage2 pagetable memory from
kvm_init_stage2_mmu() because we will no longer call
kvm_arch_flush_shadow_all().

It seems that it is impossible to simply move kvm_free_stage2_pgd() from
kvm_arch_flush_shadow_all() into kvm_arch_destroy_vm() due to the issue
mentioned in the "Fixes" commit below. Thus, fixed it by freeing the
memory from kvm_arch_destroy_vm() only if the MMU notifier is not even
initialized.

Fixes: 293f293637b5 ("kvm-arm: Unmap shadow pagetables properly")
Signed-off-by: Qian Cai <quic_qiancai@quicinc.com>
---
 arch/arm64/kvm/arm.c | 3 +++
 arch/arm64/kvm/mmu.c | 3 ++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 400bb0fe2745..7d12824f2034 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -180,6 +180,9 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
  */
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
+	if (!kvm->mmu_notifier.ops)
+		kvm_free_stage2_pgd(&kvm->arch.mmu);
+
 	bitmap_free(kvm->arch.pmu_filter);
 	free_cpumask_var(kvm->arch.supported_cpus);
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f5651a05b6a8..13a527656ba7 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1739,7 +1739,8 @@ void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen)
 
 void kvm_arch_flush_shadow_all(struct kvm *kvm)
 {
-	kvm_free_stage2_pgd(&kvm->arch.mmu);
+	if (kvm->mmu_notifier.ops)
+		kvm_free_stage2_pgd(&kvm->arch.mmu);
 }
 
 void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
-- 
2.32.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
  2022-05-26 20:39 ` Qian Cai
  (?)
@ 2022-05-31 16:57   ` Will Deacon
  -1 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2022-05-31 16:57 UTC (permalink / raw)
  To: Qian Cai
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, linux-arm-kernel, kvmarm, linux-kernel

On Thu, May 26, 2022 at 04:39:56PM -0400, Qian Cai wrote:
> Running some SR-IOV workloads could trigger some leak reports from
> kmemleak.
> 
> unreferenced object 0xffff080243cef500 (size 128):
>   comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
>   hex dump (first 32 bytes):
>     28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
>     e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
>   backtrace:
>      kmem_cache_alloc_trace
>      kvm_init_stage2_mmu

Hmm, I can't spot a 128-byte allocation in here so this is pretty cryptic.
I don't really like the idea of papering over the report; we'd be better off
trying to reproduce it.

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
@ 2022-05-31 16:57   ` Will Deacon
  0 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2022-05-31 16:57 UTC (permalink / raw)
  To: Qian Cai
  Cc: Marc Zyngier, linux-kernel, Catalin Marinas, kvmarm, linux-arm-kernel

On Thu, May 26, 2022 at 04:39:56PM -0400, Qian Cai wrote:
> Running some SR-IOV workloads could trigger some leak reports from
> kmemleak.
> 
> unreferenced object 0xffff080243cef500 (size 128):
>   comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
>   hex dump (first 32 bytes):
>     28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
>     e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
>   backtrace:
>      kmem_cache_alloc_trace
>      kvm_init_stage2_mmu

Hmm, I can't spot a 128-byte allocation in here so this is pretty cryptic.
I don't really like the idea of papering over the report; we'd be better off
trying to reproduce it.

Will
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
@ 2022-05-31 16:57   ` Will Deacon
  0 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2022-05-31 16:57 UTC (permalink / raw)
  To: Qian Cai
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, linux-arm-kernel, kvmarm, linux-kernel

On Thu, May 26, 2022 at 04:39:56PM -0400, Qian Cai wrote:
> Running some SR-IOV workloads could trigger some leak reports from
> kmemleak.
> 
> unreferenced object 0xffff080243cef500 (size 128):
>   comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
>   hex dump (first 32 bytes):
>     28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
>     e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
>   backtrace:
>      kmem_cache_alloc_trace
>      kvm_init_stage2_mmu

Hmm, I can't spot a 128-byte allocation in here so this is pretty cryptic.
I don't really like the idea of papering over the report; we'd be better off
trying to reproduce it.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
  2022-05-31 16:57   ` Will Deacon
  (?)
@ 2022-05-31 17:01     ` Will Deacon
  -1 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2022-05-31 17:01 UTC (permalink / raw)
  To: Qian Cai
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, linux-arm-kernel, kvmarm, linux-kernel

On Tue, May 31, 2022 at 05:57:11PM +0100, Will Deacon wrote:
> On Thu, May 26, 2022 at 04:39:56PM -0400, Qian Cai wrote:
> > Running some SR-IOV workloads could trigger some leak reports from
> > kmemleak.
> > 
> > unreferenced object 0xffff080243cef500 (size 128):
> >   comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
> >   hex dump (first 32 bytes):
> >     28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
> >     e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
> >   backtrace:
> >      kmem_cache_alloc_trace
> >      kvm_init_stage2_mmu
> 
> Hmm, I can't spot a 128-byte allocation in here so this is pretty cryptic.
> I don't really like the idea of papering over the report; we'd be better off
> trying to reproduce it.

... although the hexdump does look like {u32; u32; ptr; ptr; ptr}, which
would match 'struct kvm_pgtable'. I guess the allocation is aligned to
ARCH_DMA_MINALIGN, which could explain the size?

Have you spotted any pattern for when the leak occurs? How are you
terminating the guest?

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
@ 2022-05-31 17:01     ` Will Deacon
  0 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2022-05-31 17:01 UTC (permalink / raw)
  To: Qian Cai
  Cc: Marc Zyngier, linux-kernel, Catalin Marinas, kvmarm, linux-arm-kernel

On Tue, May 31, 2022 at 05:57:11PM +0100, Will Deacon wrote:
> On Thu, May 26, 2022 at 04:39:56PM -0400, Qian Cai wrote:
> > Running some SR-IOV workloads could trigger some leak reports from
> > kmemleak.
> > 
> > unreferenced object 0xffff080243cef500 (size 128):
> >   comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
> >   hex dump (first 32 bytes):
> >     28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
> >     e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
> >   backtrace:
> >      kmem_cache_alloc_trace
> >      kvm_init_stage2_mmu
> 
> Hmm, I can't spot a 128-byte allocation in here so this is pretty cryptic.
> I don't really like the idea of papering over the report; we'd be better off
> trying to reproduce it.

... although the hexdump does look like {u32; u32; ptr; ptr; ptr}, which
would match 'struct kvm_pgtable'. I guess the allocation is aligned to
ARCH_DMA_MINALIGN, which could explain the size?

Have you spotted any pattern for when the leak occurs? How are you
terminating the guest?

Will
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
@ 2022-05-31 17:01     ` Will Deacon
  0 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2022-05-31 17:01 UTC (permalink / raw)
  To: Qian Cai
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, linux-arm-kernel, kvmarm, linux-kernel

On Tue, May 31, 2022 at 05:57:11PM +0100, Will Deacon wrote:
> On Thu, May 26, 2022 at 04:39:56PM -0400, Qian Cai wrote:
> > Running some SR-IOV workloads could trigger some leak reports from
> > kmemleak.
> > 
> > unreferenced object 0xffff080243cef500 (size 128):
> >   comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
> >   hex dump (first 32 bytes):
> >     28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
> >     e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
> >   backtrace:
> >      kmem_cache_alloc_trace
> >      kvm_init_stage2_mmu
> 
> Hmm, I can't spot a 128-byte allocation in here so this is pretty cryptic.
> I don't really like the idea of papering over the report; we'd be better off
> trying to reproduce it.

... although the hexdump does look like {u32; u32; ptr; ptr; ptr}, which
would match 'struct kvm_pgtable'. I guess the allocation is aligned to
ARCH_DMA_MINALIGN, which could explain the size?

Have you spotted any pattern for when the leak occurs? How are you
terminating the guest?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
  2022-05-31 16:57   ` Will Deacon
  (?)
@ 2022-05-31 17:23     ` Qian Cai
  -1 siblings, 0 replies; 15+ messages in thread
From: Qian Cai @ 2022-05-31 17:23 UTC (permalink / raw)
  To: Will Deacon
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, linux-arm-kernel, kvmarm, linux-kernel

On Tue, May 31, 2022 at 05:57:11PM +0100, Will Deacon wrote:
> On Thu, May 26, 2022 at 04:39:56PM -0400, Qian Cai wrote:
> > Running some SR-IOV workloads could trigger some leak reports from
> > kmemleak.
> > 
> > unreferenced object 0xffff080243cef500 (size 128):
> >   comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
> >   hex dump (first 32 bytes):
> >     28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
> >     e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
> >   backtrace:
> >      kmem_cache_alloc_trace
> >      kvm_init_stage2_mmu
> 
> Hmm, I can't spot a 128-byte allocation in here so this is pretty cryptic.
> I don't really like the idea of papering over the report; we'd be better off
> trying to reproduce it.

As far as I would like to reproduce, I have tried it in the last a few
weeks without luck. It still happens from time to time though from our
daily CI, so I was thinking to plug the knowns leaks first.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
@ 2022-05-31 17:23     ` Qian Cai
  0 siblings, 0 replies; 15+ messages in thread
From: Qian Cai @ 2022-05-31 17:23 UTC (permalink / raw)
  To: Will Deacon
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, linux-arm-kernel, kvmarm, linux-kernel

On Tue, May 31, 2022 at 05:57:11PM +0100, Will Deacon wrote:
> On Thu, May 26, 2022 at 04:39:56PM -0400, Qian Cai wrote:
> > Running some SR-IOV workloads could trigger some leak reports from
> > kmemleak.
> > 
> > unreferenced object 0xffff080243cef500 (size 128):
> >   comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
> >   hex dump (first 32 bytes):
> >     28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
> >     e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
> >   backtrace:
> >      kmem_cache_alloc_trace
> >      kvm_init_stage2_mmu
> 
> Hmm, I can't spot a 128-byte allocation in here so this is pretty cryptic.
> I don't really like the idea of papering over the report; we'd be better off
> trying to reproduce it.

As far as I would like to reproduce, I have tried it in the last a few
weeks without luck. It still happens from time to time though from our
daily CI, so I was thinking to plug the knowns leaks first.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
@ 2022-05-31 17:23     ` Qian Cai
  0 siblings, 0 replies; 15+ messages in thread
From: Qian Cai @ 2022-05-31 17:23 UTC (permalink / raw)
  To: Will Deacon
  Cc: Marc Zyngier, linux-kernel, Catalin Marinas, kvmarm, linux-arm-kernel

On Tue, May 31, 2022 at 05:57:11PM +0100, Will Deacon wrote:
> On Thu, May 26, 2022 at 04:39:56PM -0400, Qian Cai wrote:
> > Running some SR-IOV workloads could trigger some leak reports from
> > kmemleak.
> > 
> > unreferenced object 0xffff080243cef500 (size 128):
> >   comm "qemu-system-aar", pid 179935, jiffies 4298359506 (age 1629.732s)
> >   hex dump (first 32 bytes):
> >     28 00 00 00 01 00 00 00 00 e0 4c 52 03 08 ff ff  (.........LR....
> >     e0 af a4 7f 7c d1 ff ff a8 3c b3 08 00 80 ff ff  ....|....<......
> >   backtrace:
> >      kmem_cache_alloc_trace
> >      kvm_init_stage2_mmu
> 
> Hmm, I can't spot a 128-byte allocation in here so this is pretty cryptic.
> I don't really like the idea of papering over the report; we'd be better off
> trying to reproduce it.

As far as I would like to reproduce, I have tried it in the last a few
weeks without luck. It still happens from time to time though from our
daily CI, so I was thinking to plug the knowns leaks first.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
  2022-05-31 17:01     ` Will Deacon
  (?)
@ 2022-05-31 17:41       ` Qian Cai
  -1 siblings, 0 replies; 15+ messages in thread
From: Qian Cai @ 2022-05-31 17:41 UTC (permalink / raw)
  To: Will Deacon
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, linux-arm-kernel, kvmarm, linux-kernel

On Tue, May 31, 2022 at 06:01:58PM +0100, Will Deacon wrote:
> Have you spotted any pattern for when the leak occurs? How are you
> terminating the guest?

It just to send a SIGTERM to the qemu-system-aarch64 process. Origially,
right after sending the signal, it will remove_id/unbind from the vfio-pci
and then bind to the original (ixgbe) driver. However, since the process
might take a while to clean off itself, the bind might failed with -EBUSY.
I could reproduce it a few times one day while was unable to do so some
other days.

Later, we changed the code to make sure the process is disappeard first and
then remove_id/bind/unbind. Apparently, it make harder to reproduce if not
totally eliminate it.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
@ 2022-05-31 17:41       ` Qian Cai
  0 siblings, 0 replies; 15+ messages in thread
From: Qian Cai @ 2022-05-31 17:41 UTC (permalink / raw)
  To: Will Deacon
  Cc: Marc Zyngier, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Catalin Marinas, linux-arm-kernel, kvmarm, linux-kernel

On Tue, May 31, 2022 at 06:01:58PM +0100, Will Deacon wrote:
> Have you spotted any pattern for when the leak occurs? How are you
> terminating the guest?

It just to send a SIGTERM to the qemu-system-aarch64 process. Origially,
right after sending the signal, it will remove_id/unbind from the vfio-pci
and then bind to the original (ixgbe) driver. However, since the process
might take a while to clean off itself, the bind might failed with -EBUSY.
I could reproduce it a few times one day while was unable to do so some
other days.

Later, we changed the code to make sure the process is disappeard first and
then remove_id/bind/unbind. Apparently, it make harder to reproduce if not
totally eliminate it.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable
@ 2022-05-31 17:41       ` Qian Cai
  0 siblings, 0 replies; 15+ messages in thread
From: Qian Cai @ 2022-05-31 17:41 UTC (permalink / raw)
  To: Will Deacon
  Cc: Marc Zyngier, linux-kernel, Catalin Marinas, kvmarm, linux-arm-kernel

On Tue, May 31, 2022 at 06:01:58PM +0100, Will Deacon wrote:
> Have you spotted any pattern for when the leak occurs? How are you
> terminating the guest?

It just to send a SIGTERM to the qemu-system-aarch64 process. Origially,
right after sending the signal, it will remove_id/unbind from the vfio-pci
and then bind to the original (ixgbe) driver. However, since the process
might take a while to clean off itself, the bind might failed with -EBUSY.
I could reproduce it a few times one day while was unable to do so some
other days.

Later, we changed the code to make sure the process is disappeard first and
then remove_id/bind/unbind. Apparently, it make harder to reproduce if not
totally eliminate it.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-06-01 17:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-26 20:39 [PATCH] KVM: arm64: Fix memory leaks from stage2 pagetable Qian Cai
2022-05-26 20:39 ` Qian Cai
2022-05-26 20:39 ` Qian Cai
2022-05-31 16:57 ` Will Deacon
2022-05-31 16:57   ` Will Deacon
2022-05-31 16:57   ` Will Deacon
2022-05-31 17:01   ` Will Deacon
2022-05-31 17:01     ` Will Deacon
2022-05-31 17:01     ` Will Deacon
2022-05-31 17:41     ` Qian Cai
2022-05-31 17:41       ` Qian Cai
2022-05-31 17:41       ` Qian Cai
2022-05-31 17:23   ` Qian Cai
2022-05-31 17:23     ` Qian Cai
2022-05-31 17:23     ` Qian Cai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.