All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3]  kvm: arm/arm64: Fixes for use after free problems
@ 2017-03-14 14:52 ` Suzuki K Poulose
  0 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel, andreyknvl
  Cc: dvyukov, marc.zyngier, christoffer.dall, kvmarm, kvm,
	linux-kernel, kcc, syzkaller, will.deacon, catalin.marinas,
	pbonzini, mark.rutland, suzuki.poulose, ard.biesheuvel, stable

This series contains potential fixes for problems reported by [0] & [1].

[0] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/492944.html
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/492943.html


Marc Zyngier (2):
  kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
  kvm: arm/arm64: Take mmap_sem in kvm_arch_prepare_memory_region

Suzuki K Poulose (1):
  kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd

 arch/arm/kvm/mmu.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 0/3]  kvm: arm/arm64: Fixes for use after free problems
@ 2017-03-14 14:52 ` Suzuki K Poulose
  0 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel, andreyknvl
  Cc: kvm, marc.zyngier, catalin.marinas, ard.biesheuvel, will.deacon,
	linux-kernel, stable, kcc, syzkaller, dvyukov, pbonzini, kvmarm

This series contains potential fixes for problems reported by [0] & [1].

[0] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/492944.html
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/492943.html


Marc Zyngier (2):
  kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
  kvm: arm/arm64: Take mmap_sem in kvm_arch_prepare_memory_region

Suzuki K Poulose (1):
  kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd

 arch/arm/kvm/mmu.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 0/3]  kvm: arm/arm64: Fixes for use after free problems
@ 2017-03-14 14:52 ` Suzuki K Poulose
  0 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel

This series contains potential fixes for problems reported by [0] & [1].

[0] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/492944.html
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/492943.html


Marc Zyngier (2):
  kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
  kvm: arm/arm64: Take mmap_sem in kvm_arch_prepare_memory_region

Suzuki K Poulose (1):
  kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd

 arch/arm/kvm/mmu.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
  2017-03-14 14:52 ` Suzuki K Poulose
  (?)
@ 2017-03-14 14:52   ` Suzuki K Poulose
  -1 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel, andreyknvl
  Cc: dvyukov, marc.zyngier, christoffer.dall, kvmarm, kvm,
	linux-kernel, kcc, syzkaller, will.deacon, catalin.marinas,
	pbonzini, mark.rutland, suzuki.poulose, ard.biesheuvel, stable

From: Marc Zyngier <marc.zyngier@arm.com>

We don't hold the mmap_sem while searching for the VMAs when
we try to unmap each memslot for a VM. Fix this properly to
avoid unexpected results.

Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
Cc: stable@vger.kernel.org # v3.19+
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm/kvm/mmu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 962616f..f2e2e0c 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
 	int idx;
 
 	idx = srcu_read_lock(&kvm->srcu);
+	down_read(&current->mm->mmap_sem);
 	spin_lock(&kvm->mmu_lock);
 
 	slots = kvm_memslots(kvm);
@@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
 		stage2_unmap_memslot(kvm, memslot);
 
 	spin_unlock(&kvm->mmu_lock);
+	up_read(&current->mm->mmap_sem);
 	srcu_read_unlock(&kvm->srcu, idx);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
@ 2017-03-14 14:52   ` Suzuki K Poulose
  0 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel, andreyknvl
  Cc: kvm, marc.zyngier, catalin.marinas, ard.biesheuvel, will.deacon,
	linux-kernel, stable, kcc, syzkaller, dvyukov, pbonzini, kvmarm

From: Marc Zyngier <marc.zyngier@arm.com>

We don't hold the mmap_sem while searching for the VMAs when
we try to unmap each memslot for a VM. Fix this properly to
avoid unexpected results.

Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
Cc: stable@vger.kernel.org # v3.19+
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm/kvm/mmu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 962616f..f2e2e0c 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
 	int idx;
 
 	idx = srcu_read_lock(&kvm->srcu);
+	down_read(&current->mm->mmap_sem);
 	spin_lock(&kvm->mmu_lock);
 
 	slots = kvm_memslots(kvm);
@@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
 		stage2_unmap_memslot(kvm, memslot);
 
 	spin_unlock(&kvm->mmu_lock);
+	up_read(&current->mm->mmap_sem);
 	srcu_read_unlock(&kvm->srcu, idx);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
@ 2017-03-14 14:52   ` Suzuki K Poulose
  0 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel

From: Marc Zyngier <marc.zyngier@arm.com>

We don't hold the mmap_sem while searching for the VMAs when
we try to unmap each memslot for a VM. Fix this properly to
avoid unexpected results.

Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
Cc: stable at vger.kernel.org # v3.19+
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm/kvm/mmu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 962616f..f2e2e0c 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
 	int idx;
 
 	idx = srcu_read_lock(&kvm->srcu);
+	down_read(&current->mm->mmap_sem);
 	spin_lock(&kvm->mmu_lock);
 
 	slots = kvm_memslots(kvm);
@@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
 		stage2_unmap_memslot(kvm, memslot);
 
 	spin_unlock(&kvm->mmu_lock);
+	up_read(&current->mm->mmap_sem);
 	srcu_read_unlock(&kvm->srcu, idx);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 2/3] kvm: arm/arm64: Take mmap_sem in kvm_arch_prepare_memory_region
  2017-03-14 14:52 ` Suzuki K Poulose
  (?)
@ 2017-03-14 14:52   ` Suzuki K Poulose
  -1 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel, andreyknvl
  Cc: dvyukov, marc.zyngier, christoffer.dall, kvmarm, kvm,
	linux-kernel, kcc, syzkaller, will.deacon, catalin.marinas,
	pbonzini, mark.rutland, suzuki.poulose, ard.biesheuvel, stable

From: Marc Zyngier <marc.zyngier@arm.com>

We don't hold the mmap_sem while searching for VMAs (via find_vma), in
kvm_arch_prepare_memory_region, which can end up in expected failures.

Fixes: commit 8eef91239e57 ("arm/arm64: KVM: map MMIO regions at creation time")
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Eric Auger <eric.auger@rehat.com>
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[ Handle dirty page logging failure case ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm/kvm/mmu.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index f2e2e0c..13b9c1f 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1803,6 +1803,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 	    (KVM_PHYS_SIZE >> PAGE_SHIFT))
 		return -EFAULT;
 
+	down_read(&current->mm->mmap_sem);
 	/*
 	 * A memory region could potentially cover multiple VMAs, and any holes
 	 * between them, so iterate over all of them to find out if we can map
@@ -1846,8 +1847,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 			pa += vm_start - vma->vm_start;
 
 			/* IO region dirty page logging not allowed */
-			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES)
-				return -EINVAL;
+			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES) {
+				ret = -EINVAL;
+				goto out;
+			}
 
 			ret = kvm_phys_addr_ioremap(kvm, gpa, pa,
 						    vm_end - vm_start,
@@ -1859,7 +1862,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 	} while (hva < reg_end);
 
 	if (change == KVM_MR_FLAGS_ONLY)
-		return ret;
+		goto out;
 
 	spin_lock(&kvm->mmu_lock);
 	if (ret)
@@ -1867,6 +1870,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 	else
 		stage2_flush_memslot(kvm, memslot);
 	spin_unlock(&kvm->mmu_lock);
+out:
+	up_read(&current->mm->mmap_sem);
 	return ret;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 2/3] kvm: arm/arm64: Take mmap_sem in kvm_arch_prepare_memory_region
@ 2017-03-14 14:52   ` Suzuki K Poulose
  0 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel, andreyknvl
  Cc: kvm, marc.zyngier, catalin.marinas, ard.biesheuvel, will.deacon,
	linux-kernel, stable, kcc, syzkaller, dvyukov, pbonzini, kvmarm

From: Marc Zyngier <marc.zyngier@arm.com>

We don't hold the mmap_sem while searching for VMAs (via find_vma), in
kvm_arch_prepare_memory_region, which can end up in expected failures.

Fixes: commit 8eef91239e57 ("arm/arm64: KVM: map MMIO regions at creation time")
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Eric Auger <eric.auger@rehat.com>
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[ Handle dirty page logging failure case ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm/kvm/mmu.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index f2e2e0c..13b9c1f 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1803,6 +1803,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 	    (KVM_PHYS_SIZE >> PAGE_SHIFT))
 		return -EFAULT;
 
+	down_read(&current->mm->mmap_sem);
 	/*
 	 * A memory region could potentially cover multiple VMAs, and any holes
 	 * between them, so iterate over all of them to find out if we can map
@@ -1846,8 +1847,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 			pa += vm_start - vma->vm_start;
 
 			/* IO region dirty page logging not allowed */
-			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES)
-				return -EINVAL;
+			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES) {
+				ret = -EINVAL;
+				goto out;
+			}
 
 			ret = kvm_phys_addr_ioremap(kvm, gpa, pa,
 						    vm_end - vm_start,
@@ -1859,7 +1862,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 	} while (hva < reg_end);
 
 	if (change == KVM_MR_FLAGS_ONLY)
-		return ret;
+		goto out;
 
 	spin_lock(&kvm->mmu_lock);
 	if (ret)
@@ -1867,6 +1870,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 	else
 		stage2_flush_memslot(kvm, memslot);
 	spin_unlock(&kvm->mmu_lock);
+out:
+	up_read(&current->mm->mmap_sem);
 	return ret;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 2/3] kvm: arm/arm64: Take mmap_sem in kvm_arch_prepare_memory_region
@ 2017-03-14 14:52   ` Suzuki K Poulose
  0 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel

From: Marc Zyngier <marc.zyngier@arm.com>

We don't hold the mmap_sem while searching for VMAs (via find_vma), in
kvm_arch_prepare_memory_region, which can end up in expected failures.

Fixes: commit 8eef91239e57 ("arm/arm64: KVM: map MMIO regions at creation time")
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Eric Auger <eric.auger@rehat.com>
Cc: stable at vger.kernel.org # v3.18+
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[ Handle dirty page logging failure case ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm/kvm/mmu.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index f2e2e0c..13b9c1f 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1803,6 +1803,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 	    (KVM_PHYS_SIZE >> PAGE_SHIFT))
 		return -EFAULT;
 
+	down_read(&current->mm->mmap_sem);
 	/*
 	 * A memory region could potentially cover multiple VMAs, and any holes
 	 * between them, so iterate over all of them to find out if we can map
@@ -1846,8 +1847,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 			pa += vm_start - vma->vm_start;
 
 			/* IO region dirty page logging not allowed */
-			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES)
-				return -EINVAL;
+			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES) {
+				ret = -EINVAL;
+				goto out;
+			}
 
 			ret = kvm_phys_addr_ioremap(kvm, gpa, pa,
 						    vm_end - vm_start,
@@ -1859,7 +1862,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 	} while (hva < reg_end);
 
 	if (change == KVM_MR_FLAGS_ONLY)
-		return ret;
+		goto out;
 
 	spin_lock(&kvm->mmu_lock);
 	if (ret)
@@ -1867,6 +1870,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 	else
 		stage2_flush_memslot(kvm, memslot);
 	spin_unlock(&kvm->mmu_lock);
+out:
+	up_read(&current->mm->mmap_sem);
 	return ret;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  2017-03-14 14:52 ` Suzuki K Poulose
  (?)
@ 2017-03-14 14:52   ` Suzuki K Poulose
  -1 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel, andreyknvl
  Cc: dvyukov, marc.zyngier, christoffer.dall, kvmarm, kvm,
	linux-kernel, kcc, syzkaller, will.deacon, catalin.marinas,
	pbonzini, mark.rutland, suzuki.poulose, ard.biesheuvel, stable

In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
unmap_stage2_range() on the entire memory range for the guest. This could
cause problems with other callers (e.g, munmap on a memslot) trying to
unmap a range.

Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
Cc: stable@vger.kernel.org # v3.10+
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm/kvm/mmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 13b9c1f..b361f71 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
 	if (kvm->arch.pgd == NULL)
 		return;
 
+	spin_lock(&kvm->mmu_lock);
 	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
+	spin_unlock(&kvm->mmu_lock);
+
 	/* Free the HW pgd, one page at a time */
 	free_pages_exact(kvm->arch.pgd, S2_PGD_SIZE);
 	kvm->arch.pgd = NULL;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-14 14:52   ` Suzuki K Poulose
  0 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel, andreyknvl
  Cc: kvm, marc.zyngier, catalin.marinas, ard.biesheuvel, will.deacon,
	linux-kernel, stable, kcc, syzkaller, dvyukov, pbonzini, kvmarm

In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
unmap_stage2_range() on the entire memory range for the guest. This could
cause problems with other callers (e.g, munmap on a memslot) trying to
unmap a range.

Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
Cc: stable@vger.kernel.org # v3.10+
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm/kvm/mmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 13b9c1f..b361f71 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
 	if (kvm->arch.pgd == NULL)
 		return;
 
+	spin_lock(&kvm->mmu_lock);
 	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
+	spin_unlock(&kvm->mmu_lock);
+
 	/* Free the HW pgd, one page at a time */
 	free_pages_exact(kvm->arch.pgd, S2_PGD_SIZE);
 	kvm->arch.pgd = NULL;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-14 14:52   ` Suzuki K Poulose
  0 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-14 14:52 UTC (permalink / raw)
  To: linux-arm-kernel

In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
unmap_stage2_range() on the entire memory range for the guest. This could
cause problems with other callers (e.g, munmap on a memslot) trying to
unmap a range.

Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
Cc: stable at vger.kernel.org # v3.10+
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm/kvm/mmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 13b9c1f..b361f71 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
 	if (kvm->arch.pgd == NULL)
 		return;
 
+	spin_lock(&kvm->mmu_lock);
 	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
+	spin_unlock(&kvm->mmu_lock);
+
 	/* Free the HW pgd, one page at a time */
 	free_pages_exact(kvm->arch.pgd, S2_PGD_SIZE);
 	kvm->arch.pgd = NULL;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
  2017-03-14 14:52   ` Suzuki K Poulose
@ 2017-03-15  9:17     ` Christoffer Dall
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15  9:17 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, andreyknvl, dvyukov, marc.zyngier,
	christoffer.dall, kvmarm, kvm, linux-kernel, kcc, syzkaller,
	will.deacon, catalin.marinas, pbonzini, mark.rutland,
	ard.biesheuvel, stable

On Tue, Mar 14, 2017 at 02:52:32PM +0000, Suzuki K Poulose wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> We don't hold the mmap_sem while searching for the VMAs when
> we try to unmap each memslot for a VM. Fix this properly to
> avoid unexpected results.
> 
> Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
> Cc: stable@vger.kernel.org # v3.19+
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm/kvm/mmu.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616f..f2e2e0c 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>  	int idx;
>  
>  	idx = srcu_read_lock(&kvm->srcu);
> +	down_read(&current->mm->mmap_sem);
>  	spin_lock(&kvm->mmu_lock);
>  
>  	slots = kvm_memslots(kvm);
> @@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>  		stage2_unmap_memslot(kvm, memslot);
>  
>  	spin_unlock(&kvm->mmu_lock);
> +	up_read(&current->mm->mmap_sem);
>  	srcu_read_unlock(&kvm->srcu, idx);
>  }
>  
> -- 
> 2.7.4
> 

Are we sure that holding mmu_lock is valid while holding the mmap_sem?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
@ 2017-03-15  9:17     ` Christoffer Dall
  0 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15  9:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Mar 14, 2017 at 02:52:32PM +0000, Suzuki K Poulose wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> We don't hold the mmap_sem while searching for the VMAs when
> we try to unmap each memslot for a VM. Fix this properly to
> avoid unexpected results.
> 
> Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
> Cc: stable at vger.kernel.org # v3.19+
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm/kvm/mmu.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616f..f2e2e0c 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>  	int idx;
>  
>  	idx = srcu_read_lock(&kvm->srcu);
> +	down_read(&current->mm->mmap_sem);
>  	spin_lock(&kvm->mmu_lock);
>  
>  	slots = kvm_memslots(kvm);
> @@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>  		stage2_unmap_memslot(kvm, memslot);
>  
>  	spin_unlock(&kvm->mmu_lock);
> +	up_read(&current->mm->mmap_sem);
>  	srcu_read_unlock(&kvm->srcu, idx);
>  }
>  
> -- 
> 2.7.4
> 

Are we sure that holding mmu_lock is valid while holding the mmap_sem?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  2017-03-14 14:52   ` Suzuki K Poulose
  (?)
@ 2017-03-15  9:21     ` Christoffer Dall
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15  9:21 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, andreyknvl, dvyukov, marc.zyngier,
	christoffer.dall, kvmarm, kvm, linux-kernel, kcc, syzkaller,
	will.deacon, catalin.marinas, pbonzini, mark.rutland,
	ard.biesheuvel, stable

On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
> unmap_stage2_range() on the entire memory range for the guest. This could
> cause problems with other callers (e.g, munmap on a memslot) trying to
> unmap a range.
> 
> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
> Cc: stable@vger.kernel.org # v3.10+
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm/kvm/mmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 13b9c1f..b361f71 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>  	if (kvm->arch.pgd == NULL)
>  		return;
>  
> +	spin_lock(&kvm->mmu_lock);
>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> +	spin_unlock(&kvm->mmu_lock);
> +

This ends up holding the spin lock for potentially quite a while, where
we can do things like __flush_dcache_area(), which I think can fault.

Is that valid?

Thanks,
-Christoffer

>  	/* Free the HW pgd, one page at a time */
>  	free_pages_exact(kvm->arch.pgd, S2_PGD_SIZE);
>  	kvm->arch.pgd = NULL;
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15  9:21     ` Christoffer Dall
  0 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15  9:21 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, kvm, ard.biesheuvel, marc.zyngier, andreyknvl,
	will.deacon, linux-kernel, stable, kcc, syzkaller, dvyukov,
	catalin.marinas, pbonzini, kvmarm

On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
> unmap_stage2_range() on the entire memory range for the guest. This could
> cause problems with other callers (e.g, munmap on a memslot) trying to
> unmap a range.
> 
> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
> Cc: stable@vger.kernel.org # v3.10+
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm/kvm/mmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 13b9c1f..b361f71 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>  	if (kvm->arch.pgd == NULL)
>  		return;
>  
> +	spin_lock(&kvm->mmu_lock);
>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> +	spin_unlock(&kvm->mmu_lock);
> +

This ends up holding the spin lock for potentially quite a while, where
we can do things like __flush_dcache_area(), which I think can fault.

Is that valid?

Thanks,
-Christoffer

>  	/* Free the HW pgd, one page at a time */
>  	free_pages_exact(kvm->arch.pgd, S2_PGD_SIZE);
>  	kvm->arch.pgd = NULL;
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15  9:21     ` Christoffer Dall
  0 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15  9:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
> unmap_stage2_range() on the entire memory range for the guest. This could
> cause problems with other callers (e.g, munmap on a memslot) trying to
> unmap a range.
> 
> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
> Cc: stable at vger.kernel.org # v3.10+
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm/kvm/mmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 13b9c1f..b361f71 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>  	if (kvm->arch.pgd == NULL)
>  		return;
>  
> +	spin_lock(&kvm->mmu_lock);
>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> +	spin_unlock(&kvm->mmu_lock);
> +

This ends up holding the spin lock for potentially quite a while, where
we can do things like __flush_dcache_area(), which I think can fault.

Is that valid?

Thanks,
-Christoffer

>  	/* Free the HW pgd, one page at a time */
>  	free_pages_exact(kvm->arch.pgd, S2_PGD_SIZE);
>  	kvm->arch.pgd = NULL;
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
  2017-03-15  9:17     ` Christoffer Dall
@ 2017-03-15  9:34       ` Marc Zyngier
  -1 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15  9:34 UTC (permalink / raw)
  To: Christoffer Dall, Suzuki K Poulose
  Cc: linux-arm-kernel, andreyknvl, dvyukov, christoffer.dall, kvmarm,
	kvm, linux-kernel, kcc, syzkaller, will.deacon, catalin.marinas,
	pbonzini, mark.rutland, ard.biesheuvel, stable

On 15/03/17 09:17, Christoffer Dall wrote:
> On Tue, Mar 14, 2017 at 02:52:32PM +0000, Suzuki K Poulose wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> We don't hold the mmap_sem while searching for the VMAs when
>> we try to unmap each memslot for a VM. Fix this properly to
>> avoid unexpected results.
>>
>> Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
>> Cc: stable@vger.kernel.org # v3.19+
>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>  arch/arm/kvm/mmu.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616f..f2e2e0c 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  	int idx;
>>  
>>  	idx = srcu_read_lock(&kvm->srcu);
>> +	down_read(&current->mm->mmap_sem);
>>  	spin_lock(&kvm->mmu_lock);
>>  
>>  	slots = kvm_memslots(kvm);
>> @@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  		stage2_unmap_memslot(kvm, memslot);
>>  
>>  	spin_unlock(&kvm->mmu_lock);
>> +	up_read(&current->mm->mmap_sem);
>>  	srcu_read_unlock(&kvm->srcu, idx);
>>  }
>>  
>> -- 
>> 2.7.4
>>
> 
> Are we sure that holding mmu_lock is valid while holding the mmap_sem?

Maybe I'm just confused by the many levels of locking, Here's my rational:

- kvm->srcu protects the memslot list
- mmap_sem protects the kernel VMA list
- mmu_lock protects the stage2 page tables (at least here)

I don't immediately see any issue with holding the mmap_sem mutex here
(unless there is a path that would retrigger a down operation on the
mmap_sem?).

Or am I missing something obvious?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
@ 2017-03-15  9:34       ` Marc Zyngier
  0 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15  9:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/03/17 09:17, Christoffer Dall wrote:
> On Tue, Mar 14, 2017 at 02:52:32PM +0000, Suzuki K Poulose wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> We don't hold the mmap_sem while searching for the VMAs when
>> we try to unmap each memslot for a VM. Fix this properly to
>> avoid unexpected results.
>>
>> Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
>> Cc: stable at vger.kernel.org # v3.19+
>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>  arch/arm/kvm/mmu.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616f..f2e2e0c 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  	int idx;
>>  
>>  	idx = srcu_read_lock(&kvm->srcu);
>> +	down_read(&current->mm->mmap_sem);
>>  	spin_lock(&kvm->mmu_lock);
>>  
>>  	slots = kvm_memslots(kvm);
>> @@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  		stage2_unmap_memslot(kvm, memslot);
>>  
>>  	spin_unlock(&kvm->mmu_lock);
>> +	up_read(&current->mm->mmap_sem);
>>  	srcu_read_unlock(&kvm->srcu, idx);
>>  }
>>  
>> -- 
>> 2.7.4
>>
> 
> Are we sure that holding mmu_lock is valid while holding the mmap_sem?

Maybe I'm just confused by the many levels of locking, Here's my rational:

- kvm->srcu protects the memslot list
- mmap_sem protects the kernel VMA list
- mmu_lock protects the stage2 page tables (at least here)

I don't immediately see any issue with holding the mmap_sem mutex here
(unless there is a path that would retrigger a down operation on the
mmap_sem?).

Or am I missing something obvious?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  2017-03-15  9:21     ` Christoffer Dall
@ 2017-03-15  9:39       ` Marc Zyngier
  -1 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15  9:39 UTC (permalink / raw)
  To: Christoffer Dall, Suzuki K Poulose
  Cc: linux-arm-kernel, andreyknvl, dvyukov, christoffer.dall, kvmarm,
	kvm, linux-kernel, kcc, syzkaller, will.deacon, catalin.marinas,
	pbonzini, mark.rutland, ard.biesheuvel, stable

On 15/03/17 09:21, Christoffer Dall wrote:
> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>> unmap_stage2_range() on the entire memory range for the guest. This could
>> cause problems with other callers (e.g, munmap on a memslot) trying to
>> unmap a range.
>>
>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>> Cc: stable@vger.kernel.org # v3.10+
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>  arch/arm/kvm/mmu.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 13b9c1f..b361f71 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>  	if (kvm->arch.pgd == NULL)
>>  		return;
>>  
>> +	spin_lock(&kvm->mmu_lock);
>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>> +	spin_unlock(&kvm->mmu_lock);
>> +
> 
> This ends up holding the spin lock for potentially quite a while, where
> we can do things like __flush_dcache_area(), which I think can fault.

I believe we're always using the linear mapping (or kmap on 32bit) in
order not to fault.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15  9:39       ` Marc Zyngier
  0 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15  9:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/03/17 09:21, Christoffer Dall wrote:
> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>> unmap_stage2_range() on the entire memory range for the guest. This could
>> cause problems with other callers (e.g, munmap on a memslot) trying to
>> unmap a range.
>>
>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>> Cc: stable at vger.kernel.org # v3.10+
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>  arch/arm/kvm/mmu.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 13b9c1f..b361f71 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>  	if (kvm->arch.pgd == NULL)
>>  		return;
>>  
>> +	spin_lock(&kvm->mmu_lock);
>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>> +	spin_unlock(&kvm->mmu_lock);
>> +
> 
> This ends up holding the spin lock for potentially quite a while, where
> we can do things like __flush_dcache_area(), which I think can fault.

I believe we're always using the linear mapping (or kmap on 32bit) in
order not to fault.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  2017-03-15  9:39       ` Marc Zyngier
  (?)
@ 2017-03-15 10:56         ` Christoffer Dall
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 10:56 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Suzuki K Poulose, linux-arm-kernel, andreyknvl, dvyukov,
	christoffer.dall, kvmarm, kvm, linux-kernel, kcc, syzkaller,
	will.deacon, catalin.marinas, pbonzini, mark.rutland,
	ard.biesheuvel, stable

On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
> On 15/03/17 09:21, Christoffer Dall wrote:
> > On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
> >> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
> >> unmap_stage2_range() on the entire memory range for the guest. This could
> >> cause problems with other callers (e.g, munmap on a memslot) trying to
> >> unmap a range.
> >>
> >> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
> >> Cc: stable@vger.kernel.org # v3.10+
> >> Cc: Marc Zyngier <marc.zyngier@arm.com>
> >> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> >> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >> ---
> >>  arch/arm/kvm/mmu.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 13b9c1f..b361f71 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
> >>  	if (kvm->arch.pgd == NULL)
> >>  		return;
> >>  
> >> +	spin_lock(&kvm->mmu_lock);
> >>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> >> +	spin_unlock(&kvm->mmu_lock);
> >> +
> > 
> > This ends up holding the spin lock for potentially quite a while, where
> > we can do things like __flush_dcache_area(), which I think can fault.
> 
> I believe we're always using the linear mapping (or kmap on 32bit) in
> order not to fault.
> 

ok, then there's just the concern that we may be holding a spinlock for
a very long time.  I seem to recall Mario once added something where he
unlocked and gave a chance to schedule something else for each PUD or
something like that, because he ran into the issue during migration.  Am
I confusing this with something else?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 10:56         ` Christoffer Dall
  0 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 10:56 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, andreyknvl, ard.biesheuvel, will.deacon,
	linux-kernel, stable, kcc, syzkaller, dvyukov, catalin.marinas,
	pbonzini, kvmarm

On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
> On 15/03/17 09:21, Christoffer Dall wrote:
> > On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
> >> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
> >> unmap_stage2_range() on the entire memory range for the guest. This could
> >> cause problems with other callers (e.g, munmap on a memslot) trying to
> >> unmap a range.
> >>
> >> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
> >> Cc: stable@vger.kernel.org # v3.10+
> >> Cc: Marc Zyngier <marc.zyngier@arm.com>
> >> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> >> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >> ---
> >>  arch/arm/kvm/mmu.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 13b9c1f..b361f71 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
> >>  	if (kvm->arch.pgd == NULL)
> >>  		return;
> >>  
> >> +	spin_lock(&kvm->mmu_lock);
> >>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> >> +	spin_unlock(&kvm->mmu_lock);
> >> +
> > 
> > This ends up holding the spin lock for potentially quite a while, where
> > we can do things like __flush_dcache_area(), which I think can fault.
> 
> I believe we're always using the linear mapping (or kmap on 32bit) in
> order not to fault.
> 

ok, then there's just the concern that we may be holding a spinlock for
a very long time.  I seem to recall Mario once added something where he
unlocked and gave a chance to schedule something else for each PUD or
something like that, because he ran into the issue during migration.  Am
I confusing this with something else?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 10:56         ` Christoffer Dall
  0 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 10:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
> On 15/03/17 09:21, Christoffer Dall wrote:
> > On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
> >> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
> >> unmap_stage2_range() on the entire memory range for the guest. This could
> >> cause problems with other callers (e.g, munmap on a memslot) trying to
> >> unmap a range.
> >>
> >> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
> >> Cc: stable at vger.kernel.org # v3.10+
> >> Cc: Marc Zyngier <marc.zyngier@arm.com>
> >> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> >> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >> ---
> >>  arch/arm/kvm/mmu.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 13b9c1f..b361f71 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
> >>  	if (kvm->arch.pgd == NULL)
> >>  		return;
> >>  
> >> +	spin_lock(&kvm->mmu_lock);
> >>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> >> +	spin_unlock(&kvm->mmu_lock);
> >> +
> > 
> > This ends up holding the spin lock for potentially quite a while, where
> > we can do things like __flush_dcache_area(), which I think can fault.
> 
> I believe we're always using the linear mapping (or kmap on 32bit) in
> order not to fault.
> 

ok, then there's just the concern that we may be holding a spinlock for
a very long time.  I seem to recall Mario once added something where he
unlocked and gave a chance to schedule something else for each PUD or
something like that, because he ran into the issue during migration.  Am
I confusing this with something else?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
  2017-03-15  9:34       ` Marc Zyngier
  (?)
@ 2017-03-15 11:05         ` Christoffer Dall
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 11:05 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Suzuki K Poulose, linux-arm-kernel, andreyknvl, dvyukov,
	christoffer.dall, kvmarm, kvm, linux-kernel, kcc, syzkaller,
	will.deacon, catalin.marinas, pbonzini, mark.rutland,
	ard.biesheuvel, stable

On Wed, Mar 15, 2017 at 09:34:53AM +0000, Marc Zyngier wrote:
> On 15/03/17 09:17, Christoffer Dall wrote:
> > On Tue, Mar 14, 2017 at 02:52:32PM +0000, Suzuki K Poulose wrote:
> >> From: Marc Zyngier <marc.zyngier@arm.com>
> >>
> >> We don't hold the mmap_sem while searching for the VMAs when
> >> we try to unmap each memslot for a VM. Fix this properly to
> >> avoid unexpected results.
> >>
> >> Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
> >> Cc: stable@vger.kernel.org # v3.19+
> >> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >> ---
> >>  arch/arm/kvm/mmu.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 962616f..f2e2e0c 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
> >>  	int idx;
> >>  
> >>  	idx = srcu_read_lock(&kvm->srcu);
> >> +	down_read(&current->mm->mmap_sem);
> >>  	spin_lock(&kvm->mmu_lock);
> >>  
> >>  	slots = kvm_memslots(kvm);
> >> @@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
> >>  		stage2_unmap_memslot(kvm, memslot);
> >>  
> >>  	spin_unlock(&kvm->mmu_lock);
> >> +	up_read(&current->mm->mmap_sem);
> >>  	srcu_read_unlock(&kvm->srcu, idx);
> >>  }
> >>  
> >> -- 
> >> 2.7.4
> >>
> > 
> > Are we sure that holding mmu_lock is valid while holding the mmap_sem?
> 
> Maybe I'm just confused by the many levels of locking, Here's my rational:
> 
> - kvm->srcu protects the memslot list
> - mmap_sem protects the kernel VMA list
> - mmu_lock protects the stage2 page tables (at least here)
> 
> I don't immediately see any issue with holding the mmap_sem mutex here
> (unless there is a path that would retrigger a down operation on the
> mmap_sem?).
> 
> Or am I missing something obvious?

I was worried that someone else could hold the mmu_lock and take the
mmap_sem, but that wouldn't be allowed of course, because the semaphore
can sleep, so I agree, you should be good.

I just needed this conversation to feel good about this patch ;)

Reviewed-by: Christoffer Dall <cdall@linaro.org>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
@ 2017-03-15 11:05         ` Christoffer Dall
  0 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 11:05 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, andreyknvl, ard.biesheuvel, will.deacon,
	linux-kernel, stable, kcc, syzkaller, dvyukov, catalin.marinas,
	pbonzini, kvmarm

On Wed, Mar 15, 2017 at 09:34:53AM +0000, Marc Zyngier wrote:
> On 15/03/17 09:17, Christoffer Dall wrote:
> > On Tue, Mar 14, 2017 at 02:52:32PM +0000, Suzuki K Poulose wrote:
> >> From: Marc Zyngier <marc.zyngier@arm.com>
> >>
> >> We don't hold the mmap_sem while searching for the VMAs when
> >> we try to unmap each memslot for a VM. Fix this properly to
> >> avoid unexpected results.
> >>
> >> Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
> >> Cc: stable@vger.kernel.org # v3.19+
> >> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >> ---
> >>  arch/arm/kvm/mmu.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 962616f..f2e2e0c 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
> >>  	int idx;
> >>  
> >>  	idx = srcu_read_lock(&kvm->srcu);
> >> +	down_read(&current->mm->mmap_sem);
> >>  	spin_lock(&kvm->mmu_lock);
> >>  
> >>  	slots = kvm_memslots(kvm);
> >> @@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
> >>  		stage2_unmap_memslot(kvm, memslot);
> >>  
> >>  	spin_unlock(&kvm->mmu_lock);
> >> +	up_read(&current->mm->mmap_sem);
> >>  	srcu_read_unlock(&kvm->srcu, idx);
> >>  }
> >>  
> >> -- 
> >> 2.7.4
> >>
> > 
> > Are we sure that holding mmu_lock is valid while holding the mmap_sem?
> 
> Maybe I'm just confused by the many levels of locking, Here's my rational:
> 
> - kvm->srcu protects the memslot list
> - mmap_sem protects the kernel VMA list
> - mmu_lock protects the stage2 page tables (at least here)
> 
> I don't immediately see any issue with holding the mmap_sem mutex here
> (unless there is a path that would retrigger a down operation on the
> mmap_sem?).
> 
> Or am I missing something obvious?

I was worried that someone else could hold the mmu_lock and take the
mmap_sem, but that wouldn't be allowed of course, because the semaphore
can sleep, so I agree, you should be good.

I just needed this conversation to feel good about this patch ;)

Reviewed-by: Christoffer Dall <cdall@linaro.org>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
@ 2017-03-15 11:05         ` Christoffer Dall
  0 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 11:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 15, 2017 at 09:34:53AM +0000, Marc Zyngier wrote:
> On 15/03/17 09:17, Christoffer Dall wrote:
> > On Tue, Mar 14, 2017 at 02:52:32PM +0000, Suzuki K Poulose wrote:
> >> From: Marc Zyngier <marc.zyngier@arm.com>
> >>
> >> We don't hold the mmap_sem while searching for the VMAs when
> >> we try to unmap each memslot for a VM. Fix this properly to
> >> avoid unexpected results.
> >>
> >> Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
> >> Cc: stable at vger.kernel.org # v3.19+
> >> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >> ---
> >>  arch/arm/kvm/mmu.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 962616f..f2e2e0c 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
> >>  	int idx;
> >>  
> >>  	idx = srcu_read_lock(&kvm->srcu);
> >> +	down_read(&current->mm->mmap_sem);
> >>  	spin_lock(&kvm->mmu_lock);
> >>  
> >>  	slots = kvm_memslots(kvm);
> >> @@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
> >>  		stage2_unmap_memslot(kvm, memslot);
> >>  
> >>  	spin_unlock(&kvm->mmu_lock);
> >> +	up_read(&current->mm->mmap_sem);
> >>  	srcu_read_unlock(&kvm->srcu, idx);
> >>  }
> >>  
> >> -- 
> >> 2.7.4
> >>
> > 
> > Are we sure that holding mmu_lock is valid while holding the mmap_sem?
> 
> Maybe I'm just confused by the many levels of locking, Here's my rational:
> 
> - kvm->srcu protects the memslot list
> - mmap_sem protects the kernel VMA list
> - mmu_lock protects the stage2 page tables (at least here)
> 
> I don't immediately see any issue with holding the mmap_sem mutex here
> (unless there is a path that would retrigger a down operation on the
> mmap_sem?).
> 
> Or am I missing something obvious?

I was worried that someone else could hold the mmu_lock and take the
mmap_sem, but that wouldn't be allowed of course, because the semaphore
can sleep, so I agree, you should be good.

I just needed this conversation to feel good about this patch ;)

Reviewed-by: Christoffer Dall <cdall@linaro.org>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 2/3] kvm: arm/arm64: Take mmap_sem in kvm_arch_prepare_memory_region
  2017-03-14 14:52   ` Suzuki K Poulose
  (?)
@ 2017-03-15 11:05     ` Christoffer Dall
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 11:05 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, andreyknvl, dvyukov, marc.zyngier,
	christoffer.dall, kvmarm, kvm, linux-kernel, kcc, syzkaller,
	will.deacon, catalin.marinas, pbonzini, mark.rutland,
	ard.biesheuvel, stable

On Tue, Mar 14, 2017 at 02:52:33PM +0000, Suzuki K Poulose wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> We don't hold the mmap_sem while searching for VMAs (via find_vma), in
> kvm_arch_prepare_memory_region, which can end up in expected failures.
> 
> Fixes: commit 8eef91239e57 ("arm/arm64: KVM: map MMIO regions at creation time")
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Cc: Eric Auger <eric.auger@rehat.com>
> Cc: stable@vger.kernel.org # v3.18+
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> [ Handle dirty page logging failure case ]
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Reviewed-by: Christoffer Dall <cdall@linaro.org>

> ---
>  arch/arm/kvm/mmu.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index f2e2e0c..13b9c1f 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -1803,6 +1803,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  	    (KVM_PHYS_SIZE >> PAGE_SHIFT))
>  		return -EFAULT;
>  
> +	down_read(&current->mm->mmap_sem);
>  	/*
>  	 * A memory region could potentially cover multiple VMAs, and any holes
>  	 * between them, so iterate over all of them to find out if we can map
> @@ -1846,8 +1847,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  			pa += vm_start - vma->vm_start;
>  
>  			/* IO region dirty page logging not allowed */
> -			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES)
> -				return -EINVAL;
> +			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES) {
> +				ret = -EINVAL;
> +				goto out;
> +			}
>  
>  			ret = kvm_phys_addr_ioremap(kvm, gpa, pa,
>  						    vm_end - vm_start,
> @@ -1859,7 +1862,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  	} while (hva < reg_end);
>  
>  	if (change == KVM_MR_FLAGS_ONLY)
> -		return ret;
> +		goto out;
>  
>  	spin_lock(&kvm->mmu_lock);
>  	if (ret)
> @@ -1867,6 +1870,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  	else
>  		stage2_flush_memslot(kvm, memslot);
>  	spin_unlock(&kvm->mmu_lock);
> +out:
> +	up_read(&current->mm->mmap_sem);
>  	return ret;
>  }
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 2/3] kvm: arm/arm64: Take mmap_sem in kvm_arch_prepare_memory_region
@ 2017-03-15 11:05     ` Christoffer Dall
  0 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 11:05 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, kvm, ard.biesheuvel, marc.zyngier, andreyknvl,
	will.deacon, linux-kernel, stable, kcc, syzkaller, dvyukov,
	catalin.marinas, pbonzini, kvmarm

On Tue, Mar 14, 2017 at 02:52:33PM +0000, Suzuki K Poulose wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> We don't hold the mmap_sem while searching for VMAs (via find_vma), in
> kvm_arch_prepare_memory_region, which can end up in expected failures.
> 
> Fixes: commit 8eef91239e57 ("arm/arm64: KVM: map MMIO regions at creation time")
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Cc: Eric Auger <eric.auger@rehat.com>
> Cc: stable@vger.kernel.org # v3.18+
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> [ Handle dirty page logging failure case ]
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Reviewed-by: Christoffer Dall <cdall@linaro.org>

> ---
>  arch/arm/kvm/mmu.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index f2e2e0c..13b9c1f 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -1803,6 +1803,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  	    (KVM_PHYS_SIZE >> PAGE_SHIFT))
>  		return -EFAULT;
>  
> +	down_read(&current->mm->mmap_sem);
>  	/*
>  	 * A memory region could potentially cover multiple VMAs, and any holes
>  	 * between them, so iterate over all of them to find out if we can map
> @@ -1846,8 +1847,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  			pa += vm_start - vma->vm_start;
>  
>  			/* IO region dirty page logging not allowed */
> -			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES)
> -				return -EINVAL;
> +			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES) {
> +				ret = -EINVAL;
> +				goto out;
> +			}
>  
>  			ret = kvm_phys_addr_ioremap(kvm, gpa, pa,
>  						    vm_end - vm_start,
> @@ -1859,7 +1862,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  	} while (hva < reg_end);
>  
>  	if (change == KVM_MR_FLAGS_ONLY)
> -		return ret;
> +		goto out;
>  
>  	spin_lock(&kvm->mmu_lock);
>  	if (ret)
> @@ -1867,6 +1870,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  	else
>  		stage2_flush_memslot(kvm, memslot);
>  	spin_unlock(&kvm->mmu_lock);
> +out:
> +	up_read(&current->mm->mmap_sem);
>  	return ret;
>  }
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 2/3] kvm: arm/arm64: Take mmap_sem in kvm_arch_prepare_memory_region
@ 2017-03-15 11:05     ` Christoffer Dall
  0 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 11:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Mar 14, 2017 at 02:52:33PM +0000, Suzuki K Poulose wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> We don't hold the mmap_sem while searching for VMAs (via find_vma), in
> kvm_arch_prepare_memory_region, which can end up in expected failures.
> 
> Fixes: commit 8eef91239e57 ("arm/arm64: KVM: map MMIO regions at creation time")
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Cc: Eric Auger <eric.auger@rehat.com>
> Cc: stable at vger.kernel.org # v3.18+
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> [ Handle dirty page logging failure case ]
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Reviewed-by: Christoffer Dall <cdall@linaro.org>

> ---
>  arch/arm/kvm/mmu.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index f2e2e0c..13b9c1f 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -1803,6 +1803,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  	    (KVM_PHYS_SIZE >> PAGE_SHIFT))
>  		return -EFAULT;
>  
> +	down_read(&current->mm->mmap_sem);
>  	/*
>  	 * A memory region could potentially cover multiple VMAs, and any holes
>  	 * between them, so iterate over all of them to find out if we can map
> @@ -1846,8 +1847,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  			pa += vm_start - vma->vm_start;
>  
>  			/* IO region dirty page logging not allowed */
> -			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES)
> -				return -EINVAL;
> +			if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES) {
> +				ret = -EINVAL;
> +				goto out;
> +			}
>  
>  			ret = kvm_phys_addr_ioremap(kvm, gpa, pa,
>  						    vm_end - vm_start,
> @@ -1859,7 +1862,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  	} while (hva < reg_end);
>  
>  	if (change == KVM_MR_FLAGS_ONLY)
> -		return ret;
> +		goto out;
>  
>  	spin_lock(&kvm->mmu_lock);
>  	if (ret)
> @@ -1867,6 +1870,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>  	else
>  		stage2_flush_memslot(kvm, memslot);
>  	spin_unlock(&kvm->mmu_lock);
> +out:
> +	up_read(&current->mm->mmap_sem);
>  	return ret;
>  }
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  2017-03-15 10:56         ` Christoffer Dall
  (?)
@ 2017-03-15 13:28           ` Marc Zyngier
  -1 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 13:28 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Suzuki K Poulose, linux-arm-kernel, andreyknvl, dvyukov,
	christoffer.dall, kvmarm, kvm, linux-kernel, kcc, syzkaller,
	will.deacon, catalin.marinas, pbonzini, mark.rutland,
	ard.biesheuvel, stable

On 15/03/17 10:56, Christoffer Dall wrote:
> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>> On 15/03/17 09:21, Christoffer Dall wrote:
>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>> unmap a range.
>>>>
>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>> Cc: stable@vger.kernel.org # v3.10+
>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>> ---
>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>  1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index 13b9c1f..b361f71 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>  	if (kvm->arch.pgd == NULL)
>>>>  		return;
>>>>  
>>>> +	spin_lock(&kvm->mmu_lock);
>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>> +	spin_unlock(&kvm->mmu_lock);
>>>> +
>>>
>>> This ends up holding the spin lock for potentially quite a while, where
>>> we can do things like __flush_dcache_area(), which I think can fault.
>>
>> I believe we're always using the linear mapping (or kmap on 32bit) in
>> order not to fault.
>>
> 
> ok, then there's just the concern that we may be holding a spinlock for
> a very long time.  I seem to recall Mario once added something where he
> unlocked and gave a chance to schedule something else for each PUD or
> something like that, because he ran into the issue during migration.  Am
> I confusing this with something else?

That definitely rings a bell: stage2_wp_range() uses that kind of trick
to give the system a chance to breathe. Maybe we could use a similar
trick in our S2 unmapping code? How about this (completely untested) patch:

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 962616fd4ddd..1786c24212d4 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
 	phys_addr_t addr = start, end = start + size;
 	phys_addr_t next;
 
+	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
+
 	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
 	do {
+		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
+			cond_resched_lock(&kvm->mmu_lock);
+
 		next = stage2_pgd_addr_end(addr, end);
 		if (!stage2_pgd_none(*pgd))
 			unmap_stage2_puds(kvm, pgd, addr, next);

The additional BUG_ON() is just for my own peace of mind - we seem to
have missed a couple of these lately, and the "breathing" code makes
it imperative that this lock is being taken prior to entering the
function.

Thoughts?

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 13:28           ` Marc Zyngier
  0 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 13:28 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, kvm, andreyknvl, ard.biesheuvel, will.deacon,
	linux-kernel, stable, kcc, syzkaller, dvyukov, catalin.marinas,
	pbonzini, kvmarm

On 15/03/17 10:56, Christoffer Dall wrote:
> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>> On 15/03/17 09:21, Christoffer Dall wrote:
>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>> unmap a range.
>>>>
>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>> Cc: stable@vger.kernel.org # v3.10+
>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>> ---
>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>  1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index 13b9c1f..b361f71 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>  	if (kvm->arch.pgd == NULL)
>>>>  		return;
>>>>  
>>>> +	spin_lock(&kvm->mmu_lock);
>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>> +	spin_unlock(&kvm->mmu_lock);
>>>> +
>>>
>>> This ends up holding the spin lock for potentially quite a while, where
>>> we can do things like __flush_dcache_area(), which I think can fault.
>>
>> I believe we're always using the linear mapping (or kmap on 32bit) in
>> order not to fault.
>>
> 
> ok, then there's just the concern that we may be holding a spinlock for
> a very long time.  I seem to recall Mario once added something where he
> unlocked and gave a chance to schedule something else for each PUD or
> something like that, because he ran into the issue during migration.  Am
> I confusing this with something else?

That definitely rings a bell: stage2_wp_range() uses that kind of trick
to give the system a chance to breathe. Maybe we could use a similar
trick in our S2 unmapping code? How about this (completely untested) patch:

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 962616fd4ddd..1786c24212d4 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
 	phys_addr_t addr = start, end = start + size;
 	phys_addr_t next;
 
+	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
+
 	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
 	do {
+		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
+			cond_resched_lock(&kvm->mmu_lock);
+
 		next = stage2_pgd_addr_end(addr, end);
 		if (!stage2_pgd_none(*pgd))
 			unmap_stage2_puds(kvm, pgd, addr, next);

The additional BUG_ON() is just for my own peace of mind - we seem to
have missed a couple of these lately, and the "breathing" code makes
it imperative that this lock is being taken prior to entering the
function.

Thoughts?

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 13:28           ` Marc Zyngier
  0 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 13:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/03/17 10:56, Christoffer Dall wrote:
> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>> On 15/03/17 09:21, Christoffer Dall wrote:
>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>> unmap a range.
>>>>
>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>> Cc: stable at vger.kernel.org # v3.10+
>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>> ---
>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>  1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index 13b9c1f..b361f71 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>  	if (kvm->arch.pgd == NULL)
>>>>  		return;
>>>>  
>>>> +	spin_lock(&kvm->mmu_lock);
>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>> +	spin_unlock(&kvm->mmu_lock);
>>>> +
>>>
>>> This ends up holding the spin lock for potentially quite a while, where
>>> we can do things like __flush_dcache_area(), which I think can fault.
>>
>> I believe we're always using the linear mapping (or kmap on 32bit) in
>> order not to fault.
>>
> 
> ok, then there's just the concern that we may be holding a spinlock for
> a very long time.  I seem to recall Mario once added something where he
> unlocked and gave a chance to schedule something else for each PUD or
> something like that, because he ran into the issue during migration.  Am
> I confusing this with something else?

That definitely rings a bell: stage2_wp_range() uses that kind of trick
to give the system a chance to breathe. Maybe we could use a similar
trick in our S2 unmapping code? How about this (completely untested) patch:

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 962616fd4ddd..1786c24212d4 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
 	phys_addr_t addr = start, end = start + size;
 	phys_addr_t next;
 
+	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
+
 	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
 	do {
+		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
+			cond_resched_lock(&kvm->mmu_lock);
+
 		next = stage2_pgd_addr_end(addr, end);
 		if (!stage2_pgd_none(*pgd))
 			unmap_stage2_puds(kvm, pgd, addr, next);

The additional BUG_ON() is just for my own peace of mind - we seem to
have missed a couple of these lately, and the "breathing" code makes
it imperative that this lock is being taken prior to entering the
function.

Thoughts?

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
  2017-03-15  9:17     ` Christoffer Dall
  (?)
@ 2017-03-15 13:29       ` Paolo Bonzini
  -1 siblings, 0 replies; 54+ messages in thread
From: Paolo Bonzini @ 2017-03-15 13:29 UTC (permalink / raw)
  To: Christoffer Dall, Suzuki K Poulose
  Cc: linux-arm-kernel, andreyknvl, dvyukov, marc.zyngier,
	christoffer.dall, kvmarm, kvm, linux-kernel, kcc, syzkaller,
	will.deacon, catalin.marinas, mark.rutland, ard.biesheuvel,
	stable



On 15/03/2017 10:17, Christoffer Dall wrote:
> On Tue, Mar 14, 2017 at 02:52:32PM +0000, Suzuki K Poulose wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> We don't hold the mmap_sem while searching for the VMAs when
>> we try to unmap each memslot for a VM. Fix this properly to
>> avoid unexpected results.
>>
>> Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
>> Cc: stable@vger.kernel.org # v3.19+
>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>  arch/arm/kvm/mmu.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616f..f2e2e0c 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  	int idx;
>>  
>>  	idx = srcu_read_lock(&kvm->srcu);
>> +	down_read(&current->mm->mmap_sem);
>>  	spin_lock(&kvm->mmu_lock);
>>  
>>  	slots = kvm_memslots(kvm);
>> @@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  		stage2_unmap_memslot(kvm, memslot);
>>  
>>  	spin_unlock(&kvm->mmu_lock);
>> +	up_read(&current->mm->mmap_sem);
>>  	srcu_read_unlock(&kvm->srcu, idx);
>>  }
>>  
>> -- 
>> 2.7.4
>>
> 
> Are we sure that holding mmu_lock is valid while holding the mmap_sem?

Sure, spinlock-inside-semaphore and spinlock-inside-mutex is always okay.

Paolo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
@ 2017-03-15 13:29       ` Paolo Bonzini
  0 siblings, 0 replies; 54+ messages in thread
From: Paolo Bonzini @ 2017-03-15 13:29 UTC (permalink / raw)
  To: Christoffer Dall, Suzuki K Poulose
  Cc: linux-arm-kernel, kvm, ard.biesheuvel, marc.zyngier, andreyknvl,
	will.deacon, linux-kernel, stable, kcc, syzkaller, dvyukov,
	catalin.marinas, kvmarm



On 15/03/2017 10:17, Christoffer Dall wrote:
> On Tue, Mar 14, 2017 at 02:52:32PM +0000, Suzuki K Poulose wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> We don't hold the mmap_sem while searching for the VMAs when
>> we try to unmap each memslot for a VM. Fix this properly to
>> avoid unexpected results.
>>
>> Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
>> Cc: stable@vger.kernel.org # v3.19+
>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>  arch/arm/kvm/mmu.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616f..f2e2e0c 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  	int idx;
>>  
>>  	idx = srcu_read_lock(&kvm->srcu);
>> +	down_read(&current->mm->mmap_sem);
>>  	spin_lock(&kvm->mmu_lock);
>>  
>>  	slots = kvm_memslots(kvm);
>> @@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  		stage2_unmap_memslot(kvm, memslot);
>>  
>>  	spin_unlock(&kvm->mmu_lock);
>> +	up_read(&current->mm->mmap_sem);
>>  	srcu_read_unlock(&kvm->srcu, idx);
>>  }
>>  
>> -- 
>> 2.7.4
>>
> 
> Are we sure that holding mmu_lock is valid while holding the mmap_sem?

Sure, spinlock-inside-semaphore and spinlock-inside-mutex is always okay.

Paolo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm
@ 2017-03-15 13:29       ` Paolo Bonzini
  0 siblings, 0 replies; 54+ messages in thread
From: Paolo Bonzini @ 2017-03-15 13:29 UTC (permalink / raw)
  To: linux-arm-kernel



On 15/03/2017 10:17, Christoffer Dall wrote:
> On Tue, Mar 14, 2017 at 02:52:32PM +0000, Suzuki K Poulose wrote:
>> From: Marc Zyngier <marc.zyngier@arm.com>
>>
>> We don't hold the mmap_sem while searching for the VMAs when
>> we try to unmap each memslot for a VM. Fix this properly to
>> avoid unexpected results.
>>
>> Fixes: commit 957db105c997 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
>> Cc: stable at vger.kernel.org # v3.19+
>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>  arch/arm/kvm/mmu.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616f..f2e2e0c 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -803,6 +803,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  	int idx;
>>  
>>  	idx = srcu_read_lock(&kvm->srcu);
>> +	down_read(&current->mm->mmap_sem);
>>  	spin_lock(&kvm->mmu_lock);
>>  
>>  	slots = kvm_memslots(kvm);
>> @@ -810,6 +811,7 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  		stage2_unmap_memslot(kvm, memslot);
>>  
>>  	spin_unlock(&kvm->mmu_lock);
>> +	up_read(&current->mm->mmap_sem);
>>  	srcu_read_unlock(&kvm->srcu, idx);
>>  }
>>  
>> -- 
>> 2.7.4
>>
> 
> Are we sure that holding mmu_lock is valid while holding the mmap_sem?

Sure, spinlock-inside-semaphore and spinlock-inside-mutex is always okay.

Paolo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  2017-03-15 13:28           ` Marc Zyngier
  (?)
@ 2017-03-15 13:35             ` Christoffer Dall
  -1 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 13:35 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Suzuki K Poulose, linux-arm-kernel, andreyknvl, dvyukov,
	christoffer.dall, kvmarm, kvm, linux-kernel, kcc, syzkaller,
	will.deacon, catalin.marinas, pbonzini, mark.rutland,
	ard.biesheuvel, stable

On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
> On 15/03/17 10:56, Christoffer Dall wrote:
> > On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
> >> On 15/03/17 09:21, Christoffer Dall wrote:
> >>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
> >>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
> >>>> unmap_stage2_range() on the entire memory range for the guest. This could
> >>>> cause problems with other callers (e.g, munmap on a memslot) trying to
> >>>> unmap a range.
> >>>>
> >>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
> >>>> Cc: stable@vger.kernel.org # v3.10+
> >>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
> >>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> >>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>>> ---
> >>>>  arch/arm/kvm/mmu.c | 3 +++
> >>>>  1 file changed, 3 insertions(+)
> >>>>
> >>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >>>> index 13b9c1f..b361f71 100644
> >>>> --- a/arch/arm/kvm/mmu.c
> >>>> +++ b/arch/arm/kvm/mmu.c
> >>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
> >>>>  	if (kvm->arch.pgd == NULL)
> >>>>  		return;
> >>>>  
> >>>> +	spin_lock(&kvm->mmu_lock);
> >>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> >>>> +	spin_unlock(&kvm->mmu_lock);
> >>>> +
> >>>
> >>> This ends up holding the spin lock for potentially quite a while, where
> >>> we can do things like __flush_dcache_area(), which I think can fault.
> >>
> >> I believe we're always using the linear mapping (or kmap on 32bit) in
> >> order not to fault.
> >>
> > 
> > ok, then there's just the concern that we may be holding a spinlock for
> > a very long time.  I seem to recall Mario once added something where he
> > unlocked and gave a chance to schedule something else for each PUD or
> > something like that, because he ran into the issue during migration.  Am
> > I confusing this with something else?
> 
> That definitely rings a bell: stage2_wp_range() uses that kind of trick
> to give the system a chance to breathe. Maybe we could use a similar
> trick in our S2 unmapping code? How about this (completely untested) patch:
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616fd4ddd..1786c24212d4 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>  	phys_addr_t addr = start, end = start + size;
>  	phys_addr_t next;
>  
> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
> +
>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>  	do {
> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
> +			cond_resched_lock(&kvm->mmu_lock);
> +
>  		next = stage2_pgd_addr_end(addr, end);
>  		if (!stage2_pgd_none(*pgd))
>  			unmap_stage2_puds(kvm, pgd, addr, next);
> 
> The additional BUG_ON() is just for my own peace of mind - we seem to
> have missed a couple of these lately, and the "breathing" code makes
> it imperative that this lock is being taken prior to entering the
> function.
> 

Looks good to me!

-Christoffer

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 13:35             ` Christoffer Dall
  0 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 13:35 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, andreyknvl, ard.biesheuvel, will.deacon,
	linux-kernel, stable, kcc, syzkaller, dvyukov, catalin.marinas,
	pbonzini, kvmarm

On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
> On 15/03/17 10:56, Christoffer Dall wrote:
> > On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
> >> On 15/03/17 09:21, Christoffer Dall wrote:
> >>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
> >>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
> >>>> unmap_stage2_range() on the entire memory range for the guest. This could
> >>>> cause problems with other callers (e.g, munmap on a memslot) trying to
> >>>> unmap a range.
> >>>>
> >>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
> >>>> Cc: stable@vger.kernel.org # v3.10+
> >>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
> >>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> >>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>>> ---
> >>>>  arch/arm/kvm/mmu.c | 3 +++
> >>>>  1 file changed, 3 insertions(+)
> >>>>
> >>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >>>> index 13b9c1f..b361f71 100644
> >>>> --- a/arch/arm/kvm/mmu.c
> >>>> +++ b/arch/arm/kvm/mmu.c
> >>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
> >>>>  	if (kvm->arch.pgd == NULL)
> >>>>  		return;
> >>>>  
> >>>> +	spin_lock(&kvm->mmu_lock);
> >>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> >>>> +	spin_unlock(&kvm->mmu_lock);
> >>>> +
> >>>
> >>> This ends up holding the spin lock for potentially quite a while, where
> >>> we can do things like __flush_dcache_area(), which I think can fault.
> >>
> >> I believe we're always using the linear mapping (or kmap on 32bit) in
> >> order not to fault.
> >>
> > 
> > ok, then there's just the concern that we may be holding a spinlock for
> > a very long time.  I seem to recall Mario once added something where he
> > unlocked and gave a chance to schedule something else for each PUD or
> > something like that, because he ran into the issue during migration.  Am
> > I confusing this with something else?
> 
> That definitely rings a bell: stage2_wp_range() uses that kind of trick
> to give the system a chance to breathe. Maybe we could use a similar
> trick in our S2 unmapping code? How about this (completely untested) patch:
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616fd4ddd..1786c24212d4 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>  	phys_addr_t addr = start, end = start + size;
>  	phys_addr_t next;
>  
> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
> +
>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>  	do {
> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
> +			cond_resched_lock(&kvm->mmu_lock);
> +
>  		next = stage2_pgd_addr_end(addr, end);
>  		if (!stage2_pgd_none(*pgd))
>  			unmap_stage2_puds(kvm, pgd, addr, next);
> 
> The additional BUG_ON() is just for my own peace of mind - we seem to
> have missed a couple of these lately, and the "breathing" code makes
> it imperative that this lock is being taken prior to entering the
> function.
> 

Looks good to me!

-Christoffer

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 13:35             ` Christoffer Dall
  0 siblings, 0 replies; 54+ messages in thread
From: Christoffer Dall @ 2017-03-15 13:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
> On 15/03/17 10:56, Christoffer Dall wrote:
> > On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
> >> On 15/03/17 09:21, Christoffer Dall wrote:
> >>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
> >>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
> >>>> unmap_stage2_range() on the entire memory range for the guest. This could
> >>>> cause problems with other callers (e.g, munmap on a memslot) trying to
> >>>> unmap a range.
> >>>>
> >>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
> >>>> Cc: stable at vger.kernel.org # v3.10+
> >>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
> >>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> >>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>>> ---
> >>>>  arch/arm/kvm/mmu.c | 3 +++
> >>>>  1 file changed, 3 insertions(+)
> >>>>
> >>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >>>> index 13b9c1f..b361f71 100644
> >>>> --- a/arch/arm/kvm/mmu.c
> >>>> +++ b/arch/arm/kvm/mmu.c
> >>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
> >>>>  	if (kvm->arch.pgd == NULL)
> >>>>  		return;
> >>>>  
> >>>> +	spin_lock(&kvm->mmu_lock);
> >>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> >>>> +	spin_unlock(&kvm->mmu_lock);
> >>>> +
> >>>
> >>> This ends up holding the spin lock for potentially quite a while, where
> >>> we can do things like __flush_dcache_area(), which I think can fault.
> >>
> >> I believe we're always using the linear mapping (or kmap on 32bit) in
> >> order not to fault.
> >>
> > 
> > ok, then there's just the concern that we may be holding a spinlock for
> > a very long time.  I seem to recall Mario once added something where he
> > unlocked and gave a chance to schedule something else for each PUD or
> > something like that, because he ran into the issue during migration.  Am
> > I confusing this with something else?
> 
> That definitely rings a bell: stage2_wp_range() uses that kind of trick
> to give the system a chance to breathe. Maybe we could use a similar
> trick in our S2 unmapping code? How about this (completely untested) patch:
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616fd4ddd..1786c24212d4 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>  	phys_addr_t addr = start, end = start + size;
>  	phys_addr_t next;
>  
> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
> +
>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>  	do {
> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
> +			cond_resched_lock(&kvm->mmu_lock);
> +
>  		next = stage2_pgd_addr_end(addr, end);
>  		if (!stage2_pgd_none(*pgd))
>  			unmap_stage2_puds(kvm, pgd, addr, next);
> 
> The additional BUG_ON() is just for my own peace of mind - we seem to
> have missed a couple of these lately, and the "breathing" code makes
> it imperative that this lock is being taken prior to entering the
> function.
> 

Looks good to me!

-Christoffer

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  2017-03-15 13:35             ` Christoffer Dall
  (?)
@ 2017-03-15 13:43               ` Marc Zyngier
  -1 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 13:43 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Suzuki K Poulose, linux-arm-kernel, andreyknvl, dvyukov,
	christoffer.dall, kvmarm, kvm, linux-kernel, kcc, syzkaller,
	will.deacon, catalin.marinas, pbonzini, mark.rutland,
	ard.biesheuvel, stable

On 15/03/17 13:35, Christoffer Dall wrote:
> On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
>> On 15/03/17 10:56, Christoffer Dall wrote:
>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>> unmap a range.
>>>>>>
>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>> Cc: stable@vger.kernel.org # v3.10+
>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>> ---
>>>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>>>  1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>>>> index 13b9c1f..b361f71 100644
>>>>>> --- a/arch/arm/kvm/mmu.c
>>>>>> +++ b/arch/arm/kvm/mmu.c
>>>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>>>  	if (kvm->arch.pgd == NULL)
>>>>>>  		return;
>>>>>>  
>>>>>> +	spin_lock(&kvm->mmu_lock);
>>>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>>>> +	spin_unlock(&kvm->mmu_lock);
>>>>>> +
>>>>>
>>>>> This ends up holding the spin lock for potentially quite a while, where
>>>>> we can do things like __flush_dcache_area(), which I think can fault.
>>>>
>>>> I believe we're always using the linear mapping (or kmap on 32bit) in
>>>> order not to fault.
>>>>
>>>
>>> ok, then there's just the concern that we may be holding a spinlock for
>>> a very long time.  I seem to recall Mario once added something where he
>>> unlocked and gave a chance to schedule something else for each PUD or
>>> something like that, because he ran into the issue during migration.  Am
>>> I confusing this with something else?
>>
>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>> to give the system a chance to breathe. Maybe we could use a similar
>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616fd4ddd..1786c24212d4 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>  	phys_addr_t addr = start, end = start + size;
>>  	phys_addr_t next;
>>  
>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
>> +
>>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>>  	do {
>> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>> +			cond_resched_lock(&kvm->mmu_lock);
>> +
>>  		next = stage2_pgd_addr_end(addr, end);
>>  		if (!stage2_pgd_none(*pgd))
>>  			unmap_stage2_puds(kvm, pgd, addr, next);
>>
>> The additional BUG_ON() is just for my own peace of mind - we seem to
>> have missed a couple of these lately, and the "breathing" code makes
>> it imperative that this lock is being taken prior to entering the
>> function.
>>
> 
> Looks good to me!

OK. I'll stash that on top of Suzuki's series, and start running some
actual tests... ;-)

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 13:43               ` Marc Zyngier
  0 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 13:43 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, kvm, andreyknvl, ard.biesheuvel, will.deacon,
	linux-kernel, stable, kcc, syzkaller, dvyukov, catalin.marinas,
	pbonzini, kvmarm

On 15/03/17 13:35, Christoffer Dall wrote:
> On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
>> On 15/03/17 10:56, Christoffer Dall wrote:
>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>> unmap a range.
>>>>>>
>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>> Cc: stable@vger.kernel.org # v3.10+
>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>> ---
>>>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>>>  1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>>>> index 13b9c1f..b361f71 100644
>>>>>> --- a/arch/arm/kvm/mmu.c
>>>>>> +++ b/arch/arm/kvm/mmu.c
>>>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>>>  	if (kvm->arch.pgd == NULL)
>>>>>>  		return;
>>>>>>  
>>>>>> +	spin_lock(&kvm->mmu_lock);
>>>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>>>> +	spin_unlock(&kvm->mmu_lock);
>>>>>> +
>>>>>
>>>>> This ends up holding the spin lock for potentially quite a while, where
>>>>> we can do things like __flush_dcache_area(), which I think can fault.
>>>>
>>>> I believe we're always using the linear mapping (or kmap on 32bit) in
>>>> order not to fault.
>>>>
>>>
>>> ok, then there's just the concern that we may be holding a spinlock for
>>> a very long time.  I seem to recall Mario once added something where he
>>> unlocked and gave a chance to schedule something else for each PUD or
>>> something like that, because he ran into the issue during migration.  Am
>>> I confusing this with something else?
>>
>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>> to give the system a chance to breathe. Maybe we could use a similar
>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616fd4ddd..1786c24212d4 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>  	phys_addr_t addr = start, end = start + size;
>>  	phys_addr_t next;
>>  
>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
>> +
>>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>>  	do {
>> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>> +			cond_resched_lock(&kvm->mmu_lock);
>> +
>>  		next = stage2_pgd_addr_end(addr, end);
>>  		if (!stage2_pgd_none(*pgd))
>>  			unmap_stage2_puds(kvm, pgd, addr, next);
>>
>> The additional BUG_ON() is just for my own peace of mind - we seem to
>> have missed a couple of these lately, and the "breathing" code makes
>> it imperative that this lock is being taken prior to entering the
>> function.
>>
> 
> Looks good to me!

OK. I'll stash that on top of Suzuki's series, and start running some
actual tests... ;-)

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 13:43               ` Marc Zyngier
  0 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 13:43 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/03/17 13:35, Christoffer Dall wrote:
> On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
>> On 15/03/17 10:56, Christoffer Dall wrote:
>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>> unmap a range.
>>>>>>
>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>> Cc: stable at vger.kernel.org # v3.10+
>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>> ---
>>>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>>>  1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>>>> index 13b9c1f..b361f71 100644
>>>>>> --- a/arch/arm/kvm/mmu.c
>>>>>> +++ b/arch/arm/kvm/mmu.c
>>>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>>>  	if (kvm->arch.pgd == NULL)
>>>>>>  		return;
>>>>>>  
>>>>>> +	spin_lock(&kvm->mmu_lock);
>>>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>>>> +	spin_unlock(&kvm->mmu_lock);
>>>>>> +
>>>>>
>>>>> This ends up holding the spin lock for potentially quite a while, where
>>>>> we can do things like __flush_dcache_area(), which I think can fault.
>>>>
>>>> I believe we're always using the linear mapping (or kmap on 32bit) in
>>>> order not to fault.
>>>>
>>>
>>> ok, then there's just the concern that we may be holding a spinlock for
>>> a very long time.  I seem to recall Mario once added something where he
>>> unlocked and gave a chance to schedule something else for each PUD or
>>> something like that, because he ran into the issue during migration.  Am
>>> I confusing this with something else?
>>
>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>> to give the system a chance to breathe. Maybe we could use a similar
>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616fd4ddd..1786c24212d4 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>  	phys_addr_t addr = start, end = start + size;
>>  	phys_addr_t next;
>>  
>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
>> +
>>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>>  	do {
>> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>> +			cond_resched_lock(&kvm->mmu_lock);
>> +
>>  		next = stage2_pgd_addr_end(addr, end);
>>  		if (!stage2_pgd_none(*pgd))
>>  			unmap_stage2_puds(kvm, pgd, addr, next);
>>
>> The additional BUG_ON() is just for my own peace of mind - we seem to
>> have missed a couple of these lately, and the "breathing" code makes
>> it imperative that this lock is being taken prior to entering the
>> function.
>>
> 
> Looks good to me!

OK. I'll stash that on top of Suzuki's series, and start running some
actual tests... ;-)

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  2017-03-15 13:43               ` Marc Zyngier
  (?)
@ 2017-03-15 13:50                 ` Robin Murphy
  -1 siblings, 0 replies; 54+ messages in thread
From: Robin Murphy @ 2017-03-15 13:50 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel, mark.rutland, kvm, Suzuki K Poulose,
	andreyknvl, ard.biesheuvel, will.deacon, linux-kernel, stable,
	kcc, syzkaller, dvyukov, catalin.marinas, pbonzini, kvmarm,
	christoffer.dall

Hi Marc,

On 15/03/17 13:43, Marc Zyngier wrote:
> On 15/03/17 13:35, Christoffer Dall wrote:
>> On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
>>> On 15/03/17 10:56, Christoffer Dall wrote:
>>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>>> unmap a range.
>>>>>>>
>>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>>> Cc: stable@vger.kernel.org # v3.10+
>>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>>> ---
>>>>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>
>>>>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>>>>> index 13b9c1f..b361f71 100644
>>>>>>> --- a/arch/arm/kvm/mmu.c
>>>>>>> +++ b/arch/arm/kvm/mmu.c
>>>>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>>>>  	if (kvm->arch.pgd == NULL)
>>>>>>>  		return;
>>>>>>>  
>>>>>>> +	spin_lock(&kvm->mmu_lock);
>>>>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>>>>> +	spin_unlock(&kvm->mmu_lock);
>>>>>>> +
>>>>>>
>>>>>> This ends up holding the spin lock for potentially quite a while, where
>>>>>> we can do things like __flush_dcache_area(), which I think can fault.
>>>>>
>>>>> I believe we're always using the linear mapping (or kmap on 32bit) in
>>>>> order not to fault.
>>>>>
>>>>
>>>> ok, then there's just the concern that we may be holding a spinlock for
>>>> a very long time.  I seem to recall Mario once added something where he
>>>> unlocked and gave a chance to schedule something else for each PUD or
>>>> something like that, because he ran into the issue during migration.  Am
>>>> I confusing this with something else?
>>>
>>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>>> to give the system a chance to breathe. Maybe we could use a similar
>>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>>
>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>> index 962616fd4ddd..1786c24212d4 100644
>>> --- a/arch/arm/kvm/mmu.c
>>> +++ b/arch/arm/kvm/mmu.c
>>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>>  	phys_addr_t addr = start, end = start + size;
>>>  	phys_addr_t next;
>>>  
>>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));

Nit: assert_spin_locked() is somewhat more pleasant (and currently looks
to expand to the exact same code).

Robin.

>>> +
>>>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>>>  	do {
>>> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>>> +			cond_resched_lock(&kvm->mmu_lock);
>>> +
>>>  		next = stage2_pgd_addr_end(addr, end);
>>>  		if (!stage2_pgd_none(*pgd))
>>>  			unmap_stage2_puds(kvm, pgd, addr, next);
>>>
>>> The additional BUG_ON() is just for my own peace of mind - we seem to
>>> have missed a couple of these lately, and the "breathing" code makes
>>> it imperative that this lock is being taken prior to entering the
>>> function.
>>>
>>
>> Looks good to me!
> 
> OK. I'll stash that on top of Suzuki's series, and start running some
> actual tests... ;-)
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 13:50                 ` Robin Murphy
  0 siblings, 0 replies; 54+ messages in thread
From: Robin Murphy @ 2017-03-15 13:50 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: kvm, ard.biesheuvel, andreyknvl, will.deacon, linux-kernel,
	stable, kcc, syzkaller, dvyukov, catalin.marinas, pbonzini,
	kvmarm, linux-arm-kernel

Hi Marc,

On 15/03/17 13:43, Marc Zyngier wrote:
> On 15/03/17 13:35, Christoffer Dall wrote:
>> On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
>>> On 15/03/17 10:56, Christoffer Dall wrote:
>>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>>> unmap a range.
>>>>>>>
>>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>>> Cc: stable@vger.kernel.org # v3.10+
>>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>>> ---
>>>>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>
>>>>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>>>>> index 13b9c1f..b361f71 100644
>>>>>>> --- a/arch/arm/kvm/mmu.c
>>>>>>> +++ b/arch/arm/kvm/mmu.c
>>>>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>>>>  	if (kvm->arch.pgd == NULL)
>>>>>>>  		return;
>>>>>>>  
>>>>>>> +	spin_lock(&kvm->mmu_lock);
>>>>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>>>>> +	spin_unlock(&kvm->mmu_lock);
>>>>>>> +
>>>>>>
>>>>>> This ends up holding the spin lock for potentially quite a while, where
>>>>>> we can do things like __flush_dcache_area(), which I think can fault.
>>>>>
>>>>> I believe we're always using the linear mapping (or kmap on 32bit) in
>>>>> order not to fault.
>>>>>
>>>>
>>>> ok, then there's just the concern that we may be holding a spinlock for
>>>> a very long time.  I seem to recall Mario once added something where he
>>>> unlocked and gave a chance to schedule something else for each PUD or
>>>> something like that, because he ran into the issue during migration.  Am
>>>> I confusing this with something else?
>>>
>>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>>> to give the system a chance to breathe. Maybe we could use a similar
>>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>>
>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>> index 962616fd4ddd..1786c24212d4 100644
>>> --- a/arch/arm/kvm/mmu.c
>>> +++ b/arch/arm/kvm/mmu.c
>>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>>  	phys_addr_t addr = start, end = start + size;
>>>  	phys_addr_t next;
>>>  
>>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));

Nit: assert_spin_locked() is somewhat more pleasant (and currently looks
to expand to the exact same code).

Robin.

>>> +
>>>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>>>  	do {
>>> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>>> +			cond_resched_lock(&kvm->mmu_lock);
>>> +
>>>  		next = stage2_pgd_addr_end(addr, end);
>>>  		if (!stage2_pgd_none(*pgd))
>>>  			unmap_stage2_puds(kvm, pgd, addr, next);
>>>
>>> The additional BUG_ON() is just for my own peace of mind - we seem to
>>> have missed a couple of these lately, and the "breathing" code makes
>>> it imperative that this lock is being taken prior to entering the
>>> function.
>>>
>>
>> Looks good to me!
> 
> OK. I'll stash that on top of Suzuki's series, and start running some
> actual tests... ;-)
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 13:50                 ` Robin Murphy
  0 siblings, 0 replies; 54+ messages in thread
From: Robin Murphy @ 2017-03-15 13:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marc,

On 15/03/17 13:43, Marc Zyngier wrote:
> On 15/03/17 13:35, Christoffer Dall wrote:
>> On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
>>> On 15/03/17 10:56, Christoffer Dall wrote:
>>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>>> unmap a range.
>>>>>>>
>>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>>> Cc: stable at vger.kernel.org # v3.10+
>>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>>> ---
>>>>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>
>>>>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>>>>> index 13b9c1f..b361f71 100644
>>>>>>> --- a/arch/arm/kvm/mmu.c
>>>>>>> +++ b/arch/arm/kvm/mmu.c
>>>>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>>>>  	if (kvm->arch.pgd == NULL)
>>>>>>>  		return;
>>>>>>>  
>>>>>>> +	spin_lock(&kvm->mmu_lock);
>>>>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>>>>> +	spin_unlock(&kvm->mmu_lock);
>>>>>>> +
>>>>>>
>>>>>> This ends up holding the spin lock for potentially quite a while, where
>>>>>> we can do things like __flush_dcache_area(), which I think can fault.
>>>>>
>>>>> I believe we're always using the linear mapping (or kmap on 32bit) in
>>>>> order not to fault.
>>>>>
>>>>
>>>> ok, then there's just the concern that we may be holding a spinlock for
>>>> a very long time.  I seem to recall Mario once added something where he
>>>> unlocked and gave a chance to schedule something else for each PUD or
>>>> something like that, because he ran into the issue during migration.  Am
>>>> I confusing this with something else?
>>>
>>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>>> to give the system a chance to breathe. Maybe we could use a similar
>>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>>
>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>> index 962616fd4ddd..1786c24212d4 100644
>>> --- a/arch/arm/kvm/mmu.c
>>> +++ b/arch/arm/kvm/mmu.c
>>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>>  	phys_addr_t addr = start, end = start + size;
>>>  	phys_addr_t next;
>>>  
>>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));

Nit: assert_spin_locked() is somewhat more pleasant (and currently looks
to expand to the exact same code).

Robin.

>>> +
>>>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>>>  	do {
>>> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>>> +			cond_resched_lock(&kvm->mmu_lock);
>>> +
>>>  		next = stage2_pgd_addr_end(addr, end);
>>>  		if (!stage2_pgd_none(*pgd))
>>>  			unmap_stage2_puds(kvm, pgd, addr, next);
>>>
>>> The additional BUG_ON() is just for my own peace of mind - we seem to
>>> have missed a couple of these lately, and the "breathing" code makes
>>> it imperative that this lock is being taken prior to entering the
>>> function.
>>>
>>
>> Looks good to me!
> 
> OK. I'll stash that on top of Suzuki's series, and start running some
> actual tests... ;-)
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  2017-03-15 13:50                 ` Robin Murphy
  (?)
@ 2017-03-15 13:55                   ` Marc Zyngier
  -1 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 13:55 UTC (permalink / raw)
  To: Robin Murphy, Christoffer Dall
  Cc: linux-arm-kernel, mark.rutland, kvm, Suzuki K Poulose,
	andreyknvl, ard.biesheuvel, will.deacon, linux-kernel, stable,
	kcc, syzkaller, dvyukov, catalin.marinas, pbonzini, kvmarm,
	christoffer.dall

On 15/03/17 13:50, Robin Murphy wrote:
> Hi Marc,
> 
> On 15/03/17 13:43, Marc Zyngier wrote:
>> On 15/03/17 13:35, Christoffer Dall wrote:
>>> On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
>>>> On 15/03/17 10:56, Christoffer Dall wrote:
>>>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>>>> unmap a range.
>>>>>>>>
>>>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>>>> Cc: stable@vger.kernel.org # v3.10+
>>>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>>>> ---
>>>>>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>>>>>> index 13b9c1f..b361f71 100644
>>>>>>>> --- a/arch/arm/kvm/mmu.c
>>>>>>>> +++ b/arch/arm/kvm/mmu.c
>>>>>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>>>>>  	if (kvm->arch.pgd == NULL)
>>>>>>>>  		return;
>>>>>>>>  
>>>>>>>> +	spin_lock(&kvm->mmu_lock);
>>>>>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>>>>>> +	spin_unlock(&kvm->mmu_lock);
>>>>>>>> +
>>>>>>>
>>>>>>> This ends up holding the spin lock for potentially quite a while, where
>>>>>>> we can do things like __flush_dcache_area(), which I think can fault.
>>>>>>
>>>>>> I believe we're always using the linear mapping (or kmap on 32bit) in
>>>>>> order not to fault.
>>>>>>
>>>>>
>>>>> ok, then there's just the concern that we may be holding a spinlock for
>>>>> a very long time.  I seem to recall Mario once added something where he
>>>>> unlocked and gave a chance to schedule something else for each PUD or
>>>>> something like that, because he ran into the issue during migration.  Am
>>>>> I confusing this with something else?
>>>>
>>>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>>>> to give the system a chance to breathe. Maybe we could use a similar
>>>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>>>
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index 962616fd4ddd..1786c24212d4 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>>>  	phys_addr_t addr = start, end = start + size;
>>>>  	phys_addr_t next;
>>>>  
>>>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
> 
> Nit: assert_spin_locked() is somewhat more pleasant (and currently looks
> to expand to the exact same code).

Fancy!

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 13:55                   ` Marc Zyngier
  0 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 13:55 UTC (permalink / raw)
  To: Robin Murphy, Christoffer Dall
  Cc: kvm, ard.biesheuvel, andreyknvl, will.deacon, linux-kernel,
	stable, kcc, syzkaller, dvyukov, catalin.marinas, pbonzini,
	kvmarm, linux-arm-kernel

On 15/03/17 13:50, Robin Murphy wrote:
> Hi Marc,
> 
> On 15/03/17 13:43, Marc Zyngier wrote:
>> On 15/03/17 13:35, Christoffer Dall wrote:
>>> On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
>>>> On 15/03/17 10:56, Christoffer Dall wrote:
>>>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>>>> unmap a range.
>>>>>>>>
>>>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>>>> Cc: stable@vger.kernel.org # v3.10+
>>>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>>>> ---
>>>>>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>>>>>> index 13b9c1f..b361f71 100644
>>>>>>>> --- a/arch/arm/kvm/mmu.c
>>>>>>>> +++ b/arch/arm/kvm/mmu.c
>>>>>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>>>>>  	if (kvm->arch.pgd == NULL)
>>>>>>>>  		return;
>>>>>>>>  
>>>>>>>> +	spin_lock(&kvm->mmu_lock);
>>>>>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>>>>>> +	spin_unlock(&kvm->mmu_lock);
>>>>>>>> +
>>>>>>>
>>>>>>> This ends up holding the spin lock for potentially quite a while, where
>>>>>>> we can do things like __flush_dcache_area(), which I think can fault.
>>>>>>
>>>>>> I believe we're always using the linear mapping (or kmap on 32bit) in
>>>>>> order not to fault.
>>>>>>
>>>>>
>>>>> ok, then there's just the concern that we may be holding a spinlock for
>>>>> a very long time.  I seem to recall Mario once added something where he
>>>>> unlocked and gave a chance to schedule something else for each PUD or
>>>>> something like that, because he ran into the issue during migration.  Am
>>>>> I confusing this with something else?
>>>>
>>>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>>>> to give the system a chance to breathe. Maybe we could use a similar
>>>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>>>
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index 962616fd4ddd..1786c24212d4 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>>>  	phys_addr_t addr = start, end = start + size;
>>>>  	phys_addr_t next;
>>>>  
>>>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
> 
> Nit: assert_spin_locked() is somewhat more pleasant (and currently looks
> to expand to the exact same code).

Fancy!

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 13:55                   ` Marc Zyngier
  0 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/03/17 13:50, Robin Murphy wrote:
> Hi Marc,
> 
> On 15/03/17 13:43, Marc Zyngier wrote:
>> On 15/03/17 13:35, Christoffer Dall wrote:
>>> On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
>>>> On 15/03/17 10:56, Christoffer Dall wrote:
>>>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>>>> unmap a range.
>>>>>>>>
>>>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>>>> Cc: stable at vger.kernel.org # v3.10+
>>>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>>>> ---
>>>>>>>>  arch/arm/kvm/mmu.c | 3 +++
>>>>>>>>  1 file changed, 3 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>>>>>> index 13b9c1f..b361f71 100644
>>>>>>>> --- a/arch/arm/kvm/mmu.c
>>>>>>>> +++ b/arch/arm/kvm/mmu.c
>>>>>>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>>>>>>>>  	if (kvm->arch.pgd == NULL)
>>>>>>>>  		return;
>>>>>>>>  
>>>>>>>> +	spin_lock(&kvm->mmu_lock);
>>>>>>>>  	unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
>>>>>>>> +	spin_unlock(&kvm->mmu_lock);
>>>>>>>> +
>>>>>>>
>>>>>>> This ends up holding the spin lock for potentially quite a while, where
>>>>>>> we can do things like __flush_dcache_area(), which I think can fault.
>>>>>>
>>>>>> I believe we're always using the linear mapping (or kmap on 32bit) in
>>>>>> order not to fault.
>>>>>>
>>>>>
>>>>> ok, then there's just the concern that we may be holding a spinlock for
>>>>> a very long time.  I seem to recall Mario once added something where he
>>>>> unlocked and gave a chance to schedule something else for each PUD or
>>>>> something like that, because he ran into the issue during migration.  Am
>>>>> I confusing this with something else?
>>>>
>>>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>>>> to give the system a chance to breathe. Maybe we could use a similar
>>>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>>>
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index 962616fd4ddd..1786c24212d4 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>>>  	phys_addr_t addr = start, end = start + size;
>>>>  	phys_addr_t next;
>>>>  
>>>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
> 
> Nit: assert_spin_locked() is somewhat more pleasant (and currently looks
> to expand to the exact same code).

Fancy!

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  2017-03-15 13:28           ` Marc Zyngier
  (?)
@ 2017-03-15 14:33             ` Suzuki K Poulose
  -1 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-15 14:33 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel, andreyknvl, dvyukov, christoffer.dall, kvmarm,
	kvm, linux-kernel, kcc, syzkaller, will.deacon, catalin.marinas,
	pbonzini, mark.rutland, ard.biesheuvel, stable

On 15/03/17 13:28, Marc Zyngier wrote:
> On 15/03/17 10:56, Christoffer Dall wrote:
>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>> unmap a range.
>>>>>
>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>> Cc: stable@vger.kernel.org # v3.10+
>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

...

>> ok, then there's just the concern that we may be holding a spinlock for
>> a very long time.  I seem to recall Mario once added something where he
>> unlocked and gave a chance to schedule something else for each PUD or
>> something like that, because he ran into the issue during migration.  Am
>> I confusing this with something else?
>
> That definitely rings a bell: stage2_wp_range() uses that kind of trick
> to give the system a chance to breathe. Maybe we could use a similar
> trick in our S2 unmapping code? How about this (completely untested) patch:
>
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616fd4ddd..1786c24212d4 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>  	phys_addr_t addr = start, end = start + size;
>  	phys_addr_t next;
>
> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
> +
>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>  	do {
> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
> +			cond_resched_lock(&kvm->mmu_lock);

nit: I think we could make the cond_resched_lock() unconditionally here:
Given, __cond_resched_lock() already does all the above checks :

kernel/sched/core.c:

int __cond_resched_lock(spinlock_t *lock)
{
         int resched = should_resched(PREEMPT_LOCK_OFFSET);

...

         if (spin_needbreak(lock) || resched) {


Suzuki

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 14:33             ` Suzuki K Poulose
  0 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-15 14:33 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel, kvm, ard.biesheuvel, andreyknvl, will.deacon,
	linux-kernel, stable, kcc, syzkaller, dvyukov, catalin.marinas,
	pbonzini, kvmarm

On 15/03/17 13:28, Marc Zyngier wrote:
> On 15/03/17 10:56, Christoffer Dall wrote:
>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>> unmap a range.
>>>>>
>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>> Cc: stable@vger.kernel.org # v3.10+
>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

...

>> ok, then there's just the concern that we may be holding a spinlock for
>> a very long time.  I seem to recall Mario once added something where he
>> unlocked and gave a chance to schedule something else for each PUD or
>> something like that, because he ran into the issue during migration.  Am
>> I confusing this with something else?
>
> That definitely rings a bell: stage2_wp_range() uses that kind of trick
> to give the system a chance to breathe. Maybe we could use a similar
> trick in our S2 unmapping code? How about this (completely untested) patch:
>
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616fd4ddd..1786c24212d4 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>  	phys_addr_t addr = start, end = start + size;
>  	phys_addr_t next;
>
> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
> +
>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>  	do {
> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
> +			cond_resched_lock(&kvm->mmu_lock);

nit: I think we could make the cond_resched_lock() unconditionally here:
Given, __cond_resched_lock() already does all the above checks :

kernel/sched/core.c:

int __cond_resched_lock(spinlock_t *lock)
{
         int resched = should_resched(PREEMPT_LOCK_OFFSET);

...

         if (spin_needbreak(lock) || resched) {


Suzuki

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 14:33             ` Suzuki K Poulose
  0 siblings, 0 replies; 54+ messages in thread
From: Suzuki K Poulose @ 2017-03-15 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/03/17 13:28, Marc Zyngier wrote:
> On 15/03/17 10:56, Christoffer Dall wrote:
>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>> unmap a range.
>>>>>
>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>> Cc: stable at vger.kernel.org # v3.10+
>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

...

>> ok, then there's just the concern that we may be holding a spinlock for
>> a very long time.  I seem to recall Mario once added something where he
>> unlocked and gave a chance to schedule something else for each PUD or
>> something like that, because he ran into the issue during migration.  Am
>> I confusing this with something else?
>
> That definitely rings a bell: stage2_wp_range() uses that kind of trick
> to give the system a chance to breathe. Maybe we could use a similar
> trick in our S2 unmapping code? How about this (completely untested) patch:
>
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616fd4ddd..1786c24212d4 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>  	phys_addr_t addr = start, end = start + size;
>  	phys_addr_t next;
>
> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
> +
>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>  	do {
> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
> +			cond_resched_lock(&kvm->mmu_lock);

nit: I think we could make the cond_resched_lock() unconditionally here:
Given, __cond_resched_lock() already does all the above checks :

kernel/sched/core.c:

int __cond_resched_lock(spinlock_t *lock)
{
         int resched = should_resched(PREEMPT_LOCK_OFFSET);

...

         if (spin_needbreak(lock) || resched) {


Suzuki

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
  2017-03-15 14:33             ` Suzuki K Poulose
  (?)
@ 2017-03-15 15:07               ` Marc Zyngier
  -1 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 15:07 UTC (permalink / raw)
  To: Suzuki K Poulose, Christoffer Dall
  Cc: linux-arm-kernel, andreyknvl, dvyukov, christoffer.dall, kvmarm,
	kvm, linux-kernel, kcc, syzkaller, will.deacon, catalin.marinas,
	pbonzini, mark.rutland, ard.biesheuvel, stable

On 15/03/17 14:33, Suzuki K Poulose wrote:
> On 15/03/17 13:28, Marc Zyngier wrote:
>> On 15/03/17 10:56, Christoffer Dall wrote:
>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>> unmap a range.
>>>>>>
>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>> Cc: stable@vger.kernel.org # v3.10+
>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> 
> ...
> 
>>> ok, then there's just the concern that we may be holding a spinlock for
>>> a very long time.  I seem to recall Mario once added something where he
>>> unlocked and gave a chance to schedule something else for each PUD or
>>> something like that, because he ran into the issue during migration.  Am
>>> I confusing this with something else?
>>
>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>> to give the system a chance to breathe. Maybe we could use a similar
>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616fd4ddd..1786c24212d4 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>  	phys_addr_t addr = start, end = start + size;
>>  	phys_addr_t next;
>>
>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
>> +
>>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>>  	do {
>> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>> +			cond_resched_lock(&kvm->mmu_lock);
> 
> nit: I think we could make the cond_resched_lock() unconditionally here:
> Given, __cond_resched_lock() already does all the above checks :
> 
> kernel/sched/core.c:
> 
> int __cond_resched_lock(spinlock_t *lock)
> {
>          int resched = should_resched(PREEMPT_LOCK_OFFSET);
> 
> ...
> 
>          if (spin_needbreak(lock) || resched) {

Right. And should_resched() also contains a test for need_resched().

This means we can also simplify stage2_wp_range(). Awesome!

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 15:07               ` Marc Zyngier
  0 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 15:07 UTC (permalink / raw)
  To: Suzuki K Poulose, Christoffer Dall
  Cc: linux-arm-kernel, kvm, ard.biesheuvel, andreyknvl, will.deacon,
	linux-kernel, stable, kcc, syzkaller, dvyukov, catalin.marinas,
	pbonzini, kvmarm

On 15/03/17 14:33, Suzuki K Poulose wrote:
> On 15/03/17 13:28, Marc Zyngier wrote:
>> On 15/03/17 10:56, Christoffer Dall wrote:
>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>> unmap a range.
>>>>>>
>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>> Cc: stable@vger.kernel.org # v3.10+
>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> 
> ...
> 
>>> ok, then there's just the concern that we may be holding a spinlock for
>>> a very long time.  I seem to recall Mario once added something where he
>>> unlocked and gave a chance to schedule something else for each PUD or
>>> something like that, because he ran into the issue during migration.  Am
>>> I confusing this with something else?
>>
>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>> to give the system a chance to breathe. Maybe we could use a similar
>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616fd4ddd..1786c24212d4 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>  	phys_addr_t addr = start, end = start + size;
>>  	phys_addr_t next;
>>
>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
>> +
>>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>>  	do {
>> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>> +			cond_resched_lock(&kvm->mmu_lock);
> 
> nit: I think we could make the cond_resched_lock() unconditionally here:
> Given, __cond_resched_lock() already does all the above checks :
> 
> kernel/sched/core.c:
> 
> int __cond_resched_lock(spinlock_t *lock)
> {
>          int resched = should_resched(PREEMPT_LOCK_OFFSET);
> 
> ...
> 
>          if (spin_needbreak(lock) || resched) {

Right. And should_resched() also contains a test for need_resched().

This means we can also simplify stage2_wp_range(). Awesome!

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
@ 2017-03-15 15:07               ` Marc Zyngier
  0 siblings, 0 replies; 54+ messages in thread
From: Marc Zyngier @ 2017-03-15 15:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/03/17 14:33, Suzuki K Poulose wrote:
> On 15/03/17 13:28, Marc Zyngier wrote:
>> On 15/03/17 10:56, Christoffer Dall wrote:
>>> On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
>>>> On 15/03/17 09:21, Christoffer Dall wrote:
>>>>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
>>>>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
>>>>>> unmap_stage2_range() on the entire memory range for the guest. This could
>>>>>> cause problems with other callers (e.g, munmap on a memslot) trying to
>>>>>> unmap a range.
>>>>>>
>>>>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
>>>>>> Cc: stable at vger.kernel.org # v3.10+
>>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> 
> ...
> 
>>> ok, then there's just the concern that we may be holding a spinlock for
>>> a very long time.  I seem to recall Mario once added something where he
>>> unlocked and gave a chance to schedule something else for each PUD or
>>> something like that, because he ran into the issue during migration.  Am
>>> I confusing this with something else?
>>
>> That definitely rings a bell: stage2_wp_range() uses that kind of trick
>> to give the system a chance to breathe. Maybe we could use a similar
>> trick in our S2 unmapping code? How about this (completely untested) patch:
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 962616fd4ddd..1786c24212d4 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>  	phys_addr_t addr = start, end = start + size;
>>  	phys_addr_t next;
>>
>> +	BUG_ON(!spin_is_locked(&kvm->mmu_lock));
>> +
>>  	pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>>  	do {
>> +		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>> +			cond_resched_lock(&kvm->mmu_lock);
> 
> nit: I think we could make the cond_resched_lock() unconditionally here:
> Given, __cond_resched_lock() already does all the above checks :
> 
> kernel/sched/core.c:
> 
> int __cond_resched_lock(spinlock_t *lock)
> {
>          int resched = should_resched(PREEMPT_LOCK_OFFSET);
> 
> ...
> 
>          if (spin_needbreak(lock) || resched) {

Right. And should_resched() also contains a test for need_resched().

This means we can also simplify stage2_wp_range(). Awesome!

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2017-03-15 15:08 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-14 14:52 [PATCH 0/3] kvm: arm/arm64: Fixes for use after free problems Suzuki K Poulose
2017-03-14 14:52 ` Suzuki K Poulose
2017-03-14 14:52 ` Suzuki K Poulose
2017-03-14 14:52 ` [PATCH 1/3] kvm: arm/arm64: Take mmap_sem in stage2_unmap_vm Suzuki K Poulose
2017-03-14 14:52   ` Suzuki K Poulose
2017-03-14 14:52   ` Suzuki K Poulose
2017-03-15  9:17   ` Christoffer Dall
2017-03-15  9:17     ` Christoffer Dall
2017-03-15  9:34     ` Marc Zyngier
2017-03-15  9:34       ` Marc Zyngier
2017-03-15 11:05       ` Christoffer Dall
2017-03-15 11:05         ` Christoffer Dall
2017-03-15 11:05         ` Christoffer Dall
2017-03-15 13:29     ` Paolo Bonzini
2017-03-15 13:29       ` Paolo Bonzini
2017-03-15 13:29       ` Paolo Bonzini
2017-03-14 14:52 ` [PATCH 2/3] kvm: arm/arm64: Take mmap_sem in kvm_arch_prepare_memory_region Suzuki K Poulose
2017-03-14 14:52   ` Suzuki K Poulose
2017-03-14 14:52   ` Suzuki K Poulose
2017-03-15 11:05   ` Christoffer Dall
2017-03-15 11:05     ` Christoffer Dall
2017-03-15 11:05     ` Christoffer Dall
2017-03-14 14:52 ` [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd Suzuki K Poulose
2017-03-14 14:52   ` Suzuki K Poulose
2017-03-14 14:52   ` Suzuki K Poulose
2017-03-15  9:21   ` Christoffer Dall
2017-03-15  9:21     ` Christoffer Dall
2017-03-15  9:21     ` Christoffer Dall
2017-03-15  9:39     ` Marc Zyngier
2017-03-15  9:39       ` Marc Zyngier
2017-03-15 10:56       ` Christoffer Dall
2017-03-15 10:56         ` Christoffer Dall
2017-03-15 10:56         ` Christoffer Dall
2017-03-15 13:28         ` Marc Zyngier
2017-03-15 13:28           ` Marc Zyngier
2017-03-15 13:28           ` Marc Zyngier
2017-03-15 13:35           ` Christoffer Dall
2017-03-15 13:35             ` Christoffer Dall
2017-03-15 13:35             ` Christoffer Dall
2017-03-15 13:43             ` Marc Zyngier
2017-03-15 13:43               ` Marc Zyngier
2017-03-15 13:43               ` Marc Zyngier
2017-03-15 13:50               ` Robin Murphy
2017-03-15 13:50                 ` Robin Murphy
2017-03-15 13:50                 ` Robin Murphy
2017-03-15 13:55                 ` Marc Zyngier
2017-03-15 13:55                   ` Marc Zyngier
2017-03-15 13:55                   ` Marc Zyngier
2017-03-15 14:33           ` Suzuki K Poulose
2017-03-15 14:33             ` Suzuki K Poulose
2017-03-15 14:33             ` Suzuki K Poulose
2017-03-15 15:07             ` Marc Zyngier
2017-03-15 15:07               ` Marc Zyngier
2017-03-15 15:07               ` Marc Zyngier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.