All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] KVM/ARM fixes for 5.1-rc3
@ 2019-03-28 13:36 ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, YueHaibing, Julien Grall, Zenghui Yu, kvmarm, linux-arm-kernel

Paolo, Radim,

Here's a handful of KVM/ARM fixes for 5.1-rc3. We have a number of
stage-2 page table fixes, some srcu fixes the the ITS emulation, a
reset fix for the virtual PMU, a GICv4 regression fix, and some
cleanups.

Please pull,

	M.

The following changes since commit 9e98c678c2d6ae3a17cb2de55d17f69dddaa231b:

  Linux 5.1-rc1 (2019-03-17 14:22:26 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvmarm-fixes-for-5.1

for you to fetch changes up to 8324c3d518cfd69f2a17866b52c13bf56d3042d8:

  KVM: arm/arm64: Comments cleanup in mmu.c (2019-03-28 13:17:17 +0000)

----------------------------------------------------------------
KVM/ARM fixes for 5.1

- Fix THP handling in the presence of pre-existing PTEs
- Honor request for PTE mappings even when THPs are available
- GICv4 performance improvement
- Take the srcu lock when writing to guest-controlled ITS data structures
- Reset the virtual PMU in preemptible context
- Various cleanups

----------------------------------------------------------------
Marc Zyngier (4):
      KVM: arm64: Reset the PMU in preemptible context
      arm64: KVM: Always set ICH_HCR_EL2.EN if GICv4 is enabled
      KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memory
      KVM: arm/arm64: vgic-its: Take the srcu lock when parsing the memslots

Suzuki K Poulose (2):
      KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
      KVM: arm/arm64: Fix handling of stage2 huge mappings

YueHaibing (1):
      KVM: arm/arm64: vgic-its: Make attribute accessors static

Zenghui Yu (1):
      KVM: arm/arm64: Comments cleanup in mmu.c

 arch/arm/include/asm/kvm_mmu.h        |  11 +++
 arch/arm/include/asm/stage2_pgtable.h |   2 +
 arch/arm64/include/asm/kvm_mmu.h      |  11 +++
 arch/arm64/kvm/reset.c                |   6 +-
 virt/kvm/arm/hyp/vgic-v3-sr.c         |   4 +-
 virt/kvm/arm/mmu.c                    | 125 ++++++++++++++++++++--------------
 virt/kvm/arm/vgic/vgic-its.c          |  31 +++++----
 virt/kvm/arm/vgic/vgic-v3.c           |   4 +-
 virt/kvm/arm/vgic/vgic.c              |  14 ++--
 9 files changed, 133 insertions(+), 75 deletions(-)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [GIT PULL] KVM/ARM fixes for 5.1-rc3
@ 2019-03-28 13:36 ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Eric Auger,
	Julien Grall, James Morse, Nianyao Tang, Zenghui Yu, kvmarm,
	linux-arm-kernel

Paolo, Radim,

Here's a handful of KVM/ARM fixes for 5.1-rc3. We have a number of
stage-2 page table fixes, some srcu fixes the the ITS emulation, a
reset fix for the virtual PMU, a GICv4 regression fix, and some
cleanups.

Please pull,

	M.

The following changes since commit 9e98c678c2d6ae3a17cb2de55d17f69dddaa231b:

  Linux 5.1-rc1 (2019-03-17 14:22:26 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvmarm-fixes-for-5.1

for you to fetch changes up to 8324c3d518cfd69f2a17866b52c13bf56d3042d8:

  KVM: arm/arm64: Comments cleanup in mmu.c (2019-03-28 13:17:17 +0000)

----------------------------------------------------------------
KVM/ARM fixes for 5.1

- Fix THP handling in the presence of pre-existing PTEs
- Honor request for PTE mappings even when THPs are available
- GICv4 performance improvement
- Take the srcu lock when writing to guest-controlled ITS data structures
- Reset the virtual PMU in preemptible context
- Various cleanups

----------------------------------------------------------------
Marc Zyngier (4):
      KVM: arm64: Reset the PMU in preemptible context
      arm64: KVM: Always set ICH_HCR_EL2.EN if GICv4 is enabled
      KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memory
      KVM: arm/arm64: vgic-its: Take the srcu lock when parsing the memslots

Suzuki K Poulose (2):
      KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
      KVM: arm/arm64: Fix handling of stage2 huge mappings

YueHaibing (1):
      KVM: arm/arm64: vgic-its: Make attribute accessors static

Zenghui Yu (1):
      KVM: arm/arm64: Comments cleanup in mmu.c

 arch/arm/include/asm/kvm_mmu.h        |  11 +++
 arch/arm/include/asm/stage2_pgtable.h |   2 +
 arch/arm64/include/asm/kvm_mmu.h      |  11 +++
 arch/arm64/kvm/reset.c                |   6 +-
 virt/kvm/arm/hyp/vgic-v3-sr.c         |   4 +-
 virt/kvm/arm/mmu.c                    | 125 ++++++++++++++++++++--------------
 virt/kvm/arm/vgic/vgic-its.c          |  31 +++++----
 virt/kvm/arm/vgic/vgic-v3.c           |   4 +-
 virt/kvm/arm/vgic/vgic.c              |  14 ++--
 9 files changed, 133 insertions(+), 75 deletions(-)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/8] KVM: arm64: Reset the PMU in preemptible context
  2019-03-28 13:36 ` Marc Zyngier
@ 2019-03-28 13:36   ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, YueHaibing, Julien Grall, Zenghui Yu, kvmarm, linux-arm-kernel

We've become very cautious to now always reset the vcpu when nothing
is loaded on the physical CPU. To do so, we now disable preemption
and do a kvm_arch_vcpu_put() to make sure we have all the state
in memory (and that it won't be loaded behind out back).

This now causes issues with resetting the PMU, which calls into perf.
Perf itself uses mutexes, which clashes with the lack of preemption.
It is worth realizing that the PMU is fully emulated, and that
no PMU state is ever loaded on the physical CPU. This means we can
perfectly reset the PMU outside of the non-preemptible section.

Fixes: e761a927bc9a ("KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded")
Reported-by: Julien Grall <julien.grall@arm.com>
Tested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/reset.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index f16a5f8ff2b4..e2a0500cd7a2 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -123,6 +123,9 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	int ret = -EINVAL;
 	bool loaded;
 
+	/* Reset PMU outside of the non-preemptible section */
+	kvm_pmu_vcpu_reset(vcpu);
+
 	preempt_disable();
 	loaded = (vcpu->cpu != -1);
 	if (loaded)
@@ -170,9 +173,6 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 		vcpu->arch.reset_state.reset = false;
 	}
 
-	/* Reset PMU */
-	kvm_pmu_vcpu_reset(vcpu);
-
 	/* Default workaround setup is enabled (if supported) */
 	if (kvm_arm_have_ssbd() == KVM_SSBD_KERNEL)
 		vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 1/8] KVM: arm64: Reset the PMU in preemptible context
@ 2019-03-28 13:36   ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Eric Auger,
	Julien Grall, James Morse, Nianyao Tang, Zenghui Yu, kvmarm,
	linux-arm-kernel

We've become very cautious to now always reset the vcpu when nothing
is loaded on the physical CPU. To do so, we now disable preemption
and do a kvm_arch_vcpu_put() to make sure we have all the state
in memory (and that it won't be loaded behind out back).

This now causes issues with resetting the PMU, which calls into perf.
Perf itself uses mutexes, which clashes with the lack of preemption.
It is worth realizing that the PMU is fully emulated, and that
no PMU state is ever loaded on the physical CPU. This means we can
perfectly reset the PMU outside of the non-preemptible section.

Fixes: e761a927bc9a ("KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded")
Reported-by: Julien Grall <julien.grall@arm.com>
Tested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/reset.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index f16a5f8ff2b4..e2a0500cd7a2 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -123,6 +123,9 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	int ret = -EINVAL;
 	bool loaded;
 
+	/* Reset PMU outside of the non-preemptible section */
+	kvm_pmu_vcpu_reset(vcpu);
+
 	preempt_disable();
 	loaded = (vcpu->cpu != -1);
 	if (loaded)
@@ -170,9 +173,6 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 		vcpu->arch.reset_state.reset = false;
 	}
 
-	/* Reset PMU */
-	kvm_pmu_vcpu_reset(vcpu);
-
 	/* Default workaround setup is enabled (if supported) */
 	if (kvm_arm_have_ssbd() == KVM_SSBD_KERNEL)
 		vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/8] arm64: KVM: Always set ICH_HCR_EL2.EN if GICv4 is enabled
  2019-03-28 13:36 ` Marc Zyngier
@ 2019-03-28 13:36   ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, YueHaibing, Julien Grall, Zenghui Yu, kvmarm, linux-arm-kernel

The normal interrupt flow is not to enable the vgic when no virtual
interrupt is to be injected (i.e. the LRs are empty). But when a guest
is likely to use GICv4 for LPIs, we absolutely need to switch it on
at all times. Otherwise, VLPIs only get delivered when there is something
in the LRs, which doesn't happen very often.

Reported-by: Nianyao Tang <tangnianyao@huawei.com>
Tested-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/hyp/vgic-v3-sr.c |  4 ++--
 virt/kvm/arm/vgic/vgic.c      | 14 ++++++++++----
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
index 264d92da3240..370bd6c5e6cb 100644
--- a/virt/kvm/arm/hyp/vgic-v3-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
@@ -222,7 +222,7 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 		}
 	}
 
-	if (used_lrs) {
+	if (used_lrs || cpu_if->its_vpe.its_vm) {
 		int i;
 		u32 elrsr;
 
@@ -247,7 +247,7 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu *vcpu)
 	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
 	int i;
 
-	if (used_lrs) {
+	if (used_lrs || cpu_if->its_vpe.its_vm) {
 		write_gicreg(cpu_if->vgic_hcr, ICH_HCR_EL2);
 
 		for (i = 0; i < used_lrs; i++)
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index abd9c7352677..3af69f2a3866 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -867,15 +867,21 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 	 * either observe the new interrupt before or after doing this check,
 	 * and introducing additional synchronization mechanism doesn't change
 	 * this.
+	 *
+	 * Note that we still need to go through the whole thing if anything
+	 * can be directly injected (GICv4).
 	 */
-	if (list_empty(&vcpu->arch.vgic_cpu.ap_list_head))
+	if (list_empty(&vcpu->arch.vgic_cpu.ap_list_head) &&
+	    !vgic_supports_direct_msis(vcpu->kvm))
 		return;
 
 	DEBUG_SPINLOCK_BUG_ON(!irqs_disabled());
 
-	raw_spin_lock(&vcpu->arch.vgic_cpu.ap_list_lock);
-	vgic_flush_lr_state(vcpu);
-	raw_spin_unlock(&vcpu->arch.vgic_cpu.ap_list_lock);
+	if (!list_empty(&vcpu->arch.vgic_cpu.ap_list_head)) {
+		raw_spin_lock(&vcpu->arch.vgic_cpu.ap_list_lock);
+		vgic_flush_lr_state(vcpu);
+		raw_spin_unlock(&vcpu->arch.vgic_cpu.ap_list_lock);
+	}
 
 	if (can_access_vgic_from_kernel())
 		vgic_restore_state(vcpu);
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/8] arm64: KVM: Always set ICH_HCR_EL2.EN if GICv4 is enabled
@ 2019-03-28 13:36   ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Eric Auger,
	Julien Grall, James Morse, Nianyao Tang, Zenghui Yu, kvmarm,
	linux-arm-kernel

The normal interrupt flow is not to enable the vgic when no virtual
interrupt is to be injected (i.e. the LRs are empty). But when a guest
is likely to use GICv4 for LPIs, we absolutely need to switch it on
at all times. Otherwise, VLPIs only get delivered when there is something
in the LRs, which doesn't happen very often.

Reported-by: Nianyao Tang <tangnianyao@huawei.com>
Tested-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/hyp/vgic-v3-sr.c |  4 ++--
 virt/kvm/arm/vgic/vgic.c      | 14 ++++++++++----
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
index 264d92da3240..370bd6c5e6cb 100644
--- a/virt/kvm/arm/hyp/vgic-v3-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
@@ -222,7 +222,7 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
 		}
 	}
 
-	if (used_lrs) {
+	if (used_lrs || cpu_if->its_vpe.its_vm) {
 		int i;
 		u32 elrsr;
 
@@ -247,7 +247,7 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu *vcpu)
 	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
 	int i;
 
-	if (used_lrs) {
+	if (used_lrs || cpu_if->its_vpe.its_vm) {
 		write_gicreg(cpu_if->vgic_hcr, ICH_HCR_EL2);
 
 		for (i = 0; i < used_lrs; i++)
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index abd9c7352677..3af69f2a3866 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -867,15 +867,21 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 	 * either observe the new interrupt before or after doing this check,
 	 * and introducing additional synchronization mechanism doesn't change
 	 * this.
+	 *
+	 * Note that we still need to go through the whole thing if anything
+	 * can be directly injected (GICv4).
 	 */
-	if (list_empty(&vcpu->arch.vgic_cpu.ap_list_head))
+	if (list_empty(&vcpu->arch.vgic_cpu.ap_list_head) &&
+	    !vgic_supports_direct_msis(vcpu->kvm))
 		return;
 
 	DEBUG_SPINLOCK_BUG_ON(!irqs_disabled());
 
-	raw_spin_lock(&vcpu->arch.vgic_cpu.ap_list_lock);
-	vgic_flush_lr_state(vcpu);
-	raw_spin_unlock(&vcpu->arch.vgic_cpu.ap_list_lock);
+	if (!list_empty(&vcpu->arch.vgic_cpu.ap_list_head)) {
+		raw_spin_lock(&vcpu->arch.vgic_cpu.ap_list_lock);
+		vgic_flush_lr_state(vcpu);
+		raw_spin_unlock(&vcpu->arch.vgic_cpu.ap_list_lock);
+	}
 
 	if (can_access_vgic_from_kernel())
 		vgic_restore_state(vcpu);
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/8] KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memory
  2019-03-28 13:36 ` Marc Zyngier
@ 2019-03-28 13:36   ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, YueHaibing, Julien Grall, Zenghui Yu, kvmarm, linux-arm-kernel

When halting a guest, QEMU flushes the virtual ITS caches, which
amounts to writing to the various tables that the guest has allocated.

When doing this, we fail to take the srcu lock, and the kernel
shouts loudly if running a lockdep kernel:

[   69.680416] =============================
[   69.680819] WARNING: suspicious RCU usage
[   69.681526] 5.1.0-rc1-00008-g600025238f51-dirty #18 Not tainted
[   69.682096] -----------------------------
[   69.682501] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage!
[   69.683225]
[   69.683225] other info that might help us debug this:
[   69.683225]
[   69.683975]
[   69.683975] rcu_scheduler_active = 2, debug_locks = 1
[   69.684598] 6 locks held by qemu-system-aar/4097:
[   69.685059]  #0: 0000000034196013 (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0
[   69.686087]  #1: 00000000f2ed935e (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0
[   69.686919]  #2: 000000005e71ea54 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[   69.687698]  #3: 00000000c17e548d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[   69.688475]  #4: 00000000ba386017 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[   69.689978]  #5: 00000000c2c3c335 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[   69.690729]
[   69.690729] stack backtrace:
[   69.691151] CPU: 2 PID: 4097 Comm: qemu-system-aar Not tainted 5.1.0-rc1-00008-g600025238f51-dirty #18
[   69.691984] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019
[   69.692831] Call trace:
[   69.694072]  lockdep_rcu_suspicious+0xcc/0x110
[   69.694490]  gfn_to_memslot+0x174/0x190
[   69.694853]  kvm_write_guest+0x50/0xb0
[   69.695209]  vgic_its_save_tables_v0+0x248/0x330
[   69.695639]  vgic_its_set_attr+0x298/0x3a0
[   69.696024]  kvm_device_ioctl_attr+0x9c/0xd8
[   69.696424]  kvm_device_ioctl+0x8c/0xf8
[   69.696788]  do_vfs_ioctl+0xc8/0x960
[   69.697128]  ksys_ioctl+0x8c/0xa0
[   69.697445]  __arm64_sys_ioctl+0x28/0x38
[   69.697817]  el0_svc_common+0xd8/0x138
[   69.698173]  el0_svc_handler+0x38/0x78
[   69.698528]  el0_svc+0x8/0xc

The fix is to obviously take the srcu lock, just like we do on the
read side of things since bf308242ab98. One wonders why this wasn't
fixed at the same time, but hey...

Fixes: bf308242ab98 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   | 11 +++++++++++
 arch/arm64/include/asm/kvm_mmu.h | 11 +++++++++++
 virt/kvm/arm/vgic/vgic-its.c     |  8 ++++----
 virt/kvm/arm/vgic/vgic-v3.c      |  4 ++--
 4 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 2de96a180166..31de4ab93005 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -381,6 +381,17 @@ static inline int kvm_read_guest_lock(struct kvm *kvm,
 	return ret;
 }
 
+static inline int kvm_write_guest_lock(struct kvm *kvm, gpa_t gpa,
+				       const void *data, unsigned long len)
+{
+	int srcu_idx = srcu_read_lock(&kvm->srcu);
+	int ret = kvm_write_guest(kvm, gpa, data, len);
+
+	srcu_read_unlock(&kvm->srcu, srcu_idx);
+
+	return ret;
+}
+
 static inline void *kvm_get_hyp_vector(void)
 {
 	switch(read_cpuid_part()) {
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index b0742a16c6c9..ebeefcf835e8 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -445,6 +445,17 @@ static inline int kvm_read_guest_lock(struct kvm *kvm,
 	return ret;
 }
 
+static inline int kvm_write_guest_lock(struct kvm *kvm, gpa_t gpa,
+				       const void *data, unsigned long len)
+{
+	int srcu_idx = srcu_read_lock(&kvm->srcu);
+	int ret = kvm_write_guest(kvm, gpa, data, len);
+
+	srcu_read_unlock(&kvm->srcu, srcu_idx);
+
+	return ret;
+}
+
 #ifdef CONFIG_KVM_INDIRECT_VECTORS
 /*
  * EL2 vectors can be mapped and rerouted in a number of ways,
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index ab3f47745d9c..c41e11fd841c 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1919,7 +1919,7 @@ static int vgic_its_save_ite(struct vgic_its *its, struct its_device *dev,
 	       ((u64)ite->irq->intid << KVM_ITS_ITE_PINTID_SHIFT) |
 		ite->collection->collection_id;
 	val = cpu_to_le64(val);
-	return kvm_write_guest(kvm, gpa, &val, ite_esz);
+	return kvm_write_guest_lock(kvm, gpa, &val, ite_esz);
 }
 
 /**
@@ -2066,7 +2066,7 @@ static int vgic_its_save_dte(struct vgic_its *its, struct its_device *dev,
 	       (itt_addr_field << KVM_ITS_DTE_ITTADDR_SHIFT) |
 		(dev->num_eventid_bits - 1));
 	val = cpu_to_le64(val);
-	return kvm_write_guest(kvm, ptr, &val, dte_esz);
+	return kvm_write_guest_lock(kvm, ptr, &val, dte_esz);
 }
 
 /**
@@ -2246,7 +2246,7 @@ static int vgic_its_save_cte(struct vgic_its *its,
 	       ((u64)collection->target_addr << KVM_ITS_CTE_RDBASE_SHIFT) |
 	       collection->collection_id);
 	val = cpu_to_le64(val);
-	return kvm_write_guest(its->dev->kvm, gpa, &val, esz);
+	return kvm_write_guest_lock(its->dev->kvm, gpa, &val, esz);
 }
 
 static int vgic_its_restore_cte(struct vgic_its *its, gpa_t gpa, int esz)
@@ -2317,7 +2317,7 @@ static int vgic_its_save_collection_table(struct vgic_its *its)
 	 */
 	val = 0;
 	BUG_ON(cte_esz > sizeof(val));
-	ret = kvm_write_guest(its->dev->kvm, gpa, &val, cte_esz);
+	ret = kvm_write_guest_lock(its->dev->kvm, gpa, &val, cte_esz);
 	return ret;
 }
 
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index 408a78eb6a97..9f87e58dbd4a 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -358,7 +358,7 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
 	if (status) {
 		/* clear consumed data */
 		val &= ~(1 << bit_nr);
-		ret = kvm_write_guest(kvm, ptr, &val, 1);
+		ret = kvm_write_guest_lock(kvm, ptr, &val, 1);
 		if (ret)
 			return ret;
 	}
@@ -409,7 +409,7 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
 		else
 			val &= ~(1 << bit_nr);
 
-		ret = kvm_write_guest(kvm, ptr, &val, 1);
+		ret = kvm_write_guest_lock(kvm, ptr, &val, 1);
 		if (ret)
 			return ret;
 	}
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/8] KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memory
@ 2019-03-28 13:36   ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Eric Auger,
	Julien Grall, James Morse, Nianyao Tang, Zenghui Yu, kvmarm,
	linux-arm-kernel

When halting a guest, QEMU flushes the virtual ITS caches, which
amounts to writing to the various tables that the guest has allocated.

When doing this, we fail to take the srcu lock, and the kernel
shouts loudly if running a lockdep kernel:

[   69.680416] =============================
[   69.680819] WARNING: suspicious RCU usage
[   69.681526] 5.1.0-rc1-00008-g600025238f51-dirty #18 Not tainted
[   69.682096] -----------------------------
[   69.682501] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage!
[   69.683225]
[   69.683225] other info that might help us debug this:
[   69.683225]
[   69.683975]
[   69.683975] rcu_scheduler_active = 2, debug_locks = 1
[   69.684598] 6 locks held by qemu-system-aar/4097:
[   69.685059]  #0: 0000000034196013 (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0
[   69.686087]  #1: 00000000f2ed935e (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0
[   69.686919]  #2: 000000005e71ea54 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[   69.687698]  #3: 00000000c17e548d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[   69.688475]  #4: 00000000ba386017 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[   69.689978]  #5: 00000000c2c3c335 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[   69.690729]
[   69.690729] stack backtrace:
[   69.691151] CPU: 2 PID: 4097 Comm: qemu-system-aar Not tainted 5.1.0-rc1-00008-g600025238f51-dirty #18
[   69.691984] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019
[   69.692831] Call trace:
[   69.694072]  lockdep_rcu_suspicious+0xcc/0x110
[   69.694490]  gfn_to_memslot+0x174/0x190
[   69.694853]  kvm_write_guest+0x50/0xb0
[   69.695209]  vgic_its_save_tables_v0+0x248/0x330
[   69.695639]  vgic_its_set_attr+0x298/0x3a0
[   69.696024]  kvm_device_ioctl_attr+0x9c/0xd8
[   69.696424]  kvm_device_ioctl+0x8c/0xf8
[   69.696788]  do_vfs_ioctl+0xc8/0x960
[   69.697128]  ksys_ioctl+0x8c/0xa0
[   69.697445]  __arm64_sys_ioctl+0x28/0x38
[   69.697817]  el0_svc_common+0xd8/0x138
[   69.698173]  el0_svc_handler+0x38/0x78
[   69.698528]  el0_svc+0x8/0xc

The fix is to obviously take the srcu lock, just like we do on the
read side of things since bf308242ab98. One wonders why this wasn't
fixed at the same time, but hey...

Fixes: bf308242ab98 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   | 11 +++++++++++
 arch/arm64/include/asm/kvm_mmu.h | 11 +++++++++++
 virt/kvm/arm/vgic/vgic-its.c     |  8 ++++----
 virt/kvm/arm/vgic/vgic-v3.c      |  4 ++--
 4 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 2de96a180166..31de4ab93005 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -381,6 +381,17 @@ static inline int kvm_read_guest_lock(struct kvm *kvm,
 	return ret;
 }
 
+static inline int kvm_write_guest_lock(struct kvm *kvm, gpa_t gpa,
+				       const void *data, unsigned long len)
+{
+	int srcu_idx = srcu_read_lock(&kvm->srcu);
+	int ret = kvm_write_guest(kvm, gpa, data, len);
+
+	srcu_read_unlock(&kvm->srcu, srcu_idx);
+
+	return ret;
+}
+
 static inline void *kvm_get_hyp_vector(void)
 {
 	switch(read_cpuid_part()) {
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index b0742a16c6c9..ebeefcf835e8 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -445,6 +445,17 @@ static inline int kvm_read_guest_lock(struct kvm *kvm,
 	return ret;
 }
 
+static inline int kvm_write_guest_lock(struct kvm *kvm, gpa_t gpa,
+				       const void *data, unsigned long len)
+{
+	int srcu_idx = srcu_read_lock(&kvm->srcu);
+	int ret = kvm_write_guest(kvm, gpa, data, len);
+
+	srcu_read_unlock(&kvm->srcu, srcu_idx);
+
+	return ret;
+}
+
 #ifdef CONFIG_KVM_INDIRECT_VECTORS
 /*
  * EL2 vectors can be mapped and rerouted in a number of ways,
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index ab3f47745d9c..c41e11fd841c 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1919,7 +1919,7 @@ static int vgic_its_save_ite(struct vgic_its *its, struct its_device *dev,
 	       ((u64)ite->irq->intid << KVM_ITS_ITE_PINTID_SHIFT) |
 		ite->collection->collection_id;
 	val = cpu_to_le64(val);
-	return kvm_write_guest(kvm, gpa, &val, ite_esz);
+	return kvm_write_guest_lock(kvm, gpa, &val, ite_esz);
 }
 
 /**
@@ -2066,7 +2066,7 @@ static int vgic_its_save_dte(struct vgic_its *its, struct its_device *dev,
 	       (itt_addr_field << KVM_ITS_DTE_ITTADDR_SHIFT) |
 		(dev->num_eventid_bits - 1));
 	val = cpu_to_le64(val);
-	return kvm_write_guest(kvm, ptr, &val, dte_esz);
+	return kvm_write_guest_lock(kvm, ptr, &val, dte_esz);
 }
 
 /**
@@ -2246,7 +2246,7 @@ static int vgic_its_save_cte(struct vgic_its *its,
 	       ((u64)collection->target_addr << KVM_ITS_CTE_RDBASE_SHIFT) |
 	       collection->collection_id);
 	val = cpu_to_le64(val);
-	return kvm_write_guest(its->dev->kvm, gpa, &val, esz);
+	return kvm_write_guest_lock(its->dev->kvm, gpa, &val, esz);
 }
 
 static int vgic_its_restore_cte(struct vgic_its *its, gpa_t gpa, int esz)
@@ -2317,7 +2317,7 @@ static int vgic_its_save_collection_table(struct vgic_its *its)
 	 */
 	val = 0;
 	BUG_ON(cte_esz > sizeof(val));
-	ret = kvm_write_guest(its->dev->kvm, gpa, &val, cte_esz);
+	ret = kvm_write_guest_lock(its->dev->kvm, gpa, &val, cte_esz);
 	return ret;
 }
 
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index 408a78eb6a97..9f87e58dbd4a 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -358,7 +358,7 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
 	if (status) {
 		/* clear consumed data */
 		val &= ~(1 << bit_nr);
-		ret = kvm_write_guest(kvm, ptr, &val, 1);
+		ret = kvm_write_guest_lock(kvm, ptr, &val, 1);
 		if (ret)
 			return ret;
 	}
@@ -409,7 +409,7 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
 		else
 			val &= ~(1 << bit_nr);
 
-		ret = kvm_write_guest(kvm, ptr, &val, 1);
+		ret = kvm_write_guest_lock(kvm, ptr, &val, 1);
 		if (ret)
 			return ret;
 	}
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/8] KVM: arm/arm64: vgic-its: Take the srcu lock when parsing the memslots
  2019-03-28 13:36 ` Marc Zyngier
@ 2019-03-28 13:36   ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, YueHaibing, Julien Grall, Zenghui Yu, kvmarm, linux-arm-kernel

Calling kvm_is_visible_gfn() implies that we're parsing the memslots,
and doing this without the srcu lock is frown upon:

[12704.164532] =============================
[12704.164544] WARNING: suspicious RCU usage
[12704.164560] 5.1.0-rc1-00008-g600025238f51-dirty #16 Tainted: G        W
[12704.164573] -----------------------------
[12704.164589] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage!
[12704.164602] other info that might help us debug this:
[12704.164616] rcu_scheduler_active = 2, debug_locks = 1
[12704.164631] 6 locks held by qemu-system-aar/13968:
[12704.164644]  #0: 000000007ebdae4f (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0
[12704.164691]  #1: 000000007d751022 (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0
[12704.164726]  #2: 00000000219d2706 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[12704.164761]  #3: 00000000a760aecd (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[12704.164794]  #4: 000000000ef8e31d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[12704.164827]  #5: 000000007a872093 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[12704.164861] stack backtrace:
[12704.164878] CPU: 2 PID: 13968 Comm: qemu-system-aar Tainted: G        W         5.1.0-rc1-00008-g600025238f51-dirty #16
[12704.164887] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019
[12704.164896] Call trace:
[12704.164910]  dump_backtrace+0x0/0x138
[12704.164920]  show_stack+0x24/0x30
[12704.164934]  dump_stack+0xbc/0x104
[12704.164946]  lockdep_rcu_suspicious+0xcc/0x110
[12704.164958]  gfn_to_memslot+0x174/0x190
[12704.164969]  kvm_is_visible_gfn+0x28/0x70
[12704.164980]  vgic_its_check_id.isra.0+0xec/0x1e8
[12704.164991]  vgic_its_save_tables_v0+0x1ac/0x330
[12704.165001]  vgic_its_set_attr+0x298/0x3a0
[12704.165012]  kvm_device_ioctl_attr+0x9c/0xd8
[12704.165022]  kvm_device_ioctl+0x8c/0xf8
[12704.165035]  do_vfs_ioctl+0xc8/0x960
[12704.165045]  ksys_ioctl+0x8c/0xa0
[12704.165055]  __arm64_sys_ioctl+0x28/0x38
[12704.165067]  el0_svc_common+0xd8/0x138
[12704.165078]  el0_svc_handler+0x38/0x78
[12704.165089]  el0_svc+0x8/0xc

Make sure the lock is taken when doing this.

Fixes: bf308242ab98 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock")
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index c41e11fd841c..fcb2fceaa4a5 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -754,8 +754,9 @@ static bool vgic_its_check_id(struct vgic_its *its, u64 baser, u32 id,
 	u64 indirect_ptr, type = GITS_BASER_TYPE(baser);
 	phys_addr_t base = GITS_BASER_ADDR_48_to_52(baser);
 	int esz = GITS_BASER_ENTRY_SIZE(baser);
-	int index;
+	int index, idx;
 	gfn_t gfn;
+	bool ret;
 
 	switch (type) {
 	case GITS_BASER_TYPE_DEVICE:
@@ -782,7 +783,8 @@ static bool vgic_its_check_id(struct vgic_its *its, u64 baser, u32 id,
 
 		if (eaddr)
 			*eaddr = addr;
-		return kvm_is_visible_gfn(its->dev->kvm, gfn);
+
+		goto out;
 	}
 
 	/* calculate and check the index into the 1st level */
@@ -812,7 +814,12 @@ static bool vgic_its_check_id(struct vgic_its *its, u64 baser, u32 id,
 
 	if (eaddr)
 		*eaddr = indirect_ptr;
-	return kvm_is_visible_gfn(its->dev->kvm, gfn);
+
+out:
+	idx = srcu_read_lock(&its->dev->kvm->srcu);
+	ret = kvm_is_visible_gfn(its->dev->kvm, gfn);
+	srcu_read_unlock(&its->dev->kvm->srcu, idx);
+	return ret;
 }
 
 static int vgic_its_alloc_collection(struct vgic_its *its,
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/8] KVM: arm/arm64: vgic-its: Take the srcu lock when parsing the memslots
@ 2019-03-28 13:36   ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Eric Auger,
	Julien Grall, James Morse, Nianyao Tang, Zenghui Yu, kvmarm,
	linux-arm-kernel

Calling kvm_is_visible_gfn() implies that we're parsing the memslots,
and doing this without the srcu lock is frown upon:

[12704.164532] =============================
[12704.164544] WARNING: suspicious RCU usage
[12704.164560] 5.1.0-rc1-00008-g600025238f51-dirty #16 Tainted: G        W
[12704.164573] -----------------------------
[12704.164589] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage!
[12704.164602] other info that might help us debug this:
[12704.164616] rcu_scheduler_active = 2, debug_locks = 1
[12704.164631] 6 locks held by qemu-system-aar/13968:
[12704.164644]  #0: 000000007ebdae4f (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0
[12704.164691]  #1: 000000007d751022 (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0
[12704.164726]  #2: 00000000219d2706 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[12704.164761]  #3: 00000000a760aecd (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[12704.164794]  #4: 000000000ef8e31d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[12704.164827]  #5: 000000007a872093 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0
[12704.164861] stack backtrace:
[12704.164878] CPU: 2 PID: 13968 Comm: qemu-system-aar Tainted: G        W         5.1.0-rc1-00008-g600025238f51-dirty #16
[12704.164887] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019
[12704.164896] Call trace:
[12704.164910]  dump_backtrace+0x0/0x138
[12704.164920]  show_stack+0x24/0x30
[12704.164934]  dump_stack+0xbc/0x104
[12704.164946]  lockdep_rcu_suspicious+0xcc/0x110
[12704.164958]  gfn_to_memslot+0x174/0x190
[12704.164969]  kvm_is_visible_gfn+0x28/0x70
[12704.164980]  vgic_its_check_id.isra.0+0xec/0x1e8
[12704.164991]  vgic_its_save_tables_v0+0x1ac/0x330
[12704.165001]  vgic_its_set_attr+0x298/0x3a0
[12704.165012]  kvm_device_ioctl_attr+0x9c/0xd8
[12704.165022]  kvm_device_ioctl+0x8c/0xf8
[12704.165035]  do_vfs_ioctl+0xc8/0x960
[12704.165045]  ksys_ioctl+0x8c/0xa0
[12704.165055]  __arm64_sys_ioctl+0x28/0x38
[12704.165067]  el0_svc_common+0xd8/0x138
[12704.165078]  el0_svc_handler+0x38/0x78
[12704.165089]  el0_svc+0x8/0xc

Make sure the lock is taken when doing this.

Fixes: bf308242ab98 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock")
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index c41e11fd841c..fcb2fceaa4a5 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -754,8 +754,9 @@ static bool vgic_its_check_id(struct vgic_its *its, u64 baser, u32 id,
 	u64 indirect_ptr, type = GITS_BASER_TYPE(baser);
 	phys_addr_t base = GITS_BASER_ADDR_48_to_52(baser);
 	int esz = GITS_BASER_ENTRY_SIZE(baser);
-	int index;
+	int index, idx;
 	gfn_t gfn;
+	bool ret;
 
 	switch (type) {
 	case GITS_BASER_TYPE_DEVICE:
@@ -782,7 +783,8 @@ static bool vgic_its_check_id(struct vgic_its *its, u64 baser, u32 id,
 
 		if (eaddr)
 			*eaddr = addr;
-		return kvm_is_visible_gfn(its->dev->kvm, gfn);
+
+		goto out;
 	}
 
 	/* calculate and check the index into the 1st level */
@@ -812,7 +814,12 @@ static bool vgic_its_check_id(struct vgic_its *its, u64 baser, u32 id,
 
 	if (eaddr)
 		*eaddr = indirect_ptr;
-	return kvm_is_visible_gfn(its->dev->kvm, gfn);
+
+out:
+	idx = srcu_read_lock(&its->dev->kvm->srcu);
+	ret = kvm_is_visible_gfn(its->dev->kvm, gfn);
+	srcu_read_unlock(&its->dev->kvm->srcu, idx);
+	return ret;
 }
 
 static int vgic_its_alloc_collection(struct vgic_its *its,
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/8] KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
  2019-03-28 13:36 ` Marc Zyngier
@ 2019-03-28 13:36   ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, YueHaibing, Julien Grall, Zenghui Yu, kvmarm, linux-arm-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
made the checks to skip huge mappings, stricter. However it introduced
a bug where we still use huge mappings, ignoring the flag to
use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE.

Also, the checks do not cover the PUD huge pages, that was
under review during the same period. This patch fixes both
the issues.

Fixes : 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 43 +++++++++++++++++++++----------------------
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ffd7acdceac7..bcdf978c0d1d 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1594,8 +1594,9 @@ static void kvm_send_hwpoison_signal(unsigned long address,
 	send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current);
 }
 
-static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
-					       unsigned long hva)
+static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
+					       unsigned long hva,
+					       unsigned long map_size)
 {
 	gpa_t gpa_start;
 	hva_t uaddr_start, uaddr_end;
@@ -1610,34 +1611,34 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
 
 	/*
 	 * Pages belonging to memslots that don't have the same alignment
-	 * within a PMD for userspace and IPA cannot be mapped with stage-2
-	 * PMD entries, because we'll end up mapping the wrong pages.
+	 * within a PMD/PUD for userspace and IPA cannot be mapped with stage-2
+	 * PMD/PUD entries, because we'll end up mapping the wrong pages.
 	 *
 	 * Consider a layout like the following:
 	 *
 	 *    memslot->userspace_addr:
 	 *    +-----+--------------------+--------------------+---+
-	 *    |abcde|fgh  Stage-1 PMD    |    Stage-1 PMD   tv|xyz|
+	 *    |abcde|fgh  Stage-1 block  |    Stage-1 block tv|xyz|
 	 *    +-----+--------------------+--------------------+---+
 	 *
 	 *    memslot->base_gfn << PAGE_SIZE:
 	 *      +---+--------------------+--------------------+-----+
-	 *      |abc|def  Stage-2 PMD    |    Stage-2 PMD     |tvxyz|
+	 *      |abc|def  Stage-2 block  |    Stage-2 block   |tvxyz|
 	 *      +---+--------------------+--------------------+-----+
 	 *
-	 * If we create those stage-2 PMDs, we'll end up with this incorrect
+	 * If we create those stage-2 blocks, we'll end up with this incorrect
 	 * mapping:
 	 *   d -> f
 	 *   e -> g
 	 *   f -> h
 	 */
-	if ((gpa_start & ~S2_PMD_MASK) != (uaddr_start & ~S2_PMD_MASK))
+	if ((gpa_start & (map_size - 1)) != (uaddr_start & (map_size - 1)))
 		return false;
 
 	/*
 	 * Next, let's make sure we're not trying to map anything not covered
-	 * by the memslot. This means we have to prohibit PMD size mappings
-	 * for the beginning and end of a non-PMD aligned and non-PMD sized
+	 * by the memslot. This means we have to prohibit block size mappings
+	 * for the beginning and end of a non-block aligned and non-block sized
 	 * memory slot (illustrated by the head and tail parts of the
 	 * userspace view above containing pages 'abcde' and 'xyz',
 	 * respectively).
@@ -1646,8 +1647,8 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
 	 * userspace_addr or the base_gfn, as both are equally aligned (per
 	 * the check above) and equally sized.
 	 */
-	return (hva & S2_PMD_MASK) >= uaddr_start &&
-	       (hva & S2_PMD_MASK) + S2_PMD_SIZE <= uaddr_end;
+	return (hva & ~(map_size - 1)) >= uaddr_start &&
+	       (hva & ~(map_size - 1)) + map_size <= uaddr_end;
 }
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
@@ -1676,12 +1677,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	if (!fault_supports_stage2_pmd_mappings(memslot, hva))
-		force_pte = true;
-
-	if (logging_active)
-		force_pte = true;
-
 	/* Let's check if we will get back a huge page backed by hugetlbfs */
 	down_read(&current->mm->mmap_sem);
 	vma = find_vma_intersection(current->mm, hva, hva + 1);
@@ -1692,6 +1687,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	}
 
 	vma_pagesize = vma_kernel_pagesize(vma);
+	if (logging_active ||
+	    !fault_supports_stage2_huge_mapping(memslot, hva, vma_pagesize)) {
+		force_pte = true;
+		vma_pagesize = PAGE_SIZE;
+	}
+
 	/*
 	 * The stage2 has a minimum of 2 level table (For arm64 see
 	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
@@ -1699,11 +1700,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	 * As for PUD huge maps, we must make sure that we have at least
 	 * 3 levels, i.e, PMD is not folded.
 	 */
-	if ((vma_pagesize == PMD_SIZE ||
-	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
-	    !force_pte) {
+	if (vma_pagesize == PMD_SIZE ||
+	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
 		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
-	}
 	up_read(&current->mm->mmap_sem);
 
 	/* We need minimum second+third level pages */
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/8] KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
@ 2019-03-28 13:36   ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Eric Auger,
	Julien Grall, James Morse, Nianyao Tang, Zenghui Yu, kvmarm,
	linux-arm-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
made the checks to skip huge mappings, stricter. However it introduced
a bug where we still use huge mappings, ignoring the flag to
use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE.

Also, the checks do not cover the PUD huge pages, that was
under review during the same period. This patch fixes both
the issues.

Fixes : 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 43 +++++++++++++++++++++----------------------
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ffd7acdceac7..bcdf978c0d1d 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1594,8 +1594,9 @@ static void kvm_send_hwpoison_signal(unsigned long address,
 	send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current);
 }
 
-static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
-					       unsigned long hva)
+static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
+					       unsigned long hva,
+					       unsigned long map_size)
 {
 	gpa_t gpa_start;
 	hva_t uaddr_start, uaddr_end;
@@ -1610,34 +1611,34 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
 
 	/*
 	 * Pages belonging to memslots that don't have the same alignment
-	 * within a PMD for userspace and IPA cannot be mapped with stage-2
-	 * PMD entries, because we'll end up mapping the wrong pages.
+	 * within a PMD/PUD for userspace and IPA cannot be mapped with stage-2
+	 * PMD/PUD entries, because we'll end up mapping the wrong pages.
 	 *
 	 * Consider a layout like the following:
 	 *
 	 *    memslot->userspace_addr:
 	 *    +-----+--------------------+--------------------+---+
-	 *    |abcde|fgh  Stage-1 PMD    |    Stage-1 PMD   tv|xyz|
+	 *    |abcde|fgh  Stage-1 block  |    Stage-1 block tv|xyz|
 	 *    +-----+--------------------+--------------------+---+
 	 *
 	 *    memslot->base_gfn << PAGE_SIZE:
 	 *      +---+--------------------+--------------------+-----+
-	 *      |abc|def  Stage-2 PMD    |    Stage-2 PMD     |tvxyz|
+	 *      |abc|def  Stage-2 block  |    Stage-2 block   |tvxyz|
 	 *      +---+--------------------+--------------------+-----+
 	 *
-	 * If we create those stage-2 PMDs, we'll end up with this incorrect
+	 * If we create those stage-2 blocks, we'll end up with this incorrect
 	 * mapping:
 	 *   d -> f
 	 *   e -> g
 	 *   f -> h
 	 */
-	if ((gpa_start & ~S2_PMD_MASK) != (uaddr_start & ~S2_PMD_MASK))
+	if ((gpa_start & (map_size - 1)) != (uaddr_start & (map_size - 1)))
 		return false;
 
 	/*
 	 * Next, let's make sure we're not trying to map anything not covered
-	 * by the memslot. This means we have to prohibit PMD size mappings
-	 * for the beginning and end of a non-PMD aligned and non-PMD sized
+	 * by the memslot. This means we have to prohibit block size mappings
+	 * for the beginning and end of a non-block aligned and non-block sized
 	 * memory slot (illustrated by the head and tail parts of the
 	 * userspace view above containing pages 'abcde' and 'xyz',
 	 * respectively).
@@ -1646,8 +1647,8 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
 	 * userspace_addr or the base_gfn, as both are equally aligned (per
 	 * the check above) and equally sized.
 	 */
-	return (hva & S2_PMD_MASK) >= uaddr_start &&
-	       (hva & S2_PMD_MASK) + S2_PMD_SIZE <= uaddr_end;
+	return (hva & ~(map_size - 1)) >= uaddr_start &&
+	       (hva & ~(map_size - 1)) + map_size <= uaddr_end;
 }
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
@@ -1676,12 +1677,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	if (!fault_supports_stage2_pmd_mappings(memslot, hva))
-		force_pte = true;
-
-	if (logging_active)
-		force_pte = true;
-
 	/* Let's check if we will get back a huge page backed by hugetlbfs */
 	down_read(&current->mm->mmap_sem);
 	vma = find_vma_intersection(current->mm, hva, hva + 1);
@@ -1692,6 +1687,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	}
 
 	vma_pagesize = vma_kernel_pagesize(vma);
+	if (logging_active ||
+	    !fault_supports_stage2_huge_mapping(memslot, hva, vma_pagesize)) {
+		force_pte = true;
+		vma_pagesize = PAGE_SIZE;
+	}
+
 	/*
 	 * The stage2 has a minimum of 2 level table (For arm64 see
 	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
@@ -1699,11 +1700,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	 * As for PUD huge maps, we must make sure that we have at least
 	 * 3 levels, i.e, PMD is not folded.
 	 */
-	if ((vma_pagesize == PMD_SIZE ||
-	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
-	    !force_pte) {
+	if (vma_pagesize == PMD_SIZE ||
+	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
 		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
-	}
 	up_read(&current->mm->mmap_sem);
 
 	/* We need minimum second+third level pages */
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 6/8] KVM: arm/arm64: Fix handling of stage2 huge mappings
  2019-03-28 13:36 ` Marc Zyngier
@ 2019-03-28 13:36   ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, YueHaibing, Julien Grall, Zenghui Yu, kvmarm, linux-arm-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

We rely on the mmu_notifier call backs to handle the split/merge
of huge pages and thus we are guaranteed that, while creating a
block mapping, either the entire block is unmapped at stage2 or it
is missing permission.

However, we miss a case where the block mapping is split for dirty
logging case and then could later be made block mapping, if we cancel the
dirty logging. This not only creates inconsistent TLB entries for
the pages in the the block, but also leakes the table pages for
PMD level.

Handle this corner case for the huge mappings at stage2 by
unmapping the non-huge mapping for the block. This could potentially
release the upper level table. So we need to restart the table walk
once we unmap the range.

Fixes : ad361f093c1e31d ("KVM: ARM: Support hugetlbfs backed huge pages")
Reported-by: Zheng Xiang <zhengxiang9@huawei.com>
Cc: Zheng Xiang <zhengxiang9@huawei.com>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/stage2_pgtable.h |  2 +
 virt/kvm/arm/mmu.c                    | 59 +++++++++++++++++++--------
 2 files changed, 45 insertions(+), 16 deletions(-)

diff --git a/arch/arm/include/asm/stage2_pgtable.h b/arch/arm/include/asm/stage2_pgtable.h
index de2089501b8b..9e11dce55e06 100644
--- a/arch/arm/include/asm/stage2_pgtable.h
+++ b/arch/arm/include/asm/stage2_pgtable.h
@@ -75,6 +75,8 @@ static inline bool kvm_stage2_has_pud(struct kvm *kvm)
 
 #define S2_PMD_MASK				PMD_MASK
 #define S2_PMD_SIZE				PMD_SIZE
+#define S2_PUD_MASK				PUD_MASK
+#define S2_PUD_SIZE				PUD_SIZE
 
 static inline bool kvm_stage2_has_pmd(struct kvm *kvm)
 {
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index bcdf978c0d1d..f9da2fad9bd6 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1067,25 +1067,43 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 {
 	pmd_t *pmd, old_pmd;
 
+retry:
 	pmd = stage2_get_pmd(kvm, cache, addr);
 	VM_BUG_ON(!pmd);
 
 	old_pmd = *pmd;
+	/*
+	 * Multiple vcpus faulting on the same PMD entry, can
+	 * lead to them sequentially updating the PMD with the
+	 * same value. Following the break-before-make
+	 * (pmd_clear() followed by tlb_flush()) process can
+	 * hinder forward progress due to refaults generated
+	 * on missing translations.
+	 *
+	 * Skip updating the page table if the entry is
+	 * unchanged.
+	 */
+	if (pmd_val(old_pmd) == pmd_val(*new_pmd))
+		return 0;
+
 	if (pmd_present(old_pmd)) {
 		/*
-		 * Multiple vcpus faulting on the same PMD entry, can
-		 * lead to them sequentially updating the PMD with the
-		 * same value. Following the break-before-make
-		 * (pmd_clear() followed by tlb_flush()) process can
-		 * hinder forward progress due to refaults generated
-		 * on missing translations.
+		 * If we already have PTE level mapping for this block,
+		 * we must unmap it to avoid inconsistent TLB state and
+		 * leaking the table page. We could end up in this situation
+		 * if the memory slot was marked for dirty logging and was
+		 * reverted, leaving PTE level mappings for the pages accessed
+		 * during the period. So, unmap the PTE level mapping for this
+		 * block and retry, as we could have released the upper level
+		 * table in the process.
 		 *
-		 * Skip updating the page table if the entry is
-		 * unchanged.
+		 * Normal THP split/merge follows mmu_notifier callbacks and do
+		 * get handled accordingly.
 		 */
-		if (pmd_val(old_pmd) == pmd_val(*new_pmd))
-			return 0;
-
+		if (!pmd_thp_or_huge(old_pmd)) {
+			unmap_stage2_range(kvm, addr & S2_PMD_MASK, S2_PMD_SIZE);
+			goto retry;
+		}
 		/*
 		 * Mapping in huge pages should only happen through a
 		 * fault.  If a page is merged into a transparent huge
@@ -1097,8 +1115,7 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 		 * should become splitting first, unmapped, merged,
 		 * and mapped back in on-demand.
 		 */
-		VM_BUG_ON(pmd_pfn(old_pmd) != pmd_pfn(*new_pmd));
-
+		WARN_ON_ONCE(pmd_pfn(old_pmd) != pmd_pfn(*new_pmd));
 		pmd_clear(pmd);
 		kvm_tlb_flush_vmid_ipa(kvm, addr);
 	} else {
@@ -1114,6 +1131,7 @@ static int stage2_set_pud_huge(struct kvm *kvm, struct kvm_mmu_memory_cache *cac
 {
 	pud_t *pudp, old_pud;
 
+retry:
 	pudp = stage2_get_pud(kvm, cache, addr);
 	VM_BUG_ON(!pudp);
 
@@ -1121,14 +1139,23 @@ static int stage2_set_pud_huge(struct kvm *kvm, struct kvm_mmu_memory_cache *cac
 
 	/*
 	 * A large number of vcpus faulting on the same stage 2 entry,
-	 * can lead to a refault due to the
-	 * stage2_pud_clear()/tlb_flush(). Skip updating the page
-	 * tables if there is no change.
+	 * can lead to a refault due to the stage2_pud_clear()/tlb_flush().
+	 * Skip updating the page tables if there is no change.
 	 */
 	if (pud_val(old_pud) == pud_val(*new_pudp))
 		return 0;
 
 	if (stage2_pud_present(kvm, old_pud)) {
+		/*
+		 * If we already have table level mapping for this block, unmap
+		 * the range for this block and retry.
+		 */
+		if (!stage2_pud_huge(kvm, old_pud)) {
+			unmap_stage2_range(kvm, addr & S2_PUD_MASK, S2_PUD_SIZE);
+			goto retry;
+		}
+
+		WARN_ON_ONCE(kvm_pud_pfn(old_pud) != kvm_pud_pfn(*new_pudp));
 		stage2_pud_clear(kvm, pudp);
 		kvm_tlb_flush_vmid_ipa(kvm, addr);
 	} else {
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 6/8] KVM: arm/arm64: Fix handling of stage2 huge mappings
@ 2019-03-28 13:36   ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Eric Auger,
	Julien Grall, James Morse, Nianyao Tang, Zenghui Yu, kvmarm,
	linux-arm-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

We rely on the mmu_notifier call backs to handle the split/merge
of huge pages and thus we are guaranteed that, while creating a
block mapping, either the entire block is unmapped at stage2 or it
is missing permission.

However, we miss a case where the block mapping is split for dirty
logging case and then could later be made block mapping, if we cancel the
dirty logging. This not only creates inconsistent TLB entries for
the pages in the the block, but also leakes the table pages for
PMD level.

Handle this corner case for the huge mappings at stage2 by
unmapping the non-huge mapping for the block. This could potentially
release the upper level table. So we need to restart the table walk
once we unmap the range.

Fixes : ad361f093c1e31d ("KVM: ARM: Support hugetlbfs backed huge pages")
Reported-by: Zheng Xiang <zhengxiang9@huawei.com>
Cc: Zheng Xiang <zhengxiang9@huawei.com>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/stage2_pgtable.h |  2 +
 virt/kvm/arm/mmu.c                    | 59 +++++++++++++++++++--------
 2 files changed, 45 insertions(+), 16 deletions(-)

diff --git a/arch/arm/include/asm/stage2_pgtable.h b/arch/arm/include/asm/stage2_pgtable.h
index de2089501b8b..9e11dce55e06 100644
--- a/arch/arm/include/asm/stage2_pgtable.h
+++ b/arch/arm/include/asm/stage2_pgtable.h
@@ -75,6 +75,8 @@ static inline bool kvm_stage2_has_pud(struct kvm *kvm)
 
 #define S2_PMD_MASK				PMD_MASK
 #define S2_PMD_SIZE				PMD_SIZE
+#define S2_PUD_MASK				PUD_MASK
+#define S2_PUD_SIZE				PUD_SIZE
 
 static inline bool kvm_stage2_has_pmd(struct kvm *kvm)
 {
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index bcdf978c0d1d..f9da2fad9bd6 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1067,25 +1067,43 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 {
 	pmd_t *pmd, old_pmd;
 
+retry:
 	pmd = stage2_get_pmd(kvm, cache, addr);
 	VM_BUG_ON(!pmd);
 
 	old_pmd = *pmd;
+	/*
+	 * Multiple vcpus faulting on the same PMD entry, can
+	 * lead to them sequentially updating the PMD with the
+	 * same value. Following the break-before-make
+	 * (pmd_clear() followed by tlb_flush()) process can
+	 * hinder forward progress due to refaults generated
+	 * on missing translations.
+	 *
+	 * Skip updating the page table if the entry is
+	 * unchanged.
+	 */
+	if (pmd_val(old_pmd) == pmd_val(*new_pmd))
+		return 0;
+
 	if (pmd_present(old_pmd)) {
 		/*
-		 * Multiple vcpus faulting on the same PMD entry, can
-		 * lead to them sequentially updating the PMD with the
-		 * same value. Following the break-before-make
-		 * (pmd_clear() followed by tlb_flush()) process can
-		 * hinder forward progress due to refaults generated
-		 * on missing translations.
+		 * If we already have PTE level mapping for this block,
+		 * we must unmap it to avoid inconsistent TLB state and
+		 * leaking the table page. We could end up in this situation
+		 * if the memory slot was marked for dirty logging and was
+		 * reverted, leaving PTE level mappings for the pages accessed
+		 * during the period. So, unmap the PTE level mapping for this
+		 * block and retry, as we could have released the upper level
+		 * table in the process.
 		 *
-		 * Skip updating the page table if the entry is
-		 * unchanged.
+		 * Normal THP split/merge follows mmu_notifier callbacks and do
+		 * get handled accordingly.
 		 */
-		if (pmd_val(old_pmd) == pmd_val(*new_pmd))
-			return 0;
-
+		if (!pmd_thp_or_huge(old_pmd)) {
+			unmap_stage2_range(kvm, addr & S2_PMD_MASK, S2_PMD_SIZE);
+			goto retry;
+		}
 		/*
 		 * Mapping in huge pages should only happen through a
 		 * fault.  If a page is merged into a transparent huge
@@ -1097,8 +1115,7 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 		 * should become splitting first, unmapped, merged,
 		 * and mapped back in on-demand.
 		 */
-		VM_BUG_ON(pmd_pfn(old_pmd) != pmd_pfn(*new_pmd));
-
+		WARN_ON_ONCE(pmd_pfn(old_pmd) != pmd_pfn(*new_pmd));
 		pmd_clear(pmd);
 		kvm_tlb_flush_vmid_ipa(kvm, addr);
 	} else {
@@ -1114,6 +1131,7 @@ static int stage2_set_pud_huge(struct kvm *kvm, struct kvm_mmu_memory_cache *cac
 {
 	pud_t *pudp, old_pud;
 
+retry:
 	pudp = stage2_get_pud(kvm, cache, addr);
 	VM_BUG_ON(!pudp);
 
@@ -1121,14 +1139,23 @@ static int stage2_set_pud_huge(struct kvm *kvm, struct kvm_mmu_memory_cache *cac
 
 	/*
 	 * A large number of vcpus faulting on the same stage 2 entry,
-	 * can lead to a refault due to the
-	 * stage2_pud_clear()/tlb_flush(). Skip updating the page
-	 * tables if there is no change.
+	 * can lead to a refault due to the stage2_pud_clear()/tlb_flush().
+	 * Skip updating the page tables if there is no change.
 	 */
 	if (pud_val(old_pud) == pud_val(*new_pudp))
 		return 0;
 
 	if (stage2_pud_present(kvm, old_pud)) {
+		/*
+		 * If we already have table level mapping for this block, unmap
+		 * the range for this block and retry.
+		 */
+		if (!stage2_pud_huge(kvm, old_pud)) {
+			unmap_stage2_range(kvm, addr & S2_PUD_MASK, S2_PUD_SIZE);
+			goto retry;
+		}
+
+		WARN_ON_ONCE(kvm_pud_pfn(old_pud) != kvm_pud_pfn(*new_pudp));
 		stage2_pud_clear(kvm, pudp);
 		kvm_tlb_flush_vmid_ipa(kvm, addr);
 	} else {
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 7/8] KVM: arm/arm64: vgic-its: Make attribute accessors static
  2019-03-28 13:36 ` Marc Zyngier
@ 2019-03-28 13:36   ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, YueHaibing, Julien Grall, Zenghui Yu, kvmarm, linux-arm-kernel

From: YueHaibing <yuehaibing@huawei.com>

Fix sparse warnings:

arch/arm64/kvm/../../../virt/kvm/arm/vgic/vgic-its.c:1732:5: warning:
 symbol 'vgic_its_has_attr_regs' was not declared. Should it be static?
arch/arm64/kvm/../../../virt/kvm/arm/vgic/vgic-its.c:1753:5: warning:
 symbol 'vgic_its_attr_regs_access' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
[maz: fixed subject]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index fcb2fceaa4a5..44ceaccb18cf 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1736,8 +1736,8 @@ static void vgic_its_destroy(struct kvm_device *kvm_dev)
 	kfree(its);
 }
 
-int vgic_its_has_attr_regs(struct kvm_device *dev,
-			   struct kvm_device_attr *attr)
+static int vgic_its_has_attr_regs(struct kvm_device *dev,
+				  struct kvm_device_attr *attr)
 {
 	const struct vgic_register_region *region;
 	gpa_t offset = attr->attr;
@@ -1757,9 +1757,9 @@ int vgic_its_has_attr_regs(struct kvm_device *dev,
 	return 0;
 }
 
-int vgic_its_attr_regs_access(struct kvm_device *dev,
-			      struct kvm_device_attr *attr,
-			      u64 *reg, bool is_write)
+static int vgic_its_attr_regs_access(struct kvm_device *dev,
+				     struct kvm_device_attr *attr,
+				     u64 *reg, bool is_write)
 {
 	const struct vgic_register_region *region;
 	struct vgic_its *its;
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 7/8] KVM: arm/arm64: vgic-its: Make attribute accessors static
@ 2019-03-28 13:36   ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Eric Auger,
	Julien Grall, James Morse, Nianyao Tang, Zenghui Yu, kvmarm,
	linux-arm-kernel

From: YueHaibing <yuehaibing@huawei.com>

Fix sparse warnings:

arch/arm64/kvm/../../../virt/kvm/arm/vgic/vgic-its.c:1732:5: warning:
 symbol 'vgic_its_has_attr_regs' was not declared. Should it be static?
arch/arm64/kvm/../../../virt/kvm/arm/vgic/vgic-its.c:1753:5: warning:
 symbol 'vgic_its_attr_regs_access' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
[maz: fixed subject]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index fcb2fceaa4a5..44ceaccb18cf 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1736,8 +1736,8 @@ static void vgic_its_destroy(struct kvm_device *kvm_dev)
 	kfree(its);
 }
 
-int vgic_its_has_attr_regs(struct kvm_device *dev,
-			   struct kvm_device_attr *attr)
+static int vgic_its_has_attr_regs(struct kvm_device *dev,
+				  struct kvm_device_attr *attr)
 {
 	const struct vgic_register_region *region;
 	gpa_t offset = attr->attr;
@@ -1757,9 +1757,9 @@ int vgic_its_has_attr_regs(struct kvm_device *dev,
 	return 0;
 }
 
-int vgic_its_attr_regs_access(struct kvm_device *dev,
-			      struct kvm_device_attr *attr,
-			      u64 *reg, bool is_write)
+static int vgic_its_attr_regs_access(struct kvm_device *dev,
+				     struct kvm_device_attr *attr,
+				     u64 *reg, bool is_write)
 {
 	const struct vgic_register_region *region;
 	struct vgic_its *its;
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 8/8] KVM: arm/arm64: Comments cleanup in mmu.c
  2019-03-28 13:36 ` Marc Zyngier
@ 2019-03-28 13:36   ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, YueHaibing, Julien Grall, Zenghui Yu, kvmarm, linux-arm-kernel

From: Zenghui Yu <yuzenghui@huawei.com>

Some comments in virt/kvm/arm/mmu.c are outdated. Update them to
reflect the current state of the code.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[maz: commit message tidy-up]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index f9da2fad9bd6..27c958306449 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -102,8 +102,7 @@ static bool kvm_is_device_pfn(unsigned long pfn)
  * @addr:	IPA
  * @pmd:	pmd pointer for IPA
  *
- * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
- * pages in the range dirty.
+ * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs.
  */
 static void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
 {
@@ -121,8 +120,7 @@ static void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
  * @addr:	IPA
  * @pud:	pud pointer for IPA
  *
- * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
- * pages in the range dirty.
+ * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs.
  */
 static void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr, pud_t *pudp)
 {
@@ -899,9 +897,8 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size,
  * kvm_alloc_stage2_pgd - allocate level-1 table for stage-2 translation.
  * @kvm:	The KVM struct pointer for the VM.
  *
- * Allocates only the stage-2 HW PGD level table(s) (can support either full
- * 40-bit input addresses or limited to 32-bit input addresses). Clears the
- * allocated pages.
+ * Allocates only the stage-2 HW PGD level table(s) of size defined by
+ * stage2_pgd_size(kvm).
  *
  * Note we don't need locking here as this is only called when the VM is
  * created, which can only be done once.
@@ -1478,13 +1475,11 @@ static void stage2_wp_pmds(struct kvm *kvm, pud_t *pud,
 }
 
 /**
-  * stage2_wp_puds - write protect PGD range
-  * @pgd:	pointer to pgd entry
-  * @addr:	range start address
-  * @end:	range end address
-  *
-  * Process PUD entries, for a huge PUD we cause a panic.
-  */
+ * stage2_wp_puds - write protect PGD range
+ * @pgd:	pointer to pgd entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
 static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
 			    phys_addr_t addr, phys_addr_t end)
 {
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 8/8] KVM: arm/arm64: Comments cleanup in mmu.c
@ 2019-03-28 13:36   ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-03-28 13:36 UTC (permalink / raw)
  To: Paolo Bonzini, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Eric Auger,
	Julien Grall, James Morse, Nianyao Tang, Zenghui Yu, kvmarm,
	linux-arm-kernel

From: Zenghui Yu <yuzenghui@huawei.com>

Some comments in virt/kvm/arm/mmu.c are outdated. Update them to
reflect the current state of the code.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[maz: commit message tidy-up]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index f9da2fad9bd6..27c958306449 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -102,8 +102,7 @@ static bool kvm_is_device_pfn(unsigned long pfn)
  * @addr:	IPA
  * @pmd:	pmd pointer for IPA
  *
- * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
- * pages in the range dirty.
+ * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs.
  */
 static void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
 {
@@ -121,8 +120,7 @@ static void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
  * @addr:	IPA
  * @pud:	pud pointer for IPA
  *
- * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
- * pages in the range dirty.
+ * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs.
  */
 static void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr, pud_t *pudp)
 {
@@ -899,9 +897,8 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size,
  * kvm_alloc_stage2_pgd - allocate level-1 table for stage-2 translation.
  * @kvm:	The KVM struct pointer for the VM.
  *
- * Allocates only the stage-2 HW PGD level table(s) (can support either full
- * 40-bit input addresses or limited to 32-bit input addresses). Clears the
- * allocated pages.
+ * Allocates only the stage-2 HW PGD level table(s) of size defined by
+ * stage2_pgd_size(kvm).
  *
  * Note we don't need locking here as this is only called when the VM is
  * created, which can only be done once.
@@ -1478,13 +1475,11 @@ static void stage2_wp_pmds(struct kvm *kvm, pud_t *pud,
 }
 
 /**
-  * stage2_wp_puds - write protect PGD range
-  * @pgd:	pointer to pgd entry
-  * @addr:	range start address
-  * @end:	range end address
-  *
-  * Process PUD entries, for a huge PUD we cause a panic.
-  */
+ * stage2_wp_puds - write protect PGD range
+ * @pgd:	pointer to pgd entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
 static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
 			    phys_addr_t addr, phys_addr_t end)
 {
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [GIT PULL] KVM/ARM fixes for 5.1-rc3
  2019-03-28 13:36 ` Marc Zyngier
@ 2019-03-28 18:07   ` Paolo Bonzini
  -1 siblings, 0 replies; 28+ messages in thread
From: Paolo Bonzini @ 2019-03-28 18:07 UTC (permalink / raw)
  To: Marc Zyngier, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Eric Auger,
	Julien Grall, James Morse, Nianyao Tang, Zenghui Yu, kvmarm,
	linux-arm-kernel

On 28/03/19 14:36, Marc Zyngier wrote:
> Paolo, Radim,
> 
> Here's a handful of KVM/ARM fixes for 5.1-rc3. We have a number of
> stage-2 page table fixes, some srcu fixes the the ITS emulation, a
> reset fix for the virtual PMU, a GICv4 regression fix, and some
> cleanups.
> 
> Please pull,
> 
> 	M.
> 
> The following changes since commit 9e98c678c2d6ae3a17cb2de55d17f69dddaa231b:
> 
>   Linux 5.1-rc1 (2019-03-17 14:22:26 -0700)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvmarm-fixes-for-5.1
> 
> for you to fetch changes up to 8324c3d518cfd69f2a17866b52c13bf56d3042d8:
> 
>   KVM: arm/arm64: Comments cleanup in mmu.c (2019-03-28 13:17:17 +0000)
> 
> ----------------------------------------------------------------
> KVM/ARM fixes for 5.1
> 
> - Fix THP handling in the presence of pre-existing PTEs
> - Honor request for PTE mappings even when THPs are available
> - GICv4 performance improvement
> - Take the srcu lock when writing to guest-controlled ITS data structures
> - Reset the virtual PMU in preemptible context
> - Various cleanups
> 
> ----------------------------------------------------------------
> Marc Zyngier (4):
>       KVM: arm64: Reset the PMU in preemptible context
>       arm64: KVM: Always set ICH_HCR_EL2.EN if GICv4 is enabled
>       KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memory
>       KVM: arm/arm64: vgic-its: Take the srcu lock when parsing the memslots
> 
> Suzuki K Poulose (2):
>       KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
>       KVM: arm/arm64: Fix handling of stage2 huge mappings
> 
> YueHaibing (1):
>       KVM: arm/arm64: vgic-its: Make attribute accessors static
> 
> Zenghui Yu (1):
>       KVM: arm/arm64: Comments cleanup in mmu.c
> 
>  arch/arm/include/asm/kvm_mmu.h        |  11 +++
>  arch/arm/include/asm/stage2_pgtable.h |   2 +
>  arch/arm64/include/asm/kvm_mmu.h      |  11 +++
>  arch/arm64/kvm/reset.c                |   6 +-
>  virt/kvm/arm/hyp/vgic-v3-sr.c         |   4 +-
>  virt/kvm/arm/mmu.c                    | 125 ++++++++++++++++++++--------------
>  virt/kvm/arm/vgic/vgic-its.c          |  31 +++++----
>  virt/kvm/arm/vgic/vgic-v3.c           |   4 +-
>  virt/kvm/arm/vgic/vgic.c              |  14 ++--
>  9 files changed, 133 insertions(+), 75 deletions(-)
> 

Pulled, thanks.  Will push together with the x86 stuff I'm testing now,
and send it to Linus tomorrow.

Paolo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [GIT PULL] KVM/ARM fixes for 5.1-rc3
@ 2019-03-28 18:07   ` Paolo Bonzini
  0 siblings, 0 replies; 28+ messages in thread
From: Paolo Bonzini @ 2019-03-28 18:07 UTC (permalink / raw)
  To: Marc Zyngier, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Eric Auger,
	Julien Grall, James Morse, Nianyao Tang, Zenghui Yu, kvmarm,
	linux-arm-kernel

On 28/03/19 14:36, Marc Zyngier wrote:
> Paolo, Radim,
> 
> Here's a handful of KVM/ARM fixes for 5.1-rc3. We have a number of
> stage-2 page table fixes, some srcu fixes the the ITS emulation, a
> reset fix for the virtual PMU, a GICv4 regression fix, and some
> cleanups.
> 
> Please pull,
> 
> 	M.
> 
> The following changes since commit 9e98c678c2d6ae3a17cb2de55d17f69dddaa231b:
> 
>   Linux 5.1-rc1 (2019-03-17 14:22:26 -0700)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvmarm-fixes-for-5.1
> 
> for you to fetch changes up to 8324c3d518cfd69f2a17866b52c13bf56d3042d8:
> 
>   KVM: arm/arm64: Comments cleanup in mmu.c (2019-03-28 13:17:17 +0000)
> 
> ----------------------------------------------------------------
> KVM/ARM fixes for 5.1
> 
> - Fix THP handling in the presence of pre-existing PTEs
> - Honor request for PTE mappings even when THPs are available
> - GICv4 performance improvement
> - Take the srcu lock when writing to guest-controlled ITS data structures
> - Reset the virtual PMU in preemptible context
> - Various cleanups
> 
> ----------------------------------------------------------------
> Marc Zyngier (4):
>       KVM: arm64: Reset the PMU in preemptible context
>       arm64: KVM: Always set ICH_HCR_EL2.EN if GICv4 is enabled
>       KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memory
>       KVM: arm/arm64: vgic-its: Take the srcu lock when parsing the memslots
> 
> Suzuki K Poulose (2):
>       KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
>       KVM: arm/arm64: Fix handling of stage2 huge mappings
> 
> YueHaibing (1):
>       KVM: arm/arm64: vgic-its: Make attribute accessors static
> 
> Zenghui Yu (1):
>       KVM: arm/arm64: Comments cleanup in mmu.c
> 
>  arch/arm/include/asm/kvm_mmu.h        |  11 +++
>  arch/arm/include/asm/stage2_pgtable.h |   2 +
>  arch/arm64/include/asm/kvm_mmu.h      |  11 +++
>  arch/arm64/kvm/reset.c                |   6 +-
>  virt/kvm/arm/hyp/vgic-v3-sr.c         |   4 +-
>  virt/kvm/arm/mmu.c                    | 125 ++++++++++++++++++++--------------
>  virt/kvm/arm/vgic/vgic-its.c          |  31 +++++----
>  virt/kvm/arm/vgic/vgic-v3.c           |   4 +-
>  virt/kvm/arm/vgic/vgic.c              |  14 ++--
>  9 files changed, 133 insertions(+), 75 deletions(-)
> 

Pulled, thanks.  Will push together with the x86 stuff I'm testing now,
and send it to Linus tomorrow.

Paolo

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/8] KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
  2019-03-28 13:36   ` Marc Zyngier
@ 2019-04-01 17:10     ` Auger Eric
  -1 siblings, 0 replies; 28+ messages in thread
From: Auger Eric @ 2019-04-01 17:10 UTC (permalink / raw)
  To: Marc Zyngier, Paolo Bonzini, Radim Krčmář
  Cc: kvm, YueHaibing, Julien Grall, Zenghui Yu, kvmarm, linux-arm-kernel

Hi Suzuki,

On 3/28/19 2:36 PM, Marc Zyngier wrote:
> From: Suzuki K Poulose <suzuki.poulose@arm.com>
> 
> commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
> made the checks to skip huge mappings, stricter. However it introduced
> a bug where we still use huge mappings, ignoring the flag to
> use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE.
> 
> Also, the checks do not cover the PUD huge pages, that was
> under review during the same period. This patch fixes both
> the issues.

I face a regression with this patch. My guest gets stuck. I am running
on AMD Seattle. Reverting the patch makes things work again for me. I
run with qemu. In this scenario I don't use hugepages. I use 64kB page
size for both the host and guest.

Thanks

Eric
> 
> Fixes : 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
> Reported-by: Zenghui Yu <yuzenghui@huawei.com>
> Cc: Zenghui Yu <yuzenghui@huawei.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  virt/kvm/arm/mmu.c | 43 +++++++++++++++++++++----------------------
>  1 file changed, 21 insertions(+), 22 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ffd7acdceac7..bcdf978c0d1d 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1594,8 +1594,9 @@ static void kvm_send_hwpoison_signal(unsigned long address,
>  	send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current);
>  }
>  
> -static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
> -					       unsigned long hva)
> +static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
> +					       unsigned long hva,
> +					       unsigned long map_size)
>  {
>  	gpa_t gpa_start;
>  	hva_t uaddr_start, uaddr_end;
> @@ -1610,34 +1611,34 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
>  
>  	/*
>  	 * Pages belonging to memslots that don't have the same alignment
> -	 * within a PMD for userspace and IPA cannot be mapped with stage-2
> -	 * PMD entries, because we'll end up mapping the wrong pages.
> +	 * within a PMD/PUD for userspace and IPA cannot be mapped with stage-2
> +	 * PMD/PUD entries, because we'll end up mapping the wrong pages.
>  	 *
>  	 * Consider a layout like the following:
>  	 *
>  	 *    memslot->userspace_addr:
>  	 *    +-----+--------------------+--------------------+---+
> -	 *    |abcde|fgh  Stage-1 PMD    |    Stage-1 PMD   tv|xyz|
> +	 *    |abcde|fgh  Stage-1 block  |    Stage-1 block tv|xyz|
>  	 *    +-----+--------------------+--------------------+---+
>  	 *
>  	 *    memslot->base_gfn << PAGE_SIZE:
>  	 *      +---+--------------------+--------------------+-----+
> -	 *      |abc|def  Stage-2 PMD    |    Stage-2 PMD     |tvxyz|
> +	 *      |abc|def  Stage-2 block  |    Stage-2 block   |tvxyz|
>  	 *      +---+--------------------+--------------------+-----+
>  	 *
> -	 * If we create those stage-2 PMDs, we'll end up with this incorrect
> +	 * If we create those stage-2 blocks, we'll end up with this incorrect
>  	 * mapping:
>  	 *   d -> f
>  	 *   e -> g
>  	 *   f -> h
>  	 */
> -	if ((gpa_start & ~S2_PMD_MASK) != (uaddr_start & ~S2_PMD_MASK))
> +	if ((gpa_start & (map_size - 1)) != (uaddr_start & (map_size - 1)))
>  		return false;
>  
>  	/*
>  	 * Next, let's make sure we're not trying to map anything not covered
> -	 * by the memslot. This means we have to prohibit PMD size mappings
> -	 * for the beginning and end of a non-PMD aligned and non-PMD sized
> +	 * by the memslot. This means we have to prohibit block size mappings
> +	 * for the beginning and end of a non-block aligned and non-block sized
>  	 * memory slot (illustrated by the head and tail parts of the
>  	 * userspace view above containing pages 'abcde' and 'xyz',
>  	 * respectively).
> @@ -1646,8 +1647,8 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
>  	 * userspace_addr or the base_gfn, as both are equally aligned (per
>  	 * the check above) and equally sized.
>  	 */
> -	return (hva & S2_PMD_MASK) >= uaddr_start &&
> -	       (hva & S2_PMD_MASK) + S2_PMD_SIZE <= uaddr_end;
> +	return (hva & ~(map_size - 1)) >= uaddr_start &&
> +	       (hva & ~(map_size - 1)) + map_size <= uaddr_end;
>  }
>  
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> @@ -1676,12 +1677,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  		return -EFAULT;
>  	}
>  
> -	if (!fault_supports_stage2_pmd_mappings(memslot, hva))
> -		force_pte = true;
> -
> -	if (logging_active)
> -		force_pte = true;
> -
>  	/* Let's check if we will get back a huge page backed by hugetlbfs */
>  	down_read(&current->mm->mmap_sem);
>  	vma = find_vma_intersection(current->mm, hva, hva + 1);
> @@ -1692,6 +1687,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	}
>  
>  	vma_pagesize = vma_kernel_pagesize(vma);
> +	if (logging_active ||
> +	    !fault_supports_stage2_huge_mapping(memslot, hva, vma_pagesize)) {
> +		force_pte = true;
> +		vma_pagesize = PAGE_SIZE;
> +	}
> +
>  	/*
>  	 * The stage2 has a minimum of 2 level table (For arm64 see
>  	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
> @@ -1699,11 +1700,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	 * As for PUD huge maps, we must make sure that we have at least
>  	 * 3 levels, i.e, PMD is not folded.
>  	 */
> -	if ((vma_pagesize == PMD_SIZE ||
> -	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
> -	    !force_pte) {
> +	if (vma_pagesize == PMD_SIZE ||
> +	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
>  		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
> -	}
>  	up_read(&current->mm->mmap_sem);
>  
>  	/* We need minimum second+third level pages */
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/8] KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
@ 2019-04-01 17:10     ` Auger Eric
  0 siblings, 0 replies; 28+ messages in thread
From: Auger Eric @ 2019-04-01 17:10 UTC (permalink / raw)
  To: Marc Zyngier, Paolo Bonzini, Radim Krčmář
  Cc: kvm, Suzuki K Poulose, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Julien Grall,
	James Morse, Nianyao Tang, Zenghui Yu, kvmarm, linux-arm-kernel

Hi Suzuki,

On 3/28/19 2:36 PM, Marc Zyngier wrote:
> From: Suzuki K Poulose <suzuki.poulose@arm.com>
> 
> commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
> made the checks to skip huge mappings, stricter. However it introduced
> a bug where we still use huge mappings, ignoring the flag to
> use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE.
> 
> Also, the checks do not cover the PUD huge pages, that was
> under review during the same period. This patch fixes both
> the issues.

I face a regression with this patch. My guest gets stuck. I am running
on AMD Seattle. Reverting the patch makes things work again for me. I
run with qemu. In this scenario I don't use hugepages. I use 64kB page
size for both the host and guest.

Thanks

Eric
> 
> Fixes : 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
> Reported-by: Zenghui Yu <yuzenghui@huawei.com>
> Cc: Zenghui Yu <yuzenghui@huawei.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  virt/kvm/arm/mmu.c | 43 +++++++++++++++++++++----------------------
>  1 file changed, 21 insertions(+), 22 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ffd7acdceac7..bcdf978c0d1d 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1594,8 +1594,9 @@ static void kvm_send_hwpoison_signal(unsigned long address,
>  	send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current);
>  }
>  
> -static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
> -					       unsigned long hva)
> +static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
> +					       unsigned long hva,
> +					       unsigned long map_size)
>  {
>  	gpa_t gpa_start;
>  	hva_t uaddr_start, uaddr_end;
> @@ -1610,34 +1611,34 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
>  
>  	/*
>  	 * Pages belonging to memslots that don't have the same alignment
> -	 * within a PMD for userspace and IPA cannot be mapped with stage-2
> -	 * PMD entries, because we'll end up mapping the wrong pages.
> +	 * within a PMD/PUD for userspace and IPA cannot be mapped with stage-2
> +	 * PMD/PUD entries, because we'll end up mapping the wrong pages.
>  	 *
>  	 * Consider a layout like the following:
>  	 *
>  	 *    memslot->userspace_addr:
>  	 *    +-----+--------------------+--------------------+---+
> -	 *    |abcde|fgh  Stage-1 PMD    |    Stage-1 PMD   tv|xyz|
> +	 *    |abcde|fgh  Stage-1 block  |    Stage-1 block tv|xyz|
>  	 *    +-----+--------------------+--------------------+---+
>  	 *
>  	 *    memslot->base_gfn << PAGE_SIZE:
>  	 *      +---+--------------------+--------------------+-----+
> -	 *      |abc|def  Stage-2 PMD    |    Stage-2 PMD     |tvxyz|
> +	 *      |abc|def  Stage-2 block  |    Stage-2 block   |tvxyz|
>  	 *      +---+--------------------+--------------------+-----+
>  	 *
> -	 * If we create those stage-2 PMDs, we'll end up with this incorrect
> +	 * If we create those stage-2 blocks, we'll end up with this incorrect
>  	 * mapping:
>  	 *   d -> f
>  	 *   e -> g
>  	 *   f -> h
>  	 */
> -	if ((gpa_start & ~S2_PMD_MASK) != (uaddr_start & ~S2_PMD_MASK))
> +	if ((gpa_start & (map_size - 1)) != (uaddr_start & (map_size - 1)))
>  		return false;
>  
>  	/*
>  	 * Next, let's make sure we're not trying to map anything not covered
> -	 * by the memslot. This means we have to prohibit PMD size mappings
> -	 * for the beginning and end of a non-PMD aligned and non-PMD sized
> +	 * by the memslot. This means we have to prohibit block size mappings
> +	 * for the beginning and end of a non-block aligned and non-block sized
>  	 * memory slot (illustrated by the head and tail parts of the
>  	 * userspace view above containing pages 'abcde' and 'xyz',
>  	 * respectively).
> @@ -1646,8 +1647,8 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
>  	 * userspace_addr or the base_gfn, as both are equally aligned (per
>  	 * the check above) and equally sized.
>  	 */
> -	return (hva & S2_PMD_MASK) >= uaddr_start &&
> -	       (hva & S2_PMD_MASK) + S2_PMD_SIZE <= uaddr_end;
> +	return (hva & ~(map_size - 1)) >= uaddr_start &&
> +	       (hva & ~(map_size - 1)) + map_size <= uaddr_end;
>  }
>  
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> @@ -1676,12 +1677,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  		return -EFAULT;
>  	}
>  
> -	if (!fault_supports_stage2_pmd_mappings(memslot, hva))
> -		force_pte = true;
> -
> -	if (logging_active)
> -		force_pte = true;
> -
>  	/* Let's check if we will get back a huge page backed by hugetlbfs */
>  	down_read(&current->mm->mmap_sem);
>  	vma = find_vma_intersection(current->mm, hva, hva + 1);
> @@ -1692,6 +1687,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	}
>  
>  	vma_pagesize = vma_kernel_pagesize(vma);
> +	if (logging_active ||
> +	    !fault_supports_stage2_huge_mapping(memslot, hva, vma_pagesize)) {
> +		force_pte = true;
> +		vma_pagesize = PAGE_SIZE;
> +	}
> +
>  	/*
>  	 * The stage2 has a minimum of 2 level table (For arm64 see
>  	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
> @@ -1699,11 +1700,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	 * As for PUD huge maps, we must make sure that we have at least
>  	 * 3 levels, i.e, PMD is not folded.
>  	 */
> -	if ((vma_pagesize == PMD_SIZE ||
> -	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
> -	    !force_pte) {
> +	if (vma_pagesize == PMD_SIZE ||
> +	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
>  		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
> -	}
>  	up_read(&current->mm->mmap_sem);
>  
>  	/* We need minimum second+third level pages */
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/8] KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
  2019-04-01 17:10     ` Auger Eric
@ 2019-04-02  9:47       ` Suzuki K Poulose
  -1 siblings, 0 replies; 28+ messages in thread
From: Suzuki K Poulose @ 2019-04-02  9:47 UTC (permalink / raw)
  To: Auger Eric
  Cc: kvm, Marc Zyngier, YueHaibing, Julien Grall, Zenghui Yu,
	Paolo Bonzini, kvmarm, linux-arm-kernel

On Mon, Apr 01, 2019 at 07:10:37PM +0200, Auger Eric wrote:
> Hi Suzuki,
> 
> On 3/28/19 2:36 PM, Marc Zyngier wrote:
> > From: Suzuki K Poulose <suzuki.poulose@arm.com>
> > 
> > commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
> > made the checks to skip huge mappings, stricter. However it introduced
> > a bug where we still use huge mappings, ignoring the flag to
> > use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE.
> > 
> > Also, the checks do not cover the PUD huge pages, that was
> > under review during the same period. This patch fixes both
> > the issues.
> 
> I face a regression with this patch. My guest gets stuck. I am running
> on AMD Seattle. Reverting the patch makes things work again for me. I
> run with qemu. In this scenario I don't use hugepages. I use 64kB page
> size for both the host and guest.

Hi Eric,

Thanks for the testing. Does the following patch fix the issue for you ?


---8>---
kvm: arm: Skip transparent huge pages in unaligned memslots

We silently create stage2 huge mappings for a memslot with
unaligned IPA and user address.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 virt/kvm/arm/mmu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 27c9583..4a22f5b 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1412,7 +1412,9 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
 		 * page accordingly.
 		 */
 		mask = PTRS_PER_PMD - 1;
-		VM_BUG_ON((gfn & mask) != (pfn & mask));
+		/* Skip memslots with unaligned IPA and user address */
+		if ((gfn & mask) != (pfn & mask))
+			return false;
 		if (pfn & mask) {
 			*ipap &= PMD_MASK;
 			kvm_release_pfn_clean(pfn);
-- 
2.7.4


Kind regards
Suzuki


> 
> Thanks
> 
> Eric
> > 
> > Fixes : 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
> > Reported-by: Zenghui Yu <yuzenghui@huawei.com>
> > Cc: Zenghui Yu <yuzenghui@huawei.com>
> > Cc: Christoffer Dall <christoffer.dall@arm.com>
> > Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> >  virt/kvm/arm/mmu.c | 43 +++++++++++++++++++++----------------------
> >  1 file changed, 21 insertions(+), 22 deletions(-)
> > 
> > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> > index ffd7acdceac7..bcdf978c0d1d 100644
> > --- a/virt/kvm/arm/mmu.c
> > +++ b/virt/kvm/arm/mmu.c
> > @@ -1594,8 +1594,9 @@ static void kvm_send_hwpoison_signal(unsigned long address,
> >  	send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current);
> >  }
> >  
> > -static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
> > -					       unsigned long hva)
> > +static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
> > +					       unsigned long hva,
> > +					       unsigned long map_size)
> >  {
> >  	gpa_t gpa_start;
> >  	hva_t uaddr_start, uaddr_end;
> > @@ -1610,34 +1611,34 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
> >  
> >  	/*
> >  	 * Pages belonging to memslots that don't have the same alignment
> > -	 * within a PMD for userspace and IPA cannot be mapped with stage-2
> > -	 * PMD entries, because we'll end up mapping the wrong pages.
> > +	 * within a PMD/PUD for userspace and IPA cannot be mapped with stage-2
> > +	 * PMD/PUD entries, because we'll end up mapping the wrong pages.
> >  	 *
> >  	 * Consider a layout like the following:
> >  	 *
> >  	 *    memslot->userspace_addr:
> >  	 *    +-----+--------------------+--------------------+---+
> > -	 *    |abcde|fgh  Stage-1 PMD    |    Stage-1 PMD   tv|xyz|
> > +	 *    |abcde|fgh  Stage-1 block  |    Stage-1 block tv|xyz|
> >  	 *    +-----+--------------------+--------------------+---+
> >  	 *
> >  	 *    memslot->base_gfn << PAGE_SIZE:
> >  	 *      +---+--------------------+--------------------+-----+
> > -	 *      |abc|def  Stage-2 PMD    |    Stage-2 PMD     |tvxyz|
> > +	 *      |abc|def  Stage-2 block  |    Stage-2 block   |tvxyz|
> >  	 *      +---+--------------------+--------------------+-----+
> >  	 *
> > -	 * If we create those stage-2 PMDs, we'll end up with this incorrect
> > +	 * If we create those stage-2 blocks, we'll end up with this incorrect
> >  	 * mapping:
> >  	 *   d -> f
> >  	 *   e -> g
> >  	 *   f -> h
> >  	 */
> > -	if ((gpa_start & ~S2_PMD_MASK) != (uaddr_start & ~S2_PMD_MASK))
> > +	if ((gpa_start & (map_size - 1)) != (uaddr_start & (map_size - 1)))
> >  		return false;
> >  
> >  	/*
> >  	 * Next, let's make sure we're not trying to map anything not covered
> > -	 * by the memslot. This means we have to prohibit PMD size mappings
> > -	 * for the beginning and end of a non-PMD aligned and non-PMD sized
> > +	 * by the memslot. This means we have to prohibit block size mappings
> > +	 * for the beginning and end of a non-block aligned and non-block sized
> >  	 * memory slot (illustrated by the head and tail parts of the
> >  	 * userspace view above containing pages 'abcde' and 'xyz',
> >  	 * respectively).
> > @@ -1646,8 +1647,8 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
> >  	 * userspace_addr or the base_gfn, as both are equally aligned (per
> >  	 * the check above) and equally sized.
> >  	 */
> > -	return (hva & S2_PMD_MASK) >= uaddr_start &&
> > -	       (hva & S2_PMD_MASK) + S2_PMD_SIZE <= uaddr_end;
> > +	return (hva & ~(map_size - 1)) >= uaddr_start &&
> > +	       (hva & ~(map_size - 1)) + map_size <= uaddr_end;
> >  }
> >  
> >  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> > @@ -1676,12 +1677,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >  		return -EFAULT;
> >  	}
> >  
> > -	if (!fault_supports_stage2_pmd_mappings(memslot, hva))
> > -		force_pte = true;
> > -
> > -	if (logging_active)
> > -		force_pte = true;
> > -
> >  	/* Let's check if we will get back a huge page backed by hugetlbfs */
> >  	down_read(&current->mm->mmap_sem);
> >  	vma = find_vma_intersection(current->mm, hva, hva + 1);
> > @@ -1692,6 +1687,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >  	}
> >  
> >  	vma_pagesize = vma_kernel_pagesize(vma);
> > +	if (logging_active ||
> > +	    !fault_supports_stage2_huge_mapping(memslot, hva, vma_pagesize)) {
> > +		force_pte = true;
> > +		vma_pagesize = PAGE_SIZE;
> > +	}
> > +
> >  	/*
> >  	 * The stage2 has a minimum of 2 level table (For arm64 see
> >  	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
> > @@ -1699,11 +1700,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >  	 * As for PUD huge maps, we must make sure that we have at least
> >  	 * 3 levels, i.e, PMD is not folded.
> >  	 */
> > -	if ((vma_pagesize == PMD_SIZE ||
> > -	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
> > -	    !force_pte) {
> > +	if (vma_pagesize == PMD_SIZE ||
> > +	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
> >  		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
> > -	}
> >  	up_read(&current->mm->mmap_sem);
> >  
> >  	/* We need minimum second+third level pages */
> > 

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/8] KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
@ 2019-04-02  9:47       ` Suzuki K Poulose
  0 siblings, 0 replies; 28+ messages in thread
From: Suzuki K Poulose @ 2019-04-02  9:47 UTC (permalink / raw)
  To: Auger Eric
  Cc: kvm, Radim Krčmář,
	Marc Zyngier, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Julien Grall,
	James Morse, Nianyao Tang, Zenghui Yu, Paolo Bonzini,
	suzuki.poulose, kvmarm, linux-arm-kernel

On Mon, Apr 01, 2019 at 07:10:37PM +0200, Auger Eric wrote:
> Hi Suzuki,
> 
> On 3/28/19 2:36 PM, Marc Zyngier wrote:
> > From: Suzuki K Poulose <suzuki.poulose@arm.com>
> > 
> > commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
> > made the checks to skip huge mappings, stricter. However it introduced
> > a bug where we still use huge mappings, ignoring the flag to
> > use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE.
> > 
> > Also, the checks do not cover the PUD huge pages, that was
> > under review during the same period. This patch fixes both
> > the issues.
> 
> I face a regression with this patch. My guest gets stuck. I am running
> on AMD Seattle. Reverting the patch makes things work again for me. I
> run with qemu. In this scenario I don't use hugepages. I use 64kB page
> size for both the host and guest.

Hi Eric,

Thanks for the testing. Does the following patch fix the issue for you ?


---8>---
kvm: arm: Skip transparent huge pages in unaligned memslots

We silently create stage2 huge mappings for a memslot with
unaligned IPA and user address.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 virt/kvm/arm/mmu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 27c9583..4a22f5b 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1412,7 +1412,9 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
 		 * page accordingly.
 		 */
 		mask = PTRS_PER_PMD - 1;
-		VM_BUG_ON((gfn & mask) != (pfn & mask));
+		/* Skip memslots with unaligned IPA and user address */
+		if ((gfn & mask) != (pfn & mask))
+			return false;
 		if (pfn & mask) {
 			*ipap &= PMD_MASK;
 			kvm_release_pfn_clean(pfn);
-- 
2.7.4


Kind regards
Suzuki


> 
> Thanks
> 
> Eric
> > 
> > Fixes : 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
> > Reported-by: Zenghui Yu <yuzenghui@huawei.com>
> > Cc: Zenghui Yu <yuzenghui@huawei.com>
> > Cc: Christoffer Dall <christoffer.dall@arm.com>
> > Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> >  virt/kvm/arm/mmu.c | 43 +++++++++++++++++++++----------------------
> >  1 file changed, 21 insertions(+), 22 deletions(-)
> > 
> > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> > index ffd7acdceac7..bcdf978c0d1d 100644
> > --- a/virt/kvm/arm/mmu.c
> > +++ b/virt/kvm/arm/mmu.c
> > @@ -1594,8 +1594,9 @@ static void kvm_send_hwpoison_signal(unsigned long address,
> >  	send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current);
> >  }
> >  
> > -static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
> > -					       unsigned long hva)
> > +static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
> > +					       unsigned long hva,
> > +					       unsigned long map_size)
> >  {
> >  	gpa_t gpa_start;
> >  	hva_t uaddr_start, uaddr_end;
> > @@ -1610,34 +1611,34 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
> >  
> >  	/*
> >  	 * Pages belonging to memslots that don't have the same alignment
> > -	 * within a PMD for userspace and IPA cannot be mapped with stage-2
> > -	 * PMD entries, because we'll end up mapping the wrong pages.
> > +	 * within a PMD/PUD for userspace and IPA cannot be mapped with stage-2
> > +	 * PMD/PUD entries, because we'll end up mapping the wrong pages.
> >  	 *
> >  	 * Consider a layout like the following:
> >  	 *
> >  	 *    memslot->userspace_addr:
> >  	 *    +-----+--------------------+--------------------+---+
> > -	 *    |abcde|fgh  Stage-1 PMD    |    Stage-1 PMD   tv|xyz|
> > +	 *    |abcde|fgh  Stage-1 block  |    Stage-1 block tv|xyz|
> >  	 *    +-----+--------------------+--------------------+---+
> >  	 *
> >  	 *    memslot->base_gfn << PAGE_SIZE:
> >  	 *      +---+--------------------+--------------------+-----+
> > -	 *      |abc|def  Stage-2 PMD    |    Stage-2 PMD     |tvxyz|
> > +	 *      |abc|def  Stage-2 block  |    Stage-2 block   |tvxyz|
> >  	 *      +---+--------------------+--------------------+-----+
> >  	 *
> > -	 * If we create those stage-2 PMDs, we'll end up with this incorrect
> > +	 * If we create those stage-2 blocks, we'll end up with this incorrect
> >  	 * mapping:
> >  	 *   d -> f
> >  	 *   e -> g
> >  	 *   f -> h
> >  	 */
> > -	if ((gpa_start & ~S2_PMD_MASK) != (uaddr_start & ~S2_PMD_MASK))
> > +	if ((gpa_start & (map_size - 1)) != (uaddr_start & (map_size - 1)))
> >  		return false;
> >  
> >  	/*
> >  	 * Next, let's make sure we're not trying to map anything not covered
> > -	 * by the memslot. This means we have to prohibit PMD size mappings
> > -	 * for the beginning and end of a non-PMD aligned and non-PMD sized
> > +	 * by the memslot. This means we have to prohibit block size mappings
> > +	 * for the beginning and end of a non-block aligned and non-block sized
> >  	 * memory slot (illustrated by the head and tail parts of the
> >  	 * userspace view above containing pages 'abcde' and 'xyz',
> >  	 * respectively).
> > @@ -1646,8 +1647,8 @@ static bool fault_supports_stage2_pmd_mappings(struct kvm_memory_slot *memslot,
> >  	 * userspace_addr or the base_gfn, as both are equally aligned (per
> >  	 * the check above) and equally sized.
> >  	 */
> > -	return (hva & S2_PMD_MASK) >= uaddr_start &&
> > -	       (hva & S2_PMD_MASK) + S2_PMD_SIZE <= uaddr_end;
> > +	return (hva & ~(map_size - 1)) >= uaddr_start &&
> > +	       (hva & ~(map_size - 1)) + map_size <= uaddr_end;
> >  }
> >  
> >  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> > @@ -1676,12 +1677,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >  		return -EFAULT;
> >  	}
> >  
> > -	if (!fault_supports_stage2_pmd_mappings(memslot, hva))
> > -		force_pte = true;
> > -
> > -	if (logging_active)
> > -		force_pte = true;
> > -
> >  	/* Let's check if we will get back a huge page backed by hugetlbfs */
> >  	down_read(&current->mm->mmap_sem);
> >  	vma = find_vma_intersection(current->mm, hva, hva + 1);
> > @@ -1692,6 +1687,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >  	}
> >  
> >  	vma_pagesize = vma_kernel_pagesize(vma);
> > +	if (logging_active ||
> > +	    !fault_supports_stage2_huge_mapping(memslot, hva, vma_pagesize)) {
> > +		force_pte = true;
> > +		vma_pagesize = PAGE_SIZE;
> > +	}
> > +
> >  	/*
> >  	 * The stage2 has a minimum of 2 level table (For arm64 see
> >  	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
> > @@ -1699,11 +1700,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >  	 * As for PUD huge maps, we must make sure that we have at least
> >  	 * 3 levels, i.e, PMD is not folded.
> >  	 */
> > -	if ((vma_pagesize == PMD_SIZE ||
> > -	     (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) &&
> > -	    !force_pte) {
> > +	if (vma_pagesize == PMD_SIZE ||
> > +	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
> >  		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
> > -	}
> >  	up_read(&current->mm->mmap_sem);
> >  
> >  	/* We need minimum second+third level pages */
> > 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/8] KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
  2019-04-02  9:47       ` Suzuki K Poulose
@ 2019-04-02 10:07         ` Auger Eric
  -1 siblings, 0 replies; 28+ messages in thread
From: Auger Eric @ 2019-04-02 10:07 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: kvm, Radim Krčmář,
	Marc Zyngier, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Julien Grall,
	James Morse, Nianyao Tang, Zenghui Yu, Paolo Bonzini, kvmarm,
	linux-arm-kernel

Hi Suzuki,

On 4/2/19 11:47 AM, Suzuki K Poulose wrote:
> On Mon, Apr 01, 2019 at 07:10:37PM +0200, Auger Eric wrote:
>> Hi Suzuki,
>>
>> On 3/28/19 2:36 PM, Marc Zyngier wrote:
>>> From: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>
>>> commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
>>> made the checks to skip huge mappings, stricter. However it introduced
>>> a bug where we still use huge mappings, ignoring the flag to
>>> use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE.
>>>
>>> Also, the checks do not cover the PUD huge pages, that was
>>> under review during the same period. This patch fixes both
>>> the issues.
>>
>> I face a regression with this patch. My guest gets stuck. I am running
>> on AMD Seattle. Reverting the patch makes things work again for me. I
>> run with qemu. In this scenario I don't use hugepages. I use 64kB page
>> size for both the host and guest.
> 
> Hi Eric,
> 
> Thanks for the testing. Does the following patch fix the issue for you ?

Yes it does.

Thanks

Eric
> 
> 
> ---8>---
> kvm: arm: Skip transparent huge pages in unaligned memslots
> 
> We silently create stage2 huge mappings for a memslot with
> unaligned IPA and user address.
> 
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  virt/kvm/arm/mmu.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 27c9583..4a22f5b 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1412,7 +1412,9 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
>  		 * page accordingly.
>  		 */
>  		mask = PTRS_PER_PMD - 1;
> -		VM_BUG_ON((gfn & mask) != (pfn & mask));
> +		/* Skip memslots with unaligned IPA and user address */
> +		if ((gfn & mask) != (pfn & mask))
> +			return false;
>  		if (pfn & mask) {
>  			*ipap &= PMD_MASK;
>  			kvm_release_pfn_clean(pfn);
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/8] KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
@ 2019-04-02 10:07         ` Auger Eric
  0 siblings, 0 replies; 28+ messages in thread
From: Auger Eric @ 2019-04-02 10:07 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: kvm, Radim Krčmář,
	Marc Zyngier, Julien Thierry, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Julien Grall,
	James Morse, Nianyao Tang, Zenghui Yu, Paolo Bonzini, kvmarm,
	linux-arm-kernel

Hi Suzuki,

On 4/2/19 11:47 AM, Suzuki K Poulose wrote:
> On Mon, Apr 01, 2019 at 07:10:37PM +0200, Auger Eric wrote:
>> Hi Suzuki,
>>
>> On 3/28/19 2:36 PM, Marc Zyngier wrote:
>>> From: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>
>>> commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
>>> made the checks to skip huge mappings, stricter. However it introduced
>>> a bug where we still use huge mappings, ignoring the flag to
>>> use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE.
>>>
>>> Also, the checks do not cover the PUD huge pages, that was
>>> under review during the same period. This patch fixes both
>>> the issues.
>>
>> I face a regression with this patch. My guest gets stuck. I am running
>> on AMD Seattle. Reverting the patch makes things work again for me. I
>> run with qemu. In this scenario I don't use hugepages. I use 64kB page
>> size for both the host and guest.
> 
> Hi Eric,
> 
> Thanks for the testing. Does the following patch fix the issue for you ?

Yes it does.

Thanks

Eric
> 
> 
> ---8>---
> kvm: arm: Skip transparent huge pages in unaligned memslots
> 
> We silently create stage2 huge mappings for a memslot with
> unaligned IPA and user address.
> 
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  virt/kvm/arm/mmu.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 27c9583..4a22f5b 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1412,7 +1412,9 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t *ipap)
>  		 * page accordingly.
>  		 */
>  		mask = PTRS_PER_PMD - 1;
> -		VM_BUG_ON((gfn & mask) != (pfn & mask));
> +		/* Skip memslots with unaligned IPA and user address */
> +		if ((gfn & mask) != (pfn & mask))
> +			return false;
>  		if (pfn & mask) {
>  			*ipap &= PMD_MASK;
>  			kvm_release_pfn_clean(pfn);
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/8] KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
  2019-04-02 10:07         ` Auger Eric
@ 2019-04-02 10:19           ` Marc Zyngier
  -1 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-04-02 10:19 UTC (permalink / raw)
  To: Auger Eric
  Cc: kvm, YueHaibing, Julien Grall, Zenghui Yu, Paolo Bonzini, kvmarm,
	linux-arm-kernel

On Tue, 02 Apr 2019 11:07:28 +0100,
Auger Eric <eric.auger@redhat.com> wrote:
> 
> Hi Suzuki,
> 
> On 4/2/19 11:47 AM, Suzuki K Poulose wrote:
> > On Mon, Apr 01, 2019 at 07:10:37PM +0200, Auger Eric wrote:
> >> Hi Suzuki,
> >>
> >> On 3/28/19 2:36 PM, Marc Zyngier wrote:
> >>> From: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>>
> >>> commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
> >>> made the checks to skip huge mappings, stricter. However it introduced
> >>> a bug where we still use huge mappings, ignoring the flag to
> >>> use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE.
> >>>
> >>> Also, the checks do not cover the PUD huge pages, that was
> >>> under review during the same period. This patch fixes both
> >>> the issues.
> >>
> >> I face a regression with this patch. My guest gets stuck. I am running
> >> on AMD Seattle. Reverting the patch makes things work again for me. I
> >> run with qemu. In this scenario I don't use hugepages. I use 64kB page
> >> size for both the host and guest.
> > 
> > Hi Eric,
> > 
> > Thanks for the testing. Does the following patch fix the issue for you ?
> 
> Yes it does.

Thanks for testing this. Suzuki, can you please resend this with
Eric's TB, and a Fixes: tag? I'll queue it right away.

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/8] KVM: arm/arm64: Enforce PTE mappings at stage2 when needed
@ 2019-04-02 10:19           ` Marc Zyngier
  0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2019-04-02 10:19 UTC (permalink / raw)
  To: Auger Eric
  Cc: Julien Thierry, kvm, Radim Krčmář,
	Suzuki K Poulose, YueHaibing, Zheng Xiang,
	Shameerali Kolothum Thodi, Christoffer Dall, Julien Grall,
	James Morse, Nianyao Tang, Zenghui Yu, Paolo Bonzini, kvmarm,
	linux-arm-kernel

On Tue, 02 Apr 2019 11:07:28 +0100,
Auger Eric <eric.auger@redhat.com> wrote:
> 
> Hi Suzuki,
> 
> On 4/2/19 11:47 AM, Suzuki K Poulose wrote:
> > On Mon, Apr 01, 2019 at 07:10:37PM +0200, Auger Eric wrote:
> >> Hi Suzuki,
> >>
> >> On 3/28/19 2:36 PM, Marc Zyngier wrote:
> >>> From: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>>
> >>> commit 6794ad5443a2118 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings")
> >>> made the checks to skip huge mappings, stricter. However it introduced
> >>> a bug where we still use huge mappings, ignoring the flag to
> >>> use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE.
> >>>
> >>> Also, the checks do not cover the PUD huge pages, that was
> >>> under review during the same period. This patch fixes both
> >>> the issues.
> >>
> >> I face a regression with this patch. My guest gets stuck. I am running
> >> on AMD Seattle. Reverting the patch makes things work again for me. I
> >> run with qemu. In this scenario I don't use hugepages. I use 64kB page
> >> size for both the host and guest.
> > 
> > Hi Eric,
> > 
> > Thanks for the testing. Does the following patch fix the issue for you ?
> 
> Yes it does.

Thanks for testing this. Suzuki, can you please resend this with
Eric's TB, and a Fixes: tag? I'll queue it right away.

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2019-04-02 10:19 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-28 13:36 [GIT PULL] KVM/ARM fixes for 5.1-rc3 Marc Zyngier
2019-03-28 13:36 ` Marc Zyngier
2019-03-28 13:36 ` [PATCH 1/8] KVM: arm64: Reset the PMU in preemptible context Marc Zyngier
2019-03-28 13:36   ` Marc Zyngier
2019-03-28 13:36 ` [PATCH 2/8] arm64: KVM: Always set ICH_HCR_EL2.EN if GICv4 is enabled Marc Zyngier
2019-03-28 13:36   ` Marc Zyngier
2019-03-28 13:36 ` [PATCH 3/8] KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memory Marc Zyngier
2019-03-28 13:36   ` Marc Zyngier
2019-03-28 13:36 ` [PATCH 4/8] KVM: arm/arm64: vgic-its: Take the srcu lock when parsing the memslots Marc Zyngier
2019-03-28 13:36   ` Marc Zyngier
2019-03-28 13:36 ` [PATCH 5/8] KVM: arm/arm64: Enforce PTE mappings at stage2 when needed Marc Zyngier
2019-03-28 13:36   ` Marc Zyngier
2019-04-01 17:10   ` Auger Eric
2019-04-01 17:10     ` Auger Eric
2019-04-02  9:47     ` Suzuki K Poulose
2019-04-02  9:47       ` Suzuki K Poulose
2019-04-02 10:07       ` Auger Eric
2019-04-02 10:07         ` Auger Eric
2019-04-02 10:19         ` Marc Zyngier
2019-04-02 10:19           ` Marc Zyngier
2019-03-28 13:36 ` [PATCH 6/8] KVM: arm/arm64: Fix handling of stage2 huge mappings Marc Zyngier
2019-03-28 13:36   ` Marc Zyngier
2019-03-28 13:36 ` [PATCH 7/8] KVM: arm/arm64: vgic-its: Make attribute accessors static Marc Zyngier
2019-03-28 13:36   ` Marc Zyngier
2019-03-28 13:36 ` [PATCH 8/8] KVM: arm/arm64: Comments cleanup in mmu.c Marc Zyngier
2019-03-28 13:36   ` Marc Zyngier
2019-03-28 18:07 ` [GIT PULL] KVM/ARM fixes for 5.1-rc3 Paolo Bonzini
2019-03-28 18:07   ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.