linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid
@ 2021-11-22 12:18 Shameer Kolothum
  2021-11-22 12:18 ` [PATCH v4 1/4] KVM: arm64: Introduce a new VMID allocator for KVM Shameer Kolothum
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Shameer Kolothum @ 2021-11-22 12:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, linux-kernel
  Cc: maz, will, catalin.marinas, james.morse, julien.thierry.kdev,
	suzuki.poulose, jean-philippe, Alexandru.Elisei, qperret,
	jonathan.cameron, linuxarm

Changes from v3:
- Main change is in patch #4, where the VMID is now set to an
  invalid one on vCPU schedule out. Introduced an 
  INVALID_ACTIVE_VMID which is basically a VMID 0 with generation 1.
  Since the basic allocator algorithm reserves vmid #0, it is never
  used as an active VMID. This (hopefully) will fix the issue of
  unnecessarily reserving VMID space with active_vmids when those
  VMs are no longer active[0] and at the same time address the
  problem noted in v3 wherein everything ends up in slow-path[1].

Testing:
 -Run with VMID bit set to 4 and maxcpus to 8 on D06. The test
  involves running concurrently 50 guests with 4 vCPUs. Each
  guest will then execute hackbench 5 times before exiting.
  No crash was observed for a 4-day continuous run.
  The latest branch is here,
   https://github.com/hisilicon/kernel-dev/tree/private-v5.16-rc1-vmid-v4

 -TLA+ model. Modified the asidalloc model to incorporate the new
  VMID algo. The main differences are,
  -flush_tlb_all() instead of local_tlb_flush_all() on rollover.
  -Introduced INVALID_VMID and vCPU Sched Out logic.
  -No CnP (Removed UniqueASIDAllCPUs & UniqueASIDActiveTask invariants).
  -Removed  UniqueVMIDPerCPU invariant for now as it looks like
   because of the speculative fetching with flush_tlb_all() there
   is a small window where this gets triggered. If I change the
   logic back to local_flush_tlb_all(), UniqueVMIDPerCPU seems to
   be fine. With my limited knowledge on TLA+ model, it is not
   clear to me whether this is a problem with the above logic
   or the VMID model implementation. Really appreciate any help
   with the model.
   The initial VMID TLA+ model is here,
   https://github.com/shamiali2008/kernel-tla/tree/private-vmidalloc-v1

Please take a look and let me know.

Thanks,
Shameer

[0] https://lore.kernel.org/kvmarm/20210721160614.GC11003@willie-the-truck/
[1] https://lore.kernel.org/kvmarm/20210803114034.GB30853@willie-the-truck/

History:
--------
v2 --> v3
  -Dropped adding a new static key and cpufeature for retrieving
   supported VMID bits. Instead, we now make use of the
   kvm_arm_vmid_bits variable (patch #2).

  -Since we expect less frequent rollover in the case of VMIDs,
   the TLB invalidation is now broadcasted on rollover instead
   of keeping per CPU flush_pending info and issuing a local
   context flush.

  -Clear active_vmids on vCPU schedule out to avoid unnecessarily
   reserving the VMID space(patch #3).

  -I have kept the struct kvm_vmid as it is for now(instead of a
   typedef as suggested), as we may soon add another variable to
   it when we introduce Pinned KVM VMID support.

RFCv1 --> v2
   -Dropped "pinned VMID" support for now.
   -Dropped RFC tag.
RFCv1
   https://lore.kernel.org/kvmarm/20210506165232.1969-1-shameerali.kolothum.thodi@huawei.com/

Julien Grall (1):
  KVM: arm64: Align the VMID allocation with the arm64 ASID

Shameer Kolothum (3):
  KVM: arm64: Introduce a new VMID allocator for KVM
  KVM: arm64: Make VMID bits accessible outside of allocator
  KVM: arm64: Make active_vmids invalid on vCPU schedule out

 arch/arm64/include/asm/kvm_host.h     |  10 +-
 arch/arm64/include/asm/kvm_mmu.h      |   4 +-
 arch/arm64/kernel/image-vars.h        |   3 +
 arch/arm64/kvm/Makefile               |   2 +-
 arch/arm64/kvm/arm.c                  | 106 +++-----------
 arch/arm64/kvm/hyp/nvhe/mem_protect.c |   3 +-
 arch/arm64/kvm/mmu.c                  |   1 -
 arch/arm64/kvm/vmid.c                 | 196 ++++++++++++++++++++++++++
 8 files changed, 228 insertions(+), 97 deletions(-)
 create mode 100644 arch/arm64/kvm/vmid.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 1/4] KVM: arm64: Introduce a new VMID allocator for KVM
  2021-11-22 12:18 [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Shameer Kolothum
@ 2021-11-22 12:18 ` Shameer Kolothum
  2022-01-21  7:35   ` Reiji Watanabe
  2021-11-22 12:18 ` [PATCH v4 2/4] KVM: arm64: Make VMID bits accessible outside of allocator Shameer Kolothum
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Shameer Kolothum @ 2021-11-22 12:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, linux-kernel
  Cc: maz, will, catalin.marinas, james.morse, julien.thierry.kdev,
	suzuki.poulose, jean-philippe, Alexandru.Elisei, qperret,
	jonathan.cameron, linuxarm

A new VMID allocator for arm64 KVM use. This is based on
arm64 ASID allocator algorithm.

One major deviation from the ASID allocator is the way we
flush the context. Unlike ASID allocator, we expect less
frequent rollover in the case of VMIDs. Hence, instead of
marking the CPU as flush_pending and issuing a local context
invalidation on the next context switch, we  broadcast TLB
flush + I-cache invalidation over the inner shareable domain
on rollover.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 arch/arm64/include/asm/kvm_host.h |   4 +
 arch/arm64/kvm/vmid.c             | 177 ++++++++++++++++++++++++++++++
 2 files changed, 181 insertions(+)
 create mode 100644 arch/arm64/kvm/vmid.c

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2a5f7f38006f..f4a86a79ea4a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -690,6 +690,10 @@ int kvm_arm_pvtime_get_attr(struct kvm_vcpu *vcpu,
 int kvm_arm_pvtime_has_attr(struct kvm_vcpu *vcpu,
 			    struct kvm_device_attr *attr);
 
+int kvm_arm_vmid_alloc_init(void);
+void kvm_arm_vmid_alloc_free(void);
+void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid);
+
 static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch *vcpu_arch)
 {
 	vcpu_arch->steal.base = GPA_INVALID;
diff --git a/arch/arm64/kvm/vmid.c b/arch/arm64/kvm/vmid.c
new file mode 100644
index 000000000000..aa01c97f7df0
--- /dev/null
+++ b/arch/arm64/kvm/vmid.c
@@ -0,0 +1,177 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * VMID allocator.
+ *
+ * Based on Arm64 ASID allocator algorithm.
+ * Please refer arch/arm64/mm/context.c for detailed
+ * comments on algorithm.
+ *
+ * Copyright (C) 2002-2003 Deep Blue Solutions Ltd, all rights reserved.
+ * Copyright (C) 2012 ARM Ltd.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/bitops.h>
+
+#include <asm/kvm_asm.h>
+#include <asm/kvm_mmu.h>
+
+static unsigned int kvm_arm_vmid_bits;
+static DEFINE_RAW_SPINLOCK(cpu_vmid_lock);
+
+static atomic64_t vmid_generation;
+static unsigned long *vmid_map;
+
+static DEFINE_PER_CPU(atomic64_t, active_vmids);
+static DEFINE_PER_CPU(u64, reserved_vmids);
+
+#define VMID_MASK		(~GENMASK(kvm_arm_vmid_bits - 1, 0))
+#define VMID_FIRST_VERSION	(1UL << kvm_arm_vmid_bits)
+
+#define NUM_USER_VMIDS		VMID_FIRST_VERSION
+#define vmid2idx(vmid)		((vmid) & ~VMID_MASK)
+#define idx2vmid(idx)		vmid2idx(idx)
+
+#define vmid_gen_match(vmid) \
+	(!(((vmid) ^ atomic64_read(&vmid_generation)) >> kvm_arm_vmid_bits))
+
+static void flush_context(void)
+{
+	int cpu;
+	u64 vmid;
+
+	bitmap_clear(vmid_map, 0, NUM_USER_VMIDS);
+
+	for_each_possible_cpu(cpu) {
+		vmid = atomic64_xchg_relaxed(&per_cpu(active_vmids, cpu), 0);
+
+		/* Preserve reserved VMID */
+		if (vmid == 0)
+			vmid = per_cpu(reserved_vmids, cpu);
+		__set_bit(vmid2idx(vmid), vmid_map);
+		per_cpu(reserved_vmids, cpu) = vmid;
+	}
+
+	/*
+	 * Unlike ASID allocator, we expect less frequent rollover in
+	 * case of VMIDs. Hence, instead of marking the CPU as
+	 * flush_pending and issuing a local context invalidation on
+	 * the next context-switch, we broadcast TLB flush + I-cache
+	 * invalidation over the inner shareable domain on rollover.
+	 */
+	 kvm_call_hyp(__kvm_flush_vm_context);
+}
+
+static bool check_update_reserved_vmid(u64 vmid, u64 newvmid)
+{
+	int cpu;
+	bool hit = false;
+
+	/*
+	 * Iterate over the set of reserved VMIDs looking for a match
+	 * and update to use newvmid (i.e. the same VMID in the current
+	 * generation).
+	 */
+	for_each_possible_cpu(cpu) {
+		if (per_cpu(reserved_vmids, cpu) == vmid) {
+			hit = true;
+			per_cpu(reserved_vmids, cpu) = newvmid;
+		}
+	}
+
+	return hit;
+}
+
+static u64 new_vmid(struct kvm_vmid *kvm_vmid)
+{
+	static u32 cur_idx = 1;
+	u64 vmid = atomic64_read(&kvm_vmid->id);
+	u64 generation = atomic64_read(&vmid_generation);
+
+	if (vmid != 0) {
+		u64 newvmid = generation | (vmid & ~VMID_MASK);
+
+		if (check_update_reserved_vmid(vmid, newvmid)) {
+			atomic64_set(&kvm_vmid->id, newvmid);
+			return newvmid;
+		}
+
+		if (!__test_and_set_bit(vmid2idx(vmid), vmid_map)) {
+			atomic64_set(&kvm_vmid->id, newvmid);
+			return newvmid;
+		}
+	}
+
+	vmid = find_next_zero_bit(vmid_map, NUM_USER_VMIDS, cur_idx);
+	if (vmid != NUM_USER_VMIDS)
+		goto set_vmid;
+
+	/* We're out of VMIDs, so increment the global generation count */
+	generation = atomic64_add_return_relaxed(VMID_FIRST_VERSION,
+						 &vmid_generation);
+	flush_context();
+
+	/* We have more VMIDs than CPUs, so this will always succeed */
+	vmid = find_next_zero_bit(vmid_map, NUM_USER_VMIDS, 1);
+
+set_vmid:
+	__set_bit(vmid, vmid_map);
+	cur_idx = vmid;
+	vmid = idx2vmid(vmid) | generation;
+	atomic64_set(&kvm_vmid->id, vmid);
+	return vmid;
+}
+
+void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid)
+{
+	unsigned long flags;
+	u64 vmid, old_active_vmid;
+
+	vmid = atomic64_read(&kvm_vmid->id);
+
+	/*
+	 * Please refer comments in check_and_switch_context() in
+	 * arch/arm64/mm/context.c.
+	 */
+	old_active_vmid = atomic64_read(this_cpu_ptr(&active_vmids));
+	if (old_active_vmid && vmid_gen_match(vmid) &&
+	    atomic64_cmpxchg_relaxed(this_cpu_ptr(&active_vmids),
+				     old_active_vmid, vmid))
+		return;
+
+	raw_spin_lock_irqsave(&cpu_vmid_lock, flags);
+
+	/* Check that our VMID belongs to the current generation. */
+	vmid = atomic64_read(&kvm_vmid->id);
+	if (!vmid_gen_match(vmid))
+		vmid = new_vmid(kvm_vmid);
+
+	atomic64_set(this_cpu_ptr(&active_vmids), vmid);
+	raw_spin_unlock_irqrestore(&cpu_vmid_lock, flags);
+}
+
+/*
+ * Initialize the VMID allocator
+ */
+int kvm_arm_vmid_alloc_init(void)
+{
+	kvm_arm_vmid_bits = kvm_get_vmid_bits();
+
+	/*
+	 * Expect allocation after rollover to fail if we don't have
+	 * at least one more VMID than CPUs. VMID #0 is always reserved.
+	 */
+	WARN_ON(NUM_USER_VMIDS - 1 <= num_possible_cpus());
+	atomic64_set(&vmid_generation, VMID_FIRST_VERSION);
+	vmid_map = kcalloc(BITS_TO_LONGS(NUM_USER_VMIDS),
+			   sizeof(*vmid_map), GFP_KERNEL);
+	if (!vmid_map)
+		return -ENOMEM;
+
+	return 0;
+}
+
+void kvm_arm_vmid_alloc_free(void)
+{
+	kfree(vmid_map);
+}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 2/4] KVM: arm64: Make VMID bits accessible outside of allocator
  2021-11-22 12:18 [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Shameer Kolothum
  2021-11-22 12:18 ` [PATCH v4 1/4] KVM: arm64: Introduce a new VMID allocator for KVM Shameer Kolothum
@ 2021-11-22 12:18 ` Shameer Kolothum
  2021-11-22 12:18 ` [PATCH v4 3/4] KVM: arm64: Align the VMID allocation with the arm64 ASID Shameer Kolothum
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Shameer Kolothum @ 2021-11-22 12:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, linux-kernel
  Cc: maz, will, catalin.marinas, james.morse, julien.thierry.kdev,
	suzuki.poulose, jean-philippe, Alexandru.Elisei, qperret,
	jonathan.cameron, linuxarm

Since we already set the kvm_arm_vmid_bits in the VMID allocator
init function, make it accessible outside as well so that it can
be used in the subsequent patch.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 arch/arm64/include/asm/kvm_host.h | 1 +
 arch/arm64/kernel/image-vars.h    | 3 +++
 arch/arm64/kvm/vmid.c             | 2 +-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f4a86a79ea4a..51af17e16115 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -690,6 +690,7 @@ int kvm_arm_pvtime_get_attr(struct kvm_vcpu *vcpu,
 int kvm_arm_pvtime_has_attr(struct kvm_vcpu *vcpu,
 			    struct kvm_device_attr *attr);
 
+extern unsigned int kvm_arm_vmid_bits;
 int kvm_arm_vmid_alloc_init(void);
 void kvm_arm_vmid_alloc_free(void);
 void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid);
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index c96a9a0043bf..c12963c3a055 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -79,6 +79,9 @@ KVM_NVHE_ALIAS(__hyp_stub_vectors);
 /* Kernel symbol used by icache_is_vpipt(). */
 KVM_NVHE_ALIAS(__icache_flags);
 
+/* VMID bits set by the KVM VMID allocator */
+KVM_NVHE_ALIAS(kvm_arm_vmid_bits);
+
 /* Kernel symbols needed for cpus_have_final/const_caps checks. */
 KVM_NVHE_ALIAS(arm64_const_caps_ready);
 KVM_NVHE_ALIAS(cpu_hwcap_keys);
diff --git a/arch/arm64/kvm/vmid.c b/arch/arm64/kvm/vmid.c
index aa01c97f7df0..9aff692b6b7d 100644
--- a/arch/arm64/kvm/vmid.c
+++ b/arch/arm64/kvm/vmid.c
@@ -16,7 +16,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmu.h>
 
-static unsigned int kvm_arm_vmid_bits;
+unsigned int kvm_arm_vmid_bits;
 static DEFINE_RAW_SPINLOCK(cpu_vmid_lock);
 
 static atomic64_t vmid_generation;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 3/4] KVM: arm64: Align the VMID allocation with the arm64 ASID
  2021-11-22 12:18 [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Shameer Kolothum
  2021-11-22 12:18 ` [PATCH v4 1/4] KVM: arm64: Introduce a new VMID allocator for KVM Shameer Kolothum
  2021-11-22 12:18 ` [PATCH v4 2/4] KVM: arm64: Make VMID bits accessible outside of allocator Shameer Kolothum
@ 2021-11-22 12:18 ` Shameer Kolothum
  2021-11-22 12:18 ` [PATCH v4 4/4] KVM: arm64: Make active_vmids invalid on vCPU schedule out Shameer Kolothum
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Shameer Kolothum @ 2021-11-22 12:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, linux-kernel
  Cc: maz, will, catalin.marinas, james.morse, julien.thierry.kdev,
	suzuki.poulose, jean-philippe, Alexandru.Elisei, qperret,
	jonathan.cameron, linuxarm

From: Julien Grall <julien.grall@arm.com>

At the moment, the VMID algorithm will send an SGI to all the
CPUs to force an exit and then broadcast a full TLB flush and
I-Cache invalidation.

This patch uses the new VMID allocator. The benefits are:
   - Aligns with arm64 ASID algorithm.
   - CPUs are not forced to exit at roll-over. Instead,
     the VMID will be marked reserved and context invalidation
     is broadcasted. This will reduce the IPIs traffic.
   - More flexible to add support for pinned KVM VMIDs in
     the future.
   
With the new algo, the code is now adapted:
    - The call to update_vmid() will be done with preemption
      disabled as the new algo requires to store information
      per-CPU.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 arch/arm64/include/asm/kvm_host.h     |   4 +-
 arch/arm64/include/asm/kvm_mmu.h      |   4 +-
 arch/arm64/kvm/Makefile               |   2 +-
 arch/arm64/kvm/arm.c                  | 105 ++++----------------------
 arch/arm64/kvm/hyp/nvhe/mem_protect.c |   3 +-
 arch/arm64/kvm/mmu.c                  |   1 -
 6 files changed, 22 insertions(+), 97 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 51af17e16115..752d4408e3d0 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -72,9 +72,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
 void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu);
 
 struct kvm_vmid {
-	/* The VMID generation used for the virt. memory system */
-	u64    vmid_gen;
-	u32    vmid;
+	atomic64_t id;
 };
 
 struct kvm_s2_mmu {
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 02d378887743..d8e6266b8d46 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -115,6 +115,7 @@ alternative_cb_end
 #include <asm/cache.h>
 #include <asm/cacheflush.h>
 #include <asm/mmu_context.h>
+#include <asm/kvm_host.h>
 
 void kvm_update_va_mask(struct alt_instr *alt,
 			__le32 *origptr, __le32 *updptr, int nr_inst);
@@ -264,7 +265,8 @@ static __always_inline u64 kvm_get_vttbr(struct kvm_s2_mmu *mmu)
 	u64 cnp = system_supports_cnp() ? VTTBR_CNP_BIT : 0;
 
 	baddr = mmu->pgd_phys;
-	vmid_field = (u64)READ_ONCE(vmid->vmid) << VTTBR_VMID_SHIFT;
+	vmid_field = atomic64_read(&vmid->id) << VTTBR_VMID_SHIFT;
+	vmid_field &= VTTBR_VMID_MASK(kvm_arm_vmid_bits);
 	return kvm_phys_to_vttbr(baddr) | vmid_field | cnp;
 }
 
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 989bb5dad2c8..4a607d52356c 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -16,7 +16,7 @@ kvm-y := $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o \
 	 inject_fault.o va_layout.o handle_exit.o \
 	 guest.o debug.o reset.o sys_regs.o \
 	 vgic-sys-reg-v3.o fpsimd.o pmu.o \
-	 arch_timer.o trng.o\
+	 arch_timer.o trng.o vmid.o \
 	 vgic/vgic.o vgic/vgic-init.o \
 	 vgic/vgic-irqfd.o vgic/vgic-v2.o \
 	 vgic/vgic-v3.o vgic/vgic-v4.o \
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 2f03cbfefe67..0544011b0fc6 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -53,11 +53,6 @@ static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
 unsigned long kvm_arm_hyp_percpu_base[NR_CPUS];
 DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
 
-/* The VMID used in the VTTBR */
-static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
-static u32 kvm_next_vmid;
-static DEFINE_SPINLOCK(kvm_vmid_lock);
-
 static bool vgic_present;
 
 static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
@@ -496,87 +491,6 @@ bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
 	return vcpu_mode_priv(vcpu);
 }
 
-/* Just ensure a guest exit from a particular CPU */
-static void exit_vm_noop(void *info)
-{
-}
-
-void force_vm_exit(const cpumask_t *mask)
-{
-	preempt_disable();
-	smp_call_function_many(mask, exit_vm_noop, NULL, true);
-	preempt_enable();
-}
-
-/**
- * need_new_vmid_gen - check that the VMID is still valid
- * @vmid: The VMID to check
- *
- * return true if there is a new generation of VMIDs being used
- *
- * The hardware supports a limited set of values with the value zero reserved
- * for the host, so we check if an assigned value belongs to a previous
- * generation, which requires us to assign a new value. If we're the first to
- * use a VMID for the new generation, we must flush necessary caches and TLBs
- * on all CPUs.
- */
-static bool need_new_vmid_gen(struct kvm_vmid *vmid)
-{
-	u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen);
-	smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
-	return unlikely(READ_ONCE(vmid->vmid_gen) != current_vmid_gen);
-}
-
-/**
- * update_vmid - Update the vmid with a valid VMID for the current generation
- * @vmid: The stage-2 VMID information struct
- */
-static void update_vmid(struct kvm_vmid *vmid)
-{
-	if (!need_new_vmid_gen(vmid))
-		return;
-
-	spin_lock(&kvm_vmid_lock);
-
-	/*
-	 * We need to re-check the vmid_gen here to ensure that if another vcpu
-	 * already allocated a valid vmid for this vm, then this vcpu should
-	 * use the same vmid.
-	 */
-	if (!need_new_vmid_gen(vmid)) {
-		spin_unlock(&kvm_vmid_lock);
-		return;
-	}
-
-	/* First user of a new VMID generation? */
-	if (unlikely(kvm_next_vmid == 0)) {
-		atomic64_inc(&kvm_vmid_gen);
-		kvm_next_vmid = 1;
-
-		/*
-		 * On SMP we know no other CPUs can use this CPU's or each
-		 * other's VMID after force_vm_exit returns since the
-		 * kvm_vmid_lock blocks them from reentry to the guest.
-		 */
-		force_vm_exit(cpu_all_mask);
-		/*
-		 * Now broadcast TLB + ICACHE invalidation over the inner
-		 * shareable domain to make sure all data structures are
-		 * clean.
-		 */
-		kvm_call_hyp(__kvm_flush_vm_context);
-	}
-
-	WRITE_ONCE(vmid->vmid, kvm_next_vmid);
-	kvm_next_vmid++;
-	kvm_next_vmid &= (1 << kvm_get_vmid_bits()) - 1;
-
-	smp_wmb();
-	WRITE_ONCE(vmid->vmid_gen, atomic64_read(&kvm_vmid_gen));
-
-	spin_unlock(&kvm_vmid_lock);
-}
-
 static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 {
 	struct kvm *kvm = vcpu->kvm;
@@ -753,7 +667,6 @@ static bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu, int *ret)
 	}
 
 	return kvm_request_pending(vcpu) ||
-			need_new_vmid_gen(&vcpu->arch.hw_mmu->vmid) ||
 			xfer_to_guest_mode_work_pending();
 }
 
@@ -804,8 +717,6 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 		if (!ret)
 			ret = 1;
 
-		update_vmid(&vcpu->arch.hw_mmu->vmid);
-
 		check_vcpu_requests(vcpu);
 
 		/*
@@ -815,6 +726,15 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 		 */
 		preempt_disable();
 
+		/*
+		 * The VMID allocator only tracks active VMIDs per
+		 * physical CPU, and therefore the VMID allocated may not be
+		 * preserved on VMID roll-over if the task was preempted,
+		 * making a thread's VMID inactive. So we need to call
+		 * kvm_arm_vmid_update() in non-premptible context.
+		 */
+		kvm_arm_vmid_update(&vcpu->arch.hw_mmu->vmid);
+
 		kvm_pmu_flush_hwstate(vcpu);
 
 		local_irq_disable();
@@ -2111,6 +2031,12 @@ int kvm_arch_init(void *opaque)
 	if (err)
 		return err;
 
+	err = kvm_arm_vmid_alloc_init();
+	if (err) {
+		kvm_err("Failed to initialize VMID allocator.\n");
+		return err;
+	}
+
 	if (!in_hyp_mode) {
 		err = init_hyp_mode();
 		if (err)
@@ -2150,6 +2076,7 @@ int kvm_arch_init(void *opaque)
 	if (!in_hyp_mode)
 		teardown_hyp_mode();
 out_err:
+	kvm_arm_vmid_alloc_free();
 	return err;
 }
 
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index c1a90dd022b8..136141f7d524 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -117,8 +117,7 @@ int kvm_host_prepare_stage2(void *pgt_pool_base)
 	mmu->pgd_phys = __hyp_pa(host_kvm.pgt.pgd);
 	mmu->arch = &host_kvm.arch;
 	mmu->pgt = &host_kvm.pgt;
-	WRITE_ONCE(mmu->vmid.vmid_gen, 0);
-	WRITE_ONCE(mmu->vmid.vmid, 0);
+	atomic64_set(&mmu->vmid.id, 0);
 
 	return 0;
 }
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 326cdfec74a1..f1f4bbe5adeb 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -532,7 +532,6 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu)
 	mmu->arch = &kvm->arch;
 	mmu->pgt = pgt;
 	mmu->pgd_phys = __pa(pgt->pgd);
-	WRITE_ONCE(mmu->vmid.vmid_gen, 0);
 	return 0;
 
 out_destroy_pgtable:
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 4/4] KVM: arm64: Make active_vmids invalid on vCPU schedule out
  2021-11-22 12:18 [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Shameer Kolothum
                   ` (2 preceding siblings ...)
  2021-11-22 12:18 ` [PATCH v4 3/4] KVM: arm64: Align the VMID allocation with the arm64 ASID Shameer Kolothum
@ 2021-11-22 12:18 ` Shameer Kolothum
  2022-01-05 13:25 ` [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Shameerali Kolothum Thodi
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Shameer Kolothum @ 2021-11-22 12:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, linux-kernel
  Cc: maz, will, catalin.marinas, james.morse, julien.thierry.kdev,
	suzuki.poulose, jean-philippe, Alexandru.Elisei, qperret,
	jonathan.cameron, linuxarm

Like ASID allocator, we copy the active_vmids into the
reserved_vmids on a rollover. But it's unlikely that
every CPU will have a vCPU as current task and we may
end up unnecessarily reserving the VMID space.

Hence, set active_vmids to an invalid one when scheduling
out a vCPU.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/kvm/arm.c              |  1 +
 arch/arm64/kvm/vmid.c             | 25 ++++++++++++++++++++++---
 3 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 752d4408e3d0..22f952effd03 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -692,6 +692,7 @@ extern unsigned int kvm_arm_vmid_bits;
 int kvm_arm_vmid_alloc_init(void);
 void kvm_arm_vmid_alloc_free(void);
 void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid);
+void kvm_arm_vmid_clear_active(void);
 
 static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch *vcpu_arch)
 {
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 0544011b0fc6..bfe926805240 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -431,6 +431,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 	kvm_timer_vcpu_put(vcpu);
 	kvm_vgic_put(vcpu);
 	kvm_vcpu_pmu_restore_host(vcpu);
+	kvm_arm_vmid_clear_active();
 
 	vcpu->cpu = -1;
 }
diff --git a/arch/arm64/kvm/vmid.c b/arch/arm64/kvm/vmid.c
index 9aff692b6b7d..966ebb2d12e5 100644
--- a/arch/arm64/kvm/vmid.c
+++ b/arch/arm64/kvm/vmid.c
@@ -32,6 +32,13 @@ static DEFINE_PER_CPU(u64, reserved_vmids);
 #define vmid2idx(vmid)		((vmid) & ~VMID_MASK)
 #define idx2vmid(idx)		vmid2idx(idx)
 
+/*
+ * As vmid #0 is always reserved, we will never allocate one
+ * as below and can be treated as invalid. This is used to
+ * set the active_vmids on vCPU schedule out.
+ */
+#define VMID_ACTIVE_INVALID		VMID_FIRST_VERSION
+
 #define vmid_gen_match(vmid) \
 	(!(((vmid) ^ atomic64_read(&vmid_generation)) >> kvm_arm_vmid_bits))
 
@@ -122,6 +129,12 @@ static u64 new_vmid(struct kvm_vmid *kvm_vmid)
 	return vmid;
 }
 
+/* Called from vCPU sched out with preemption disabled */
+void kvm_arm_vmid_clear_active(void)
+{
+	atomic64_set(this_cpu_ptr(&active_vmids), VMID_ACTIVE_INVALID);
+}
+
 void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid)
 {
 	unsigned long flags;
@@ -132,11 +145,17 @@ void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid)
 	/*
 	 * Please refer comments in check_and_switch_context() in
 	 * arch/arm64/mm/context.c.
+	 *
+	 * Unlike ASID allocator, we set the active_vmids to
+	 * VMID_ACTIVE_INVALID on vCPU schedule out to avoid
+	 * reserving the VMID space needlessly on rollover.
+	 * Hence explicitly check here for a "!= 0" to
+	 * handle the sync with a concurrent rollover.
 	 */
 	old_active_vmid = atomic64_read(this_cpu_ptr(&active_vmids));
-	if (old_active_vmid && vmid_gen_match(vmid) &&
-	    atomic64_cmpxchg_relaxed(this_cpu_ptr(&active_vmids),
-				     old_active_vmid, vmid))
+	if (old_active_vmid != 0 && vmid_gen_match(vmid) &&
+	    0 != atomic64_cmpxchg_relaxed(this_cpu_ptr(&active_vmids),
+					  old_active_vmid, vmid))
 		return;
 
 	raw_spin_lock_irqsave(&cpu_vmid_lock, flags);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* RE: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid
  2021-11-22 12:18 [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Shameer Kolothum
                   ` (3 preceding siblings ...)
  2021-11-22 12:18 ` [PATCH v4 4/4] KVM: arm64: Make active_vmids invalid on vCPU schedule out Shameer Kolothum
@ 2022-01-05 13:25 ` Shameerali Kolothum Thodi
  2022-01-18 12:32 ` Catalin Marinas
  2022-02-08 17:37 ` Marc Zyngier
  6 siblings, 0 replies; 13+ messages in thread
From: Shameerali Kolothum Thodi @ 2022-01-05 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, linux-kernel
  Cc: maz, will, catalin.marinas, james.morse, julien.thierry.kdev,
	suzuki.poulose, jean-philippe, Alexandru.Elisei, qperret,
	Jonathan Cameron, Linuxarm

Hi,

A gentle ping on this series. Please take a look and let me know the new approach
taken in this revision is good enough or not. 

Appreciate your feedback.

Thanks,
Shameer

> -----Original Message-----
> From: linux-arm-kernel [mailto:linux-arm-kernel-bounces@lists.infradead.org]
> On Behalf Of Shameer Kolothum
> Sent: 22 November 2021 12:19
> To: linux-arm-kernel@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-kernel@vger.kernel.org
> Cc: maz@kernel.org; will@kernel.org; catalin.marinas@arm.com;
> james.morse@arm.com; julien.thierry.kdev@gmail.com;
> suzuki.poulose@arm.com; jean-philippe@linaro.org;
> Alexandru.Elisei@arm.com; qperret@google.com; Jonathan Cameron
> <jonathan.cameron@huawei.com>; Linuxarm <linuxarm@huawei.com>
> Subject: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid
> 
> Changes from v3:
> - Main change is in patch #4, where the VMID is now set to an
>   invalid one on vCPU schedule out. Introduced an
>   INVALID_ACTIVE_VMID which is basically a VMID 0 with generation 1.
>   Since the basic allocator algorithm reserves vmid #0, it is never
>   used as an active VMID. This (hopefully) will fix the issue of
>   unnecessarily reserving VMID space with active_vmids when those
>   VMs are no longer active[0] and at the same time address the
>   problem noted in v3 wherein everything ends up in slow-path[1].
> 
> Testing:
>  -Run with VMID bit set to 4 and maxcpus to 8 on D06. The test
>   involves running concurrently 50 guests with 4 vCPUs. Each
>   guest will then execute hackbench 5 times before exiting.
>   No crash was observed for a 4-day continuous run.
>   The latest branch is here,
>    https://github.com/hisilicon/kernel-dev/tree/private-v5.16-rc1-vmid-v4
> 
>  -TLA+ model. Modified the asidalloc model to incorporate the new
>   VMID algo. The main differences are,
>   -flush_tlb_all() instead of local_tlb_flush_all() on rollover.
>   -Introduced INVALID_VMID and vCPU Sched Out logic.
>   -No CnP (Removed UniqueASIDAllCPUs & UniqueASIDActiveTask invariants).
>   -Removed  UniqueVMIDPerCPU invariant for now as it looks like
>    because of the speculative fetching with flush_tlb_all() there
>    is a small window where this gets triggered. If I change the
>    logic back to local_flush_tlb_all(), UniqueVMIDPerCPU seems to
>    be fine. With my limited knowledge on TLA+ model, it is not
>    clear to me whether this is a problem with the above logic
>    or the VMID model implementation. Really appreciate any help
>    with the model.
>    The initial VMID TLA+ model is here,
>    https://github.com/shamiali2008/kernel-tla/tree/private-vmidalloc-v1
> 
> Please take a look and let me know.
> 
> Thanks,
> Shameer
> 
> [0]
> https://lore.kernel.org/kvmarm/20210721160614.GC11003@willie-the-truck/
> [1]
> https://lore.kernel.org/kvmarm/20210803114034.GB30853@willie-the-truck/
> 
> History:
> --------
> v2 --> v3
>   -Dropped adding a new static key and cpufeature for retrieving
>    supported VMID bits. Instead, we now make use of the
>    kvm_arm_vmid_bits variable (patch #2).
> 
>   -Since we expect less frequent rollover in the case of VMIDs,
>    the TLB invalidation is now broadcasted on rollover instead
>    of keeping per CPU flush_pending info and issuing a local
>    context flush.
> 
>   -Clear active_vmids on vCPU schedule out to avoid unnecessarily
>    reserving the VMID space(patch #3).
> 
>   -I have kept the struct kvm_vmid as it is for now(instead of a
>    typedef as suggested), as we may soon add another variable to
>    it when we introduce Pinned KVM VMID support.
> 
> RFCv1 --> v2
>    -Dropped "pinned VMID" support for now.
>    -Dropped RFC tag.
> RFCv1
> 
> https://lore.kernel.org/kvmarm/20210506165232.1969-1-shameerali.kolothu
> m.thodi@huawei.com/
> 
> Julien Grall (1):
>   KVM: arm64: Align the VMID allocation with the arm64 ASID
> 
> Shameer Kolothum (3):
>   KVM: arm64: Introduce a new VMID allocator for KVM
>   KVM: arm64: Make VMID bits accessible outside of allocator
>   KVM: arm64: Make active_vmids invalid on vCPU schedule out
> 
>  arch/arm64/include/asm/kvm_host.h     |  10 +-
>  arch/arm64/include/asm/kvm_mmu.h      |   4 +-
>  arch/arm64/kernel/image-vars.h        |   3 +
>  arch/arm64/kvm/Makefile               |   2 +-
>  arch/arm64/kvm/arm.c                  | 106 +++-----------
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c |   3 +-
>  arch/arm64/kvm/mmu.c                  |   1 -
>  arch/arm64/kvm/vmid.c                 | 196
> ++++++++++++++++++++++++++
>  8 files changed, 228 insertions(+), 97 deletions(-)  create mode 100644
> arch/arm64/kvm/vmid.c
> 
> --
> 2.17.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid
  2021-11-22 12:18 [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Shameer Kolothum
                   ` (4 preceding siblings ...)
  2022-01-05 13:25 ` [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Shameerali Kolothum Thodi
@ 2022-01-18 12:32 ` Catalin Marinas
  2022-01-19  9:23   ` Shameerali Kolothum Thodi
  2022-02-08 17:37 ` Marc Zyngier
  6 siblings, 1 reply; 13+ messages in thread
From: Catalin Marinas @ 2022-01-18 12:32 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: linux-arm-kernel, kvmarm, linux-kernel, maz, will, james.morse,
	julien.thierry.kdev, suzuki.poulose, jean-philippe,
	Alexandru.Elisei, qperret, jonathan.cameron, linuxarm

Hi Shameer,

On Mon, Nov 22, 2021 at 12:18:40PM +0000, Shameer Kolothum wrote:
>  -TLA+ model. Modified the asidalloc model to incorporate the new
>   VMID algo. The main differences are,
>   -flush_tlb_all() instead of local_tlb_flush_all() on rollover.
>   -Introduced INVALID_VMID and vCPU Sched Out logic.
>   -No CnP (Removed UniqueASIDAllCPUs & UniqueASIDActiveTask invariants).
>   -Removed  UniqueVMIDPerCPU invariant for now as it looks like
>    because of the speculative fetching with flush_tlb_all() there
>    is a small window where this gets triggered. If I change the
>    logic back to local_flush_tlb_all(), UniqueVMIDPerCPU seems to
>    be fine. With my limited knowledge on TLA+ model, it is not
>    clear to me whether this is a problem with the above logic
>    or the VMID model implementation. Really appreciate any help
>    with the model.
>    The initial VMID TLA+ model is here,
>    https://github.com/shamiali2008/kernel-tla/tree/private-vmidalloc-v1

I only had a brief look at the TLA+ model and I don't understand why you
have a separate 'shed_out' process. It would run in parallel with the
'sched' but AFAICT you can't really schedule a guest out while you are
in the middle of scheduling it in. I'd rather use the same 'sched'
process and either schedule in an inactive task or schedule out an
active one for a given CPU.

Also active_vmids[] for example is defined on the CPUS domain but you
call vcpu_sched_out() from a process that's not in the CPUS domain but
the SCHED_OUT one.

Regarding UniqueVMIDPerCPU, I think we need to figure out why it
happens. The fact that flush_tlb_all() was made to simulate the
speculative TLB loads is not relevant. In a different spec I have,
arm64kpti.tla, I just used another process that invokes an update_tlbs()
macro so that it can happen at any time. I didn't bother to update the
ASID spec in a similar way but it may be useful. The corresponding
UniqueASIDPerCPU meant that for any two TLB entries on a single CPU, if
they correspond to different tasks (pgd), they should have different
ASIDs. That's a strong requirement, otherwise we end up with the wrong
translation.

Why did you remove the CnP? Do we have this disabled for KVM guests?

-- 
Catalin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid
  2022-01-18 12:32 ` Catalin Marinas
@ 2022-01-19  9:23   ` Shameerali Kolothum Thodi
  2022-01-19 11:49     ` Catalin Marinas
  0 siblings, 1 reply; 13+ messages in thread
From: Shameerali Kolothum Thodi @ 2022-01-19  9:23 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, kvmarm, linux-kernel, maz, will, james.morse,
	julien.thierry.kdev, suzuki.poulose, jean-philippe,
	Alexandru.Elisei, qperret, Jonathan Cameron, Linuxarm

Hi Catalin,

> -----Original Message-----
> From: Catalin Marinas [mailto:catalin.marinas@arm.com]
> Sent: 18 January 2022 12:33
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: linux-arm-kernel@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-kernel@vger.kernel.org; maz@kernel.org; will@kernel.org;
> james.morse@arm.com; julien.thierry.kdev@gmail.com;
> suzuki.poulose@arm.com; jean-philippe@linaro.org;
> Alexandru.Elisei@arm.com; qperret@google.com; Jonathan Cameron
> <jonathan.cameron@huawei.com>; Linuxarm <linuxarm@huawei.com>
> Subject: Re: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid
> 
> Hi Shameer,
> 
> On Mon, Nov 22, 2021 at 12:18:40PM +0000, Shameer Kolothum wrote:
> >  -TLA+ model. Modified the asidalloc model to incorporate the new
> >   VMID algo. The main differences are,
> >   -flush_tlb_all() instead of local_tlb_flush_all() on rollover.
> >   -Introduced INVALID_VMID and vCPU Sched Out logic.
> >   -No CnP (Removed UniqueASIDAllCPUs & UniqueASIDActiveTask
> invariants).
> >   -Removed  UniqueVMIDPerCPU invariant for now as it looks like
> >    because of the speculative fetching with flush_tlb_all() there
> >    is a small window where this gets triggered. If I change the
> >    logic back to local_flush_tlb_all(), UniqueVMIDPerCPU seems to
> >    be fine. With my limited knowledge on TLA+ model, it is not
> >    clear to me whether this is a problem with the above logic
> >    or the VMID model implementation. Really appreciate any help
> >    with the model.
> >    The initial VMID TLA+ model is here,
> >    https://github.com/shamiali2008/kernel-tla/tree/private-vmidalloc-v1
> 
> I only had a brief look at the TLA+ model and I don't understand why you
> have a separate 'shed_out' process. It would run in parallel with the
> 'sched' but AFAICT you can't really schedule a guest out while you are
> in the middle of scheduling it in. I'd rather use the same 'sched'
> process and either schedule in an inactive task or schedule out an
> active one for a given CPU.
> 
> Also active_vmids[] for example is defined on the CPUS domain but you
> call vcpu_sched_out() from a process that's not in the CPUS domain but
> the SCHED_OUT one.

Many thanks for taking a look. My bad!. The 'sched_out' would indeed run in parallel
and defeat the purpose. I must say I was really confused by the TLA+ syntax and
is still not very confident about it.

Based on the above suggestion, I have modified it as below,

\* vCPU is scheduled out by KVM
macro vcpu_sched_out() {
        active_kvm[self].task := 0;
        active_vmids[self] := INVALID_VMID;
}
....

\* About to run a Guest VM
process (sched \in CPUS)
{
sched_loop:
        while (TRUE) {
                with (t \in TASKS) {
                        if (t # ActiveTask(self))
                                call kvm_arm_vmid_update(t);
                        else
                                vcpu_sched_out();
                }
        }
}

Please let me know if this is ok.

> Regarding UniqueVMIDPerCPU, I think we need to figure out why it
> happens. The fact that flush_tlb_all() was made to simulate the
> speculative TLB loads is not relevant. In a different spec I have,
> arm64kpti.tla, I just used another process that invokes an update_tlbs()
> macro so that it can happen at any time. I didn't bother to update the
> ASID spec in a similar way but it may be useful.

Ok. I will take a look at that and try that.

 The corresponding
> UniqueASIDPerCPU meant that for any two TLB entries on a single CPU, if
> they correspond to different tasks (pgd), they should have different
> ASIDs. That's a strong requirement, otherwise we end up with the wrong
> translation.

Yes, I understand that it is a strong requirement. Also, I thought this is something
that will trigger easily with the test setup I had with the real hardware. But testing
stayed on for days, so I was not sure it is a problem with the TLA+ implementation
or not.  

> 
> Why did you remove the CnP? Do we have this disabled for KVM guests?

I removed CnP related Invariants to simplify things for the first version. Also not sure
what specific changes we need to do for CnP here. Do we still need that switching to 
global pg_dir to prevent any speculative reading? I didn't see that being done in KVM 
anywhere at the moment. Maybe I am missing something.

On a side note, In my setup, the CnP=TRUE case for asidalloc.tla now fails with,
"Error: Invariant TLBEmptyInvalPTE is violated.". Please check.

Thanks,
Shameer 

> --
> Catalin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid
  2022-01-19  9:23   ` Shameerali Kolothum Thodi
@ 2022-01-19 11:49     ` Catalin Marinas
  2022-01-19 12:30       ` Shameerali Kolothum Thodi
  0 siblings, 1 reply; 13+ messages in thread
From: Catalin Marinas @ 2022-01-19 11:49 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: linux-arm-kernel, kvmarm, linux-kernel, maz, will, james.morse,
	julien.thierry.kdev, suzuki.poulose, jean-philippe,
	Alexandru.Elisei, qperret, Jonathan Cameron, Linuxarm

On Wed, Jan 19, 2022 at 09:23:31AM +0000, Shameerali Kolothum Thodi wrote:
> > On Mon, Nov 22, 2021 at 12:18:40PM +0000, Shameer Kolothum wrote:
> > >  -TLA+ model. Modified the asidalloc model to incorporate the new
> > >   VMID algo. The main differences are,
> > >   -flush_tlb_all() instead of local_tlb_flush_all() on rollover.
> > >   -Introduced INVALID_VMID and vCPU Sched Out logic.
> > >   -No CnP (Removed UniqueASIDAllCPUs & UniqueASIDActiveTask invariants).
> > >   -Removed  UniqueVMIDPerCPU invariant for now as it looks like
> > >    because of the speculative fetching with flush_tlb_all() there
> > >    is a small window where this gets triggered. If I change the
> > >    logic back to local_flush_tlb_all(), UniqueVMIDPerCPU seems to
> > >    be fine. With my limited knowledge on TLA+ model, it is not
> > >    clear to me whether this is a problem with the above logic
> > >    or the VMID model implementation. Really appreciate any help
> > >    with the model.
> > >    The initial VMID TLA+ model is here,
> > >    https://github.com/shamiali2008/kernel-tla/tree/private-vmidalloc-v1
> > 
> > I only had a brief look at the TLA+ model and I don't understand why you
> > have a separate 'shed_out' process. It would run in parallel with the
> > 'sched' but AFAICT you can't really schedule a guest out while you are
> > in the middle of scheduling it in. I'd rather use the same 'sched'
> > process and either schedule in an inactive task or schedule out an
> > active one for a given CPU.
> > 
> > Also active_vmids[] for example is defined on the CPUS domain but you
> > call vcpu_sched_out() from a process that's not in the CPUS domain but
> > the SCHED_OUT one.
> 
> Many thanks for taking a look. My bad!. The 'sched_out' would indeed run in parallel
> and defeat the purpose. I must say I was really confused by the TLA+ syntax and
> is still not very confident about it.

Yeah, it can be confusing. If you have time, you could give CBMC a try
and the 'spec' would be pretty close to your C version. Each CPU would
be modelled as a thread with an extra thread that simulates the
speculative TLB look-ups for all CPUS together with the asserts for the
invariants. The spinlocks would be pthread_mutexes.

> Based on the above suggestion, I have modified it as below,
> 
> \* vCPU is scheduled out by KVM
> macro vcpu_sched_out() {
>         active_kvm[self].task := 0;
>         active_vmids[self] := INVALID_VMID;
> }

Could you call cpu_switch_kvm(0, INVALID_VMID) instead? You could do
this directly below and avoid another macro. Well, whatever you find
clearer.

What confuses me is that your INVALID_VMID looks like a valid one: vmid
0, generation 1. Do you ensure that you never allocate VMID 0?

> \* About to run a Guest VM
> process (sched \in CPUS)
> {
> sched_loop:
>         while (TRUE) {
>                 with (t \in TASKS) {
>                         if (t # ActiveTask(self))
>                                 call kvm_arm_vmid_update(t);
>                         else
>                                 vcpu_sched_out();
>                 }
>         }
> }

Yes, that's what I meant.

> > The corresponding
> > UniqueASIDPerCPU meant that for any two TLB entries on a single CPU, if
> > they correspond to different tasks (pgd), they should have different
> > ASIDs. That's a strong requirement, otherwise we end up with the wrong
> > translation.
> 
> Yes, I understand that it is a strong requirement. Also, I thought this is something
> that will trigger easily with the test setup I had with the real hardware. But testing
> stayed on for days, so I was not sure it is a problem with the TLA+ implementation
> or not.  

Well, you'd have to check the TLA+ state trace and see how it got
there, whether the last state would be a valid one. It's either
something missing in the spec that the hardware enforces or the
algorithm is wrong and just hard to hit in practice. If this condition
is violated in hardware for a very brief period, e.g. due to some TLBI,
you'd not notice an issue under normal circumstances. But it's still
incorrect.

> > Why did you remove the CnP? Do we have this disabled for KVM guests?
> 
> I removed CnP related Invariants to simplify things for the first version. Also not sure
> what specific changes we need to do for CnP here. Do we still need that switching to 
> global pg_dir to prevent any speculative reading? I didn't see that being done in KVM 
> anywhere at the moment. Maybe I am missing something.

It make sense to ignore CnP for now. Maybe KVM doesn't even bother and
sets VTTBR_EL2.CnP to 0 (I haven't checked).

> On a side note, In my setup, the CnP=TRUE case for asidalloc.tla now fails with,
> "Error: Invariant TLBEmptyInvalPTE is violated.". Please check.

This was added later as part of try-to-unmap and I only checked this
with CnP = FALSE. I'll need to look into, it's possible that
flush_tlb_all() doesn't take into account that the pte is unmapped (as
cpu_switch_mm() does). I'll add a separate thread for speculative TLB
loads, it's easier to have them in one place. Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid
  2022-01-19 11:49     ` Catalin Marinas
@ 2022-01-19 12:30       ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 13+ messages in thread
From: Shameerali Kolothum Thodi @ 2022-01-19 12:30 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, kvmarm, linux-kernel, maz, will, james.morse,
	julien.thierry.kdev, suzuki.poulose, jean-philippe,
	Alexandru.Elisei, qperret, Jonathan Cameron, Linuxarm



> -----Original Message-----
> From: Catalin Marinas [mailto:catalin.marinas@arm.com]
> Sent: 19 January 2022 11:50
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: linux-arm-kernel@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-kernel@vger.kernel.org; maz@kernel.org; will@kernel.org;
> james.morse@arm.com; julien.thierry.kdev@gmail.com;
> suzuki.poulose@arm.com; jean-philippe@linaro.org;
> Alexandru.Elisei@arm.com; qperret@google.com; Jonathan Cameron
> <jonathan.cameron@huawei.com>; Linuxarm <linuxarm@huawei.com>
> Subject: Re: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid
> 
> On Wed, Jan 19, 2022 at 09:23:31AM +0000, Shameerali Kolothum Thodi
> wrote:
> > > On Mon, Nov 22, 2021 at 12:18:40PM +0000, Shameer Kolothum wrote:
> > > >  -TLA+ model. Modified the asidalloc model to incorporate the new
> > > >   VMID algo. The main differences are,
> > > >   -flush_tlb_all() instead of local_tlb_flush_all() on rollover.
> > > >   -Introduced INVALID_VMID and vCPU Sched Out logic.
> > > >   -No CnP (Removed UniqueASIDAllCPUs & UniqueASIDActiveTask
> invariants).
> > > >   -Removed  UniqueVMIDPerCPU invariant for now as it looks like
> > > >    because of the speculative fetching with flush_tlb_all() there
> > > >    is a small window where this gets triggered. If I change the
> > > >    logic back to local_flush_tlb_all(), UniqueVMIDPerCPU seems to
> > > >    be fine. With my limited knowledge on TLA+ model, it is not
> > > >    clear to me whether this is a problem with the above logic
> > > >    or the VMID model implementation. Really appreciate any help
> > > >    with the model.
> > > >    The initial VMID TLA+ model is here,
> > > >
> https://github.com/shamiali2008/kernel-tla/tree/private-vmidalloc-v1
> > >
> > > I only had a brief look at the TLA+ model and I don't understand why you
> > > have a separate 'shed_out' process. It would run in parallel with the
> > > 'sched' but AFAICT you can't really schedule a guest out while you are
> > > in the middle of scheduling it in. I'd rather use the same 'sched'
> > > process and either schedule in an inactive task or schedule out an
> > > active one for a given CPU.
> > >
> > > Also active_vmids[] for example is defined on the CPUS domain but you
> > > call vcpu_sched_out() from a process that's not in the CPUS domain but
> > > the SCHED_OUT one.
> >
> > Many thanks for taking a look. My bad!. The 'sched_out' would indeed run in
> parallel
> > and defeat the purpose. I must say I was really confused by the TLA+ syntax
> and
> > is still not very confident about it.
> 
> Yeah, it can be confusing. If you have time, you could give CBMC a try
> and the 'spec' would be pretty close to your C version. Each CPU would
> be modelled as a thread with an extra thread that simulates the
> speculative TLB look-ups for all CPUS together with the asserts for the
> invariants. The spinlocks would be pthread_mutexes.

Ok, will take a look.

> > Based on the above suggestion, I have modified it as below,
> >
> > \* vCPU is scheduled out by KVM
> > macro vcpu_sched_out() {
> >         active_kvm[self].task := 0;
> >         active_vmids[self] := INVALID_VMID;
> > }
> 
> Could you call cpu_switch_kvm(0, INVALID_VMID) instead? You could do
> this directly below and avoid another macro. Well, whatever you find
> clearer.

Sure, will change that.

> What confuses me is that your INVALID_VMID looks like a valid one: vmid
> 0, generation 1. Do you ensure that you never allocate VMID 0?

This is same as in asidalloc wherein the cur_idx starts at 1 for any new
allocation. I think that guarantees VMID 0 is never allocated.  

> 
> > \* About to run a Guest VM
> > process (sched \in CPUS)
> > {
> > sched_loop:
> >         while (TRUE) {
> >                 with (t \in TASKS) {
> >                         if (t # ActiveTask(self))
> >                                 call kvm_arm_vmid_update(t);
> >                         else
> >                                 vcpu_sched_out();
> >                 }
> >         }
> > }
> 
> Yes, that's what I meant.

Ok.

> 
> > > The corresponding
> > > UniqueASIDPerCPU meant that for any two TLB entries on a single CPU, if
> > > they correspond to different tasks (pgd), they should have different
> > > ASIDs. That's a strong requirement, otherwise we end up with the wrong
> > > translation.
> >
> > Yes, I understand that it is a strong requirement. Also, I thought this is
> something
> > that will trigger easily with the test setup I had with the real hardware. But
> testing
> > stayed on for days, so I was not sure it is a problem with the TLA+
> implementation
> > or not.
> 
> Well, you'd have to check the TLA+ state trace and see how it got
> there, whether the last state would be a valid one. It's either
> something missing in the spec that the hardware enforces or the
> algorithm is wrong and just hard to hit in practice. If this condition
> is violated in hardware for a very brief period, e.g. due to some TLBI,
> you'd not notice an issue under normal circumstances. But it's still
> incorrect.

Hmm..I had a quick implementation with a separate thread for speculative
load as done in the arm64kpti.tla, but still get the UniqueVMIDPerCPU violation
error. I will go through the state trace and see whether I can interpret something :) 
 
> > > Why did you remove the CnP? Do we have this disabled for KVM guests?
> >
> > I removed CnP related Invariants to simplify things for the first version. Also
> not sure
> > what specific changes we need to do for CnP here. Do we still need that
> switching to
> > global pg_dir to prevent any speculative reading? I didn't see that being
> done in KVM
> > anywhere at the moment. Maybe I am missing something.
> 
> It make sense to ignore CnP for now. Maybe KVM doesn't even bother and
> sets VTTBR_EL2.CnP to 0 (I haven't checked).

I think KVM sets the CnP here,
https://elixir.bootlin.com/linux/latest/source/arch/arm64/include/asm/kvm_mmu.h#L264

But I haven't figured out what else we need to do in this new VMID allocator for
CnP case. 

Marc, Will?

> 
> > On a side note, In my setup, the CnP=TRUE case for asidalloc.tla now fails
> with,
> > "Error: Invariant TLBEmptyInvalPTE is violated.". Please check.
> 
> This was added later as part of try-to-unmap and I only checked this
> with CnP = FALSE. I'll need to look into, it's possible that
> flush_tlb_all() doesn't take into account that the pte is unmapped (as
> cpu_switch_mm() does). I'll add a separate thread for speculative TLB
> loads, it's easier to have them in one place. Thanks.

Ok.

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/4] KVM: arm64: Introduce a new VMID allocator for KVM
  2021-11-22 12:18 ` [PATCH v4 1/4] KVM: arm64: Introduce a new VMID allocator for KVM Shameer Kolothum
@ 2022-01-21  7:35   ` Reiji Watanabe
  2022-01-21  8:44     ` Shameerali Kolothum Thodi
  0 siblings, 1 reply; 13+ messages in thread
From: Reiji Watanabe @ 2022-01-21  7:35 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: Linux ARM, kvmarm, linux-kernel, jean-philippe, Marc Zyngier,
	linuxarm, jonathan.cameron, Catalin Marinas, Will Deacon

On Mon, Nov 22, 2021 at 4:19 AM Shameer Kolothum
<shameerali.kolothum.thodi@huawei.com> wrote:
>
> A new VMID allocator for arm64 KVM use. This is based on
> arm64 ASID allocator algorithm.
>
> One major deviation from the ASID allocator is the way we
> flush the context. Unlike ASID allocator, we expect less
> frequent rollover in the case of VMIDs. Hence, instead of
> marking the CPU as flush_pending and issuing a local context
> invalidation on the next context switch, we  broadcast TLB
> flush + I-cache invalidation over the inner shareable domain
> on rollover.
>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |   4 +
>  arch/arm64/kvm/vmid.c             | 177 ++++++++++++++++++++++++++++++
>  2 files changed, 181 insertions(+)
>  create mode 100644 arch/arm64/kvm/vmid.c
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 2a5f7f38006f..f4a86a79ea4a 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -690,6 +690,10 @@ int kvm_arm_pvtime_get_attr(struct kvm_vcpu *vcpu,
>  int kvm_arm_pvtime_has_attr(struct kvm_vcpu *vcpu,
>                             struct kvm_device_attr *attr);
>
> +int kvm_arm_vmid_alloc_init(void);
> +void kvm_arm_vmid_alloc_free(void);
> +void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid);
> +
>  static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch *vcpu_arch)
>  {
>         vcpu_arch->steal.base = GPA_INVALID;
> diff --git a/arch/arm64/kvm/vmid.c b/arch/arm64/kvm/vmid.c
> new file mode 100644
> index 000000000000..aa01c97f7df0
> --- /dev/null
> +++ b/arch/arm64/kvm/vmid.c
> @@ -0,0 +1,177 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * VMID allocator.
> + *
> + * Based on Arm64 ASID allocator algorithm.
> + * Please refer arch/arm64/mm/context.c for detailed
> + * comments on algorithm.
> + *
> + * Copyright (C) 2002-2003 Deep Blue Solutions Ltd, all rights reserved.
> + * Copyright (C) 2012 ARM Ltd.
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/bitops.h>
> +
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_mmu.h>
> +
> +static unsigned int kvm_arm_vmid_bits;
> +static DEFINE_RAW_SPINLOCK(cpu_vmid_lock);
> +
> +static atomic64_t vmid_generation;
> +static unsigned long *vmid_map;
> +
> +static DEFINE_PER_CPU(atomic64_t, active_vmids);
> +static DEFINE_PER_CPU(u64, reserved_vmids);
> +
> +#define VMID_MASK              (~GENMASK(kvm_arm_vmid_bits - 1, 0))
> +#define VMID_FIRST_VERSION     (1UL << kvm_arm_vmid_bits)
> +
> +#define NUM_USER_VMIDS         VMID_FIRST_VERSION
> +#define vmid2idx(vmid)         ((vmid) & ~VMID_MASK)
> +#define idx2vmid(idx)          vmid2idx(idx)
> +
> +#define vmid_gen_match(vmid) \
> +       (!(((vmid) ^ atomic64_read(&vmid_generation)) >> kvm_arm_vmid_bits))
> +
> +static void flush_context(void)
> +{
> +       int cpu;
> +       u64 vmid;
> +
> +       bitmap_clear(vmid_map, 0, NUM_USER_VMIDS);
> +
> +       for_each_possible_cpu(cpu) {
> +               vmid = atomic64_xchg_relaxed(&per_cpu(active_vmids, cpu), 0);
> +
> +               /* Preserve reserved VMID */
> +               if (vmid == 0)
> +                       vmid = per_cpu(reserved_vmids, cpu);
> +               __set_bit(vmid2idx(vmid), vmid_map);
> +               per_cpu(reserved_vmids, cpu) = vmid;
> +       }
> +
> +       /*
> +        * Unlike ASID allocator, we expect less frequent rollover in
> +        * case of VMIDs. Hence, instead of marking the CPU as
> +        * flush_pending and issuing a local context invalidation on
> +        * the next context-switch, we broadcast TLB flush + I-cache
> +        * invalidation over the inner shareable domain on rollover.
> +        */
> +        kvm_call_hyp(__kvm_flush_vm_context);
> +}
> +
> +static bool check_update_reserved_vmid(u64 vmid, u64 newvmid)
> +{
> +       int cpu;
> +       bool hit = false;
> +
> +       /*
> +        * Iterate over the set of reserved VMIDs looking for a match
> +        * and update to use newvmid (i.e. the same VMID in the current
> +        * generation).
> +        */
> +       for_each_possible_cpu(cpu) {
> +               if (per_cpu(reserved_vmids, cpu) == vmid) {
> +                       hit = true;
> +                       per_cpu(reserved_vmids, cpu) = newvmid;
> +               }
> +       }

Once updating reserved_vmids gets done for the all CPUs, it appears
that the function doesn't need to iterate over the set of reserved
VMIDs (correct ?). So, I'm wondering if KVM can manage the number of
CPUs for which reserved_vmids need to get updated so that the function
can skip the loop when the number is zero.  I'm not sure how likely
that would help though.
(Since every vmid allocation for non-new guest needs to iterate over
 reserved_vmids holding cpu_vmid_lock, I'm a bit concerned about the
 performance impact on systems with a large number of CPUs.)

Thanks,
Reiji

> +
> +       return hit;
> +}
> +
> +static u64 new_vmid(struct kvm_vmid *kvm_vmid)
> +{
> +       static u32 cur_idx = 1;
> +       u64 vmid = atomic64_read(&kvm_vmid->id);
> +       u64 generation = atomic64_read(&vmid_generation);
> +
> +       if (vmid != 0) {
> +               u64 newvmid = generation | (vmid & ~VMID_MASK);
> +
> +               if (check_update_reserved_vmid(vmid, newvmid)) {
> +                       atomic64_set(&kvm_vmid->id, newvmid);
> +                       return newvmid;
> +               }
> +
> +               if (!__test_and_set_bit(vmid2idx(vmid), vmid_map)) {
> +                       atomic64_set(&kvm_vmid->id, newvmid);
> +                       return newvmid;
> +               }
> +       }
> +
> +       vmid = find_next_zero_bit(vmid_map, NUM_USER_VMIDS, cur_idx);
> +       if (vmid != NUM_USER_VMIDS)
> +               goto set_vmid;
> +
> +       /* We're out of VMIDs, so increment the global generation count */
> +       generation = atomic64_add_return_relaxed(VMID_FIRST_VERSION,
> +                                                &vmid_generation);
> +       flush_context();
> +
> +       /* We have more VMIDs than CPUs, so this will always succeed */
> +       vmid = find_next_zero_bit(vmid_map, NUM_USER_VMIDS, 1);
> +
> +set_vmid:
> +       __set_bit(vmid, vmid_map);
> +       cur_idx = vmid;
> +       vmid = idx2vmid(vmid) | generation;
> +       atomic64_set(&kvm_vmid->id, vmid);
> +       return vmid;
> +}
> +
> +void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid)
> +{
> +       unsigned long flags;
> +       u64 vmid, old_active_vmid;
> +
> +       vmid = atomic64_read(&kvm_vmid->id);
> +
> +       /*
> +        * Please refer comments in check_and_switch_context() in
> +        * arch/arm64/mm/context.c.
> +        */
> +       old_active_vmid = atomic64_read(this_cpu_ptr(&active_vmids));
> +       if (old_active_vmid && vmid_gen_match(vmid) &&
> +           atomic64_cmpxchg_relaxed(this_cpu_ptr(&active_vmids),
> +                                    old_active_vmid, vmid))
> +               return;
> +
> +       raw_spin_lock_irqsave(&cpu_vmid_lock, flags);
> +
> +       /* Check that our VMID belongs to the current generation. */
> +       vmid = atomic64_read(&kvm_vmid->id);
> +       if (!vmid_gen_match(vmid))
> +               vmid = new_vmid(kvm_vmid);
> +
> +       atomic64_set(this_cpu_ptr(&active_vmids), vmid);
> +       raw_spin_unlock_irqrestore(&cpu_vmid_lock, flags);
> +}
> +
> +/*
> + * Initialize the VMID allocator
> + */
> +int kvm_arm_vmid_alloc_init(void)
> +{
> +       kvm_arm_vmid_bits = kvm_get_vmid_bits();
> +
> +       /*
> +        * Expect allocation after rollover to fail if we don't have
> +        * at least one more VMID than CPUs. VMID #0 is always reserved.
> +        */
> +       WARN_ON(NUM_USER_VMIDS - 1 <= num_possible_cpus());
> +       atomic64_set(&vmid_generation, VMID_FIRST_VERSION);
> +       vmid_map = kcalloc(BITS_TO_LONGS(NUM_USER_VMIDS),
> +                          sizeof(*vmid_map), GFP_KERNEL);
> +       if (!vmid_map)
> +               return -ENOMEM;
> +
> +       return 0;
> +}
> +
> +void kvm_arm_vmid_alloc_free(void)
> +{
> +       kfree(vmid_map);
> +}
> --
> 2.17.1
>
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v4 1/4] KVM: arm64: Introduce a new VMID allocator for KVM
  2022-01-21  7:35   ` Reiji Watanabe
@ 2022-01-21  8:44     ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 13+ messages in thread
From: Shameerali Kolothum Thodi @ 2022-01-21  8:44 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: Linux ARM, kvmarm, linux-kernel, jean-philippe, Marc Zyngier,
	Linuxarm, Jonathan Cameron, Catalin Marinas, Will Deacon



> -----Original Message-----
> From: Reiji Watanabe [mailto:reijiw@google.com]
> Sent: 21 January 2022 07:36
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Linux ARM <linux-arm-kernel@lists.infradead.org>;
> kvmarm@lists.cs.columbia.edu; linux-kernel@vger.kernel.org;
> jean-philippe@linaro.org; Marc Zyngier <maz@kernel.org>; Linuxarm
> <linuxarm@huawei.com>; Jonathan Cameron
> <jonathan.cameron@huawei.com>; Catalin Marinas
> <catalin.marinas@arm.com>; Will Deacon <will@kernel.org>
> Subject: Re: [PATCH v4 1/4] KVM: arm64: Introduce a new VMID allocator for
> KVM
> 
> On Mon, Nov 22, 2021 at 4:19 AM Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com> wrote:
> >
> > A new VMID allocator for arm64 KVM use. This is based on
> > arm64 ASID allocator algorithm.
> >
> > One major deviation from the ASID allocator is the way we
> > flush the context. Unlike ASID allocator, we expect less
> > frequent rollover in the case of VMIDs. Hence, instead of
> > marking the CPU as flush_pending and issuing a local context
> > invalidation on the next context switch, we  broadcast TLB
> > flush + I-cache invalidation over the inner shareable domain
> > on rollover.
> >
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |   4 +
> >  arch/arm64/kvm/vmid.c             | 177
> ++++++++++++++++++++++++++++++
> >  2 files changed, 181 insertions(+)
> >  create mode 100644 arch/arm64/kvm/vmid.c
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h
> b/arch/arm64/include/asm/kvm_host.h
> > index 2a5f7f38006f..f4a86a79ea4a 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -690,6 +690,10 @@ int kvm_arm_pvtime_get_attr(struct kvm_vcpu
> *vcpu,
> >  int kvm_arm_pvtime_has_attr(struct kvm_vcpu *vcpu,
> >                             struct kvm_device_attr *attr);
> >
> > +int kvm_arm_vmid_alloc_init(void);
> > +void kvm_arm_vmid_alloc_free(void);
> > +void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid);
> > +
> >  static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch
> *vcpu_arch)
> >  {
> >         vcpu_arch->steal.base = GPA_INVALID;
> > diff --git a/arch/arm64/kvm/vmid.c b/arch/arm64/kvm/vmid.c
> > new file mode 100644
> > index 000000000000..aa01c97f7df0
> > --- /dev/null
> > +++ b/arch/arm64/kvm/vmid.c
> > @@ -0,0 +1,177 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * VMID allocator.
> > + *
> > + * Based on Arm64 ASID allocator algorithm.
> > + * Please refer arch/arm64/mm/context.c for detailed
> > + * comments on algorithm.
> > + *
> > + * Copyright (C) 2002-2003 Deep Blue Solutions Ltd, all rights reserved.
> > + * Copyright (C) 2012 ARM Ltd.
> > + */
> > +
> > +#include <linux/bitfield.h>
> > +#include <linux/bitops.h>
> > +
> > +#include <asm/kvm_asm.h>
> > +#include <asm/kvm_mmu.h>
> > +
> > +static unsigned int kvm_arm_vmid_bits;
> > +static DEFINE_RAW_SPINLOCK(cpu_vmid_lock);
> > +
> > +static atomic64_t vmid_generation;
> > +static unsigned long *vmid_map;
> > +
> > +static DEFINE_PER_CPU(atomic64_t, active_vmids);
> > +static DEFINE_PER_CPU(u64, reserved_vmids);
> > +
> > +#define VMID_MASK              (~GENMASK(kvm_arm_vmid_bits - 1,
> 0))
> > +#define VMID_FIRST_VERSION     (1UL << kvm_arm_vmid_bits)
> > +
> > +#define NUM_USER_VMIDS         VMID_FIRST_VERSION
> > +#define vmid2idx(vmid)         ((vmid) & ~VMID_MASK)
> > +#define idx2vmid(idx)          vmid2idx(idx)
> > +
> > +#define vmid_gen_match(vmid) \
> > +       (!(((vmid) ^ atomic64_read(&vmid_generation)) >>
> kvm_arm_vmid_bits))
> > +
> > +static void flush_context(void)
> > +{
> > +       int cpu;
> > +       u64 vmid;
> > +
> > +       bitmap_clear(vmid_map, 0, NUM_USER_VMIDS);
> > +
> > +       for_each_possible_cpu(cpu) {
> > +               vmid = atomic64_xchg_relaxed(&per_cpu(active_vmids,
> cpu), 0);
> > +
> > +               /* Preserve reserved VMID */
> > +               if (vmid == 0)
> > +                       vmid = per_cpu(reserved_vmids, cpu);
> > +               __set_bit(vmid2idx(vmid), vmid_map);
> > +               per_cpu(reserved_vmids, cpu) = vmid;
> > +       }
> > +
> > +       /*
> > +        * Unlike ASID allocator, we expect less frequent rollover in
> > +        * case of VMIDs. Hence, instead of marking the CPU as
> > +        * flush_pending and issuing a local context invalidation on
> > +        * the next context-switch, we broadcast TLB flush + I-cache
> > +        * invalidation over the inner shareable domain on rollover.
> > +        */
> > +        kvm_call_hyp(__kvm_flush_vm_context);
> > +}
> > +
> > +static bool check_update_reserved_vmid(u64 vmid, u64 newvmid)
> > +{
> > +       int cpu;
> > +       bool hit = false;
> > +
> > +       /*
> > +        * Iterate over the set of reserved VMIDs looking for a match
> > +        * and update to use newvmid (i.e. the same VMID in the current
> > +        * generation).
> > +        */
> > +       for_each_possible_cpu(cpu) {
> > +               if (per_cpu(reserved_vmids, cpu) == vmid) {
> > +                       hit = true;
> > +                       per_cpu(reserved_vmids, cpu) = newvmid;
> > +               }
> > +       }
> 
> Once updating reserved_vmids gets done for the all CPUs, it appears
> that the function doesn't need to iterate over the set of reserved
> VMIDs (correct ?). So, I'm wondering if KVM can manage the number of
> CPUs for which reserved_vmids need to get updated so that the function
> can skip the loop when the number is zero.  I'm not sure how likely
> that would help though.

Ok. I think that is possible to do. In the flush_context() we can update the
number of CPUs with valid reserved_vmid and add a check here. Not sure on
the probability of that being zero though.

> (Since every vmid allocation for non-new guest needs to iterate over
>  reserved_vmids holding cpu_vmid_lock, I'm a bit concerned about the
>  performance impact on systems with a large number of CPUs.)

But the non-new guest VMID allocation will normally go through the fast path
unless there is a rollover. And for 16bit VMID space, the frequency of rollover
is very rare I guess.

Thanks,
Shameer

> 
> Thanks,
> Reiji
> 
> > +
> > +       return hit;
> > +}
> > +
> > +static u64 new_vmid(struct kvm_vmid *kvm_vmid)
> > +{
> > +       static u32 cur_idx = 1;
> > +       u64 vmid = atomic64_read(&kvm_vmid->id);
> > +       u64 generation = atomic64_read(&vmid_generation);
> > +
> > +       if (vmid != 0) {
> > +               u64 newvmid = generation | (vmid & ~VMID_MASK);
> > +
> > +               if (check_update_reserved_vmid(vmid, newvmid)) {
> > +                       atomic64_set(&kvm_vmid->id, newvmid);
> > +                       return newvmid;
> > +               }
> > +
> > +               if (!__test_and_set_bit(vmid2idx(vmid), vmid_map)) {
> > +                       atomic64_set(&kvm_vmid->id, newvmid);
> > +                       return newvmid;
> > +               }
> > +       }
> > +
> > +       vmid = find_next_zero_bit(vmid_map, NUM_USER_VMIDS,
> cur_idx);
> > +       if (vmid != NUM_USER_VMIDS)
> > +               goto set_vmid;
> > +
> > +       /* We're out of VMIDs, so increment the global generation count */
> > +       generation = atomic64_add_return_relaxed(VMID_FIRST_VERSION,
> > +
> &vmid_generation);
> > +       flush_context();
> > +
> > +       /* We have more VMIDs than CPUs, so this will always succeed */
> > +       vmid = find_next_zero_bit(vmid_map, NUM_USER_VMIDS, 1);
> > +
> > +set_vmid:
> > +       __set_bit(vmid, vmid_map);
> > +       cur_idx = vmid;
> > +       vmid = idx2vmid(vmid) | generation;
> > +       atomic64_set(&kvm_vmid->id, vmid);
> > +       return vmid;
> > +}
> > +
> > +void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid)
> > +{
> > +       unsigned long flags;
> > +       u64 vmid, old_active_vmid;
> > +
> > +       vmid = atomic64_read(&kvm_vmid->id);
> > +
> > +       /*
> > +        * Please refer comments in check_and_switch_context() in
> > +        * arch/arm64/mm/context.c.
> > +        */
> > +       old_active_vmid = atomic64_read(this_cpu_ptr(&active_vmids));
> > +       if (old_active_vmid && vmid_gen_match(vmid) &&
> > +           atomic64_cmpxchg_relaxed(this_cpu_ptr(&active_vmids),
> > +                                    old_active_vmid, vmid))
> > +               return;
> > +
> > +       raw_spin_lock_irqsave(&cpu_vmid_lock, flags);
> > +
> > +       /* Check that our VMID belongs to the current generation. */
> > +       vmid = atomic64_read(&kvm_vmid->id);
> > +       if (!vmid_gen_match(vmid))
> > +               vmid = new_vmid(kvm_vmid);
> > +
> > +       atomic64_set(this_cpu_ptr(&active_vmids), vmid);
> > +       raw_spin_unlock_irqrestore(&cpu_vmid_lock, flags);
> > +}
> > +
> > +/*
> > + * Initialize the VMID allocator
> > + */
> > +int kvm_arm_vmid_alloc_init(void)
> > +{
> > +       kvm_arm_vmid_bits = kvm_get_vmid_bits();
> > +
> > +       /*
> > +        * Expect allocation after rollover to fail if we don't have
> > +        * at least one more VMID than CPUs. VMID #0 is always reserved.
> > +        */
> > +       WARN_ON(NUM_USER_VMIDS - 1 <= num_possible_cpus());
> > +       atomic64_set(&vmid_generation, VMID_FIRST_VERSION);
> > +       vmid_map = kcalloc(BITS_TO_LONGS(NUM_USER_VMIDS),
> > +                          sizeof(*vmid_map), GFP_KERNEL);
> > +       if (!vmid_map)
> > +               return -ENOMEM;
> > +
> > +       return 0;
> > +}
> > +
> > +void kvm_arm_vmid_alloc_free(void)
> > +{
> > +       kfree(vmid_map);
> > +}
> > --
> > 2.17.1
> >
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid
  2021-11-22 12:18 [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Shameer Kolothum
                   ` (5 preceding siblings ...)
  2022-01-18 12:32 ` Catalin Marinas
@ 2022-02-08 17:37 ` Marc Zyngier
  6 siblings, 0 replies; 13+ messages in thread
From: Marc Zyngier @ 2022-02-08 17:37 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, linux-kernel, Shameer Kolothum
  Cc: will, jonathan.cameron, Alexandru.Elisei, catalin.marinas,
	linuxarm, jean-philippe, qperret, julien.thierry.kdev,
	suzuki.poulose, james.morse

On Mon, 22 Nov 2021 12:18:40 +0000, Shameer Kolothum wrote:
> Changes from v3:
> - Main change is in patch #4, where the VMID is now set to an
>   invalid one on vCPU schedule out. Introduced an
>   INVALID_ACTIVE_VMID which is basically a VMID 0 with generation 1.
>   Since the basic allocator algorithm reserves vmid #0, it is never
>   used as an active VMID. This (hopefully) will fix the issue of
>   unnecessarily reserving VMID space with active_vmids when those
>   VMs are no longer active[0] and at the same time address the
>   problem noted in v3 wherein everything ends up in slow-path[1].
> 
> [...]

Applied to next, thanks!

[1/4] KVM: arm64: Introduce a new VMID allocator for KVM
      commit: 417838392f2e657ee25cc30e373ff4c35a0faa90
[2/4] KVM: arm64: Make VMID bits accessible outside of allocator
      commit: f8051e960922a9de8e42159103d5d9c697ef17ec
[3/4] KVM: arm64: Align the VMID allocation with the arm64 ASID
      commit: 3248136b3637e1671e4fa46e32e2122f9ec4bc3d
[4/4] KVM: arm64: Make active_vmids invalid on vCPU schedule out
      commit: 100b4f092f878dc379f1fcef9ce567c25dee3473

Cheers,

	M.
-- 
Without deviation from the norm, progress is not possible.



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-02-08 17:37 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-22 12:18 [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Shameer Kolothum
2021-11-22 12:18 ` [PATCH v4 1/4] KVM: arm64: Introduce a new VMID allocator for KVM Shameer Kolothum
2022-01-21  7:35   ` Reiji Watanabe
2022-01-21  8:44     ` Shameerali Kolothum Thodi
2021-11-22 12:18 ` [PATCH v4 2/4] KVM: arm64: Make VMID bits accessible outside of allocator Shameer Kolothum
2021-11-22 12:18 ` [PATCH v4 3/4] KVM: arm64: Align the VMID allocation with the arm64 ASID Shameer Kolothum
2021-11-22 12:18 ` [PATCH v4 4/4] KVM: arm64: Make active_vmids invalid on vCPU schedule out Shameer Kolothum
2022-01-05 13:25 ` [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Shameerali Kolothum Thodi
2022-01-18 12:32 ` Catalin Marinas
2022-01-19  9:23   ` Shameerali Kolothum Thodi
2022-01-19 11:49     ` Catalin Marinas
2022-01-19 12:30       ` Shameerali Kolothum Thodi
2022-02-08 17:37 ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).