All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] KVM/arm64: Cache maintenance relaxations
@ 2018-05-30 12:47 ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Catalin Marinas

This small series makes use of features recently introduced in the
ARMv8 architecture to relax the cache maintenance operations on CPUs
that implement these features.

FWB is the most important one. It allows stage-2 to enforce the
cacheability of memory, no matter what the guest says. It also
mandates that the whole machine is cache coherent (no non-coherent
I/O), meaning we can drop a whole class of cache maintenance
operations.

FWB, when combined with CTL_EL0.{IDC,DIC} allows the removal of all
cache maintenance and tracking of pages being executed.

We also take the opportunity to drop a few useless CMOs that were
applied to the HYP page tables, but that were never necessary. This
ended up requiring quite a few changes in out page table accessors so
that they get consistent barriers. These barriers are a bit on the
heavy side, and could be further optimized, although that's probably
for a different patch series

* From v1:
  - Reordered the series in a slightly more consistent order
  - Added patches refactoring the PT accessors before dropping
    the CMOs
  - Removed final dsb(ishst) from the last patch, as all PT accessors
    now have their own barriers
  - Collected Acks from Catalin

Marc Zyngier (6):
  arm64: KVM: Add support for Stage-2 control of memory types and
    cacheability
  arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
  arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set
  KVM: arm/arm64: Consolidate page-table accessors
  KVM: arm/arm64: Stop using {pmd,pud,pgd}_populate
  KVM: arm/arm64: Remove unnecessary CMOs when creating HYP page tables

 arch/arm/include/asm/kvm_mmu.h        | 14 ++-------
 arch/arm64/include/asm/cpucaps.h      |  3 +-
 arch/arm64/include/asm/kvm_arm.h      |  1 +
 arch/arm64/include/asm/kvm_emulate.h  |  2 ++
 arch/arm64/include/asm/kvm_mmu.h      | 32 +++++++++++++------
 arch/arm64/include/asm/memory.h       |  7 +++++
 arch/arm64/include/asm/pgtable-prot.h | 24 ++++++++++++--
 arch/arm64/include/asm/sysreg.h       |  1 +
 arch/arm64/kernel/cpufeature.c        | 20 ++++++++++++
 arch/arm64/kvm/sys_regs.c             |  8 ++++-
 virt/kvm/arm/mmu.c                    | 45 ++++++++++++++++++++++-----
 11 files changed, 125 insertions(+), 32 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 0/6] KVM/arm64: Cache maintenance relaxations
@ 2018-05-30 12:47 ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

This small series makes use of features recently introduced in the
ARMv8 architecture to relax the cache maintenance operations on CPUs
that implement these features.

FWB is the most important one. It allows stage-2 to enforce the
cacheability of memory, no matter what the guest says. It also
mandates that the whole machine is cache coherent (no non-coherent
I/O), meaning we can drop a whole class of cache maintenance
operations.

FWB, when combined with CTL_EL0.{IDC,DIC} allows the removal of all
cache maintenance and tracking of pages being executed.

We also take the opportunity to drop a few useless CMOs that were
applied to the HYP page tables, but that were never necessary. This
ended up requiring quite a few changes in out page table accessors so
that they get consistent barriers. These barriers are a bit on the
heavy side, and could be further optimized, although that's probably
for a different patch series

* From v1:
  - Reordered the series in a slightly more consistent order
  - Added patches refactoring the PT accessors before dropping
    the CMOs
  - Removed final dsb(ishst) from the last patch, as all PT accessors
    now have their own barriers
  - Collected Acks from Catalin

Marc Zyngier (6):
  arm64: KVM: Add support for Stage-2 control of memory types and
    cacheability
  arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
  arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set
  KVM: arm/arm64: Consolidate page-table accessors
  KVM: arm/arm64: Stop using {pmd,pud,pgd}_populate
  KVM: arm/arm64: Remove unnecessary CMOs when creating HYP page tables

 arch/arm/include/asm/kvm_mmu.h        | 14 ++-------
 arch/arm64/include/asm/cpucaps.h      |  3 +-
 arch/arm64/include/asm/kvm_arm.h      |  1 +
 arch/arm64/include/asm/kvm_emulate.h  |  2 ++
 arch/arm64/include/asm/kvm_mmu.h      | 32 +++++++++++++------
 arch/arm64/include/asm/memory.h       |  7 +++++
 arch/arm64/include/asm/pgtable-prot.h | 24 ++++++++++++--
 arch/arm64/include/asm/sysreg.h       |  1 +
 arch/arm64/kernel/cpufeature.c        | 20 ++++++++++++
 arch/arm64/kvm/sys_regs.c             |  8 ++++-
 virt/kvm/arm/mmu.c                    | 45 ++++++++++++++++++++++-----
 11 files changed, 125 insertions(+), 32 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 1/6] arm64: KVM: Add support for Stage-2 control of memory types and cacheability
  2018-05-30 12:47 ` Marc Zyngier
@ 2018-05-30 12:47   ` Marc Zyngier
  -1 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Catalin Marinas

Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes
results in the strongest attribute of the two stages.  This means
that the hypervisor has to perform quite a lot of cache maintenance
just in case the guest has some non-cacheable mappings around.

ARMv8.4 solves this problem by offering a different mode (FWB) where
Stage-2 has total control over the memory attribute (this is limited
to systems where both I/O and instruction caches are coherent with
the dcache). This is achieved by having a different set of memory
attributes in the page tables, and a new bit set in HCR_EL2.

On such a system, we can then safely sidestep any form of dcache
management.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/cpucaps.h      |  3 ++-
 arch/arm64/include/asm/kvm_arm.h      |  1 +
 arch/arm64/include/asm/kvm_emulate.h  |  2 ++
 arch/arm64/include/asm/kvm_mmu.h      | 24 +++++++++++++++++-------
 arch/arm64/include/asm/memory.h       |  7 +++++++
 arch/arm64/include/asm/pgtable-prot.h | 14 ++++++++++++--
 arch/arm64/include/asm/sysreg.h       |  1 +
 arch/arm64/kernel/cpufeature.c        | 20 ++++++++++++++++++++
 virt/kvm/arm/mmu.c                    |  4 ++++
 9 files changed, 66 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index bc51b72fafd4..90e06a49f3e1 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -48,7 +48,8 @@
 #define ARM64_HAS_CACHE_IDC			27
 #define ARM64_HAS_CACHE_DIC			28
 #define ARM64_HW_DBM				29
+#define ARM64_HAS_STAGE2_FWB			30
 
-#define ARM64_NCAPS				30
+#define ARM64_NCAPS				31
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 6dd285e979c9..aa45df752a16 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -23,6 +23,7 @@
 #include <asm/types.h>
 
 /* Hyp Configuration Register (HCR) bits */
+#define HCR_FWB		(UL(1) << 46)
 #define HCR_TEA		(UL(1) << 37)
 #define HCR_TERR	(UL(1) << 36)
 #define HCR_TLOR	(UL(1) << 35)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 1dab3a984608..dd98fdf33d99 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -63,6 +63,8 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
 		/* trap error record accesses */
 		vcpu->arch.hcr_el2 |= HCR_TERR;
 	}
+	if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
+		vcpu->arch.hcr_el2 |= HCR_FWB;
 
 	if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
 		vcpu->arch.hcr_el2 &= ~HCR_RW;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 082110993647..9dbca5355029 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -258,6 +258,7 @@ static inline bool kvm_page_empty(void *ptr)
 struct kvm;
 
 #define kvm_flush_dcache_to_poc(a,l)	__flush_dcache_area((a), (l))
+#define kvm_flush_dcache_to_pou(a,l)	__clean_dcache_area_pou((a), (l))
 
 static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
 {
@@ -268,7 +269,10 @@ static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
 {
 	void *va = page_address(pfn_to_page(pfn));
 
-	kvm_flush_dcache_to_poc(va, size);
+	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
+		kvm_flush_dcache_to_poc(va, size);
+	else
+		kvm_flush_dcache_to_pou(va, size);
 }
 
 static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
@@ -288,20 +292,26 @@ static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
 
 static inline void __kvm_flush_dcache_pte(pte_t pte)
 {
-	struct page *page = pte_page(pte);
-	kvm_flush_dcache_to_poc(page_address(page), PAGE_SIZE);
+	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) {
+		struct page *page = pte_page(pte);
+		kvm_flush_dcache_to_poc(page_address(page), PAGE_SIZE);
+	}
 }
 
 static inline void __kvm_flush_dcache_pmd(pmd_t pmd)
 {
-	struct page *page = pmd_page(pmd);
-	kvm_flush_dcache_to_poc(page_address(page), PMD_SIZE);
+	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) {
+		struct page *page = pmd_page(pmd);
+		kvm_flush_dcache_to_poc(page_address(page), PMD_SIZE);
+	}
 }
 
 static inline void __kvm_flush_dcache_pud(pud_t pud)
 {
-	struct page *page = pud_page(pud);
-	kvm_flush_dcache_to_poc(page_address(page), PUD_SIZE);
+	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) {
+		struct page *page = pud_page(pud);
+		kvm_flush_dcache_to_poc(page_address(page), PUD_SIZE);
+	}
 }
 
 #define kvm_virt_to_phys(x)		__pa_symbol(x)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 49d99214f43c..b96442960aea 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -155,6 +155,13 @@
 #define MT_S2_NORMAL		0xf
 #define MT_S2_DEVICE_nGnRE	0x1
 
+/*
+ * Memory types for Stage-2 translation when ID_AA64MMFR2_EL1.FWB is 0001
+ * Stage-2 enforces Normal-WB and Device-nGnRE
+ */
+#define MT_S2_FWB_NORMAL	6
+#define MT_S2_FWB_DEVICE_nGnRE	1
+
 #ifdef CONFIG_ARM64_4K_PAGES
 #define IOREMAP_MAX_ORDER	(PUD_SHIFT)
 #else
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 108ecad7acc5..c66c3047400e 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -67,8 +67,18 @@
 #define PAGE_HYP_RO		__pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN)
 #define PAGE_HYP_DEVICE		__pgprot(PROT_DEVICE_nGnRE | PTE_HYP)
 
-#define PAGE_S2			__pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
-#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
+#define PAGE_S2_MEMATTR(attr)						\
+	({								\
+		u64 __val;						\
+		if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))		\
+			__val = PTE_S2_MEMATTR(MT_S2_FWB_ ## attr);	\
+		else							\
+			__val = PTE_S2_MEMATTR(MT_S2_ ## attr);		\
+		__val;							\
+	 })
+
+#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
+#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
 
 #define PAGE_NONE		__pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
 #define PAGE_SHARED		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 6171178075dc..385fb8c28981 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -576,6 +576,7 @@
 #define ID_AA64MMFR1_VMIDBITS_16	2
 
 /* id_aa64mmfr2 */
+#define ID_AA64MMFR2_FWB_SHIFT		40
 #define ID_AA64MMFR2_AT_SHIFT		32
 #define ID_AA64MMFR2_LVA_SHIFT		16
 #define ID_AA64MMFR2_IESB_SHIFT		12
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9d1b06d67c53..50185607026b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -192,6 +192,7 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr1[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_FWB_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_AT_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_LVA_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_IESB_SHIFT, 4, 0),
@@ -1026,6 +1027,14 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
 }
 #endif
 
+static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
+{
+	u64 val = read_sysreg_s(SYS_CLIDR_EL1);
+
+	/* Check that CLIDR_EL1.LOU{U,IS} are both 0 */
+	WARN_ON(val & (7 << 27 | 7 << 21));
+}
+
 static const struct arm64_cpu_capabilities arm64_features[] = {
 	{
 		.desc = "GIC system register CPU interface",
@@ -1182,6 +1191,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
 		.matches = has_cache_dic,
 	},
+	{
+		.desc = "Stage-2 Force Write-Back",
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.capability = ARM64_HAS_STAGE2_FWB,
+		.sys_reg = SYS_ID_AA64MMFR2_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64MMFR2_FWB_SHIFT,
+		.min_field_value = 1,
+		.matches = has_cpuid_feature,
+		.cpu_enable = cpu_has_fwb,
+	},
 #ifdef CONFIG_ARM64_HW_AFDBM
 	{
 		/*
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 7f6a944db23d..ba66bf7ae299 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -196,6 +196,10 @@ static void clear_stage2_pmd_entry(struct kvm *kvm, pmd_t *pmd, phys_addr_t addr
  * This is why right after unmapping a page/section and invalidating
  * the corresponding TLBs, we call kvm_flush_dcache_p*() to make sure
  * the IO subsystem will never hit in the cache.
+ *
+ * This is all avoided on systems that have ARM64_HAS_STAGE2_FWB, as
+ * we then fully enforce cacheability of RAM, no matter what the guest
+ * does.
  */
 static void unmap_stage2_ptes(struct kvm *kvm, pmd_t *pmd,
 		       phys_addr_t addr, phys_addr_t end)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 1/6] arm64: KVM: Add support for Stage-2 control of memory types and cacheability
@ 2018-05-30 12:47   ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes
results in the strongest attribute of the two stages.  This means
that the hypervisor has to perform quite a lot of cache maintenance
just in case the guest has some non-cacheable mappings around.

ARMv8.4 solves this problem by offering a different mode (FWB) where
Stage-2 has total control over the memory attribute (this is limited
to systems where both I/O and instruction caches are coherent with
the dcache). This is achieved by having a different set of memory
attributes in the page tables, and a new bit set in HCR_EL2.

On such a system, we can then safely sidestep any form of dcache
management.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/cpucaps.h      |  3 ++-
 arch/arm64/include/asm/kvm_arm.h      |  1 +
 arch/arm64/include/asm/kvm_emulate.h  |  2 ++
 arch/arm64/include/asm/kvm_mmu.h      | 24 +++++++++++++++++-------
 arch/arm64/include/asm/memory.h       |  7 +++++++
 arch/arm64/include/asm/pgtable-prot.h | 14 ++++++++++++--
 arch/arm64/include/asm/sysreg.h       |  1 +
 arch/arm64/kernel/cpufeature.c        | 20 ++++++++++++++++++++
 virt/kvm/arm/mmu.c                    |  4 ++++
 9 files changed, 66 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index bc51b72fafd4..90e06a49f3e1 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -48,7 +48,8 @@
 #define ARM64_HAS_CACHE_IDC			27
 #define ARM64_HAS_CACHE_DIC			28
 #define ARM64_HW_DBM				29
+#define ARM64_HAS_STAGE2_FWB			30
 
-#define ARM64_NCAPS				30
+#define ARM64_NCAPS				31
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 6dd285e979c9..aa45df752a16 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -23,6 +23,7 @@
 #include <asm/types.h>
 
 /* Hyp Configuration Register (HCR) bits */
+#define HCR_FWB		(UL(1) << 46)
 #define HCR_TEA		(UL(1) << 37)
 #define HCR_TERR	(UL(1) << 36)
 #define HCR_TLOR	(UL(1) << 35)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 1dab3a984608..dd98fdf33d99 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -63,6 +63,8 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
 		/* trap error record accesses */
 		vcpu->arch.hcr_el2 |= HCR_TERR;
 	}
+	if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
+		vcpu->arch.hcr_el2 |= HCR_FWB;
 
 	if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
 		vcpu->arch.hcr_el2 &= ~HCR_RW;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 082110993647..9dbca5355029 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -258,6 +258,7 @@ static inline bool kvm_page_empty(void *ptr)
 struct kvm;
 
 #define kvm_flush_dcache_to_poc(a,l)	__flush_dcache_area((a), (l))
+#define kvm_flush_dcache_to_pou(a,l)	__clean_dcache_area_pou((a), (l))
 
 static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
 {
@@ -268,7 +269,10 @@ static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
 {
 	void *va = page_address(pfn_to_page(pfn));
 
-	kvm_flush_dcache_to_poc(va, size);
+	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
+		kvm_flush_dcache_to_poc(va, size);
+	else
+		kvm_flush_dcache_to_pou(va, size);
 }
 
 static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
@@ -288,20 +292,26 @@ static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
 
 static inline void __kvm_flush_dcache_pte(pte_t pte)
 {
-	struct page *page = pte_page(pte);
-	kvm_flush_dcache_to_poc(page_address(page), PAGE_SIZE);
+	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) {
+		struct page *page = pte_page(pte);
+		kvm_flush_dcache_to_poc(page_address(page), PAGE_SIZE);
+	}
 }
 
 static inline void __kvm_flush_dcache_pmd(pmd_t pmd)
 {
-	struct page *page = pmd_page(pmd);
-	kvm_flush_dcache_to_poc(page_address(page), PMD_SIZE);
+	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) {
+		struct page *page = pmd_page(pmd);
+		kvm_flush_dcache_to_poc(page_address(page), PMD_SIZE);
+	}
 }
 
 static inline void __kvm_flush_dcache_pud(pud_t pud)
 {
-	struct page *page = pud_page(pud);
-	kvm_flush_dcache_to_poc(page_address(page), PUD_SIZE);
+	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) {
+		struct page *page = pud_page(pud);
+		kvm_flush_dcache_to_poc(page_address(page), PUD_SIZE);
+	}
 }
 
 #define kvm_virt_to_phys(x)		__pa_symbol(x)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 49d99214f43c..b96442960aea 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -155,6 +155,13 @@
 #define MT_S2_NORMAL		0xf
 #define MT_S2_DEVICE_nGnRE	0x1
 
+/*
+ * Memory types for Stage-2 translation when ID_AA64MMFR2_EL1.FWB is 0001
+ * Stage-2 enforces Normal-WB and Device-nGnRE
+ */
+#define MT_S2_FWB_NORMAL	6
+#define MT_S2_FWB_DEVICE_nGnRE	1
+
 #ifdef CONFIG_ARM64_4K_PAGES
 #define IOREMAP_MAX_ORDER	(PUD_SHIFT)
 #else
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 108ecad7acc5..c66c3047400e 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -67,8 +67,18 @@
 #define PAGE_HYP_RO		__pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN)
 #define PAGE_HYP_DEVICE		__pgprot(PROT_DEVICE_nGnRE | PTE_HYP)
 
-#define PAGE_S2			__pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
-#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
+#define PAGE_S2_MEMATTR(attr)						\
+	({								\
+		u64 __val;						\
+		if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))		\
+			__val = PTE_S2_MEMATTR(MT_S2_FWB_ ## attr);	\
+		else							\
+			__val = PTE_S2_MEMATTR(MT_S2_ ## attr);		\
+		__val;							\
+	 })
+
+#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
+#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
 
 #define PAGE_NONE		__pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
 #define PAGE_SHARED		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 6171178075dc..385fb8c28981 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -576,6 +576,7 @@
 #define ID_AA64MMFR1_VMIDBITS_16	2
 
 /* id_aa64mmfr2 */
+#define ID_AA64MMFR2_FWB_SHIFT		40
 #define ID_AA64MMFR2_AT_SHIFT		32
 #define ID_AA64MMFR2_LVA_SHIFT		16
 #define ID_AA64MMFR2_IESB_SHIFT		12
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9d1b06d67c53..50185607026b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -192,6 +192,7 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr1[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_FWB_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_AT_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_LVA_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR2_IESB_SHIFT, 4, 0),
@@ -1026,6 +1027,14 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
 }
 #endif
 
+static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
+{
+	u64 val = read_sysreg_s(SYS_CLIDR_EL1);
+
+	/* Check that CLIDR_EL1.LOU{U,IS} are both 0 */
+	WARN_ON(val & (7 << 27 | 7 << 21));
+}
+
 static const struct arm64_cpu_capabilities arm64_features[] = {
 	{
 		.desc = "GIC system register CPU interface",
@@ -1182,6 +1191,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
 		.matches = has_cache_dic,
 	},
+	{
+		.desc = "Stage-2 Force Write-Back",
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.capability = ARM64_HAS_STAGE2_FWB,
+		.sys_reg = SYS_ID_AA64MMFR2_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64MMFR2_FWB_SHIFT,
+		.min_field_value = 1,
+		.matches = has_cpuid_feature,
+		.cpu_enable = cpu_has_fwb,
+	},
 #ifdef CONFIG_ARM64_HW_AFDBM
 	{
 		/*
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 7f6a944db23d..ba66bf7ae299 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -196,6 +196,10 @@ static void clear_stage2_pmd_entry(struct kvm *kvm, pmd_t *pmd, phys_addr_t addr
  * This is why right after unmapping a page/section and invalidating
  * the corresponding TLBs, we call kvm_flush_dcache_p*() to make sure
  * the IO subsystem will never hit in the cache.
+ *
+ * This is all avoided on systems that have ARM64_HAS_STAGE2_FWB, as
+ * we then fully enforce cacheability of RAM, no matter what the guest
+ * does.
  */
 static void unmap_stage2_ptes(struct kvm *kvm, pmd_t *pmd,
 		       phys_addr_t addr, phys_addr_t end)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
  2018-05-30 12:47 ` Marc Zyngier
@ 2018-05-30 12:47   ` Marc Zyngier
  -1 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: Mark Rutland, Catalin Marinas, Christoffer Dall

Set/Way handling is one of the ugliest corners of KVM. We shouldn't
have to handle that, but better safe than sorry.

Thankfully, FWB fixes this for us by not requiering any maintenance
whatsoever, which means we don't have to emulate S/W CMOs, and don't
have to track VM ops either.

We still have to trap S/W though, if only to prevent the guest from
doing something bad.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 6e3b969391fd..9a740f159245 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -195,7 +195,13 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
 	if (!p->is_write)
 		return read_from_write_only(vcpu, p, r);
 
-	kvm_set_way_flush(vcpu);
+	/*
+	 * Only track S/W ops if we don't have FWB. It still indicates
+	 * that the guest is a bit broken...
+	 */
+	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
+		kvm_set_way_flush(vcpu);
+
 	return true;
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
@ 2018-05-30 12:47   ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

Set/Way handling is one of the ugliest corners of KVM. We shouldn't
have to handle that, but better safe than sorry.

Thankfully, FWB fixes this for us by not requiering any maintenance
whatsoever, which means we don't have to emulate S/W CMOs, and don't
have to track VM ops either.

We still have to trap S/W though, if only to prevent the guest from
doing something bad.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 6e3b969391fd..9a740f159245 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -195,7 +195,13 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
 	if (!p->is_write)
 		return read_from_write_only(vcpu, p, r);
 
-	kvm_set_way_flush(vcpu);
+	/*
+	 * Only track S/W ops if we don't have FWB. It still indicates
+	 * that the guest is a bit broken...
+	 */
+	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
+		kvm_set_way_flush(vcpu);
+
 	return true;
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 3/6] arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set
  2018-05-30 12:47 ` Marc Zyngier
@ 2018-05-30 12:47   ` Marc Zyngier
  -1 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Catalin Marinas

On systems where CTR_EL0.DIC is set, we don't need to perform
icache invalidation to guarantee that we'll fetch the right
instruction stream.

This also means that taking a permission fault to invalidate the
icache is an unnecessary overhead.

On such systems, we can safely leave the page as being executable.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/pgtable-prot.h | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index c66c3047400e..78b942c1bea4 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -77,8 +77,18 @@
 		__val;							\
 	 })
 
-#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
-#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
+#define PAGE_S2_XN							\
+	({								\
+		u64 __val;						\
+		if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))		\
+			__val = 0;					\
+		else							\
+			__val = PTE_S2_XN;				\
+		__val;							\
+	})
+
+#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PAGE_S2_XN)
+#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PAGE_S2_XN)
 
 #define PAGE_NONE		__pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
 #define PAGE_SHARED		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 3/6] arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set
@ 2018-05-30 12:47   ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

On systems where CTR_EL0.DIC is set, we don't need to perform
icache invalidation to guarantee that we'll fetch the right
instruction stream.

This also means that taking a permission fault to invalidate the
icache is an unnecessary overhead.

On such systems, we can safely leave the page as being executable.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/pgtable-prot.h | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index c66c3047400e..78b942c1bea4 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -77,8 +77,18 @@
 		__val;							\
 	 })
 
-#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
-#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
+#define PAGE_S2_XN							\
+	({								\
+		u64 __val;						\
+		if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))		\
+			__val = 0;					\
+		else							\
+			__val = PTE_S2_XN;				\
+		__val;							\
+	})
+
+#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PAGE_S2_XN)
+#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PAGE_S2_XN)
 
 #define PAGE_NONE		__pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
 #define PAGE_SHARED		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 4/6] KVM: arm/arm64: Consolidate page-table accessors
  2018-05-30 12:47 ` Marc Zyngier
@ 2018-05-30 12:47   ` Marc Zyngier
  -1 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: Mark Rutland, Catalin Marinas, Christoffer Dall

The arm and arm64 KVM page tables accessors are pointlessly different
between the two architectures, and likely both wrong one way or another:
arm64 lacks a dsb(), and arm doesn't use WRITE_ONCE.

Let's unify them.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   | 12 -----------
 arch/arm64/include/asm/kvm_mmu.h |  3 ---
 virt/kvm/arm/mmu.c               | 35 ++++++++++++++++++++++++++++----
 3 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 707a1f06dc5d..468ff945efa0 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -75,18 +75,6 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
-static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
-{
-	*pmd = new_pmd;
-	dsb(ishst);
-}
-
-static inline void kvm_set_pte(pte_t *pte, pte_t new_pte)
-{
-	*pte = new_pte;
-	dsb(ishst);
-}
-
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 9dbca5355029..26c89b63f604 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -170,9 +170,6 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
-#define	kvm_set_pte(ptep, pte)		set_pte(ptep, pte)
-#define	kvm_set_pmd(pmdp, pmd)		set_pmd(pmdp, pmd)
-
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ba66bf7ae299..c9ed239c0840 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -177,6 +177,33 @@ static void clear_stage2_pmd_entry(struct kvm *kvm, pmd_t *pmd, phys_addr_t addr
 	put_page(virt_to_page(pmd));
 }
 
+static inline void kvm_set_pte(pte_t *ptep, pte_t new_pte)
+{
+	WRITE_ONCE(*ptep, new_pte);
+	dsb(ishst);
+}
+
+static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
+{
+	WRITE_ONCE(*pmdp, new_pmd);
+	dsb(ishst);
+}
+
+static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
+{
+	pmd_populate_kernel(NULL, pmdp, ptep);
+}
+
+static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
+{
+	pud_populate(NULL, pudp, pmdp);
+}
+
+static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
+{
+	pgd_populate(NULL, pgdp, pudp);
+}
+
 /*
  * Unmapping vs dcache management:
  *
@@ -603,7 +630,7 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
 				kvm_err("Cannot allocate Hyp pte\n");
 				return -ENOMEM;
 			}
-			pmd_populate_kernel(NULL, pmd, pte);
+			kvm_pmd_populate(pmd, pte);
 			get_page(virt_to_page(pmd));
 			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
 		}
@@ -636,7 +663,7 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
 				kvm_err("Cannot allocate Hyp pmd\n");
 				return -ENOMEM;
 			}
-			pud_populate(NULL, pud, pmd);
+			kvm_pud_populate(pud, pmd);
 			get_page(virt_to_page(pud));
 			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
 		}
@@ -673,7 +700,7 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
 				err = -ENOMEM;
 				goto out;
 			}
-			pgd_populate(NULL, pgd, pud);
+			kvm_pgd_populate(pgd, pud);
 			get_page(virt_to_page(pgd));
 			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
 		}
@@ -1092,7 +1119,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 		if (!cache)
 			return 0; /* ignore calls from kvm_set_spte_hva */
 		pte = mmu_memory_cache_alloc(cache);
-		pmd_populate_kernel(NULL, pmd, pte);
+		kvm_pmd_populate(pmd, pte);
 		get_page(virt_to_page(pmd));
 	}
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 4/6] KVM: arm/arm64: Consolidate page-table accessors
@ 2018-05-30 12:47   ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

The arm and arm64 KVM page tables accessors are pointlessly different
between the two architectures, and likely both wrong one way or another:
arm64 lacks a dsb(), and arm doesn't use WRITE_ONCE.

Let's unify them.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   | 12 -----------
 arch/arm64/include/asm/kvm_mmu.h |  3 ---
 virt/kvm/arm/mmu.c               | 35 ++++++++++++++++++++++++++++----
 3 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 707a1f06dc5d..468ff945efa0 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -75,18 +75,6 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
-static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
-{
-	*pmd = new_pmd;
-	dsb(ishst);
-}
-
-static inline void kvm_set_pte(pte_t *pte, pte_t new_pte)
-{
-	*pte = new_pte;
-	dsb(ishst);
-}
-
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 9dbca5355029..26c89b63f604 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -170,9 +170,6 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
-#define	kvm_set_pte(ptep, pte)		set_pte(ptep, pte)
-#define	kvm_set_pmd(pmdp, pmd)		set_pmd(pmdp, pmd)
-
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ba66bf7ae299..c9ed239c0840 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -177,6 +177,33 @@ static void clear_stage2_pmd_entry(struct kvm *kvm, pmd_t *pmd, phys_addr_t addr
 	put_page(virt_to_page(pmd));
 }
 
+static inline void kvm_set_pte(pte_t *ptep, pte_t new_pte)
+{
+	WRITE_ONCE(*ptep, new_pte);
+	dsb(ishst);
+}
+
+static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
+{
+	WRITE_ONCE(*pmdp, new_pmd);
+	dsb(ishst);
+}
+
+static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
+{
+	pmd_populate_kernel(NULL, pmdp, ptep);
+}
+
+static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
+{
+	pud_populate(NULL, pudp, pmdp);
+}
+
+static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
+{
+	pgd_populate(NULL, pgdp, pudp);
+}
+
 /*
  * Unmapping vs dcache management:
  *
@@ -603,7 +630,7 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
 				kvm_err("Cannot allocate Hyp pte\n");
 				return -ENOMEM;
 			}
-			pmd_populate_kernel(NULL, pmd, pte);
+			kvm_pmd_populate(pmd, pte);
 			get_page(virt_to_page(pmd));
 			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
 		}
@@ -636,7 +663,7 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
 				kvm_err("Cannot allocate Hyp pmd\n");
 				return -ENOMEM;
 			}
-			pud_populate(NULL, pud, pmd);
+			kvm_pud_populate(pud, pmd);
 			get_page(virt_to_page(pud));
 			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
 		}
@@ -673,7 +700,7 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
 				err = -ENOMEM;
 				goto out;
 			}
-			pgd_populate(NULL, pgd, pud);
+			kvm_pgd_populate(pgd, pud);
 			get_page(virt_to_page(pgd));
 			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
 		}
@@ -1092,7 +1119,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 		if (!cache)
 			return 0; /* ignore calls from kvm_set_spte_hva */
 		pte = mmu_memory_cache_alloc(cache);
-		pmd_populate_kernel(NULL, pmd, pte);
+		kvm_pmd_populate(pmd, pte);
 		get_page(virt_to_page(pmd));
 	}
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 5/6] KVM: arm/arm64: Stop using {pmd,pud,pgd}_populate
  2018-05-30 12:47 ` Marc Zyngier
@ 2018-05-30 12:47   ` Marc Zyngier
  -1 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Catalin Marinas

The {pmd,pud,pgd}_populate accessors usage in the kernel have always
been a bit weird in KVM. We don't have a struct mm to pass (and
neither does the kernel most of the time, but still...), and
the 32bit code has all kind of cache maintenance that doesn't make
sense on ARMv7+ when MP extensions are mandatory (which is the
case when the VEs are present).

Let's bite the bullet and provide our own implementations. The
only bit of architectural code left has to do with building the table
entry itself (arm64 having up to 52bit PA, arm lacking PUD level).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   | 4 ++++
 arch/arm64/include/asm/kvm_mmu.h | 7 +++++++
 virt/kvm/arm/mmu.c               | 8 +++++---
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 468ff945efa0..a94ef9833bd3 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -75,6 +75,10 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
+#define kvm_mk_pmd(ptep)	__pmd(__pa(ptep) | PMD_TYPE_TABLE)
+#define kvm_mk_pud(pmdp)	__pud(__pa(pmdp) | PMD_TYPE_TABLE)
+#define kvm_mk_pgd(pudp)	({ BUILD_BUG(); 0; })
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 26c89b63f604..22c9f7cfdf93 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -170,6 +170,13 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
+#define kvm_mk_pmd(ptep)					\
+	__pmd(__phys_to_pmd_val(__pa(ptep) | PMD_TYPE_TABLE))
+#define kvm_mk_pud(pmdp)					\
+	__pud(__phys_to_pud_val(__pa(pmdp) | PMD_TYPE_TABLE))
+#define kvm_mk_pgd(pudp)					\
+	__pgd(__phys_to_pgd_val(__pa(pudp) | PUD_TYPE_TABLE))
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index c9ed239c0840..ad1980d2118a 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -191,17 +191,19 @@ static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
 
 static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
 {
-	pmd_populate_kernel(NULL, pmdp, ptep);
+	kvm_set_pmd(pmdp, kvm_mk_pmd(ptep));
 }
 
 static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
 {
-	pud_populate(NULL, pudp, pmdp);
+	WRITE_ONCE(*pudp, kvm_mk_pud(pmdp));
+	dsb(ishst);
 }
 
 static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
 {
-	pgd_populate(NULL, pgdp, pudp);
+	WRITE_ONCE(*pgdp, kvm_mk_pgd(pudp));
+	dsb(ishst);
 }
 
 /*
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 5/6] KVM: arm/arm64: Stop using {pmd,pud,pgd}_populate
@ 2018-05-30 12:47   ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

The {pmd,pud,pgd}_populate accessors usage in the kernel have always
been a bit weird in KVM. We don't have a struct mm to pass (and
neither does the kernel most of the time, but still...), and
the 32bit code has all kind of cache maintenance that doesn't make
sense on ARMv7+ when MP extensions are mandatory (which is the
case when the VEs are present).

Let's bite the bullet and provide our own implementations. The
only bit of architectural code left has to do with building the table
entry itself (arm64 having up to 52bit PA, arm lacking PUD level).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   | 4 ++++
 arch/arm64/include/asm/kvm_mmu.h | 7 +++++++
 virt/kvm/arm/mmu.c               | 8 +++++---
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 468ff945efa0..a94ef9833bd3 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -75,6 +75,10 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
+#define kvm_mk_pmd(ptep)	__pmd(__pa(ptep) | PMD_TYPE_TABLE)
+#define kvm_mk_pud(pmdp)	__pud(__pa(pmdp) | PMD_TYPE_TABLE)
+#define kvm_mk_pgd(pudp)	({ BUILD_BUG(); 0; })
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 26c89b63f604..22c9f7cfdf93 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -170,6 +170,13 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
+#define kvm_mk_pmd(ptep)					\
+	__pmd(__phys_to_pmd_val(__pa(ptep) | PMD_TYPE_TABLE))
+#define kvm_mk_pud(pmdp)					\
+	__pud(__phys_to_pud_val(__pa(pmdp) | PMD_TYPE_TABLE))
+#define kvm_mk_pgd(pudp)					\
+	__pgd(__phys_to_pgd_val(__pa(pudp) | PUD_TYPE_TABLE))
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
 	pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index c9ed239c0840..ad1980d2118a 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -191,17 +191,19 @@ static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
 
 static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
 {
-	pmd_populate_kernel(NULL, pmdp, ptep);
+	kvm_set_pmd(pmdp, kvm_mk_pmd(ptep));
 }
 
 static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
 {
-	pud_populate(NULL, pudp, pmdp);
+	WRITE_ONCE(*pudp, kvm_mk_pud(pmdp));
+	dsb(ishst);
 }
 
 static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
 {
-	pgd_populate(NULL, pgdp, pudp);
+	WRITE_ONCE(*pgdp, kvm_mk_pgd(pudp));
+	dsb(ishst);
 }
 
 /*
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 6/6] KVM: arm/arm64: Remove unnecessary CMOs when creating HYP page tables
  2018-05-30 12:47 ` Marc Zyngier
@ 2018-05-30 12:47   ` Marc Zyngier
  -1 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel; +Cc: Catalin Marinas

There is no need to perform cache maintenance operations when
creating the HYP page tables if we have the multiprocessing
extensions. ARMv7 mandates them with the virtualization support,
and ARMv8 just mandates them unconditionally.

Let's remove these operations.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ad1980d2118a..ccdf544d44c0 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -607,7 +607,6 @@ static void create_hyp_pte_mappings(pmd_t *pmd, unsigned long start,
 		pte = pte_offset_kernel(pmd, addr);
 		kvm_set_pte(pte, pfn_pte(pfn, prot));
 		get_page(virt_to_page(pte));
-		kvm_flush_dcache_to_poc(pte, sizeof(*pte));
 		pfn++;
 	} while (addr += PAGE_SIZE, addr != end);
 }
@@ -634,7 +633,6 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
 			}
 			kvm_pmd_populate(pmd, pte);
 			get_page(virt_to_page(pmd));
-			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
 		}
 
 		next = pmd_addr_end(addr, end);
@@ -667,7 +665,6 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
 			}
 			kvm_pud_populate(pud, pmd);
 			get_page(virt_to_page(pud));
-			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
 		}
 
 		next = pud_addr_end(addr, end);
@@ -704,7 +701,6 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
 			}
 			kvm_pgd_populate(pgd, pud);
 			get_page(virt_to_page(pgd));
-			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
 		}
 
 		next = pgd_addr_end(addr, end);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 6/6] KVM: arm/arm64: Remove unnecessary CMOs when creating HYP page tables
@ 2018-05-30 12:47   ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-30 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

There is no need to perform cache maintenance operations when
creating the HYP page tables if we have the multiprocessing
extensions. ARMv7 mandates them with the virtualization support,
and ARMv8 just mandates them unconditionally.

Let's remove these operations.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ad1980d2118a..ccdf544d44c0 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -607,7 +607,6 @@ static void create_hyp_pte_mappings(pmd_t *pmd, unsigned long start,
 		pte = pte_offset_kernel(pmd, addr);
 		kvm_set_pte(pte, pfn_pte(pfn, prot));
 		get_page(virt_to_page(pte));
-		kvm_flush_dcache_to_poc(pte, sizeof(*pte));
 		pfn++;
 	} while (addr += PAGE_SIZE, addr != end);
 }
@@ -634,7 +633,6 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
 			}
 			kvm_pmd_populate(pmd, pte);
 			get_page(virt_to_page(pmd));
-			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
 		}
 
 		next = pmd_addr_end(addr, end);
@@ -667,7 +665,6 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
 			}
 			kvm_pud_populate(pud, pmd);
 			get_page(virt_to_page(pud));
-			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
 		}
 
 		next = pud_addr_end(addr, end);
@@ -704,7 +701,6 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
 			}
 			kvm_pgd_populate(pgd, pud);
 			get_page(virt_to_page(pgd));
-			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
 		}
 
 		next = pgd_addr_end(addr, end);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/6] arm64: KVM: Add support for Stage-2 control of memory types and cacheability
  2018-05-30 12:47   ` Marc Zyngier
@ 2018-05-31 11:49     ` Mark Rutland
  -1 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 11:49 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, Catalin Marinas, kvmarm, kvm, Christoffer Dall

On Wed, May 30, 2018 at 01:47:01PM +0100, Marc Zyngier wrote:
> Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes
> results in the strongest attribute of the two stages.  This means
> that the hypervisor has to perform quite a lot of cache maintenance
> just in case the guest has some non-cacheable mappings around.
> 
> ARMv8.4 solves this problem by offering a different mode (FWB) where
> Stage-2 has total control over the memory attribute (this is limited
> to systems where both I/O and instruction caches are coherent with

s/caches/fetches/ -- the I-caches themselves aren't coherent with the
D-caches (or we could omit I-cache maintenance).

i.e. this implies IDC, but not DIC.

> the dcache). This is achieved by having a different set of memory
> attributes in the page tables, and a new bit set in HCR_EL2.
> 
> On such a system, we can then safely sidestep any form of dcache
> management.
> 
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---

>  static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
>  {
> @@ -268,7 +269,10 @@ static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
>  {
>  	void *va = page_address(pfn_to_page(pfn));
>  
> -	kvm_flush_dcache_to_poc(va, size);
> +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> +		kvm_flush_dcache_to_poc(va, size);
> +	else
> +		kvm_flush_dcache_to_pou(va, size);
>  }

Te commit message said instruction fetches were coherent, and that no
D-cache maintenance was necessary, so why do we need maintenance to the
PoU?

> +static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
> +{
> +	u64 val = read_sysreg_s(SYS_CLIDR_EL1);
> +
> +	/* Check that CLIDR_EL1.LOU{U,IS} are both 0 */
> +	WARN_ON(val & (7 << 27 | 7 << 21));
> +}

What about CTR_EL0.IDC?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 1/6] arm64: KVM: Add support for Stage-2 control of memory types and cacheability
@ 2018-05-31 11:49     ` Mark Rutland
  0 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 11:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 30, 2018 at 01:47:01PM +0100, Marc Zyngier wrote:
> Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes
> results in the strongest attribute of the two stages.  This means
> that the hypervisor has to perform quite a lot of cache maintenance
> just in case the guest has some non-cacheable mappings around.
> 
> ARMv8.4 solves this problem by offering a different mode (FWB) where
> Stage-2 has total control over the memory attribute (this is limited
> to systems where both I/O and instruction caches are coherent with

s/caches/fetches/ -- the I-caches themselves aren't coherent with the
D-caches (or we could omit I-cache maintenance).

i.e. this implies IDC, but not DIC.

> the dcache). This is achieved by having a different set of memory
> attributes in the page tables, and a new bit set in HCR_EL2.
> 
> On such a system, we can then safely sidestep any form of dcache
> management.
> 
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---

>  static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
>  {
> @@ -268,7 +269,10 @@ static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
>  {
>  	void *va = page_address(pfn_to_page(pfn));
>  
> -	kvm_flush_dcache_to_poc(va, size);
> +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> +		kvm_flush_dcache_to_poc(va, size);
> +	else
> +		kvm_flush_dcache_to_pou(va, size);
>  }

Te commit message said instruction fetches were coherent, and that no
D-cache maintenance was necessary, so why do we need maintenance to the
PoU?

> +static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
> +{
> +	u64 val = read_sysreg_s(SYS_CLIDR_EL1);
> +
> +	/* Check that CLIDR_EL1.LOU{U,IS} are both 0 */
> +	WARN_ON(val & (7 << 27 | 7 << 21));
> +}

What about CTR_EL0.IDC?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
  2018-05-30 12:47   ` Marc Zyngier
@ 2018-05-31 11:51     ` Mark Rutland
  -1 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 11:51 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, Catalin Marinas, kvmarm, kvm

On Wed, May 30, 2018 at 01:47:02PM +0100, Marc Zyngier wrote:
> Set/Way handling is one of the ugliest corners of KVM. We shouldn't
> have to handle that, but better safe than sorry.
> 
> Thankfully, FWB fixes this for us by not requiering any maintenance
> whatsoever, which means we don't have to emulate S/W CMOs, and don't
> have to track VM ops either.
> 
> We still have to trap S/W though, if only to prevent the guest from
> doing something bad.

S/W ops *also* do I-cache maintenance, so we'd still need to emulate
that. Though it looks like we're missing that today...

> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 6e3b969391fd..9a740f159245 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -195,7 +195,13 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
>  	if (!p->is_write)
>  		return read_from_write_only(vcpu, p, r);
>  
> -	kvm_set_way_flush(vcpu);
> +	/*
> +	 * Only track S/W ops if we don't have FWB. It still indicates
> +	 * that the guest is a bit broken...
> +	 */
> +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> +		kvm_set_way_flush(vcpu);
> +

Assuming we implement I-cache maintenance, we can have something like:

	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
		kvm_set_way_flush_dcache(vcpu);

	kvm_set_way_flush_icache(vcpu);

Thanks,
Mark.

>  	return true;
>  }
>  
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
@ 2018-05-31 11:51     ` Mark Rutland
  0 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 11:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 30, 2018 at 01:47:02PM +0100, Marc Zyngier wrote:
> Set/Way handling is one of the ugliest corners of KVM. We shouldn't
> have to handle that, but better safe than sorry.
> 
> Thankfully, FWB fixes this for us by not requiering any maintenance
> whatsoever, which means we don't have to emulate S/W CMOs, and don't
> have to track VM ops either.
> 
> We still have to trap S/W though, if only to prevent the guest from
> doing something bad.

S/W ops *also* do I-cache maintenance, so we'd still need to emulate
that. Though it looks like we're missing that today...

> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 6e3b969391fd..9a740f159245 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -195,7 +195,13 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
>  	if (!p->is_write)
>  		return read_from_write_only(vcpu, p, r);
>  
> -	kvm_set_way_flush(vcpu);
> +	/*
> +	 * Only track S/W ops if we don't have FWB. It still indicates
> +	 * that the guest is a bit broken...
> +	 */
> +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> +		kvm_set_way_flush(vcpu);
> +

Assuming we implement I-cache maintenance, we can have something like:

	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
		kvm_set_way_flush_dcache(vcpu);

	kvm_set_way_flush_icache(vcpu);

Thanks,
Mark.

>  	return true;
>  }
>  
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 3/6] arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set
  2018-05-30 12:47   ` Marc Zyngier
@ 2018-05-31 11:52     ` Mark Rutland
  -1 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 11:52 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, Catalin Marinas, kvmarm, kvm

On Wed, May 30, 2018 at 01:47:03PM +0100, Marc Zyngier wrote:
> On systems where CTR_EL0.DIC is set, we don't need to perform
> icache invalidation to guarantee that we'll fetch the right
> instruction stream.
> 
> This also means that taking a permission fault to invalidate the
> icache is an unnecessary overhead.
> 
> On such systems, we can safely leave the page as being executable.
> 
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm64/include/asm/pgtable-prot.h | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
> index c66c3047400e..78b942c1bea4 100644
> --- a/arch/arm64/include/asm/pgtable-prot.h
> +++ b/arch/arm64/include/asm/pgtable-prot.h
> @@ -77,8 +77,18 @@
>  		__val;							\
>  	 })
>  
> -#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
> -#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
> +#define PAGE_S2_XN							\
> +	({								\
> +		u64 __val;						\
> +		if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))		\
> +			__val = 0;					\
> +		else							\
> +			__val = PTE_S2_XN;				\
> +		__val;							\
> +	})
> +
> +#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PAGE_S2_XN)
> +#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PAGE_S2_XN)
>  
>  #define PAGE_NONE		__pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
>  #define PAGE_SHARED		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 3/6] arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set
@ 2018-05-31 11:52     ` Mark Rutland
  0 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 11:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 30, 2018 at 01:47:03PM +0100, Marc Zyngier wrote:
> On systems where CTR_EL0.DIC is set, we don't need to perform
> icache invalidation to guarantee that we'll fetch the right
> instruction stream.
> 
> This also means that taking a permission fault to invalidate the
> icache is an unnecessary overhead.
> 
> On such systems, we can safely leave the page as being executable.
> 
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm64/include/asm/pgtable-prot.h | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
> index c66c3047400e..78b942c1bea4 100644
> --- a/arch/arm64/include/asm/pgtable-prot.h
> +++ b/arch/arm64/include/asm/pgtable-prot.h
> @@ -77,8 +77,18 @@
>  		__val;							\
>  	 })
>  
> -#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
> -#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
> +#define PAGE_S2_XN							\
> +	({								\
> +		u64 __val;						\
> +		if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))		\
> +			__val = 0;					\
> +		else							\
> +			__val = PTE_S2_XN;				\
> +		__val;							\
> +	})
> +
> +#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PAGE_S2_XN)
> +#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PAGE_S2_XN)
>  
>  #define PAGE_NONE		__pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
>  #define PAGE_SHARED		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 4/6] KVM: arm/arm64: Consolidate page-table accessors
  2018-05-30 12:47   ` Marc Zyngier
@ 2018-05-31 11:55     ` Mark Rutland
  -1 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 11:55 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, Catalin Marinas, kvmarm, kvm

On Wed, May 30, 2018 at 01:47:04PM +0100, Marc Zyngier wrote:
> The arm and arm64 KVM page tables accessors are pointlessly different
> between the two architectures, and likely both wrong one way or another:
> arm64 lacks a dsb(), and arm doesn't use WRITE_ONCE.
> 
> Let's unify them.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm/include/asm/kvm_mmu.h   | 12 -----------
>  arch/arm64/include/asm/kvm_mmu.h |  3 ---
>  virt/kvm/arm/mmu.c               | 35 ++++++++++++++++++++++++++++----
>  3 files changed, 31 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 707a1f06dc5d..468ff945efa0 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -75,18 +75,6 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> -static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
> -{
> -	*pmd = new_pmd;
> -	dsb(ishst);
> -}
> -
> -static inline void kvm_set_pte(pte_t *pte, pte_t new_pte)
> -{
> -	*pte = new_pte;
> -	dsb(ishst);
> -}
> -
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= L_PTE_S2_RDWR;
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 9dbca5355029..26c89b63f604 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -170,9 +170,6 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> -#define	kvm_set_pte(ptep, pte)		set_pte(ptep, pte)
> -#define	kvm_set_pmd(pmdp, pmd)		set_pmd(pmdp, pmd)
> -
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= PTE_S2_RDWR;
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ba66bf7ae299..c9ed239c0840 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -177,6 +177,33 @@ static void clear_stage2_pmd_entry(struct kvm *kvm, pmd_t *pmd, phys_addr_t addr
>  	put_page(virt_to_page(pmd));
>  }
>  
> +static inline void kvm_set_pte(pte_t *ptep, pte_t new_pte)
> +{
> +	WRITE_ONCE(*ptep, new_pte);
> +	dsb(ishst);
> +}
> +
> +static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
> +{
> +	WRITE_ONCE(*pmdp, new_pmd);
> +	dsb(ishst);
> +}
> +
> +static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
> +{
> +	pmd_populate_kernel(NULL, pmdp, ptep);
> +}
> +
> +static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
> +{
> +	pud_populate(NULL, pudp, pmdp);
> +}
> +
> +static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
> +{
> +	pgd_populate(NULL, pgdp, pudp);
> +}
> +
>  /*
>   * Unmapping vs dcache management:
>   *
> @@ -603,7 +630,7 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
>  				kvm_err("Cannot allocate Hyp pte\n");
>  				return -ENOMEM;
>  			}
> -			pmd_populate_kernel(NULL, pmd, pte);
> +			kvm_pmd_populate(pmd, pte);
>  			get_page(virt_to_page(pmd));
>  			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
>  		}
> @@ -636,7 +663,7 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
>  				kvm_err("Cannot allocate Hyp pmd\n");
>  				return -ENOMEM;
>  			}
> -			pud_populate(NULL, pud, pmd);
> +			kvm_pud_populate(pud, pmd);
>  			get_page(virt_to_page(pud));
>  			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
>  		}
> @@ -673,7 +700,7 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
>  				err = -ENOMEM;
>  				goto out;
>  			}
> -			pgd_populate(NULL, pgd, pud);
> +			kvm_pgd_populate(pgd, pud);
>  			get_page(virt_to_page(pgd));
>  			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
>  		}
> @@ -1092,7 +1119,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  		if (!cache)
>  			return 0; /* ignore calls from kvm_set_spte_hva */
>  		pte = mmu_memory_cache_alloc(cache);
> -		pmd_populate_kernel(NULL, pmd, pte);
> +		kvm_pmd_populate(pmd, pte);
>  		get_page(virt_to_page(pmd));
>  	}
>  
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 4/6] KVM: arm/arm64: Consolidate page-table accessors
@ 2018-05-31 11:55     ` Mark Rutland
  0 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 30, 2018 at 01:47:04PM +0100, Marc Zyngier wrote:
> The arm and arm64 KVM page tables accessors are pointlessly different
> between the two architectures, and likely both wrong one way or another:
> arm64 lacks a dsb(), and arm doesn't use WRITE_ONCE.
> 
> Let's unify them.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm/include/asm/kvm_mmu.h   | 12 -----------
>  arch/arm64/include/asm/kvm_mmu.h |  3 ---
>  virt/kvm/arm/mmu.c               | 35 ++++++++++++++++++++++++++++----
>  3 files changed, 31 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 707a1f06dc5d..468ff945efa0 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -75,18 +75,6 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> -static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
> -{
> -	*pmd = new_pmd;
> -	dsb(ishst);
> -}
> -
> -static inline void kvm_set_pte(pte_t *pte, pte_t new_pte)
> -{
> -	*pte = new_pte;
> -	dsb(ishst);
> -}
> -
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= L_PTE_S2_RDWR;
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 9dbca5355029..26c89b63f604 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -170,9 +170,6 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> -#define	kvm_set_pte(ptep, pte)		set_pte(ptep, pte)
> -#define	kvm_set_pmd(pmdp, pmd)		set_pmd(pmdp, pmd)
> -
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= PTE_S2_RDWR;
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ba66bf7ae299..c9ed239c0840 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -177,6 +177,33 @@ static void clear_stage2_pmd_entry(struct kvm *kvm, pmd_t *pmd, phys_addr_t addr
>  	put_page(virt_to_page(pmd));
>  }
>  
> +static inline void kvm_set_pte(pte_t *ptep, pte_t new_pte)
> +{
> +	WRITE_ONCE(*ptep, new_pte);
> +	dsb(ishst);
> +}
> +
> +static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
> +{
> +	WRITE_ONCE(*pmdp, new_pmd);
> +	dsb(ishst);
> +}
> +
> +static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
> +{
> +	pmd_populate_kernel(NULL, pmdp, ptep);
> +}
> +
> +static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
> +{
> +	pud_populate(NULL, pudp, pmdp);
> +}
> +
> +static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
> +{
> +	pgd_populate(NULL, pgdp, pudp);
> +}
> +
>  /*
>   * Unmapping vs dcache management:
>   *
> @@ -603,7 +630,7 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
>  				kvm_err("Cannot allocate Hyp pte\n");
>  				return -ENOMEM;
>  			}
> -			pmd_populate_kernel(NULL, pmd, pte);
> +			kvm_pmd_populate(pmd, pte);
>  			get_page(virt_to_page(pmd));
>  			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
>  		}
> @@ -636,7 +663,7 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
>  				kvm_err("Cannot allocate Hyp pmd\n");
>  				return -ENOMEM;
>  			}
> -			pud_populate(NULL, pud, pmd);
> +			kvm_pud_populate(pud, pmd);
>  			get_page(virt_to_page(pud));
>  			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
>  		}
> @@ -673,7 +700,7 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
>  				err = -ENOMEM;
>  				goto out;
>  			}
> -			pgd_populate(NULL, pgd, pud);
> +			kvm_pgd_populate(pgd, pud);
>  			get_page(virt_to_page(pgd));
>  			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
>  		}
> @@ -1092,7 +1119,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  		if (!cache)
>  			return 0; /* ignore calls from kvm_set_spte_hva */
>  		pte = mmu_memory_cache_alloc(cache);
> -		pmd_populate_kernel(NULL, pmd, pte);
> +		kvm_pmd_populate(pmd, pte);
>  		get_page(virt_to_page(pmd));
>  	}
>  
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 5/6] KVM: arm/arm64: Stop using {pmd,pud,pgd}_populate
  2018-05-30 12:47   ` Marc Zyngier
@ 2018-05-31 12:01     ` Mark Rutland
  -1 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 12:01 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, Catalin Marinas, kvmarm, kvm, Christoffer Dall

On Wed, May 30, 2018 at 01:47:05PM +0100, Marc Zyngier wrote:
> The {pmd,pud,pgd}_populate accessors usage in the kernel have always
> been a bit weird in KVM. We don't have a struct mm to pass (and
> neither does the kernel most of the time, but still...), and
> the 32bit code has all kind of cache maintenance that doesn't make
> sense on ARMv7+ when MP extensions are mandatory (which is the
> case when the VEs are present).
> 
> Let's bite the bullet and provide our own implementations. The
> only bit of architectural code left has to do with building the table
> entry itself (arm64 having up to 52bit PA, arm lacking PUD level).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h   | 4 ++++
>  arch/arm64/include/asm/kvm_mmu.h | 7 +++++++
>  virt/kvm/arm/mmu.c               | 8 +++++---
>  3 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 468ff945efa0..a94ef9833bd3 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -75,6 +75,10 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> +#define kvm_mk_pmd(ptep)	__pmd(__pa(ptep) | PMD_TYPE_TABLE)
> +#define kvm_mk_pud(pmdp)	__pud(__pa(pmdp) | PMD_TYPE_TABLE)
> +#define kvm_mk_pgd(pudp)	({ BUILD_BUG(); 0; })

I can't remember how the folding logic works for ARM is a pgd entry the
entire pud table?

Assuming so:

Acked-by: Mark Rutland <mark.rutland@arm.com>

> +
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= L_PTE_S2_RDWR;
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 26c89b63f604..22c9f7cfdf93 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -170,6 +170,13 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> +#define kvm_mk_pmd(ptep)					\
> +	__pmd(__phys_to_pmd_val(__pa(ptep) | PMD_TYPE_TABLE))
> +#define kvm_mk_pud(pmdp)					\
> +	__pud(__phys_to_pud_val(__pa(pmdp) | PMD_TYPE_TABLE))
> +#define kvm_mk_pgd(pudp)					\
> +	__pgd(__phys_to_pgd_val(__pa(pudp) | PUD_TYPE_TABLE))
> +
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= PTE_S2_RDWR;
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index c9ed239c0840..ad1980d2118a 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -191,17 +191,19 @@ static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
>  
>  static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
>  {
> -	pmd_populate_kernel(NULL, pmdp, ptep);
> +	kvm_set_pmd(pmdp, kvm_mk_pmd(ptep));
>  }
>  
>  static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
>  {
> -	pud_populate(NULL, pudp, pmdp);
> +	WRITE_ONCE(*pudp, kvm_mk_pud(pmdp));
> +	dsb(ishst);
>  }
>  
>  static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
>  {
> -	pgd_populate(NULL, pgdp, pudp);
> +	WRITE_ONCE(*pgdp, kvm_mk_pgd(pudp));
> +	dsb(ishst);
>  }
>  
>  /*
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 5/6] KVM: arm/arm64: Stop using {pmd,pud,pgd}_populate
@ 2018-05-31 12:01     ` Mark Rutland
  0 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 12:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 30, 2018 at 01:47:05PM +0100, Marc Zyngier wrote:
> The {pmd,pud,pgd}_populate accessors usage in the kernel have always
> been a bit weird in KVM. We don't have a struct mm to pass (and
> neither does the kernel most of the time, but still...), and
> the 32bit code has all kind of cache maintenance that doesn't make
> sense on ARMv7+ when MP extensions are mandatory (which is the
> case when the VEs are present).
> 
> Let's bite the bullet and provide our own implementations. The
> only bit of architectural code left has to do with building the table
> entry itself (arm64 having up to 52bit PA, arm lacking PUD level).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h   | 4 ++++
>  arch/arm64/include/asm/kvm_mmu.h | 7 +++++++
>  virt/kvm/arm/mmu.c               | 8 +++++---
>  3 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 468ff945efa0..a94ef9833bd3 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -75,6 +75,10 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> +#define kvm_mk_pmd(ptep)	__pmd(__pa(ptep) | PMD_TYPE_TABLE)
> +#define kvm_mk_pud(pmdp)	__pud(__pa(pmdp) | PMD_TYPE_TABLE)
> +#define kvm_mk_pgd(pudp)	({ BUILD_BUG(); 0; })

I can't remember how the folding logic works for ARM is a pgd entry the
entire pud table?

Assuming so:

Acked-by: Mark Rutland <mark.rutland@arm.com>

> +
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= L_PTE_S2_RDWR;
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 26c89b63f604..22c9f7cfdf93 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -170,6 +170,13 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> +#define kvm_mk_pmd(ptep)					\
> +	__pmd(__phys_to_pmd_val(__pa(ptep) | PMD_TYPE_TABLE))
> +#define kvm_mk_pud(pmdp)					\
> +	__pud(__phys_to_pud_val(__pa(pmdp) | PMD_TYPE_TABLE))
> +#define kvm_mk_pgd(pudp)					\
> +	__pgd(__phys_to_pgd_val(__pa(pudp) | PUD_TYPE_TABLE))
> +
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= PTE_S2_RDWR;
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index c9ed239c0840..ad1980d2118a 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -191,17 +191,19 @@ static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
>  
>  static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
>  {
> -	pmd_populate_kernel(NULL, pmdp, ptep);
> +	kvm_set_pmd(pmdp, kvm_mk_pmd(ptep));
>  }
>  
>  static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
>  {
> -	pud_populate(NULL, pudp, pmdp);
> +	WRITE_ONCE(*pudp, kvm_mk_pud(pmdp));
> +	dsb(ishst);
>  }
>  
>  static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
>  {
> -	pgd_populate(NULL, pgdp, pudp);
> +	WRITE_ONCE(*pgdp, kvm_mk_pgd(pudp));
> +	dsb(ishst);
>  }
>  
>  /*
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 6/6] KVM: arm/arm64: Remove unnecessary CMOs when creating HYP page tables
  2018-05-30 12:47   ` Marc Zyngier
@ 2018-05-31 12:01     ` Mark Rutland
  -1 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 12:01 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, Catalin Marinas, kvmarm, kvm, Christoffer Dall

On Wed, May 30, 2018 at 01:47:06PM +0100, Marc Zyngier wrote:
> There is no need to perform cache maintenance operations when
> creating the HYP page tables if we have the multiprocessing
> extensions. ARMv7 mandates them with the virtualization support,
> and ARMv8 just mandates them unconditionally.
> 
> Let's remove these operations.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  virt/kvm/arm/mmu.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ad1980d2118a..ccdf544d44c0 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -607,7 +607,6 @@ static void create_hyp_pte_mappings(pmd_t *pmd, unsigned long start,
>  		pte = pte_offset_kernel(pmd, addr);
>  		kvm_set_pte(pte, pfn_pte(pfn, prot));
>  		get_page(virt_to_page(pte));
> -		kvm_flush_dcache_to_poc(pte, sizeof(*pte));
>  		pfn++;
>  	} while (addr += PAGE_SIZE, addr != end);
>  }
> @@ -634,7 +633,6 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
>  			}
>  			kvm_pmd_populate(pmd, pte);
>  			get_page(virt_to_page(pmd));
> -			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
>  		}
>  
>  		next = pmd_addr_end(addr, end);
> @@ -667,7 +665,6 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
>  			}
>  			kvm_pud_populate(pud, pmd);
>  			get_page(virt_to_page(pud));
> -			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
>  		}
>  
>  		next = pud_addr_end(addr, end);
> @@ -704,7 +701,6 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
>  			}
>  			kvm_pgd_populate(pgd, pud);
>  			get_page(virt_to_page(pgd));
> -			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
>  		}
>  
>  		next = pgd_addr_end(addr, end);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 6/6] KVM: arm/arm64: Remove unnecessary CMOs when creating HYP page tables
@ 2018-05-31 12:01     ` Mark Rutland
  0 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 12:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 30, 2018 at 01:47:06PM +0100, Marc Zyngier wrote:
> There is no need to perform cache maintenance operations when
> creating the HYP page tables if we have the multiprocessing
> extensions. ARMv7 mandates them with the virtualization support,
> and ARMv8 just mandates them unconditionally.
> 
> Let's remove these operations.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  virt/kvm/arm/mmu.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ad1980d2118a..ccdf544d44c0 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -607,7 +607,6 @@ static void create_hyp_pte_mappings(pmd_t *pmd, unsigned long start,
>  		pte = pte_offset_kernel(pmd, addr);
>  		kvm_set_pte(pte, pfn_pte(pfn, prot));
>  		get_page(virt_to_page(pte));
> -		kvm_flush_dcache_to_poc(pte, sizeof(*pte));
>  		pfn++;
>  	} while (addr += PAGE_SIZE, addr != end);
>  }
> @@ -634,7 +633,6 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
>  			}
>  			kvm_pmd_populate(pmd, pte);
>  			get_page(virt_to_page(pmd));
> -			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
>  		}
>  
>  		next = pmd_addr_end(addr, end);
> @@ -667,7 +665,6 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
>  			}
>  			kvm_pud_populate(pud, pmd);
>  			get_page(virt_to_page(pud));
> -			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
>  		}
>  
>  		next = pud_addr_end(addr, end);
> @@ -704,7 +701,6 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
>  			}
>  			kvm_pgd_populate(pgd, pud);
>  			get_page(virt_to_page(pgd));
> -			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
>  		}
>  
>  		next = pgd_addr_end(addr, end);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/6] arm64: KVM: Add support for Stage-2 control of memory types and cacheability
  2018-05-31 11:49     ` Mark Rutland
@ 2018-05-31 12:38       ` Marc Zyngier
  -1 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-31 12:38 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, Catalin Marinas, kvmarm, kvm, Christoffer Dall

On 31/05/18 12:49, Mark Rutland wrote:
> On Wed, May 30, 2018 at 01:47:01PM +0100, Marc Zyngier wrote:
>> Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes
>> results in the strongest attribute of the two stages.  This means
>> that the hypervisor has to perform quite a lot of cache maintenance
>> just in case the guest has some non-cacheable mappings around.
>>
>> ARMv8.4 solves this problem by offering a different mode (FWB) where
>> Stage-2 has total control over the memory attribute (this is limited
>> to systems where both I/O and instruction caches are coherent with
> 
> s/caches/fetches/ -- the I-caches themselves aren't coherent with the
> D-caches (or we could omit I-cache maintenance).
> 
> i.e. this implies IDC, but not DIC.

It may imply IDC behaviour, but not quite IDC itself. I agree, this
looks dodgy. I've asked for clarification on the spec.

> 
>> the dcache). This is achieved by having a different set of memory
>> attributes in the page tables, and a new bit set in HCR_EL2.
>>
>> On such a system, we can then safely sidestep any form of dcache
>> management.
>>
>> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
> 
>>  static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
>>  {
>> @@ -268,7 +269,10 @@ static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
>>  {
>>  	void *va = page_address(pfn_to_page(pfn));
>>  
>> -	kvm_flush_dcache_to_poc(va, size);
>> +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
>> +		kvm_flush_dcache_to_poc(va, size);
>> +	else
>> +		kvm_flush_dcache_to_pou(va, size);
>>  }
> 
> Te commit message said instruction fetches were coherent, and that no
> D-cache maintenance was necessary, so why do we need maintenance to the
> PoU?

That maintenance will be elided if we actually have IDC set. I'm happy
to drop it once I have confirmation that we have an identical behaviour.

> 
>> +static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
>> +{
>> +	u64 val = read_sysreg_s(SYS_CLIDR_EL1);
>> +
>> +	/* Check that CLIDR_EL1.LOU{U,IS} are both 0 */
>> +	WARN_ON(val & (7 << 27 | 7 << 21));
>> +}
> 
> What about CTR_EL0.IDC?

Again, that depends on whether FWB implies IDC or not.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 1/6] arm64: KVM: Add support for Stage-2 control of memory types and cacheability
@ 2018-05-31 12:38       ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-31 12:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 31/05/18 12:49, Mark Rutland wrote:
> On Wed, May 30, 2018 at 01:47:01PM +0100, Marc Zyngier wrote:
>> Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes
>> results in the strongest attribute of the two stages.  This means
>> that the hypervisor has to perform quite a lot of cache maintenance
>> just in case the guest has some non-cacheable mappings around.
>>
>> ARMv8.4 solves this problem by offering a different mode (FWB) where
>> Stage-2 has total control over the memory attribute (this is limited
>> to systems where both I/O and instruction caches are coherent with
> 
> s/caches/fetches/ -- the I-caches themselves aren't coherent with the
> D-caches (or we could omit I-cache maintenance).
> 
> i.e. this implies IDC, but not DIC.

It may imply IDC behaviour, but not quite IDC itself. I agree, this
looks dodgy. I've asked for clarification on the spec.

> 
>> the dcache). This is achieved by having a different set of memory
>> attributes in the page tables, and a new bit set in HCR_EL2.
>>
>> On such a system, we can then safely sidestep any form of dcache
>> management.
>>
>> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
> 
>>  static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
>>  {
>> @@ -268,7 +269,10 @@ static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
>>  {
>>  	void *va = page_address(pfn_to_page(pfn));
>>  
>> -	kvm_flush_dcache_to_poc(va, size);
>> +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
>> +		kvm_flush_dcache_to_poc(va, size);
>> +	else
>> +		kvm_flush_dcache_to_pou(va, size);
>>  }
> 
> Te commit message said instruction fetches were coherent, and that no
> D-cache maintenance was necessary, so why do we need maintenance to the
> PoU?

That maintenance will be elided if we actually have IDC set. I'm happy
to drop it once I have confirmation that we have an identical behaviour.

> 
>> +static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
>> +{
>> +	u64 val = read_sysreg_s(SYS_CLIDR_EL1);
>> +
>> +	/* Check that CLIDR_EL1.LOU{U,IS} are both 0 */
>> +	WARN_ON(val & (7 << 27 | 7 << 21));
>> +}
> 
> What about CTR_EL0.IDC?

Again, that depends on whether FWB implies IDC or not.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
  2018-05-31 11:51     ` Mark Rutland
@ 2018-05-31 13:00       ` Marc Zyngier
  -1 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-31 13:00 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, Catalin Marinas, kvmarm, kvm, Christoffer Dall

On 31/05/18 12:51, Mark Rutland wrote:
> On Wed, May 30, 2018 at 01:47:02PM +0100, Marc Zyngier wrote:
>> Set/Way handling is one of the ugliest corners of KVM. We shouldn't
>> have to handle that, but better safe than sorry.
>>
>> Thankfully, FWB fixes this for us by not requiering any maintenance
>> whatsoever, which means we don't have to emulate S/W CMOs, and don't
>> have to track VM ops either.
>>
>> We still have to trap S/W though, if only to prevent the guest from
>> doing something bad.
> 
> S/W ops *also* do I-cache maintenance, so we'd still need to emulate
> that. Though it looks like we're missing that today...

This doesn't look right: CSSELR_EL1 does indeed have an InD bit, but
that's only for the purpose of reading CSSIDR_EL1. DC CSW and co
directly take a level *without* the InD bit, and seem to be limited to
"data and unified cache".

Am I missing something?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
@ 2018-05-31 13:00       ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-05-31 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

On 31/05/18 12:51, Mark Rutland wrote:
> On Wed, May 30, 2018 at 01:47:02PM +0100, Marc Zyngier wrote:
>> Set/Way handling is one of the ugliest corners of KVM. We shouldn't
>> have to handle that, but better safe than sorry.
>>
>> Thankfully, FWB fixes this for us by not requiering any maintenance
>> whatsoever, which means we don't have to emulate S/W CMOs, and don't
>> have to track VM ops either.
>>
>> We still have to trap S/W though, if only to prevent the guest from
>> doing something bad.
> 
> S/W ops *also* do I-cache maintenance, so we'd still need to emulate
> that. Though it looks like we're missing that today...

This doesn't look right: CSSELR_EL1 does indeed have an InD bit, but
that's only for the purpose of reading CSSIDR_EL1. DC CSW and co
directly take a level *without* the InD bit, and seem to be limited to
"data and unified cache".

Am I missing something?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
  2018-05-31 13:00       ` Marc Zyngier
@ 2018-05-31 16:00         ` Mark Rutland
  -1 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 16:00 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, Catalin Marinas, kvmarm, kvm, Christoffer Dall

On Thu, May 31, 2018 at 02:00:11PM +0100, Marc Zyngier wrote:
> On 31/05/18 12:51, Mark Rutland wrote:
> > On Wed, May 30, 2018 at 01:47:02PM +0100, Marc Zyngier wrote:
> >> Set/Way handling is one of the ugliest corners of KVM. We shouldn't
> >> have to handle that, but better safe than sorry.
> >>
> >> Thankfully, FWB fixes this for us by not requiering any maintenance
> >> whatsoever, which means we don't have to emulate S/W CMOs, and don't
> >> have to track VM ops either.
> >>
> >> We still have to trap S/W though, if only to prevent the guest from
> >> doing something bad.
> > 
> > S/W ops *also* do I-cache maintenance, so we'd still need to emulate
> > that. Though it looks like we're missing that today...
> 
> This doesn't look right: CSSELR_EL1 does indeed have an InD bit, but
> that's only for the purpose of reading CSSIDR_EL1. DC CSW and co
> directly take a level *without* the InD bit, and seem to be limited to
> "data and unified cache".
> 
> Am I missing something?

No; I was mistaken.

Sorry for the noise!

Mark.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
@ 2018-05-31 16:00         ` Mark Rutland
  0 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2018-05-31 16:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 31, 2018 at 02:00:11PM +0100, Marc Zyngier wrote:
> On 31/05/18 12:51, Mark Rutland wrote:
> > On Wed, May 30, 2018 at 01:47:02PM +0100, Marc Zyngier wrote:
> >> Set/Way handling is one of the ugliest corners of KVM. We shouldn't
> >> have to handle that, but better safe than sorry.
> >>
> >> Thankfully, FWB fixes this for us by not requiering any maintenance
> >> whatsoever, which means we don't have to emulate S/W CMOs, and don't
> >> have to track VM ops either.
> >>
> >> We still have to trap S/W though, if only to prevent the guest from
> >> doing something bad.
> > 
> > S/W ops *also* do I-cache maintenance, so we'd still need to emulate
> > that. Though it looks like we're missing that today...
> 
> This doesn't look right: CSSELR_EL1 does indeed have an InD bit, but
> that's only for the purpose of reading CSSIDR_EL1. DC CSW and co
> directly take a level *without* the InD bit, and seem to be limited to
> "data and unified cache".
> 
> Am I missing something?

No; I was mistaken.

Sorry for the noise!

Mark.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
  2018-05-30 12:47   ` Marc Zyngier
@ 2018-06-09  9:26     ` Christoffer Dall
  -1 siblings, 0 replies; 48+ messages in thread
From: Christoffer Dall @ 2018-06-09  9:26 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, Catalin Marinas, Mark Rutland, kvmarm, kvm

On Wed, May 30, 2018 at 01:47:02PM +0100, Marc Zyngier wrote:
> Set/Way handling is one of the ugliest corners of KVM. We shouldn't
> have to handle that, but better safe than sorry.
> 
> Thankfully, FWB fixes this for us by not requiering any maintenance
> whatsoever, which means we don't have to emulate S/W CMOs, and don't
> have to track VM ops either.

I tiny bit of rationale here would have been nice.  As I understand it,
if we're presenting the guest with a fully coherent system, there should
never be a need to invalidate anything, because the guest will always
see the most recent value no matter how it sings and dances, right?

> 
> We still have to trap S/W though, if only to prevent the guest from
> doing something bad.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 6e3b969391fd..9a740f159245 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -195,7 +195,13 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
>  	if (!p->is_write)
>  		return read_from_write_only(vcpu, p, r);
>  
> -	kvm_set_way_flush(vcpu);
> +	/*
> +	 * Only track S/W ops if we don't have FWB. It still indicates
> +	 * that the guest is a bit broken...
> +	 */

Is it strictly true that the guest is broken if it does any form of S/W
ops?  Does the guest actually know that it's running on a fully coherent
system, or is the argument that no software, ever, should do S/W, even
for reboot etc.?

I think this should have slightly more info, or that part of the comment
should just be dropped, to avoid misleading future readers who don't
have the full picture.

> +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> +		kvm_set_way_flush(vcpu);
> +
>  	return true;
>  }
>  
> -- 
> 2.17.1
> 

Besides the usual nits on commentary:

Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
@ 2018-06-09  9:26     ` Christoffer Dall
  0 siblings, 0 replies; 48+ messages in thread
From: Christoffer Dall @ 2018-06-09  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 30, 2018 at 01:47:02PM +0100, Marc Zyngier wrote:
> Set/Way handling is one of the ugliest corners of KVM. We shouldn't
> have to handle that, but better safe than sorry.
> 
> Thankfully, FWB fixes this for us by not requiering any maintenance
> whatsoever, which means we don't have to emulate S/W CMOs, and don't
> have to track VM ops either.

I tiny bit of rationale here would have been nice.  As I understand it,
if we're presenting the guest with a fully coherent system, there should
never be a need to invalidate anything, because the guest will always
see the most recent value no matter how it sings and dances, right?

> 
> We still have to trap S/W though, if only to prevent the guest from
> doing something bad.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/sys_regs.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 6e3b969391fd..9a740f159245 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -195,7 +195,13 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
>  	if (!p->is_write)
>  		return read_from_write_only(vcpu, p, r);
>  
> -	kvm_set_way_flush(vcpu);
> +	/*
> +	 * Only track S/W ops if we don't have FWB. It still indicates
> +	 * that the guest is a bit broken...
> +	 */

Is it strictly true that the guest is broken if it does any form of S/W
ops?  Does the guest actually know that it's running on a fully coherent
system, or is the argument that no software, ever, should do S/W, even
for reboot etc.?

I think this should have slightly more info, or that part of the comment
should just be dropped, to avoid misleading future readers who don't
have the full picture.

> +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> +		kvm_set_way_flush(vcpu);
> +
>  	return true;
>  }
>  
> -- 
> 2.17.1
> 

Besides the usual nits on commentary:

Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 3/6] arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set
  2018-05-30 12:47   ` Marc Zyngier
@ 2018-06-09  9:29     ` Christoffer Dall
  -1 siblings, 0 replies; 48+ messages in thread
From: Christoffer Dall @ 2018-06-09  9:29 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, Catalin Marinas, Mark Rutland, kvmarm, kvm

On Wed, May 30, 2018 at 01:47:03PM +0100, Marc Zyngier wrote:
> On systems where CTR_EL0.DIC is set, we don't need to perform
> icache invalidation to guarantee that we'll fetch the right
> instruction stream.
> 
> This also means that taking a permission fault to invalidate the
> icache is an unnecessary overhead.
> 
> On such systems, we can safely leave the page as being executable.
> 
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@arm.com>

> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/pgtable-prot.h | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
> index c66c3047400e..78b942c1bea4 100644
> --- a/arch/arm64/include/asm/pgtable-prot.h
> +++ b/arch/arm64/include/asm/pgtable-prot.h
> @@ -77,8 +77,18 @@
>  		__val;							\
>  	 })
>  
> -#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
> -#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
> +#define PAGE_S2_XN							\
> +	({								\
> +		u64 __val;						\
> +		if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))		\
> +			__val = 0;					\
> +		else							\
> +			__val = PTE_S2_XN;				\
> +		__val;							\
> +	})
> +
> +#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PAGE_S2_XN)
> +#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PAGE_S2_XN)
>  
>  #define PAGE_NONE		__pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
>  #define PAGE_SHARED		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 3/6] arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set
@ 2018-06-09  9:29     ` Christoffer Dall
  0 siblings, 0 replies; 48+ messages in thread
From: Christoffer Dall @ 2018-06-09  9:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 30, 2018 at 01:47:03PM +0100, Marc Zyngier wrote:
> On systems where CTR_EL0.DIC is set, we don't need to perform
> icache invalidation to guarantee that we'll fetch the right
> instruction stream.
> 
> This also means that taking a permission fault to invalidate the
> icache is an unnecessary overhead.
> 
> On such systems, we can safely leave the page as being executable.
> 
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@arm.com>

> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/pgtable-prot.h | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
> index c66c3047400e..78b942c1bea4 100644
> --- a/arch/arm64/include/asm/pgtable-prot.h
> +++ b/arch/arm64/include/asm/pgtable-prot.h
> @@ -77,8 +77,18 @@
>  		__val;							\
>  	 })
>  
> -#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
> -#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
> +#define PAGE_S2_XN							\
> +	({								\
> +		u64 __val;						\
> +		if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))		\
> +			__val = 0;					\
> +		else							\
> +			__val = PTE_S2_XN;				\
> +		__val;							\
> +	})
> +
> +#define PAGE_S2			__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(NORMAL) | PTE_S2_RDONLY | PAGE_S2_XN)
> +#define PAGE_S2_DEVICE		__pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PAGE_S2_XN)
>  
>  #define PAGE_NONE		__pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
>  #define PAGE_SHARED		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 4/6] KVM: arm/arm64: Consolidate page-table accessors
  2018-05-30 12:47   ` Marc Zyngier
@ 2018-06-09  9:31     ` Christoffer Dall
  -1 siblings, 0 replies; 48+ messages in thread
From: Christoffer Dall @ 2018-06-09  9:31 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, Catalin Marinas, Mark Rutland, kvmarm, kvm

On Wed, May 30, 2018 at 01:47:04PM +0100, Marc Zyngier wrote:
> The arm and arm64 KVM page tables accessors are pointlessly different
> between the two architectures, and likely both wrong one way or another:
> arm64 lacks a dsb(), and arm doesn't use WRITE_ONCE.
> 
> Let's unify them.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h   | 12 -----------
>  arch/arm64/include/asm/kvm_mmu.h |  3 ---
>  virt/kvm/arm/mmu.c               | 35 ++++++++++++++++++++++++++++----
>  3 files changed, 31 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 707a1f06dc5d..468ff945efa0 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -75,18 +75,6 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> -static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
> -{
> -	*pmd = new_pmd;
> -	dsb(ishst);
> -}
> -
> -static inline void kvm_set_pte(pte_t *pte, pte_t new_pte)
> -{
> -	*pte = new_pte;
> -	dsb(ishst);
> -}
> -
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= L_PTE_S2_RDWR;
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 9dbca5355029..26c89b63f604 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -170,9 +170,6 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> -#define	kvm_set_pte(ptep, pte)		set_pte(ptep, pte)
> -#define	kvm_set_pmd(pmdp, pmd)		set_pmd(pmdp, pmd)
> -
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= PTE_S2_RDWR;
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ba66bf7ae299..c9ed239c0840 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -177,6 +177,33 @@ static void clear_stage2_pmd_entry(struct kvm *kvm, pmd_t *pmd, phys_addr_t addr
>  	put_page(virt_to_page(pmd));
>  }
>  
> +static inline void kvm_set_pte(pte_t *ptep, pte_t new_pte)
> +{
> +	WRITE_ONCE(*ptep, new_pte);
> +	dsb(ishst);
> +}
> +
> +static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
> +{
> +	WRITE_ONCE(*pmdp, new_pmd);
> +	dsb(ishst);
> +}
> +

arm64 set_pte and set_pmd have an isb() in addition to the dsb(), why
can we let go of that here?

> +static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
> +{
> +	pmd_populate_kernel(NULL, pmdp, ptep);
> +}
> +
> +static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
> +{
> +	pud_populate(NULL, pudp, pmdp);
> +}
> +
> +static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
> +{
> +	pgd_populate(NULL, pgdp, pudp);
> +}
> +
>  /*
>   * Unmapping vs dcache management:
>   *
> @@ -603,7 +630,7 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
>  				kvm_err("Cannot allocate Hyp pte\n");
>  				return -ENOMEM;
>  			}
> -			pmd_populate_kernel(NULL, pmd, pte);
> +			kvm_pmd_populate(pmd, pte);
>  			get_page(virt_to_page(pmd));
>  			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
>  		}
> @@ -636,7 +663,7 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
>  				kvm_err("Cannot allocate Hyp pmd\n");
>  				return -ENOMEM;
>  			}
> -			pud_populate(NULL, pud, pmd);
> +			kvm_pud_populate(pud, pmd);
>  			get_page(virt_to_page(pud));
>  			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
>  		}
> @@ -673,7 +700,7 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
>  				err = -ENOMEM;
>  				goto out;
>  			}
> -			pgd_populate(NULL, pgd, pud);
> +			kvm_pgd_populate(pgd, pud);
>  			get_page(virt_to_page(pgd));
>  			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
>  		}
> @@ -1092,7 +1119,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  		if (!cache)
>  			return 0; /* ignore calls from kvm_set_spte_hva */
>  		pte = mmu_memory_cache_alloc(cache);
> -		pmd_populate_kernel(NULL, pmd, pte);
> +		kvm_pmd_populate(pmd, pte);
>  		get_page(virt_to_page(pmd));
>  	}
>  
> -- 
> 2.17.1
> 

Otherwise:

Acked-by: Christoffer Dall <christoffer.dall@arm.com>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 4/6] KVM: arm/arm64: Consolidate page-table accessors
@ 2018-06-09  9:31     ` Christoffer Dall
  0 siblings, 0 replies; 48+ messages in thread
From: Christoffer Dall @ 2018-06-09  9:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 30, 2018 at 01:47:04PM +0100, Marc Zyngier wrote:
> The arm and arm64 KVM page tables accessors are pointlessly different
> between the two architectures, and likely both wrong one way or another:
> arm64 lacks a dsb(), and arm doesn't use WRITE_ONCE.
> 
> Let's unify them.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h   | 12 -----------
>  arch/arm64/include/asm/kvm_mmu.h |  3 ---
>  virt/kvm/arm/mmu.c               | 35 ++++++++++++++++++++++++++++----
>  3 files changed, 31 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 707a1f06dc5d..468ff945efa0 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -75,18 +75,6 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> -static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
> -{
> -	*pmd = new_pmd;
> -	dsb(ishst);
> -}
> -
> -static inline void kvm_set_pte(pte_t *pte, pte_t new_pte)
> -{
> -	*pte = new_pte;
> -	dsb(ishst);
> -}
> -
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= L_PTE_S2_RDWR;
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 9dbca5355029..26c89b63f604 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -170,9 +170,6 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> -#define	kvm_set_pte(ptep, pte)		set_pte(ptep, pte)
> -#define	kvm_set_pmd(pmdp, pmd)		set_pmd(pmdp, pmd)
> -
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= PTE_S2_RDWR;
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ba66bf7ae299..c9ed239c0840 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -177,6 +177,33 @@ static void clear_stage2_pmd_entry(struct kvm *kvm, pmd_t *pmd, phys_addr_t addr
>  	put_page(virt_to_page(pmd));
>  }
>  
> +static inline void kvm_set_pte(pte_t *ptep, pte_t new_pte)
> +{
> +	WRITE_ONCE(*ptep, new_pte);
> +	dsb(ishst);
> +}
> +
> +static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
> +{
> +	WRITE_ONCE(*pmdp, new_pmd);
> +	dsb(ishst);
> +}
> +

arm64 set_pte and set_pmd have an isb() in addition to the dsb(), why
can we let go of that here?

> +static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
> +{
> +	pmd_populate_kernel(NULL, pmdp, ptep);
> +}
> +
> +static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
> +{
> +	pud_populate(NULL, pudp, pmdp);
> +}
> +
> +static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
> +{
> +	pgd_populate(NULL, pgdp, pudp);
> +}
> +
>  /*
>   * Unmapping vs dcache management:
>   *
> @@ -603,7 +630,7 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
>  				kvm_err("Cannot allocate Hyp pte\n");
>  				return -ENOMEM;
>  			}
> -			pmd_populate_kernel(NULL, pmd, pte);
> +			kvm_pmd_populate(pmd, pte);
>  			get_page(virt_to_page(pmd));
>  			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
>  		}
> @@ -636,7 +663,7 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
>  				kvm_err("Cannot allocate Hyp pmd\n");
>  				return -ENOMEM;
>  			}
> -			pud_populate(NULL, pud, pmd);
> +			kvm_pud_populate(pud, pmd);
>  			get_page(virt_to_page(pud));
>  			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
>  		}
> @@ -673,7 +700,7 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
>  				err = -ENOMEM;
>  				goto out;
>  			}
> -			pgd_populate(NULL, pgd, pud);
> +			kvm_pgd_populate(pgd, pud);
>  			get_page(virt_to_page(pgd));
>  			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
>  		}
> @@ -1092,7 +1119,7 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  		if (!cache)
>  			return 0; /* ignore calls from kvm_set_spte_hva */
>  		pte = mmu_memory_cache_alloc(cache);
> -		pmd_populate_kernel(NULL, pmd, pte);
> +		kvm_pmd_populate(pmd, pte);
>  		get_page(virt_to_page(pmd));
>  	}
>  
> -- 
> 2.17.1
> 

Otherwise:

Acked-by: Christoffer Dall <christoffer.dall@arm.com>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 5/6] KVM: arm/arm64: Stop using {pmd,pud,pgd}_populate
  2018-05-30 12:47   ` Marc Zyngier
@ 2018-06-09  9:57     ` Christoffer Dall
  -1 siblings, 0 replies; 48+ messages in thread
From: Christoffer Dall @ 2018-06-09  9:57 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, Catalin Marinas, Mark Rutland, kvmarm, kvm

On Wed, May 30, 2018 at 01:47:05PM +0100, Marc Zyngier wrote:
> The {pmd,pud,pgd}_populate accessors usage in the kernel have always
> been a bit weird in KVM. We don't have a struct mm to pass (and
> neither does the kernel most of the time, but still...), and
> the 32bit code has all kind of cache maintenance that doesn't make
> sense on ARMv7+ when MP extensions are mandatory (which is the
> case when the VEs are present).
> 
> Let's bite the bullet and provide our own implementations. The
> only bit of architectural code left has to do with building the table
> entry itself (arm64 having up to 52bit PA, arm lacking PUD level).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@arm.com>

> ---
>  arch/arm/include/asm/kvm_mmu.h   | 4 ++++
>  arch/arm64/include/asm/kvm_mmu.h | 7 +++++++
>  virt/kvm/arm/mmu.c               | 8 +++++---
>  3 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 468ff945efa0..a94ef9833bd3 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -75,6 +75,10 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> +#define kvm_mk_pmd(ptep)	__pmd(__pa(ptep) | PMD_TYPE_TABLE)
> +#define kvm_mk_pud(pmdp)	__pud(__pa(pmdp) | PMD_TYPE_TABLE)
> +#define kvm_mk_pgd(pudp)	({ BUILD_BUG(); 0; })
> +
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= L_PTE_S2_RDWR;
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 26c89b63f604..22c9f7cfdf93 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -170,6 +170,13 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> +#define kvm_mk_pmd(ptep)					\
> +	__pmd(__phys_to_pmd_val(__pa(ptep) | PMD_TYPE_TABLE))
> +#define kvm_mk_pud(pmdp)					\
> +	__pud(__phys_to_pud_val(__pa(pmdp) | PMD_TYPE_TABLE))
> +#define kvm_mk_pgd(pudp)					\
> +	__pgd(__phys_to_pgd_val(__pa(pudp) | PUD_TYPE_TABLE))
> +
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= PTE_S2_RDWR;
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index c9ed239c0840..ad1980d2118a 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -191,17 +191,19 @@ static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
>  
>  static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
>  {
> -	pmd_populate_kernel(NULL, pmdp, ptep);
> +	kvm_set_pmd(pmdp, kvm_mk_pmd(ptep));
>  }
>  
>  static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
>  {
> -	pud_populate(NULL, pudp, pmdp);
> +	WRITE_ONCE(*pudp, kvm_mk_pud(pmdp));
> +	dsb(ishst);
>  }
>  
>  static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
>  {
> -	pgd_populate(NULL, pgdp, pudp);
> +	WRITE_ONCE(*pgdp, kvm_mk_pgd(pudp));
> +	dsb(ishst);
>  }
>  
>  /*
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 5/6] KVM: arm/arm64: Stop using {pmd,pud,pgd}_populate
@ 2018-06-09  9:57     ` Christoffer Dall
  0 siblings, 0 replies; 48+ messages in thread
From: Christoffer Dall @ 2018-06-09  9:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 30, 2018 at 01:47:05PM +0100, Marc Zyngier wrote:
> The {pmd,pud,pgd}_populate accessors usage in the kernel have always
> been a bit weird in KVM. We don't have a struct mm to pass (and
> neither does the kernel most of the time, but still...), and
> the 32bit code has all kind of cache maintenance that doesn't make
> sense on ARMv7+ when MP extensions are mandatory (which is the
> case when the VEs are present).
> 
> Let's bite the bullet and provide our own implementations. The
> only bit of architectural code left has to do with building the table
> entry itself (arm64 having up to 52bit PA, arm lacking PUD level).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@arm.com>

> ---
>  arch/arm/include/asm/kvm_mmu.h   | 4 ++++
>  arch/arm64/include/asm/kvm_mmu.h | 7 +++++++
>  virt/kvm/arm/mmu.c               | 8 +++++---
>  3 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 468ff945efa0..a94ef9833bd3 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -75,6 +75,10 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> +#define kvm_mk_pmd(ptep)	__pmd(__pa(ptep) | PMD_TYPE_TABLE)
> +#define kvm_mk_pud(pmdp)	__pud(__pa(pmdp) | PMD_TYPE_TABLE)
> +#define kvm_mk_pgd(pudp)	({ BUILD_BUG(); 0; })
> +
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= L_PTE_S2_RDWR;
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 26c89b63f604..22c9f7cfdf93 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -170,6 +170,13 @@ phys_addr_t kvm_get_idmap_vector(void);
>  int kvm_mmu_init(void);
>  void kvm_clear_hyp_idmap(void);
>  
> +#define kvm_mk_pmd(ptep)					\
> +	__pmd(__phys_to_pmd_val(__pa(ptep) | PMD_TYPE_TABLE))
> +#define kvm_mk_pud(pmdp)					\
> +	__pud(__phys_to_pud_val(__pa(pmdp) | PMD_TYPE_TABLE))
> +#define kvm_mk_pgd(pudp)					\
> +	__pgd(__phys_to_pgd_val(__pa(pudp) | PUD_TYPE_TABLE))
> +
>  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>  {
>  	pte_val(pte) |= PTE_S2_RDWR;
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index c9ed239c0840..ad1980d2118a 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -191,17 +191,19 @@ static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
>  
>  static inline void kvm_pmd_populate(pmd_t *pmdp, pte_t *ptep)
>  {
> -	pmd_populate_kernel(NULL, pmdp, ptep);
> +	kvm_set_pmd(pmdp, kvm_mk_pmd(ptep));
>  }
>  
>  static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
>  {
> -	pud_populate(NULL, pudp, pmdp);
> +	WRITE_ONCE(*pudp, kvm_mk_pud(pmdp));
> +	dsb(ishst);
>  }
>  
>  static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
>  {
> -	pgd_populate(NULL, pgdp, pudp);
> +	WRITE_ONCE(*pgdp, kvm_mk_pgd(pudp));
> +	dsb(ishst);
>  }
>  
>  /*
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 6/6] KVM: arm/arm64: Remove unnecessary CMOs when creating HYP page tables
  2018-05-30 12:47   ` Marc Zyngier
@ 2018-06-09  9:58     ` Christoffer Dall
  -1 siblings, 0 replies; 48+ messages in thread
From: Christoffer Dall @ 2018-06-09  9:58 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, Catalin Marinas, Mark Rutland, kvmarm, kvm

On Wed, May 30, 2018 at 01:47:06PM +0100, Marc Zyngier wrote:
> There is no need to perform cache maintenance operations when
> creating the HYP page tables if we have the multiprocessing
> extensions. ARMv7 mandates them with the virtualization support,
> and ARMv8 just mandates them unconditionally.
> 
> Let's remove these operations.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@arm.com>

> ---
>  virt/kvm/arm/mmu.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ad1980d2118a..ccdf544d44c0 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -607,7 +607,6 @@ static void create_hyp_pte_mappings(pmd_t *pmd, unsigned long start,
>  		pte = pte_offset_kernel(pmd, addr);
>  		kvm_set_pte(pte, pfn_pte(pfn, prot));
>  		get_page(virt_to_page(pte));
> -		kvm_flush_dcache_to_poc(pte, sizeof(*pte));
>  		pfn++;
>  	} while (addr += PAGE_SIZE, addr != end);
>  }
> @@ -634,7 +633,6 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
>  			}
>  			kvm_pmd_populate(pmd, pte);
>  			get_page(virt_to_page(pmd));
> -			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
>  		}
>  
>  		next = pmd_addr_end(addr, end);
> @@ -667,7 +665,6 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
>  			}
>  			kvm_pud_populate(pud, pmd);
>  			get_page(virt_to_page(pud));
> -			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
>  		}
>  
>  		next = pud_addr_end(addr, end);
> @@ -704,7 +701,6 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
>  			}
>  			kvm_pgd_populate(pgd, pud);
>  			get_page(virt_to_page(pgd));
> -			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
>  		}
>  
>  		next = pgd_addr_end(addr, end);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 6/6] KVM: arm/arm64: Remove unnecessary CMOs when creating HYP page tables
@ 2018-06-09  9:58     ` Christoffer Dall
  0 siblings, 0 replies; 48+ messages in thread
From: Christoffer Dall @ 2018-06-09  9:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 30, 2018 at 01:47:06PM +0100, Marc Zyngier wrote:
> There is no need to perform cache maintenance operations when
> creating the HYP page tables if we have the multiprocessing
> extensions. ARMv7 mandates them with the virtualization support,
> and ARMv8 just mandates them unconditionally.
> 
> Let's remove these operations.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@arm.com>

> ---
>  virt/kvm/arm/mmu.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ad1980d2118a..ccdf544d44c0 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -607,7 +607,6 @@ static void create_hyp_pte_mappings(pmd_t *pmd, unsigned long start,
>  		pte = pte_offset_kernel(pmd, addr);
>  		kvm_set_pte(pte, pfn_pte(pfn, prot));
>  		get_page(virt_to_page(pte));
> -		kvm_flush_dcache_to_poc(pte, sizeof(*pte));
>  		pfn++;
>  	} while (addr += PAGE_SIZE, addr != end);
>  }
> @@ -634,7 +633,6 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
>  			}
>  			kvm_pmd_populate(pmd, pte);
>  			get_page(virt_to_page(pmd));
> -			kvm_flush_dcache_to_poc(pmd, sizeof(*pmd));
>  		}
>  
>  		next = pmd_addr_end(addr, end);
> @@ -667,7 +665,6 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
>  			}
>  			kvm_pud_populate(pud, pmd);
>  			get_page(virt_to_page(pud));
> -			kvm_flush_dcache_to_poc(pud, sizeof(*pud));
>  		}
>  
>  		next = pud_addr_end(addr, end);
> @@ -704,7 +701,6 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
>  			}
>  			kvm_pgd_populate(pgd, pud);
>  			get_page(virt_to_page(pgd));
> -			kvm_flush_dcache_to_poc(pgd, sizeof(*pgd));
>  		}
>  
>  		next = pgd_addr_end(addr, end);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
  2018-06-09  9:26     ` Christoffer Dall
@ 2018-06-09 12:31       ` Marc Zyngier
  -1 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-06-09 12:31 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, Catalin Marinas, Mark Rutland, kvmarm, kvm

On Sat, 09 Jun 2018 10:26:40 +0100,
Christoffer Dall wrote:
> 
> On Wed, May 30, 2018 at 01:47:02PM +0100, Marc Zyngier wrote:
> > Set/Way handling is one of the ugliest corners of KVM. We shouldn't
> > have to handle that, but better safe than sorry.
> > 
> > Thankfully, FWB fixes this for us by not requiering any maintenance
> > whatsoever, which means we don't have to emulate S/W CMOs, and don't
> > have to track VM ops either.
> 
> I tiny bit of rationale here would have been nice.  As I understand it,
> if we're presenting the guest with a fully coherent system, there should
> never be a need to invalidate anything, because the guest will always
> see the most recent value no matter how it sings and dances, right?

The guest may not even know about the "fully coherent system". It may
continue to issue its CMOs as before, not realising that they are not
required.

> > 
> > We still have to trap S/W though, if only to prevent the guest from
> > doing something bad.
> > 
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> >  arch/arm64/kvm/sys_regs.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index 6e3b969391fd..9a740f159245 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -195,7 +195,13 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
> >  	if (!p->is_write)
> >  		return read_from_write_only(vcpu, p, r);
> >  
> > -	kvm_set_way_flush(vcpu);
> > +	/*
> > +	 * Only track S/W ops if we don't have FWB. It still indicates
> > +	 * that the guest is a bit broken...
> > +	 */
> 
> Is it strictly true that the guest is broken if it does any form of S/W
> ops?  Does the guest actually know that it's running on a fully coherent
> system, or is the argument that no software, ever, should do S/W, even
> for reboot etc.?

S/W should really only be used in power-management scenario. I really
cannot think of a single valid (or even safe) reason to issue a S/W
operation outside of PM, when you're guaranteed that there is only a
single CPU up and running. A guest OS cannot enforce this requirement,
so that's really always broken.

> I think this should have slightly more info, or that part of the comment
> should just be dropped, to avoid misleading future readers who don't
> have the full picture.

Happy to add more details when I respin this series.

> 
> > +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> > +		kvm_set_way_flush(vcpu);
> > +
> >  	return true;
> >  }
> >  
> > -- 
> > 2.17.1
> > 
> 
> Besides the usual nits on commentary:
> 
> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present
@ 2018-06-09 12:31       ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-06-09 12:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, 09 Jun 2018 10:26:40 +0100,
Christoffer Dall wrote:
> 
> On Wed, May 30, 2018 at 01:47:02PM +0100, Marc Zyngier wrote:
> > Set/Way handling is one of the ugliest corners of KVM. We shouldn't
> > have to handle that, but better safe than sorry.
> > 
> > Thankfully, FWB fixes this for us by not requiering any maintenance
> > whatsoever, which means we don't have to emulate S/W CMOs, and don't
> > have to track VM ops either.
> 
> I tiny bit of rationale here would have been nice.  As I understand it,
> if we're presenting the guest with a fully coherent system, there should
> never be a need to invalidate anything, because the guest will always
> see the most recent value no matter how it sings and dances, right?

The guest may not even know about the "fully coherent system". It may
continue to issue its CMOs as before, not realising that they are not
required.

> > 
> > We still have to trap S/W though, if only to prevent the guest from
> > doing something bad.
> > 
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> >  arch/arm64/kvm/sys_regs.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index 6e3b969391fd..9a740f159245 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -195,7 +195,13 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
> >  	if (!p->is_write)
> >  		return read_from_write_only(vcpu, p, r);
> >  
> > -	kvm_set_way_flush(vcpu);
> > +	/*
> > +	 * Only track S/W ops if we don't have FWB. It still indicates
> > +	 * that the guest is a bit broken...
> > +	 */
> 
> Is it strictly true that the guest is broken if it does any form of S/W
> ops?  Does the guest actually know that it's running on a fully coherent
> system, or is the argument that no software, ever, should do S/W, even
> for reboot etc.?

S/W should really only be used in power-management scenario. I really
cannot think of a single valid (or even safe) reason to issue a S/W
operation outside of PM, when you're guaranteed that there is only a
single CPU up and running. A guest OS cannot enforce this requirement,
so that's really always broken.

> I think this should have slightly more info, or that part of the comment
> should just be dropped, to avoid misleading future readers who don't
> have the full picture.

Happy to add more details when I respin this series.

> 
> > +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
> > +		kvm_set_way_flush(vcpu);
> > +
> >  	return true;
> >  }
> >  
> > -- 
> > 2.17.1
> > 
> 
> Besides the usual nits on commentary:
> 
> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 4/6] KVM: arm/arm64: Consolidate page-table accessors
  2018-06-09  9:31     ` Christoffer Dall
@ 2018-06-09 12:34       ` Marc Zyngier
  -1 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-06-09 12:34 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, Catalin Marinas, Mark Rutland, kvmarm, kvm

On Sat, 09 Jun 2018 10:31:48 +0100,
Christoffer Dall wrote:
> 
> On Wed, May 30, 2018 at 01:47:04PM +0100, Marc Zyngier wrote:
> > The arm and arm64 KVM page tables accessors are pointlessly different
> > between the two architectures, and likely both wrong one way or another:
> > arm64 lacks a dsb(), and arm doesn't use WRITE_ONCE.
> > 
> > Let's unify them.
> > 
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> >  arch/arm/include/asm/kvm_mmu.h   | 12 -----------
> >  arch/arm64/include/asm/kvm_mmu.h |  3 ---
> >  virt/kvm/arm/mmu.c               | 35 ++++++++++++++++++++++++++++----
> >  3 files changed, 31 insertions(+), 19 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> > index 707a1f06dc5d..468ff945efa0 100644
> > --- a/arch/arm/include/asm/kvm_mmu.h
> > +++ b/arch/arm/include/asm/kvm_mmu.h
> > @@ -75,18 +75,6 @@ phys_addr_t kvm_get_idmap_vector(void);
> >  int kvm_mmu_init(void);
> >  void kvm_clear_hyp_idmap(void);
> >  
> > -static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
> > -{
> > -	*pmd = new_pmd;
> > -	dsb(ishst);
> > -}
> > -
> > -static inline void kvm_set_pte(pte_t *pte, pte_t new_pte)
> > -{
> > -	*pte = new_pte;
> > -	dsb(ishst);
> > -}
> > -
> >  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
> >  {
> >  	pte_val(pte) |= L_PTE_S2_RDWR;
> > diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> > index 9dbca5355029..26c89b63f604 100644
> > --- a/arch/arm64/include/asm/kvm_mmu.h
> > +++ b/arch/arm64/include/asm/kvm_mmu.h
> > @@ -170,9 +170,6 @@ phys_addr_t kvm_get_idmap_vector(void);
> >  int kvm_mmu_init(void);
> >  void kvm_clear_hyp_idmap(void);
> >  
> > -#define	kvm_set_pte(ptep, pte)		set_pte(ptep, pte)
> > -#define	kvm_set_pmd(pmdp, pmd)		set_pmd(pmdp, pmd)
> > -
> >  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
> >  {
> >  	pte_val(pte) |= PTE_S2_RDWR;
> > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> > index ba66bf7ae299..c9ed239c0840 100644
> > --- a/virt/kvm/arm/mmu.c
> > +++ b/virt/kvm/arm/mmu.c
> > @@ -177,6 +177,33 @@ static void clear_stage2_pmd_entry(struct kvm *kvm, pmd_t *pmd, phys_addr_t addr
> >  	put_page(virt_to_page(pmd));
> >  }
> >  
> > +static inline void kvm_set_pte(pte_t *ptep, pte_t new_pte)
> > +{
> > +	WRITE_ONCE(*ptep, new_pte);
> > +	dsb(ishst);
> > +}
> > +
> > +static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
> > +{
> > +	WRITE_ONCE(*pmdp, new_pmd);
> > +	dsb(ishst);
> > +}
> > +
> 
> arm64 set_pte and set_pmd have an isb() in addition to the dsb(), why
> can we let go of that here?

Good point. There was an offline discussion with Will and Mark a
couple of weeks ago, where we agreed that this ISB wasn't
required. I've of course paged it out. Mark, do you remember the
rational?

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 4/6] KVM: arm/arm64: Consolidate page-table accessors
@ 2018-06-09 12:34       ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-06-09 12:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, 09 Jun 2018 10:31:48 +0100,
Christoffer Dall wrote:
> 
> On Wed, May 30, 2018 at 01:47:04PM +0100, Marc Zyngier wrote:
> > The arm and arm64 KVM page tables accessors are pointlessly different
> > between the two architectures, and likely both wrong one way or another:
> > arm64 lacks a dsb(), and arm doesn't use WRITE_ONCE.
> > 
> > Let's unify them.
> > 
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> >  arch/arm/include/asm/kvm_mmu.h   | 12 -----------
> >  arch/arm64/include/asm/kvm_mmu.h |  3 ---
> >  virt/kvm/arm/mmu.c               | 35 ++++++++++++++++++++++++++++----
> >  3 files changed, 31 insertions(+), 19 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> > index 707a1f06dc5d..468ff945efa0 100644
> > --- a/arch/arm/include/asm/kvm_mmu.h
> > +++ b/arch/arm/include/asm/kvm_mmu.h
> > @@ -75,18 +75,6 @@ phys_addr_t kvm_get_idmap_vector(void);
> >  int kvm_mmu_init(void);
> >  void kvm_clear_hyp_idmap(void);
> >  
> > -static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
> > -{
> > -	*pmd = new_pmd;
> > -	dsb(ishst);
> > -}
> > -
> > -static inline void kvm_set_pte(pte_t *pte, pte_t new_pte)
> > -{
> > -	*pte = new_pte;
> > -	dsb(ishst);
> > -}
> > -
> >  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
> >  {
> >  	pte_val(pte) |= L_PTE_S2_RDWR;
> > diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> > index 9dbca5355029..26c89b63f604 100644
> > --- a/arch/arm64/include/asm/kvm_mmu.h
> > +++ b/arch/arm64/include/asm/kvm_mmu.h
> > @@ -170,9 +170,6 @@ phys_addr_t kvm_get_idmap_vector(void);
> >  int kvm_mmu_init(void);
> >  void kvm_clear_hyp_idmap(void);
> >  
> > -#define	kvm_set_pte(ptep, pte)		set_pte(ptep, pte)
> > -#define	kvm_set_pmd(pmdp, pmd)		set_pmd(pmdp, pmd)
> > -
> >  static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
> >  {
> >  	pte_val(pte) |= PTE_S2_RDWR;
> > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> > index ba66bf7ae299..c9ed239c0840 100644
> > --- a/virt/kvm/arm/mmu.c
> > +++ b/virt/kvm/arm/mmu.c
> > @@ -177,6 +177,33 @@ static void clear_stage2_pmd_entry(struct kvm *kvm, pmd_t *pmd, phys_addr_t addr
> >  	put_page(virt_to_page(pmd));
> >  }
> >  
> > +static inline void kvm_set_pte(pte_t *ptep, pte_t new_pte)
> > +{
> > +	WRITE_ONCE(*ptep, new_pte);
> > +	dsb(ishst);
> > +}
> > +
> > +static inline void kvm_set_pmd(pmd_t *pmdp, pmd_t new_pmd)
> > +{
> > +	WRITE_ONCE(*pmdp, new_pmd);
> > +	dsb(ishst);
> > +}
> > +
> 
> arm64 set_pte and set_pmd have an isb() in addition to the dsb(), why
> can we let go of that here?

Good point. There was an offline discussion with Will and Mark a
couple of weeks ago, where we agreed that this ISB wasn't
required. I've of course paged it out. Mark, do you remember the
rational?

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 1/6] arm64: KVM: Add support for Stage-2 control of memory types and cacheability
  2018-05-31 12:38       ` Marc Zyngier
@ 2018-06-12 12:55         ` Marc Zyngier
  -1 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-06-12 12:55 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, Catalin Marinas, kvmarm, kvm, Christoffer Dall

On 31/05/18 13:38, Marc Zyngier wrote:
> On 31/05/18 12:49, Mark Rutland wrote:
>> On Wed, May 30, 2018 at 01:47:01PM +0100, Marc Zyngier wrote:
>>> Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes
>>> results in the strongest attribute of the two stages.  This means
>>> that the hypervisor has to perform quite a lot of cache maintenance
>>> just in case the guest has some non-cacheable mappings around.
>>>
>>> ARMv8.4 solves this problem by offering a different mode (FWB) where
>>> Stage-2 has total control over the memory attribute (this is limited
>>> to systems where both I/O and instruction caches are coherent with
>>
>> s/caches/fetches/ -- the I-caches themselves aren't coherent with the
>> D-caches (or we could omit I-cache maintenance).
>>
>> i.e. this implies IDC, but not DIC.
> 
> It may imply IDC behaviour, but not quite IDC itself. I agree, this
> looks dodgy. I've asked for clarification on the spec.

I've now received confirmation that FWB implies the IDC behaviour. It
doesn't guarantee that CTR_EL0.IDC will be set though, only that
CLIDR_EL1.LOU{U,IS} are both 0.

> 
>>
>>> the dcache). This is achieved by having a different set of memory
>>> attributes in the page tables, and a new bit set in HCR_EL2.
>>>
>>> On such a system, we can then safely sidestep any form of dcache
>>> management.
>>>
>>> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>
>>>  static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
>>>  {
>>> @@ -268,7 +269,10 @@ static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
>>>  {
>>>  	void *va = page_address(pfn_to_page(pfn));
>>>  
>>> -	kvm_flush_dcache_to_poc(va, size);
>>> +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
>>> +		kvm_flush_dcache_to_poc(va, size);
>>> +	else
>>> +		kvm_flush_dcache_to_pou(va, size);
>>>  }
>>
>> Te commit message said instruction fetches were coherent, and that no
>> D-cache maintenance was necessary, so why do we need maintenance to the
>> PoU?
> 
> That maintenance will be elided if we actually have IDC set. I'm happy
> to drop it once I have confirmation that we have an identical behaviour.

Given the above, I'll drop the clean to PoU.

> 
>>
>>> +static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
>>> +{
>>> +	u64 val = read_sysreg_s(SYS_CLIDR_EL1);
>>> +
>>> +	/* Check that CLIDR_EL1.LOU{U,IS} are both 0 */
>>> +	WARN_ON(val & (7 << 27 | 7 << 21));
>>> +}
>>
>> What about CTR_EL0.IDC?
> 
> Again, that depends on whether FWB implies IDC or not.

The spec doesn't seem to guarantee IDC, but does mandate that
CLIDR_EL1.LOU{U,IS} are set 0.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 1/6] arm64: KVM: Add support for Stage-2 control of memory types and cacheability
@ 2018-06-12 12:55         ` Marc Zyngier
  0 siblings, 0 replies; 48+ messages in thread
From: Marc Zyngier @ 2018-06-12 12:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 31/05/18 13:38, Marc Zyngier wrote:
> On 31/05/18 12:49, Mark Rutland wrote:
>> On Wed, May 30, 2018 at 01:47:01PM +0100, Marc Zyngier wrote:
>>> Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes
>>> results in the strongest attribute of the two stages.  This means
>>> that the hypervisor has to perform quite a lot of cache maintenance
>>> just in case the guest has some non-cacheable mappings around.
>>>
>>> ARMv8.4 solves this problem by offering a different mode (FWB) where
>>> Stage-2 has total control over the memory attribute (this is limited
>>> to systems where both I/O and instruction caches are coherent with
>>
>> s/caches/fetches/ -- the I-caches themselves aren't coherent with the
>> D-caches (or we could omit I-cache maintenance).
>>
>> i.e. this implies IDC, but not DIC.
> 
> It may imply IDC behaviour, but not quite IDC itself. I agree, this
> looks dodgy. I've asked for clarification on the spec.

I've now received confirmation that FWB implies the IDC behaviour. It
doesn't guarantee that CTR_EL0.IDC will be set though, only that
CLIDR_EL1.LOU{U,IS} are both 0.

> 
>>
>>> the dcache). This is achieved by having a different set of memory
>>> attributes in the page tables, and a new bit set in HCR_EL2.
>>>
>>> On such a system, we can then safely sidestep any form of dcache
>>> management.
>>>
>>> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>
>>>  static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
>>>  {
>>> @@ -268,7 +269,10 @@ static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
>>>  {
>>>  	void *va = page_address(pfn_to_page(pfn));
>>>  
>>> -	kvm_flush_dcache_to_poc(va, size);
>>> +	if (!cpus_have_const_cap(ARM64_HAS_STAGE2_FWB))
>>> +		kvm_flush_dcache_to_poc(va, size);
>>> +	else
>>> +		kvm_flush_dcache_to_pou(va, size);
>>>  }
>>
>> Te commit message said instruction fetches were coherent, and that no
>> D-cache maintenance was necessary, so why do we need maintenance to the
>> PoU?
> 
> That maintenance will be elided if we actually have IDC set. I'm happy
> to drop it once I have confirmation that we have an identical behaviour.

Given the above, I'll drop the clean to PoU.

> 
>>
>>> +static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
>>> +{
>>> +	u64 val = read_sysreg_s(SYS_CLIDR_EL1);
>>> +
>>> +	/* Check that CLIDR_EL1.LOU{U,IS} are both 0 */
>>> +	WARN_ON(val & (7 << 27 | 7 << 21));
>>> +}
>>
>> What about CTR_EL0.IDC?
> 
> Again, that depends on whether FWB implies IDC or not.

The spec doesn't seem to guarantee IDC, but does mandate that
CLIDR_EL1.LOU{U,IS} are set 0.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2018-06-12 12:55 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-30 12:47 [PATCH v2 0/6] KVM/arm64: Cache maintenance relaxations Marc Zyngier
2018-05-30 12:47 ` Marc Zyngier
2018-05-30 12:47 ` [PATCH v2 1/6] arm64: KVM: Add support for Stage-2 control of memory types and cacheability Marc Zyngier
2018-05-30 12:47   ` Marc Zyngier
2018-05-31 11:49   ` Mark Rutland
2018-05-31 11:49     ` Mark Rutland
2018-05-31 12:38     ` Marc Zyngier
2018-05-31 12:38       ` Marc Zyngier
2018-06-12 12:55       ` Marc Zyngier
2018-06-12 12:55         ` Marc Zyngier
2018-05-30 12:47 ` [PATCH v2 2/6] arm64: KVM: Handle Set/Way CMOs as NOPs if FWB is present Marc Zyngier
2018-05-30 12:47   ` Marc Zyngier
2018-05-31 11:51   ` Mark Rutland
2018-05-31 11:51     ` Mark Rutland
2018-05-31 13:00     ` Marc Zyngier
2018-05-31 13:00       ` Marc Zyngier
2018-05-31 16:00       ` Mark Rutland
2018-05-31 16:00         ` Mark Rutland
2018-06-09  9:26   ` Christoffer Dall
2018-06-09  9:26     ` Christoffer Dall
2018-06-09 12:31     ` Marc Zyngier
2018-06-09 12:31       ` Marc Zyngier
2018-05-30 12:47 ` [PATCH v2 3/6] arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set Marc Zyngier
2018-05-30 12:47   ` Marc Zyngier
2018-05-31 11:52   ` Mark Rutland
2018-05-31 11:52     ` Mark Rutland
2018-06-09  9:29   ` Christoffer Dall
2018-06-09  9:29     ` Christoffer Dall
2018-05-30 12:47 ` [PATCH v2 4/6] KVM: arm/arm64: Consolidate page-table accessors Marc Zyngier
2018-05-30 12:47   ` Marc Zyngier
2018-05-31 11:55   ` Mark Rutland
2018-05-31 11:55     ` Mark Rutland
2018-06-09  9:31   ` Christoffer Dall
2018-06-09  9:31     ` Christoffer Dall
2018-06-09 12:34     ` Marc Zyngier
2018-06-09 12:34       ` Marc Zyngier
2018-05-30 12:47 ` [PATCH v2 5/6] KVM: arm/arm64: Stop using {pmd,pud,pgd}_populate Marc Zyngier
2018-05-30 12:47   ` Marc Zyngier
2018-05-31 12:01   ` Mark Rutland
2018-05-31 12:01     ` Mark Rutland
2018-06-09  9:57   ` Christoffer Dall
2018-06-09  9:57     ` Christoffer Dall
2018-05-30 12:47 ` [PATCH v2 6/6] KVM: arm/arm64: Remove unnecessary CMOs when creating HYP page tables Marc Zyngier
2018-05-30 12:47   ` Marc Zyngier
2018-05-31 12:01   ` Mark Rutland
2018-05-31 12:01     ` Mark Rutland
2018-06-09  9:58   ` Christoffer Dall
2018-06-09  9:58     ` Christoffer Dall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.