linux-s390.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH v2 0/5] s390: Add support for large pages
@ 2021-02-23 14:07 Claudio Imbrenda
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 1/5] libcflat: add SZ_1M and SZ_2G Claudio Imbrenda
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Claudio Imbrenda @ 2021-02-23 14:07 UTC (permalink / raw)
  To: kvm; +Cc: linux-s390, david, thuth, frankja, cohuck, pmorel

Introduce support for large (1M) and huge (2G) pages.

Add a simple testcase for EDAT1 and EDAT2.

v1->v2

* split patch 2 -> new patch 2 and new patch 3
* patch 2: fixes pgtable.h, also fixes wrong usage of REGION_TABLE_LENGTH
  instead of SEGMENT_TABLE_LENGTH
* patch 3: introduces new macros and functions for large pages
* patch 4: remove erroneous double call to pte_alloc in get_pte
* patch 4: added comment in mmu.c to bridge the s390x architecural names
  with the Linux ones used in the kvm-unit-tests
* patch 5: added and fixed lots of comments to explain what's going on
* patch 5: set FC for region 3 after writing the canary, like for segments
* patch 5: use uintptr_t instead of intptr_t for set_prefix
* patch 5: introduce new macro PGD_PAGE_SHIFT instead of using magic value 41
* patch 5: use VIRT(0) instead of mem to make it more clear what we are
  doing, even though VIRT(0) expands to mem

Claudio Imbrenda (5):
  libcflat: add SZ_1M and SZ_2G
  s390x: lib: fix pgtable.h
  s390x: lib: improve pgtable.h
  s390x: mmu: add support for large pages
  s390x: edat test

 s390x/Makefile          |   1 +
 lib/s390x/asm/pgtable.h |  40 +++++-
 lib/libcflat.h          |   2 +
 lib/s390x/mmu.h         |  73 +++++++++-
 lib/s390x/mmu.c         | 260 ++++++++++++++++++++++++++++++++----
 s390x/edat.c            | 285 ++++++++++++++++++++++++++++++++++++++++
 s390x/unittests.cfg     |   3 +
 7 files changed, 633 insertions(+), 31 deletions(-)
 create mode 100644 s390x/edat.c

-- 
2.26.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [kvm-unit-tests PATCH v2 1/5] libcflat: add SZ_1M and SZ_2G
  2021-02-23 14:07 [kvm-unit-tests PATCH v2 0/5] s390: Add support for large pages Claudio Imbrenda
@ 2021-02-23 14:07 ` Claudio Imbrenda
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 2/5] s390x: lib: fix pgtable.h Claudio Imbrenda
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Claudio Imbrenda @ 2021-02-23 14:07 UTC (permalink / raw)
  To: kvm; +Cc: linux-s390, david, thuth, frankja, cohuck, pmorel

Add SZ_1M and SZ_2G to libcflat.h

s390x needs those for large/huge pages

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
---
 lib/libcflat.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/libcflat.h b/lib/libcflat.h
index 460a1234..8dac0621 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -157,7 +157,9 @@ extern void setup_vm(void);
 #define SZ_8K			(1 << 13)
 #define SZ_16K			(1 << 14)
 #define SZ_64K			(1 << 16)
+#define SZ_1M			(1 << 20)
 #define SZ_2M			(1 << 21)
 #define SZ_1G			(1 << 30)
+#define SZ_2G			(1ul << 31)
 
 #endif
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [kvm-unit-tests PATCH v2 2/5] s390x: lib: fix pgtable.h
  2021-02-23 14:07 [kvm-unit-tests PATCH v2 0/5] s390: Add support for large pages Claudio Imbrenda
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 1/5] libcflat: add SZ_1M and SZ_2G Claudio Imbrenda
@ 2021-02-23 14:07 ` Claudio Imbrenda
  2021-02-23 14:31   ` Janosch Frank
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 3/5] s390x: lib: improve pgtable.h Claudio Imbrenda
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Claudio Imbrenda @ 2021-02-23 14:07 UTC (permalink / raw)
  To: kvm; +Cc: linux-s390, david, thuth, frankja, cohuck, pmorel

Fix pgtable.h:

* SEGMENT_ENTRY_SFAA had one extra bit set
* pmd entries don't have a length
* ipte does not need to clear the lower bits
* pud entries should use SEGMENT_TABLE_LENGTH, as they point to segment tables

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
 lib/s390x/asm/pgtable.h | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/lib/s390x/asm/pgtable.h b/lib/s390x/asm/pgtable.h
index 277f3480..a2ff2d4e 100644
--- a/lib/s390x/asm/pgtable.h
+++ b/lib/s390x/asm/pgtable.h
@@ -60,7 +60,7 @@
 #define SEGMENT_SHIFT			20
 
 #define SEGMENT_ENTRY_ORIGIN		0xfffffffffffff800UL
-#define SEGMENT_ENTRY_SFAA		0xfffffffffff80000UL
+#define SEGMENT_ENTRY_SFAA		0xfffffffffff00000UL
 #define SEGMENT_ENTRY_AV		0x0000000000010000UL
 #define SEGMENT_ENTRY_ACC		0x000000000000f000UL
 #define SEGMENT_ENTRY_F			0x0000000000000800UL
@@ -183,7 +183,7 @@ static inline pmd_t *pmd_alloc(pud_t *pud, unsigned long addr)
 	if (pud_none(*pud)) {
 		pmd_t *pmd = pmd_alloc_one();
 		pud_val(*pud) = __pa(pmd) | REGION_ENTRY_TT_REGION3 |
-				REGION_TABLE_LENGTH;
+				SEGMENT_TABLE_LENGTH;
 	}
 	return pmd_offset(pud, addr);
 }
@@ -202,15 +202,14 @@ static inline pte_t *pte_alloc(pmd_t *pmd, unsigned long addr)
 {
 	if (pmd_none(*pmd)) {
 		pte_t *pte = pte_alloc_one();
-		pmd_val(*pmd) = __pa(pte) | SEGMENT_ENTRY_TT_SEGMENT |
-				SEGMENT_TABLE_LENGTH;
+		pmd_val(*pmd) = __pa(pte) | SEGMENT_ENTRY_TT_SEGMENT;
 	}
 	return pte_offset(pmd, addr);
 }
 
 static inline void ipte(unsigned long vaddr, pteval_t *p_pte)
 {
-	unsigned long table_origin = (unsigned long)p_pte & PAGE_MASK;
+	unsigned long table_origin = (unsigned long)p_pte;
 
 	asm volatile(
 		"	ipte %0,%1\n"
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [kvm-unit-tests PATCH v2 3/5] s390x: lib: improve pgtable.h
  2021-02-23 14:07 [kvm-unit-tests PATCH v2 0/5] s390: Add support for large pages Claudio Imbrenda
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 1/5] libcflat: add SZ_1M and SZ_2G Claudio Imbrenda
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 2/5] s390x: lib: fix pgtable.h Claudio Imbrenda
@ 2021-02-23 14:07 ` Claudio Imbrenda
  2021-02-23 14:35   ` Janosch Frank
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 4/5] s390x: mmu: add support for large pages Claudio Imbrenda
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 5/5] s390x: edat test Claudio Imbrenda
  4 siblings, 1 reply; 14+ messages in thread
From: Claudio Imbrenda @ 2021-02-23 14:07 UTC (permalink / raw)
  To: kvm; +Cc: linux-s390, david, thuth, frankja, cohuck, pmorel

Improve pgtable.h:

* add macros to check whether a pmd or a pud are large / huge
* add idte functions for pmd, pud, p4d and pgd

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
 lib/s390x/asm/pgtable.h | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/lib/s390x/asm/pgtable.h b/lib/s390x/asm/pgtable.h
index a2ff2d4e..70d4afde 100644
--- a/lib/s390x/asm/pgtable.h
+++ b/lib/s390x/asm/pgtable.h
@@ -100,6 +100,9 @@
 #define pmd_none(entry) (pmd_val(entry) & SEGMENT_ENTRY_I)
 #define pte_none(entry) (pte_val(entry) & PAGE_ENTRY_I)
 
+#define pud_huge(entry)  (pud_val(entry) & REGION3_ENTRY_FC)
+#define pmd_large(entry) (pmd_val(entry) & SEGMENT_ENTRY_FC)
+
 #define pgd_addr(entry) __va(pgd_val(entry) & REGION_ENTRY_ORIGIN)
 #define p4d_addr(entry) __va(p4d_val(entry) & REGION_ENTRY_ORIGIN)
 #define pud_addr(entry) __va(pud_val(entry) & REGION_ENTRY_ORIGIN)
@@ -216,6 +219,34 @@ static inline void ipte(unsigned long vaddr, pteval_t *p_pte)
 		: : "a" (table_origin), "a" (vaddr) : "memory");
 }
 
+static inline void idte(unsigned long table_origin, unsigned long vaddr)
+{
+	vaddr &= SEGMENT_ENTRY_SFAA;
+	asm volatile(
+		"	idte %0,0,%1\n"
+		: : "a" (table_origin), "a" (vaddr) : "memory");
+}
+
+static inline void idte_pmdp(unsigned long vaddr, pmdval_t *pmdp)
+{
+	idte((unsigned long)(pmdp - pmd_index(vaddr)) | ASCE_DT_SEGMENT, vaddr);
+}
+
+static inline void idte_pudp(unsigned long vaddr, pudval_t *pudp)
+{
+	idte((unsigned long)(pudp - pud_index(vaddr)) | ASCE_DT_REGION3, vaddr);
+}
+
+static inline void idte_p4dp(unsigned long vaddr, p4dval_t *p4dp)
+{
+	idte((unsigned long)(p4dp - p4d_index(vaddr)) | ASCE_DT_REGION2, vaddr);
+}
+
+static inline void idte_pgdp(unsigned long vaddr, pgdval_t *pgdp)
+{
+	idte((unsigned long)(pgdp - pgd_index(vaddr)) | ASCE_DT_REGION1, vaddr);
+}
+
 void configure_dat(int enable);
 
 #endif /* _ASMS390X_PGTABLE_H_ */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [kvm-unit-tests PATCH v2 4/5] s390x: mmu: add support for large pages
  2021-02-23 14:07 [kvm-unit-tests PATCH v2 0/5] s390: Add support for large pages Claudio Imbrenda
                   ` (2 preceding siblings ...)
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 3/5] s390x: lib: improve pgtable.h Claudio Imbrenda
@ 2021-02-23 14:07 ` Claudio Imbrenda
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 5/5] s390x: edat test Claudio Imbrenda
  4 siblings, 0 replies; 14+ messages in thread
From: Claudio Imbrenda @ 2021-02-23 14:07 UTC (permalink / raw)
  To: kvm; +Cc: linux-s390, david, thuth, frankja, cohuck, pmorel

Add support for 1M and 2G pages.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
 lib/s390x/mmu.h |  73 +++++++++++++-
 lib/s390x/mmu.c | 260 +++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 307 insertions(+), 26 deletions(-)

diff --git a/lib/s390x/mmu.h b/lib/s390x/mmu.h
index 603f289e..93208467 100644
--- a/lib/s390x/mmu.h
+++ b/lib/s390x/mmu.h
@@ -10,9 +10,78 @@
 #ifndef _ASMS390X_MMU_H_
 #define _ASMS390X_MMU_H_
 
-void protect_page(void *vaddr, unsigned long prot);
+/*
+ * Splits the pagetables down to the given DAT tables level.
+ * Returns a pointer to the DAT table entry of the given level.
+ * @pgtable root of the page table tree
+ * @vaddr address whose page tables are to split
+ * @level 3 (for 2GB pud), 4 (for 1 MB pmd) or 5 (for 4KB pages)
+ */
+void *split_page(pgd_t *pgtable, void *vaddr, unsigned int level);
+
+/*
+ * Applies the given protection bits to the given DAT tables level,
+ * splitting if necessary.
+ * @pgtable root of the page table tree
+ * @vaddr address whose protection bits are to be changed
+ * @prot the protection bits to set
+ * @level 3 (for 2GB pud), 4 (for 1MB pmd) or 5 (for 4KB pages)
+ */
+void protect_dat_entry(void *vaddr, unsigned long prot, unsigned int level);
+/*
+ * Clears the given protection bits from the given DAT tables level,
+ * splitting if necessary.
+ * @pgtable root of the page table tree
+ * @vaddr address whose protection bits are to be changed
+ * @prot the protection bits to clear
+ * @level 3 (for 2GB pud), 4 (for 1MB pmd) or 5 (for 4kB pages)
+ */
+void unprotect_dat_entry(void *vaddr, unsigned long prot, unsigned int level);
+
+/*
+ * Applies the given protection bits to the given 4kB pages range,
+ * splitting if necessary.
+ * @start starting address whose protection bits are to be changed
+ * @len size in bytes
+ * @prot the protection bits to set
+ */
 void protect_range(void *start, unsigned long len, unsigned long prot);
-void unprotect_page(void *vaddr, unsigned long prot);
+/*
+ * Clears the given protection bits from the given 4kB pages range,
+ * splitting if necessary.
+ * @start starting address whose protection bits are to be changed
+ * @len size in bytes
+ * @prot the protection bits to set
+ */
 void unprotect_range(void *start, unsigned long len, unsigned long prot);
 
+/* Similar to install_page, maps the virtual address to the physical address
+ * for the given page tables, using 1MB large pages.
+ * Returns a pointer to the DAT table entry.
+ * @pgtable root of the page table tree
+ * @phys physical address to map, must be 1MB aligned!
+ * @vaddr virtual address to map, must be 1MB aligned!
+ */
+pmdval_t *install_large_page(pgd_t *pgtable, phys_addr_t phys, void *vaddr);
+
+/* Similar to install_page, maps the virtual address to the physical address
+ * for the given page tables, using 2GB huge pages.
+ * Returns a pointer to the DAT table entry.
+ * @pgtable root of the page table tree
+ * @phys physical address to map, must be 2GB aligned!
+ * @vaddr virtual address to map, must be 2GB aligned!
+ */
+pudval_t *install_huge_page(pgd_t *pgtable, phys_addr_t phys, void *vaddr);
+
+static inline void protect_page(void *vaddr, unsigned long prot)
+{
+	protect_dat_entry(vaddr, prot, 5);
+}
+
+static inline void unprotect_page(void *vaddr, unsigned long prot)
+{
+	unprotect_dat_entry(vaddr, prot, 5);
+}
+void *get_dat_entry(pgd_t *pgtable, void *vaddr, unsigned int level);
+
 #endif /* _ASMS390X_MMU_H_ */
diff --git a/lib/s390x/mmu.c b/lib/s390x/mmu.c
index 5c517366..def91334 100644
--- a/lib/s390x/mmu.c
+++ b/lib/s390x/mmu.c
@@ -15,6 +15,18 @@
 #include <vmalloc.h>
 #include "mmu.h"
 
+/*
+ * The naming convention used here is the same as used in the Linux kernel,
+ * and this is the corrispondence between the s390x architectural names and
+ * the Linux ones:
+ *
+ * pgd - region 1 table entry
+ * p4d - region 2 table entry
+ * pud - region 3 table entry
+ * pmd - segment table entry
+ * pte - page table entry
+ */
+
 static pgd_t *table_root;
 
 void configure_dat(int enable)
@@ -46,54 +58,254 @@ static void mmu_enable(pgd_t *pgtable)
 	lc->pgm_new_psw.mask |= PSW_MASK_DAT;
 }
 
-static pteval_t *get_pte(pgd_t *pgtable, uintptr_t vaddr)
+/*
+ * Get the pud (region 3) DAT table entry for the given address and root,
+ * allocating it if necessary
+ */
+static inline pud_t *get_pud(pgd_t *pgtable, uintptr_t vaddr)
 {
 	pgd_t *pgd = pgd_offset(pgtable, vaddr);
 	p4d_t *p4d = p4d_alloc(pgd, vaddr);
 	pud_t *pud = pud_alloc(p4d, vaddr);
-	pmd_t *pmd = pmd_alloc(pud, vaddr);
-	pte_t *pte = pte_alloc(pmd, vaddr);
 
-	return &pte_val(*pte);
+	return pud;
+}
+
+/*
+ * Get the pmd (segment) DAT table entry for the given address and pud,
+ * allocating it if necessary.
+ * The pud must not be huge.
+ */
+static inline pmd_t *get_pmd(pud_t *pud, uintptr_t vaddr)
+{
+	pmd_t *pmd;
+
+	assert(!pud_huge(*pud));
+	pmd = pmd_alloc(pud, vaddr);
+	return pmd;
+}
+
+/*
+ * Get the pte (page) DAT table entry for the given address and pmd,
+ * allocating it if necessary.
+ * The pmd must not be large.
+ */
+static inline pte_t *get_pte(pmd_t *pmd, uintptr_t vaddr)
+{
+	pte_t *pte;
+
+	assert(!pmd_large(*pmd));
+	pte = pte_alloc(pmd, vaddr);
+	return pte;
+}
+
+/*
+ * Splits a large pmd (segment) DAT table entry into equivalent 4kB small
+ * pages.
+ * @pmd The pmd to split, it must be large.
+ * @va the virtual address corresponding to this pmd.
+ */
+static void split_pmd(pmd_t *pmd, uintptr_t va)
+{
+	phys_addr_t pa = pmd_val(*pmd) & SEGMENT_ENTRY_SFAA;
+	unsigned long i;
+	pte_t *pte;
+
+	assert(pmd_large(*pmd));
+	pte = alloc_pages(PAGE_TABLE_ORDER);
+	for (i = 0; i < PAGE_TABLE_ENTRIES; i++)
+		pte_val(pte[i]) =  pa | PAGE_SIZE * i;
+	idte_pmdp(va, &pmd_val(*pmd));
+	pmd_val(*pmd) = __pa(pte) | SEGMENT_ENTRY_TT_SEGMENT;
+
+}
+
+/*
+ * Splits a huge pud (region 3) DAT table entry into equivalent 1MB large
+ * pages.
+ * @pud The pud to split, it must be huge.
+ * @va the virtual address corresponding to this pud.
+ */
+static void split_pud(pud_t *pud, uintptr_t va)
+{
+	phys_addr_t pa = pud_val(*pud) & REGION3_ENTRY_RFAA;
+	unsigned long i;
+	pmd_t *pmd;
+
+	assert(pud_huge(*pud));
+	pmd = alloc_pages(SEGMENT_TABLE_ORDER);
+	for (i = 0; i < SEGMENT_TABLE_ENTRIES; i++)
+		pmd_val(pmd[i]) =  pa | SZ_1M * i | SEGMENT_ENTRY_FC | SEGMENT_ENTRY_TT_SEGMENT;
+	idte_pudp(va, &pud_val(*pud));
+	pud_val(*pud) = __pa(pmd) | REGION_ENTRY_TT_REGION3 | REGION_TABLE_LENGTH;
+}
+
+void *get_dat_entry(pgd_t *pgtable, void *vaddr, unsigned int level)
+{
+	uintptr_t va = (uintptr_t)vaddr;
+	pgd_t *pgd;
+	p4d_t *p4d;
+	pud_t *pud;
+	pmd_t *pmd;
+
+	assert(level && (level <= 5));
+	pgd = pgd_offset(pgtable, va);
+	if (level == 1)
+		return pgd;
+	p4d = p4d_alloc(pgd, va);
+	if (level == 2)
+		return p4d;
+	pud = pud_alloc(p4d, va);
+
+	if (level == 3)
+		return pud;
+	if (!pud_none(*pud) && pud_huge(*pud))
+		split_pud(pud, va);
+	pmd = get_pmd(pud, va);
+	if (level == 4)
+		return pmd;
+	if (!pmd_none(*pmd) && pmd_large(*pmd))
+		split_pmd(pmd, va);
+	return get_pte(pmd, va);
+}
+
+void *split_page(pgd_t *pgtable, void *vaddr, unsigned int level)
+{
+	assert((level >= 3) && (level <= 5));
+	return get_dat_entry(pgtable ? pgtable : table_root, vaddr, level);
 }
 
 phys_addr_t virt_to_pte_phys(pgd_t *pgtable, void *vaddr)
 {
-	return (*get_pte(pgtable, (uintptr_t)vaddr) & PAGE_MASK) +
-	       ((unsigned long)vaddr & ~PAGE_MASK);
+	uintptr_t va = (uintptr_t)vaddr;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	pud = get_pud(pgtable, va);
+	if (pud_huge(*pud))
+		return (pud_val(*pud) & REGION3_ENTRY_RFAA) | (va & ~REGION3_ENTRY_RFAA);
+	pmd = get_pmd(pud, va);
+	if (pmd_large(*pmd))
+		return (pmd_val(*pmd) & SEGMENT_ENTRY_SFAA) | (va & ~SEGMENT_ENTRY_SFAA);
+	pte = get_pte(pmd, va);
+	return (pte_val(*pte) & PAGE_MASK) | (va & ~PAGE_MASK);
+}
+
+/*
+ * Get the DAT table entry of the given level for the given address,
+ * splitting if necessary. If the entry was not invalid, invalidate it, and
+ * return the pointer to the entry and, if requested, its old value.
+ * @pgtable root of the page tables
+ * @vaddr virtual address
+ * @level 3 (for 2GB pud), 4 (for 1MB pmd) or 5 (for 4kB pages)
+ * @old if not NULL, will be written with the old value of the DAT table
+ * entry before invalidation
+ */
+static void *dat_get_and_invalidate(pgd_t *pgtable, void *vaddr, unsigned int level, unsigned long *old)
+{
+	unsigned long va = (unsigned long)vaddr;
+	void *ptr;
+
+	ptr = get_dat_entry(pgtable, vaddr, level);
+	if (old)
+		*old = *(unsigned long *)ptr;
+	if ((level == 1) && !pgd_none(*(pgd_t *)ptr))
+		idte_pgdp(va, ptr);
+	else if ((level == 2) && !p4d_none(*(p4d_t *)ptr))
+		idte_p4dp(va, ptr);
+	else if ((level == 3) && !pud_none(*(pud_t *)ptr))
+		idte_pudp(va, ptr);
+	else if ((level == 4) && !pmd_none(*(pmd_t *)ptr))
+		idte_pmdp(va, ptr);
+	else if (!pte_none(*(pte_t *)ptr))
+		ipte(va, ptr);
+	return ptr;
 }
 
-static pteval_t *set_pte(pgd_t *pgtable, pteval_t val, void *vaddr)
+static void cleanup_pmd(pmd_t *pmd)
 {
-	pteval_t *p_pte = get_pte(pgtable, (uintptr_t)vaddr);
+	/* was invalid or large, nothing to do */
+	if (pmd_none(*pmd) || pmd_large(*pmd))
+		return;
+	/* was not large, free the corresponding page table */
+	free_pages((void *)(pmd_val(*pmd) & PAGE_MASK));
+}
 
-	/* first flush the old entry (if we're replacing anything) */
-	if (!(*p_pte & PAGE_ENTRY_I))
-		ipte((uintptr_t)vaddr, p_pte);
+static void cleanup_pud(pud_t *pud)
+{
+	unsigned long i;
+	pmd_t *pmd;
 
-	*p_pte = val;
-	return p_pte;
+	/* was invalid or large, nothing to do */
+	if (pud_none(*pud) || pud_huge(*pud))
+		return;
+	/* recursively clean up all pmds if needed */
+	pmd = (pmd_t *)(pud_val(*pud) & PAGE_MASK);
+	for (i = 0; i < SEGMENT_TABLE_ENTRIES; i++)
+		cleanup_pmd(pmd + i);
+	/* free the corresponding segment table */
+	free_pages(pmd);
+}
+
+/*
+ * Set the DAT entry for the given level of the given virtual address. If a
+ * mapping already existed, it is overwritten. If an existing mapping with
+ * smaller pages existed, all the lower tables are freed.
+ * Returns the pointer to the DAT table entry.
+ * @pgtable root of the page tables
+ * @val the new value for the DAT table entry
+ * @vaddr the virtual address
+ * @level 3 for pud (region 3), 4 for pmd (segment) and 5 for pte (pages)
+ */
+static void *set_dat_entry(pgd_t *pgtable, unsigned long val, void *vaddr, unsigned int level)
+{
+	unsigned long old, *res;
+
+	res = dat_get_and_invalidate(pgtable, vaddr, level, &old);
+	if (level == 4)
+		cleanup_pmd((pmd_t *)&old);
+	if (level == 3)
+		cleanup_pud((pud_t *)&old);
+	*res = val;
+	return res;
 }
 
 pteval_t *install_page(pgd_t *pgtable, phys_addr_t phys, void *vaddr)
 {
-	return set_pte(pgtable, __pa(phys), vaddr);
+	assert(IS_ALIGNED(phys, PAGE_SIZE));
+	assert(IS_ALIGNED((uintptr_t)vaddr, PAGE_SIZE));
+	return set_dat_entry(pgtable, phys, vaddr, 5);
+}
+
+pmdval_t *install_large_page(pgd_t *pgtable, phys_addr_t phys, void *vaddr)
+{
+	assert(IS_ALIGNED(phys, SZ_1M));
+	assert(IS_ALIGNED((uintptr_t)vaddr, SZ_1M));
+	return set_dat_entry(pgtable, phys | SEGMENT_ENTRY_FC, vaddr, 4);
+}
+
+pudval_t *install_huge_page(pgd_t *pgtable, phys_addr_t phys, void *vaddr)
+{
+	assert(IS_ALIGNED(phys, SZ_2G));
+	assert(IS_ALIGNED((uintptr_t)vaddr, SZ_2G));
+	return set_dat_entry(pgtable, phys | REGION3_ENTRY_FC | REGION_ENTRY_TT_REGION3, vaddr, 3);
 }
 
-void protect_page(void *vaddr, unsigned long prot)
+void protect_dat_entry(void *vaddr, unsigned long prot, unsigned int level)
 {
-	pteval_t *p_pte = get_pte(table_root, (uintptr_t)vaddr);
-	pteval_t n_pte = *p_pte | prot;
+	unsigned long old, *ptr;
 
-	set_pte(table_root, n_pte, vaddr);
+	ptr = dat_get_and_invalidate(table_root, vaddr, level, &old);
+	*ptr = old | prot;
 }
 
-void unprotect_page(void *vaddr, unsigned long prot)
+void unprotect_dat_entry(void *vaddr, unsigned long prot, unsigned int level)
 {
-	pteval_t *p_pte = get_pte(table_root, (uintptr_t)vaddr);
-	pteval_t n_pte = *p_pte & ~prot;
+	unsigned long old, *ptr;
 
-	set_pte(table_root, n_pte, vaddr);
+	ptr = dat_get_and_invalidate(table_root, vaddr, level, &old);
+	*ptr = old & ~prot;
 }
 
 void protect_range(void *start, unsigned long len, unsigned long prot)
@@ -102,7 +314,7 @@ void protect_range(void *start, unsigned long len, unsigned long prot)
 
 	len &= PAGE_MASK;
 	for (; len; len -= PAGE_SIZE, curr += PAGE_SIZE)
-		protect_page((void *)curr, prot);
+		protect_dat_entry((void *)curr, prot, 5);
 }
 
 void unprotect_range(void *start, unsigned long len, unsigned long prot)
@@ -111,7 +323,7 @@ void unprotect_range(void *start, unsigned long len, unsigned long prot)
 
 	len &= PAGE_MASK;
 	for (; len; len -= PAGE_SIZE, curr += PAGE_SIZE)
-		unprotect_page((void *)curr, prot);
+		unprotect_dat_entry((void *)curr, prot, 5);
 }
 
 static void setup_identity(pgd_t *pgtable, phys_addr_t start_addr,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [kvm-unit-tests PATCH v2 5/5] s390x: edat test
  2021-02-23 14:07 [kvm-unit-tests PATCH v2 0/5] s390: Add support for large pages Claudio Imbrenda
                   ` (3 preceding siblings ...)
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 4/5] s390x: mmu: add support for large pages Claudio Imbrenda
@ 2021-02-23 14:07 ` Claudio Imbrenda
  2021-02-23 14:57   ` Janosch Frank
  4 siblings, 1 reply; 14+ messages in thread
From: Claudio Imbrenda @ 2021-02-23 14:07 UTC (permalink / raw)
  To: kvm; +Cc: linux-s390, david, thuth, frankja, cohuck, pmorel

Simple EDAT test.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
---
 s390x/Makefile      |   1 +
 s390x/edat.c        | 285 ++++++++++++++++++++++++++++++++++++++++++++
 s390x/unittests.cfg |   3 +
 3 files changed, 289 insertions(+)
 create mode 100644 s390x/edat.c

diff --git a/s390x/Makefile b/s390x/Makefile
index 08d85c9f..fc885150 100644
--- a/s390x/Makefile
+++ b/s390x/Makefile
@@ -20,6 +20,7 @@ tests += $(TEST_DIR)/sclp.elf
 tests += $(TEST_DIR)/css.elf
 tests += $(TEST_DIR)/uv-guest.elf
 tests += $(TEST_DIR)/sie.elf
+tests += $(TEST_DIR)/edat.elf
 
 tests_binary = $(patsubst %.elf,%.bin,$(tests))
 ifneq ($(HOST_KEY_DOCUMENT),)
diff --git a/s390x/edat.c b/s390x/edat.c
new file mode 100644
index 00000000..0bd16efc
--- /dev/null
+++ b/s390x/edat.c
@@ -0,0 +1,285 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * EDAT test.
+ *
+ * Copyright (c) 2021 IBM Corp
+ *
+ * Authors:
+ *	Claudio Imbrenda <imbrenda@linux.ibm.com>
+ */
+#include <libcflat.h>
+#include <vmalloc.h>
+#include <asm/facility.h>
+#include <asm/interrupt.h>
+#include <mmu.h>
+#include <asm/pgtable.h>
+#include <asm-generic/barrier.h>
+
+#define PGD_PAGE_SHIFT (REGION1_SHIFT - PAGE_SHIFT)
+
+#define TEID_ADDR	PAGE_MASK
+#define TEID_AI		0x003
+#define TEID_M		0x004
+#define TEID_A		0x008
+#define TEID_FS		0xc00
+
+#define LC_SIZE	(2 * PAGE_SIZE)
+#define VIRT(x)	((void *)((unsigned long)(x) + (unsigned long)mem))
+
+static uint8_t prefix_buf[LC_SIZE] __attribute__((aligned(LC_SIZE)));
+static unsigned int tmp[1024] __attribute__((aligned(PAGE_SIZE)));
+static void *root, *mem, *m;
+static struct lowcore *lc;
+volatile unsigned int *p;
+
+/* Expect a program interrupt, and clear the TEID */
+static void expect_dat_fault(void)
+{
+	expect_pgm_int();
+	lc->trans_exc_id = 0;
+}
+
+/*
+ * Check if a non-access-list protection exception happened for the given
+ * address, in the primary address space.
+ */
+static bool check_pgm_prot(void *ptr)
+{
+	unsigned long teid = lc->trans_exc_id;
+
+	if (lc->pgm_int_code != PGM_INT_CODE_PROTECTION)
+		return false;
+	/*
+	 * if for any reason TEID_M is not present, the rest of the field is
+	 * not meaningful, so no point in checking it
+	 */
+	if (~teid & TEID_M)
+		return true;
+	return (~teid & TEID_A) &&
+		((teid & TEID_ADDR) == ((uint64_t)ptr & PAGE_MASK)) &&
+		!(teid & TEID_AI);
+}
+
+static void test_dat(void)
+{
+	report_prefix_push("edat off");
+	/* disable EDAT */
+	ctl_clear_bit(0, 23);
+
+	/* Check some basics */
+	p[0] = 42;
+	report(p[0] == 42, "pte, r/w");
+	p[0] = 0;
+
+	/* Write protect the page and try to write, expect a fault */
+	protect_page(m, PAGE_ENTRY_P);
+	expect_dat_fault();
+	p[0] = 42;
+	unprotect_page(m, PAGE_ENTRY_P);
+	report(!p[0] && check_pgm_prot(m), "pte, ro");
+
+	/*
+	 * The FC bit (for large pages) should be ignored because EDAT is
+	 * off. We set a value and then we try to read it back again after
+	 * setting the FC bit. This way we can check if large pages were
+	 * erroneously enabled despite EDAT being off.
+	 */
+	p[0] = 42;
+	protect_dat_entry(m, SEGMENT_ENTRY_FC, 4);
+	report(p[0] == 42, "pmd, fc=1, r/w");
+	unprotect_dat_entry(m, SEGMENT_ENTRY_FC, 4);
+	p[0] = 0;
+
+	/*
+	 * Segment protection should work even with EDAT off, try to write
+	 * anyway and expect a fault
+	 */
+	protect_dat_entry(m, SEGMENT_ENTRY_P, 4);
+	expect_dat_fault();
+	p[0] = 42;
+	report(!p[0] && check_pgm_prot(m), "pmd, ro");
+	unprotect_dat_entry(m, SEGMENT_ENTRY_P, 4);
+
+	/* The FC bit should be ignored because EDAT is off, like above */
+	p[0] = 42;
+	protect_dat_entry(m, REGION3_ENTRY_FC, 3);
+	report(p[0] == 42, "pud, fc=1, r/w");
+	unprotect_dat_entry(m, REGION3_ENTRY_FC, 3);
+	p[0] = 0;
+
+	/*
+	 * Region1/2/3 protection should not work, because EDAT is off.
+	 * Protect the various region1/2/3 entries and write, expect the
+	 * write to be successful.
+	 */
+	protect_dat_entry(m, REGION_ENTRY_P, 3);
+	p[0] = 42;
+	report(p[0] == 42, "pud, ro");
+	unprotect_dat_entry(m, REGION_ENTRY_P, 3);
+	p[0] = 0;
+
+	protect_dat_entry(m, REGION_ENTRY_P, 2);
+	p[0] = 42;
+	report(p[0] == 42, "p4d, ro");
+	unprotect_dat_entry(m, REGION_ENTRY_P, 2);
+	p[0] = 0;
+
+	protect_dat_entry(m, REGION_ENTRY_P, 1);
+	p[0] = 42;
+	report(p[0] == 42, "pgd, ro");
+	unprotect_dat_entry(m, REGION_ENTRY_P, 1);
+	p[0] = 0;
+
+	report_prefix_pop();
+}
+
+static void test_edat1(void)
+{
+	report_prefix_push("edat1");
+	/* Enable EDAT */
+	ctl_set_bit(0, 23);
+	p[0] = 0;
+
+	/*
+	 * Segment protection should work normally, try to write and expect
+	 * a fault.
+	 */
+	expect_dat_fault();
+	protect_dat_entry(m, SEGMENT_ENTRY_P, 4);
+	p[0] = 42;
+	report(!p[0] && check_pgm_prot(m), "pmd, ro");
+	unprotect_dat_entry(m, SEGMENT_ENTRY_P, 4);
+
+	/*
+	 * Region1/2/3 protection should work now, because EDAT is on. Try
+	 * to write anyway and expect a fault.
+	 */
+	expect_dat_fault();
+	protect_dat_entry(m, REGION_ENTRY_P, 3);
+	p[0] = 42;
+	report(!p[0] && check_pgm_prot(m), "pud, ro");
+	unprotect_dat_entry(m, REGION_ENTRY_P, 3);
+
+	expect_dat_fault();
+	protect_dat_entry(m, REGION_ENTRY_P, 2);
+	p[0] = 42;
+	report(!p[0] && check_pgm_prot(m), "p4d, ro");
+	unprotect_dat_entry(m, REGION_ENTRY_P, 2);
+
+	expect_dat_fault();
+	protect_dat_entry(m, REGION_ENTRY_P, 1);
+	p[0] = 42;
+	report(!p[0] && check_pgm_prot(m), "pgd, ro");
+	unprotect_dat_entry(m, REGION_ENTRY_P, 1);
+
+	/* Large pages should work */
+	p[0] = 42;
+	install_large_page(root, 0, mem);
+	report(p[0] == 42, "pmd, large");
+
+	/*
+	 * Prefixing should not work with large pages. Since the lower
+	 * addresses are mapped with small pages, which are subject to
+	 * prefixing, and the pages mapped with large pages are not subject
+	 * to prefixing, this is the resulting scenario:
+	 *
+	 * virtual 0 = real 0 -> absolute prefix_buf
+	 * virtual prefix_buf = real prefix_buf -> absolute 0
+	 * VIRT(0) -> absolute 0
+	 * VIRT(prefix_buf) -> absolute prefix_buf
+	 *
+	 * The testcase checks if the memory at virtual 0 has the same
+	 * content as the memory at VIRT(prefix_buf) and the memory at
+	 * VIRT(0) has the same content as the memory at virtual prefix_buf.
+	 * If prefixing is erroneously applied for large pages, the testcase
+	 * will therefore fail.
+	 */
+	report(!memcmp(0, VIRT(prefix_buf), LC_SIZE) &&
+		!memcmp(prefix_buf, VIRT(0), LC_SIZE),
+		"pmd, large, prefixing");
+
+	report_prefix_pop();
+}
+
+static void test_edat2(void)
+{
+	report_prefix_push("edat2");
+	p[0] = 42;
+
+	/* Huge pages should work */
+	install_huge_page(root, 0, mem);
+	report(p[0] == 42, "pud, huge");
+
+	/* Prefixing should not work with huge pages, just like large pages */
+	report(!memcmp(0, VIRT(prefix_buf), LC_SIZE) &&
+		!memcmp(prefix_buf, VIRT(0), LC_SIZE),
+		"pmd, large, prefixing");
+
+	report_prefix_pop();
+}
+
+static unsigned int setup(void)
+{
+	bool has_edat1 = test_facility(8);
+	bool has_edat2 = test_facility(78);
+	unsigned long pa, va;
+
+	if (has_edat2 && !has_edat1)
+		report_abort("EDAT2 available, but EDAT1 not available");
+
+	/* Setup DAT 1:1 mapping and memory management */
+	setup_vm();
+	root = (void *)(stctg(1) & PAGE_MASK);
+
+	/*
+	 * Get a pgd worth of virtual memory, so we can test things later
+	 * without interfering with the test code or the interrupt handler
+	 */
+	mem = alloc_vpages_aligned(BIT_ULL(PGD_PAGE_SHIFT), PGD_PAGE_SHIFT);
+	assert(mem);
+	va = (unsigned long)mem;
+
+	/* Map the first 1GB of real memory */
+	for (pa = 0; pa < SZ_1G; pa += PAGE_SIZE, va += PAGE_SIZE)
+		install_page(root, pa, (void *)va);
+
+	/*
+	 * Move the lowcore to a known non-zero location. This is needed
+	 * later to check whether prefixing is working with large pages.
+	 */
+	assert((unsigned long)&prefix_buf < SZ_2G);
+	memcpy(prefix_buf, 0, LC_SIZE);
+	set_prefix((uint32_t)(uintptr_t)prefix_buf);
+	/* Clear the old copy */
+	memset(prefix_buf, 0, LC_SIZE);
+
+	/* m will point to tmp through the new virtual mapping */
+	m = VIRT(&tmp);
+	/* p is the same as m but volatile */
+	p = (volatile unsigned int *)m;
+
+	return has_edat1 + has_edat2;
+}
+
+int main(void)
+{
+	unsigned int edat;
+
+	report_prefix_push("edat");
+	edat = setup();
+
+	test_dat();
+
+	if (edat)
+		test_edat1();
+	else
+		report_skip("EDAT not available");
+
+	if (edat >= 2)
+		test_edat2();
+	else
+		report_skip("EDAT2 not available");
+
+	report_prefix_pop();
+	return report_summary();
+}
diff --git a/s390x/unittests.cfg b/s390x/unittests.cfg
index 2298be6c..8da4a0a3 100644
--- a/s390x/unittests.cfg
+++ b/s390x/unittests.cfg
@@ -99,3 +99,6 @@ file = uv-guest.elf
 
 [sie]
 file = sie.elf
+
+[edat]
+file = edat.elf
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [kvm-unit-tests PATCH v2 2/5] s390x: lib: fix pgtable.h
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 2/5] s390x: lib: fix pgtable.h Claudio Imbrenda
@ 2021-02-23 14:31   ` Janosch Frank
  2021-02-23 15:21     ` Claudio Imbrenda
  0 siblings, 1 reply; 14+ messages in thread
From: Janosch Frank @ 2021-02-23 14:31 UTC (permalink / raw)
  To: Claudio Imbrenda, kvm; +Cc: linux-s390, david, thuth, cohuck, pmorel

On 2/23/21 3:07 PM, Claudio Imbrenda wrote:
> Fix pgtable.h:
> 
> * SEGMENT_ENTRY_SFAA had one extra bit set
> * pmd entries don't have a length
> * ipte does not need to clear the lower bits
> * pud entries should use SEGMENT_TABLE_LENGTH, as they point to segment tables
> 
> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> ---
>  lib/s390x/asm/pgtable.h | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/s390x/asm/pgtable.h b/lib/s390x/asm/pgtable.h
> index 277f3480..a2ff2d4e 100644
> --- a/lib/s390x/asm/pgtable.h
> +++ b/lib/s390x/asm/pgtable.h
> @@ -60,7 +60,7 @@
>  #define SEGMENT_SHIFT			20
>  
>  #define SEGMENT_ENTRY_ORIGIN		0xfffffffffffff800UL
> -#define SEGMENT_ENTRY_SFAA		0xfffffffffff80000UL
> +#define SEGMENT_ENTRY_SFAA		0xfffffffffff00000UL
>  #define SEGMENT_ENTRY_AV		0x0000000000010000UL
>  #define SEGMENT_ENTRY_ACC		0x000000000000f000UL
>  #define SEGMENT_ENTRY_F			0x0000000000000800UL
> @@ -183,7 +183,7 @@ static inline pmd_t *pmd_alloc(pud_t *pud, unsigned long addr)
>  	if (pud_none(*pud)) {
>  		pmd_t *pmd = pmd_alloc_one();
>  		pud_val(*pud) = __pa(pmd) | REGION_ENTRY_TT_REGION3 |
> -				REGION_TABLE_LENGTH;
> +				SEGMENT_TABLE_LENGTH;

@David: I'd much rather have REGION_ENTRY_LENGTH instead of
REGION_TABLE_LENGTH and SEGMENT_TABLE_LENGTH.

My argument is that this is not really an attribute of the table and
very much specific to the format of the (region table) entry. We already
have the table order as a length anyway...

Could you tell me what you had in mind when splitting this?

>  	}
>  	return pmd_offset(pud, addr);
>  }
> @@ -202,15 +202,14 @@ static inline pte_t *pte_alloc(pmd_t *pmd, unsigned long addr)
>  {
>  	if (pmd_none(*pmd)) {
>  		pte_t *pte = pte_alloc_one();
> -		pmd_val(*pmd) = __pa(pte) | SEGMENT_ENTRY_TT_SEGMENT |
> -				SEGMENT_TABLE_LENGTH;
> +		pmd_val(*pmd) = __pa(pte) | SEGMENT_ENTRY_TT_SEGMENT;

Uhhhh good catch!

>  	}
>  	return pte_offset(pmd, addr);
>  }
>  
>  static inline void ipte(unsigned long vaddr, pteval_t *p_pte)
>  {
> -	unsigned long table_origin = (unsigned long)p_pte & PAGE_MASK;
> +	unsigned long table_origin = (unsigned long)p_pte;
>  
>  	asm volatile(
>  		"	ipte %0,%1\n"
> 

IPTE ignores that data but having the page mask also doesn't hurt so
generally this is not a fix, right?


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [kvm-unit-tests PATCH v2 3/5] s390x: lib: improve pgtable.h
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 3/5] s390x: lib: improve pgtable.h Claudio Imbrenda
@ 2021-02-23 14:35   ` Janosch Frank
  2021-02-23 15:21     ` Claudio Imbrenda
  0 siblings, 1 reply; 14+ messages in thread
From: Janosch Frank @ 2021-02-23 14:35 UTC (permalink / raw)
  To: Claudio Imbrenda, kvm; +Cc: linux-s390, david, thuth, cohuck, pmorel

On 2/23/21 3:07 PM, Claudio Imbrenda wrote:
> Improve pgtable.h:
> 
> * add macros to check whether a pmd or a pud are large / huge
> * add idte functions for pmd, pud, p4d and pgd
> 
> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

Could you please make the subject more specific?
"s390x: lib: Add idte and huge entry check functions"

Acked-by: Janosch Frank <frankja@linux.ibm.com>

> ---
>  lib/s390x/asm/pgtable.h | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/lib/s390x/asm/pgtable.h b/lib/s390x/asm/pgtable.h
> index a2ff2d4e..70d4afde 100644
> --- a/lib/s390x/asm/pgtable.h
> +++ b/lib/s390x/asm/pgtable.h
> @@ -100,6 +100,9 @@
>  #define pmd_none(entry) (pmd_val(entry) & SEGMENT_ENTRY_I)
>  #define pte_none(entry) (pte_val(entry) & PAGE_ENTRY_I)
>  
> +#define pud_huge(entry)  (pud_val(entry) & REGION3_ENTRY_FC)
> +#define pmd_large(entry) (pmd_val(entry) & SEGMENT_ENTRY_FC)
> +
>  #define pgd_addr(entry) __va(pgd_val(entry) & REGION_ENTRY_ORIGIN)
>  #define p4d_addr(entry) __va(p4d_val(entry) & REGION_ENTRY_ORIGIN)
>  #define pud_addr(entry) __va(pud_val(entry) & REGION_ENTRY_ORIGIN)
> @@ -216,6 +219,34 @@ static inline void ipte(unsigned long vaddr, pteval_t *p_pte)
>  		: : "a" (table_origin), "a" (vaddr) : "memory");
>  }
>  
> +static inline void idte(unsigned long table_origin, unsigned long vaddr)
> +{
> +	vaddr &= SEGMENT_ENTRY_SFAA;
> +	asm volatile(
> +		"	idte %0,0,%1\n"
> +		: : "a" (table_origin), "a" (vaddr) : "memory");
> +}
> +
> +static inline void idte_pmdp(unsigned long vaddr, pmdval_t *pmdp)
> +{
> +	idte((unsigned long)(pmdp - pmd_index(vaddr)) | ASCE_DT_SEGMENT, vaddr);
> +}
> +
> +static inline void idte_pudp(unsigned long vaddr, pudval_t *pudp)
> +{
> +	idte((unsigned long)(pudp - pud_index(vaddr)) | ASCE_DT_REGION3, vaddr);
> +}
> +
> +static inline void idte_p4dp(unsigned long vaddr, p4dval_t *p4dp)
> +{
> +	idte((unsigned long)(p4dp - p4d_index(vaddr)) | ASCE_DT_REGION2, vaddr);
> +}
> +
> +static inline void idte_pgdp(unsigned long vaddr, pgdval_t *pgdp)
> +{
> +	idte((unsigned long)(pgdp - pgd_index(vaddr)) | ASCE_DT_REGION1, vaddr);
> +}
> +
>  void configure_dat(int enable);
>  
>  #endif /* _ASMS390X_PGTABLE_H_ */
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [kvm-unit-tests PATCH v2 5/5] s390x: edat test
  2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 5/5] s390x: edat test Claudio Imbrenda
@ 2021-02-23 14:57   ` Janosch Frank
  2021-02-23 15:22     ` Claudio Imbrenda
  0 siblings, 1 reply; 14+ messages in thread
From: Janosch Frank @ 2021-02-23 14:57 UTC (permalink / raw)
  To: Claudio Imbrenda, kvm; +Cc: linux-s390, david, thuth, cohuck, pmorel

On 2/23/21 3:07 PM, Claudio Imbrenda wrote:
> Simple EDAT test.
> 
> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> ---
>  s390x/Makefile      |   1 +
>  s390x/edat.c        | 285 ++++++++++++++++++++++++++++++++++++++++++++
>  s390x/unittests.cfg |   3 +
>  3 files changed, 289 insertions(+)
>  create mode 100644 s390x/edat.c
> 
> diff --git a/s390x/Makefile b/s390x/Makefile
> index 08d85c9f..fc885150 100644
> --- a/s390x/Makefile
> +++ b/s390x/Makefile
> @@ -20,6 +20,7 @@ tests += $(TEST_DIR)/sclp.elf
>  tests += $(TEST_DIR)/css.elf
>  tests += $(TEST_DIR)/uv-guest.elf
>  tests += $(TEST_DIR)/sie.elf
> +tests += $(TEST_DIR)/edat.elf
>  
>  tests_binary = $(patsubst %.elf,%.bin,$(tests))
>  ifneq ($(HOST_KEY_DOCUMENT),)
> diff --git a/s390x/edat.c b/s390x/edat.c
> new file mode 100644
> index 00000000..0bd16efc
> --- /dev/null
> +++ b/s390x/edat.c
> @@ -0,0 +1,285 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * EDAT test.
> + *
> + * Copyright (c) 2021 IBM Corp
> + *
> + * Authors:
> + *	Claudio Imbrenda <imbrenda@linux.ibm.com>
> + */
> +#include <libcflat.h>
> +#include <vmalloc.h>
> +#include <asm/facility.h>
> +#include <asm/interrupt.h>
> +#include <mmu.h>
> +#include <asm/pgtable.h>
> +#include <asm-generic/barrier.h>
> +
> +#define PGD_PAGE_SHIFT (REGION1_SHIFT - PAGE_SHIFT)
> +
> +#define TEID_ADDR	PAGE_MASK
> +#define TEID_AI		0x003
> +#define TEID_M		0x004
> +#define TEID_A		0x008
> +#define TEID_FS		0xc00


I'm thinking about presenting faults in more detail in the future i.e.
printing the TEID or at least the important parts of it. I'd guess this
should go into the library.


> +
> +#define LC_SIZE	(2 * PAGE_SIZE)
> +#define VIRT(x)	((void *)((unsigned long)(x) + (unsigned long)mem))
> +
> +static uint8_t prefix_buf[LC_SIZE] __attribute__((aligned(LC_SIZE)));
> +static unsigned int tmp[1024] __attribute__((aligned(PAGE_SIZE)));
> +static void *root, *mem, *m;
> +static struct lowcore *lc;
> +volatile unsigned int *p;
> +
> +/* Expect a program interrupt, and clear the TEID */
> +static void expect_dat_fault(void)
> +{
> +	expect_pgm_int();
> +	lc->trans_exc_id = 0;
> +}
> +
> +/*
> + * Check if a non-access-list protection exception happened for the given
> + * address, in the primary address space.
> + */
> +static bool check_pgm_prot(void *ptr)
> +{
> +	unsigned long teid = lc->trans_exc_id;
> +
> +	if (lc->pgm_int_code != PGM_INT_CODE_PROTECTION)
> +		return false;
> +	/*
> +	 * if for any reason TEID_M is not present, the rest of the field is
> +	 * not meaningful, so no point in checking it
> +	 */
> +	if (~teid & TEID_M)
> +		return true;
> +	return (~teid & TEID_A) &&
> +		((teid & TEID_ADDR) == ((uint64_t)ptr & PAGE_MASK)) &&
> +		!(teid & TEID_AI);
> +}
> +
> +static void test_dat(void)
> +{
> +	report_prefix_push("edat off");
> +	/* disable EDAT */
> +	ctl_clear_bit(0, 23);

It's about time we properly define CTL0 bits.

> +
> +	/* Check some basics */
> +	p[0] = 42;
> +	report(p[0] == 42, "pte, r/w");
> +	p[0] = 0;
> +
> +	/* Write protect the page and try to write, expect a fault */
> +	protect_page(m, PAGE_ENTRY_P);
> +	expect_dat_fault();
> +	p[0] = 42;
> +	unprotect_page(m, PAGE_ENTRY_P);
> +	report(!p[0] && check_pgm_prot(m), "pte, ro");
> +
> +	/*
> +	 * The FC bit (for large pages) should be ignored because EDAT is
> +	 * off. We set a value and then we try to read it back again after
> +	 * setting the FC bit. This way we can check if large pages were
> +	 * erroneously enabled despite EDAT being off.
> +	 */
> +	p[0] = 42;
> +	protect_dat_entry(m, SEGMENT_ENTRY_FC, 4);
> +	report(p[0] == 42, "pmd, fc=1, r/w");
> +	unprotect_dat_entry(m, SEGMENT_ENTRY_FC, 4);
> +	p[0] = 0;
> +
> +	/*
> +	 * Segment protection should work even with EDAT off, try to write
> +	 * anyway and expect a fault
> +	 */
> +	protect_dat_entry(m, SEGMENT_ENTRY_P, 4);
> +	expect_dat_fault();
> +	p[0] = 42;
> +	report(!p[0] && check_pgm_prot(m), "pmd, ro");
> +	unprotect_dat_entry(m, SEGMENT_ENTRY_P, 4);
> +
> +	/* The FC bit should be ignored because EDAT is off, like above */
> +	p[0] = 42;
> +	protect_dat_entry(m, REGION3_ENTRY_FC, 3);
> +	report(p[0] == 42, "pud, fc=1, r/w");
> +	unprotect_dat_entry(m, REGION3_ENTRY_FC, 3);
> +	p[0] = 0;
> +
> +	/*
> +	 * Region1/2/3 protection should not work, because EDAT is off.
> +	 * Protect the various region1/2/3 entries and write, expect the
> +	 * write to be successful.
> +	 */
> +	protect_dat_entry(m, REGION_ENTRY_P, 3);
> +	p[0] = 42;
> +	report(p[0] == 42, "pud, ro");
> +	unprotect_dat_entry(m, REGION_ENTRY_P, 3);
> +	p[0] = 0;
> +
> +	protect_dat_entry(m, REGION_ENTRY_P, 2);
> +	p[0] = 42;
> +	report(p[0] == 42, "p4d, ro");
> +	unprotect_dat_entry(m, REGION_ENTRY_P, 2);
> +	p[0] = 0;
> +
> +	protect_dat_entry(m, REGION_ENTRY_P, 1);
> +	p[0] = 42;
> +	report(p[0] == 42, "pgd, ro");
> +	unprotect_dat_entry(m, REGION_ENTRY_P, 1);
> +	p[0] = 0;
> +
> +	report_prefix_pop();
> +}
> +
> +static void test_edat1(void)
> +{
> +	report_prefix_push("edat1");
> +	/* Enable EDAT */
> +	ctl_set_bit(0, 23);
> +	p[0] = 0;
> +
> +	/*
> +	 * Segment protection should work normally, try to write and expect
> +	 * a fault.
> +	 */
> +	expect_dat_fault();
> +	protect_dat_entry(m, SEGMENT_ENTRY_P, 4);
> +	p[0] = 42;
> +	report(!p[0] && check_pgm_prot(m), "pmd, ro");
> +	unprotect_dat_entry(m, SEGMENT_ENTRY_P, 4);
> +
> +	/*
> +	 * Region1/2/3 protection should work now, because EDAT is on. Try
> +	 * to write anyway and expect a fault.
> +	 */
> +	expect_dat_fault();
> +	protect_dat_entry(m, REGION_ENTRY_P, 3);
> +	p[0] = 42;
> +	report(!p[0] && check_pgm_prot(m), "pud, ro");
> +	unprotect_dat_entry(m, REGION_ENTRY_P, 3);
> +
> +	expect_dat_fault();
> +	protect_dat_entry(m, REGION_ENTRY_P, 2);
> +	p[0] = 42;
> +	report(!p[0] && check_pgm_prot(m), "p4d, ro");
> +	unprotect_dat_entry(m, REGION_ENTRY_P, 2);
> +
> +	expect_dat_fault();
> +	protect_dat_entry(m, REGION_ENTRY_P, 1);
> +	p[0] = 42;
> +	report(!p[0] && check_pgm_prot(m), "pgd, ro");
> +	unprotect_dat_entry(m, REGION_ENTRY_P, 1);
> +
> +	/* Large pages should work */
> +	p[0] = 42;
> +	install_large_page(root, 0, mem);
> +	report(p[0] == 42, "pmd, large");
> +
> +	/*
> +	 * Prefixing should not work with large pages. Since the lower
> +	 * addresses are mapped with small pages, which are subject to
> +	 * prefixing, and the pages mapped with large pages are not subject
> +	 * to prefixing, this is the resulting scenario:
> +	 *
> +	 * virtual 0 = real 0 -> absolute prefix_buf
> +	 * virtual prefix_buf = real prefix_buf -> absolute 0
> +	 * VIRT(0) -> absolute 0
> +	 * VIRT(prefix_buf) -> absolute prefix_buf
> +	 *
> +	 * The testcase checks if the memory at virtual 0 has the same
> +	 * content as the memory at VIRT(prefix_buf) and the memory at
> +	 * VIRT(0) has the same content as the memory at virtual prefix_buf.
> +	 * If prefixing is erroneously applied for large pages, the testcase
> +	 * will therefore fail.
> +	 */
> +	report(!memcmp(0, VIRT(prefix_buf), LC_SIZE) &&
> +		!memcmp(prefix_buf, VIRT(0), LC_SIZE),
> +		"pmd, large, prefixing");
> +
> +	report_prefix_pop();
> +}
> +
> +static void test_edat2(void)
> +{
> +	report_prefix_push("edat2");
> +	p[0] = 42;
> +
> +	/* Huge pages should work */
> +	install_huge_page(root, 0, mem);
> +	report(p[0] == 42, "pud, huge");
> +
> +	/* Prefixing should not work with huge pages, just like large pages */
> +	report(!memcmp(0, VIRT(prefix_buf), LC_SIZE) &&
> +		!memcmp(prefix_buf, VIRT(0), LC_SIZE),
> +		"pmd, large, prefixing");
> +
> +	report_prefix_pop();
> +}
> +
> +static unsigned int setup(void)
> +{
> +	bool has_edat1 = test_facility(8);
> +	bool has_edat2 = test_facility(78);
> +	unsigned long pa, va;
> +
> +	if (has_edat2 && !has_edat1)
> +		report_abort("EDAT2 available, but EDAT1 not available");
> +
> +	/* Setup DAT 1:1 mapping and memory management */
> +	setup_vm();
> +	root = (void *)(stctg(1) & PAGE_MASK);
> +
> +	/*
> +	 * Get a pgd worth of virtual memory, so we can test things later
> +	 * without interfering with the test code or the interrupt handler
> +	 */
> +	mem = alloc_vpages_aligned(BIT_ULL(PGD_PAGE_SHIFT), PGD_PAGE_SHIFT);
> +	assert(mem);
> +	va = (unsigned long)mem;
> +
> +	/* Map the first 1GB of real memory */
> +	for (pa = 0; pa < SZ_1G; pa += PAGE_SIZE, va += PAGE_SIZE)
> +		install_page(root, pa, (void *)va);
> +
> +	/*
> +	 * Move the lowcore to a known non-zero location. This is needed
> +	 * later to check whether prefixing is working with large pages.
> +	 */
> +	assert((unsigned long)&prefix_buf < SZ_2G);
> +	memcpy(prefix_buf, 0, LC_SIZE);
> +	set_prefix((uint32_t)(uintptr_t)prefix_buf);
> +	/* Clear the old copy */
> +	memset(prefix_buf, 0, LC_SIZE);
> +
> +	/* m will point to tmp through the new virtual mapping */
> +	m = VIRT(&tmp);
> +	/* p is the same as m but volatile */
> +	p = (volatile unsigned int *)m;
> +
> +	return has_edat1 + has_edat2;
> +}
> +
> +int main(void)
> +{
> +	unsigned int edat;
> +
> +	report_prefix_push("edat");
> +	edat = setup();
> +
> +	test_dat();
> +
> +	if (edat)
> +		test_edat1();
> +	else
> +		report_skip("EDAT not available");
> +
> +	if (edat >= 2)
> +		test_edat2();
> +	else
> +		report_skip("EDAT2 not available");
> +
> +	report_prefix_pop();
> +	return report_summary();
> +}
> diff --git a/s390x/unittests.cfg b/s390x/unittests.cfg
> index 2298be6c..8da4a0a3 100644
> --- a/s390x/unittests.cfg
> +++ b/s390x/unittests.cfg
> @@ -99,3 +99,6 @@ file = uv-guest.elf
>  
>  [sie]
>  file = sie.elf
> +
> +[edat]
> +file = edat.elf
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [kvm-unit-tests PATCH v2 2/5] s390x: lib: fix pgtable.h
  2021-02-23 14:31   ` Janosch Frank
@ 2021-02-23 15:21     ` Claudio Imbrenda
  2021-02-23 15:44       ` Janosch Frank
  0 siblings, 1 reply; 14+ messages in thread
From: Claudio Imbrenda @ 2021-02-23 15:21 UTC (permalink / raw)
  To: Janosch Frank; +Cc: kvm, linux-s390, david, thuth, cohuck, pmorel

On Tue, 23 Feb 2021 15:31:06 +0100
Janosch Frank <frankja@linux.ibm.com> wrote:

> On 2/23/21 3:07 PM, Claudio Imbrenda wrote:
> > Fix pgtable.h:
> > 
> > * SEGMENT_ENTRY_SFAA had one extra bit set
> > * pmd entries don't have a length
> > * ipte does not need to clear the lower bits
> > * pud entries should use SEGMENT_TABLE_LENGTH, as they point to
> > segment tables
> > 
> > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> > ---
> >  lib/s390x/asm/pgtable.h | 9 ++++-----
> >  1 file changed, 4 insertions(+), 5 deletions(-)
> > 
> > diff --git a/lib/s390x/asm/pgtable.h b/lib/s390x/asm/pgtable.h
> > index 277f3480..a2ff2d4e 100644
> > --- a/lib/s390x/asm/pgtable.h
> > +++ b/lib/s390x/asm/pgtable.h
> > @@ -60,7 +60,7 @@
> >  #define SEGMENT_SHIFT			20
> >  
> >  #define SEGMENT_ENTRY_ORIGIN		0xfffffffffffff800UL
> > -#define SEGMENT_ENTRY_SFAA		0xfffffffffff80000UL
> > +#define SEGMENT_ENTRY_SFAA		0xfffffffffff00000UL
> >  #define SEGMENT_ENTRY_AV		0x0000000000010000UL
> >  #define SEGMENT_ENTRY_ACC		0x000000000000f000UL
> >  #define SEGMENT_ENTRY_F			0x0000000000000800UL
> > @@ -183,7 +183,7 @@ static inline pmd_t *pmd_alloc(pud_t *pud,
> > unsigned long addr) if (pud_none(*pud)) {
> >  		pmd_t *pmd = pmd_alloc_one();
> >  		pud_val(*pud) = __pa(pmd) |
> > REGION_ENTRY_TT_REGION3 |
> > -				REGION_TABLE_LENGTH;
> > +				SEGMENT_TABLE_LENGTH;  
> 
> @David: I'd much rather have REGION_ENTRY_LENGTH instead of
> REGION_TABLE_LENGTH and SEGMENT_TABLE_LENGTH.

I'm weakly against it

> My argument is that this is not really an attribute of the table and

it actually is an attribute of the table. the *_TABLE_LENGTH fields
tell how long the _table pointed to_ is. we always allocate the full 4
pages, so in our case it will always be 0x3.

segment table entries don't have a length because they point to page
tables. region3 table entries point to segment tables, so they have
SEGMENT_TABLE_LENGTH in their length field.

> very much specific to the format of the (region table) entry. We
> already have the table order as a length anyway...
> 
> Could you tell me what you had in mind when splitting this?
> 
> >  	}
> >  	return pmd_offset(pud, addr);
> >  }
> > @@ -202,15 +202,14 @@ static inline pte_t *pte_alloc(pmd_t *pmd,
> > unsigned long addr) {
> >  	if (pmd_none(*pmd)) {
> >  		pte_t *pte = pte_alloc_one();
> > -		pmd_val(*pmd) = __pa(pte) |
> > SEGMENT_ENTRY_TT_SEGMENT |
> > -				SEGMENT_TABLE_LENGTH;
> > +		pmd_val(*pmd) = __pa(pte) |
> > SEGMENT_ENTRY_TT_SEGMENT;  
> 
> Uhhhh good catch!
> 
> >  	}
> >  	return pte_offset(pmd, addr);
> >  }
> >  
> >  static inline void ipte(unsigned long vaddr, pteval_t *p_pte)
> >  {
> > -	unsigned long table_origin = (unsigned long)p_pte &
> > PAGE_MASK;
> > +	unsigned long table_origin = (unsigned long)p_pte;
> >  
> >  	asm volatile(
> >  		"	ipte %0,%1\n"
> >   
> 
> IPTE ignores that data but having the page mask also doesn't hurt so
> generally this is not a fix, right?

apart from avoiding an unnecessary operation, there is a small nit:
IPTE wants the address of the page table, ignoring the rightmost _11_
bits. with PAGE_MASK we ignore the rightmost _12_ bits instead.
it's not an issue in practice, because we allocate one page table per
page anyway, wasting the second half of the page, so in our case that
stray bit will always be 0. but in case we decide to allocate the page
tables more tightly, or in case some testcase wants to manually play
tricks with the page tables, there might be the risk that IPTE would
target the wrong page table.

by not clearing the lower 12 bits we not only save an unnecessary
operation, but we are actually more architecturally correct.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [kvm-unit-tests PATCH v2 3/5] s390x: lib: improve pgtable.h
  2021-02-23 14:35   ` Janosch Frank
@ 2021-02-23 15:21     ` Claudio Imbrenda
  0 siblings, 0 replies; 14+ messages in thread
From: Claudio Imbrenda @ 2021-02-23 15:21 UTC (permalink / raw)
  To: Janosch Frank; +Cc: kvm, linux-s390, david, thuth, cohuck, pmorel

On Tue, 23 Feb 2021 15:35:28 +0100
Janosch Frank <frankja@linux.ibm.com> wrote:

> On 2/23/21 3:07 PM, Claudio Imbrenda wrote:
> > Improve pgtable.h:
> > 
> > * add macros to check whether a pmd or a pud are large / huge
> > * add idte functions for pmd, pud, p4d and pgd
> > 
> > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>  
> 
> Could you please make the subject more specific?
> "s390x: lib: Add idte and huge entry check functions"

will do

> Acked-by: Janosch Frank <frankja@linux.ibm.com>
> 
> > ---
> >  lib/s390x/asm/pgtable.h | 31 +++++++++++++++++++++++++++++++
> >  1 file changed, 31 insertions(+)
> > 
> > diff --git a/lib/s390x/asm/pgtable.h b/lib/s390x/asm/pgtable.h
> > index a2ff2d4e..70d4afde 100644
> > --- a/lib/s390x/asm/pgtable.h
> > +++ b/lib/s390x/asm/pgtable.h
> > @@ -100,6 +100,9 @@
> >  #define pmd_none(entry) (pmd_val(entry) & SEGMENT_ENTRY_I)
> >  #define pte_none(entry) (pte_val(entry) & PAGE_ENTRY_I)
> >  
> > +#define pud_huge(entry)  (pud_val(entry) & REGION3_ENTRY_FC)
> > +#define pmd_large(entry) (pmd_val(entry) & SEGMENT_ENTRY_FC)
> > +
> >  #define pgd_addr(entry) __va(pgd_val(entry) & REGION_ENTRY_ORIGIN)
> >  #define p4d_addr(entry) __va(p4d_val(entry) & REGION_ENTRY_ORIGIN)
> >  #define pud_addr(entry) __va(pud_val(entry) & REGION_ENTRY_ORIGIN)
> > @@ -216,6 +219,34 @@ static inline void ipte(unsigned long vaddr,
> > pteval_t *p_pte) : : "a" (table_origin), "a" (vaddr) : "memory");
> >  }
> >  
> > +static inline void idte(unsigned long table_origin, unsigned long
> > vaddr) +{
> > +	vaddr &= SEGMENT_ENTRY_SFAA;
> > +	asm volatile(
> > +		"	idte %0,0,%1\n"
> > +		: : "a" (table_origin), "a" (vaddr) : "memory");
> > +}
> > +
> > +static inline void idte_pmdp(unsigned long vaddr, pmdval_t *pmdp)
> > +{
> > +	idte((unsigned long)(pmdp - pmd_index(vaddr)) |
> > ASCE_DT_SEGMENT, vaddr); +}
> > +
> > +static inline void idte_pudp(unsigned long vaddr, pudval_t *pudp)
> > +{
> > +	idte((unsigned long)(pudp - pud_index(vaddr)) |
> > ASCE_DT_REGION3, vaddr); +}
> > +
> > +static inline void idte_p4dp(unsigned long vaddr, p4dval_t *p4dp)
> > +{
> > +	idte((unsigned long)(p4dp - p4d_index(vaddr)) |
> > ASCE_DT_REGION2, vaddr); +}
> > +
> > +static inline void idte_pgdp(unsigned long vaddr, pgdval_t *pgdp)
> > +{
> > +	idte((unsigned long)(pgdp - pgd_index(vaddr)) |
> > ASCE_DT_REGION1, vaddr); +}
> > +
> >  void configure_dat(int enable);
> >  
> >  #endif /* _ASMS390X_PGTABLE_H_ */
> >   
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [kvm-unit-tests PATCH v2 5/5] s390x: edat test
  2021-02-23 14:57   ` Janosch Frank
@ 2021-02-23 15:22     ` Claudio Imbrenda
  0 siblings, 0 replies; 14+ messages in thread
From: Claudio Imbrenda @ 2021-02-23 15:22 UTC (permalink / raw)
  To: Janosch Frank; +Cc: kvm, linux-s390, david, thuth, cohuck, pmorel

On Tue, 23 Feb 2021 15:57:30 +0100
Janosch Frank <frankja@linux.ibm.com> wrote:

> On 2/23/21 3:07 PM, Claudio Imbrenda wrote:
> > Simple EDAT test.
> > 
> > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> > ---
> >  s390x/Makefile      |   1 +
> >  s390x/edat.c        | 285
> > ++++++++++++++++++++++++++++++++++++++++++++ s390x/unittests.cfg |
> >  3 + 3 files changed, 289 insertions(+)
> >  create mode 100644 s390x/edat.c
> > 
> > diff --git a/s390x/Makefile b/s390x/Makefile
> > index 08d85c9f..fc885150 100644
> > --- a/s390x/Makefile
> > +++ b/s390x/Makefile
> > @@ -20,6 +20,7 @@ tests += $(TEST_DIR)/sclp.elf
> >  tests += $(TEST_DIR)/css.elf
> >  tests += $(TEST_DIR)/uv-guest.elf
> >  tests += $(TEST_DIR)/sie.elf
> > +tests += $(TEST_DIR)/edat.elf
> >  
> >  tests_binary = $(patsubst %.elf,%.bin,$(tests))
> >  ifneq ($(HOST_KEY_DOCUMENT),)
> > diff --git a/s390x/edat.c b/s390x/edat.c
> > new file mode 100644
> > index 00000000..0bd16efc
> > --- /dev/null
> > +++ b/s390x/edat.c
> > @@ -0,0 +1,285 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * EDAT test.
> > + *
> > + * Copyright (c) 2021 IBM Corp
> > + *
> > + * Authors:
> > + *	Claudio Imbrenda <imbrenda@linux.ibm.com>
> > + */
> > +#include <libcflat.h>
> > +#include <vmalloc.h>
> > +#include <asm/facility.h>
> > +#include <asm/interrupt.h>
> > +#include <mmu.h>
> > +#include <asm/pgtable.h>
> > +#include <asm-generic/barrier.h>
> > +
> > +#define PGD_PAGE_SHIFT (REGION1_SHIFT - PAGE_SHIFT)
> > +
> > +#define TEID_ADDR	PAGE_MASK
> > +#define TEID_AI		0x003
> > +#define TEID_M		0x004
> > +#define TEID_A		0x008
> > +#define TEID_FS		0xc00  
> 
> 
> I'm thinking about presenting faults in more detail in the future i.e.
> printing the TEID or at least the important parts of it. I'd guess
> this should go into the library.

will do

> 
> > +
> > +#define LC_SIZE	(2 * PAGE_SIZE)
> > +#define VIRT(x)	((void *)((unsigned long)(x) + (unsigned
> > long)mem)) +
> > +static uint8_t prefix_buf[LC_SIZE]
> > __attribute__((aligned(LC_SIZE))); +static unsigned int tmp[1024]
> > __attribute__((aligned(PAGE_SIZE))); +static void *root, *mem, *m;
> > +static struct lowcore *lc;
> > +volatile unsigned int *p;
> > +
> > +/* Expect a program interrupt, and clear the TEID */
> > +static void expect_dat_fault(void)
> > +{
> > +	expect_pgm_int();
> > +	lc->trans_exc_id = 0;
> > +}
> > +
> > +/*
> > + * Check if a non-access-list protection exception happened for
> > the given
> > + * address, in the primary address space.
> > + */
> > +static bool check_pgm_prot(void *ptr)
> > +{
> > +	unsigned long teid = lc->trans_exc_id;
> > +
> > +	if (lc->pgm_int_code != PGM_INT_CODE_PROTECTION)
> > +		return false;
> > +	/*
> > +	 * if for any reason TEID_M is not present, the rest of
> > the field is
> > +	 * not meaningful, so no point in checking it
> > +	 */
> > +	if (~teid & TEID_M)
> > +		return true;
> > +	return (~teid & TEID_A) &&
> > +		((teid & TEID_ADDR) == ((uint64_t)ptr &
> > PAGE_MASK)) &&
> > +		!(teid & TEID_AI);
> > +}
> > +
> > +static void test_dat(void)
> > +{
> > +	report_prefix_push("edat off");
> > +	/* disable EDAT */
> > +	ctl_clear_bit(0, 23);  
> 
> It's about time we properly define CTL0 bits.

will do

> > +
> > +	/* Check some basics */
> > +	p[0] = 42;
> > +	report(p[0] == 42, "pte, r/w");
> > +	p[0] = 0;
> > +
> > +	/* Write protect the page and try to write, expect a fault
> > */
> > +	protect_page(m, PAGE_ENTRY_P);
> > +	expect_dat_fault();
> > +	p[0] = 42;
> > +	unprotect_page(m, PAGE_ENTRY_P);
> > +	report(!p[0] && check_pgm_prot(m), "pte, ro");
> > +
> > +	/*
> > +	 * The FC bit (for large pages) should be ignored because
> > EDAT is
> > +	 * off. We set a value and then we try to read it back
> > again after
> > +	 * setting the FC bit. This way we can check if large
> > pages were
> > +	 * erroneously enabled despite EDAT being off.
> > +	 */
> > +	p[0] = 42;
> > +	protect_dat_entry(m, SEGMENT_ENTRY_FC, 4);
> > +	report(p[0] == 42, "pmd, fc=1, r/w");
> > +	unprotect_dat_entry(m, SEGMENT_ENTRY_FC, 4);
> > +	p[0] = 0;
> > +
> > +	/*
> > +	 * Segment protection should work even with EDAT off, try
> > to write
> > +	 * anyway and expect a fault
> > +	 */
> > +	protect_dat_entry(m, SEGMENT_ENTRY_P, 4);
> > +	expect_dat_fault();
> > +	p[0] = 42;
> > +	report(!p[0] && check_pgm_prot(m), "pmd, ro");
> > +	unprotect_dat_entry(m, SEGMENT_ENTRY_P, 4);
> > +
> > +	/* The FC bit should be ignored because EDAT is off, like
> > above */
> > +	p[0] = 42;
> > +	protect_dat_entry(m, REGION3_ENTRY_FC, 3);
> > +	report(p[0] == 42, "pud, fc=1, r/w");
> > +	unprotect_dat_entry(m, REGION3_ENTRY_FC, 3);
> > +	p[0] = 0;
> > +
> > +	/*
> > +	 * Region1/2/3 protection should not work, because EDAT is
> > off.
> > +	 * Protect the various region1/2/3 entries and write,
> > expect the
> > +	 * write to be successful.
> > +	 */
> > +	protect_dat_entry(m, REGION_ENTRY_P, 3);
> > +	p[0] = 42;
> > +	report(p[0] == 42, "pud, ro");
> > +	unprotect_dat_entry(m, REGION_ENTRY_P, 3);
> > +	p[0] = 0;
> > +
> > +	protect_dat_entry(m, REGION_ENTRY_P, 2);
> > +	p[0] = 42;
> > +	report(p[0] == 42, "p4d, ro");
> > +	unprotect_dat_entry(m, REGION_ENTRY_P, 2);
> > +	p[0] = 0;
> > +
> > +	protect_dat_entry(m, REGION_ENTRY_P, 1);
> > +	p[0] = 42;
> > +	report(p[0] == 42, "pgd, ro");
> > +	unprotect_dat_entry(m, REGION_ENTRY_P, 1);
> > +	p[0] = 0;
> > +
> > +	report_prefix_pop();
> > +}
> > +
> > +static void test_edat1(void)
> > +{
> > +	report_prefix_push("edat1");
> > +	/* Enable EDAT */
> > +	ctl_set_bit(0, 23);
> > +	p[0] = 0;
> > +
> > +	/*
> > +	 * Segment protection should work normally, try to write
> > and expect
> > +	 * a fault.
> > +	 */
> > +	expect_dat_fault();
> > +	protect_dat_entry(m, SEGMENT_ENTRY_P, 4);
> > +	p[0] = 42;
> > +	report(!p[0] && check_pgm_prot(m), "pmd, ro");
> > +	unprotect_dat_entry(m, SEGMENT_ENTRY_P, 4);
> > +
> > +	/*
> > +	 * Region1/2/3 protection should work now, because EDAT is
> > on. Try
> > +	 * to write anyway and expect a fault.
> > +	 */
> > +	expect_dat_fault();
> > +	protect_dat_entry(m, REGION_ENTRY_P, 3);
> > +	p[0] = 42;
> > +	report(!p[0] && check_pgm_prot(m), "pud, ro");
> > +	unprotect_dat_entry(m, REGION_ENTRY_P, 3);
> > +
> > +	expect_dat_fault();
> > +	protect_dat_entry(m, REGION_ENTRY_P, 2);
> > +	p[0] = 42;
> > +	report(!p[0] && check_pgm_prot(m), "p4d, ro");
> > +	unprotect_dat_entry(m, REGION_ENTRY_P, 2);
> > +
> > +	expect_dat_fault();
> > +	protect_dat_entry(m, REGION_ENTRY_P, 1);
> > +	p[0] = 42;
> > +	report(!p[0] && check_pgm_prot(m), "pgd, ro");
> > +	unprotect_dat_entry(m, REGION_ENTRY_P, 1);
> > +
> > +	/* Large pages should work */
> > +	p[0] = 42;
> > +	install_large_page(root, 0, mem);
> > +	report(p[0] == 42, "pmd, large");
> > +
> > +	/*
> > +	 * Prefixing should not work with large pages. Since the
> > lower
> > +	 * addresses are mapped with small pages, which are
> > subject to
> > +	 * prefixing, and the pages mapped with large pages are
> > not subject
> > +	 * to prefixing, this is the resulting scenario:
> > +	 *
> > +	 * virtual 0 = real 0 -> absolute prefix_buf
> > +	 * virtual prefix_buf = real prefix_buf -> absolute 0
> > +	 * VIRT(0) -> absolute 0
> > +	 * VIRT(prefix_buf) -> absolute prefix_buf
> > +	 *
> > +	 * The testcase checks if the memory at virtual 0 has the
> > same
> > +	 * content as the memory at VIRT(prefix_buf) and the
> > memory at
> > +	 * VIRT(0) has the same content as the memory at virtual
> > prefix_buf.
> > +	 * If prefixing is erroneously applied for large pages,
> > the testcase
> > +	 * will therefore fail.
> > +	 */
> > +	report(!memcmp(0, VIRT(prefix_buf), LC_SIZE) &&
> > +		!memcmp(prefix_buf, VIRT(0), LC_SIZE),
> > +		"pmd, large, prefixing");
> > +
> > +	report_prefix_pop();
> > +}
> > +
> > +static void test_edat2(void)
> > +{
> > +	report_prefix_push("edat2");
> > +	p[0] = 42;
> > +
> > +	/* Huge pages should work */
> > +	install_huge_page(root, 0, mem);
> > +	report(p[0] == 42, "pud, huge");
> > +
> > +	/* Prefixing should not work with huge pages, just like
> > large pages */
> > +	report(!memcmp(0, VIRT(prefix_buf), LC_SIZE) &&
> > +		!memcmp(prefix_buf, VIRT(0), LC_SIZE),
> > +		"pmd, large, prefixing");
> > +
> > +	report_prefix_pop();
> > +}
> > +
> > +static unsigned int setup(void)
> > +{
> > +	bool has_edat1 = test_facility(8);
> > +	bool has_edat2 = test_facility(78);
> > +	unsigned long pa, va;
> > +
> > +	if (has_edat2 && !has_edat1)
> > +		report_abort("EDAT2 available, but EDAT1 not
> > available"); +
> > +	/* Setup DAT 1:1 mapping and memory management */
> > +	setup_vm();
> > +	root = (void *)(stctg(1) & PAGE_MASK);
> > +
> > +	/*
> > +	 * Get a pgd worth of virtual memory, so we can test
> > things later
> > +	 * without interfering with the test code or the interrupt
> > handler
> > +	 */
> > +	mem = alloc_vpages_aligned(BIT_ULL(PGD_PAGE_SHIFT),
> > PGD_PAGE_SHIFT);
> > +	assert(mem);
> > +	va = (unsigned long)mem;
> > +
> > +	/* Map the first 1GB of real memory */
> > +	for (pa = 0; pa < SZ_1G; pa += PAGE_SIZE, va += PAGE_SIZE)
> > +		install_page(root, pa, (void *)va);
> > +
> > +	/*
> > +	 * Move the lowcore to a known non-zero location. This is
> > needed
> > +	 * later to check whether prefixing is working with large
> > pages.
> > +	 */
> > +	assert((unsigned long)&prefix_buf < SZ_2G);
> > +	memcpy(prefix_buf, 0, LC_SIZE);
> > +	set_prefix((uint32_t)(uintptr_t)prefix_buf);
> > +	/* Clear the old copy */
> > +	memset(prefix_buf, 0, LC_SIZE);
> > +
> > +	/* m will point to tmp through the new virtual mapping */
> > +	m = VIRT(&tmp);
> > +	/* p is the same as m but volatile */
> > +	p = (volatile unsigned int *)m;
> > +
> > +	return has_edat1 + has_edat2;
> > +}
> > +
> > +int main(void)
> > +{
> > +	unsigned int edat;
> > +
> > +	report_prefix_push("edat");
> > +	edat = setup();
> > +
> > +	test_dat();
> > +
> > +	if (edat)
> > +		test_edat1();
> > +	else
> > +		report_skip("EDAT not available");
> > +
> > +	if (edat >= 2)
> > +		test_edat2();
> > +	else
> > +		report_skip("EDAT2 not available");
> > +
> > +	report_prefix_pop();
> > +	return report_summary();
> > +}
> > diff --git a/s390x/unittests.cfg b/s390x/unittests.cfg
> > index 2298be6c..8da4a0a3 100644
> > --- a/s390x/unittests.cfg
> > +++ b/s390x/unittests.cfg
> > @@ -99,3 +99,6 @@ file = uv-guest.elf
> >  
> >  [sie]
> >  file = sie.elf
> > +
> > +[edat]
> > +file = edat.elf
> >   
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [kvm-unit-tests PATCH v2 2/5] s390x: lib: fix pgtable.h
  2021-02-23 15:21     ` Claudio Imbrenda
@ 2021-02-23 15:44       ` Janosch Frank
  2021-02-23 15:53         ` Claudio Imbrenda
  0 siblings, 1 reply; 14+ messages in thread
From: Janosch Frank @ 2021-02-23 15:44 UTC (permalink / raw)
  To: Claudio Imbrenda; +Cc: kvm, linux-s390, david, thuth, cohuck, pmorel

On 2/23/21 4:21 PM, Claudio Imbrenda wrote:
> On Tue, 23 Feb 2021 15:31:06 +0100
> Janosch Frank <frankja@linux.ibm.com> wrote:
> 
>> On 2/23/21 3:07 PM, Claudio Imbrenda wrote:
>>> Fix pgtable.h:
>>>
>>> * SEGMENT_ENTRY_SFAA had one extra bit set
>>> * pmd entries don't have a length
>>> * ipte does not need to clear the lower bits
>>> * pud entries should use SEGMENT_TABLE_LENGTH, as they point to
>>> segment tables
>>>
>>> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
>>> ---
>>>  lib/s390x/asm/pgtable.h | 9 ++++-----
>>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/lib/s390x/asm/pgtable.h b/lib/s390x/asm/pgtable.h
>>> index 277f3480..a2ff2d4e 100644
>>> --- a/lib/s390x/asm/pgtable.h
>>> +++ b/lib/s390x/asm/pgtable.h
>>> @@ -60,7 +60,7 @@
>>>  #define SEGMENT_SHIFT			20
>>>  
>>>  #define SEGMENT_ENTRY_ORIGIN		0xfffffffffffff800UL
>>> -#define SEGMENT_ENTRY_SFAA		0xfffffffffff80000UL
>>> +#define SEGMENT_ENTRY_SFAA		0xfffffffffff00000UL
>>>  #define SEGMENT_ENTRY_AV		0x0000000000010000UL
>>>  #define SEGMENT_ENTRY_ACC		0x000000000000f000UL
>>>  #define SEGMENT_ENTRY_F			0x0000000000000800UL
>>> @@ -183,7 +183,7 @@ static inline pmd_t *pmd_alloc(pud_t *pud,
>>> unsigned long addr) if (pud_none(*pud)) {
>>>  		pmd_t *pmd = pmd_alloc_one();
>>>  		pud_val(*pud) = __pa(pmd) |
>>> REGION_ENTRY_TT_REGION3 |
>>> -				REGION_TABLE_LENGTH;
>>> +				SEGMENT_TABLE_LENGTH;  
>>
>> @David: I'd much rather have REGION_ENTRY_LENGTH instead of
>> REGION_TABLE_LENGTH and SEGMENT_TABLE_LENGTH.
> 
> I'm weakly against it
> 
>> My argument is that this is not really an attribute of the table and
> 
> it actually is an attribute of the table. the *_TABLE_LENGTH fields
> tell how long the _table pointed to_ is. we always allocate the full 4
> pages, so in our case it will always be 0x3.
> 

It's part of the entry nevertheless and should not be a *_TABLE_*
constant. It's used in entries and has a very specific format that only
makes sense if it's being used in a region entry.

Every other thing that you or into the entry apart from the address is
named *_ENTRY_* so this should be too.

> segment table entries don't have a length because they point to page
> tables. region3 table entries point to segment tables, so they have
> SEGMENT_TABLE_LENGTH in their length field.

Btw. I'd guess the SEGMENT_TABLE_LENGTH name is the reason that you had
to fix the problem in the next hunk...

>> very much specific to the format of the (region table) entry. We
>> already have the table order as a length anyway...
>>
>> Could you tell me what you had in mind when splitting this?
>>
>>>  	}
>>>  	return pmd_offset(pud, addr);
>>>  }
>>> @@ -202,15 +202,14 @@ static inline pte_t *pte_alloc(pmd_t *pmd,
>>> unsigned long addr) {
>>>  	if (pmd_none(*pmd)) {
>>>  		pte_t *pte = pte_alloc_one();
>>> -		pmd_val(*pmd) = __pa(pte) |
>>> SEGMENT_ENTRY_TT_SEGMENT |
>>> -				SEGMENT_TABLE_LENGTH;
>>> +		pmd_val(*pmd) = __pa(pte) |
>>> SEGMENT_ENTRY_TT_SEGMENT;  
>>
>> Uhhhh good catch!
>>
>>>  	}
>>>  	return pte_offset(pmd, addr);
>>>  }
>>>  
>>>  static inline void ipte(unsigned long vaddr, pteval_t *p_pte)
>>>  {
>>> -	unsigned long table_origin = (unsigned long)p_pte &
>>> PAGE_MASK;
>>> +	unsigned long table_origin = (unsigned long)p_pte;
>>>  
>>>  	asm volatile(
>>>  		"	ipte %0,%1\n"
>>>   
>>
>> IPTE ignores that data but having the page mask also doesn't hurt so
>> generally this is not a fix, right?
> 
> apart from avoiding an unnecessary operation, there is a small nit:
> IPTE wants the address of the page table, ignoring the rightmost _11_
> bits. with PAGE_MASK we ignore the rightmost _12_ bits instead.
> it's not an issue in practice, because we allocate one page table per
> page anyway, wasting the second half of the page, so in our case that
> stray bit will always be 0. but in case we decide to allocate the page
> tables more tightly, or in case some testcase wants to manually play
> tricks with the page tables, there might be the risk that IPTE would
> target the wrong page table.
> 
> by not clearing the lower 12 bits we not only save an unnecessary
> operation, but we are actually more architecturally correct.
> 

Right, the old 2k pgtable thing...
Would you mind putting that into the patch description?


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [kvm-unit-tests PATCH v2 2/5] s390x: lib: fix pgtable.h
  2021-02-23 15:44       ` Janosch Frank
@ 2021-02-23 15:53         ` Claudio Imbrenda
  0 siblings, 0 replies; 14+ messages in thread
From: Claudio Imbrenda @ 2021-02-23 15:53 UTC (permalink / raw)
  To: Janosch Frank; +Cc: kvm, linux-s390, david, thuth, cohuck, pmorel

On Tue, 23 Feb 2021 16:44:43 +0100
Janosch Frank <frankja@linux.ibm.com> wrote:

> On 2/23/21 4:21 PM, Claudio Imbrenda wrote:
> > On Tue, 23 Feb 2021 15:31:06 +0100
> > Janosch Frank <frankja@linux.ibm.com> wrote:
> >   
> >> On 2/23/21 3:07 PM, Claudio Imbrenda wrote:  
> >>> Fix pgtable.h:
> >>>
> >>> * SEGMENT_ENTRY_SFAA had one extra bit set
> >>> * pmd entries don't have a length
> >>> * ipte does not need to clear the lower bits
> >>> * pud entries should use SEGMENT_TABLE_LENGTH, as they point to
> >>> segment tables
> >>>
> >>> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> >>> ---
> >>>  lib/s390x/asm/pgtable.h | 9 ++++-----
> >>>  1 file changed, 4 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/lib/s390x/asm/pgtable.h b/lib/s390x/asm/pgtable.h
> >>> index 277f3480..a2ff2d4e 100644
> >>> --- a/lib/s390x/asm/pgtable.h
> >>> +++ b/lib/s390x/asm/pgtable.h
> >>> @@ -60,7 +60,7 @@
> >>>  #define SEGMENT_SHIFT			20
> >>>  
> >>>  #define SEGMENT_ENTRY_ORIGIN		0xfffffffffffff800UL
> >>> -#define SEGMENT_ENTRY_SFAA		0xfffffffffff80000UL
> >>> +#define SEGMENT_ENTRY_SFAA		0xfffffffffff00000UL
> >>>  #define SEGMENT_ENTRY_AV		0x0000000000010000UL
> >>>  #define SEGMENT_ENTRY_ACC		0x000000000000f000UL
> >>>  #define SEGMENT_ENTRY_F
> >>> 0x0000000000000800UL @@ -183,7 +183,7 @@ static inline pmd_t
> >>> *pmd_alloc(pud_t *pud, unsigned long addr) if (pud_none(*pud)) {
> >>>  		pmd_t *pmd = pmd_alloc_one();
> >>>  		pud_val(*pud) = __pa(pmd) |
> >>> REGION_ENTRY_TT_REGION3 |
> >>> -				REGION_TABLE_LENGTH;
> >>> +				SEGMENT_TABLE_LENGTH;    
> >>
> >> @David: I'd much rather have REGION_ENTRY_LENGTH instead of
> >> REGION_TABLE_LENGTH and SEGMENT_TABLE_LENGTH.  
> > 
> > I'm weakly against it
> >   
> >> My argument is that this is not really an attribute of the table
> >> and  
> > 
> > it actually is an attribute of the table. the *_TABLE_LENGTH fields
> > tell how long the _table pointed to_ is. we always allocate the
> > full 4 pages, so in our case it will always be 0x3.
> >   
> 
> It's part of the entry nevertheless and should not be a *_TABLE_*
> constant. It's used in entries and has a very specific format that
> only makes sense if it's being used in a region entry.

or in the ASCE, yes

> Every other thing that you or into the entry apart from the address is
> named *_ENTRY_* so this should be too.

fair enough

> > segment table entries don't have a length because they point to page
> > tables. region3 table entries point to segment tables, so they have
> > SEGMENT_TABLE_LENGTH in their length field.  
> 
> Btw. I'd guess the SEGMENT_TABLE_LENGTH name is the reason that you
> had to fix the problem in the next hunk...

yes, I am quite sure of this too :)

I'll figure out a better name and respin

> >> very much specific to the format of the (region table) entry. We
> >> already have the table order as a length anyway...
> >>
> >> Could you tell me what you had in mind when splitting this?
> >>  
> >>>  	}
> >>>  	return pmd_offset(pud, addr);
> >>>  }
> >>> @@ -202,15 +202,14 @@ static inline pte_t *pte_alloc(pmd_t *pmd,
> >>> unsigned long addr) {
> >>>  	if (pmd_none(*pmd)) {
> >>>  		pte_t *pte = pte_alloc_one();
> >>> -		pmd_val(*pmd) = __pa(pte) |
> >>> SEGMENT_ENTRY_TT_SEGMENT |
> >>> -				SEGMENT_TABLE_LENGTH;
> >>> +		pmd_val(*pmd) = __pa(pte) |
> >>> SEGMENT_ENTRY_TT_SEGMENT;    
> >>
> >> Uhhhh good catch!
> >>  
> >>>  	}
> >>>  	return pte_offset(pmd, addr);
> >>>  }
> >>>  
> >>>  static inline void ipte(unsigned long vaddr, pteval_t *p_pte)
> >>>  {
> >>> -	unsigned long table_origin = (unsigned long)p_pte &
> >>> PAGE_MASK;
> >>> +	unsigned long table_origin = (unsigned long)p_pte;
> >>>  
> >>>  	asm volatile(
> >>>  		"	ipte %0,%1\n"
> >>>     
> >>
> >> IPTE ignores that data but having the page mask also doesn't hurt
> >> so generally this is not a fix, right?  
> > 
> > apart from avoiding an unnecessary operation, there is a small nit:
> > IPTE wants the address of the page table, ignoring the rightmost
> > _11_ bits. with PAGE_MASK we ignore the rightmost _12_ bits instead.
> > it's not an issue in practice, because we allocate one page table
> > per page anyway, wasting the second half of the page, so in our
> > case that stray bit will always be 0. but in case we decide to
> > allocate the page tables more tightly, or in case some testcase
> > wants to manually play tricks with the page tables, there might be
> > the risk that IPTE would target the wrong page table.
> > 
> > by not clearing the lower 12 bits we not only save an unnecessary
> > operation, but we are actually more architecturally correct.
> >   
> 
> Right, the old 2k pgtable thing...
> Would you mind putting that into the patch description?

will do



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-02-23 15:54 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-23 14:07 [kvm-unit-tests PATCH v2 0/5] s390: Add support for large pages Claudio Imbrenda
2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 1/5] libcflat: add SZ_1M and SZ_2G Claudio Imbrenda
2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 2/5] s390x: lib: fix pgtable.h Claudio Imbrenda
2021-02-23 14:31   ` Janosch Frank
2021-02-23 15:21     ` Claudio Imbrenda
2021-02-23 15:44       ` Janosch Frank
2021-02-23 15:53         ` Claudio Imbrenda
2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 3/5] s390x: lib: improve pgtable.h Claudio Imbrenda
2021-02-23 14:35   ` Janosch Frank
2021-02-23 15:21     ` Claudio Imbrenda
2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 4/5] s390x: mmu: add support for large pages Claudio Imbrenda
2021-02-23 14:07 ` [kvm-unit-tests PATCH v2 5/5] s390x: edat test Claudio Imbrenda
2021-02-23 14:57   ` Janosch Frank
2021-02-23 15:22     ` Claudio Imbrenda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).