* [kvm-unit-tests PATCH v1 0/4] s390: Add support for large pages @ 2021-02-09 14:38 Claudio Imbrenda 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 1/4] libcflat: add SZ_1M and SZ_2G Claudio Imbrenda ` (3 more replies) 0 siblings, 4 replies; 11+ messages in thread From: Claudio Imbrenda @ 2021-02-09 14:38 UTC (permalink / raw) To: kvm; +Cc: linux-s390, david, thuth, frankja, cohuck, pmorel Introduce support for large (1M) and huge (2G) pages. Add a simple testcase for EDAT1 and EDAT2. Claudio Imbrenda (4): libcflat: add SZ_1M and SZ_2G s390x: lib: fix and improve pgtable.h s390x: mmu: add support for large pages s390x: edat test s390x/Makefile | 1 + lib/s390x/asm/pgtable.h | 38 ++++++- lib/libcflat.h | 2 + lib/s390x/mmu.h | 73 +++++++++++- lib/s390x/mmu.c | 246 ++++++++++++++++++++++++++++++++++++---- s390x/edat.c | 238 ++++++++++++++++++++++++++++++++++++++ s390x/unittests.cfg | 3 + 7 files changed, 572 insertions(+), 29 deletions(-) create mode 100644 s390x/edat.c -- 2.26.2 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [kvm-unit-tests PATCH v1 1/4] libcflat: add SZ_1M and SZ_2G 2021-02-09 14:38 [kvm-unit-tests PATCH v1 0/4] s390: Add support for large pages Claudio Imbrenda @ 2021-02-09 14:38 ` Claudio Imbrenda 2021-02-09 15:21 ` Thomas Huth 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 2/4] s390x: lib: fix and improve pgtable.h Claudio Imbrenda ` (2 subsequent siblings) 3 siblings, 1 reply; 11+ messages in thread From: Claudio Imbrenda @ 2021-02-09 14:38 UTC (permalink / raw) To: kvm; +Cc: linux-s390, david, thuth, frankja, cohuck, pmorel Add SZ_1M and SZ_2G to libcflat.h s390x needs those for large/huge pages Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> --- lib/libcflat.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/libcflat.h b/lib/libcflat.h index 460a1234..8dac0621 100644 --- a/lib/libcflat.h +++ b/lib/libcflat.h @@ -157,7 +157,9 @@ extern void setup_vm(void); #define SZ_8K (1 << 13) #define SZ_16K (1 << 14) #define SZ_64K (1 << 16) +#define SZ_1M (1 << 20) #define SZ_2M (1 << 21) #define SZ_1G (1 << 30) +#define SZ_2G (1ul << 31) #endif -- 2.26.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [kvm-unit-tests PATCH v1 1/4] libcflat: add SZ_1M and SZ_2G 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 1/4] libcflat: add SZ_1M and SZ_2G Claudio Imbrenda @ 2021-02-09 15:21 ` Thomas Huth 0 siblings, 0 replies; 11+ messages in thread From: Thomas Huth @ 2021-02-09 15:21 UTC (permalink / raw) To: Claudio Imbrenda, kvm; +Cc: linux-s390, david, frankja, cohuck, pmorel On 09/02/2021 15.38, Claudio Imbrenda wrote: > Add SZ_1M and SZ_2G to libcflat.h > > s390x needs those for large/huge pages > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> > --- > lib/libcflat.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/lib/libcflat.h b/lib/libcflat.h > index 460a1234..8dac0621 100644 > --- a/lib/libcflat.h > +++ b/lib/libcflat.h > @@ -157,7 +157,9 @@ extern void setup_vm(void); > #define SZ_8K (1 << 13) > #define SZ_16K (1 << 14) > #define SZ_64K (1 << 16) > +#define SZ_1M (1 << 20) > #define SZ_2M (1 << 21) > #define SZ_1G (1 << 30) > +#define SZ_2G (1ul << 31) > > #endif > Reviewed-by: Thomas Huth <thuth@redhat.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [kvm-unit-tests PATCH v1 2/4] s390x: lib: fix and improve pgtable.h 2021-02-09 14:38 [kvm-unit-tests PATCH v1 0/4] s390: Add support for large pages Claudio Imbrenda 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 1/4] libcflat: add SZ_1M and SZ_2G Claudio Imbrenda @ 2021-02-09 14:38 ` Claudio Imbrenda 2021-02-11 9:09 ` Thomas Huth 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 3/4] s390x: mmu: add support for large pages Claudio Imbrenda 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 4/4] s390x: edat test Claudio Imbrenda 3 siblings, 1 reply; 11+ messages in thread From: Claudio Imbrenda @ 2021-02-09 14:38 UTC (permalink / raw) To: kvm; +Cc: linux-s390, david, thuth, frankja, cohuck, pmorel Fix and improve pgtable.h: * SEGMENT_ENTRY_SFAA had one extra bit set * pmd entries don't have a length * ipte does not need to clear the lower bits * add macros to check whether a pmd or a pud are large / huge * add idte functions for pmd, pud, p4d and pgd Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> --- lib/s390x/asm/pgtable.h | 38 ++++++++++++++++++++++++++++++++++---- 1 file changed, 34 insertions(+), 4 deletions(-) diff --git a/lib/s390x/asm/pgtable.h b/lib/s390x/asm/pgtable.h index 277f3480..4269ab62 100644 --- a/lib/s390x/asm/pgtable.h +++ b/lib/s390x/asm/pgtable.h @@ -60,7 +60,7 @@ #define SEGMENT_SHIFT 20 #define SEGMENT_ENTRY_ORIGIN 0xfffffffffffff800UL -#define SEGMENT_ENTRY_SFAA 0xfffffffffff80000UL +#define SEGMENT_ENTRY_SFAA 0xfffffffffff00000UL #define SEGMENT_ENTRY_AV 0x0000000000010000UL #define SEGMENT_ENTRY_ACC 0x000000000000f000UL #define SEGMENT_ENTRY_F 0x0000000000000800UL @@ -100,6 +100,9 @@ #define pmd_none(entry) (pmd_val(entry) & SEGMENT_ENTRY_I) #define pte_none(entry) (pte_val(entry) & PAGE_ENTRY_I) +#define pud_huge(entry) (pud_val(entry) & REGION3_ENTRY_FC) +#define pmd_large(entry) (pmd_val(entry) & SEGMENT_ENTRY_FC) + #define pgd_addr(entry) __va(pgd_val(entry) & REGION_ENTRY_ORIGIN) #define p4d_addr(entry) __va(p4d_val(entry) & REGION_ENTRY_ORIGIN) #define pud_addr(entry) __va(pud_val(entry) & REGION_ENTRY_ORIGIN) @@ -202,21 +205,48 @@ static inline pte_t *pte_alloc(pmd_t *pmd, unsigned long addr) { if (pmd_none(*pmd)) { pte_t *pte = pte_alloc_one(); - pmd_val(*pmd) = __pa(pte) | SEGMENT_ENTRY_TT_SEGMENT | - SEGMENT_TABLE_LENGTH; + pmd_val(*pmd) = __pa(pte) | SEGMENT_ENTRY_TT_SEGMENT; } return pte_offset(pmd, addr); } static inline void ipte(unsigned long vaddr, pteval_t *p_pte) { - unsigned long table_origin = (unsigned long)p_pte & PAGE_MASK; + unsigned long table_origin = (unsigned long)p_pte; asm volatile( " ipte %0,%1\n" : : "a" (table_origin), "a" (vaddr) : "memory"); } +static inline void idte(unsigned long table_origin, unsigned long vaddr) +{ + vaddr &= SEGMENT_ENTRY_SFAA; + asm volatile( + " idte %0,0,%1\n" + : : "a" (table_origin), "a" (vaddr) : "memory"); +} + +static inline void idte_pmdp(unsigned long vaddr, pmdval_t *pmdp) +{ + idte((unsigned long)(pmdp - pmd_index(vaddr)) | ASCE_DT_SEGMENT, vaddr); +} + +static inline void idte_pudp(unsigned long vaddr, pudval_t *pudp) +{ + idte((unsigned long)(pudp - pud_index(vaddr)) | ASCE_DT_REGION3, vaddr); +} + +static inline void idte_p4dp(unsigned long vaddr, p4dval_t *p4dp) +{ + idte((unsigned long)(p4dp - p4d_index(vaddr)) | ASCE_DT_REGION2, vaddr); +} + +static inline void idte_pgdp(unsigned long vaddr, pgdval_t *pgdp) +{ + idte((unsigned long)(pgdp - pgd_index(vaddr)) | ASCE_DT_REGION1, vaddr); +} + void configure_dat(int enable); #endif /* _ASMS390X_PGTABLE_H_ */ -- 2.26.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [kvm-unit-tests PATCH v1 2/4] s390x: lib: fix and improve pgtable.h 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 2/4] s390x: lib: fix and improve pgtable.h Claudio Imbrenda @ 2021-02-11 9:09 ` Thomas Huth 0 siblings, 0 replies; 11+ messages in thread From: Thomas Huth @ 2021-02-11 9:09 UTC (permalink / raw) To: Claudio Imbrenda, kvm; +Cc: linux-s390, david, frankja, cohuck, pmorel On 09/02/2021 15.38, Claudio Imbrenda wrote: > Fix and improve pgtable.h: > > * SEGMENT_ENTRY_SFAA had one extra bit set > * pmd entries don't have a length > * ipte does not need to clear the lower bits > * add macros to check whether a pmd or a pud are large / huge > * add idte functions for pmd, pud, p4d and pgd > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> > --- > lib/s390x/asm/pgtable.h | 38 ++++++++++++++++++++++++++++++++++---- > 1 file changed, 34 insertions(+), 4 deletions(-) > > diff --git a/lib/s390x/asm/pgtable.h b/lib/s390x/asm/pgtable.h > index 277f3480..4269ab62 100644 > --- a/lib/s390x/asm/pgtable.h > +++ b/lib/s390x/asm/pgtable.h > @@ -60,7 +60,7 @@ > #define SEGMENT_SHIFT 20 > > #define SEGMENT_ENTRY_ORIGIN 0xfffffffffffff800UL > -#define SEGMENT_ENTRY_SFAA 0xfffffffffff80000UL > +#define SEGMENT_ENTRY_SFAA 0xfffffffffff00000UL > #define SEGMENT_ENTRY_AV 0x0000000000010000UL > #define SEGMENT_ENTRY_ACC 0x000000000000f000UL > #define SEGMENT_ENTRY_F 0x0000000000000800UL > @@ -100,6 +100,9 @@ > #define pmd_none(entry) (pmd_val(entry) & SEGMENT_ENTRY_I) > #define pte_none(entry) (pte_val(entry) & PAGE_ENTRY_I) > > +#define pud_huge(entry) (pud_val(entry) & REGION3_ENTRY_FC) > +#define pmd_large(entry) (pmd_val(entry) & SEGMENT_ENTRY_FC) > + > #define pgd_addr(entry) __va(pgd_val(entry) & REGION_ENTRY_ORIGIN) > #define p4d_addr(entry) __va(p4d_val(entry) & REGION_ENTRY_ORIGIN) > #define pud_addr(entry) __va(pud_val(entry) & REGION_ENTRY_ORIGIN) > @@ -202,21 +205,48 @@ static inline pte_t *pte_alloc(pmd_t *pmd, unsigned long addr) > { > if (pmd_none(*pmd)) { > pte_t *pte = pte_alloc_one(); > - pmd_val(*pmd) = __pa(pte) | SEGMENT_ENTRY_TT_SEGMENT | > - SEGMENT_TABLE_LENGTH; > + pmd_val(*pmd) = __pa(pte) | SEGMENT_ENTRY_TT_SEGMENT; I think you could even remove the #define SEGMENT_TABLE_LENGTH now, since this define does not make much sense, does it? > } > return pte_offset(pmd, addr); > } > > static inline void ipte(unsigned long vaddr, pteval_t *p_pte) > { > - unsigned long table_origin = (unsigned long)p_pte & PAGE_MASK; > + unsigned long table_origin = (unsigned long)p_pte; > > asm volatile( > " ipte %0,%1\n" > : : "a" (table_origin), "a" (vaddr) : "memory"); > } > > +static inline void idte(unsigned long table_origin, unsigned long vaddr) > +{ > + vaddr &= SEGMENT_ENTRY_SFAA; > + asm volatile( > + " idte %0,0,%1\n" > + : : "a" (table_origin), "a" (vaddr) : "memory"); > +} > + > +static inline void idte_pmdp(unsigned long vaddr, pmdval_t *pmdp) > +{ > + idte((unsigned long)(pmdp - pmd_index(vaddr)) | ASCE_DT_SEGMENT, vaddr); > +} > + > +static inline void idte_pudp(unsigned long vaddr, pudval_t *pudp) > +{ > + idte((unsigned long)(pudp - pud_index(vaddr)) | ASCE_DT_REGION3, vaddr); > +} > + > +static inline void idte_p4dp(unsigned long vaddr, p4dval_t *p4dp) > +{ > + idte((unsigned long)(p4dp - p4d_index(vaddr)) | ASCE_DT_REGION2, vaddr); > +} > + > +static inline void idte_pgdp(unsigned long vaddr, pgdval_t *pgdp) > +{ > + idte((unsigned long)(pgdp - pgd_index(vaddr)) | ASCE_DT_REGION1, vaddr); > +} I think it would be cleaner to separate the fixes from the new functions, i.e. put the new functions into a separate patch. Thomas ^ permalink raw reply [flat|nested] 11+ messages in thread
* [kvm-unit-tests PATCH v1 3/4] s390x: mmu: add support for large pages 2021-02-09 14:38 [kvm-unit-tests PATCH v1 0/4] s390: Add support for large pages Claudio Imbrenda 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 1/4] libcflat: add SZ_1M and SZ_2G Claudio Imbrenda 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 2/4] s390x: lib: fix and improve pgtable.h Claudio Imbrenda @ 2021-02-09 14:38 ` Claudio Imbrenda 2021-02-11 10:06 ` Thomas Huth 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 4/4] s390x: edat test Claudio Imbrenda 3 siblings, 1 reply; 11+ messages in thread From: Claudio Imbrenda @ 2021-02-09 14:38 UTC (permalink / raw) To: kvm; +Cc: linux-s390, david, thuth, frankja, cohuck, pmorel Add support for 1M and 2G pages. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> --- lib/s390x/mmu.h | 73 +++++++++++++- lib/s390x/mmu.c | 246 +++++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 294 insertions(+), 25 deletions(-) diff --git a/lib/s390x/mmu.h b/lib/s390x/mmu.h index 603f289e..93208467 100644 --- a/lib/s390x/mmu.h +++ b/lib/s390x/mmu.h @@ -10,9 +10,78 @@ #ifndef _ASMS390X_MMU_H_ #define _ASMS390X_MMU_H_ -void protect_page(void *vaddr, unsigned long prot); +/* + * Splits the pagetables down to the given DAT tables level. + * Returns a pointer to the DAT table entry of the given level. + * @pgtable root of the page table tree + * @vaddr address whose page tables are to split + * @level 3 (for 2GB pud), 4 (for 1 MB pmd) or 5 (for 4KB pages) + */ +void *split_page(pgd_t *pgtable, void *vaddr, unsigned int level); + +/* + * Applies the given protection bits to the given DAT tables level, + * splitting if necessary. + * @pgtable root of the page table tree + * @vaddr address whose protection bits are to be changed + * @prot the protection bits to set + * @level 3 (for 2GB pud), 4 (for 1MB pmd) or 5 (for 4KB pages) + */ +void protect_dat_entry(void *vaddr, unsigned long prot, unsigned int level); +/* + * Clears the given protection bits from the given DAT tables level, + * splitting if necessary. + * @pgtable root of the page table tree + * @vaddr address whose protection bits are to be changed + * @prot the protection bits to clear + * @level 3 (for 2GB pud), 4 (for 1MB pmd) or 5 (for 4kB pages) + */ +void unprotect_dat_entry(void *vaddr, unsigned long prot, unsigned int level); + +/* + * Applies the given protection bits to the given 4kB pages range, + * splitting if necessary. + * @start starting address whose protection bits are to be changed + * @len size in bytes + * @prot the protection bits to set + */ void protect_range(void *start, unsigned long len, unsigned long prot); -void unprotect_page(void *vaddr, unsigned long prot); +/* + * Clears the given protection bits from the given 4kB pages range, + * splitting if necessary. + * @start starting address whose protection bits are to be changed + * @len size in bytes + * @prot the protection bits to set + */ void unprotect_range(void *start, unsigned long len, unsigned long prot); +/* Similar to install_page, maps the virtual address to the physical address + * for the given page tables, using 1MB large pages. + * Returns a pointer to the DAT table entry. + * @pgtable root of the page table tree + * @phys physical address to map, must be 1MB aligned! + * @vaddr virtual address to map, must be 1MB aligned! + */ +pmdval_t *install_large_page(pgd_t *pgtable, phys_addr_t phys, void *vaddr); + +/* Similar to install_page, maps the virtual address to the physical address + * for the given page tables, using 2GB huge pages. + * Returns a pointer to the DAT table entry. + * @pgtable root of the page table tree + * @phys physical address to map, must be 2GB aligned! + * @vaddr virtual address to map, must be 2GB aligned! + */ +pudval_t *install_huge_page(pgd_t *pgtable, phys_addr_t phys, void *vaddr); + +static inline void protect_page(void *vaddr, unsigned long prot) +{ + protect_dat_entry(vaddr, prot, 5); +} + +static inline void unprotect_page(void *vaddr, unsigned long prot) +{ + unprotect_dat_entry(vaddr, prot, 5); +} +void *get_dat_entry(pgd_t *pgtable, void *vaddr, unsigned int level); + #endif /* _ASMS390X_MMU_H_ */ diff --git a/lib/s390x/mmu.c b/lib/s390x/mmu.c index 5c517366..3b7cd4b1 100644 --- a/lib/s390x/mmu.c +++ b/lib/s390x/mmu.c @@ -46,54 +46,254 @@ static void mmu_enable(pgd_t *pgtable) lc->pgm_new_psw.mask |= PSW_MASK_DAT; } -static pteval_t *get_pte(pgd_t *pgtable, uintptr_t vaddr) +/* + * Get the pud (region 3) DAT table entry for the given address and root, + * allocating it if necessary + */ +static inline pud_t *get_pud(pgd_t *pgtable, uintptr_t vaddr) { pgd_t *pgd = pgd_offset(pgtable, vaddr); p4d_t *p4d = p4d_alloc(pgd, vaddr); pud_t *pud = pud_alloc(p4d, vaddr); - pmd_t *pmd = pmd_alloc(pud, vaddr); + + return pud; +} + +/* + * Get the pmd (segment) DAT table entry for the given address and pud, + * allocating it if necessary. + * The pud must not be huge. + */ +static inline pmd_t *get_pmd(pud_t *pud, uintptr_t vaddr) +{ + pmd_t *pmd; + + assert(!pud_huge(*pud)); + pmd = pmd_alloc(pud, vaddr); + return pmd; +} + +/* + * Get the pte (page) DAT table entry for the given address and pmd, + * allocating it if necessary. + * The pmd must not be large. + */ +static inline pte_t *get_pte(pmd_t *pmd, uintptr_t vaddr) +{ pte_t *pte = pte_alloc(pmd, vaddr); - return &pte_val(*pte); + assert(!pmd_large(*pmd)); + pte = pte_alloc(pmd, vaddr); + return pte; +} + +/* + * Splits a large pmd (segment) DAT table entry into equivalent 4kB small + * pages. + * @pmd The pmd to split, it must be large. + * @va the virtual address corresponding to this pmd. + */ +static void split_pmd(pmd_t *pmd, uintptr_t va) +{ + phys_addr_t pa = pmd_val(*pmd) & SEGMENT_ENTRY_SFAA; + unsigned long i; + pte_t *pte; + + assert(pmd_large(*pmd)); + pte = alloc_pages(PAGE_TABLE_ORDER); + for (i = 0; i < PAGE_TABLE_ENTRIES; i++) + pte_val(pte[i]) = pa | PAGE_SIZE * i; + idte_pmdp(va, &pmd_val(*pmd)); + pmd_val(*pmd) = __pa(pte) | SEGMENT_ENTRY_TT_SEGMENT; + +} + +/* + * Splits a huge pud (region 3) DAT table entry into equivalent 1MB large + * pages. + * @pud The pud to split, it must be huge. + * @va the virtual address corresponding to this pud. + */ +static void split_pud(pud_t *pud, uintptr_t va) +{ + phys_addr_t pa = pud_val(*pud) & REGION3_ENTRY_RFAA; + unsigned long i; + pmd_t *pmd; + + assert(pud_huge(*pud)); + pmd = alloc_pages(SEGMENT_TABLE_ORDER); + for (i = 0; i < SEGMENT_TABLE_ENTRIES; i++) + pmd_val(pmd[i]) = pa | SZ_1M * i | SEGMENT_ENTRY_FC | SEGMENT_ENTRY_TT_SEGMENT; + idte_pudp(va, &pud_val(*pud)); + pud_val(*pud) = __pa(pmd) | REGION_ENTRY_TT_REGION3 | REGION_TABLE_LENGTH; +} + +void *get_dat_entry(pgd_t *pgtable, void *vaddr, unsigned int level) +{ + uintptr_t va = (uintptr_t)vaddr; + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + + assert(level && (level <= 5)); + pgd = pgd_offset(pgtable, va); + if (level == 1) + return pgd; + p4d = p4d_alloc(pgd, va); + if (level == 2) + return p4d; + pud = pud_alloc(p4d, va); + + if (level == 3) + return pud; + if (!pud_none(*pud) && pud_huge(*pud)) + split_pud(pud, va); + pmd = get_pmd(pud, va); + if (level == 4) + return pmd; + if (!pmd_none(*pmd) && pmd_large(*pmd)) + split_pmd(pmd, va); + return get_pte(pmd, va); +} + +void *split_page(pgd_t *pgtable, void *vaddr, unsigned int level) +{ + assert((level >= 3) && (level <= 5)); + return get_dat_entry(pgtable ? pgtable : table_root, vaddr, level); } phys_addr_t virt_to_pte_phys(pgd_t *pgtable, void *vaddr) { - return (*get_pte(pgtable, (uintptr_t)vaddr) & PAGE_MASK) + - ((unsigned long)vaddr & ~PAGE_MASK); + uintptr_t va = (uintptr_t)vaddr; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + pud = get_pud(pgtable, va); + if (pud_huge(*pud)) + return (pud_val(*pud) & REGION3_ENTRY_RFAA) | (va & ~REGION3_ENTRY_RFAA); + pmd = get_pmd(pud, va); + if (pmd_large(*pmd)) + return (pmd_val(*pmd) & SEGMENT_ENTRY_SFAA) | (va & ~SEGMENT_ENTRY_SFAA); + pte = get_pte(pmd, va); + return (pte_val(*pte) & PAGE_MASK) | (va & ~PAGE_MASK); +} + +/* + * Get the DAT table entry of the given level for the given address, + * splitting if necessary. If the entry was not invalid, invalidate it, and + * return the pointer to the entry and, if requested, its old value. + * @pgtable root of the page tables + * @vaddr virtual address + * @level 3 (for 2GB pud), 4 (for 1MB pmd) or 5 (for 4kB pages) + * @old if not NULL, will be written with the old value of the DAT table + * entry before invalidation + */ +static void *dat_get_and_invalidate(pgd_t *pgtable, void *vaddr, unsigned int level, unsigned long *old) +{ + unsigned long va = (unsigned long)vaddr; + void *ptr; + + ptr = get_dat_entry(pgtable, vaddr, level); + if (old) + *old = *(unsigned long *)ptr; + if ((level == 1) && !pgd_none(*(pgd_t *)ptr)) + idte_pgdp(va, ptr); + else if ((level == 2) && !p4d_none(*(p4d_t *)ptr)) + idte_p4dp(va, ptr); + else if ((level == 3) && !pud_none(*(pud_t *)ptr)) + idte_pudp(va, ptr); + else if ((level == 4) && !pmd_none(*(pmd_t *)ptr)) + idte_pmdp(va, ptr); + else if (!pte_none(*(pte_t *)ptr)) + ipte(va, ptr); + return ptr; +} + +static void cleanup_pmd(pmd_t *pmd) +{ + /* was invalid or large, nothing to do */ + if (pmd_none(*pmd) || pmd_large(*pmd)) + return; + /* was not large, free the corresponding page table */ + free_pages((void *)(pmd_val(*pmd) & PAGE_MASK)); } -static pteval_t *set_pte(pgd_t *pgtable, pteval_t val, void *vaddr) +static void cleanup_pud(pud_t *pud) { - pteval_t *p_pte = get_pte(pgtable, (uintptr_t)vaddr); + unsigned long i; + pmd_t *pmd; - /* first flush the old entry (if we're replacing anything) */ - if (!(*p_pte & PAGE_ENTRY_I)) - ipte((uintptr_t)vaddr, p_pte); + /* was invalid or large, nothing to do */ + if (pud_none(*pud) || pud_huge(*pud)) + return; + /* recursively clean up all pmds if needed */ + pmd = (pmd_t *)(pud_val(*pud) & PAGE_MASK); + for (i = 0; i < SEGMENT_TABLE_ENTRIES; i++) + cleanup_pmd(pmd + i); + /* free the corresponding segment table */ + free_pages(pmd); +} + +/* + * Set the DAT entry for the given level of the given virtual address. If a + * mapping already existed, it is overwritten. If an existing mapping with + * smaller pages existed, all the lower tables are freed. + * Returns the pointer to the DAT table entry. + * @pgtable root of the page tables + * @val the new value for the DAT table entry + * @vaddr the virtual address + * @level 3 for pud (region 3), 4 for pmd (segment) and 5 for pte (pages) + */ +static void *set_dat_entry(pgd_t *pgtable, unsigned long val, void *vaddr, unsigned int level) +{ + unsigned long old, *res; - *p_pte = val; - return p_pte; + res = dat_get_and_invalidate(pgtable, vaddr, level, &old); + if (level == 4) + cleanup_pmd((pmd_t *)&old); + if (level == 3) + cleanup_pud((pud_t *)&old); + *res = val; + return res; } pteval_t *install_page(pgd_t *pgtable, phys_addr_t phys, void *vaddr) { - return set_pte(pgtable, __pa(phys), vaddr); + assert(IS_ALIGNED(phys, PAGE_SIZE)); + assert(IS_ALIGNED((uintptr_t)vaddr, PAGE_SIZE)); + return set_dat_entry(pgtable, phys, vaddr, 5); +} + +pmdval_t *install_large_page(pgd_t *pgtable, phys_addr_t phys, void *vaddr) +{ + assert(IS_ALIGNED(phys, SZ_1M)); + assert(IS_ALIGNED((uintptr_t)vaddr, SZ_1M)); + return set_dat_entry(pgtable, phys | SEGMENT_ENTRY_FC, vaddr, 4); +} + +pudval_t *install_huge_page(pgd_t *pgtable, phys_addr_t phys, void *vaddr) +{ + assert(IS_ALIGNED(phys, SZ_2G)); + assert(IS_ALIGNED((uintptr_t)vaddr, SZ_2G)); + return set_dat_entry(pgtable, phys | REGION3_ENTRY_FC | REGION_ENTRY_TT_REGION3, vaddr, 3); } -void protect_page(void *vaddr, unsigned long prot) +void protect_dat_entry(void *vaddr, unsigned long prot, unsigned int level) { - pteval_t *p_pte = get_pte(table_root, (uintptr_t)vaddr); - pteval_t n_pte = *p_pte | prot; + unsigned long old, *ptr; - set_pte(table_root, n_pte, vaddr); + ptr = dat_get_and_invalidate(table_root, vaddr, level, &old); + *ptr = old | prot; } -void unprotect_page(void *vaddr, unsigned long prot) +void unprotect_dat_entry(void *vaddr, unsigned long prot, unsigned int level) { - pteval_t *p_pte = get_pte(table_root, (uintptr_t)vaddr); - pteval_t n_pte = *p_pte & ~prot; + unsigned long old, *ptr; - set_pte(table_root, n_pte, vaddr); + ptr = dat_get_and_invalidate(table_root, vaddr, level, &old); + *ptr = old & ~prot; } void protect_range(void *start, unsigned long len, unsigned long prot) @@ -102,7 +302,7 @@ void protect_range(void *start, unsigned long len, unsigned long prot) len &= PAGE_MASK; for (; len; len -= PAGE_SIZE, curr += PAGE_SIZE) - protect_page((void *)curr, prot); + protect_dat_entry((void *)curr, prot, 5); } void unprotect_range(void *start, unsigned long len, unsigned long prot) @@ -111,7 +311,7 @@ void unprotect_range(void *start, unsigned long len, unsigned long prot) len &= PAGE_MASK; for (; len; len -= PAGE_SIZE, curr += PAGE_SIZE) - unprotect_page((void *)curr, prot); + unprotect_dat_entry((void *)curr, prot, 5); } static void setup_identity(pgd_t *pgtable, phys_addr_t start_addr, -- 2.26.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [kvm-unit-tests PATCH v1 3/4] s390x: mmu: add support for large pages 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 3/4] s390x: mmu: add support for large pages Claudio Imbrenda @ 2021-02-11 10:06 ` Thomas Huth 2021-02-11 10:30 ` Claudio Imbrenda 0 siblings, 1 reply; 11+ messages in thread From: Thomas Huth @ 2021-02-11 10:06 UTC (permalink / raw) To: Claudio Imbrenda, kvm; +Cc: linux-s390, david, frankja, cohuck, pmorel On 09/02/2021 15.38, Claudio Imbrenda wrote: > Add support for 1M and 2G pages. > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> > --- > lib/s390x/mmu.h | 73 +++++++++++++- > lib/s390x/mmu.c | 246 +++++++++++++++++++++++++++++++++++++++++++----- > 2 files changed, 294 insertions(+), 25 deletions(-) [...] > +/* > + * Get the pte (page) DAT table entry for the given address and pmd, > + * allocating it if necessary. > + * The pmd must not be large. > + */ > +static inline pte_t *get_pte(pmd_t *pmd, uintptr_t vaddr) > +{ > pte_t *pte = pte_alloc(pmd, vaddr); > > - return &pte_val(*pte); > + assert(!pmd_large(*pmd)); > + pte = pte_alloc(pmd, vaddr); Why is this function doing "pte = pte_alloc(pmd, vaddr)" twice now? > + return pte; > +} [...] > + if ((level == 1) && !pgd_none(*(pgd_t *)ptr)) > + idte_pgdp(va, ptr); > + else if ((level == 2) && !p4d_none(*(p4d_t *)ptr)) > + idte_p4dp(va, ptr); > + else if ((level == 3) && !pud_none(*(pud_t *)ptr)) > + idte_pudp(va, ptr); > + else if ((level == 4) && !pmd_none(*(pmd_t *)ptr)) > + idte_pmdp(va, ptr); > + else if (!pte_none(*(pte_t *)ptr)) > + ipte(va, ptr); Meta-comment: Being someone who worked quite a bit with the page tables on s390x, but never really got in touch with the way it is handled in the Linux kernel, I'm always having a hard time to match all these TLAs to the PoP: pmd, pud, p4d ... Can we please have a proper place in the kvm-unit-tests sources somewhere (maybe at the beginning of mmu.c), where the TLAs are explained and how they map to the region and segment tables of the Z architecture? (I personally would prefer to completely switch to the Z arch naming instead, but I guess that's too much of a change right now) Thomas ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [kvm-unit-tests PATCH v1 3/4] s390x: mmu: add support for large pages 2021-02-11 10:06 ` Thomas Huth @ 2021-02-11 10:30 ` Claudio Imbrenda 0 siblings, 0 replies; 11+ messages in thread From: Claudio Imbrenda @ 2021-02-11 10:30 UTC (permalink / raw) To: Thomas Huth; +Cc: kvm, linux-s390, david, frankja, cohuck, pmorel On Thu, 11 Feb 2021 11:06:06 +0100 Thomas Huth <thuth@redhat.com> wrote: > On 09/02/2021 15.38, Claudio Imbrenda wrote: > > Add support for 1M and 2G pages. > > > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> > > --- > > lib/s390x/mmu.h | 73 +++++++++++++- > > lib/s390x/mmu.c | 246 > > +++++++++++++++++++++++++++++++++++++++++++----- 2 files changed, > > 294 insertions(+), 25 deletions(-) > [...] > > +/* > > + * Get the pte (page) DAT table entry for the given address and > > pmd, > > + * allocating it if necessary. > > + * The pmd must not be large. > > + */ > > +static inline pte_t *get_pte(pmd_t *pmd, uintptr_t vaddr) > > +{ > > pte_t *pte = pte_alloc(pmd, vaddr); > > > > - return &pte_val(*pte); > > + assert(!pmd_large(*pmd)); > > + pte = pte_alloc(pmd, vaddr); > > Why is this function doing "pte = pte_alloc(pmd, vaddr)" twice now? ooops! the first pte_alloc is not supposed to be there! good catch - will fix > > + return pte; > > +} > [...] > > + if ((level == 1) && !pgd_none(*(pgd_t *)ptr)) > > + idte_pgdp(va, ptr); > > + else if ((level == 2) && !p4d_none(*(p4d_t *)ptr)) > > + idte_p4dp(va, ptr); > > + else if ((level == 3) && !pud_none(*(pud_t *)ptr)) > > + idte_pudp(va, ptr); > > + else if ((level == 4) && !pmd_none(*(pmd_t *)ptr)) > > + idte_pmdp(va, ptr); > > + else if (!pte_none(*(pte_t *)ptr)) > > + ipte(va, ptr); > > Meta-comment: Being someone who worked quite a bit with the page > tables on s390x, but never really got in touch with the way it is > handled in the Linux kernel, I'm always having a hard time to match > all these TLAs to the PoP: pmd, pud, p4d ... > Can we please have a proper place in the kvm-unit-tests sources > somewhere (maybe at the beginning of mmu.c), where the TLAs are > explained and how they map to the region and segment tables of the Z > architecture? (I personally would prefer to completely switch to the > Z arch naming instead, but I guess that's too much of a change right > now) makes sense, I can add that > Thomas > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [kvm-unit-tests PATCH v1 4/4] s390x: edat test 2021-02-09 14:38 [kvm-unit-tests PATCH v1 0/4] s390: Add support for large pages Claudio Imbrenda ` (2 preceding siblings ...) 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 3/4] s390x: mmu: add support for large pages Claudio Imbrenda @ 2021-02-09 14:38 ` Claudio Imbrenda 2021-02-11 11:35 ` Thomas Huth 3 siblings, 1 reply; 11+ messages in thread From: Claudio Imbrenda @ 2021-02-09 14:38 UTC (permalink / raw) To: kvm; +Cc: linux-s390, david, thuth, frankja, cohuck, pmorel Simple EDAT test. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> --- s390x/Makefile | 1 + s390x/edat.c | 238 ++++++++++++++++++++++++++++++++++++++++++++ s390x/unittests.cfg | 3 + 3 files changed, 242 insertions(+) create mode 100644 s390x/edat.c diff --git a/s390x/Makefile b/s390x/Makefile index 08d85c9f..fc885150 100644 --- a/s390x/Makefile +++ b/s390x/Makefile @@ -20,6 +20,7 @@ tests += $(TEST_DIR)/sclp.elf tests += $(TEST_DIR)/css.elf tests += $(TEST_DIR)/uv-guest.elf tests += $(TEST_DIR)/sie.elf +tests += $(TEST_DIR)/edat.elf tests_binary = $(patsubst %.elf,%.bin,$(tests)) ifneq ($(HOST_KEY_DOCUMENT),) diff --git a/s390x/edat.c b/s390x/edat.c new file mode 100644 index 00000000..504a1501 --- /dev/null +++ b/s390x/edat.c @@ -0,0 +1,238 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * EDAT test. + * + * Copyright (c) 2021 IBM Corp + * + * Authors: + * Claudio Imbrenda <imbrenda@linux.ibm.com> + */ +#include <libcflat.h> +#include <vmalloc.h> +#include <asm/facility.h> +#include <asm/interrupt.h> +#include <mmu.h> +#include <asm/pgtable.h> +#include <asm-generic/barrier.h> + +#define TEID_ADDR PAGE_MASK +#define TEID_AI 0x003 +#define TEID_M 0x004 +#define TEID_A 0x008 +#define TEID_FS 0xc00 + +#define LC_SIZE (2 * PAGE_SIZE) +#define VIRT(x) ((void *)((unsigned long)(x) + (unsigned long)mem)) + +static uint8_t prefix_buf[LC_SIZE] __attribute__((aligned(LC_SIZE))); +static unsigned int tmp[1024] __attribute__((aligned(PAGE_SIZE))); +static void *root, *mem, *m; +static struct lowcore *lc; +volatile unsigned int *p; + +/* Expect a program interrupt, and clear the TEID */ +static void expect_dat_fault(void) +{ + expect_pgm_int(); + lc->trans_exc_id = 0; +} + +/* Check if a protection exception happened for the given address */ +static bool check_pgm_prot(void *ptr) +{ + unsigned long teid = lc->trans_exc_id; + + if (lc->pgm_int_code != PGM_INT_CODE_PROTECTION) + return 0; + if (~teid & TEID_M) + return 1; + return (~teid & TEID_A) && + ((teid & TEID_ADDR) == ((uint64_t)ptr & PAGE_MASK)) && + !(teid & TEID_AI); +} + +static void test_dat(void) +{ + report_prefix_push("edat off"); + /* disable EDAT */ + ctl_clear_bit(0, 23); + + /* Check some basics */ + p[0] = 42; + report(p[0] == 42, "pte, r/w"); + p[0] = 0; + + protect_page(m, PAGE_ENTRY_P); + expect_dat_fault(); + p[0] = 42; + unprotect_page(m, PAGE_ENTRY_P); + report(!p[0] && check_pgm_prot(m), "pte, ro"); + + /* The FC bit should be ignored because EDAT is off */ + p[0] = 42; + protect_dat_entry(m, SEGMENT_ENTRY_FC, 4); + report(p[0] == 42, "pmd, fc=1, r/w"); + unprotect_dat_entry(m, SEGMENT_ENTRY_FC, 4); + p[0] = 0; + + /* Segment protection should work even with EDAT off */ + protect_dat_entry(m, SEGMENT_ENTRY_P, 4); + expect_dat_fault(); + p[0] = 42; + report(!p[0] && check_pgm_prot(m), "pmd, ro"); + unprotect_dat_entry(m, SEGMENT_ENTRY_P, 4); + + /* The FC bit should be ignored because EDAT is off*/ + protect_dat_entry(m, REGION3_ENTRY_FC, 3); + p[0] = 42; + report(p[0] == 42, "pud, fc=1, r/w"); + unprotect_dat_entry(m, REGION3_ENTRY_FC, 3); + p[0] = 0; + + /* Region1/2/3 protection should not work, because EDAT is off */ + protect_dat_entry(m, REGION_ENTRY_P, 3); + p[0] = 42; + report(p[0] == 42, "pud, ro"); + unprotect_dat_entry(m, REGION_ENTRY_P, 3); + p[0] = 0; + + protect_dat_entry(m, REGION_ENTRY_P, 2); + p[0] = 42; + report(p[0] == 42, "p4d, ro"); + unprotect_dat_entry(m, REGION_ENTRY_P, 2); + p[0] = 0; + + protect_dat_entry(m, REGION_ENTRY_P, 1); + p[0] = 42; + report(p[0] == 42, "pgd, ro"); + unprotect_dat_entry(m, REGION_ENTRY_P, 1); + p[0] = 0; + + report_prefix_pop(); +} + +static void test_edat1(void) +{ + report_prefix_push("edat1"); + /* Enable EDAT */ + ctl_set_bit(0, 23); + p[0] = 0; + + /* Segment protection should work */ + expect_dat_fault(); + protect_dat_entry(m, SEGMENT_ENTRY_P, 4); + p[0] = 42; + report(!p[0] && check_pgm_prot(m), "pmd, ro"); + unprotect_dat_entry(m, SEGMENT_ENTRY_P, 4); + + /* Region1/2/3 protection should work now */ + expect_dat_fault(); + protect_dat_entry(m, REGION_ENTRY_P, 3); + p[0] = 42; + report(!p[0] && check_pgm_prot(m), "pud, ro"); + unprotect_dat_entry(m, REGION_ENTRY_P, 3); + + expect_dat_fault(); + protect_dat_entry(m, REGION_ENTRY_P, 2); + p[0] = 42; + report(!p[0] && check_pgm_prot(m), "p4d, ro"); + unprotect_dat_entry(m, REGION_ENTRY_P, 2); + + expect_dat_fault(); + protect_dat_entry(m, REGION_ENTRY_P, 1); + p[0] = 42; + report(!p[0] && check_pgm_prot(m), "pgd, ro"); + unprotect_dat_entry(m, REGION_ENTRY_P, 1); + + /* Large pages should work */ + p[0] = 42; + install_large_page(root, 0, mem); + report(p[0] == 42, "pmd, large"); + + /* Prefixing should not work with large pages */ + report(!memcmp(0, VIRT(prefix_buf), LC_SIZE) && + !memcmp(prefix_buf, mem, LC_SIZE), + "pmd, large, prefixing"); + + report_prefix_pop(); +} + +static void test_edat2(void) +{ + report_prefix_push("edat2"); + p[0] = 42; + + /* Huge pages should work */ + install_huge_page(root, 0, mem); + report(p[0] == 42, "pud, huge"); + + /* Prefixing should not work with huge pages */ + report(!memcmp(0, VIRT(prefix_buf), LC_SIZE) && + !memcmp(prefix_buf, mem, LC_SIZE), + "pmd, large, prefixing"); + + report_prefix_pop(); +} + +static unsigned int setup(void) +{ + bool has_edat1 = test_facility(8); + bool has_edat2 = test_facility(78); + unsigned long pa, va; + + if (has_edat2 && !has_edat1) + report_abort("EDAT2 available, but EDAT1 not available"); + + /* Setup DAT 1:1 mapping and memory management */ + setup_vm(); + root = (void *)(stctg(1) & PAGE_MASK); + + /* + * Get a pgd worth of virtual memory, so we can test things later + * without interfering with the test code or the interrupt handler + */ + mem = alloc_vpages_aligned(BIT_ULL(41), 41); + assert(mem); + va = (unsigned long)mem; + + /* Map the first 1GB of real memory */ + for (pa = 0; pa < SZ_1G; pa += PAGE_SIZE, va += PAGE_SIZE) + install_page(root, pa, (void *)va); + + /* Move the lowcore to a known non-zero location */ + assert((unsigned long)&prefix_buf < SZ_2G); + memcpy(prefix_buf, 0, LC_SIZE); + set_prefix((uint32_t)(intptr_t)prefix_buf); + /* Clear the old copy */ + memset(prefix_buf, 0, LC_SIZE); + + /* m will point to tmp through the new virtual mapping */ + m = VIRT(&tmp); + /* p is the same as m but volatile */ + p = (volatile unsigned int *)m; + + return has_edat1 + has_edat2; +} + +int main(void) +{ + unsigned int edat; + + report_prefix_push("edat"); + edat = setup(); + + test_dat(); + + if (edat) + test_edat1(); + else + report_skip("EDAT not available"); + + if (edat >= 2) + test_edat2(); + else + report_skip("EDAT2 not available"); + + report_prefix_pop(); + return report_summary(); +} diff --git a/s390x/unittests.cfg b/s390x/unittests.cfg index 2298be6c..8da4a0a3 100644 --- a/s390x/unittests.cfg +++ b/s390x/unittests.cfg @@ -99,3 +99,6 @@ file = uv-guest.elf [sie] file = sie.elf + +[edat] +file = edat.elf -- 2.26.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [kvm-unit-tests PATCH v1 4/4] s390x: edat test 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 4/4] s390x: edat test Claudio Imbrenda @ 2021-02-11 11:35 ` Thomas Huth 2021-02-11 12:18 ` Claudio Imbrenda 0 siblings, 1 reply; 11+ messages in thread From: Thomas Huth @ 2021-02-11 11:35 UTC (permalink / raw) To: Claudio Imbrenda, kvm; +Cc: linux-s390, david, frankja, cohuck, pmorel On 09/02/2021 15.38, Claudio Imbrenda wrote: > Simple EDAT test. > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> > --- > s390x/Makefile | 1 + > s390x/edat.c | 238 ++++++++++++++++++++++++++++++++++++++++++++ > s390x/unittests.cfg | 3 + > 3 files changed, 242 insertions(+) > create mode 100644 s390x/edat.c > > diff --git a/s390x/Makefile b/s390x/Makefile > index 08d85c9f..fc885150 100644 > --- a/s390x/Makefile > +++ b/s390x/Makefile > @@ -20,6 +20,7 @@ tests += $(TEST_DIR)/sclp.elf > tests += $(TEST_DIR)/css.elf > tests += $(TEST_DIR)/uv-guest.elf > tests += $(TEST_DIR)/sie.elf > +tests += $(TEST_DIR)/edat.elf > > tests_binary = $(patsubst %.elf,%.bin,$(tests)) > ifneq ($(HOST_KEY_DOCUMENT),) > diff --git a/s390x/edat.c b/s390x/edat.c > new file mode 100644 > index 00000000..504a1501 > --- /dev/null > +++ b/s390x/edat.c > @@ -0,0 +1,238 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +/* > + * EDAT test. > + * > + * Copyright (c) 2021 IBM Corp > + * > + * Authors: > + * Claudio Imbrenda <imbrenda@linux.ibm.com> > + */ > +#include <libcflat.h> > +#include <vmalloc.h> > +#include <asm/facility.h> > +#include <asm/interrupt.h> > +#include <mmu.h> > +#include <asm/pgtable.h> > +#include <asm-generic/barrier.h> > + > +#define TEID_ADDR PAGE_MASK > +#define TEID_AI 0x003 > +#define TEID_M 0x004 > +#define TEID_A 0x008 > +#define TEID_FS 0xc00 > + > +#define LC_SIZE (2 * PAGE_SIZE) > +#define VIRT(x) ((void *)((unsigned long)(x) + (unsigned long)mem)) > + > +static uint8_t prefix_buf[LC_SIZE] __attribute__((aligned(LC_SIZE))); > +static unsigned int tmp[1024] __attribute__((aligned(PAGE_SIZE))); > +static void *root, *mem, *m; > +static struct lowcore *lc; > +volatile unsigned int *p; > + > +/* Expect a program interrupt, and clear the TEID */ > +static void expect_dat_fault(void) > +{ > + expect_pgm_int(); > + lc->trans_exc_id = 0; > +} > + > +/* Check if a protection exception happened for the given address */ > +static bool check_pgm_prot(void *ptr) > +{ > + unsigned long teid = lc->trans_exc_id; > + > + if (lc->pgm_int_code != PGM_INT_CODE_PROTECTION) > + return 0; return false. It's a bool return type. > + if (~teid & TEID_M) I'd maybe rather write this as: if (!(teid & TEID_M)) ... but it's just a matter of taste. > + return 1; return true; So this is for backward compatiblity with older Z systems that do not have the corresponding facility? Should there be a corresponding facility check somewhere? Or maybe add at least a comment? > + return (~teid & TEID_A) && > + ((teid & TEID_ADDR) == ((uint64_t)ptr & PAGE_MASK)) && > + !(teid & TEID_AI); So you're checking for one specific type of protection exception here only ... please add an appropriate comment. > +} > + > +static void test_dat(void) > +{ > + report_prefix_push("edat off"); > + /* disable EDAT */ > + ctl_clear_bit(0, 23); > + > + /* Check some basics */ > + p[0] = 42; > + report(p[0] == 42, "pte, r/w"); > + p[0] = 0; > + > + protect_page(m, PAGE_ENTRY_P); > + expect_dat_fault(); > + p[0] = 42; > + unprotect_page(m, PAGE_ENTRY_P); > + report(!p[0] && check_pgm_prot(m), "pte, ro"); > + > + /* The FC bit should be ignored because EDAT is off */ > + p[0] = 42; I'd suggest to set p[0] = 0 here... > + protect_dat_entry(m, SEGMENT_ENTRY_FC, 4); ... and change the value to 42 after enabling the protection ... otherwise you don't really test the non-working write protection here, do you? > + report(p[0] == 42, "pmd, fc=1, r/w"); > + unprotect_dat_entry(m, SEGMENT_ENTRY_FC, 4); > + p[0] = 0; > + > + /* Segment protection should work even with EDAT off */ > + protect_dat_entry(m, SEGMENT_ENTRY_P, 4); > + expect_dat_fault(); > + p[0] = 42; > + report(!p[0] && check_pgm_prot(m), "pmd, ro"); > + unprotect_dat_entry(m, SEGMENT_ENTRY_P, 4); > + > + /* The FC bit should be ignored because EDAT is off*/ Set p[0] to 0 again before enabling the protection? Or maybe use a different value than 42 below...? > + protect_dat_entry(m, REGION3_ENTRY_FC, 3); > + p[0] = 42; > + report(p[0] == 42, "pud, fc=1, r/w"); > + unprotect_dat_entry(m, REGION3_ENTRY_FC, 3); > + p[0] = 0; > + > + /* Region1/2/3 protection should not work, because EDAT is off */ > + protect_dat_entry(m, REGION_ENTRY_P, 3); > + p[0] = 42; > + report(p[0] == 42, "pud, ro"); > + unprotect_dat_entry(m, REGION_ENTRY_P, 3); > + p[0] = 0; > + > + protect_dat_entry(m, REGION_ENTRY_P, 2); > + p[0] = 42; > + report(p[0] == 42, "p4d, ro"); > + unprotect_dat_entry(m, REGION_ENTRY_P, 2); > + p[0] = 0; > + > + protect_dat_entry(m, REGION_ENTRY_P, 1); > + p[0] = 42; > + report(p[0] == 42, "pgd, ro"); > + unprotect_dat_entry(m, REGION_ENTRY_P, 1); > + p[0] = 0; > + > + report_prefix_pop(); > +} Thomas ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [kvm-unit-tests PATCH v1 4/4] s390x: edat test 2021-02-11 11:35 ` Thomas Huth @ 2021-02-11 12:18 ` Claudio Imbrenda 0 siblings, 0 replies; 11+ messages in thread From: Claudio Imbrenda @ 2021-02-11 12:18 UTC (permalink / raw) To: Thomas Huth; +Cc: kvm, linux-s390, david, frankja, cohuck, pmorel On Thu, 11 Feb 2021 12:35:49 +0100 Thomas Huth <thuth@redhat.com> wrote: > On 09/02/2021 15.38, Claudio Imbrenda wrote: > > Simple EDAT test. > > > > Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> > > --- > > s390x/Makefile | 1 + > > s390x/edat.c | 238 > > ++++++++++++++++++++++++++++++++++++++++++++ s390x/unittests.cfg | > > 3 + 3 files changed, 242 insertions(+) > > create mode 100644 s390x/edat.c > > > > diff --git a/s390x/Makefile b/s390x/Makefile > > index 08d85c9f..fc885150 100644 > > --- a/s390x/Makefile > > +++ b/s390x/Makefile > > @@ -20,6 +20,7 @@ tests += $(TEST_DIR)/sclp.elf > > tests += $(TEST_DIR)/css.elf > > tests += $(TEST_DIR)/uv-guest.elf > > tests += $(TEST_DIR)/sie.elf > > +tests += $(TEST_DIR)/edat.elf > > > > tests_binary = $(patsubst %.elf,%.bin,$(tests)) > > ifneq ($(HOST_KEY_DOCUMENT),) > > diff --git a/s390x/edat.c b/s390x/edat.c > > new file mode 100644 > > index 00000000..504a1501 > > --- /dev/null > > +++ b/s390x/edat.c > > @@ -0,0 +1,238 @@ > > +/* SPDX-License-Identifier: GPL-2.0-only */ > > +/* > > + * EDAT test. > > + * > > + * Copyright (c) 2021 IBM Corp > > + * > > + * Authors: > > + * Claudio Imbrenda <imbrenda@linux.ibm.com> > > + */ > > +#include <libcflat.h> > > +#include <vmalloc.h> > > +#include <asm/facility.h> > > +#include <asm/interrupt.h> > > +#include <mmu.h> > > +#include <asm/pgtable.h> > > +#include <asm-generic/barrier.h> > > + > > +#define TEID_ADDR PAGE_MASK > > +#define TEID_AI 0x003 > > +#define TEID_M 0x004 > > +#define TEID_A 0x008 > > +#define TEID_FS 0xc00 > > + > > +#define LC_SIZE (2 * PAGE_SIZE) > > +#define VIRT(x) ((void *)((unsigned long)(x) + (unsigned > > long)mem)) + > > +static uint8_t prefix_buf[LC_SIZE] > > __attribute__((aligned(LC_SIZE))); +static unsigned int tmp[1024] > > __attribute__((aligned(PAGE_SIZE))); +static void *root, *mem, *m; > > +static struct lowcore *lc; > > +volatile unsigned int *p; > > + > > +/* Expect a program interrupt, and clear the TEID */ > > +static void expect_dat_fault(void) > > +{ > > + expect_pgm_int(); > > + lc->trans_exc_id = 0; > > +} > > + > > +/* Check if a protection exception happened for the given address > > */ +static bool check_pgm_prot(void *ptr) > > +{ > > + unsigned long teid = lc->trans_exc_id; > > + > > + if (lc->pgm_int_code != PGM_INT_CODE_PROTECTION) > > + return 0; > > return false. > It's a bool return type. yeah, that looks cleaner, I'll fix it > > + if (~teid & TEID_M) > > I'd maybe rather write this as: > > if (!(teid & TEID_M)) > > ... but it's just a matter of taste. yes, I actually had it that way in the beginning, but using ~ is shorter and does not need parentheses > > + return 1; > > return true; > > So this is for backward compatiblity with older Z systems that do not > have the corresponding facility? Should there be a corresponding > facility check somewhere? Or maybe add at least a comment? no, it's not for backwards compatibility as far as I know. If I read the documentation correctly, that bit might be zero under some circumstances, and here I will just give up instead of checking if the circumstances were actually correct. > > + return (~teid & TEID_A) && > > + ((teid & TEID_ADDR) == ((uint64_t)ptr & > > PAGE_MASK)) && > > + !(teid & TEID_AI); > > So you're checking for one specific type of protection exception here > only ... please add an appropriate comment. more or less, but I'll add a comment to explain what's going on > > +} > > + > > +static void test_dat(void) > > +{ > > + report_prefix_push("edat off"); > > + /* disable EDAT */ > > + ctl_clear_bit(0, 23); > > + > > + /* Check some basics */ > > + p[0] = 42; > > + report(p[0] == 42, "pte, r/w"); > > + p[0] = 0; > > + > > + protect_page(m, PAGE_ENTRY_P); > > + expect_dat_fault(); > > + p[0] = 42; > > + unprotect_page(m, PAGE_ENTRY_P); > > + report(!p[0] && check_pgm_prot(m), "pte, ro"); > > + > > + /* The FC bit should be ignored because EDAT is off */ > > + p[0] = 42; > > I'd suggest to set p[0] = 0 here... > > > + protect_dat_entry(m, SEGMENT_ENTRY_FC, 4); > > ... and change the value to 42 after enabling the protection ... > otherwise you don't really test the non-working write protection > here, do you? but this is not the write protection. here I'm setting the bit for large pages. so first I write something, then I set the bit, then I check if I can still read it. if not, it means that the FC bit was not ignored (i.e. the entry was considered as a large page instead of a normal segment table entry pointing to a page table) Write protection for segment entries _should_ work even with EDAT off, and that is in fact what the next test checks... > > + report(p[0] == 42, "pmd, fc=1, r/w"); > > + unprotect_dat_entry(m, SEGMENT_ENTRY_FC, 4); > > + p[0] = 0; > > + ... this one here: > > + /* Segment protection should work even with EDAT off */ > > + protect_dat_entry(m, SEGMENT_ENTRY_P, 4); > > + expect_dat_fault(); > > + p[0] = 42; > > + report(!p[0] && check_pgm_prot(m), "pmd, ro"); > > + unprotect_dat_entry(m, SEGMENT_ENTRY_P, 4); > > + > > + /* The FC bit should be ignored because EDAT is off*/ > > Set p[0] to 0 again before enabling the protection? Or maybe use a > different value than 42 below...? why? we already checked that p[0] == 0, and if p[0] somehow still is 42, we are going to set it to 42 again > > + protect_dat_entry(m, REGION3_ENTRY_FC, 3); > > + p[0] = 42; but! we should set it to 42 BEFORE setting the FC bit! I will fix this and maybe add a few more comments to explain what's going on > > + report(p[0] == 42, "pud, fc=1, r/w"); > > + unprotect_dat_entry(m, REGION3_ENTRY_FC, 3); > > + p[0] = 0; > > + > > + /* Region1/2/3 protection should not work, because EDAT is > > off */ > > + protect_dat_entry(m, REGION_ENTRY_P, 3); > > + p[0] = 42; > > + report(p[0] == 42, "pud, ro"); > > + unprotect_dat_entry(m, REGION_ENTRY_P, 3); > > + p[0] = 0; > > + > > + protect_dat_entry(m, REGION_ENTRY_P, 2); > > + p[0] = 42; > > + report(p[0] == 42, "p4d, ro"); > > + unprotect_dat_entry(m, REGION_ENTRY_P, 2); > > + p[0] = 0; > > + > > + protect_dat_entry(m, REGION_ENTRY_P, 1); > > + p[0] = 42; > > + report(p[0] == 42, "pgd, ro"); > > + unprotect_dat_entry(m, REGION_ENTRY_P, 1); > > + p[0] = 0; > > + > > + report_prefix_pop(); > > +} > > Thomas > ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-02-11 12:22 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-09 14:38 [kvm-unit-tests PATCH v1 0/4] s390: Add support for large pages Claudio Imbrenda 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 1/4] libcflat: add SZ_1M and SZ_2G Claudio Imbrenda 2021-02-09 15:21 ` Thomas Huth 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 2/4] s390x: lib: fix and improve pgtable.h Claudio Imbrenda 2021-02-11 9:09 ` Thomas Huth 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 3/4] s390x: mmu: add support for large pages Claudio Imbrenda 2021-02-11 10:06 ` Thomas Huth 2021-02-11 10:30 ` Claudio Imbrenda 2021-02-09 14:38 ` [kvm-unit-tests PATCH v1 4/4] s390x: edat test Claudio Imbrenda 2021-02-11 11:35 ` Thomas Huth 2021-02-11 12:18 ` Claudio Imbrenda
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).