All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/4] riscv: mm: add Svnapot support
@ 2022-04-11 14:15 panqinglin2020
  2022-04-11 14:15 ` [PATCH v1 1/4] mm: modify pte format for Svnapot panqinglin2020
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: panqinglin2020 @ 2022-04-11 14:15 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, linux-riscv; +Cc: jeff, xuyinan, Qinglin Pan

From: Qinglin Pan <panqinglin2020@iscas.ac.cn>

Svnapot is a RISC-V extension for marking contiguous 4K pages as a non-4K
page. This patch set is for using Svnapot in Linux Kernel's boot process
and hugetlb fs.

Since Svnapot is just stable recently, and there seems no official way to
determine if the CPU supports Svnapot at runtime. This patchset adds a Kconfig
item for using Svnapot in "Platform type"->"Svnapot support". Its default value
is off, and people can set it on when their CPU supports Svnapot.

Qemu support for Svnapot has been accepted but still not merged into master.
So the qemu which we use to test this patchset is current in this repo (it
contains qemu Svnapot patchset):
https://github.com/plctlab/plct-qemu/tree/plct-virtmem-dev

Tested on:
  - qemu rv64 with "Svnapot support" off.
  - plct-qemu rv64 with "Svnapot support" on.


Qinglin Pan (4):
  mm: modify pte format for Svnapot
  mm: support Svnapot in physical page linear-mapping
  mm: support Svnapot in hugetlb page
  mm: support Svnapot in huge vmap

 arch/riscv/Kconfig                    |  10 +-
 arch/riscv/include/asm/hugetlb.h      |  31 +++-
 arch/riscv/include/asm/page.h         |   2 +-
 arch/riscv/include/asm/pgtable-bits.h |  31 ++++
 arch/riscv/include/asm/pgtable.h      |  68 ++++++++
 arch/riscv/include/asm/vmalloc.h      |  20 +++
 arch/riscv/mm/hugetlbpage.c           | 236 +++++++++++++++++++++++++-
 arch/riscv/mm/init.c                  |  29 +++-
 8 files changed, 416 insertions(+), 11 deletions(-)

-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v1 1/4] mm: modify pte format for Svnapot
  2022-04-11 14:15 [PATCH v1 0/4] riscv: mm: add Svnapot support panqinglin2020
@ 2022-04-11 14:15 ` panqinglin2020
  2022-04-11 14:15 ` [PATCH v1 2/4] mm: support Svnapot in physical page linear-mapping panqinglin2020
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: panqinglin2020 @ 2022-04-11 14:15 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, linux-riscv; +Cc: jeff, xuyinan, Qinglin Pan

From: Qinglin Pan <panqinglin2020@iscas.ac.cn>

This patch modifies PTE definition for Svnapot, and creates some functions in
pgtable.h to mark a PTE as napot and check if it is a Svnapot PTE.
Until now, only 64KB napot size is supported in draft spec, so some macros
has only 64KB version.

Yours,
Qinglin

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 00fd9c548f26..b86033f67610 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -343,6 +343,13 @@ config FPU
 
 	  If you don't know what to do here, say Y.
 
+config SVNAPOT
+	bool "Svnapot support"
+	default n
+	help
+	  Select if your CPU supports Svnapot and you want to enable it when
+	  kernel is booting.
+
 endmenu
 
 menu "Kernel features"
diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
index a6b0c89824c2..b37934c60c4d 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -35,6 +35,37 @@
 
 #define _PAGE_PFN_SHIFT 10
 
+#ifdef CONFIG_SVNAPOT
+#define _PAGE_RESERVE_0_SHIFT 54
+#define _PAGE_RESERVE_1_SHIFT 55
+#define _PAGE_RESERVE_2_SHIFT 56
+#define _PAGE_RESERVE_3_SHIFT 57
+#define _PAGE_RESERVE_4_SHIFT 58
+#define _PAGE_RESERVE_5_SHIFT 59
+#define _PAGE_RESERVE_6_SHIFT 60
+#define _PAGE_RESERVE_7_SHIFT 61
+#define _PAGE_RESERVE_8_SHIFT 62
+#define _PAGE_NAPOT_SHIFT 63
+#define _PAGE_RESERVE_0 (1UL << 54)
+#define _PAGE_RESERVE_1 (1UL << 55)
+#define _PAGE_RESERVE_2 (1UL << 56)
+#define _PAGE_RESERVE_3 (1UL << 57)
+#define _PAGE_RESERVE_4 (1UL << 58)
+#define _PAGE_RESERVE_5 (1UL << 59)
+#define _PAGE_RESERVE_6 (1UL << 60)
+#define _PAGE_RESERVE_7 (1UL << 61)
+#define _PAGE_RESERVE_8 (1UL << 62)
+#define _PAGE_PFN_MASK (_PAGE_RESERVE_0 - (1UL << _PAGE_PFN_SHIFT))
+/* now Svnapot only supports 64KB*/
+#define NAPOT_CONT64KB_ORDER 4UL
+#define NAPOT_CONT64KB_SHIFT (NAPOT_CONT64KB_ORDER + PAGE_SHIFT)
+#define NAPOT_CONT64KB_SIZE (1UL << NAPOT_CONT64KB_SHIFT)
+#define NAPOT_CONT64KB_MASK (NAPOT_CONT64KB_SIZE - 1)
+#define NAPOT_64KB_PTE_NUM (1UL << NAPOT_CONT64KB_ORDER)
+#define _PAGE_NAPOT      (1UL << _PAGE_NAPOT_SHIFT)
+#define NAPOT_64KB_MASK (7UL << _PAGE_PFN_SHIFT)
+#endif /*CONFIG_SVNAPOT*/
+
 /* Set of bits to preserve across pte_modify() */
 #define _PAGE_CHG_MASK  (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ |	\
 					  _PAGE_WRITE | _PAGE_EXEC |	\
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 046b44225623..f72cdb64f427 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -279,11 +279,39 @@ static inline pte_t pud_pte(pud_t pud)
 	return __pte(pud_val(pud));
 }
 
+#ifdef CONFIG_SVNAPOT
+/* Yields the page frame number (PFN) of a page table entry */
+static inline unsigned long pte_pfn(pte_t pte)
+{
+	unsigned long val  = pte_val(pte);
+	unsigned long is_napot = val >> _PAGE_NAPOT_SHIFT;
+	unsigned long pfn_field = (val & _PAGE_PFN_MASK) >> _PAGE_PFN_SHIFT;
+	unsigned long res = (pfn_field - is_napot) & pfn_field;
+	return res;
+}
+
+static inline unsigned long pte_napot(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_NAPOT;
+}
+
+static inline pte_t pte_mknapot(pte_t pte, unsigned int order)
+{
+	unsigned long napot_bits = (1UL << (order - 1)) << _PAGE_PFN_SHIFT;
+	unsigned long lower_prot =
+		pte_val(pte) & ((1UL << _PAGE_PFN_SHIFT) - 1UL);
+	unsigned long upper_prot = (pte_val(pte) >> _PAGE_PFN_SHIFT)
+				   << _PAGE_PFN_SHIFT;
+
+	return __pte(upper_prot | napot_bits | lower_prot | _PAGE_NAPOT);
+}
+#else /* CONFIG_SVNAPOT */
 /* Yields the page frame number (PFN) of a page table entry */
 static inline unsigned long pte_pfn(pte_t pte)
 {
 	return (pte_val(pte) >> _PAGE_PFN_SHIFT);
 }
+#endif /* CONFIG_SVNAPOT */
 
 #define pte_page(x)     pfn_to_page(pte_pfn(x))
 
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v1 2/4] mm: support Svnapot in physical page linear-mapping
  2022-04-11 14:15 [PATCH v1 0/4] riscv: mm: add Svnapot support panqinglin2020
  2022-04-11 14:15 ` [PATCH v1 1/4] mm: modify pte format for Svnapot panqinglin2020
@ 2022-04-11 14:15 ` panqinglin2020
  2022-04-11 14:15 ` [PATCH v1 3/4] mm: support Svnapot in hugetlb page panqinglin2020
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: panqinglin2020 @ 2022-04-11 14:15 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, linux-riscv; +Cc: jeff, xuyinan, Qinglin Pan

From: Qinglin Pan <panqinglin2020@iscas.ac.cn>

Svnapot is powerful when a physical region is going to mapped to a
virtual region. Kernel will do like this when mapping all allocable
physical pages to kernel vm space. This patch modify the
create_pte_mapping function used in linear-mapping procedure, so the
kernel can be able to use Svnapot when both address and length of
physical region are 64KB align. Code here will be executed only when
other size huge page is not suitable, so it can be an addition of
PMD_SIZE and PUD_SIZE mapping.

This patch also modifies the best_map_size function to give map_size
many times instead of only once, so a memory region can be mapped by
both PMD_SIZE and 64KB napot size.

It is tested by setting qemu's memory to a 262272k region, and the
kernel can boot successfully.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 9535bea8688c..c98a1714c9c8 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -303,7 +303,21 @@ static void __init create_pte_mapping(pte_t *ptep,
 {
 	uintptr_t pte_idx = pte_index(va);
 
+#ifndef CONFIG_SVNAPOT
 	BUG_ON(sz != PAGE_SIZE);
+#else /*CONFIG_SVNAPOT*/
+	pte_t pte;
+	WARN_ON(sz != NAPOT_CONT64KB_SIZE && sz != PAGE_SIZE);
+	if (sz == NAPOT_CONT64KB_SIZE) {
+		do {
+			pte = pfn_pte(PFN_DOWN(pa), prot);
+			ptep[pte_idx] = pte_mknapot(pte, NAPOT_CONT64KB_ORDER);
+			pte_idx++;
+			sz -= PAGE_SIZE;
+		} while (sz > 0);
+		return;
+	}
+#endif /*CONFIG_SVNAPOT*/
 
 	if (pte_none(ptep[pte_idx]))
 		ptep[pte_idx] = pfn_pte(PFN_DOWN(pa), prot);
@@ -602,10 +616,17 @@ void __init create_pgd_mapping(pgd_t *pgdp,
 static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
 {
 	/* Upgrade to PMD_SIZE mappings whenever possible */
-	if ((base & (PMD_SIZE - 1)) || (size & (PMD_SIZE - 1)))
-		return PAGE_SIZE;
+	base &= PMD_SIZE - 1;
+	if (!base && size >= PMD_SIZE)
+		return PMD_SIZE;
+
+#ifdef CONFIG_SVNAPOT
+	base &= NAPOT_CONT64KB_SIZE - 1;
+	if (!base && size >= NAPOT_CONT64KB_SIZE)
+		return NAPOT_CONT64KB_SIZE;
+#endif /*CONFIG_SVNAPOT*/
 
-	return PMD_SIZE;
+	return PAGE_SIZE;
 }
 
 #ifdef CONFIG_XIP_KERNEL
@@ -1038,9 +1059,9 @@ static void __init setup_vm_final(void)
 		if (end >= __pa(PAGE_OFFSET) + memory_limit)
 			end = __pa(PAGE_OFFSET) + memory_limit;
 
-		map_size = best_map_size(start, end - start);
 		for (pa = start; pa < end; pa += map_size) {
 			va = (uintptr_t)__va(pa);
+			map_size = best_map_size(pa, end - pa);
 
 			create_pgd_mapping(swapper_pg_dir, va, pa, map_size,
 					   pgprot_from_va(va));
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v1 3/4] mm: support Svnapot in hugetlb page
  2022-04-11 14:15 [PATCH v1 0/4] riscv: mm: add Svnapot support panqinglin2020
  2022-04-11 14:15 ` [PATCH v1 1/4] mm: modify pte format for Svnapot panqinglin2020
  2022-04-11 14:15 ` [PATCH v1 2/4] mm: support Svnapot in physical page linear-mapping panqinglin2020
@ 2022-04-11 14:15 ` panqinglin2020
  2022-04-11 14:15 ` [PATCH v1 4/4] mm: support Svnapot in huge vmap panqinglin2020
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: panqinglin2020 @ 2022-04-11 14:15 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, linux-riscv; +Cc: jeff, xuyinan, Qinglin Pan

From: Qinglin Pan <panqinglin2020@iscas.ac.cn>

Svnapot can be used to support 64KB hugetlb page, so it can become a new
option when using hugetlbfs. This patch adds a basic implementation of
hugetlb page, and support 64KB as a size in it by using Svnapot.

For test, boot kernel with command line contains "default_hugepagesz=64K
hugepagesz=64K hugepages=20" and run a simple test like this:

int main() {
	void *addr;
	addr = mmap(NULL, 64 * 1024, PROT_WRITE | PROT_READ,
			MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB | MAP_HUGE_64KB, -1, 0);
	printf("back from mmap \n");
	long *ptr = (long *)addr;
	unsigned int i = 0;
	for(; i < 8 * 1024;i += 512) {
		printf("%lp \n", ptr);
		*ptr = 0xdeafabcd12345678;
		ptr += 512;
	}
	ptr = (long *)addr;
	i = 0;
	for(; i < 8 * 1024;i += 512) {
		if (*ptr != 0xdeafabcd12345678) {
			printf("failed! 0x%lx \n", *ptr);
			break;
		}
		ptr += 512;
	}
	if(i == 8 * 1024)
		printf("simple test passed!\n");
}

And it should be passed.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index b86033f67610..490d52228997 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -41,7 +41,7 @@ config RISCV
 	select ARCH_USE_MEMTEST
 	select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
 	select ARCH_WANT_FRAME_POINTERS
-	select ARCH_WANT_GENERAL_HUGETLB
+	select ARCH_WANT_GENERAL_HUGETLB if !SVNAPOT
 	select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
 	select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU
 	select BUILDTIME_TABLE_SORT if MMU
diff --git a/arch/riscv/include/asm/hugetlb.h b/arch/riscv/include/asm/hugetlb.h
index a5c2ca1d1cd8..8aec4a04dda9 100644
--- a/arch/riscv/include/asm/hugetlb.h
+++ b/arch/riscv/include/asm/hugetlb.h
@@ -2,7 +2,36 @@
 #ifndef _ASM_RISCV_HUGETLB_H
 #define _ASM_RISCV_HUGETLB_H
 
-#include <asm-generic/hugetlb.h>
 #include <asm/page.h>
 
+#ifdef CONFIG_SVNAPOT
+extern pte_t arch_make_huge_pte(pte_t entry, unsigned int shift,
+				       vm_flags_t flags);
+#define arch_make_huge_pte arch_make_huge_pte
+#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
+extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+			    pte_t *ptep, pte_t pte);
+#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
+extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
+				     unsigned long addr, pte_t *ptep);
+#define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
+extern void huge_ptep_clear_flush(struct vm_area_struct *vma,
+				  unsigned long addr, pte_t *ptep);
+#define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
+extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
+				      unsigned long addr, pte_t *ptep,
+				      pte_t pte, int dirty);
+#define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
+extern void huge_ptep_set_wrprotect(struct mm_struct *mm,
+				    unsigned long addr, pte_t *ptep);
+#define __HAVE_ARCH_HUGE_PTE_CLEAR
+extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
+			   pte_t *ptep, unsigned long sz);
+#define set_huge_swap_pte_at riscv_set_huge_swap_pte_at
+extern void riscv_set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr,
+				 pte_t *ptep, pte_t pte, unsigned long sz);
+#endif /*CONFIG_SVNAPOT*/
+
+#include <asm-generic/hugetlb.h>
+
 #endif /* _ASM_RISCV_HUGETLB_H */
diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
index 1526e410e802..ef40ba329709 100644
--- a/arch/riscv/include/asm/page.h
+++ b/arch/riscv/include/asm/page.h
@@ -17,7 +17,7 @@
 #define PAGE_MASK	(~(PAGE_SIZE - 1))
 
 #ifdef CONFIG_64BIT
-#define HUGE_MAX_HSTATE		2
+#define HUGE_MAX_HSTATE		3
 #else
 #define HUGE_MAX_HSTATE		1
 #endif
diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index 932dadfdca54..371e35b5b334 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -2,6 +2,226 @@
 #include <linux/hugetlb.h>
 #include <linux/err.h>
 
+#ifdef CONFIG_SVNAPOT
+pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma,
+			unsigned long addr, unsigned long sz)
+{
+	pgd_t *pgdp = pgd_offset(mm, addr);
+	p4d_t *p4dp = p4d_alloc(mm, pgdp, addr);
+	pud_t *pudp = pud_alloc(mm, p4dp, addr);
+	pmd_t *pmdp = pmd_alloc(mm, pudp, addr);
+
+	if (sz == NAPOT_CONT64KB_SIZE) {
+		if (!pmdp)
+			return NULL;
+		WARN_ON(addr & (sz - 1));
+		return pte_alloc_map(mm, pmdp, addr);
+	}
+
+	return NULL;
+}
+
+pte_t *huge_pte_offset(struct mm_struct *mm,
+		       unsigned long addr, unsigned long sz)
+{
+	pgd_t *pgdp;
+	p4d_t *p4dp;
+	pud_t *pudp;
+	pmd_t *pmdp;
+	pte_t *ptep = NULL;
+
+	pgdp = pgd_offset(mm, addr);
+	if (!pgd_present(READ_ONCE(*pgdp)))
+		return NULL;
+
+	p4dp = p4d_offset(pgdp, addr);
+	if (!p4d_present(READ_ONCE(*p4dp)))
+		return NULL;
+
+	pudp = pud_offset(p4dp, addr);
+	if (!pud_present(READ_ONCE(*pudp)))
+		return NULL;
+
+	pmdp = pmd_offset(pudp, addr);
+	if (!pmd_present(READ_ONCE(*pmdp)))
+		return NULL;
+
+	if (sz == NAPOT_CONT64KB_SIZE)
+		ptep = pte_offset_kernel(pmdp, (addr & ~NAPOT_CONT64KB_MASK));
+
+	return ptep;
+}
+
+int napot_pte_num(pte_t pte)
+{
+	if (!(pte_val(pte) & NAPOT_64KB_MASK))
+		return NAPOT_64KB_PTE_NUM;
+
+	pr_warn("%s: unrecognized napot pte size 0x%lx\n",
+		__func__, pte_val(pte));
+	return 1;
+}
+
+static pte_t get_clear_flush(struct mm_struct *mm,
+			     unsigned long addr,
+			     pte_t *ptep,
+			     unsigned long pte_num)
+{
+	pte_t orig_pte = huge_ptep_get(ptep);
+	bool valid = pte_val(orig_pte);
+	unsigned long i, saddr = addr;
+
+	for (i = 0; i < pte_num; i++, addr += PAGE_SIZE, ptep++) {
+		pte_t pte = ptep_get_and_clear(mm, addr, ptep);
+
+		if (pte_dirty(pte))
+			orig_pte = pte_mkdirty(orig_pte);
+
+		if (pte_young(pte))
+			orig_pte = pte_mkyoung(orig_pte);
+	}
+
+	if (valid) {
+		struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0);
+
+		flush_tlb_range(&vma, saddr, addr);
+	}
+	return orig_pte;
+}
+
+static void clear_flush(struct mm_struct *mm,
+			     unsigned long addr,
+			     pte_t *ptep,
+			     unsigned long pte_num)
+{
+	struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0);
+	unsigned long i, saddr = addr;
+
+	for (i = 0; i < pte_num; i++, addr += PAGE_SIZE, ptep++)
+		pte_clear(mm, addr, ptep);
+
+	flush_tlb_range(&vma, saddr, addr);
+}
+
+pte_t arch_make_huge_pte(pte_t entry, unsigned int shift,
+				       vm_flags_t flags)
+{
+	if (shift == NAPOT_CONT64KB_SHIFT)
+		entry = pte_mknapot(entry, NAPOT_CONT64KB_SHIFT - PAGE_SHIFT);
+
+	return entry;
+}
+
+void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
+			    pte_t *ptep, pte_t pte)
+{
+	int i;
+	int pte_num;
+
+	if (!pte_napot(pte)) {
+		set_pte_at(mm, addr, ptep, pte);
+		return;
+	}
+
+	pte_num = napot_pte_num(pte);
+	for (i = 0; i < pte_num; i++, ptep++, addr += PAGE_SIZE)
+		set_pte_at(mm, addr, ptep, pte);
+}
+
+int huge_ptep_set_access_flags(struct vm_area_struct *vma,
+				      unsigned long addr, pte_t *ptep,
+				      pte_t pte, int dirty)
+{
+	pte_t orig_pte;
+	int i;
+	int pte_num;
+
+	if (!pte_napot(pte))
+		return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
+
+	pte_num = napot_pte_num(pte);
+	ptep = huge_pte_offset(vma->vm_mm, addr, NAPOT_CONT64KB_SIZE);
+	orig_pte = huge_ptep_get(ptep);
+
+	if (pte_dirty(orig_pte))
+		pte = pte_mkdirty(pte);
+
+	if (pte_young(orig_pte))
+		pte = pte_mkyoung(pte);
+
+	for (i = 0; i < pte_num; i++, addr += PAGE_SIZE, ptep++)
+		ptep_set_access_flags(vma, addr, ptep, pte, dirty);
+
+	return true;
+}
+
+pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
+				     unsigned long addr, pte_t *ptep)
+{
+	int pte_num;
+	pte_t orig_pte = huge_ptep_get(ptep);
+
+	if (!pte_napot(orig_pte))
+		return ptep_get_and_clear(mm, addr, ptep);
+
+	pte_num = napot_pte_num(orig_pte);
+	return get_clear_flush(mm, addr, ptep, pte_num);
+}
+
+void huge_ptep_set_wrprotect(struct mm_struct *mm,
+				    unsigned long addr, pte_t *ptep)
+{
+	int i;
+	int pte_num;
+	pte_t pte = READ_ONCE(*ptep);
+
+	if (!pte_napot(pte))
+		return ptep_set_wrprotect(mm, addr, ptep);
+
+	pte_num = napot_pte_num(pte);
+	ptep = huge_pte_offset(mm, addr, NAPOT_CONT64KB_SIZE);
+
+	for (i = 0; i < pte_num; i++, addr += PAGE_SIZE, ptep++)
+		ptep_set_wrprotect(mm, addr, ptep);
+}
+
+void huge_ptep_clear_flush(struct vm_area_struct *vma,
+				  unsigned long addr, pte_t *ptep)
+{
+	int pte_num;
+	pte_t pte = READ_ONCE(*ptep);
+
+	if (!pte_napot(pte)) {
+		ptep_clear_flush(vma, addr, ptep);
+		return;
+	}
+
+	pte_num = napot_pte_num(pte);
+	clear_flush(vma->vm_mm, addr, ptep, pte_num);
+}
+
+void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
+			   pte_t *ptep, unsigned long sz)
+{
+	int i, pte_num;
+
+	pte_num = napot_pte_num(READ_ONCE(*ptep));
+	for (i = 0; i < pte_num; i++, addr += PAGE_SIZE, ptep++)
+		pte_clear(mm, addr, ptep);
+}
+
+void riscv_set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr,
+				 pte_t *ptep, pte_t pte, unsigned long sz)
+{
+	int i, pte_num;
+
+	pte_num = napot_pte_num(READ_ONCE(*ptep));
+
+	for (i = 0; i < pte_num; i++, ptep++)
+		set_pte(ptep, pte);
+}
+#endif /*CONFIG_SVNAPOT*/
+
 int pud_huge(pud_t pud)
 {
 	return pud_leaf(pud);
@@ -18,17 +238,25 @@ bool __init arch_hugetlb_valid_size(unsigned long size)
 		return true;
 	else if (IS_ENABLED(CONFIG_64BIT) && size == PUD_SIZE)
 		return true;
+#ifdef CONFIG_SVNAPOT
+	else if (size == NAPOT_CONT64KB_SIZE)
+		return true;
+#endif /*CONFIG_SVNAPOT*/
 	else
 		return false;
 }
 
-#ifdef CONFIG_CONTIG_ALLOC
-static __init int gigantic_pages_init(void)
+static __init int hugetlbpage_init(void)
 {
+#ifdef CONFIG_CONTIG_ALLOC
 	/* With CONTIG_ALLOC, we can allocate gigantic pages at runtime */
 	if (IS_ENABLED(CONFIG_64BIT))
 		hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
+#endif /*CONFIG_CONTIG_ALLOC*/
+	hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
+#ifdef CONFIG_SVNAPOT
+	hugetlb_add_hstate(NAPOT_CONT64KB_SHIFT - PAGE_SHIFT);
+#endif /*CONFIG_SVNAPOT*/
 	return 0;
 }
-arch_initcall(gigantic_pages_init);
-#endif
+arch_initcall(hugetlbpage_init);
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v1 4/4] mm: support Svnapot in huge vmap
  2022-04-11 14:15 [PATCH v1 0/4] riscv: mm: add Svnapot support panqinglin2020
                   ` (2 preceding siblings ...)
  2022-04-11 14:15 ` [PATCH v1 3/4] mm: support Svnapot in hugetlb page panqinglin2020
@ 2022-04-11 14:15 ` panqinglin2020
  2022-04-11 14:39 ` [PATCH v1 0/4] riscv: mm: add Svnapot support 潘庆霖
  2022-05-20 17:10 ` Palmer Dabbelt
  5 siblings, 0 replies; 8+ messages in thread
From: panqinglin2020 @ 2022-04-11 14:15 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, linux-riscv; +Cc: jeff, xuyinan, Qinglin Pan

From: Qinglin Pan <panqinglin2020@iscas.ac.cn>

The HAVE_ARCH_HUGE_VMAP option can be used to help implement arch
special huge vmap size. This patch selects this option by default and
re-writes the arch_vmap_pte_range_map_size for Svnapot 64KB size.

It can be tested when booting kernel in qemu with pci device, which
will make the kernel to call pci driver using ioremap, and the
re-written function will be called.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 490d52228997..c38b5920a0a8 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -68,6 +68,7 @@ config RISCV
 	select GENERIC_TIME_VSYSCALL if MMU && 64BIT
 	select GENERIC_VDSO_TIME_NS if HAVE_GENERIC_VDSO
 	select HAVE_ARCH_AUDITSYSCALL
+	select HAVE_ARCH_HUGE_VMAP
 	select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL
 	select HAVE_ARCH_JUMP_LABEL_RELATIVE if !XIP_KERNEL
 	select HAVE_ARCH_KASAN if MMU && 64BIT
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index f72cdb64f427..510cce799a8a 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -680,6 +680,46 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
+static inline int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
+{
+	return 0;
+}
+
+static inline int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
+{
+	return 0;
+}
+
+static inline int p4d_clear_huge(p4d_t *p4d)
+{
+	return 0;
+}
+
+static inline int pud_clear_huge(pud_t *pud)
+{
+	return 0;
+}
+
+static inline int pmd_clear_huge(pmd_t *pmd)
+{
+	return 0;
+}
+
+static inline int p4d_free_pud_page(p4d_t *p4d, unsigned long addr)
+{
+	return 0;
+}
+
+static inline int pud_free_pmd_page(pud_t *pud, unsigned long addr)
+{
+	return 0;
+}
+
+static inline int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
+{
+	return 0;
+}
+
 /*
  * Encode and decode a swap entry
  *
diff --git a/arch/riscv/include/asm/vmalloc.h b/arch/riscv/include/asm/vmalloc.h
index ff9abc00d139..2c1a41c5ca8d 100644
--- a/arch/riscv/include/asm/vmalloc.h
+++ b/arch/riscv/include/asm/vmalloc.h
@@ -1,4 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
 #ifndef _ASM_RISCV_VMALLOC_H
 #define _ASM_RISCV_VMALLOC_H
 
+#include <asm/pgtable-bits.h>
+
+#ifdef CONFIG_SVNAPOT
+#define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size
+static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, unsigned long end,
+		u64 pfn, unsigned int max_page_shift)
+{
+	bool is_napot_addr = !(addr & NAPOT_CONT64KB_MASK);
+	bool pfn_align_napot = !(pfn & (NAPOT_64KB_PTE_NUM - 1UL));
+	bool space_enough = ((end - addr) >= NAPOT_CONT64KB_SIZE);
+
+	if (is_napot_addr && pfn_align_napot && space_enough
+			&& max_page_shift >= NAPOT_CONT64KB_SHIFT)
+		return NAPOT_CONT64KB_SIZE;
+
+	return PAGE_SIZE;
+}
+#endif /*CONFIG_SVNAPOT*/
+
 #endif /* _ASM_RISCV_VMALLOC_H */
-- 
2.35.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 0/4] riscv: mm: add Svnapot support
  2022-04-11 14:15 [PATCH v1 0/4] riscv: mm: add Svnapot support panqinglin2020
                   ` (3 preceding siblings ...)
  2022-04-11 14:15 ` [PATCH v1 4/4] mm: support Svnapot in huge vmap panqinglin2020
@ 2022-04-11 14:39 ` 潘庆霖
  2022-05-20 17:10 ` Palmer Dabbelt
  5 siblings, 0 replies; 8+ messages in thread
From: 潘庆霖 @ 2022-04-11 14:39 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, linux-riscv; +Cc: jeff, xuyinan

Hi all,

Sorry, a correction is needed here.

&gt; 
&gt; Qemu support for Svnapot has been accepted but still not merged into master.
&gt; So the qemu which we use to test this patchset is current in this repo (it
&gt; contains qemu Svnapot patchset):
&gt; https://github.com/plctlab/plct-qemu/tree/plct-virtmem-dev
&gt; 
&gt; Tested on:
&gt;   - qemu rv64 with "Svnapot support" off.
&gt;   - plct-qemu rv64 with "Svnapot support" on.
&gt; 

The Svnapot support for QEMU has been merged into master branch. Tests for this
patchset can run on it directly now, with cpu option specified like
"-cpu rv64,svnapot=true".

Thanks,
Qinglin

&gt; 
&gt; _______________________________________________
&gt; linux-riscv mailing list
&gt; linux-riscv@lists.infradead.org
&gt; http://lists.infradead.org/mailman/listinfo/linux-riscv
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 0/4] riscv: mm: add Svnapot support
  2022-04-11 14:15 [PATCH v1 0/4] riscv: mm: add Svnapot support panqinglin2020
                   ` (4 preceding siblings ...)
  2022-04-11 14:39 ` [PATCH v1 0/4] riscv: mm: add Svnapot support 潘庆霖
@ 2022-05-20 17:10 ` Palmer Dabbelt
  2022-05-22  2:38   ` 潘庆霖
  5 siblings, 1 reply; 8+ messages in thread
From: Palmer Dabbelt @ 2022-05-20 17:10 UTC (permalink / raw)
  To: panqinglin2020
  Cc: Paul Walmsley, aou, linux-riscv, jeff, xuyinan, panqinglin2020

On Mon, 11 Apr 2022 07:15:32 PDT (-0700), panqinglin2020@iscas.ac.cn wrote:
> From: Qinglin Pan <panqinglin2020@iscas.ac.cn>
>
> Svnapot is a RISC-V extension for marking contiguous 4K pages as a non-4K
> page. This patch set is for using Svnapot in Linux Kernel's boot process
> and hugetlb fs.
>
> Since Svnapot is just stable recently, and there seems no official way to
> determine if the CPU supports Svnapot at runtime. This patchset adds a Kconfig
> item for using Svnapot in "Platform type"->"Svnapot support". Its default value
> is off, and people can set it on when their CPU supports Svnapot.
>
> Qemu support for Svnapot has been accepted but still not merged into master.
> So the qemu which we use to test this patchset is current in this repo (it
> contains qemu Svnapot patchset):
> https://github.com/plctlab/plct-qemu/tree/plct-virtmem-dev
>
> Tested on:
>   - qemu rv64 with "Svnapot support" off.
>   - plct-qemu rv64 with "Svnapot support" on.
>
>
> Qinglin Pan (4):
>   mm: modify pte format for Svnapot
>   mm: support Svnapot in physical page linear-mapping
>   mm: support Svnapot in hugetlb page
>   mm: support Svnapot in huge vmap
>
>  arch/riscv/Kconfig                    |  10 +-
>  arch/riscv/include/asm/hugetlb.h      |  31 +++-
>  arch/riscv/include/asm/page.h         |   2 +-
>  arch/riscv/include/asm/pgtable-bits.h |  31 ++++
>  arch/riscv/include/asm/pgtable.h      |  68 ++++++++
>  arch/riscv/include/asm/vmalloc.h      |  20 +++
>  arch/riscv/mm/hugetlbpage.c           | 236 +++++++++++++++++++++++++-
>  arch/riscv/mm/init.c                  |  29 +++-
>  8 files changed, 416 insertions(+), 11 deletions(-)

Sorry for being slow here, I got pretty buried this round.

This generally looks OK, but we definately need dynamic detection.  It 
should be super easy to do that with the new framework, as we essentialy 
just need to check for Svnapot on allocation.

Aside from that, just a minor comment: it feels like Svnapot is simple 
enough that we should be able to fit it into the generic mapping code.  
No big deal if it doesn't work and I haven't tried to do so, but I think 
it's worth a shot.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: [PATCH v1 0/4] riscv: mm: add Svnapot support
  2022-05-20 17:10 ` Palmer Dabbelt
@ 2022-05-22  2:38   ` 潘庆霖
  0 siblings, 0 replies; 8+ messages in thread
From: 潘庆霖 @ 2022-05-22  2:38 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: Paul Walmsley, aou, linux-riscv, jeff, xuyinan

Hi Palmer,

&gt; 
&gt; Sorry for being slow here, I got pretty buried this round.
&gt; 
&gt; This generally looks OK, but we definately need dynamic detection.  It 
&gt; should be super easy to do that with the new framework, as we essentialy 
&gt; just need to check for Svnapot on allocation.
&gt; 

Thanks for your comment. I also have the idea to detect Svnapot cpu support dynamically
and will do it in next version.

&gt; Aside from that, just a minor comment: it feels like Svnapot is simple 
&gt; enough that we should be able to fit it into the generic mapping code.  
&gt; No big deal if it doesn't work and I haven't tried to do so, but I think 
&gt; it's worth a shot.

Ok, I will try this in next version too.

Yours,
Qinglin
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-05-22  2:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-11 14:15 [PATCH v1 0/4] riscv: mm: add Svnapot support panqinglin2020
2022-04-11 14:15 ` [PATCH v1 1/4] mm: modify pte format for Svnapot panqinglin2020
2022-04-11 14:15 ` [PATCH v1 2/4] mm: support Svnapot in physical page linear-mapping panqinglin2020
2022-04-11 14:15 ` [PATCH v1 3/4] mm: support Svnapot in hugetlb page panqinglin2020
2022-04-11 14:15 ` [PATCH v1 4/4] mm: support Svnapot in huge vmap panqinglin2020
2022-04-11 14:39 ` [PATCH v1 0/4] riscv: mm: add Svnapot support 潘庆霖
2022-05-20 17:10 ` Palmer Dabbelt
2022-05-22  2:38   ` 潘庆霖

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.