kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH 0/3] arm/arm64: mmu cleanups and fixes
@ 2022-10-06 11:12 Alexandru Elisei
  2022-10-06 11:12 ` [kvm-unit-tests PATCH 1/3] lib/vmalloc: Treat virt_to_pte_phys() as returning a physical address Alexandru Elisei
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Alexandru Elisei @ 2022-10-06 11:12 UTC (permalink / raw)
  To: pbonzini, thuth, andrew.jones, kvm, kvmarm

I was writing a quick test when I noticed that arm's implementation of
__virt_to_phys(), which ends up calling virt_to_pte_phys(), doesn't handle
block mappings and returns a bogus value.

When fixing that, I realized that lib/vmalloc confuses the return value for
virt_to_pte_phys(), which is harmless, but still wrong.

I also got confused about mmu_get_pte() and get_pte(), so I (hopefully)
improved that by renaming mmu_get_pte() to follow_pte().

Tested on rockpro64: arm64, 4k and 64k pages, with qemu and kvmtool; arm,
with qemu and kvmtool. And on an odroid-c4: arm64, 4k, 16k and 64k pages
with qemu and kvmtool; arm, with qemu and kvmtool.

Alexandru Elisei (3):
  lib/vmalloc: Treat virt_to_pte_phys() as returning a physical address
  arm/arm64: mmu: Teach virt_to_pte_phys() about block descriptors
  arm/arm64: mmu: Rename mmu_get_pte() -> follow_pte()

 lib/arm/asm/mmu-api.h |  2 +-
 lib/arm/mmu.c         | 88 +++++++++++++++++++++++++------------------
 lib/vmalloc.c         |  4 +-
 3 files changed, 55 insertions(+), 39 deletions(-)

-- 
2.37.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [kvm-unit-tests PATCH 1/3] lib/vmalloc: Treat virt_to_pte_phys() as returning a physical address
  2022-10-06 11:12 [kvm-unit-tests PATCH 0/3] arm/arm64: mmu cleanups and fixes Alexandru Elisei
@ 2022-10-06 11:12 ` Alexandru Elisei
  2022-10-06 11:35   ` Claudio Imbrenda
  2022-10-06 11:12 ` [kvm-unit-tests PATCH 2/3] arm/arm64: mmu: Teach virt_to_pte_phys() about block descriptors Alexandru Elisei
  2022-10-06 11:12 ` [kvm-unit-tests PATCH 3/3] arm/arm64: mmu: Rename mmu_get_pte() -> follow_pte() Alexandru Elisei
  2 siblings, 1 reply; 7+ messages in thread
From: Alexandru Elisei @ 2022-10-06 11:12 UTC (permalink / raw)
  To: pbonzini, thuth, andrew.jones, kvm, kvmarm
  Cc: Laurent Vivier, Janosch Frank, Claudio Imbrenda

All architectures that implements virt_to_pte_phys() (s390x, x86, arm and
arm64) return a physical address from the function. Teach vmalloc to treat
it as such, instead of confusing the return value with a page table entry.
Changing things the other way around (having the function return a page
table entry instead) is not feasible, because it is possible for an
architecture to use the upper bits of the table entry to store metadata
about the page.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Andrew Jones <andrew.jones@linux.dev>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/vmalloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/vmalloc.c b/lib/vmalloc.c
index 572682576cc3..0696b5da8190 100644
--- a/lib/vmalloc.c
+++ b/lib/vmalloc.c
@@ -169,7 +169,7 @@ static void vm_free(void *mem)
 	/* the pointer is not page-aligned, it was a single-page allocation */
 	if (!IS_ALIGNED((uintptr_t)mem, PAGE_SIZE)) {
 		assert(GET_MAGIC(mem) == VM_MAGIC);
-		page = virt_to_pte_phys(page_root, mem) & PAGE_MASK;
+		page = virt_to_pte_phys(page_root, mem);
 		assert(page);
 		free_page(phys_to_virt(page));
 		return;
@@ -183,7 +183,7 @@ static void vm_free(void *mem)
 	/* free all the pages including the metadata page */
 	ptr = (uintptr_t)m & PAGE_MASK;
 	for (i = 0 ; i < m->npages + 1; i++, ptr += PAGE_SIZE) {
-		page = virt_to_pte_phys(page_root, (void *)ptr) & PAGE_MASK;
+		page = virt_to_pte_phys(page_root, (void *)ptr);
 		assert(page);
 		free_page(phys_to_virt(page));
 	}
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [kvm-unit-tests PATCH 2/3] arm/arm64: mmu: Teach virt_to_pte_phys() about block descriptors
  2022-10-06 11:12 [kvm-unit-tests PATCH 0/3] arm/arm64: mmu cleanups and fixes Alexandru Elisei
  2022-10-06 11:12 ` [kvm-unit-tests PATCH 1/3] lib/vmalloc: Treat virt_to_pte_phys() as returning a physical address Alexandru Elisei
@ 2022-10-06 11:12 ` Alexandru Elisei
  2022-10-06 11:12 ` [kvm-unit-tests PATCH 3/3] arm/arm64: mmu: Rename mmu_get_pte() -> follow_pte() Alexandru Elisei
  2 siblings, 0 replies; 7+ messages in thread
From: Alexandru Elisei @ 2022-10-06 11:12 UTC (permalink / raw)
  To: pbonzini, thuth, andrew.jones, kvm, kvmarm

The arm and arm64 architectures allow a virtual address to be mapped using
a block descriptor (or huge page, as Linux calls it), and the function
mmu_set_ranges_sect() is made available for a test to do just that. But
virt_to_pte_phys() assumes that all virtual addresses are mapped with page
granularity, which can lead to erroneous addresses being returned in the
case of block mappings.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/arm/mmu.c | 89 +++++++++++++++++++++++++++++++--------------------
 1 file changed, 54 insertions(+), 35 deletions(-)

diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index e1a72fe4941f..2aaa63d538c0 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -111,10 +111,61 @@ pteval_t *install_page(pgd_t *pgtable, phys_addr_t phys, void *virt)
 				 __pgprot(PTE_WBWA | PTE_USER));
 }
 
-phys_addr_t virt_to_pte_phys(pgd_t *pgtable, void *mem)
+/*
+ * NOTE: The Arm architecture might require the use of a
+ * break-before-make sequence before making changes to a PTE and
+ * certain conditions are met (see Arm ARM D5-2669 for AArch64 and
+ * B3-1378 for AArch32 for more details).
+ */
+pteval_t *mmu_get_pte(pgd_t *pgtable, uintptr_t vaddr)
 {
-	return (*get_pte(pgtable, (uintptr_t)mem) & PHYS_MASK & -PAGE_SIZE)
-		+ ((ulong)mem & (PAGE_SIZE - 1));
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	if (!mmu_enabled())
+		return NULL;
+
+	pgd = pgd_offset(pgtable, vaddr);
+	if (!pgd_valid(*pgd))
+		return NULL;
+
+	pud = pud_offset(pgd, vaddr);
+	if (!pud_valid(*pud))
+		return NULL;
+
+	pmd = pmd_offset(pud, vaddr);
+	if (!pmd_valid(*pmd))
+		return NULL;
+	if (pmd_huge(*pmd))
+		return &pmd_val(*pmd);
+
+	pte = pte_offset(pmd, vaddr);
+	if (!pte_valid(*pte))
+		return NULL;
+
+        return &pte_val(*pte);
+}
+
+phys_addr_t virt_to_pte_phys(pgd_t *pgtable, void *virt)
+{
+	phys_addr_t mask;
+	pteval_t *pteval;
+
+	pteval = mmu_get_pte(pgtable, (uintptr_t)virt);
+	if (!pteval || !pte_valid(__pte(*pteval))) {
+		install_page(pgtable, (phys_addr_t)(unsigned long)virt, virt);
+		return (phys_addr_t)(unsigned long)virt;
+	}
+
+	if (pmd_huge(__pmd(*pteval)))
+		mask = PMD_MASK;
+	else
+		mask = PAGE_MASK;
+
+	return (*pteval & PHYS_MASK & mask) |
+		((phys_addr_t)(unsigned long)virt & ~mask);
 }
 
 void mmu_set_range_ptes(pgd_t *pgtable, uintptr_t virt_offset,
@@ -231,38 +282,6 @@ unsigned long __phys_to_virt(phys_addr_t addr)
 	return addr;
 }
 
-/*
- * NOTE: The Arm architecture might require the use of a
- * break-before-make sequence before making changes to a PTE and
- * certain conditions are met (see Arm ARM D5-2669 for AArch64 and
- * B3-1378 for AArch32 for more details).
- */
-pteval_t *mmu_get_pte(pgd_t *pgtable, uintptr_t vaddr)
-{
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
-	pte_t *pte;
-
-	if (!mmu_enabled())
-		return NULL;
-
-	pgd = pgd_offset(pgtable, vaddr);
-	assert(pgd_valid(*pgd));
-	pud = pud_offset(pgd, vaddr);
-	assert(pud_valid(*pud));
-	pmd = pmd_offset(pud, vaddr);
-	assert(pmd_valid(*pmd));
-
-	if (pmd_huge(*pmd))
-		return &pmd_val(*pmd);
-
-	pte = pte_offset(pmd, vaddr);
-	assert(pte_valid(*pte));
-
-        return &pte_val(*pte);
-}
-
 void mmu_clear_user(pgd_t *pgtable, unsigned long vaddr)
 {
 	pteval_t *p_pte = mmu_get_pte(pgtable, vaddr);
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [kvm-unit-tests PATCH 3/3] arm/arm64: mmu: Rename mmu_get_pte() -> follow_pte()
  2022-10-06 11:12 [kvm-unit-tests PATCH 0/3] arm/arm64: mmu cleanups and fixes Alexandru Elisei
  2022-10-06 11:12 ` [kvm-unit-tests PATCH 1/3] lib/vmalloc: Treat virt_to_pte_phys() as returning a physical address Alexandru Elisei
  2022-10-06 11:12 ` [kvm-unit-tests PATCH 2/3] arm/arm64: mmu: Teach virt_to_pte_phys() about block descriptors Alexandru Elisei
@ 2022-10-06 11:12 ` Alexandru Elisei
  2 siblings, 0 replies; 7+ messages in thread
From: Alexandru Elisei @ 2022-10-06 11:12 UTC (permalink / raw)
  To: pbonzini, thuth, andrew.jones, kvm, kvmarm

The function get_pte() from mmu.c returns a pointer to the PTE
associated with the requested virtual address, mapping the virtual
address in the process if it's not already mapped.

mmu_get_pte() returns a pointer to the PTE if and only if the virtual is
mapped in pgtable, otherwise returns NULL. Rename it to follow_pte() to
avoid any confusion with get_pte(). follow_pte() also matches the name
of Linux kernel function with a similar purpose.

Also remove the mmu_enabled() check from the function, as the purpose of
the function is to get the mapping for the virtual address in the pgtable
supplied as the argument, not to translate the virtual address to a
physical address using the current translation; that's what
virt_to_phys() does.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/arm/asm/mmu-api.h | 2 +-
 lib/arm/mmu.c         | 9 +++------
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/lib/arm/asm/mmu-api.h b/lib/arm/asm/mmu-api.h
index 3d77cbfd8b24..6c1136d957f9 100644
--- a/lib/arm/asm/mmu-api.h
+++ b/lib/arm/asm/mmu-api.h
@@ -17,6 +17,6 @@ extern void mmu_set_range_sect(pgd_t *pgtable, uintptr_t virt_offset,
 extern void mmu_set_range_ptes(pgd_t *pgtable, uintptr_t virt_offset,
 			       phys_addr_t phys_start, phys_addr_t phys_end,
 			       pgprot_t prot);
-extern pteval_t *mmu_get_pte(pgd_t *pgtable, uintptr_t vaddr);
+extern pteval_t *follow_pte(pgd_t *pgtable, uintptr_t vaddr);
 extern void mmu_clear_user(pgd_t *pgtable, unsigned long vaddr);
 #endif
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 2aaa63d538c0..ec3ce63f2316 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -117,16 +117,13 @@ pteval_t *install_page(pgd_t *pgtable, phys_addr_t phys, void *virt)
  * certain conditions are met (see Arm ARM D5-2669 for AArch64 and
  * B3-1378 for AArch32 for more details).
  */
-pteval_t *mmu_get_pte(pgd_t *pgtable, uintptr_t vaddr)
+pteval_t *follow_pte(pgd_t *pgtable, uintptr_t vaddr)
 {
 	pgd_t *pgd;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
 
-	if (!mmu_enabled())
-		return NULL;
-
 	pgd = pgd_offset(pgtable, vaddr);
 	if (!pgd_valid(*pgd))
 		return NULL;
@@ -153,7 +150,7 @@ phys_addr_t virt_to_pte_phys(pgd_t *pgtable, void *virt)
 	phys_addr_t mask;
 	pteval_t *pteval;
 
-	pteval = mmu_get_pte(pgtable, (uintptr_t)virt);
+	pteval = follow_pte(pgtable, (uintptr_t)virt);
 	if (!pteval || !pte_valid(__pte(*pteval))) {
 		install_page(pgtable, (phys_addr_t)(unsigned long)virt, virt);
 		return (phys_addr_t)(unsigned long)virt;
@@ -284,7 +281,7 @@ unsigned long __phys_to_virt(phys_addr_t addr)
 
 void mmu_clear_user(pgd_t *pgtable, unsigned long vaddr)
 {
-	pteval_t *p_pte = mmu_get_pte(pgtable, vaddr);
+	pteval_t *p_pte = follow_pte(pgtable, vaddr);
 	if (p_pte) {
 		pteval_t entry = *p_pte & ~PTE_USER;
 		WRITE_ONCE(*p_pte, entry);
-- 
2.37.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH 1/3] lib/vmalloc: Treat virt_to_pte_phys() as returning a physical address
  2022-10-06 11:12 ` [kvm-unit-tests PATCH 1/3] lib/vmalloc: Treat virt_to_pte_phys() as returning a physical address Alexandru Elisei
@ 2022-10-06 11:35   ` Claudio Imbrenda
  2022-10-06 12:09     ` Alexandru Elisei
  0 siblings, 1 reply; 7+ messages in thread
From: Claudio Imbrenda @ 2022-10-06 11:35 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: pbonzini, thuth, andrew.jones, kvm, kvmarm, Laurent Vivier,
	Janosch Frank

On Thu,  6 Oct 2022 12:12:39 +0100
Alexandru Elisei <alexandru.elisei@arm.com> wrote:

> All architectures that implements virt_to_pte_phys() (s390x, x86, arm and
> arm64) return a physical address from the function. Teach vmalloc to treat
> it as such, instead of confusing the return value with a page table entry.

I'm not sure I understand what you mean

> Changing things the other way around (having the function return a page
> table entry instead) is not feasible, because it is possible for an
> architecture to use the upper bits of the table entry to store metadata
> about the page.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: Andrew Jones <andrew.jones@linux.dev>
> Cc: Laurent Vivier <lvivier@redhat.com>
> Cc: Janosch Frank <frankja@linux.ibm.com>
> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  lib/vmalloc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/vmalloc.c b/lib/vmalloc.c
> index 572682576cc3..0696b5da8190 100644
> --- a/lib/vmalloc.c
> +++ b/lib/vmalloc.c
> @@ -169,7 +169,7 @@ static void vm_free(void *mem)
>  	/* the pointer is not page-aligned, it was a single-page allocation */
>  	if (!IS_ALIGNED((uintptr_t)mem, PAGE_SIZE)) {
>  		assert(GET_MAGIC(mem) == VM_MAGIC);
> -		page = virt_to_pte_phys(page_root, mem) & PAGE_MASK;
> +		page = virt_to_pte_phys(page_root, mem);

this will break things for small allocations, though. if the pointer is
not aligned, then the result of virt_to_pte_phys will also not be
aligned....

>  		assert(page);
>  		free_page(phys_to_virt(page));

...and phys_to_virt will also return an unaligned address, and
free_page will complain about it.

>  		return;
> @@ -183,7 +183,7 @@ static void vm_free(void *mem)
>  	/* free all the pages including the metadata page */
>  	ptr = (uintptr_t)m & PAGE_MASK;

ptr gets page aligned here

>  	for (i = 0 ; i < m->npages + 1; i++, ptr += PAGE_SIZE) {
> -		page = virt_to_pte_phys(page_root, (void *)ptr) & PAGE_MASK;
> +		page = virt_to_pte_phys(page_root, (void *)ptr);

so virt_to_pte_phys will also return an aligned address;
I agree that & PAGE_MASK is redundant here

>  		assert(page);
>  		free_page(phys_to_virt(page));
>  	}


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH 1/3] lib/vmalloc: Treat virt_to_pte_phys() as returning a physical address
  2022-10-06 11:35   ` Claudio Imbrenda
@ 2022-10-06 12:09     ` Alexandru Elisei
  2022-10-06 14:50       ` Claudio Imbrenda
  0 siblings, 1 reply; 7+ messages in thread
From: Alexandru Elisei @ 2022-10-06 12:09 UTC (permalink / raw)
  To: Claudio Imbrenda
  Cc: pbonzini, thuth, andrew.jones, kvm, kvmarm, Laurent Vivier,
	Janosch Frank

Hi,

On Thu, Oct 06, 2022 at 01:35:52PM +0200, Claudio Imbrenda wrote:
> On Thu,  6 Oct 2022 12:12:39 +0100
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> > All architectures that implements virt_to_pte_phys() (s390x, x86, arm and
> > arm64) return a physical address from the function. Teach vmalloc to treat
> > it as such, instead of confusing the return value with a page table entry.
> 
> I'm not sure I understand what you mean

I thought that vmalloc uses PAGE_MASK because it expects virt_to_pte_phys()
to return a pteval (because of the "pte' part in the virt_to_pte_phys()
function name), which might have the [PAGE_SHIFT-1:0] bits used to store
page metadata by an architecture (like permissions), but like you've
explained below it uses PAGE_MASK to align the page address (which is
identically mapped) before passing it to the page allocator to be freed.

> 
> > Changing things the other way around (having the function return a page
> > table entry instead) is not feasible, because it is possible for an
> > architecture to use the upper bits of the table entry to store metadata
> > about the page.
> > 
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Thomas Huth <thuth@redhat.com>
> > Cc: Andrew Jones <andrew.jones@linux.dev>
> > Cc: Laurent Vivier <lvivier@redhat.com>
> > Cc: Janosch Frank <frankja@linux.ibm.com>
> > Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> >  lib/vmalloc.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/lib/vmalloc.c b/lib/vmalloc.c
> > index 572682576cc3..0696b5da8190 100644
> > --- a/lib/vmalloc.c
> > +++ b/lib/vmalloc.c
> > @@ -169,7 +169,7 @@ static void vm_free(void *mem)
> >  	/* the pointer is not page-aligned, it was a single-page allocation */
> >  	if (!IS_ALIGNED((uintptr_t)mem, PAGE_SIZE)) {
> >  		assert(GET_MAGIC(mem) == VM_MAGIC);
> > -		page = virt_to_pte_phys(page_root, mem) & PAGE_MASK;
> > +		page = virt_to_pte_phys(page_root, mem);
> 
> this will break things for small allocations, though. if the pointer is
> not aligned, then the result of virt_to_pte_phys will also not be
> aligned....

I agree, I missed that part. Would be nice if it were written using
PAGE_ALIGN to avoid mistakes like mine in the future, but that's
unimportant.

> 
> >  		assert(page);
> >  		free_page(phys_to_virt(page));
> 
> ...and phys_to_virt will also return an unaligned address, and
> free_page will complain about it.
> 
> >  		return;
> > @@ -183,7 +183,7 @@ static void vm_free(void *mem)
> >  	/* free all the pages including the metadata page */
> >  	ptr = (uintptr_t)m & PAGE_MASK;
> 
> ptr gets page aligned here
> 
> >  	for (i = 0 ; i < m->npages + 1; i++, ptr += PAGE_SIZE) {
> > -		page = virt_to_pte_phys(page_root, (void *)ptr) & PAGE_MASK;
> > +		page = virt_to_pte_phys(page_root, (void *)ptr);
> 
> so virt_to_pte_phys will also return an aligned address;
> I agree that & PAGE_MASK is redundant here

You are correct, if we've ended up here it means that the pointer is
already page aligned, and it will be incremented by PAGE_SIZE each
iteration, hence the virt_to_pte_phys() will also be paged aligned.

I don't see much point in writing a patch just to remove the unnecessary
alignment here, so I'll drop this patch entirely.

Thank you for the prompt explanation!

Alex

> 
> >  		assert(page);
> >  		free_page(phys_to_virt(page));
> >  	}
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH 1/3] lib/vmalloc: Treat virt_to_pte_phys() as returning a physical address
  2022-10-06 12:09     ` Alexandru Elisei
@ 2022-10-06 14:50       ` Claudio Imbrenda
  0 siblings, 0 replies; 7+ messages in thread
From: Claudio Imbrenda @ 2022-10-06 14:50 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: pbonzini, thuth, andrew.jones, kvm, kvmarm, Laurent Vivier,
	Janosch Frank

On Thu, 6 Oct 2022 13:09:08 +0100
Alexandru Elisei <alexandru.elisei@arm.com> wrote:

> Hi,
> 
> On Thu, Oct 06, 2022 at 01:35:52PM +0200, Claudio Imbrenda wrote:
> > On Thu,  6 Oct 2022 12:12:39 +0100
> > Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> >   
> > > All architectures that implements virt_to_pte_phys() (s390x, x86,
> > > arm and arm64) return a physical address from the function. Teach
> > > vmalloc to treat it as such, instead of confusing the return
> > > value with a page table entry.  
> > 
> > I'm not sure I understand what you mean  
> 
> I thought that vmalloc uses PAGE_MASK because it expects
> virt_to_pte_phys() to return a pteval (because of the "pte' part in
> the virt_to_pte_phys()

I agree that the name of the function is confusing; there are comments
in lib/vmalloc.h and for virt_to_pte_phys it says:

/* Walk the page table and resolve the virtual address to a physical
address */

> function name), which might have the [PAGE_SHIFT-1:0] bits used to store
> page metadata by an architecture (like permissions), but like you've
> explained below it uses PAGE_MASK to align the page address (which is
> identically mapped) before passing it to the page allocator to be freed.
> 
> >   
> > > Changing things the other way around (having the function return a page
> > > table entry instead) is not feasible, because it is possible for an
> > > architecture to use the upper bits of the table entry to store metadata
> > > about the page.
> > > 
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: Thomas Huth <thuth@redhat.com>
> > > Cc: Andrew Jones <andrew.jones@linux.dev>
> > > Cc: Laurent Vivier <lvivier@redhat.com>
> > > Cc: Janosch Frank <frankja@linux.ibm.com>
> > > Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > ---
> > >  lib/vmalloc.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/lib/vmalloc.c b/lib/vmalloc.c
> > > index 572682576cc3..0696b5da8190 100644
> > > --- a/lib/vmalloc.c
> > > +++ b/lib/vmalloc.c
> > > @@ -169,7 +169,7 @@ static void vm_free(void *mem)
> > >  	/* the pointer is not page-aligned, it was a single-page allocation */
> > >  	if (!IS_ALIGNED((uintptr_t)mem, PAGE_SIZE)) {
> > >  		assert(GET_MAGIC(mem) == VM_MAGIC);
> > > -		page = virt_to_pte_phys(page_root, mem) & PAGE_MASK;
> > > +		page = virt_to_pte_phys(page_root, mem);  
> > 
> > this will break things for small allocations, though. if the pointer is
> > not aligned, then the result of virt_to_pte_phys will also not be
> > aligned....  
> 
> I agree, I missed that part. Would be nice if it were written using
> PAGE_ALIGN to avoid mistakes like mine in the future, but that's

PAGE_ALIGN rounds UP, though, and we need to round down.

I think it's easier and more readable to & PAGE_MASK, instead of a more
cumbersome ALIGN_DOWN((thing), PAGE_SIZE)

> unimportant.
> 
> >   
> > >  		assert(page);
> > >  		free_page(phys_to_virt(page));  
> > 
> > ...and phys_to_virt will also return an unaligned address, and
> > free_page will complain about it.
> >   
> > >  		return;
> > > @@ -183,7 +183,7 @@ static void vm_free(void *mem)
> > >  	/* free all the pages including the metadata page */
> > >  	ptr = (uintptr_t)m & PAGE_MASK;  
> > 
> > ptr gets page aligned here
> >   
> > >  	for (i = 0 ; i < m->npages + 1; i++, ptr += PAGE_SIZE) {
> > > -		page = virt_to_pte_phys(page_root, (void *)ptr) & PAGE_MASK;
> > > +		page = virt_to_pte_phys(page_root, (void *)ptr);  
> > 
> > so virt_to_pte_phys will also return an aligned address;
> > I agree that & PAGE_MASK is redundant here  
> 
> You are correct, if we've ended up here it means that the pointer is
> already page aligned, and it will be incremented by PAGE_SIZE each
> iteration, hence the virt_to_pte_phys() will also be paged aligned.
> 
> I don't see much point in writing a patch just to remove the unnecessary
> alignment here, so I'll drop this patch entirely.
> 
> Thank you for the prompt explanation!

I'm glad things have been clarified :)

> 
> Alex
> 
> >   
> > >  		assert(page);
> > >  		free_page(phys_to_virt(page));
> > >  	}  
> >   


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-10-06 14:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-06 11:12 [kvm-unit-tests PATCH 0/3] arm/arm64: mmu cleanups and fixes Alexandru Elisei
2022-10-06 11:12 ` [kvm-unit-tests PATCH 1/3] lib/vmalloc: Treat virt_to_pte_phys() as returning a physical address Alexandru Elisei
2022-10-06 11:35   ` Claudio Imbrenda
2022-10-06 12:09     ` Alexandru Elisei
2022-10-06 14:50       ` Claudio Imbrenda
2022-10-06 11:12 ` [kvm-unit-tests PATCH 2/3] arm/arm64: mmu: Teach virt_to_pte_phys() about block descriptors Alexandru Elisei
2022-10-06 11:12 ` [kvm-unit-tests PATCH 3/3] arm/arm64: mmu: Rename mmu_get_pte() -> follow_pte() Alexandru Elisei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).