linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] riscv: vdso: map data page before vDSO code
@ 2021-08-29  9:47 Sergey Larin
  2021-08-29  9:59 ` Kefeng Wang
  0 siblings, 1 reply; 3+ messages in thread
From: Sergey Larin @ 2021-08-29  9:47 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou
  Cc: linux-riscv, linux-kernel, Sergey Larin

Current vDSO implementation assumes that the code size always fits in
single page, and the data page follows it:

	PROVIDE(_vdso_data = . + PAGE_SIZE);

However, this was not the case with my kernel build - the
shared object had the size of 4800 bytes. This, obviously, is more than
4096 and requires second page for the rest of the data.

CLOCK_REALTIME_COARSE clock became broken. It was always returning 0
because vDSO code was reading the second code page, not the
data page. Glibc uses this clock for the time() function.

So instead of computing the offset for the data page (it is necessary to
do in runtime - you can't know the size of the binary while you're
building it) simply move it behind the code like the ARM does:

	PROVIDE(_vdso_data = . - PAGE_SIZE);

This commit also fixes arch_vma_name for the data page - it was
reporting the same '[vdso]' name for it in my case.

Since I don't have the real hardware, the change was debugged with KGDB
in RVVM and also verified in QEMU.

Signed-off-by: Sergey Larin <cerg2010cerg2010@mail.ru>
---
 arch/riscv/kernel/vdso.c          | 22 +++++++++++-----------
 arch/riscv/kernel/vdso/vdso.lds.S |  2 +-
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/riscv/kernel/vdso.c b/arch/riscv/kernel/vdso.c
index 25a3b8849599..0c49390e9be3 100644
--- a/arch/riscv/kernel/vdso.c
+++ b/arch/riscv/kernel/vdso.c
@@ -44,13 +44,13 @@ static int __init vdso_init(void)
 		return -ENOMEM;
 	}
 
+	vdso_pagelist[0] = virt_to_page(vdso_data);
 	for (i = 0; i < vdso_pages; i++) {
 		struct page *pg;
 
 		pg = virt_to_page(vdso_start + (i << PAGE_SHIFT));
-		vdso_pagelist[i] = pg;
+		vdso_pagelist[i + 1] = pg;
 	}
-	vdso_pagelist[i] = virt_to_page(vdso_data);
 
 	return 0;
 }
@@ -77,21 +77,21 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
 	 * install_special_mapping or the perf counter mmap tracking code
 	 * will fail to recognise it as a vDSO (since arch_vma_name fails).
 	 */
-	mm->context.vdso = (void *)vdso_base;
+	mm->context.vdso = (void *)vdso_base + PAGE_SIZE;
 
-	ret =
-	   install_special_mapping(mm, vdso_base, vdso_pages << PAGE_SHIFT,
-		(VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC),
-		vdso_pagelist);
+	ret = install_special_mapping(mm, vdso_base, PAGE_SIZE,
+		(VM_READ | VM_MAYREAD), &vdso_pagelist[0]);
 
 	if (unlikely(ret)) {
 		mm->context.vdso = NULL;
 		goto end;
 	}
 
-	vdso_base += (vdso_pages << PAGE_SHIFT);
-	ret = install_special_mapping(mm, vdso_base, PAGE_SIZE,
-		(VM_READ | VM_MAYREAD), &vdso_pagelist[vdso_pages]);
+	vdso_base += PAGE_SIZE;
+	ret =
+	   install_special_mapping(mm, vdso_base, vdso_pages << PAGE_SHIFT,
+		(VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC),
+		&vdso_pagelist[1]);
 
 	if (unlikely(ret))
 		mm->context.vdso = NULL;
@@ -105,7 +105,7 @@ const char *arch_vma_name(struct vm_area_struct *vma)
 	if (vma->vm_mm && (vma->vm_start == (long)vma->vm_mm->context.vdso))
 		return "[vdso]";
 	if (vma->vm_mm && (vma->vm_start ==
-			   (long)vma->vm_mm->context.vdso + PAGE_SIZE))
+			   (long)vma->vm_mm->context.vdso - PAGE_SIZE))
 		return "[vdso_data]";
 	return NULL;
 }
diff --git a/arch/riscv/kernel/vdso/vdso.lds.S b/arch/riscv/kernel/vdso/vdso.lds.S
index e6f558bca71b..fd8a31075256 100644
--- a/arch/riscv/kernel/vdso/vdso.lds.S
+++ b/arch/riscv/kernel/vdso/vdso.lds.S
@@ -8,7 +8,7 @@ OUTPUT_ARCH(riscv)
 
 SECTIONS
 {
-	PROVIDE(_vdso_data = . + PAGE_SIZE);
+	PROVIDE(_vdso_data = . - PAGE_SIZE);
 	. = SIZEOF_HEADERS;
 
 	.hash		: { *(.hash) }			:text
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] riscv: vdso: map data page before vDSO code
  2021-08-29  9:47 [PATCH] riscv: vdso: map data page before vDSO code Sergey Larin
@ 2021-08-29  9:59 ` Kefeng Wang
  2021-08-29 10:16   ` Sergey Larin
  0 siblings, 1 reply; 3+ messages in thread
From: Kefeng Wang @ 2021-08-29  9:59 UTC (permalink / raw)
  To: Sergey Larin, Paul Walmsley, Palmer Dabbelt, Albert Ou
  Cc: linux-riscv, linux-kernel

Hi Sergey

There is already one fix,

https://patchwork.kernel.org/project/linux-riscv/list/?series=534877

On 2021/8/29 17:47, Sergey Larin wrote:
> Current vDSO implementation assumes that the code size always fits in
> single page, and the data page follows it:
>
> 	PROVIDE(_vdso_data = . + PAGE_SIZE);
>
> However, this was not the case with my kernel build - the
> shared object had the size of 4800 bytes. This, obviously, is more than
> 4096 and requires second page for the rest of the data.
>
> CLOCK_REALTIME_COARSE clock became broken. It was always returning 0
> because vDSO code was reading the second code page, not the
> data page. Glibc uses this clock for the time() function.
>
> So instead of computing the offset for the data page (it is necessary to
> do in runtime - you can't know the size of the binary while you're
> building it) simply move it behind the code like the ARM does:
>
> 	PROVIDE(_vdso_data = . - PAGE_SIZE);
>
> This commit also fixes arch_vma_name for the data page - it was
> reporting the same '[vdso]' name for it in my case.
>
> Since I don't have the real hardware, the change was debugged with KGDB
> in RVVM and also verified in QEMU.
>
> Signed-off-by: Sergey Larin <cerg2010cerg2010@mail.ru>
> ---
>   arch/riscv/kernel/vdso.c          | 22 +++++++++++-----------
>   arch/riscv/kernel/vdso/vdso.lds.S |  2 +-
>   2 files changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/arch/riscv/kernel/vdso.c b/arch/riscv/kernel/vdso.c
> index 25a3b8849599..0c49390e9be3 100644
> --- a/arch/riscv/kernel/vdso.c
> +++ b/arch/riscv/kernel/vdso.c
> @@ -44,13 +44,13 @@ static int __init vdso_init(void)
>   		return -ENOMEM;
>   	}
>   
> +	vdso_pagelist[0] = virt_to_page(vdso_data);
>   	for (i = 0; i < vdso_pages; i++) {
>   		struct page *pg;
>   
>   		pg = virt_to_page(vdso_start + (i << PAGE_SHIFT));
> -		vdso_pagelist[i] = pg;
> +		vdso_pagelist[i + 1] = pg;
>   	}
> -	vdso_pagelist[i] = virt_to_page(vdso_data);
>   
>   	return 0;
>   }
> @@ -77,21 +77,21 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
>   	 * install_special_mapping or the perf counter mmap tracking code
>   	 * will fail to recognise it as a vDSO (since arch_vma_name fails).
>   	 */
> -	mm->context.vdso = (void *)vdso_base;
> +	mm->context.vdso = (void *)vdso_base + PAGE_SIZE;
>   
> -	ret =
> -	   install_special_mapping(mm, vdso_base, vdso_pages << PAGE_SHIFT,
> -		(VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC),
> -		vdso_pagelist);
> +	ret = install_special_mapping(mm, vdso_base, PAGE_SIZE,
> +		(VM_READ | VM_MAYREAD), &vdso_pagelist[0]);
>   
>   	if (unlikely(ret)) {
>   		mm->context.vdso = NULL;
>   		goto end;
>   	}
>   
> -	vdso_base += (vdso_pages << PAGE_SHIFT);
> -	ret = install_special_mapping(mm, vdso_base, PAGE_SIZE,
> -		(VM_READ | VM_MAYREAD), &vdso_pagelist[vdso_pages]);
> +	vdso_base += PAGE_SIZE;
> +	ret =
> +	   install_special_mapping(mm, vdso_base, vdso_pages << PAGE_SHIFT,
> +		(VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC),
> +		&vdso_pagelist[1]);
>   
>   	if (unlikely(ret))
>   		mm->context.vdso = NULL;
> @@ -105,7 +105,7 @@ const char *arch_vma_name(struct vm_area_struct *vma)
>   	if (vma->vm_mm && (vma->vm_start == (long)vma->vm_mm->context.vdso))
>   		return "[vdso]";
>   	if (vma->vm_mm && (vma->vm_start ==
> -			   (long)vma->vm_mm->context.vdso + PAGE_SIZE))
> +			   (long)vma->vm_mm->context.vdso - PAGE_SIZE))
>   		return "[vdso_data]";
>   	return NULL;
>   }
> diff --git a/arch/riscv/kernel/vdso/vdso.lds.S b/arch/riscv/kernel/vdso/vdso.lds.S
> index e6f558bca71b..fd8a31075256 100644
> --- a/arch/riscv/kernel/vdso/vdso.lds.S
> +++ b/arch/riscv/kernel/vdso/vdso.lds.S
> @@ -8,7 +8,7 @@ OUTPUT_ARCH(riscv)
>   
>   SECTIONS
>   {
> -	PROVIDE(_vdso_data = . + PAGE_SIZE);
> +	PROVIDE(_vdso_data = . - PAGE_SIZE);
>   	. = SIZEOF_HEADERS;
>   
>   	.hash		: { *(.hash) }			:text

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] riscv: vdso: map data page before vDSO code
  2021-08-29  9:59 ` Kefeng Wang
@ 2021-08-29 10:16   ` Sergey Larin
  0 siblings, 0 replies; 3+ messages in thread
From: Sergey Larin @ 2021-08-29 10:16 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, linux-kernel

On Sun, Aug 29, 2021 at 05:59:04PM +0800, Kefeng Wang wrote:
> Hi Sergey
> 
> There is already one fix,
> 
> https://patchwork.kernel.org/project/linux-riscv/list/?series=534877
> 

Oh, I missed it. That one looks cleaner. Thanks anyway!

> On 2021/8/29 17:47, Sergey Larin wrote:
> > Current vDSO implementation assumes that the code size always fits in
> > single page, and the data page follows it:
> > 
> > 	PROVIDE(_vdso_data = . + PAGE_SIZE);
> > 
> > However, this was not the case with my kernel build - the
> > shared object had the size of 4800 bytes. This, obviously, is more than
> > 4096 and requires second page for the rest of the data.
> > 
> > CLOCK_REALTIME_COARSE clock became broken. It was always returning 0
> > because vDSO code was reading the second code page, not the
> > data page. Glibc uses this clock for the time() function.
> > 
> > So instead of computing the offset for the data page (it is necessary to
> > do in runtime - you can't know the size of the binary while you're
> > building it) simply move it behind the code like the ARM does:
> > 
> > 	PROVIDE(_vdso_data = . - PAGE_SIZE);
> > 
> > This commit also fixes arch_vma_name for the data page - it was
> > reporting the same '[vdso]' name for it in my case.
> > 
> > Since I don't have the real hardware, the change was debugged with KGDB
> > in RVVM and also verified in QEMU.
> > 
> > Signed-off-by: Sergey Larin <cerg2010cerg2010@mail.ru>
> > ---
> >   arch/riscv/kernel/vdso.c          | 22 +++++++++++-----------
> >   arch/riscv/kernel/vdso/vdso.lds.S |  2 +-
> >   2 files changed, 12 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/riscv/kernel/vdso.c b/arch/riscv/kernel/vdso.c
> > index 25a3b8849599..0c49390e9be3 100644
> > --- a/arch/riscv/kernel/vdso.c
> > +++ b/arch/riscv/kernel/vdso.c
> > @@ -44,13 +44,13 @@ static int __init vdso_init(void)
> >   		return -ENOMEM;
> >   	}
> > +	vdso_pagelist[0] = virt_to_page(vdso_data);
> >   	for (i = 0; i < vdso_pages; i++) {
> >   		struct page *pg;
> >   		pg = virt_to_page(vdso_start + (i << PAGE_SHIFT));
> > -		vdso_pagelist[i] = pg;
> > +		vdso_pagelist[i + 1] = pg;
> >   	}
> > -	vdso_pagelist[i] = virt_to_page(vdso_data);
> >   	return 0;
> >   }
> > @@ -77,21 +77,21 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
> >   	 * install_special_mapping or the perf counter mmap tracking code
> >   	 * will fail to recognise it as a vDSO (since arch_vma_name fails).
> >   	 */
> > -	mm->context.vdso = (void *)vdso_base;
> > +	mm->context.vdso = (void *)vdso_base + PAGE_SIZE;
> > -	ret =
> > -	   install_special_mapping(mm, vdso_base, vdso_pages << PAGE_SHIFT,
> > -		(VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC),
> > -		vdso_pagelist);
> > +	ret = install_special_mapping(mm, vdso_base, PAGE_SIZE,
> > +		(VM_READ | VM_MAYREAD), &vdso_pagelist[0]);
> >   	if (unlikely(ret)) {
> >   		mm->context.vdso = NULL;
> >   		goto end;
> >   	}
> > -	vdso_base += (vdso_pages << PAGE_SHIFT);
> > -	ret = install_special_mapping(mm, vdso_base, PAGE_SIZE,
> > -		(VM_READ | VM_MAYREAD), &vdso_pagelist[vdso_pages]);
> > +	vdso_base += PAGE_SIZE;
> > +	ret =
> > +	   install_special_mapping(mm, vdso_base, vdso_pages << PAGE_SHIFT,
> > +		(VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC),
> > +		&vdso_pagelist[1]);
> >   	if (unlikely(ret))
> >   		mm->context.vdso = NULL;
> > @@ -105,7 +105,7 @@ const char *arch_vma_name(struct vm_area_struct *vma)
> >   	if (vma->vm_mm && (vma->vm_start == (long)vma->vm_mm->context.vdso))
> >   		return "[vdso]";
> >   	if (vma->vm_mm && (vma->vm_start ==
> > -			   (long)vma->vm_mm->context.vdso + PAGE_SIZE))
> > +			   (long)vma->vm_mm->context.vdso - PAGE_SIZE))
> >   		return "[vdso_data]";
> >   	return NULL;
> >   }
> > diff --git a/arch/riscv/kernel/vdso/vdso.lds.S b/arch/riscv/kernel/vdso/vdso.lds.S
> > index e6f558bca71b..fd8a31075256 100644
> > --- a/arch/riscv/kernel/vdso/vdso.lds.S
> > +++ b/arch/riscv/kernel/vdso/vdso.lds.S
> > @@ -8,7 +8,7 @@ OUTPUT_ARCH(riscv)
> >   SECTIONS
> >   {
> > -	PROVIDE(_vdso_data = . + PAGE_SIZE);
> > +	PROVIDE(_vdso_data = . - PAGE_SIZE);
> >   	. = SIZEOF_HEADERS;
> >   	.hash		: { *(.hash) }			:text

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-08-29 10:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-29  9:47 [PATCH] riscv: vdso: map data page before vDSO code Sergey Larin
2021-08-29  9:59 ` Kefeng Wang
2021-08-29 10:16   ` Sergey Larin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).