linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
@ 2019-09-30  5:59 Masahiro Yamada
  2019-09-30  7:57 ` Arnd Bergmann
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Masahiro Yamada @ 2019-09-30  5:59 UTC (permalink / raw)
  To: linux-arm-kernel, Russell King
  Cc: Linus Torvalds, Olof Johansson, Arnd Bergmann, Nick Desaulniers,
	Nicolas Saenz Julienne, Masahiro Yamada, Julien Thierry,
	Russell King, Stefan Agner, Thomas Gleixner, Vincent Whitchurch,
	linux-kernel

KernelCI reports that bcm2835_defconfig is no longer booting since
commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
forcibly"):

  https://lkml.org/lkml/2019/9/26/825

I also received a regression report from Nicolas Saenz Julienne:

  https://lkml.org/lkml/2019/9/27/263

This problem has cropped up on arch/arm/config/bcm2835_defconfig
because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
to prefer not inlining functions with -Os. I was able to reproduce
it with other boards and defconfig files by manually enabling
CONFIG_CC_OPTIMIZE_FOR_SIZE.

The __get_user_check() specifically uses r0, r1, r2 registers.
So, uaccess_save_and_enable() and uaccess_restore() must be inlined
in order to avoid those registers being overwritten in the callees.

Prior to commit 9012d011660e ("compiler: allow all arches to enable
CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
inlining functions, except on x86.

Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
So, __always_inline is now the only guaranteed way of forcible inlining.

I want to keep as much compiler's freedom as possible about the inlining
decision. So, I changed the function call order instead of adding
__always_inline around.

Call uaccess_save_and_enable() before assigning the __p ("r0"), and
uaccess_restore() after evacuating the __e ("r0").

Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---

 arch/arm/include/asm/uaccess.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 303248e5b990..559f252d7e3c 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -191,11 +191,12 @@ extern int __get_user_64t_4(void *);
 #define __get_user_check(x, p)						\
 	({								\
 		unsigned long __limit = current_thread_info()->addr_limit - 1; \
+		unsigned int __ua_flags = uaccess_save_and_enable();	\
 		register typeof(*(p)) __user *__p asm("r0") = (p);	\
 		register __inttype(x) __r2 asm("r2");			\
 		register unsigned long __l asm("r1") = __limit;		\
 		register int __e asm("r0");				\
-		unsigned int __ua_flags = uaccess_save_and_enable();	\
+		unsigned int __err;					\
 		switch (sizeof(*(__p))) {				\
 		case 1:							\
 			if (sizeof((x)) >= 8)				\
@@ -223,9 +224,10 @@ extern int __get_user_64t_4(void *);
 			break;						\
 		default: __e = __get_user_bad(); break;			\
 		}							\
-		uaccess_restore(__ua_flags);				\
+		__err = __e;						\
 		x = (typeof(*(p))) __r2;				\
-		__e;							\
+		uaccess_restore(__ua_flags);				\
+		__err;							\
 	})
 
 #define get_user(x, p)							\
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
  2019-09-30  5:59 [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined Masahiro Yamada
@ 2019-09-30  7:57 ` Arnd Bergmann
  2019-09-30  8:26   ` Masahiro Yamada
  2019-09-30  8:13 ` Nicolas Saenz Julienne
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Arnd Bergmann @ 2019-09-30  7:57 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Linux ARM, Russell King, Vincent Whitchurch, Nick Desaulniers,
	Russell King, Stefan Agner, linux-kernel, Julien Thierry,
	Olof Johansson, Thomas Gleixner, Linus Torvalds,
	Nicolas Saenz Julienne

On Mon, Sep 30, 2019 at 8:01 AM Masahiro Yamada
<yamada.masahiro@socionext.com> wrote:
>
> KernelCI reports that bcm2835_defconfig is no longer booting since
> commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> forcibly"):
>
>   https://lkml.org/lkml/2019/9/26/825
>
> I also received a regression report from Nicolas Saenz Julienne:
>
>   https://lkml.org/lkml/2019/9/27/263
>
> This problem has cropped up on arch/arm/config/bcm2835_defconfig
> because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> to prefer not inlining functions with -Os. I was able to reproduce
> it with other boards and defconfig files by manually enabling
> CONFIG_CC_OPTIMIZE_FOR_SIZE.
>
> The __get_user_check() specifically uses r0, r1, r2 registers.
> So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> in order to avoid those registers being overwritten in the callees.
>
> Prior to commit 9012d011660e ("compiler: allow all arches to enable
> CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> inlining functions, except on x86.
>
> Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> So, __always_inline is now the only guaranteed way of forcible inlining.
>
> I want to keep as much compiler's freedom as possible about the inlining
> decision. So, I changed the function call order instead of adding
> __always_inline around.
>
> Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> uaccess_restore() after evacuating the __e ("r0").
>
> Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
> Reported-by: "kernelci.org bot" <bot@kernelci.org>
> Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

Acked-by: Arnd Bergmann <arnd@arndb.de>

The patch looks reasonable to me, but I think we should also revert
commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
forcibly") in mainline for now, as it caused this to happen all the time rather
than only for users that expliticly select CONFIG_OPTIMIZE_INLINING.

We have had other bug reports starting with that commit that run into
similar issues, and I'm sure there are more of them. I don't mind having it
in linux-next for a while long, but I think it was premature to have it merged
into mainline.

        Arnd

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
  2019-09-30  5:59 [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined Masahiro Yamada
  2019-09-30  7:57 ` Arnd Bergmann
@ 2019-09-30  8:13 ` Nicolas Saenz Julienne
  2019-09-30 17:00 ` Fabrizio Castro
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Nicolas Saenz Julienne @ 2019-09-30  8:13 UTC (permalink / raw)
  To: Masahiro Yamada, linux-arm-kernel, Russell King
  Cc: Linus Torvalds, Olof Johansson, Arnd Bergmann, Nick Desaulniers,
	Julien Thierry, Russell King, Stefan Agner, Thomas Gleixner,
	Vincent Whitchurch, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1904 bytes --]

On Mon, 2019-09-30 at 14:59 +0900, Masahiro Yamada wrote:
> KernelCI reports that bcm2835_defconfig is no longer booting since
> commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> forcibly"):
> 
>   https://lkml.org/lkml/2019/9/26/825
> 
> I also received a regression report from Nicolas Saenz Julienne:
> 
>   https://lkml.org/lkml/2019/9/27/263
> 
> This problem has cropped up on arch/arm/config/bcm2835_defconfig
> because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> to prefer not inlining functions with -Os. I was able to reproduce
> it with other boards and defconfig files by manually enabling
> CONFIG_CC_OPTIMIZE_FOR_SIZE.
> 
> The __get_user_check() specifically uses r0, r1, r2 registers.
> So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> in order to avoid those registers being overwritten in the callees.
> 
> Prior to commit 9012d011660e ("compiler: allow all arches to enable
> CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> inlining functions, except on x86.
> 
> Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> So, __always_inline is now the only guaranteed way of forcible inlining.
> 
> I want to keep as much compiler's freedom as possible about the inlining
> decision. So, I changed the function call order instead of adding
> __always_inline around.
> 
> Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> uaccess_restore() after evacuating the __e ("r0").
> 
> Fixes: 9012d011660e ("compiler: allow all arches to enable
> CONFIG_OPTIMIZE_INLINING")
> Reported-by: "kernelci.org bot" <bot@kernelci.org>
> Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---

Thanks!

Tested-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
  2019-09-30  7:57 ` Arnd Bergmann
@ 2019-09-30  8:26   ` Masahiro Yamada
  0 siblings, 0 replies; 10+ messages in thread
From: Masahiro Yamada @ 2019-09-30  8:26 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linux ARM, Russell King, Vincent Whitchurch, Nick Desaulniers,
	Russell King, Stefan Agner, linux-kernel, Julien Thierry,
	Olof Johansson, Thomas Gleixner, Linus Torvalds,
	Nicolas Saenz Julienne

Hi Arnd,

On Mon, Sep 30, 2019 at 4:57 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Mon, Sep 30, 2019 at 8:01 AM Masahiro Yamada
> <yamada.masahiro@socionext.com> wrote:
> >
> > KernelCI reports that bcm2835_defconfig is no longer booting since
> > commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> > forcibly"):
> >
> >   https://lkml.org/lkml/2019/9/26/825
> >
> > I also received a regression report from Nicolas Saenz Julienne:
> >
> >   https://lkml.org/lkml/2019/9/27/263
> >
> > This problem has cropped up on arch/arm/config/bcm2835_defconfig
> > because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> > to prefer not inlining functions with -Os. I was able to reproduce
> > it with other boards and defconfig files by manually enabling
> > CONFIG_CC_OPTIMIZE_FOR_SIZE.
> >
> > The __get_user_check() specifically uses r0, r1, r2 registers.
> > So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> > in order to avoid those registers being overwritten in the callees.
> >
> > Prior to commit 9012d011660e ("compiler: allow all arches to enable
> > CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> > inlining functions, except on x86.
> >
> > Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> > So, __always_inline is now the only guaranteed way of forcible inlining.
> >
> > I want to keep as much compiler's freedom as possible about the inlining
> > decision. So, I changed the function call order instead of adding
> > __always_inline around.
> >
> > Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> > uaccess_restore() after evacuating the __e ("r0").
> >
> > Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
> > Reported-by: "kernelci.org bot" <bot@kernelci.org>
> > Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> > Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
>
> The patch looks reasonable to me, but I think we should also revert
> commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> forcibly") in mainline for now, as it caused this to happen all the time rather
> than only for users that expliticly select CONFIG_OPTIMIZE_INLINING.
>
> We have had other bug reports starting with that commit that run into
> similar issues, and I'm sure there are more of them. I don't mind having it
> in linux-next for a while long, but I think it was premature to have it merged
> into mainline.
>
>         Arnd


Hmm, I know you are testing randconfig build tests,
but how many people are testing the kernel boot in linux-next?

People and kernelci started to report the issue immediately
after it landed in the mainline...


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
  2019-09-30  5:59 [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined Masahiro Yamada
  2019-09-30  7:57 ` Arnd Bergmann
  2019-09-30  8:13 ` Nicolas Saenz Julienne
@ 2019-09-30 17:00 ` Fabrizio Castro
  2019-09-30 17:50 ` Russell King - ARM Linux admin
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Fabrizio Castro @ 2019-09-30 17:00 UTC (permalink / raw)
  To: Masahiro Yamada, linux-arm-kernel, Russell King
  Cc: Linus Torvalds, Olof Johansson, Arnd Bergmann, Nick Desaulniers,
	Nicolas Saenz Julienne, Julien Thierry, Russell King,
	Stefan Agner, Thomas Gleixner, Vincent Whitchurch, linux-kernel

> From: linux-kernel-owner@vger.kernel.org <linux-kernel-owner@vger.kernel.org> On Behalf Of Masahiro Yamada
> Sent: 30 September 2019 06:59
> Subject: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
> 
> KernelCI reports that bcm2835_defconfig is no longer booting since
> commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> forcibly"):
> 
>   https://lkml.org/lkml/2019/9/26/825
> 
> I also received a regression report from Nicolas Saenz Julienne:
> 
>   https://lkml.org/lkml/2019/9/27/263
> 
> This problem has cropped up on arch/arm/config/bcm2835_defconfig
> because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> to prefer not inlining functions with -Os. I was able to reproduce
> it with other boards and defconfig files by manually enabling
> CONFIG_CC_OPTIMIZE_FOR_SIZE.
> 
> The __get_user_check() specifically uses r0, r1, r2 registers.
> So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> in order to avoid those registers being overwritten in the callees.
> 
> Prior to commit 9012d011660e ("compiler: allow all arches to enable
> CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> inlining functions, except on x86.
> 
> Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> So, __always_inline is now the only guaranteed way of forcible inlining.
> 
> I want to keep as much compiler's freedom as possible about the inlining
> decision. So, I changed the function call order instead of adding
> __always_inline around.
> 
> Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> uaccess_restore() after evacuating the __e ("r0").
> 
> Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
> Reported-by: "kernelci.org bot" <bot@kernelci.org>
> Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

Tested-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>

> ---
> 
>  arch/arm/include/asm/uaccess.h | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> index 303248e5b990..559f252d7e3c 100644
> --- a/arch/arm/include/asm/uaccess.h
> +++ b/arch/arm/include/asm/uaccess.h
> @@ -191,11 +191,12 @@ extern int __get_user_64t_4(void *);
>  #define __get_user_check(x, p)						\
>  	({								\
>  		unsigned long __limit = current_thread_info()->addr_limit - 1; \
> +		unsigned int __ua_flags = uaccess_save_and_enable();	\
>  		register typeof(*(p)) __user *__p asm("r0") = (p);	\
>  		register __inttype(x) __r2 asm("r2");			\
>  		register unsigned long __l asm("r1") = __limit;		\
>  		register int __e asm("r0");				\
> -		unsigned int __ua_flags = uaccess_save_and_enable();	\
> +		unsigned int __err;					\
>  		switch (sizeof(*(__p))) {				\
>  		case 1:							\
>  			if (sizeof((x)) >= 8)				\
> @@ -223,9 +224,10 @@ extern int __get_user_64t_4(void *);
>  			break;						\
>  		default: __e = __get_user_bad(); break;			\
>  		}							\
> -		uaccess_restore(__ua_flags);				\
> +		__err = __e;						\
>  		x = (typeof(*(p))) __r2;				\
> -		__e;							\
> +		uaccess_restore(__ua_flags);				\
> +		__err;							\
>  	})
> 
>  #define get_user(x, p)							\
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
  2019-09-30  5:59 [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined Masahiro Yamada
                   ` (2 preceding siblings ...)
  2019-09-30 17:00 ` Fabrizio Castro
@ 2019-09-30 17:50 ` Russell King - ARM Linux admin
  2019-10-01  8:26   ` Masahiro Yamada
  2019-09-30 22:19 ` Nick Desaulniers
  2019-10-01  9:28 ` Geert Uytterhoeven
  5 siblings, 1 reply; 10+ messages in thread
From: Russell King - ARM Linux admin @ 2019-09-30 17:50 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linux-arm-kernel, Arnd Bergmann, Vincent Whitchurch,
	Nick Desaulniers, Stefan Agner, linux-kernel, Julien Thierry,
	Olof Johansson, Thomas Gleixner, Linus Torvalds,
	Nicolas Saenz Julienne

On Mon, Sep 30, 2019 at 02:59:25PM +0900, Masahiro Yamada wrote:
> KernelCI reports that bcm2835_defconfig is no longer booting since
> commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> forcibly"):
> 
>   https://lkml.org/lkml/2019/9/26/825
> 
> I also received a regression report from Nicolas Saenz Julienne:
> 
>   https://lkml.org/lkml/2019/9/27/263
> 
> This problem has cropped up on arch/arm/config/bcm2835_defconfig
> because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> to prefer not inlining functions with -Os. I was able to reproduce
> it with other boards and defconfig files by manually enabling
> CONFIG_CC_OPTIMIZE_FOR_SIZE.
> 
> The __get_user_check() specifically uses r0, r1, r2 registers.
> So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> in order to avoid those registers being overwritten in the callees.
> 
> Prior to commit 9012d011660e ("compiler: allow all arches to enable
> CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> inlining functions, except on x86.
> 
> Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> So, __always_inline is now the only guaranteed way of forcible inlining.
> 
> I want to keep as much compiler's freedom as possible about the inlining
> decision. So, I changed the function call order instead of adding
> __always_inline around.
> 
> Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> uaccess_restore() after evacuating the __e ("r0").
> 
> Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
> Reported-by: "kernelci.org bot" <bot@kernelci.org>
> Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---
> 
>  arch/arm/include/asm/uaccess.h | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> index 303248e5b990..559f252d7e3c 100644
> --- a/arch/arm/include/asm/uaccess.h
> +++ b/arch/arm/include/asm/uaccess.h
> @@ -191,11 +191,12 @@ extern int __get_user_64t_4(void *);
>  #define __get_user_check(x, p)						\
>  	({								\
>  		unsigned long __limit = current_thread_info()->addr_limit - 1; \
> +		unsigned int __ua_flags = uaccess_save_and_enable();	\

If the compiler is moving uaccess_save_and_enable(), that's something
we really don't want - the idea is to _minimise_ the number of kernel
memory accesses between enabling userspace access and performing the
actual access.

Fixing it in this way widens the window for the kernel to be doing
something it shoulding in userspace.

So, the right solution is to ensure that the compiler always inlines
the uaccess_*() helpers - which should be nothing more than four
instructions for uaccess_save_and_enable() and two for the
restore.

I.O.W. it should look something like this:

     144:       ee134f10        mrc     15, 0, r4, cr3, cr0, {0}
     148:       e3c4200c        bic     r2, r4, #12
     14c:       e24e1001        sub     r1, lr, #1
     150:       e3822004        orr     r2, r2, #4
     154:       ee032f10        mcr     15, 0, r2, cr3, cr0, {0}
     158:       f57ff06f        isb     sy
     15c:       ebfffffe        bl      0 <__get_user_4>
     160:       ee034f10        mcr     15, 0, r4, cr3, cr0, {0}
     164:       f57ff06f        isb     sy

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
  2019-09-30  5:59 [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined Masahiro Yamada
                   ` (3 preceding siblings ...)
  2019-09-30 17:50 ` Russell King - ARM Linux admin
@ 2019-09-30 22:19 ` Nick Desaulniers
  2019-10-01  8:28   ` Masahiro Yamada
  2019-10-01  9:28 ` Geert Uytterhoeven
  5 siblings, 1 reply; 10+ messages in thread
From: Nick Desaulniers @ 2019-09-30 22:19 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Linux ARM, Russell King, Linus Torvalds, Olof Johansson,
	Arnd Bergmann, Nicolas Saenz Julienne, Julien Thierry,
	Russell King, Stefan Agner, Thomas Gleixner, Vincent Whitchurch,
	LKML

On Sun, Sep 29, 2019 at 11:00 PM Masahiro Yamada
<yamada.masahiro@socionext.com> wrote:
>
> KernelCI reports that bcm2835_defconfig is no longer booting since
> commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> forcibly"):
>
>   https://lkml.org/lkml/2019/9/26/825
>
> I also received a regression report from Nicolas Saenz Julienne:
>
>   https://lkml.org/lkml/2019/9/27/263
>
> This problem has cropped up on arch/arm/config/bcm2835_defconfig
> because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> to prefer not inlining functions with -Os. I was able to reproduce
> it with other boards and defconfig files by manually enabling
> CONFIG_CC_OPTIMIZE_FOR_SIZE.
>
> The __get_user_check() specifically uses r0, r1, r2 registers.

Yep, that part is obvious, but...

> So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> in order to avoid those registers being overwritten in the callees.

Right, r0, r1, r2 are caller saved, meaning that __get_user_check must
save/restore them when making function calls. So
uaccess_save_and_enable() and uaccess_restore() should either be made
into macros (macros and typecheck (see include/linux/typecheck.h) are
peanut butter and chocolate), or occur at different points in the
function when those register variables are no longer in use.

>
> Prior to commit 9012d011660e ("compiler: allow all arches to enable
> CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> inlining functions, except on x86.
>
> Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> So, __always_inline is now the only guaranteed way of forcible inlining.
>
> I want to keep as much compiler's freedom as possible about the inlining
> decision. So, I changed the function call order instead of adding
> __always_inline around.
>
> Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> uaccess_restore() after evacuating the __e ("r0").
>
> Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
> Reported-by: "kernelci.org bot" <bot@kernelci.org>
> Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---
>
>  arch/arm/include/asm/uaccess.h | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> index 303248e5b990..559f252d7e3c 100644
> --- a/arch/arm/include/asm/uaccess.h
> +++ b/arch/arm/include/asm/uaccess.h
> @@ -191,11 +191,12 @@ extern int __get_user_64t_4(void *);
>  #define __get_user_check(x, p)                                         \
>         ({                                                              \
>                 unsigned long __limit = current_thread_info()->addr_limit - 1; \
> +               unsigned int __ua_flags = uaccess_save_and_enable();    \
>                 register typeof(*(p)) __user *__p asm("r0") = (p);      \
>                 register __inttype(x) __r2 asm("r2");                   \
>                 register unsigned long __l asm("r1") = __limit;         \
>                 register int __e asm("r0");                             \

What does it mean for there to be two different local variables pinned
to the same register? Ie. it looks like __e and __p are defined to
exist in r0.  Would having one variable and an explicit cast result in
differing storage?

> -               unsigned int __ua_flags = uaccess_save_and_enable();    \
> +               unsigned int __err;                                     \
>                 switch (sizeof(*(__p))) {                               \
>                 case 1:                                                 \
>                         if (sizeof((x)) >= 8)                           \
> @@ -223,9 +224,10 @@ extern int __get_user_64t_4(void *);
>                         break;                                          \
>                 default: __e = __get_user_bad(); break;                 \

^ I think this assignment to __e should be replaced with an assignment
to __err?  We no longer need the register at this point and could skip
the assignment of x.

>                 }                                                       \
> -               uaccess_restore(__ua_flags);                            \
> +               __err = __e;                                            \
>                 x = (typeof(*(p))) __r2;                                \
> -               __e;                                                    \
> +               uaccess_restore(__ua_flags);                            \
> +               __err;                                                  \
>         })
>
>  #define get_user(x, p)                                                 \
> --
> 2.17.1
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
  2019-09-30 17:50 ` Russell King - ARM Linux admin
@ 2019-10-01  8:26   ` Masahiro Yamada
  0 siblings, 0 replies; 10+ messages in thread
From: Masahiro Yamada @ 2019-10-01  8:26 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Arnd Bergmann, Vincent Whitchurch, Nick Desaulniers,
	Linux Kernel Mailing List, Stefan Agner, Nicolas Saenz Julienne,
	Julien Thierry, Olof Johansson, Thomas Gleixner, Linus Torvalds,
	linux-arm-kernel

Hi Russell,

On Tue, Oct 1, 2019 at 2:50 AM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Mon, Sep 30, 2019 at 02:59:25PM +0900, Masahiro Yamada wrote:
> > KernelCI reports that bcm2835_defconfig is no longer booting since
> > commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> > forcibly"):
> >
> >   https://lkml.org/lkml/2019/9/26/825
> >
> > I also received a regression report from Nicolas Saenz Julienne:
> >
> >   https://lkml.org/lkml/2019/9/27/263
> >
> > This problem has cropped up on arch/arm/config/bcm2835_defconfig
> > because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> > to prefer not inlining functions with -Os. I was able to reproduce
> > it with other boards and defconfig files by manually enabling
> > CONFIG_CC_OPTIMIZE_FOR_SIZE.
> >
> > The __get_user_check() specifically uses r0, r1, r2 registers.
> > So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> > in order to avoid those registers being overwritten in the callees.
> >
> > Prior to commit 9012d011660e ("compiler: allow all arches to enable
> > CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> > inlining functions, except on x86.
> >
> > Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> > So, __always_inline is now the only guaranteed way of forcible inlining.
> >
> > I want to keep as much compiler's freedom as possible about the inlining
> > decision. So, I changed the function call order instead of adding
> > __always_inline around.
> >
> > Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> > uaccess_restore() after evacuating the __e ("r0").
> >
> > Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
> > Reported-by: "kernelci.org bot" <bot@kernelci.org>
> > Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> > Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
> > ---
> >
> >  arch/arm/include/asm/uaccess.h | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> > index 303248e5b990..559f252d7e3c 100644
> > --- a/arch/arm/include/asm/uaccess.h
> > +++ b/arch/arm/include/asm/uaccess.h
> > @@ -191,11 +191,12 @@ extern int __get_user_64t_4(void *);
> >  #define __get_user_check(x, p)                                               \
> >       ({                                                              \
> >               unsigned long __limit = current_thread_info()->addr_limit - 1; \
> > +             unsigned int __ua_flags = uaccess_save_and_enable();    \
>
> If the compiler is moving uaccess_save_and_enable(), that's something
> we really don't want

Hmm, based on my poor knowledge about compilers,
I do not know if this re-arrangement happens...

> - the idea is to _minimise_ the number of kernel
> memory accesses between enabling userspace access and performing the
> actual access.
>
> Fixing it in this way widens the window for the kernel to be doing
> something it shoulding in userspace.
>
> So, the right solution is to ensure that the compiler always inlines
> the uaccess_*() helpers - which should be nothing more than four
> instructions for uaccess_save_and_enable() and two for the
> restore.
>

OK, I will use __always_inline to avoid
any potential behavior change.

Thanks.


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
  2019-09-30 22:19 ` Nick Desaulniers
@ 2019-10-01  8:28   ` Masahiro Yamada
  0 siblings, 0 replies; 10+ messages in thread
From: Masahiro Yamada @ 2019-10-01  8:28 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Linux ARM, Russell King, Linus Torvalds, Olof Johansson,
	Arnd Bergmann, Nicolas Saenz Julienne, Julien Thierry,
	Russell King, Stefan Agner, Thomas Gleixner, Vincent Whitchurch,
	LKML

Hi Nick,

On Tue, Oct 1, 2019 at 7:19 AM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Sun, Sep 29, 2019 at 11:00 PM Masahiro Yamada
> <yamada.masahiro@socionext.com> wrote:
> >
> > KernelCI reports that bcm2835_defconfig is no longer booting since
> > commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> > forcibly"):
> >
> >   https://lkml.org/lkml/2019/9/26/825
> >
> > I also received a regression report from Nicolas Saenz Julienne:
> >
> >   https://lkml.org/lkml/2019/9/27/263
> >
> > This problem has cropped up on arch/arm/config/bcm2835_defconfig
> > because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> > to prefer not inlining functions with -Os. I was able to reproduce
> > it with other boards and defconfig files by manually enabling
> > CONFIG_CC_OPTIMIZE_FOR_SIZE.
> >
> > The __get_user_check() specifically uses r0, r1, r2 registers.
>
> Yep, that part is obvious, but...
>
> > So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> > in order to avoid those registers being overwritten in the callees.
>
> Right, r0, r1, r2 are caller saved, meaning that __get_user_check must
> save/restore them when making function calls. So
> uaccess_save_and_enable() and uaccess_restore() should either be made
> into macros (macros and typecheck (see include/linux/typecheck.h) are
> peanut butter and chocolate), or occur at different points in the
> function when those register variables are no longer in use.
>
> >
> > Prior to commit 9012d011660e ("compiler: allow all arches to enable
> > CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> > inlining functions, except on x86.
> >
> > Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> > So, __always_inline is now the only guaranteed way of forcible inlining.
> >
> > I want to keep as much compiler's freedom as possible about the inlining
> > decision. So, I changed the function call order instead of adding
> > __always_inline around.
> >
> > Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> > uaccess_restore() after evacuating the __e ("r0").
> >
> > Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
> > Reported-by: "kernelci.org bot" <bot@kernelci.org>
> > Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> > Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
> > ---
> >
> >  arch/arm/include/asm/uaccess.h | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> > index 303248e5b990..559f252d7e3c 100644
> > --- a/arch/arm/include/asm/uaccess.h
> > +++ b/arch/arm/include/asm/uaccess.h
> > @@ -191,11 +191,12 @@ extern int __get_user_64t_4(void *);
> >  #define __get_user_check(x, p)                                         \
> >         ({                                                              \
> >                 unsigned long __limit = current_thread_info()->addr_limit - 1; \
> > +               unsigned int __ua_flags = uaccess_save_and_enable();    \
> >                 register typeof(*(p)) __user *__p asm("r0") = (p);      \
> >                 register __inttype(x) __r2 asm("r2");                   \
> >                 register unsigned long __l asm("r1") = __limit;         \
> >                 register int __e asm("r0");                             \
>
> What does it mean for there to be two different local variables pinned
> to the same register? Ie. it looks like __e and __p are defined to
> exist in r0.  Would having one variable and an explicit cast result in
> differing storage?

In my understanding,
__p is input (a pointer to the user-space)
__e is output (return value)

Maybe, does it use two variables for clarification?


>
> > -               unsigned int __ua_flags = uaccess_save_and_enable();    \
> > +               unsigned int __err;                                     \
> >                 switch (sizeof(*(__p))) {                               \
> >                 case 1:                                                 \
> >                         if (sizeof((x)) >= 8)                           \
> > @@ -223,9 +224,10 @@ extern int __get_user_64t_4(void *);
> >                         break;                                          \
> >                 default: __e = __get_user_bad(); break;                 \
>
> ^ I think this assignment to __e should be replaced with an assignment
> to __err?  We no longer need the register at this point and could skip
> the assignment of x.

Right, but '__err = __e' is necessary for non-default cases.




-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
  2019-09-30  5:59 [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined Masahiro Yamada
                   ` (4 preceding siblings ...)
  2019-09-30 22:19 ` Nick Desaulniers
@ 2019-10-01  9:28 ` Geert Uytterhoeven
  5 siblings, 0 replies; 10+ messages in thread
From: Geert Uytterhoeven @ 2019-10-01  9:28 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Linux ARM, Russell King, Arnd Bergmann, Vincent Whitchurch,
	Nick Desaulniers, Russell King, Stefan Agner,
	Linux Kernel Mailing List, Julien Thierry, Olof Johansson,
	Thomas Gleixner, Linus Torvalds, Nicolas Saenz Julienne,
	Linux-Renesas

On Mon, Sep 30, 2019 at 8:01 AM Masahiro Yamada
<yamada.masahiro@socionext.com> wrote:
> KernelCI reports that bcm2835_defconfig is no longer booting since
> commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> forcibly"):
>
>   https://lkml.org/lkml/2019/9/26/825
>
> I also received a regression report from Nicolas Saenz Julienne:
>
>   https://lkml.org/lkml/2019/9/27/263
>
> This problem has cropped up on arch/arm/config/bcm2835_defconfig
> because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> to prefer not inlining functions with -Os. I was able to reproduce
> it with other boards and defconfig files by manually enabling
> CONFIG_CC_OPTIMIZE_FOR_SIZE.
>
> The __get_user_check() specifically uses r0, r1, r2 registers.
> So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> in order to avoid those registers being overwritten in the callees.
>
> Prior to commit 9012d011660e ("compiler: allow all arches to enable
> CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> inlining functions, except on x86.
>
> Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> So, __always_inline is now the only guaranteed way of forcible inlining.
>
> I want to keep as much compiler's freedom as possible about the inlining
> decision. So, I changed the function call order instead of adding
> __always_inline around.
>
> Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> uaccess_restore() after evacuating the __e ("r0").
>
> Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
> Reported-by: "kernelci.org bot" <bot@kernelci.org>
> Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

Thanks, this fixes the issues I was seeing on r8a7791/koelsch.

Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-10-01  9:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-30  5:59 [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined Masahiro Yamada
2019-09-30  7:57 ` Arnd Bergmann
2019-09-30  8:26   ` Masahiro Yamada
2019-09-30  8:13 ` Nicolas Saenz Julienne
2019-09-30 17:00 ` Fabrizio Castro
2019-09-30 17:50 ` Russell King - ARM Linux admin
2019-10-01  8:26   ` Masahiro Yamada
2019-09-30 22:19 ` Nick Desaulniers
2019-10-01  8:28   ` Masahiro Yamada
2019-10-01  9:28 ` Geert Uytterhoeven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).