elf: align AT_RANDOM bytes
diff mbox series

Message ID 20190529213708.GA10729@avx2
State New
Headers show
Series
  • elf: align AT_RANDOM bytes
Related show

Commit Message

Alexey Dobriyan May 29, 2019, 9:37 p.m. UTC
AT_RANDOM content is always misaligned on x86_64:

	$ LD_SHOW_AUXV=1 /bin/true | grep AT_RANDOM
	AT_RANDOM:       0x7fff02101019

glibc copies first few bytes for stack protector stuff, aligned
access should be slightly faster.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 fs/binfmt_elf.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Andrew Morton May 29, 2019, 10:20 p.m. UTC | #1
On Thu, 30 May 2019 00:37:08 +0300 Alexey Dobriyan <adobriyan@gmail.com> wrote:

> AT_RANDOM content is always misaligned on x86_64:
> 
> 	$ LD_SHOW_AUXV=1 /bin/true | grep AT_RANDOM
> 	AT_RANDOM:       0x7fff02101019
> 
> glibc copies first few bytes for stack protector stuff, aligned
> access should be slightly faster.

I just don't understand the implications of this.  Is there
(badly-behaved) userspace out there which makes assumptions about the
current alignment?

How much faster, anyway?  How frequently is the AT_RANDOM record
accessed?

I often have questions such as these about your performance/space
tweaks :(.  Please try to address them as a matter of course when
preparing changelogs?

And let's Cc Kees, who wrote the thing.

> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -144,11 +144,15 @@ static int padzero(unsigned long elf_bss)
>  #define STACK_ALLOC(sp, len) ({ \
>  	elf_addr_t __user *old_sp = (elf_addr_t __user *)sp; sp += len; \
>  	old_sp; })
> +#define STACK_ALIGN(sp, align)	\
> +	((typeof(sp))(((unsigned long)sp + (int)align - 1) & ~((int)align - 1)))

I suspect plain old ALIGN() could be used here.

>  #else
>  #define STACK_ADD(sp, items) ((elf_addr_t __user *)(sp) - (items))
>  #define STACK_ROUND(sp, items) \
>  	(((unsigned long) (sp - items)) &~ 15UL)
>  #define STACK_ALLOC(sp, len) ({ sp -= len ; sp; })
> +#define STACK_ALIGN(sp, align)	\
> +	((typeof(sp))((unsigned long)sp & ~((int)align - 1)))

And maybe there's a helper which does this, dunno.

>  #endif
>  
>  #ifndef ELF_BASE_PLATFORM
> @@ -217,6 +221,12 @@ create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
>  			return -EFAULT;
>  	}
>  
> +	/*
> +	 * glibc copies first bytes for stack protector purposes
> +	 * which are misaligned on x86_64 because strlen("x86_64") + 1 == 7.
> +	 */
> +	p = STACK_ALIGN(p, sizeof(long));
> +
>  	/*
>  	 * Generate 16 random bytes for userspace PRNG seeding.
>  	 */
Kees Cook May 29, 2019, 11 p.m. UTC | #2
On Wed, May 29, 2019 at 03:20:20PM -0700, Andrew Morton wrote:
> On Thu, 30 May 2019 00:37:08 +0300 Alexey Dobriyan <adobriyan@gmail.com> wrote:
> 
> > AT_RANDOM content is always misaligned on x86_64:
> > 
> > 	$ LD_SHOW_AUXV=1 /bin/true | grep AT_RANDOM
> > 	AT_RANDOM:       0x7fff02101019
> > 
> > glibc copies first few bytes for stack protector stuff, aligned
> > access should be slightly faster.
> 
> I just don't understand the implications of this.  Is there
> (badly-behaved) userspace out there which makes assumptions about the
> current alignment?
> 
> How much faster, anyway?  How frequently is the AT_RANDOM record
> accessed?
> 
> I often have questions such as these about your performance/space
> tweaks :(.  Please try to address them as a matter of course when
> preparing changelogs?
> 
> And let's Cc Kees, who wrote the thing.
> 
> > --- a/fs/binfmt_elf.c
> > +++ b/fs/binfmt_elf.c
> > @@ -144,11 +144,15 @@ static int padzero(unsigned long elf_bss)
> >  #define STACK_ALLOC(sp, len) ({ \
> >  	elf_addr_t __user *old_sp = (elf_addr_t __user *)sp; sp += len; \
> >  	old_sp; })
> > +#define STACK_ALIGN(sp, align)	\
> > +	((typeof(sp))(((unsigned long)sp + (int)align - 1) & ~((int)align - 1)))
> 
> I suspect plain old ALIGN() could be used here.
> 
> >  #else
> >  #define STACK_ADD(sp, items) ((elf_addr_t __user *)(sp) - (items))
> >  #define STACK_ROUND(sp, items) \
> >  	(((unsigned long) (sp - items)) &~ 15UL)
> >  #define STACK_ALLOC(sp, len) ({ sp -= len ; sp; })
> > +#define STACK_ALIGN(sp, align)	\
> > +	((typeof(sp))((unsigned long)sp & ~((int)align - 1)))
> 
> And maybe there's a helper which does this, dunno.
> 
> >  #endif
> >  
> >  #ifndef ELF_BASE_PLATFORM
> > @@ -217,6 +221,12 @@ create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
> >  			return -EFAULT;
> >  	}
> >  
> > +	/*
> > +	 * glibc copies first bytes for stack protector purposes
> > +	 * which are misaligned on x86_64 because strlen("x86_64") + 1 == 7.
> > +	 */
> > +	p = STACK_ALIGN(p, sizeof(long));
> > +

I have no objection to eating some bytes here. Though perhaps things could just
be reordered to leave all the aligned things together and put all the
strings later?

-Kees

> >  	/*
> >  	 * Generate 16 random bytes for userspace PRNG seeding.
> >  	 */
>
Alexey Dobriyan May 30, 2019, 6:44 a.m. UTC | #3
On Wed, May 29, 2019 at 03:20:20PM -0700, Andrew Morton wrote:
> On Thu, 30 May 2019 00:37:08 +0300 Alexey Dobriyan <adobriyan@gmail.com> wrote:
> 
> > AT_RANDOM content is always misaligned on x86_64:
> > 
> > 	$ LD_SHOW_AUXV=1 /bin/true | grep AT_RANDOM
> > 	AT_RANDOM:       0x7fff02101019
> > 
> > glibc copies first few bytes for stack protector stuff, aligned
> > access should be slightly faster.
> 
> I just don't understand the implications of this.  Is there
> (badly-behaved) userspace out there which makes assumptions about the
> current alignment?

I don't think so: glibc has getauxval(AT_RANDOM) and userspace should
use whatever it returns as "char[16]" base pointer;

> How much faster, anyway?  How frequently is the AT_RANDOM record
> accessed?

I don't think it is measureable :-\

It is accessed twice per execve: first by the kernel putting data there,
second by glibc fetching first sizeof(uintptr_t) bytes for stack canary.

Here is stack layout at the beginning of execution:

....10  e8 03 00 00 00 00 00 00  17 00 00 00 00 00 00 00  |................|
....20  00 00 00 00 00 00 00 00  19 00 00 00 00 00 00 00  |................|
				 AT_RANDOM=25
....30  79 dd ff ff ff 7f 00 00  1a 00 00 00 00 00 00 00  |y...............|
	AT_RANDOM pointer
....40  00 00 00 00 00 00 00 00  1f 00 00 00 00 00 00 00  |................|
....50  e2 ef ff ff ff 7f 00 00  0f 00 00 00 00 00 00 00  |................|
....60  89 dd ff ff ff 7f 00 00  00 00 00 00 00 00 00 00  |................|
				    AT_RANDOM bytes (misaligned)
....70  00 00 00 00 00 00 00 00  00|a2 ef 76 37 0c 0c 69  |...........v7..i|
....80  04 32 68 e4 68 2d 53 cf  a5|78 38 36 5f 36 34 00  |.2h.h-S..x86_64.|
	AT_RANDOM------------------|"x86_64" (misaligned)

		 argv[0], envp[0]
....90  00 00 00|2f 68 6f 6d 65  2f 61 64 2f 73 2d 74 65  |.../home/ad/s-te|
....a0  73 74 2f 61 2e 6f 75 74  00 47 53 5f 4c 49 42 3d  |st/a.out.GS_LIB=|

> I often have questions such as these about your performance/space
> tweaks :(.  Please try to address them as a matter of course when
> preparing changelogs?

OK.
Alexey Dobriyan May 30, 2019, 6:47 a.m. UTC | #4
On Wed, May 29, 2019 at 04:00:23PM -0700, Kees Cook wrote:
> On Wed, May 29, 2019 at 03:20:20PM -0700, Andrew Morton wrote:
> > On Thu, 30 May 2019 00:37:08 +0300 Alexey Dobriyan <adobriyan@gmail.com> wrote:
> > 
> > > AT_RANDOM content is always misaligned on x86_64:
> > > 
> > > 	$ LD_SHOW_AUXV=1 /bin/true | grep AT_RANDOM
> > > 	AT_RANDOM:       0x7fff02101019
> > > 
> > > glibc copies first few bytes for stack protector stuff, aligned
> > > access should be slightly faster.
> > 
> > I just don't understand the implications of this.  Is there
> > (badly-behaved) userspace out there which makes assumptions about the
> > current alignment?
> > 
> > How much faster, anyway?  How frequently is the AT_RANDOM record
> > accessed?
> > 
> > I often have questions such as these about your performance/space
> > tweaks :(.  Please try to address them as a matter of course when
> > preparing changelogs?
> > 
> > And let's Cc Kees, who wrote the thing.
> > 
> > > --- a/fs/binfmt_elf.c
> > > +++ b/fs/binfmt_elf.c
> > > @@ -144,11 +144,15 @@ static int padzero(unsigned long elf_bss)
> > >  #define STACK_ALLOC(sp, len) ({ \
> > >  	elf_addr_t __user *old_sp = (elf_addr_t __user *)sp; sp += len; \
> > >  	old_sp; })
> > > +#define STACK_ALIGN(sp, align)	\
> > > +	((typeof(sp))(((unsigned long)sp + (int)align - 1) & ~((int)align - 1)))
> > 
> > I suspect plain old ALIGN() could be used here.
> > 
> > >  #else
> > >  #define STACK_ADD(sp, items) ((elf_addr_t __user *)(sp) - (items))
> > >  #define STACK_ROUND(sp, items) \
> > >  	(((unsigned long) (sp - items)) &~ 15UL)
> > >  #define STACK_ALLOC(sp, len) ({ sp -= len ; sp; })
> > > +#define STACK_ALIGN(sp, align)	\
> > > +	((typeof(sp))((unsigned long)sp & ~((int)align - 1)))
> > 
> > And maybe there's a helper which does this, dunno.
> > 
> > >  #endif
> > >  
> > >  #ifndef ELF_BASE_PLATFORM
> > > @@ -217,6 +221,12 @@ create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
> > >  			return -EFAULT;
> > >  	}
> > >  
> > > +	/*
> > > +	 * glibc copies first bytes for stack protector purposes
> > > +	 * which are misaligned on x86_64 because strlen("x86_64") + 1 == 7.
> > > +	 */
> > > +	p = STACK_ALIGN(p, sizeof(long));
> > > +
> 
> I have no objection to eating some bytes here. Though perhaps things could just
> be reordered to leave all the aligned things together and put all the
> strings later?

There should be no bytes wasted in fact. Auxv array is aligned and
whole stack is aligned once more at 16 bytes. On x86_64 AT_RANDOM content
and "x86_64" AT_PLATFORM string are put higher, so that 1 byte doesn't
change anything.

Patch
diff mbox series

--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -144,11 +144,15 @@  static int padzero(unsigned long elf_bss)
 #define STACK_ALLOC(sp, len) ({ \
 	elf_addr_t __user *old_sp = (elf_addr_t __user *)sp; sp += len; \
 	old_sp; })
+#define STACK_ALIGN(sp, align)	\
+	((typeof(sp))(((unsigned long)sp + (int)align - 1) & ~((int)align - 1)))
 #else
 #define STACK_ADD(sp, items) ((elf_addr_t __user *)(sp) - (items))
 #define STACK_ROUND(sp, items) \
 	(((unsigned long) (sp - items)) &~ 15UL)
 #define STACK_ALLOC(sp, len) ({ sp -= len ; sp; })
+#define STACK_ALIGN(sp, align)	\
+	((typeof(sp))((unsigned long)sp & ~((int)align - 1)))
 #endif
 
 #ifndef ELF_BASE_PLATFORM
@@ -217,6 +221,12 @@  create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
 			return -EFAULT;
 	}
 
+	/*
+	 * glibc copies first bytes for stack protector purposes
+	 * which are misaligned on x86_64 because strlen("x86_64") + 1 == 7.
+	 */
+	p = STACK_ALIGN(p, sizeof(long));
+
 	/*
 	 * Generate 16 random bytes for userspace PRNG seeding.
 	 */