All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Weißschuh" <linux@weissschuh.net>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Mark Brown <broonie@kernel.org>, Willy Tarreau <w@1wt.eu>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Sebastian Ott <sebott@redhat.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH RFC] binfmt_elf: fully allocate bss pages
Date: Fri, 15 Sep 2023 00:18:52 +0200	[thread overview]
Message-ID: <987fb5f4-416f-482a-9564-b12f7423e19a@t-8ch.de> (raw)
In-Reply-To: <87fs3gwn53.fsf@email.froward.int.ebiederm.org>

On 2023-09-14 14:49:44-0500, Eric W. Biederman wrote:
> Thomas Weißschuh <linux@weissschuh.net> writes:
> 
> > When allocating the pages for bss the start address needs to be rounded
> > down instead of up.
> > Otherwise the start of the bss segment may be unmapped.
> >
> > The was reported to happen on Aarch64:
> 
> Those program headers you quote look corrupt.

To reproduce:

$ cat test.c
char foo[1];

void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) _start(void)
{
	__asm__ volatile (
		"mov x0, 123\n"
		"mov x8, 93\n"       /* NR_exit == 93 */
		"svc #0\n"
	);
	__builtin_unreachable();
}

$ aarch64-linux-gnu-gcc -fno-stack-protector -o nolibc-test -nostdlib -static test.c

Note:
it works in qemu-user, newer versions need the workaround from
https://gitlab.com/qemu-project/qemu/-/issues/1854
The issue in qemu-user seems to be related to the question at hand.

> The address 0x41ffe8 is not 0x10000 aligned.
> 
> I don't think anything in the elf specification allows that.
> 
> The most common way to have bss is for a elf segment to have a larger
> memsize than filesize.  In which case rounding up is the correct way to
> handle things.
> 
> We definitely need to verify the appended bss case works, before
> taking this patch, or we will get random application failures
> because parts of the data segment are being zeroed, or the binaries
> won't load because the bss won't be able to map over the initialized data.

My hope in posting this patch was also for the bots to uncover any
obvious breakage. So far there were no reports.

> The note segment living at a conflicting virtual address also looks
> suspicious.   It is probably harmless, as note segments are not
> loaded.
> 
> 
> Are you by any chance using an experimental linker?

I'm using GNU ld 2.41 as supplied by my distro.
(ArchLinux, aarch64-linux-gnu-binutils 2.41-2)

> In general every segment in an elf executable needs to be aligned to the
> SYSVABI's architecture page size.  I think that is 64k on ARM.  Which it
> looks like the linker tried to implement by setting the alignment to
> 0x10000, and then ignored by putting a byte offset beginning to the
> page.

Looking at Figure A-5 of [0] this seems not to be the case.
It shows p_vaddr=0x8048100 and p_align=0x1000.
(On x86_64 with PAGE_SIZE=0x1000)

> At a minimum someone needs to sort through what the elf specification
> says needs to happen is a weird case like this where the start address
> of a load segment does not match the alignment of the segment.

I'll take a look.

> To see how common this is I looked at a binary known to be working, and
> my /usr/bin/ls binary has one segment that has one of these unaligned
> starts as well.

Same for my /usr/bin/busybox, also the .data and .bss segment.

> So it must be defined to work somewhere but I need to see the definition
> to even have a good opinion on the nonsense of saying an unaligned value
> should be aligned.

Figure 2-1 from [0]:

p_align:

Loadable process segments must have congruent values for p_vaddr and
p_offset, modulo the page size.This member gives the value to which the
segments are aligned in memory and in the file. Values 0 and 1 mean that no
alignment is required. Otherwise, p_align should be a positive, integral power of
2, and p_addr should equal p_offset, modulo p_align.

0x41ffe8 (p_vaddr)  % 0x1000 = 0xfe8
0x00ffe8 (p_offset) % 0x1000 = 0xfe8

0x41ffe8 (p_addr)   % 0x10000 = 0xffe8
0x00ffe8 (p_offset) % 0x10000 = 0xffe8

So this seems to be satisfied.

> All I know is that we need to limit our support to what memory mapping
> pieces from the elf executable can support.  Which at a minimum requires:
> 	virt_addr % ELF_MIN_ALIGN == file_offset % ELF_MIN_ALIGN

Aarch64 can also handle 4k pages so this invariant should be satisfied.
4k pages seems to be the default for the kernel, too.

[0] https://refspecs.linuxfoundation.org/elf/elf.pdf

> > Memory allocated by set_brk():
> > Before: start=0x420000 end=0x420000
> > After:  start=0x41f000 end=0x420000
> >
> > The triggering binary looks like this:
> >
> >     Elf file type is EXEC (Executable file)
> >     Entry point 0x400144
> >     There are 4 program headers, starting at offset 64
> >
> >     Program Headers:
> >       Type           Offset             VirtAddr           PhysAddr
> >                      FileSiz            MemSiz              Flags  Align
> >       LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
> >                      0x0000000000000178 0x0000000000000178  R E    0x10000
> >       LOAD           0x000000000000ffe8 0x000000000041ffe8 0x000000000041ffe8
> >                      0x0000000000000000 0x0000000000000008  RW     0x10000
> >       NOTE           0x0000000000000120 0x0000000000400120 0x0000000000400120
> >                      0x0000000000000024 0x0000000000000024  R      0x4
> >       GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
> >                      0x0000000000000000 0x0000000000000000  RW     0x10
> >
> >      Section to Segment mapping:
> >       Segment Sections...
> >        00     .note.gnu.build-id .text .eh_frame
> >        01     .bss
> >        02     .note.gnu.build-id
> >        03
> >
> > Reported-by: Sebastian Ott <sebott@redhat.com>
> > Closes: https://lore.kernel.org/lkml/5d49767a-fbdc-fbe7-5fb2-d99ece3168cb@redhat.com/
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
> > ---
> >
> > I'm not really familiar with the ELF loading process, so putting this
> > out as RFC.
> >
> > A example binary compiled with aarch64-linux-gnu-gcc 13.2.0 is available
> > at https://test.t-8ch.de/binfmt-bss-repro.bin
> > ---
> >  fs/binfmt_elf.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> > index 7b3d2d491407..4008a57d388b 100644
> > --- a/fs/binfmt_elf.c
> > +++ b/fs/binfmt_elf.c
> > @@ -112,7 +112,7 @@ static struct linux_binfmt elf_format = {
> >  
> >  static int set_brk(unsigned long start, unsigned long end, int prot)
> >  {
> > -	start = ELF_PAGEALIGN(start);
> > +	start = ELF_PAGESTART(start);
> >  	end = ELF_PAGEALIGN(end);
> >  	if (end > start) {
> >  		/*
> >
> > ---
> > base-commit: aed8aee11130a954356200afa3f1b8753e8a9482
> > change-id: 20230914-bss-alloc-f523fa61718c
> >
> > Best regards,

  reply	other threads:[~2023-09-14 22:19 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-14 15:59 [PATCH RFC] binfmt_elf: fully allocate bss pages Thomas Weißschuh
2023-09-14 19:49 ` Eric W. Biederman
2023-09-14 22:18   ` Thomas Weißschuh [this message]
2023-09-15 19:35 ` Sebastian Ott
2023-09-15 22:15 ` Pedro Falcato
2023-09-15 22:41   ` Thomas Weißschuh
2023-09-18 14:11 ` kernel test robot
2023-09-21 10:36 ` Sebastian Ott
2023-09-25  0:50   ` Eric W. Biederman
2023-09-25  9:20     ` Sebastian Ott
2023-09-25  9:50       ` Eric W. Biederman
2023-09-25 12:59       ` [PATCH] binfmt_elf: Support segments with 0 filesz and misaligned starts Eric W. Biederman
2023-09-25 13:00         ` kernel test robot
2023-09-25 13:07           ` Eric W. Biederman
2023-09-25 15:27         ` Sebastian Ott
2023-09-25 17:06           ` Kees Cook
2023-09-26  3:27             ` Eric W. Biederman
2023-09-27  2:34               ` Kees Cook
2023-09-26 13:49         ` Dan Carpenter
2023-09-26 14:42           ` [PATCH v2] " Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=987fb5f4-416f-482a-9564-b12f7423e19a@t-8ch.de \
    --to=linux@weissschuh.net \
    --cc=brauner@kernel.org \
    --cc=broonie@kernel.org \
    --cc=ebiederm@xmission.com \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=sebott@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.