linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* MAP_FIXED_NOREPLACE appears to break older i386 binaries
@ 2019-10-05 23:32 Russell King - ARM Linux admin
  2019-10-06  0:06 ` Linus Torvalds
  2019-10-06  0:18 ` Linus Torvalds
  0 siblings, 2 replies; 7+ messages in thread
From: Russell King - ARM Linux admin @ 2019-10-05 23:32 UTC (permalink / raw)
  To: Michal Hocko, Linus Torvalds; +Cc: linux-kernel

Under a 4.19 kernel (debian stable), I am surprised to find that some
previously working i386 binaries no longer work, whereas others are
fine.  ls, for example, dies with a SEGV, but bash is fine.

Looking at the kernel log reveals:

[13117.361000] 20899 (ls): Uhuuh, elf segment at 0000000008065000 requested but
the memory is mapped already
[13120.367221] 20935 (vdir): Uhuuh, elf segment at 0000000008065000 requested but the memory is mapped already
[13122.891253] 20936 (ls): Uhuuh, elf segment at 0000000008065000 requested but
the memory is mapped already
[13137.719143] 20940 (ls): Uhuuh, elf segment at 0000000008065000 requested but
the memory is mapped already
[13139.202469] 20978 (ls): Uhuuh, elf segment at 0000000008065000 requested but
the memory is mapped already
[13158.093533] 21007 (ls): Uhuuh, elf segment at 0000000008065000 requested but
the memory is mapped already
[13221.920939] 21021 (objdump): Uhuuh, elf segment at 00000000080a1000 requested but the memory is mapped already

Looking at /bin/ls:

Program Header:
    PHDR off    0x00000034 vaddr 0x08048034 paddr 0x08048034 align 2**2
         filesz 0x00000120 memsz 0x00000120 flags r-x
  INTERP off    0x00000154 vaddr 0x08048154 paddr 0x08048154 align 2**0
         filesz 0x00000013 memsz 0x00000013 flags r--
    LOAD off    0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
         filesz 0x0001d620 memsz 0x0001d620 flags r-x
    LOAD off    0x0001d950 vaddr 0x08065950 paddr 0x08065950 align 2**12
         filesz 0x00000a50 memsz 0x000016e4 flags rw-
 DYNAMIC off    0x0001dec4 vaddr 0x08065ec4 paddr 0x08065ec4 align 2**2
         filesz 0x00000100 memsz 0x00000100 flags rw-
    NOTE off    0x00000168 vaddr 0x08048168 paddr 0x08048168 align 2**2
         filesz 0x00000044 memsz 0x00000044 flags r--
EH_FRAME off    0x00018e68 vaddr 0x08060e68 paddr 0x08060e68 align 2**2
         filesz 0x00000774 memsz 0x00000774 flags r--
   STACK off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4
         filesz 0x00000000 memsz 0x00000000 flags rw-
   RELRO off    0x0001d950 vaddr 0x08065950 paddr 0x08065950 align 2**0
         filesz 0x000006b0 memsz 0x000006b0 flags r--

Note that the executable part of ls extends from 0x08048000 for
0x0001d620 bytes in memory and file, which takes that up to
0x08065620.  The rw data section starts at 0x08065950.

Seems we've broken older i386 binaries with commit ad55eac74f20
("elf: enforce MAP_FIXED on overlaying elf segments").  Maybe the
MAP_FIXED_NOREPLACE stuff needs to have an on/off switch?

Here's the objdump -h output for the same binary:

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .interp       00000013  08048154  08048154  00000154  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .note.ABI-tag 00000020  08048168  08048168  00000168  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .note.gnu.build-id 00000024  08048188  08048188  00000188  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .gnu.hash     0000003c  080481ac  080481ac  000001ac  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .dynsym       00000840  080481e8  080481e8  000001e8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .gnu.liblist  000000c8  08048a28  08048a28  00000a28  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .gnu.version  00000108  08049020  08049020  00001020  2**1
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 .gnu.version_r 000000c0  08049128  08049128  00001128  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  8 .rel.dyn      00000048  080491e8  080491e8  000011e8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  9 .rel.plt      00000390  08049230  08049230  00001230  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 10 .init         00000023  080495c0  080495c0  000015c0  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 11 .plt          00000730  080495f0  080495f0  000015f0  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 12 .text         00013274  08049d20  08049d20  00001d20  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 13 .fini         00000014  0805cf94  0805cf94  00014f94  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
 14 .rodata       00003ea8  0805cfc0  0805cfc0  00014fc0  2**5
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 15 .eh_frame_hdr 00000774  08060e68  08060e68  00018e68  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 16 .eh_frame     0000341c  080615dc  080615dc  000195dc  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 17 .dynstr       0000064c  080649f8  080649f8  0001c9f8  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 18 .gnu.conflict 000005dc  08065044  08065044  0001d044  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 19 .init_array   00000004  08065950  08065950  0001d950  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 20 .fini_array   00000004  08065954  08065954  0001d954  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 21 .jcr          00000004  08065958  08065958  0001d958  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 22 .data.rel.ro  00000544  08065980  08065980  0001d980  2**6
                  CONTENTS, ALLOC, LOAD, DATA
 23 .dynamic      00000100  08065ec4  08065ec4  0001dec4  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 24 .got          00000024  08065fc4  08065fc4  0001dfc4  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 25 .got.plt      000001d4  08066000  08066000  0001e000  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 26 .data         000001a0  08066200  08066200  0001e200  2**6
                  CONTENTS, ALLOC, LOAD, DATA
 27 .bss          00000c74  080663c0  080663c0  0001e3a0  2**6
                  ALLOC
 28 .gnu_debuglink 00000010  00000000  00000000  0001e3a0  2**2
                  CONTENTS, READONLY
 29 .gnu_debugdata 00001170  00000000  00000000  0001e3b0  2**0
                  CONTENTS, READONLY
 30 .gnu.prelink_undo 000005dc  00000000  00000000  0001f520  2**2
                  CONTENTS, READONLY

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: MAP_FIXED_NOREPLACE appears to break older i386 binaries
  2019-10-05 23:32 MAP_FIXED_NOREPLACE appears to break older i386 binaries Russell King - ARM Linux admin
@ 2019-10-06  0:06 ` Linus Torvalds
  2019-10-06  0:18 ` Linus Torvalds
  1 sibling, 0 replies; 7+ messages in thread
From: Linus Torvalds @ 2019-10-06  0:06 UTC (permalink / raw)
  To: Russell King - ARM Linux admin, Kees Cook
  Cc: Michal Hocko, Linux Kernel Mailing List

On Sat, Oct 5, 2019 at 4:32 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> Under a 4.19 kernel (debian stable), I am surprised to find that some
> previously working i386 binaries no longer work, whereas others are
> fine.  ls, for example, dies with a SEGV, but bash is fine.

Hmm. Is this with some recent stable kernel update? Or has it been
going on for a while and you only noticed now for some reason?

If it's recent, I'd be inclined to blame bbdc6076d2e5 ("binfmt_elf:
move brk out of mmap when doing direct loader exec") which afaik made
it into 4.19.75 and might be in that debian-stable.

And if it's that, then I think that it should be fixed by 7be3cb019db1
("binfmt_elf: Do not move brk for INTERP-less ET_EXEC") which is in
the current queue.

Adding Kees to the cc, in case he goes "No, silly Linus, you're being
stupid", or can confirm that yeah, that was the behavior for the
problem case.

Kees, original report with more information at

   https://lore.kernel.org/lkml/20191005233227.GB25745@shell.armlinux.org.uk/

And if that isn't the case, maybe you can send over one of the broken
binaries in private email for testing?

                       Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: MAP_FIXED_NOREPLACE appears to break older i386 binaries
  2019-10-05 23:32 MAP_FIXED_NOREPLACE appears to break older i386 binaries Russell King - ARM Linux admin
  2019-10-06  0:06 ` Linus Torvalds
@ 2019-10-06  0:18 ` Linus Torvalds
  2019-10-06 13:09   ` Russell King - ARM Linux admin
  1 sibling, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2019-10-06  0:18 UTC (permalink / raw)
  To: Russell King - ARM Linux admin, Kees Cook
  Cc: Michal Hocko, Linux Kernel Mailing List

Duh.

I only looked at recent issues in this area, and overlooked your
sentence in between the two ELF section dumps, and it appears that you
have already biseced it to something else:

On Sat, Oct 5, 2019 at 4:32 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> Seems we've broken older i386 binaries with commit ad55eac74f20
> ("elf: enforce MAP_FIXED on overlaying elf segments").  Maybe the
> MAP_FIXED_NOREPLACE stuff needs to have an on/off switch?

I guess the "can you send people binaries for testing" ends up being
the right thing to do, and Michal can figure it out.

Sorry for the noise.

              Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: MAP_FIXED_NOREPLACE appears to break older i386 binaries
  2019-10-06  0:18 ` Linus Torvalds
@ 2019-10-06 13:09   ` Russell King - ARM Linux admin
  2019-10-06 18:07     ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux admin @ 2019-10-06 13:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kees Cook, Michal Hocko, Linux Kernel Mailing List

On Sat, Oct 05, 2019 at 05:18:05PM -0700, Linus Torvalds wrote:
> Duh.
> 
> I only looked at recent issues in this area, and overlooked your
> sentence in between the two ELF section dumps, and it appears that you
> have already biseced it to something else:

I hadn't - I'd looked at the changes and identified a likely culpret
that fit with the symptoms and the layout of the binary.

What I'm basically trying to do is update my laptop - it was running
an x86_64 4.5.7 kernel but with 32-bit userland.  I've just installed
into a separate partition Debian Stable with the view to seeing
whether I like it, which means migrating stuff over - and I hit a
problem with the newer Evolution not wanting to recognise the
configuration/data from the previous version.

So I thought... I can just chroot into the old setup, run up evolution
there, export its configuration, so I can import it into the newer
version without having to go through a reboot cycle.

The chroot and exec of bin/bash in the old setup was successful, as
was dmesg, but useful tools like ls failed with a segfault.

The difference between working binaries and non-working binaries seems
to be whether the r-x and rw- LOAD sections in the ELF program headers
overlap on a page.  Here's bash:

    LOAD off    0x00000000 vaddr 0x08047000 paddr 0x08047000 align 2**12
         filesz 0x000bbb08 memsz 0x000bbb08 flags r-x
    LOAD off    0x000bc000 vaddr 0x08103000 paddr 0x08103000 align 2**12
         filesz 0x00004864 memsz 0x00009648 flags rw-

So, the r-x load covers 0x08047000-0x08102b08, and the following rw-
load covers 0x08103000 onwards - so next page.  dmesg is similar:

    LOAD off    0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
         filesz 0x00009c64 memsz 0x00009c64 flags r-x
    LOAD off    0x00009e28 vaddr 0x08052e28 paddr 0x08052e28 align 2**12
         filesz 0x000028ce memsz 0x000028ce flags rw-

0x08048000-0x08051c64 vs 0x08052e28 - so next page.  In contrast, ls:

    LOAD off    0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
         filesz 0x0001d620 memsz 0x0001d620 flags r-x
    LOAD off    0x0001d950 vaddr 0x08065950 paddr 0x08065950 align 2**12
         filesz 0x00000a50 memsz 0x000016e4 flags rw-

0x08048000-0x08065620 vs 0x08065950 - so same page, and fails.

Looking at the commit I referred to, what we end up with is:

- Initially, elf_fixed is MAP_FIXED_NOREPLACE and load_addr_set is false
- elf_brk and elf_bss are initially zero
- The first LOAD requests a mapping for 0x08048000 .. 0x08065fff inclusive
- since this is an executable mapping, we use elf_fixed to set the
  MAP_FIXED* flags, so this mapping is established with
  MAP_FIXED_NOREPLACE.
- load_addr_set is now set to true
- elf_bss is set to vaddr + filesz => 0x08065620
- elf_brk is set to vaddr + memsz => 0x08065620
- Moving on to the second LOAD, this is a mapping starting at 0x08065950
- Since elf_brk > elf_bss is false, we don't take that path through the
  code, which _would_ have set elf_fixed to MAP_FIXED (that's the only
  case which we would do - for the BSS.)
- As load_addr_set is true, we again use elf_fixed to set the
  MAP_FIXED* flags.  elf_fixed is still MAP_FIXED_NOREPLACE, so this
  mapping uses MAP_FIXED_NOREPLACE.
- Since this mapping overlaps the previous mapping, it fails with the
  error mentioned.

Since the ELF load_binary() method returns -EEXIST, we end up in this
code path in fs/exec.c:

                if (retval < 0 && !bprm->mm) {
                        /* we got to flush_old_exec() and failed after it */
                        read_unlock(&binfmt_lock);
                        force_sigsegv(SIGSEGV);
                        return retval;
                }

and the program is killed with a SIGSEGV.

So, from a code inspection point of view, it seems that this is likely
the culpret.

I don't yet have the debian stable system setup enough to build kernels;
that may be today's project, but I'd first like to solve the original
issue (migrating the evolution setup) so I can first see whether it's
going to be worth me continuing, or whether I persist with my existing
setup.

However, I think it _is_ worth highlighting that we seem to have broken
binary compatibility with older i386 userspace with newer kernels.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: MAP_FIXED_NOREPLACE appears to break older i386 binaries
  2019-10-06 13:09   ` Russell King - ARM Linux admin
@ 2019-10-06 18:07     ` Linus Torvalds
  2019-10-06 21:14       ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2019-10-06 18:07 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Kees Cook, Michal Hocko, Linux Kernel Mailing List

On Sun, Oct 6, 2019 at 6:09 AM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> However, I think it _is_ worth highlighting that we seem to have broken
> binary compatibility with older i386 userspace with newer kernels.

Yes, we should get this fixed. But I continue to ask you to point to
the actual binaries for testing..

                Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: MAP_FIXED_NOREPLACE appears to break older i386 binaries
  2019-10-06 18:07     ` Linus Torvalds
@ 2019-10-06 21:14       ` Linus Torvalds
  2019-10-07  1:22         ` Kees Cook
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2019-10-06 21:14 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Kees Cook, Michal Hocko, Linux Kernel Mailing List

On Sun, Oct 6, 2019 at 11:07 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Yes, we should get this fixed. But I continue to ask you to point to
> the actual binaries for testing..

Just to bring the resolution back publicly to lkml after rmk sent me
test binaries in private email, the end result is commit b212921b13bd
("elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings").

             Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: MAP_FIXED_NOREPLACE appears to break older i386 binaries
  2019-10-06 21:14       ` Linus Torvalds
@ 2019-10-07  1:22         ` Kees Cook
  0 siblings, 0 replies; 7+ messages in thread
From: Kees Cook @ 2019-10-07  1:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Russell King - ARM Linux admin, Michal Hocko, Linux Kernel Mailing List

On Sun, Oct 06, 2019 at 02:14:59PM -0700, Linus Torvalds wrote:
> On Sun, Oct 6, 2019 at 11:07 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Yes, we should get this fixed. But I continue to ask you to point to
> > the actual binaries for testing..
> 
> Just to bring the resolution back publicly to lkml after rmk sent me
> test binaries in private email, the end result is commit b212921b13bd
> ("elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings").

Unsurprisingly, I'm not a fan of reverting that patch, but obviously we
must since it breaks old userspace. :) I'm travelling tomorrow, so if
Michal doesn't fix it before I'm back, I'll take a stab at it. I'd like
to retain the general best-effort-defense of "don't let mappings collide"
but I think it'll require us retaining more details about what the ELF
told us to collide with. (i.e. the LOADs can collide, but not into
stack, brk, etc.)

And better yet, we need self-tests here. execve has SO many corner
cases... I'd like to figure out some way to capture all these.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-10-07  1:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-05 23:32 MAP_FIXED_NOREPLACE appears to break older i386 binaries Russell King - ARM Linux admin
2019-10-06  0:06 ` Linus Torvalds
2019-10-06  0:18 ` Linus Torvalds
2019-10-06 13:09   ` Russell King - ARM Linux admin
2019-10-06 18:07     ` Linus Torvalds
2019-10-06 21:14       ` Linus Torvalds
2019-10-07  1:22         ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).