linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Recent-ish changes in binfmt_elf made my program segfault
@ 2024-01-22 12:01 Jan Bujak
  2024-01-22 14:54 ` Pedro Falcato
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Jan Bujak @ 2024-01-22 12:01 UTC (permalink / raw)
  To: ebiederm, keescook; +Cc: linux-mm, linux-kernel, viro, brauner, linux-fsdevel

Hi.

I recently updated my kernel and one of my programs started segfaulting.

The issue seems to be related to how the kernel interprets PT_LOAD headers;
consider the following program headers (from 'readelf' of my reproduction):

Program Headers:
   Type  Offset   VirtAddr  PhysAddr  FileSiz  MemSiz   Flg Align
   LOAD  0x001000 0x10000   0x10000   0x000010 0x000010 R   0x1000
   LOAD  0x002000 0x11000   0x11000   0x000010 0x000010 RW  0x1000
   LOAD  0x002010 0x11010   0x11010   0x000000 0x000004 RW  0x1000
   LOAD  0x003000 0x12000   0x12000   0x0000d2 0x0000d2 R E 0x1000
   LOAD  0x004000 0x20000   0x20000   0x000004 0x000004 RW  0x1000

Old kernels load this ELF file in the following way ('/proc/self/maps'):

00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
00011000-00012000 rw-p 00002000 00:02 131  ./bug-reproduction
00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction

And new kernels do it like this:

00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
00011000-00012000 rw-p 00000000 00:00 0
00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction

That map between 0x11000 and 0x12000 is the program's '.data' and '.bss'
sections to which it tries to write to, and since the kernel doesn't map
them anymore it crashes.

I bisected the issue to the following commit:

commit 585a018627b4d7ed37387211f667916840b5c5ea
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Thu Sep 28 20:24:29 2023 -0700

     binfmt_elf: Support segments with 0 filesz and misaligned starts

I can confirm that with this commit the issue reproduces, and with it
reverted it doesn't.

I have prepared a minimal reproduction of the problem available here,
along with all of the scripts I used for bisecting:

https://github.com/koute/linux-elf-loading-bug

You can either compile it from source (requires Rust and LLD), or there's
a prebuilt binary in 'bin/bug-reproduction` which you can run. (It's tiny,
so you can easily check with 'objdump -d' that it isn't malicious).

On old kernels this will run fine, and on new kernels it will segfault.

Thanks!


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-01-22 12:01 Recent-ish changes in binfmt_elf made my program segfault Jan Bujak
@ 2024-01-22 14:54 ` Pedro Falcato
  2024-01-22 15:23   ` Jan Bujak
  2024-01-22 16:43 ` Eric W. Biederman
  2024-01-24  6:59 ` Linux regression tracking #adding (Thorsten Leemhuis)
  2 siblings, 1 reply; 18+ messages in thread
From: Pedro Falcato @ 2024-01-22 14:54 UTC (permalink / raw)
  To: Jan Bujak
  Cc: ebiederm, keescook, linux-mm, linux-kernel, viro, brauner, linux-fsdevel

On Mon, Jan 22, 2024 at 12:16 PM Jan Bujak <j@exia.io> wrote:
>
> Hi.
>
> I recently updated my kernel and one of my programs started segfaulting.
>
> The issue seems to be related to how the kernel interprets PT_LOAD headers;
> consider the following program headers (from 'readelf' of my reproduction):
>
> Program Headers:
>    Type  Offset   VirtAddr  PhysAddr  FileSiz  MemSiz   Flg Align
>    LOAD  0x001000 0x10000   0x10000   0x000010 0x000010 R   0x1000
>    LOAD  0x002000 0x11000   0x11000   0x000010 0x000010 RW  0x1000
>    LOAD  0x002010 0x11010   0x11010   0x000000 0x000004 RW  0x1000
>    LOAD  0x003000 0x12000   0x12000   0x0000d2 0x0000d2 R E 0x1000
>    LOAD  0x004000 0x20000   0x20000   0x000004 0x000004 RW  0x1000
>
> Old kernels load this ELF file in the following way ('/proc/self/maps'):
>
> 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
> 00011000-00012000 rw-p 00002000 00:02 131  ./bug-reproduction
> 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
> 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
>
> And new kernels do it like this:
>
> 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
> 00011000-00012000 rw-p 00000000 00:00 0
> 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
> 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
>
> That map between 0x11000 and 0x12000 is the program's '.data' and '.bss'
> sections to which it tries to write to, and since the kernel doesn't map
> them anymore it crashes.
>
> I bisected the issue to the following commit:
>
> commit 585a018627b4d7ed37387211f667916840b5c5ea
> Author: Eric W. Biederman <ebiederm@xmission.com>
> Date:   Thu Sep 28 20:24:29 2023 -0700
>
>      binfmt_elf: Support segments with 0 filesz and misaligned starts
>
> I can confirm that with this commit the issue reproduces, and with it
> reverted it doesn't.
>
> I have prepared a minimal reproduction of the problem available here,
> along with all of the scripts I used for bisecting:
>
> https://github.com/koute/linux-elf-loading-bug
>
> You can either compile it from source (requires Rust and LLD), or there's
> a prebuilt binary in 'bin/bug-reproduction` which you can run. (It's tiny,
> so you can easily check with 'objdump -d' that it isn't malicious).
>
> On old kernels this will run fine, and on new kernels it will segfault.

Hi!

Where did you get that linker script?

FWIW, I catched this possible issue in review, and this was already
discussed (see my email and Eric's reply):
https://lore.kernel.org/all/CAKbZUD3E2if8Sncy+M2YKncc_Zh08-86W6U5wR0ZMazShxbHHA@mail.gmail.com/

This was my original testcase
(https://github.com/heatd/elf-bug-questionmark), which convinced the
loader to map .data over a cleared .bss. Your bug seems similar, but
does the inverse: maps .bss over .data.

-- 
Pedro

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-01-22 14:54 ` Pedro Falcato
@ 2024-01-22 15:23   ` Jan Bujak
  2024-02-27  2:23     ` Kees Cook
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Bujak @ 2024-01-22 15:23 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: ebiederm, keescook, linux-mm, linux-kernel, viro, brauner, linux-fsdevel

On 1/22/24 23:54, Pedro Falcato wrote:
> Hi!
>
> Where did you get that linker script?
>
> FWIW, I catched this possible issue in review, and this was already
> discussed (see my email and Eric's reply):
> https://lore.kernel.org/all/CAKbZUD3E2if8Sncy+M2YKncc_Zh08-86W6U5wR0ZMazShxbHHA@mail.gmail.com/
>
> This was my original testcase
> (https://github.com/heatd/elf-bug-questionmark), which convinced the
> loader to map .data over a cleared .bss. Your bug seems similar, but
> does the inverse: maps .bss over .data.
>

I wrote the linker script myself from scratch.

Thank you for the link to the previous discussion. So assuming this
breakage was intended my question here is - doesn't this run afoul
of the "we do not break userspace" rule?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-01-22 12:01 Recent-ish changes in binfmt_elf made my program segfault Jan Bujak
  2024-01-22 14:54 ` Pedro Falcato
@ 2024-01-22 16:43 ` Eric W. Biederman
  2024-01-22 20:48   ` Kees Cook
  2024-01-24  6:59 ` Linux regression tracking #adding (Thorsten Leemhuis)
  2 siblings, 1 reply; 18+ messages in thread
From: Eric W. Biederman @ 2024-01-22 16:43 UTC (permalink / raw)
  To: Jan Bujak; +Cc: keescook, linux-mm, linux-kernel, viro, brauner, linux-fsdevel

Jan Bujak <j@exia.io> writes:

> Hi.
>
> I recently updated my kernel and one of my programs started segfaulting.
>
> The issue seems to be related to how the kernel interprets PT_LOAD headers;
> consider the following program headers (from 'readelf' of my reproduction):
>
> Program Headers:
>   Type  Offset   VirtAddr  PhysAddr  FileSiz  MemSiz   Flg Align
>   LOAD  0x001000 0x10000   0x10000   0x000010 0x000010 R   0x1000
>   LOAD  0x002000 0x11000   0x11000   0x000010 0x000010 RW  0x1000
>   LOAD  0x002010 0x11010   0x11010   0x000000 0x000004 RW  0x1000
>   LOAD  0x003000 0x12000   0x12000   0x0000d2 0x0000d2 R E 0x1000
>   LOAD  0x004000 0x20000   0x20000   0x000004 0x000004 RW  0x1000
>
> Old kernels load this ELF file in the following way ('/proc/self/maps'):
>
> 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
> 00011000-00012000 rw-p 00002000 00:02 131  ./bug-reproduction
> 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
> 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
>
> And new kernels do it like this:
>
> 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
> 00011000-00012000 rw-p 00000000 00:00 0
> 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
> 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
>
> That map between 0x11000 and 0x12000 is the program's '.data' and '.bss'
> sections to which it tries to write to, and since the kernel doesn't map
> them anymore it crashes.
>
> I bisected the issue to the following commit:
>
> commit 585a018627b4d7ed37387211f667916840b5c5ea
> Author: Eric W. Biederman <ebiederm@xmission.com>
> Date:   Thu Sep 28 20:24:29 2023 -0700
>
>     binfmt_elf: Support segments with 0 filesz and misaligned starts
>
> I can confirm that with this commit the issue reproduces, and with it
> reverted it doesn't.
>
> I have prepared a minimal reproduction of the problem available here,
> along with all of the scripts I used for bisecting:
>
> https://github.com/koute/linux-elf-loading-bug
>
> You can either compile it from source (requires Rust and LLD), or there's
> a prebuilt binary in 'bin/bug-reproduction` which you can run. (It's tiny,
> so you can easily check with 'objdump -d' that it isn't malicious).
>
> On old kernels this will run fine, and on new kernels it will
> segfault.

Frankly your ELF binary is buggy, and probably the best fix would be to
fix the linker script that is used to generate your binary.

The problem is the SYSV ABI defines everything in terms of pages and so
placing two ELF segments on the same page results in undefined behavior.

The code was fixed to honor your .bss segment and now your .data segment
is being stomped, because you defined them to overlap.

Ideally your linker script would place both your .data and .bss in
the same segment.  That would both fix the issue and give you a more
compact elf binary, while not changing the generated code at all.


That said regressions suck and it would be good if we could update the
code to do something reasonable in this case.

We can perhaps we can update the .bss segment to just memset an existing
page if one has already been mapped.  Which would cleanly handle a case
like yours.  I need to think about that for a moment to see what the
code would look like to do that.

Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-01-22 16:43 ` Eric W. Biederman
@ 2024-01-22 20:48   ` Kees Cook
  2024-01-22 21:01     ` Eric W. Biederman
  0 siblings, 1 reply; 18+ messages in thread
From: Kees Cook @ 2024-01-22 20:48 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Jan Bujak, linux-mm, linux-kernel, viro, brauner, linux-fsdevel

On Mon, Jan 22, 2024 at 10:43:59AM -0600, Eric W. Biederman wrote:
> Jan Bujak <j@exia.io> writes:
> 
> > Hi.
> >
> > I recently updated my kernel and one of my programs started segfaulting.
> >
> > The issue seems to be related to how the kernel interprets PT_LOAD headers;
> > consider the following program headers (from 'readelf' of my reproduction):
> >
> > Program Headers:
> >   Type  Offset   VirtAddr  PhysAddr  FileSiz  MemSiz   Flg Align
> >   LOAD  0x001000 0x10000   0x10000   0x000010 0x000010 R   0x1000
> >   LOAD  0x002000 0x11000   0x11000   0x000010 0x000010 RW  0x1000
> >   LOAD  0x002010 0x11010   0x11010   0x000000 0x000004 RW  0x1000
> >   LOAD  0x003000 0x12000   0x12000   0x0000d2 0x0000d2 R E 0x1000
> >   LOAD  0x004000 0x20000   0x20000   0x000004 0x000004 RW  0x1000
> >
> > Old kernels load this ELF file in the following way ('/proc/self/maps'):
> >
> > 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
> > 00011000-00012000 rw-p 00002000 00:02 131  ./bug-reproduction
> > 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
> > 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
> >
> > And new kernels do it like this:
> >
> > 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
> > 00011000-00012000 rw-p 00000000 00:00 0
> > 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
> > 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
> >
> > That map between 0x11000 and 0x12000 is the program's '.data' and '.bss'
> > sections to which it tries to write to, and since the kernel doesn't map
> > them anymore it crashes.
> >
> > I bisected the issue to the following commit:
> >
> > commit 585a018627b4d7ed37387211f667916840b5c5ea
> > Author: Eric W. Biederman <ebiederm@xmission.com>
> > Date:   Thu Sep 28 20:24:29 2023 -0700
> >
> >     binfmt_elf: Support segments with 0 filesz and misaligned starts
> >
> > I can confirm that with this commit the issue reproduces, and with it
> > reverted it doesn't.
> >
> > I have prepared a minimal reproduction of the problem available here,
> > along with all of the scripts I used for bisecting:
> >
> > https://github.com/koute/linux-elf-loading-bug
> >
> > You can either compile it from source (requires Rust and LLD), or there's
> > a prebuilt binary in 'bin/bug-reproduction` which you can run. (It's tiny,
> > so you can easily check with 'objdump -d' that it isn't malicious).
> >
> > On old kernels this will run fine, and on new kernels it will
> > segfault.
> 
> Frankly your ELF binary is buggy, and probably the best fix would be to
> fix the linker script that is used to generate your binary.
> 
> The problem is the SYSV ABI defines everything in terms of pages and so
> placing two ELF segments on the same page results in undefined behavior.
> 
> The code was fixed to honor your .bss segment and now your .data segment
> is being stomped, because you defined them to overlap.
> 
> Ideally your linker script would place both your .data and .bss in
> the same segment.  That would both fix the issue and give you a more
> compact elf binary, while not changing the generated code at all.
> 
> 
> That said regressions suck and it would be good if we could update the
> code to do something reasonable in this case.
> 
> We can perhaps we can update the .bss segment to just memset an existing
> page if one has already been mapped.  Which would cleanly handle a case
> like yours.  I need to think about that for a moment to see what the
> code would look like to do that.

It's the "if one has already been mapped" part which might
become expensive...

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-01-22 20:48   ` Kees Cook
@ 2024-01-22 21:01     ` Eric W. Biederman
  2024-01-22 22:12       ` Kees Cook
  0 siblings, 1 reply; 18+ messages in thread
From: Eric W. Biederman @ 2024-01-22 21:01 UTC (permalink / raw)
  To: Kees Cook; +Cc: Jan Bujak, linux-mm, linux-kernel, viro, brauner, linux-fsdevel

Kees Cook <keescook@chromium.org> writes:

> On Mon, Jan 22, 2024 at 10:43:59AM -0600, Eric W. Biederman wrote:
>> Jan Bujak <j@exia.io> writes:
>> 
>> > Hi.
>> >
>> > I recently updated my kernel and one of my programs started segfaulting.
>> >
>> > The issue seems to be related to how the kernel interprets PT_LOAD headers;
>> > consider the following program headers (from 'readelf' of my reproduction):
>> >
>> > Program Headers:
>> >   Type  Offset   VirtAddr  PhysAddr  FileSiz  MemSiz   Flg Align
>> >   LOAD  0x001000 0x10000   0x10000   0x000010 0x000010 R   0x1000
>> >   LOAD  0x002000 0x11000   0x11000   0x000010 0x000010 RW  0x1000
>> >   LOAD  0x002010 0x11010   0x11010   0x000000 0x000004 RW  0x1000
>> >   LOAD  0x003000 0x12000   0x12000   0x0000d2 0x0000d2 R E 0x1000
>> >   LOAD  0x004000 0x20000   0x20000   0x000004 0x000004 RW  0x1000
>> >
>> > Old kernels load this ELF file in the following way ('/proc/self/maps'):
>> >
>> > 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
>> > 00011000-00012000 rw-p 00002000 00:02 131  ./bug-reproduction
>> > 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
>> > 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
>> >
>> > And new kernels do it like this:
>> >
>> > 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
>> > 00011000-00012000 rw-p 00000000 00:00 0
>> > 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
>> > 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
>> >
>> > That map between 0x11000 and 0x12000 is the program's '.data' and '.bss'
>> > sections to which it tries to write to, and since the kernel doesn't map
>> > them anymore it crashes.
>> >
>> > I bisected the issue to the following commit:
>> >
>> > commit 585a018627b4d7ed37387211f667916840b5c5ea
>> > Author: Eric W. Biederman <ebiederm@xmission.com>
>> > Date:   Thu Sep 28 20:24:29 2023 -0700
>> >
>> >     binfmt_elf: Support segments with 0 filesz and misaligned starts
>> >
>> > I can confirm that with this commit the issue reproduces, and with it
>> > reverted it doesn't.
>> >
>> > I have prepared a minimal reproduction of the problem available here,
>> > along with all of the scripts I used for bisecting:
>> >
>> > https://github.com/koute/linux-elf-loading-bug
>> >
>> > You can either compile it from source (requires Rust and LLD), or there's
>> > a prebuilt binary in 'bin/bug-reproduction` which you can run. (It's tiny,
>> > so you can easily check with 'objdump -d' that it isn't malicious).
>> >
>> > On old kernels this will run fine, and on new kernels it will
>> > segfault.
>> 
>> Frankly your ELF binary is buggy, and probably the best fix would be to
>> fix the linker script that is used to generate your binary.
>> 
>> The problem is the SYSV ABI defines everything in terms of pages and so
>> placing two ELF segments on the same page results in undefined behavior.
>> 
>> The code was fixed to honor your .bss segment and now your .data segment
>> is being stomped, because you defined them to overlap.
>> 
>> Ideally your linker script would place both your .data and .bss in
>> the same segment.  That would both fix the issue and give you a more
>> compact elf binary, while not changing the generated code at all.
>> 
>> 
>> That said regressions suck and it would be good if we could update the
>> code to do something reasonable in this case.
>> 
>> We can perhaps we can update the .bss segment to just memset an existing
>> page if one has already been mapped.  Which would cleanly handle a case
>> like yours.  I need to think about that for a moment to see what the
>> code would look like to do that.
>
> It's the "if one has already been mapped" part which might
> become expensive...

I am wondering if perhaps we can add MAP_FIXED_NOREPLACE and take
some appropriate action if there is already a mapping there.

Such as printing a warning and skipping the action entirely for
a pure bss segment.  That would essentially replicate the previous
behavior.

At a minimum adding MAP_FIXED_NOREPLACE should allow us to
deterministically detect and warn about problems, making it easier
for people to understand why their binary won't run.

Eric




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-01-22 21:01     ` Eric W. Biederman
@ 2024-01-22 22:12       ` Kees Cook
  2024-02-01 10:47         ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 18+ messages in thread
From: Kees Cook @ 2024-01-22 22:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Jan Bujak, linux-mm, linux-kernel, viro, brauner, linux-fsdevel

On Mon, Jan 22, 2024 at 03:01:06PM -0600, Eric W. Biederman wrote:
> Kees Cook <keescook@chromium.org> writes:
> 
> > On Mon, Jan 22, 2024 at 10:43:59AM -0600, Eric W. Biederman wrote:
> >> Jan Bujak <j@exia.io> writes:
> >> 
> >> > Hi.
> >> >
> >> > I recently updated my kernel and one of my programs started segfaulting.
> >> >
> >> > The issue seems to be related to how the kernel interprets PT_LOAD headers;
> >> > consider the following program headers (from 'readelf' of my reproduction):
> >> >
> >> > Program Headers:
> >> >   Type  Offset   VirtAddr  PhysAddr  FileSiz  MemSiz   Flg Align
> >> >   LOAD  0x001000 0x10000   0x10000   0x000010 0x000010 R   0x1000
> >> >   LOAD  0x002000 0x11000   0x11000   0x000010 0x000010 RW  0x1000
> >> >   LOAD  0x002010 0x11010   0x11010   0x000000 0x000004 RW  0x1000
> >> >   LOAD  0x003000 0x12000   0x12000   0x0000d2 0x0000d2 R E 0x1000
> >> >   LOAD  0x004000 0x20000   0x20000   0x000004 0x000004 RW  0x1000
> >> >
> >> > Old kernels load this ELF file in the following way ('/proc/self/maps'):
> >> >
> >> > 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
> >> > 00011000-00012000 rw-p 00002000 00:02 131  ./bug-reproduction
> >> > 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
> >> > 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
> >> >
> >> > And new kernels do it like this:
> >> >
> >> > 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
> >> > 00011000-00012000 rw-p 00000000 00:00 0
> >> > 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
> >> > 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
> >> >
> >> > That map between 0x11000 and 0x12000 is the program's '.data' and '.bss'
> >> > sections to which it tries to write to, and since the kernel doesn't map
> >> > them anymore it crashes.
> >> >
> >> > I bisected the issue to the following commit:
> >> >
> >> > commit 585a018627b4d7ed37387211f667916840b5c5ea
> >> > Author: Eric W. Biederman <ebiederm@xmission.com>
> >> > Date:   Thu Sep 28 20:24:29 2023 -0700
> >> >
> >> >     binfmt_elf: Support segments with 0 filesz and misaligned starts
> >> >
> >> > I can confirm that with this commit the issue reproduces, and with it
> >> > reverted it doesn't.
> >> >
> >> > I have prepared a minimal reproduction of the problem available here,
> >> > along with all of the scripts I used for bisecting:
> >> >
> >> > https://github.com/koute/linux-elf-loading-bug
> >> >
> >> > You can either compile it from source (requires Rust and LLD), or there's
> >> > a prebuilt binary in 'bin/bug-reproduction` which you can run. (It's tiny,
> >> > so you can easily check with 'objdump -d' that it isn't malicious).
> >> >
> >> > On old kernels this will run fine, and on new kernels it will
> >> > segfault.
> >> 
> >> Frankly your ELF binary is buggy, and probably the best fix would be to
> >> fix the linker script that is used to generate your binary.
> >> 
> >> The problem is the SYSV ABI defines everything in terms of pages and so
> >> placing two ELF segments on the same page results in undefined behavior.
> >> 
> >> The code was fixed to honor your .bss segment and now your .data segment
> >> is being stomped, because you defined them to overlap.
> >> 
> >> Ideally your linker script would place both your .data and .bss in
> >> the same segment.  That would both fix the issue and give you a more
> >> compact elf binary, while not changing the generated code at all.
> >> 
> >> 
> >> That said regressions suck and it would be good if we could update the
> >> code to do something reasonable in this case.
> >> 
> >> We can perhaps we can update the .bss segment to just memset an existing
> >> page if one has already been mapped.  Which would cleanly handle a case
> >> like yours.  I need to think about that for a moment to see what the
> >> code would look like to do that.
> >
> > It's the "if one has already been mapped" part which might
> > become expensive...
> 
> I am wondering if perhaps we can add MAP_FIXED_NOREPLACE and take
> some appropriate action if there is already a mapping there.

Yeah, in the general case we had to back out MAP_FIXED_NOREPLACE usage
for individual LOADs because there were so many cases of overlapping
LOADs. :( Currently it's only used during the initial mapping (when
"total_size" is set), to avoid colliding with the stack.

But, as you suggest, if we only use it for filesz==0, it could work.

> Such as printing a warning and skipping the action entirely for
> a pure bss segment.  That would essentially replicate the previous
> behavior.

Instead of failing, perhaps we just fallback to not using
MAP_FIXED_NOREPLACE and do the memset? (And maybe pr_warn_once?)

> At a minimum adding MAP_FIXED_NOREPLACE should allow us to
> deterministically detect and warn about problems, making it easier
> for people to understand why their binary won't run.

Yeah, it seems like it's the vm_brk_flags() that is clobber the mapping,
so we have to skip that for the MAP_FIXED_NOREPLACE fails on a filesz==0
case?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-01-22 12:01 Recent-ish changes in binfmt_elf made my program segfault Jan Bujak
  2024-01-22 14:54 ` Pedro Falcato
  2024-01-22 16:43 ` Eric W. Biederman
@ 2024-01-24  6:59 ` Linux regression tracking #adding (Thorsten Leemhuis)
  2 siblings, 0 replies; 18+ messages in thread
From: Linux regression tracking #adding (Thorsten Leemhuis) @ 2024-01-24  6:59 UTC (permalink / raw)
  To: Linux kernel regressions list; +Cc: linux-mm, linux-kernel, linux-fsdevel

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 22.01.24 13:01, Jan Bujak wrote:
> 
> I recently updated my kernel and one of my programs started segfaulting.
> 
> [...]
> I bisected the issue to the following commit:
> 
> commit 585a018627b4d7ed37387211f667916840b5c5ea
> Author: Eric W. Biederman <ebiederm@xmission.com>
> Date:   Thu Sep 28 20:24:29 2023 -0700
> 
>     binfmt_elf: Support segments with 0 filesz and misaligned starts
> [...]

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 585a018627b4d7ed37387211f667916840b5c5e
#regzbot title binfmt_elf: programs started segfaulting
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-01-22 22:12       ` Kees Cook
@ 2024-02-01 10:47         ` Linux regression tracking (Thorsten Leemhuis)
  2024-02-04 23:27           ` Kees Cook
  0 siblings, 1 reply; 18+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-02-01 10:47 UTC (permalink / raw)
  To: Kees Cook, Eric W. Biederman
  Cc: Jan Bujak, linux-mm, linux-kernel, viro, brauner, linux-fsdevel,
	Linux kernel regressions list

Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Eric, what's the status wrt. to this regression? Things from here look
stalled, but I might be missing something.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 22.01.24 23:12, Kees Cook wrote:
> On Mon, Jan 22, 2024 at 03:01:06PM -0600, Eric W. Biederman wrote:
>> Kees Cook <keescook@chromium.org> writes:
>>
>>> On Mon, Jan 22, 2024 at 10:43:59AM -0600, Eric W. Biederman wrote:
>>>> Jan Bujak <j@exia.io> writes:
>>>>
>>>>> Hi.
>>>>>
>>>>> I recently updated my kernel and one of my programs started segfaulting.
>>>>>
>>>>> The issue seems to be related to how the kernel interprets PT_LOAD headers;
>>>>> consider the following program headers (from 'readelf' of my reproduction):
>>>>>
>>>>> Program Headers:
>>>>>   Type  Offset   VirtAddr  PhysAddr  FileSiz  MemSiz   Flg Align
>>>>>   LOAD  0x001000 0x10000   0x10000   0x000010 0x000010 R   0x1000
>>>>>   LOAD  0x002000 0x11000   0x11000   0x000010 0x000010 RW  0x1000
>>>>>   LOAD  0x002010 0x11010   0x11010   0x000000 0x000004 RW  0x1000
>>>>>   LOAD  0x003000 0x12000   0x12000   0x0000d2 0x0000d2 R E 0x1000
>>>>>   LOAD  0x004000 0x20000   0x20000   0x000004 0x000004 RW  0x1000
>>>>>
>>>>> Old kernels load this ELF file in the following way ('/proc/self/maps'):
>>>>>
>>>>> 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
>>>>> 00011000-00012000 rw-p 00002000 00:02 131  ./bug-reproduction
>>>>> 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
>>>>> 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
>>>>>
>>>>> And new kernels do it like this:
>>>>>
>>>>> 00010000-00011000 r--p 00001000 00:02 131  ./bug-reproduction
>>>>> 00011000-00012000 rw-p 00000000 00:00 0
>>>>> 00012000-00013000 r-xp 00003000 00:02 131  ./bug-reproduction
>>>>> 00020000-00021000 rw-p 00004000 00:02 131  ./bug-reproduction
>>>>>
>>>>> That map between 0x11000 and 0x12000 is the program's '.data' and '.bss'
>>>>> sections to which it tries to write to, and since the kernel doesn't map
>>>>> them anymore it crashes.
>>>>>
>>>>> I bisected the issue to the following commit:
>>>>>
>>>>> commit 585a018627b4d7ed37387211f667916840b5c5ea
>>>>> Author: Eric W. Biederman <ebiederm@xmission.com>
>>>>> Date:   Thu Sep 28 20:24:29 2023 -0700
>>>>>
>>>>>     binfmt_elf: Support segments with 0 filesz and misaligned starts
>>>>>
>>>>> I can confirm that with this commit the issue reproduces, and with it
>>>>> reverted it doesn't.
>>>>>
>>>>> I have prepared a minimal reproduction of the problem available here,
>>>>> along with all of the scripts I used for bisecting:
>>>>>
>>>>> https://github.com/koute/linux-elf-loading-bug
>>>>>
>>>>> You can either compile it from source (requires Rust and LLD), or there's
>>>>> a prebuilt binary in 'bin/bug-reproduction` which you can run. (It's tiny,
>>>>> so you can easily check with 'objdump -d' that it isn't malicious).
>>>>>
>>>>> On old kernels this will run fine, and on new kernels it will
>>>>> segfault.
>>>>
>>>> Frankly your ELF binary is buggy, and probably the best fix would be to
>>>> fix the linker script that is used to generate your binary.
>>>>
>>>> The problem is the SYSV ABI defines everything in terms of pages and so
>>>> placing two ELF segments on the same page results in undefined behavior.
>>>>
>>>> The code was fixed to honor your .bss segment and now your .data segment
>>>> is being stomped, because you defined them to overlap.
>>>>
>>>> Ideally your linker script would place both your .data and .bss in
>>>> the same segment.  That would both fix the issue and give you a more
>>>> compact elf binary, while not changing the generated code at all.
>>>>
>>>>
>>>> That said regressions suck and it would be good if we could update the
>>>> code to do something reasonable in this case.
>>>>
>>>> We can perhaps we can update the .bss segment to just memset an existing
>>>> page if one has already been mapped.  Which would cleanly handle a case
>>>> like yours.  I need to think about that for a moment to see what the
>>>> code would look like to do that.
>>>
>>> It's the "if one has already been mapped" part which might
>>> become expensive...
>>
>> I am wondering if perhaps we can add MAP_FIXED_NOREPLACE and take
>> some appropriate action if there is already a mapping there.
> 
> Yeah, in the general case we had to back out MAP_FIXED_NOREPLACE usage
> for individual LOADs because there were so many cases of overlapping
> LOADs. :( Currently it's only used during the initial mapping (when
> "total_size" is set), to avoid colliding with the stack.
> 
> But, as you suggest, if we only use it for filesz==0, it could work.
> 
>> Such as printing a warning and skipping the action entirely for
>> a pure bss segment.  That would essentially replicate the previous
>> behavior.
> 
> Instead of failing, perhaps we just fallback to not using
> MAP_FIXED_NOREPLACE and do the memset? (And maybe pr_warn_once?)
> 
>> At a minimum adding MAP_FIXED_NOREPLACE should allow us to
>> deterministically detect and warn about problems, making it easier
>> for people to understand why their binary won't run.
> 
> Yeah, it seems like it's the vm_brk_flags() that is clobber the mapping,
> so we have to skip that for the MAP_FIXED_NOREPLACE fails on a filesz==0
> case?
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-02-01 10:47         ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-02-04 23:27           ` Kees Cook
  2024-02-26  5:54             ` Linux regression tracking (Thorsten Leemhuis)
  2024-03-25 15:26             ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 2 replies; 18+ messages in thread
From: Kees Cook @ 2024-02-04 23:27 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: Eric W. Biederman, Jan Bujak, linux-mm, linux-kernel, viro,
	brauner, linux-fsdevel

On Thu, Feb 01, 2024 at 11:47:02AM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
> 
> Eric, what's the status wrt. to this regression? Things from here look
> stalled, but I might be missing something.
> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

If Eric doesn't beat me to it, I'm hoping to look at this more this
coming week.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-02-04 23:27           ` Kees Cook
@ 2024-02-26  5:54             ` Linux regression tracking (Thorsten Leemhuis)
  2024-03-25 15:26             ` Linux regression tracking (Thorsten Leemhuis)
  1 sibling, 0 replies; 18+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-02-26  5:54 UTC (permalink / raw)
  To: Kees Cook, Linux regressions mailing list
  Cc: Eric W. Biederman, Jan Bujak, linux-mm, linux-kernel, viro,
	brauner, linux-fsdevel

On 05.02.24 00:27, Kees Cook wrote:
> On Thu, Feb 01, 2024 at 11:47:02AM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>> for once, to make this easily accessible to everyone.
>>
>> Eric, what's the status wrt. to this regression? Things from here look
>> stalled, but I might be missing something.
>>
> If Eric doesn't beat me to it, I'm hoping to look at this more this
> coming week.

Friendly reminder that this regression is still out there, as it seems
things stalled. Or was there some progress or even a fix? Then please
tell me!

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-01-22 15:23   ` Jan Bujak
@ 2024-02-27  2:23     ` Kees Cook
  2024-02-27 15:35       ` Eric W. Biederman
  0 siblings, 1 reply; 18+ messages in thread
From: Kees Cook @ 2024-02-27  2:23 UTC (permalink / raw)
  To: Jan Bujak
  Cc: Pedro Falcato, ebiederm, linux-mm, linux-kernel, viro, brauner,
	linux-fsdevel

On Tue, Jan 23, 2024 at 12:23:27AM +0900, Jan Bujak wrote:
> On 1/22/24 23:54, Pedro Falcato wrote:
> > Hi!
> > 
> > Where did you get that linker script?
> > 
> > FWIW, I catched this possible issue in review, and this was already
> > discussed (see my email and Eric's reply):
> > https://lore.kernel.org/all/CAKbZUD3E2if8Sncy+M2YKncc_Zh08-86W6U5wR0ZMazShxbHHA@mail.gmail.com/
> > 
> > This was my original testcase
> > (https://github.com/heatd/elf-bug-questionmark), which convinced the
> > loader to map .data over a cleared .bss. Your bug seems similar, but
> > does the inverse: maps .bss over .data.
> > 
> 
> I wrote the linker script myself from scratch.

Do you still need this addressed, or have you been able to adjust the
linker script? (I ask to try to assess the priority of needing to fix
this behavior change...)

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-02-27  2:23     ` Kees Cook
@ 2024-02-27 15:35       ` Eric W. Biederman
  2024-02-27 17:22         ` Kees Cook
  0 siblings, 1 reply; 18+ messages in thread
From: Eric W. Biederman @ 2024-02-27 15:35 UTC (permalink / raw)
  To: Kees Cook
  Cc: Jan Bujak, Pedro Falcato, linux-mm, linux-kernel, viro, brauner,
	linux-fsdevel

Kees Cook <keescook@chromium.org> writes:

> On Tue, Jan 23, 2024 at 12:23:27AM +0900, Jan Bujak wrote:
>> On 1/22/24 23:54, Pedro Falcato wrote:
>> > Hi!
>> > 
>> > Where did you get that linker script?
>> > 
>> > FWIW, I catched this possible issue in review, and this was already
>> > discussed (see my email and Eric's reply):
>> > https://lore.kernel.org/all/CAKbZUD3E2if8Sncy+M2YKncc_Zh08-86W6U5wR0ZMazShxbHHA@mail.gmail.com/
>> > 
>> > This was my original testcase
>> > (https://github.com/heatd/elf-bug-questionmark), which convinced the
>> > loader to map .data over a cleared .bss. Your bug seems similar, but
>> > does the inverse: maps .bss over .data.
>> > 
>> 
>> I wrote the linker script myself from scratch.
>
> Do you still need this addressed, or have you been able to adjust the
> linker script? (I ask to try to assess the priority of needing to fix
> this behavior change...)

Kees, I haven't had a chance to test this yet but it occurred to me
that there is an easy way to handle this.  In our in-memory copy
of the elf program headers we can just merge the two segments
together.

I believe the diff below accomplishes that, and should fix issue.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 5397b552fbeb..01df7dd1f3b4 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -924,6 +926,31 @@ static int load_elf_binary(struct linux_binprm *bprm)
 	elf_ppnt = elf_phdata;
 	for (i = 0; i < elf_ex->e_phnum; i++, elf_ppnt++)
 		switch (elf_ppnt->p_type) {
+		case PT_LOAD:
+		{
+			/*
+			 * Historically linux ignored all but the
+			 * final .bss segment.  Now that linux honors
+			 * all .bss segments, a .bss segment that
+			 * logically is not overlapping but is
+			 * overlapping when it's edges are rounded up
+			 * to page size causes programs to fail.
+			 *
+			 * Handle that case by merging .bss segments
+			 * into the segment they follow.
+			 */
+			if (((i + 1) >= elf_ex->e_phnum) ||
+			    (elf_ppnt[1].p_type != PT_LOAD) ||
+			    (elf_ppnt[1].p_filesz != 0))
+				continue;
+			unsigned long end =
+				elf_ppnt[0].p_vaddr + elf_ppnt[0].p_memsz;
+			if (elf_ppnt[1].p_vaddr != end)
+				continue;
+			elf_ppnt[0].p_memsz += elf_ppnt[1].p_memsz;
+			elf_ppnt[1].p_type = PT_NULL;
+			break;
+		}
 		case PT_GNU_STACK:
 			if (elf_ppnt->p_flags & PF_X)
 				executable_stack = EXSTACK_ENABLE_X;

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-02-27 15:35       ` Eric W. Biederman
@ 2024-02-27 17:22         ` Kees Cook
  2024-02-27 20:59           ` Eric W. Biederman
  0 siblings, 1 reply; 18+ messages in thread
From: Kees Cook @ 2024-02-27 17:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Jan Bujak, Pedro Falcato, linux-mm, linux-kernel, viro, brauner,
	linux-fsdevel

On Tue, Feb 27, 2024 at 09:35:39AM -0600, Eric W. Biederman wrote:
> Kees Cook <keescook@chromium.org> writes:
> 
> > On Tue, Jan 23, 2024 at 12:23:27AM +0900, Jan Bujak wrote:
> >> On 1/22/24 23:54, Pedro Falcato wrote:
> >> > Hi!
> >> > 
> >> > Where did you get that linker script?
> >> > 
> >> > FWIW, I catched this possible issue in review, and this was already
> >> > discussed (see my email and Eric's reply):
> >> > https://lore.kernel.org/all/CAKbZUD3E2if8Sncy+M2YKncc_Zh08-86W6U5wR0ZMazShxbHHA@mail.gmail.com/
> >> > 
> >> > This was my original testcase
> >> > (https://github.com/heatd/elf-bug-questionmark), which convinced the
> >> > loader to map .data over a cleared .bss. Your bug seems similar, but
> >> > does the inverse: maps .bss over .data.
> >> > 
> >> 
> >> I wrote the linker script myself from scratch.
> >
> > Do you still need this addressed, or have you been able to adjust the
> > linker script? (I ask to try to assess the priority of needing to fix
> > this behavior change...)
> 
> Kees, I haven't had a chance to test this yet but it occurred to me
> that there is an easy way to handle this.  In our in-memory copy
> of the elf program headers we can just merge the two segments
> together.
> 
> I believe the diff below accomplishes that, and should fix issue.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 5397b552fbeb..01df7dd1f3b4 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -924,6 +926,31 @@ static int load_elf_binary(struct linux_binprm *bprm)
>  	elf_ppnt = elf_phdata;
>  	for (i = 0; i < elf_ex->e_phnum; i++, elf_ppnt++)
>  		switch (elf_ppnt->p_type) {
> +		case PT_LOAD:
> +		{
> +			/*
> +			 * Historically linux ignored all but the
> +			 * final .bss segment.  Now that linux honors
> +			 * all .bss segments, a .bss segment that
> +			 * logically is not overlapping but is
> +			 * overlapping when it's edges are rounded up
> +			 * to page size causes programs to fail.
> +			 *
> +			 * Handle that case by merging .bss segments
> +			 * into the segment they follow.
> +			 */
> +			if (((i + 1) >= elf_ex->e_phnum) ||
> +			    (elf_ppnt[1].p_type != PT_LOAD) ||
> +			    (elf_ppnt[1].p_filesz != 0))
> +				continue;
> +			unsigned long end =
> +				elf_ppnt[0].p_vaddr + elf_ppnt[0].p_memsz;
> +			if (elf_ppnt[1].p_vaddr != end)
> +				continue;
> +			elf_ppnt[0].p_memsz += elf_ppnt[1].p_memsz;
> +			elf_ppnt[1].p_type = PT_NULL;
> +			break;
> +		}
>  		case PT_GNU_STACK:
>  			if (elf_ppnt->p_flags & PF_X)
>  				executable_stack = EXSTACK_ENABLE_X;

I don't think this is safe -- it isn't looking at flags, etc. e.g.,
something like this could break:

  Type  Offset   VirtAddr  PhysAddr  FileSiz  MemSiz   Flg Align
  LOAD  0x003000 0x12000   0x12000   0x001000 0x001000 R E 0x1000
  LOAD  0x004000 0x13000   0x13000   0x000000 0x001000 RW  0x1000

Hmm

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-02-27 17:22         ` Kees Cook
@ 2024-02-27 20:59           ` Eric W. Biederman
  0 siblings, 0 replies; 18+ messages in thread
From: Eric W. Biederman @ 2024-02-27 20:59 UTC (permalink / raw)
  To: Kees Cook
  Cc: Jan Bujak, Pedro Falcato, linux-mm, linux-kernel, viro, brauner,
	linux-fsdevel

Kees Cook <keescook@chromium.org> writes:

> On Tue, Feb 27, 2024 at 09:35:39AM -0600, Eric W. Biederman wrote:
>> Kees Cook <keescook@chromium.org> writes:
>> 
>> > On Tue, Jan 23, 2024 at 12:23:27AM +0900, Jan Bujak wrote:
>> >> On 1/22/24 23:54, Pedro Falcato wrote:
>> >> > Hi!
>> >> > 
>> >> > Where did you get that linker script?
>> >> > 
>> >> > FWIW, I catched this possible issue in review, and this was already
>> >> > discussed (see my email and Eric's reply):
>> >> > https://lore.kernel.org/all/CAKbZUD3E2if8Sncy+M2YKncc_Zh08-86W6U5wR0ZMazShxbHHA@mail.gmail.com/
>> >> > 
>> >> > This was my original testcase
>> >> > (https://github.com/heatd/elf-bug-questionmark), which convinced the
>> >> > loader to map .data over a cleared .bss. Your bug seems similar, but
>> >> > does the inverse: maps .bss over .data.
>> >> > 
>> >> 
>> >> I wrote the linker script myself from scratch.
>> >
>> > Do you still need this addressed, or have you been able to adjust the
>> > linker script? (I ask to try to assess the priority of needing to fix
>> > this behavior change...)
>> 
>> Kees, I haven't had a chance to test this yet but it occurred to me
>> that there is an easy way to handle this.  In our in-memory copy
>> of the elf program headers we can just merge the two segments
>> together.
>> 
>> I believe the diff below accomplishes that, and should fix issue.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> 
>> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
>> index 5397b552fbeb..01df7dd1f3b4 100644
>> --- a/fs/binfmt_elf.c
>> +++ b/fs/binfmt_elf.c
>> @@ -924,6 +926,31 @@ static int load_elf_binary(struct linux_binprm *bprm)
>>  	elf_ppnt = elf_phdata;
>>  	for (i = 0; i < elf_ex->e_phnum; i++, elf_ppnt++)
>>  		switch (elf_ppnt->p_type) {
>> +		case PT_LOAD:
>> +		{
>> +			/*
>> +			 * Historically linux ignored all but the
>> +			 * final .bss segment.  Now that linux honors
>> +			 * all .bss segments, a .bss segment that
>> +			 * logically is not overlapping but is
>> +			 * overlapping when it's edges are rounded up
>> +			 * to page size causes programs to fail.
>> +			 *
>> +			 * Handle that case by merging .bss segments
>> +			 * into the segment they follow.
>> +			 */
>> +			if (((i + 1) >= elf_ex->e_phnum) ||
>> +			    (elf_ppnt[1].p_type != PT_LOAD) ||
>> +			    (elf_ppnt[1].p_filesz != 0))
>> +				continue;
>> +			unsigned long end =
>> +				elf_ppnt[0].p_vaddr + elf_ppnt[0].p_memsz;
>> +			if (elf_ppnt[1].p_vaddr != end)
>> +				continue;
>> +			elf_ppnt[0].p_memsz += elf_ppnt[1].p_memsz;
>> +			elf_ppnt[1].p_type = PT_NULL;
>> +			break;
>> +		}
>>  		case PT_GNU_STACK:
>>  			if (elf_ppnt->p_flags & PF_X)
>>  				executable_stack = EXSTACK_ENABLE_X;
>
> I don't think this is safe -- it isn't looking at flags, etc. e.g.,
> something like this could break:
>
>   Type  Offset   VirtAddr  PhysAddr  FileSiz  MemSiz   Flg Align
>   LOAD  0x003000 0x12000   0x12000   0x001000 0x001000 R E 0x1000
>   LOAD  0x004000 0x13000   0x13000   0x000000 0x001000 RW  0x1000

Yes.  I think it should be modified to only do something is the break
is not on a page boundary (which will automatically limit it's effect
to where we need to do something for backwards compatibility).

Still with a few tweaks and testing I think that is a good path forward
for dealing with the ``regression'' case.

Eric


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-02-04 23:27           ` Kees Cook
  2024-02-26  5:54             ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-03-25 15:26             ` Linux regression tracking (Thorsten Leemhuis)
  2024-03-25 16:56               ` Kees Cook
  1 sibling, 1 reply; 18+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-03-25 15:26 UTC (permalink / raw)
  To: Kees Cook, Linux regressions mailing list
  Cc: Eric W. Biederman, Jan Bujak, linux-mm, linux-kernel, viro,
	brauner, linux-fsdevel

On 05.02.24 00:27, Kees Cook wrote:
> On Thu, Feb 01, 2024 at 11:47:02AM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>> for once, to make this easily accessible to everyone.
>>
>> Eric, what's the status wrt. to this regression? Things from here look
>> stalled, but I might be missing something.
>>
>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> 
> If Eric doesn't beat me to it, I'm hoping to look at this more this
> coming week.

Friendly reminder: that was quite a while ago by now and it seems
neither you nor Eric looked into this. Or was there some progress and I
just missed it?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-03-25 15:26             ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-03-25 16:56               ` Kees Cook
  2024-03-25 17:08                 ` Thorsten Leemhuis
  0 siblings, 1 reply; 18+ messages in thread
From: Kees Cook @ 2024-03-25 16:56 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: Eric W. Biederman, Jan Bujak, linux-mm, linux-kernel, viro,
	brauner, linux-fsdevel

On Mon, Mar 25, 2024 at 04:26:56PM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 05.02.24 00:27, Kees Cook wrote:
> > On Thu, Feb 01, 2024 at 11:47:02AM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> >> for once, to make this easily accessible to everyone.
> >>
> >> Eric, what's the status wrt. to this regression? Things from here look
> >> stalled, but I might be missing something.
> >>
> >> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> > 
> > If Eric doesn't beat me to it, I'm hoping to look at this more this
> > coming week.
> 
> Friendly reminder: that was quite a while ago by now and it seems
> neither you nor Eric looked into this. Or was there some progress and I
> just missed it?

The original reporter hasn't responded to questions and no one else has
mentioned this issue, so I think we can remove this from the tracker.

#regzbot resolve: regression is not visible without manually constructing broken ELF headers


-- 
Kees Cook

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Recent-ish changes in binfmt_elf made my program segfault
  2024-03-25 16:56               ` Kees Cook
@ 2024-03-25 17:08                 ` Thorsten Leemhuis
  0 siblings, 0 replies; 18+ messages in thread
From: Thorsten Leemhuis @ 2024-03-25 17:08 UTC (permalink / raw)
  To: Kees Cook, Linux regressions mailing list
  Cc: Eric W. Biederman, Jan Bujak, linux-mm, linux-kernel, viro,
	brauner, linux-fsdevel

On 25.03.24 17:56, Kees Cook wrote:

> The original reporter hasn't responded to questions and no one else has
> mentioned this issue, so I think we can remove this from the tracker.
> 
> #regzbot resolve: regression is not visible without manually constructing broken ELF headers

Ahh, that's how it is sometimes, thx for the update! Ciao Thorsten

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-03-25 17:09 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-22 12:01 Recent-ish changes in binfmt_elf made my program segfault Jan Bujak
2024-01-22 14:54 ` Pedro Falcato
2024-01-22 15:23   ` Jan Bujak
2024-02-27  2:23     ` Kees Cook
2024-02-27 15:35       ` Eric W. Biederman
2024-02-27 17:22         ` Kees Cook
2024-02-27 20:59           ` Eric W. Biederman
2024-01-22 16:43 ` Eric W. Biederman
2024-01-22 20:48   ` Kees Cook
2024-01-22 21:01     ` Eric W. Biederman
2024-01-22 22:12       ` Kees Cook
2024-02-01 10:47         ` Linux regression tracking (Thorsten Leemhuis)
2024-02-04 23:27           ` Kees Cook
2024-02-26  5:54             ` Linux regression tracking (Thorsten Leemhuis)
2024-03-25 15:26             ` Linux regression tracking (Thorsten Leemhuis)
2024-03-25 16:56               ` Kees Cook
2024-03-25 17:08                 ` Thorsten Leemhuis
2024-01-24  6:59 ` Linux regression tracking #adding (Thorsten Leemhuis)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).