regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* Bug 215720 - brk() regression on AArch64 on static-pie binary -- issue with ASLR and a guard page?
@ 2022-03-28 13:21 Thorsten Leemhuis
  2022-04-09 11:49 ` Thorsten Leemhuis
  0 siblings, 1 reply; 3+ messages in thread
From: Thorsten Leemhuis @ 2022-03-28 13:21 UTC (permalink / raw)
  To: H.J. Lu; +Cc: regressions, Linux Kernel Mailing List, Victor Stinner

Hi, this is your Linux kernel regression tracker.

I noticed a regression report in bugzilla.kernel.org that afaics nobody
acted upon since it was reported about a week ago, that's why I decided
to forward it to the lists and the author of the culprit. To quote from
https://bugzilla.kernel.org/show_bug.cgi?id=215720:

>  Victor Stinner 2022-03-22 02:24:57 UTC
> 
> Created attachment 300597 [details]
> empty.c reproducer
> 
> I found a brk() syscall regression of Linux kernel 5.17 on AArch64.
> 
> A git bisect found the change "fs/binfmt_elf: use PT_LOAD p_align values for static PIE": commit 9630f0d60fec5fbcaa4435a66f75df1dc9704b66, changed related to the bz#215275.
> 
> Program to reproduce the bug, empty.c (attached to the issue):
> ---
> _Thread_local int var1 = 0;
> int main() {
>     volatile int x = 1;
>     var1 = x;
>     return 0;
> }
> ---
> 
> Build the program as a static PIE program:
> 
>     gcc -std=c11 -static-pie -g empty.c -o empty -O2
> 
> The program fails randomly, it takes 100 to 6000 runs to reproduce the crash.
> 
> Short shell loop to reproduce the crash:
> ---
> $ i=0; while true; do ./empty; rc=$?; i=$(($i + 1)); echo "$i:
> $(date): $rc"; if [ $rc -ne 0 ]; then break; fi; done
> (...)
> 159: Tue Mar 22 01:54:22 CET 2022: 0
> 160: Tue Mar 22 01:54:22 CET 2022: 0
> Segmentation fault (core dumped)
> 161: Tue Mar 22 01:54:22 CET 2022: 139
> ---
> 
> Disabling ASLR (write 0 to /proc/sys/kernel/randomize_va_space) works
> around the bug.
> 
> Rather than using "empty.c" program, the "ldconfig -V > /dev/null" command can be used: standard static-pie program.
> 
> strace when the program works:
> ---
> brk(NULL)                               = 0xaaaac3961000
> brk(0xaaaac3961b78)                     = 0xaaaac3961b78
> ---
> 
> strace when the bug occurs:
> ---
> brk(NULL)                               = 0xaaaabf3c3000
> brk(0xaaaabf3c3b78)                     = 0xaaaabf3c3000
> ---
> 
> The following test of the brk() syscall fails when the bug occurs:
> ---
> 	/* Check against existing mmap mappings. */
> 	next = find_vma(mm, oldbrk);
> 	if (next && newbrk + PAGE_SIZE > vm_start_gap(next))
> 		goto out;
> ---
> 
> Note: When the bug occurs, the program crash with SIGSEGV: the glibc __libc_setup_tls() function calls sbrk(2936) to allocate TLS variables, but it doesn't handle the memory allocation failure.
> 
> Note: At the beginning, I discovered this kernel regression while checking for Python
> buildbot failures on our Fedora Rawhide AArch64 machine.
> 
> * Fedora downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=2066147
> * Python issue: https://bugs.python.org/issue47078
> 
> [reply] [−] Comment 1 Victor Stinner 2022-03-22 02:41:00 UTC
> 
> See also the binutils issue: "p_align in ELF program headers should not exceed section alignment"
> https://sourceware.org/bugzilla/show_bug.cgi?id=28689
> 
> See also this old (kernel 4.18) fixed x86-64 kernel bug: "kernel: brk can grow the heap into the area reserved for the stack"
> https://bugzilla.redhat.com/show_bug.cgi?id=1749633


Could somebody take a look into this? Or was this discussed somewhere
else already? Or even fixed?

Anyway, to get this tracked:

#regzbot introduced: 9630f0d60fec5fbcaa4435a66f75df1dc9704b66
#regzbot from: Victor Stinner <vstinner@redhat.com>
#regzbot title: brk() regression on AArch64 on static-pie binary
#regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215720

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I'm getting a lot of
reports on my table. I can only look briefly into most of them and lack
knowledge about most of the areas they concern. I thus unfortunately
will sometimes get things wrong or miss something important. I hope
that's not the case here; if you think it is, don't hesitate to tell me
in a public reply, it's in everyone's interest to set the public record
straight.

-- 
Additional information about regzbot:

If you want to know more about regzbot, check out its web-interface, the
getting start guide, and the references documentation:

https://linux-regtracking.leemhuis.info/regzbot/
https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md
https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md

The last two documents will explain how you can interact with regzbot
yourself if your want to.

Hint for reporters: when reporting a regression it's in your interest to
CC the regression list and tell regzbot about the issue, as that ensures
the regression makes it onto the radar of the Linux kernel's regression
tracker -- that's in your interest, as it ensures your report won't fall
through the cracks unnoticed.

Hint for developers: you normally don't need to care about regzbot once
it's involved. Fix the issue as you normally would, just remember to
include 'Link:' tag in the patch descriptions pointing to all reports
about the issue. This has been expected from developers even before
regzbot showed up for reasons explained in
'Documentation/process/submitting-patches.rst' and
'Documentation/process/5.Posting.rst'.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug 215720 - brk() regression on AArch64 on static-pie binary -- issue with ASLR and a guard page?
  2022-03-28 13:21 Bug 215720 - brk() regression on AArch64 on static-pie binary -- issue with ASLR and a guard page? Thorsten Leemhuis
@ 2022-04-09 11:49 ` Thorsten Leemhuis
  2022-04-16  4:41   ` Bug 215720 - brk() regression on AArch64 on static-pie binary -- issue with ASLR and a guard page? #forregzbot Thorsten Leemhuis
  0 siblings, 1 reply; 3+ messages in thread
From: Thorsten Leemhuis @ 2022-04-09 11:49 UTC (permalink / raw)
  To: H.J. Lu
  Cc: regressions, Linux Kernel Mailing List, Victor Stinner, Mike Rapoport

Hi, this is your Linux kernel regression tracker. Top-posting for once,
to make this easily accessible to everyone.

Hey, what's up here? Or was this regressions fixed already?

H.J. Lu: reminder, this is caused by a patch of yours.

Mike, if you have a minute: '925346c129da' ("fs/binfmt_elf: fix PT_LOAD
p_align values for loaders") in 'next' contains a 'Fixes:' tag for the
culprit of this regression, but I assume it fixes a different issue?

Ciao, Thorsten

#regzbot poke

On 28.03.22 15:21, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker.
> 
> I noticed a regression report in bugzilla.kernel.org that afaics nobody
> acted upon since it was reported about a week ago, that's why I decided
> to forward it to the lists and the author of the culprit. To quote from
> https://bugzilla.kernel.org/show_bug.cgi?id=215720:
> 
>>  Victor Stinner 2022-03-22 02:24:57 UTC
>>
>> Created attachment 300597 [details]
>> empty.c reproducer
>>
>> I found a brk() syscall regression of Linux kernel 5.17 on AArch64.
>>
>> A git bisect found the change "fs/binfmt_elf: use PT_LOAD p_align values for static PIE": commit 9630f0d60fec5fbcaa4435a66f75df1dc9704b66, changed related to the bz#215275.
>>
>> Program to reproduce the bug, empty.c (attached to the issue):
>> ---
>> _Thread_local int var1 = 0;
>> int main() {
>>     volatile int x = 1;
>>     var1 = x;
>>     return 0;
>> }
>> ---
>>
>> Build the program as a static PIE program:
>>
>>     gcc -std=c11 -static-pie -g empty.c -o empty -O2
>>
>> The program fails randomly, it takes 100 to 6000 runs to reproduce the crash.
>>
>> Short shell loop to reproduce the crash:
>> ---
>> $ i=0; while true; do ./empty; rc=$?; i=$(($i + 1)); echo "$i:
>> $(date): $rc"; if [ $rc -ne 0 ]; then break; fi; done
>> (...)
>> 159: Tue Mar 22 01:54:22 CET 2022: 0
>> 160: Tue Mar 22 01:54:22 CET 2022: 0
>> Segmentation fault (core dumped)
>> 161: Tue Mar 22 01:54:22 CET 2022: 139
>> ---
>>
>> Disabling ASLR (write 0 to /proc/sys/kernel/randomize_va_space) works
>> around the bug.
>>
>> Rather than using "empty.c" program, the "ldconfig -V > /dev/null" command can be used: standard static-pie program.
>>
>> strace when the program works:
>> ---
>> brk(NULL)                               = 0xaaaac3961000
>> brk(0xaaaac3961b78)                     = 0xaaaac3961b78
>> ---
>>
>> strace when the bug occurs:
>> ---
>> brk(NULL)                               = 0xaaaabf3c3000
>> brk(0xaaaabf3c3b78)                     = 0xaaaabf3c3000
>> ---
>>
>> The following test of the brk() syscall fails when the bug occurs:
>> ---
>> 	/* Check against existing mmap mappings. */
>> 	next = find_vma(mm, oldbrk);
>> 	if (next && newbrk + PAGE_SIZE > vm_start_gap(next))
>> 		goto out;
>> ---
>>
>> Note: When the bug occurs, the program crash with SIGSEGV: the glibc __libc_setup_tls() function calls sbrk(2936) to allocate TLS variables, but it doesn't handle the memory allocation failure.
>>
>> Note: At the beginning, I discovered this kernel regression while checking for Python
>> buildbot failures on our Fedora Rawhide AArch64 machine.
>>
>> * Fedora downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=2066147
>> * Python issue: https://bugs.python.org/issue47078
>>
>> [reply] [−] Comment 1 Victor Stinner 2022-03-22 02:41:00 UTC
>>
>> See also the binutils issue: "p_align in ELF program headers should not exceed section alignment"
>> https://sourceware.org/bugzilla/show_bug.cgi?id=28689
>>
>> See also this old (kernel 4.18) fixed x86-64 kernel bug: "kernel: brk can grow the heap into the area reserved for the stack"
>> https://bugzilla.redhat.com/show_bug.cgi?id=1749633
> 
> 
> Could somebody take a look into this? Or was this discussed somewhere
> else already? Or even fixed?
> 
> Anyway, to get this tracked:
> 
> #regzbot introduced: 9630f0d60fec5fbcaa4435a66f75df1dc9704b66
> #regzbot from: Victor Stinner <vstinner@redhat.com>
> #regzbot title: brk() regression on AArch64 on static-pie binary
> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215720
> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> 
> P.S.: As the Linux kernel's regression tracker I'm getting a lot of
> reports on my table. I can only look briefly into most of them and lack
> knowledge about most of the areas they concern. I thus unfortunately
> will sometimes get things wrong or miss something important. I hope
> that's not the case here; if you think it is, don't hesitate to tell me
> in a public reply, it's in everyone's interest to set the public record
> straight.
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug 215720 - brk() regression on AArch64 on static-pie binary -- issue with ASLR and a guard page? #forregzbot
  2022-04-09 11:49 ` Thorsten Leemhuis
@ 2022-04-16  4:41   ` Thorsten Leemhuis
  0 siblings, 0 replies; 3+ messages in thread
From: Thorsten Leemhuis @ 2022-04-16  4:41 UTC (permalink / raw)
  To: regressions; +Cc: Linux Kernel Mailing List

TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.

#regzbot fixed-by: aeb7923733d100

On 09.04.22 13:49, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker. Top-posting for once,
> to make this easily accessible to everyone.
> 
> Hey, what's up here? Or was this regressions fixed already?
> 
> H.J. Lu: reminder, this is caused by a patch of yours.
> 
> Mike, if you have a minute: '925346c129da' ("fs/binfmt_elf: fix PT_LOAD
> p_align values for loaders") in 'next' contains a 'Fixes:' tag for the
> culprit of this regression, but I assume it fixes a different issue?
> 
> Ciao, Thorsten
> 
> #regzbot poke
> 
> On 28.03.22 15:21, Thorsten Leemhuis wrote:
>> Hi, this is your Linux kernel regression tracker.
>>
>> I noticed a regression report in bugzilla.kernel.org that afaics nobody
>> acted upon since it was reported about a week ago, that's why I decided
>> to forward it to the lists and the author of the culprit. To quote from
>> https://bugzilla.kernel.org/show_bug.cgi?id=215720:
>>
>>>  Victor Stinner 2022-03-22 02:24:57 UTC
>>>
>>> Created attachment 300597 [details]
>>> empty.c reproducer
>>>
>>> I found a brk() syscall regression of Linux kernel 5.17 on AArch64.
>>>
>>> A git bisect found the change "fs/binfmt_elf: use PT_LOAD p_align values for static PIE": commit 9630f0d60fec5fbcaa4435a66f75df1dc9704b66, changed related to the bz#215275.
>>>
>>> Program to reproduce the bug, empty.c (attached to the issue):
>>> ---
>>> _Thread_local int var1 = 0;
>>> int main() {
>>>     volatile int x = 1;
>>>     var1 = x;
>>>     return 0;
>>> }
>>> ---
>>>
>>> Build the program as a static PIE program:
>>>
>>>     gcc -std=c11 -static-pie -g empty.c -o empty -O2
>>>
>>> The program fails randomly, it takes 100 to 6000 runs to reproduce the crash.
>>>
>>> Short shell loop to reproduce the crash:
>>> ---
>>> $ i=0; while true; do ./empty; rc=$?; i=$(($i + 1)); echo "$i:
>>> $(date): $rc"; if [ $rc -ne 0 ]; then break; fi; done
>>> (...)
>>> 159: Tue Mar 22 01:54:22 CET 2022: 0
>>> 160: Tue Mar 22 01:54:22 CET 2022: 0
>>> Segmentation fault (core dumped)
>>> 161: Tue Mar 22 01:54:22 CET 2022: 139
>>> ---
>>>
>>> Disabling ASLR (write 0 to /proc/sys/kernel/randomize_va_space) works
>>> around the bug.
>>>
>>> Rather than using "empty.c" program, the "ldconfig -V > /dev/null" command can be used: standard static-pie program.
>>>
>>> strace when the program works:
>>> ---
>>> brk(NULL)                               = 0xaaaac3961000
>>> brk(0xaaaac3961b78)                     = 0xaaaac3961b78
>>> ---
>>>
>>> strace when the bug occurs:
>>> ---
>>> brk(NULL)                               = 0xaaaabf3c3000
>>> brk(0xaaaabf3c3b78)                     = 0xaaaabf3c3000
>>> ---
>>>
>>> The following test of the brk() syscall fails when the bug occurs:
>>> ---
>>> 	/* Check against existing mmap mappings. */
>>> 	next = find_vma(mm, oldbrk);
>>> 	if (next && newbrk + PAGE_SIZE > vm_start_gap(next))
>>> 		goto out;
>>> ---
>>>
>>> Note: When the bug occurs, the program crash with SIGSEGV: the glibc __libc_setup_tls() function calls sbrk(2936) to allocate TLS variables, but it doesn't handle the memory allocation failure.
>>>
>>> Note: At the beginning, I discovered this kernel regression while checking for Python
>>> buildbot failures on our Fedora Rawhide AArch64 machine.
>>>
>>> * Fedora downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=2066147
>>> * Python issue: https://bugs.python.org/issue47078
>>>
>>> [reply] [−] Comment 1 Victor Stinner 2022-03-22 02:41:00 UTC
>>>
>>> See also the binutils issue: "p_align in ELF program headers should not exceed section alignment"
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=28689
>>>
>>> See also this old (kernel 4.18) fixed x86-64 kernel bug: "kernel: brk can grow the heap into the area reserved for the stack"
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1749633
>>
>>
>> Could somebody take a look into this? Or was this discussed somewhere
>> else already? Or even fixed?
>>
>> Anyway, to get this tracked:
>>
>> #regzbot introduced: 9630f0d60fec5fbcaa4435a66f75df1dc9704b66
>> #regzbot from: Victor Stinner <vstinner@redhat.com>
>> #regzbot title: brk() regression on AArch64 on static-pie binary
>> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215720
>>
>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>>
>> P.S.: As the Linux kernel's regression tracker I'm getting a lot of
>> reports on my table. I can only look briefly into most of them and lack
>> knowledge about most of the areas they concern. I thus unfortunately
>> will sometimes get things wrong or miss something important. I hope
>> that's not the case here; if you think it is, don't hesitate to tell me
>> in a public reply, it's in everyone's interest to set the public record
>> straight.
>>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-04-16  4:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-28 13:21 Bug 215720 - brk() regression on AArch64 on static-pie binary -- issue with ASLR and a guard page? Thorsten Leemhuis
2022-04-09 11:49 ` Thorsten Leemhuis
2022-04-16  4:41   ` Bug 215720 - brk() regression on AArch64 on static-pie binary -- issue with ASLR and a guard page? #forregzbot Thorsten Leemhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).