All of lore.kernel.org
 help / color / mirror / Atom feed
* Bisected stability regression in 6.6
@ 2023-11-11  6:31 matoro
  2023-11-11  7:02 ` Bagas Sanjaya
  2023-11-11 21:21 ` Helge Deller
  0 siblings, 2 replies; 12+ messages in thread
From: matoro @ 2023-11-11  6:31 UTC (permalink / raw)
  To: linux-parisc, deller, Linux Kernel Mailing List, Sam James

Hi Helge, I have bisected a regression in 6.6 which is causing userspace 
segfaults at a significantly increased rate in kernel 6.6.  There seems to be 
a pathological case triggered by the ninja build tool.  The test case I have 
been using is cmake with ninja backend to attempt to build the nghttp2 
package.  In 6.6, this segfaults, not at the same location every time, but 
with enough reliability that I was able to use it as a bisection regression 
case, including immediately after a reboot.  In the kernel log, these show up 
as "trap #15: Data TLB miss fault" messages.  Now these messages can and do 
show up in 6.5 causing segfaults, but never immediately after a reboot and 
infrequently enough that the system is stable.  With kernel 6.6 I am 
completely unable to build nghttp2 under any circumstances.

I have bisected this down to the following commit:

$ git bisect good
3033cd4307681c60db6d08f398a64484b36e0b0f is the first bad commit
commit 3033cd4307681c60db6d08f398a64484b36e0b0f
Author: Helge Deller <deller@gmx.de>
Date:   Sat Aug 19 00:53:28 2023 +0200

     parisc: Use generic mmap top-down layout and brk randomization

     parisc uses a top-down layout by default that exactly fits the generic
     functions, so get rid of arch specific code and use the generic version
     by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.

     Note that on parisc the stack always grows up and a "unlimited stack"
     simply means that the value as defined in 
CONFIG_STACK_MAX_DEFAULT_SIZE_MB
     should be used. So RLIM_INFINITY is not an indicator to use the legacy
     memory layout.

     Signed-off-by: Helge Deller <deller@gmx.de>

  arch/parisc/Kconfig             | 17 +++++++++++++
  arch/parisc/kernel/process.c    | 14 -----------
  arch/parisc/kernel/sys_parisc.c | 54 
+----------------------------------------
  mm/util.c                       |  5 +++-
  4 files changed, 22 insertions(+), 68 deletions(-)

I have tried applying ad4aa06e1d92b06ed56c7240252927bd60632efe ("parisc: Add 
nop instructions after TLB inserts") on top of 6.6, but it does NOT fix the 
issue.

Let me know if there is anything I can answer on this.  I can provide full 
remote access with BMC if it would help.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bisected stability regression in 6.6
  2023-11-11  6:31 Bisected stability regression in 6.6 matoro
@ 2023-11-11  7:02 ` Bagas Sanjaya
  2023-11-22  9:07   ` Linux regression tracking #update (Thorsten Leemhuis)
  2023-11-11 21:21 ` Helge Deller
  1 sibling, 1 reply; 12+ messages in thread
From: Bagas Sanjaya @ 2023-11-11  7:02 UTC (permalink / raw)
  To: matoro, Linux PA-RISC Mailing List, Helge Deller,
	Linux Kernel Mailing List, Sam James,
	Linux Memory Management List, Linux Regressions
  Cc: James E.J. Bottomley, Andrew Morton, Peter Zijlstra,
	Rafael J. Wysocki, Gautham R. Shenoy, Josh Poimboeuf,
	Thomas Gleixner, Jens Axboe, John David Anglin

[-- Attachment #1: Type: text/plain, Size: 2560 bytes --]

On Sat, Nov 11, 2023 at 01:31:01AM -0500, matoro wrote:
> Hi Helge, I have bisected a regression in 6.6 which is causing userspace
> segfaults at a significantly increased rate in kernel 6.6.  There seems to
> be a pathological case triggered by the ninja build tool.  The test case I
> have been using is cmake with ninja backend to attempt to build the nghttp2
> package.  In 6.6, this segfaults, not at the same location every time, but
> with enough reliability that I was able to use it as a bisection regression
> case, including immediately after a reboot.  In the kernel log, these show
> up as "trap #15: Data TLB miss fault" messages.  Now these messages can and
> do show up in 6.5 causing segfaults, but never immediately after a reboot
> and infrequently enough that the system is stable.  With kernel 6.6 I am
> completely unable to build nghttp2 under any circumstances.
> 
> I have bisected this down to the following commit:
> 
> $ git bisect good
> 3033cd4307681c60db6d08f398a64484b36e0b0f is the first bad commit
> commit 3033cd4307681c60db6d08f398a64484b36e0b0f
> Author: Helge Deller <deller@gmx.de>
> Date:   Sat Aug 19 00:53:28 2023 +0200
> 
>     parisc: Use generic mmap top-down layout and brk randomization
> 
>     parisc uses a top-down layout by default that exactly fits the generic
>     functions, so get rid of arch specific code and use the generic version
>     by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
> 
>     Note that on parisc the stack always grows up and a "unlimited stack"
>     simply means that the value as defined in
> CONFIG_STACK_MAX_DEFAULT_SIZE_MB
>     should be used. So RLIM_INFINITY is not an indicator to use the legacy
>     memory layout.
> 
>     Signed-off-by: Helge Deller <deller@gmx.de>
> 
>  arch/parisc/Kconfig             | 17 +++++++++++++
>  arch/parisc/kernel/process.c    | 14 -----------
>  arch/parisc/kernel/sys_parisc.c | 54
> +----------------------------------------
>  mm/util.c                       |  5 +++-
>  4 files changed, 22 insertions(+), 68 deletions(-)
> 
> I have tried applying ad4aa06e1d92b06ed56c7240252927bd60632efe ("parisc: Add
> nop instructions after TLB inserts") on top of 6.6, but it does NOT fix the
> issue.
> 
> Let me know if there is anything I can answer on this.  I can provide full
> remote access with BMC if it would help.

Thanks for the regression report. I'm adding it to regzbot:

#regzbot ^introduced: 3033cd4307681c

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bisected stability regression in 6.6
  2023-11-11  6:31 Bisected stability regression in 6.6 matoro
  2023-11-11  7:02 ` Bagas Sanjaya
@ 2023-11-11 21:21 ` Helge Deller
  2023-11-11 21:27   ` Sam James
  2023-11-11 21:28   ` matoro
  1 sibling, 2 replies; 12+ messages in thread
From: Helge Deller @ 2023-11-11 21:21 UTC (permalink / raw)
  To: matoro, linux-parisc, Linux Kernel Mailing List, Sam James

On 11/11/23 07:31, matoro wrote:
> Hi Helge, I have bisected a regression in 6.6 which is causing
> userspace segfaults at a significantly increased rate in kernel 6.6.
> There seems to be a pathological case triggered by the ninja build
> tool.  The test case I have been using is cmake with ninja backend to
> attempt to build the nghttp2 package.  In 6.6, this segfaults, not at
> the same location every time, but with enough reliability that I was
> able to use it as a bisection regression case, including immediately
> after a reboot.  In the kernel log, these show up as "trap #15: Data
> TLB miss fault" messages.  Now these messages can and do show up in
> 6.5 causing segfaults, but never immediately after a reboot and
> infrequently enough that the system is stable.  With kernel 6.6 I am
> completely unable to build nghttp2 under any circumstances.
>
> I have bisected this down to the following commit:
>
> $ git bisect good
> 3033cd4307681c60db6d08f398a64484b36e0b0f is the first bad commit
> commit 3033cd4307681c60db6d08f398a64484b36e0b0f
> Author: Helge Deller <deller@gmx.de>
> Date:   Sat Aug 19 00:53:28 2023 +0200
>
>      parisc: Use generic mmap top-down layout and brk randomization
>
>      parisc uses a top-down layout by default that exactly fits the generic
>      functions, so get rid of arch specific code and use the generic version
>      by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
>
>      Note that on parisc the stack always grows up and a "unlimited stack"
>      simply means that the value as defined in CONFIG_STACK_MAX_DEFAULT_SIZE_MB
>      should be used. So RLIM_INFINITY is not an indicator to use the legacy
>      memory layout.
>
>      Signed-off-by: Helge Deller <deller@gmx.de>
>
>   arch/parisc/Kconfig             | 17 +++++++++++++
>   arch/parisc/kernel/process.c    | 14 -----------
>   arch/parisc/kernel/sys_parisc.c | 54 +----------------------------------------
>   mm/util.c                       |  5 +++-
>   4 files changed, 22 insertions(+), 68 deletions(-)

Thanks for your report!
I think it's quite unlikely that this patch introduces such a bad regression.
I'd suspect some other bad commmit, but I'll try to reproduce.

In any case, do you have CONFIG_BPF_JIT enabled? If so, could you try
to reproduce with CONFIG_BPF_JIT disabled?
The JIT is quite new in v6.6 and I did face some crashes and disabling
it helped me so far.

> I have tried applying ad4aa06e1d92b06ed56c7240252927bd60632efe
> ("parisc: Add nop instructions after TLB inserts") on top of 6.6, but
> it does NOT fix the issue.

Ok.

Helge

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bisected stability regression in 6.6
  2023-11-11 21:21 ` Helge Deller
@ 2023-11-11 21:27   ` Sam James
  2023-11-11 23:33     ` matoro
  2023-11-11 21:28   ` matoro
  1 sibling, 1 reply; 12+ messages in thread
From: Sam James @ 2023-11-11 21:27 UTC (permalink / raw)
  To: Helge Deller; +Cc: matoro, linux-parisc, Linux Kernel Mailing List, Sam James


Helge Deller <deller@gmx.de> writes:

> On 11/11/23 07:31, matoro wrote:
>> Hi Helge, I have bisected a regression in 6.6 which is causing
>> userspace segfaults at a significantly increased rate in kernel 6.6.
>> There seems to be a pathological case triggered by the ninja build
>> tool.  The test case I have been using is cmake with ninja backend to
>> attempt to build the nghttp2 package.  In 6.6, this segfaults, not at
>> the same location every time, but with enough reliability that I was
>> able to use it as a bisection regression case, including immediately
>> after a reboot.  In the kernel log, these show up as "trap #15: Data
>> TLB miss fault" messages.  Now these messages can and do show up in
>> 6.5 causing segfaults, but never immediately after a reboot and
>> infrequently enough that the system is stable.  With kernel 6.6 I am
>> completely unable to build nghttp2 under any circumstances.
>>
>> I have bisected this down to the following commit:
>>
>> $ git bisect good
>> 3033cd4307681c60db6d08f398a64484b36e0b0f is the first bad commit
>> commit 3033cd4307681c60db6d08f398a64484b36e0b0f
>> Author: Helge Deller <deller@gmx.de>
>> Date:   Sat Aug 19 00:53:28 2023 +0200
>>
>>      parisc: Use generic mmap top-down layout and brk randomization
>>
>>      parisc uses a top-down layout by default that exactly fits the generic
>>      functions, so get rid of arch specific code and use the generic version
>>      by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
>>
>>      Note that on parisc the stack always grows up and a "unlimited stack"
>>      simply means that the value as defined in CONFIG_STACK_MAX_DEFAULT_SIZE_MB
>>      should be used. So RLIM_INFINITY is not an indicator to use the legacy
>>      memory layout.
>>
>>      Signed-off-by: Helge Deller <deller@gmx.de>
>>
>>   arch/parisc/Kconfig             | 17 +++++++++++++
>>   arch/parisc/kernel/process.c    | 14 -----------
>>   arch/parisc/kernel/sys_parisc.c | 54 +----------------------------------------
>>   mm/util.c                       |  5 +++-
>>   4 files changed, 22 insertions(+), 68 deletions(-)
>
> Thanks for your report!
> I think it's quite unlikely that this patch introduces such a bad regression.
> I'd suspect some other bad commmit, but I'll try to reproduce.

matoro, does a revert apply cleanly? Does it help?

>
> In any case, do you have CONFIG_BPF_JIT enabled? If so, could you try
> to reproduce with CONFIG_BPF_JIT disabled?
> The JIT is quite new in v6.6 and I did face some crashes and disabling
> it helped me so far.
>
>> I have tried applying ad4aa06e1d92b06ed56c7240252927bd60632efe
>> ("parisc: Add nop instructions after TLB inserts") on top of 6.6, but
>> it does NOT fix the issue.
>
> Ok.
>
> Helge


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bisected stability regression in 6.6
  2023-11-11 21:21 ` Helge Deller
  2023-11-11 21:27   ` Sam James
@ 2023-11-11 21:28   ` matoro
  1 sibling, 0 replies; 12+ messages in thread
From: matoro @ 2023-11-11 21:28 UTC (permalink / raw)
  To: Helge Deller; +Cc: linux-parisc, Linux Kernel Mailing List, Sam James

On 2023-11-11 16:21, Helge Deller wrote:
> On 11/11/23 07:31, matoro wrote:
>> Hi Helge, I have bisected a regression in 6.6 which is causing
>> userspace segfaults at a significantly increased rate in kernel 6.6.
>> There seems to be a pathological case triggered by the ninja build
>> tool.  The test case I have been using is cmake with ninja backend to
>> attempt to build the nghttp2 package.  In 6.6, this segfaults, not at
>> the same location every time, but with enough reliability that I was
>> able to use it as a bisection regression case, including immediately
>> after a reboot.  In the kernel log, these show up as "trap #15: Data
>> TLB miss fault" messages.  Now these messages can and do show up in
>> 6.5 causing segfaults, but never immediately after a reboot and
>> infrequently enough that the system is stable.  With kernel 6.6 I am
>> completely unable to build nghttp2 under any circumstances.
>> 
>> I have bisected this down to the following commit:
>> 
>> $ git bisect good
>> 3033cd4307681c60db6d08f398a64484b36e0b0f is the first bad commit
>> commit 3033cd4307681c60db6d08f398a64484b36e0b0f
>> Author: Helge Deller <deller@gmx.de>
>> Date:   Sat Aug 19 00:53:28 2023 +0200
>> 
>>      parisc: Use generic mmap top-down layout and brk randomization
>> 
>>      parisc uses a top-down layout by default that exactly fits the generic
>>      functions, so get rid of arch specific code and use the generic 
>> version
>>      by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
>> 
>>      Note that on parisc the stack always grows up and a "unlimited stack"
>>      simply means that the value as defined in 
>> CONFIG_STACK_MAX_DEFAULT_SIZE_MB
>>      should be used. So RLIM_INFINITY is not an indicator to use the legacy
>>      memory layout.
>> 
>>      Signed-off-by: Helge Deller <deller@gmx.de>
>> 
>>   arch/parisc/Kconfig             | 17 +++++++++++++
>>   arch/parisc/kernel/process.c    | 14 -----------
>>   arch/parisc/kernel/sys_parisc.c | 54 
>> +----------------------------------------
>>   mm/util.c                       |  5 +++-
>>   4 files changed, 22 insertions(+), 68 deletions(-)
> 
> Thanks for your report!
> I think it's quite unlikely that this patch introduces such a bad 
> regression.
> I'd suspect some other bad commmit, but I'll try to reproduce.
> 
> In any case, do you have CONFIG_BPF_JIT enabled? If so, could you try
> to reproduce with CONFIG_BPF_JIT disabled?
> The JIT is quite new in v6.6 and I did face some crashes and disabling
> it helped me so far.
> 
>> I have tried applying ad4aa06e1d92b06ed56c7240252927bd60632efe
>> ("parisc: Add nop instructions after TLB inserts") on top of 6.6, but
>> it does NOT fix the issue.
> 
> Ok.
> 
> Helge

Nope, I use "make olddefconfig" when upgrading and it appears to be 
default-disabled.

$ grep -i "config_bpf_jit" /usr/src/linux-6.6.0-gentoo/.config
# CONFIG_BPF_JIT is not set

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bisected stability regression in 6.6
  2023-11-11 21:27   ` Sam James
@ 2023-11-11 23:33     ` matoro
  2023-11-12  1:22       ` Dr. David Alan Gilbert
  2023-11-12 20:22       ` Helge Deller
  0 siblings, 2 replies; 12+ messages in thread
From: matoro @ 2023-11-11 23:33 UTC (permalink / raw)
  To: Sam James; +Cc: Helge Deller, linux-parisc, Linux Kernel Mailing List

On 2023-11-11 16:27, Sam James wrote:
> Helge Deller <deller@gmx.de> writes:
> 
>> On 11/11/23 07:31, matoro wrote:
>>> Hi Helge, I have bisected a regression in 6.6 which is causing
>>> userspace segfaults at a significantly increased rate in kernel 6.6.
>>> There seems to be a pathological case triggered by the ninja build
>>> tool.  The test case I have been using is cmake with ninja backend to
>>> attempt to build the nghttp2 package.  In 6.6, this segfaults, not at
>>> the same location every time, but with enough reliability that I was
>>> able to use it as a bisection regression case, including immediately
>>> after a reboot.  In the kernel log, these show up as "trap #15: Data
>>> TLB miss fault" messages.  Now these messages can and do show up in
>>> 6.5 causing segfaults, but never immediately after a reboot and
>>> infrequently enough that the system is stable.  With kernel 6.6 I am
>>> completely unable to build nghttp2 under any circumstances.
>>> 
>>> I have bisected this down to the following commit:
>>> 
>>> $ git bisect good
>>> 3033cd4307681c60db6d08f398a64484b36e0b0f is the first bad commit
>>> commit 3033cd4307681c60db6d08f398a64484b36e0b0f
>>> Author: Helge Deller <deller@gmx.de>
>>> Date:   Sat Aug 19 00:53:28 2023 +0200
>>> 
>>>      parisc: Use generic mmap top-down layout and brk randomization
>>> 
>>>      parisc uses a top-down layout by default that exactly fits the 
>>> generic
>>>      functions, so get rid of arch specific code and use the generic 
>>> version
>>>      by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
>>> 
>>>      Note that on parisc the stack always grows up and a "unlimited stack"
>>>      simply means that the value as defined in 
>>> CONFIG_STACK_MAX_DEFAULT_SIZE_MB
>>>      should be used. So RLIM_INFINITY is not an indicator to use the 
>>> legacy
>>>      memory layout.
>>> 
>>>      Signed-off-by: Helge Deller <deller@gmx.de>
>>> 
>>>   arch/parisc/Kconfig             | 17 +++++++++++++
>>>   arch/parisc/kernel/process.c    | 14 -----------
>>>   arch/parisc/kernel/sys_parisc.c | 54 
>>> +----------------------------------------
>>>   mm/util.c                       |  5 +++-
>>>   4 files changed, 22 insertions(+), 68 deletions(-)
>> 
>> Thanks for your report!
>> I think it's quite unlikely that this patch introduces such a bad 
>> regression.
>> I'd suspect some other bad commmit, but I'll try to reproduce.
> 
> matoro, does a revert apply cleanly? Does it help?

Yes, I just tested this and it cleanly reverts on linux-6.6.y and the revert 
does fix the issue.

>> 
>> In any case, do you have CONFIG_BPF_JIT enabled? If so, could you try
>> to reproduce with CONFIG_BPF_JIT disabled?
>> The JIT is quite new in v6.6 and I did face some crashes and disabling
>> it helped me so far.
>> 
>>> I have tried applying ad4aa06e1d92b06ed56c7240252927bd60632efe
>>> ("parisc: Add nop instructions after TLB inserts") on top of 6.6, but
>>> it does NOT fix the issue.
>> 
>> Ok.
>> 
>> Helge

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bisected stability regression in 6.6
  2023-11-11 23:33     ` matoro
@ 2023-11-12  1:22       ` Dr. David Alan Gilbert
  2023-11-12  8:03         ` Helge Deller
  2023-11-12 20:22       ` Helge Deller
  1 sibling, 1 reply; 12+ messages in thread
From: Dr. David Alan Gilbert @ 2023-11-12  1:22 UTC (permalink / raw)
  To: matoro, HelgeDeller, deller
  Cc: Sam James, linux-parisc, Linux Kernel Mailing List

* matoro (matoro_mailinglist_kernel@matoro.tk) wrote:
> On 2023-11-11 16:27, Sam James wrote:
> > Helge Deller <deller@gmx.de> writes:
> > 
> > > On 11/11/23 07:31, matoro wrote:
> > > > Hi Helge, I have bisected a regression in 6.6 which is causing
> > > > userspace segfaults at a significantly increased rate in kernel 6.6.
> > > > There seems to be a pathological case triggered by the ninja build
> > > > tool.  The test case I have been using is cmake with ninja backend to
> > > > attempt to build the nghttp2 package.  In 6.6, this segfaults, not at
> > > > the same location every time, but with enough reliability that I was
> > > > able to use it as a bisection regression case, including immediately
> > > > after a reboot.  In the kernel log, these show up as "trap #15: Data
> > > > TLB miss fault" messages.  Now these messages can and do show up in
> > > > 6.5 causing segfaults, but never immediately after a reboot and
> > > > infrequently enough that the system is stable.  With kernel 6.6 I am
> > > > completely unable to build nghttp2 under any circumstances.
> > > > 
> > > > I have bisected this down to the following commit:
> > > > 
> > > > $ git bisect good
> > > > 3033cd4307681c60db6d08f398a64484b36e0b0f is the first bad commit
> > > > commit 3033cd4307681c60db6d08f398a64484b36e0b0f
> > > > Author: Helge Deller <deller@gmx.de>
> > > > Date:   Sat Aug 19 00:53:28 2023 +0200
> > > > 
> > > >      parisc: Use generic mmap top-down layout and brk randomization
> > > > 
> > > >      parisc uses a top-down layout by default that exactly fits
> > > > the generic
> > > >      functions, so get rid of arch specific code and use the
> > > > generic version
> > > >      by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
> > > > 
> > > >      Note that on parisc the stack always grows up and a "unlimited stack"
> > > >      simply means that the value as defined in
> > > > CONFIG_STACK_MAX_DEFAULT_SIZE_MB
> > > >      should be used. So RLIM_INFINITY is not an indicator to use
> > > > the legacy
> > > >      memory layout.
> > > > 
> > > >      Signed-off-by: Helge Deller <deller@gmx.de>
> > > > 
> > > >   arch/parisc/Kconfig             | 17 +++++++++++++
> > > >   arch/parisc/kernel/process.c    | 14 -----------
> > > >   arch/parisc/kernel/sys_parisc.c | 54
> > > > +----------------------------------------
> > > >   mm/util.c                       |  5 +++-
> > > >   4 files changed, 22 insertions(+), 68 deletions(-)
> > > 
> > > Thanks for your report!
> > > I think it's quite unlikely that this patch introduces such a bad
> > > regression.
> > > I'd suspect some other bad commmit, but I'll try to reproduce.
> > 
> > matoro, does a revert apply cleanly? Does it help?
> 
> Yes, I just tested this and it cleanly reverts on linux-6.6.y and the revert
> does fix the issue.

Helge:
  In that patch is:

diff --git a/mm/util.c b/mm/util.c
index dd12b9531ac4c..8810206444977 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -396,7 +396,10 @@ static int mmap_is_legacy(struct rlimit *rlim_stack)
        if (current->personality & ADDR_COMPAT_LAYOUT)
                return 1;

-       if (rlim_stack->rlim_cur == RLIM_INFINITY)
+       /* On parisc the stack always grows up - so a unlimited stack should
+        * not be an indicator to use the legacy memory layout. */
+       if (rlim_stack->rlim_cur == RLIM_INFINITY &&
+               !IS_ENABLED(CONFIG_STACK_GROWSUP))
                return 1;

        return sysctl_legacy_va_layout;

is that:
   '!IS_ENABLED(CONFIG_STACK_GROWSUP))'

 the right way around?

That feels inverted to me;  non-parisc don't have that config
set, so !IS_ENABLED... is true,  so they return 1 instead of checking
the flag?

Dave

> > > 
> > > In any case, do you have CONFIG_BPF_JIT enabled? If so, could you try
> > > to reproduce with CONFIG_BPF_JIT disabled?
> > > The JIT is quite new in v6.6 and I did face some crashes and disabling
> > > it helped me so far.
> > > 
> > > > I have tried applying ad4aa06e1d92b06ed56c7240252927bd60632efe
> > > > ("parisc: Add nop instructions after TLB inserts") on top of 6.6, but
> > > > it does NOT fix the issue.
> > > 
> > > Ok.
> > > 
> > > Helge
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Bisected stability regression in 6.6
  2023-11-12  1:22       ` Dr. David Alan Gilbert
@ 2023-11-12  8:03         ` Helge Deller
  2023-11-12 12:07           ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 12+ messages in thread
From: Helge Deller @ 2023-11-12  8:03 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, matoro, HelgeDeller
  Cc: Sam James, linux-parisc, Linux Kernel Mailing List

On 11/12/23 02:22, Dr. David Alan Gilbert wrote:
> * matoro (matoro_mailinglist_kernel@matoro.tk) wrote:
>> On 2023-11-11 16:27, Sam James wrote:
>>> Helge Deller <deller@gmx.de> writes:
>>>
>>>> On 11/11/23 07:31, matoro wrote:
>>>>> Hi Helge, I have bisected a regression in 6.6 which is causing
>>>>> userspace segfaults at a significantly increased rate in kernel 6.6.
>>>>> There seems to be a pathological case triggered by the ninja build
>>>>> tool.  The test case I have been using is cmake with ninja backend to
>>>>> attempt to build the nghttp2 package.  In 6.6, this segfaults, not at
>>>>> the same location every time, but with enough reliability that I was
>>>>> able to use it as a bisection regression case, including immediately
>>>>> after a reboot.  In the kernel log, these show up as "trap #15: Data
>>>>> TLB miss fault" messages.  Now these messages can and do show up in
>>>>> 6.5 causing segfaults, but never immediately after a reboot and
>>>>> infrequently enough that the system is stable.  With kernel 6.6 I am
>>>>> completely unable to build nghttp2 under any circumstances.
>>>>>
>>>>> I have bisected this down to the following commit:
>>>>>
>>>>> $ git bisect good
>>>>> 3033cd4307681c60db6d08f398a64484b36e0b0f is the first bad commit
>>>>> commit 3033cd4307681c60db6d08f398a64484b36e0b0f
>>>>> Author: Helge Deller <deller@gmx.de>
>>>>> Date:   Sat Aug 19 00:53:28 2023 +0200
>>>>>
>>>>>       parisc: Use generic mmap top-down layout and brk randomization
>>>>>
>>>>>       parisc uses a top-down layout by default that exactly fits
>>>>> the generic
>>>>>       functions, so get rid of arch specific code and use the
>>>>> generic version
>>>>>       by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
>>>>>
>>>>>       Note that on parisc the stack always grows up and a "unlimited stack"
>>>>>       simply means that the value as defined in
>>>>> CONFIG_STACK_MAX_DEFAULT_SIZE_MB
>>>>>       should be used. So RLIM_INFINITY is not an indicator to use
>>>>> the legacy
>>>>>       memory layout.
>>>>>
>>>>>       Signed-off-by: Helge Deller <deller@gmx.de>
>>>>>
>>>>>    arch/parisc/Kconfig             | 17 +++++++++++++
>>>>>    arch/parisc/kernel/process.c    | 14 -----------
>>>>>    arch/parisc/kernel/sys_parisc.c | 54
>>>>> +----------------------------------------
>>>>>    mm/util.c                       |  5 +++-
>>>>>    4 files changed, 22 insertions(+), 68 deletions(-)
>>>>
>>>> Thanks for your report!
>>>> I think it's quite unlikely that this patch introduces such a bad
>>>> regression.
>>>> I'd suspect some other bad commmit, but I'll try to reproduce.
>>>
>>> matoro, does a revert apply cleanly? Does it help?
>>
>> Yes, I just tested this and it cleanly reverts on linux-6.6.y and the revert
>> does fix the issue.
>
> Helge:
>    In that patch is:
>
> diff --git a/mm/util.c b/mm/util.c
> index dd12b9531ac4c..8810206444977 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -396,7 +396,10 @@ static int mmap_is_legacy(struct rlimit *rlim_stack)
>          if (current->personality & ADDR_COMPAT_LAYOUT)
>                  return 1;
>
> -       if (rlim_stack->rlim_cur == RLIM_INFINITY)
> +       /* On parisc the stack always grows up - so a unlimited stack should
> +        * not be an indicator to use the legacy memory layout. */
> +       if (rlim_stack->rlim_cur == RLIM_INFINITY &&
> +               !IS_ENABLED(CONFIG_STACK_GROWSUP))
>                  return 1;
>
>          return sysctl_legacy_va_layout;
>
> is that:
>     '!IS_ENABLED(CONFIG_STACK_GROWSUP))'
>
>   the right way around?
>
> That feels inverted to me;  non-parisc don't have that config
> set, so !IS_ENABLED... is true,  so they return 1 instead of checking
> the flag?

Right. For non-parisc the behaviour didn't change with my patch, and this
is intended. If rlim_stack->rlim_cur == RLIM_INFINITY, non-parisc return 1 as before.

Note that matoro reported a regression specifically on the parisc platform.

This change:
-       if (rlim_stack->rlim_cur == RLIM_INFINITY)
+       if (rlim_stack->rlim_cur == RLIM_INFINITY &&
+               !IS_ENABLED(CONFIG_STACK_GROWSUP))
just changes the behaviour on parisc.
On parisc rlim_stack->rlim_cur == RLIM_INFINITY" is always true, unless the user
changed the stack limit manually. If unchanged, mmap_is_legacy() should return
sysctl_legacy_va_layout, otherwise 1.

So, I think that part of the patch is OK.

Helge

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bisected stability regression in 6.6
  2023-11-12  8:03         ` Helge Deller
@ 2023-11-12 12:07           ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 12+ messages in thread
From: Dr. David Alan Gilbert @ 2023-11-12 12:07 UTC (permalink / raw)
  To: Helge Deller; +Cc: matoro, Sam James, linux-parisc, Linux Kernel Mailing List

* Helge Deller (deller@gmx.de) wrote:
> On 11/12/23 02:22, Dr. David Alan Gilbert wrote:
> > * matoro (matoro_mailinglist_kernel@matoro.tk) wrote:
> > > On 2023-11-11 16:27, Sam James wrote:
> > > > Helge Deller <deller@gmx.de> writes:
> > > > 
> > > > > On 11/11/23 07:31, matoro wrote:
> > > > > > Hi Helge, I have bisected a regression in 6.6 which is causing
> > > > > > userspace segfaults at a significantly increased rate in kernel 6.6.
> > > > > > There seems to be a pathological case triggered by the ninja build
> > > > > > tool.  The test case I have been using is cmake with ninja backend to
> > > > > > attempt to build the nghttp2 package.  In 6.6, this segfaults, not at
> > > > > > the same location every time, but with enough reliability that I was
> > > > > > able to use it as a bisection regression case, including immediately
> > > > > > after a reboot.  In the kernel log, these show up as "trap #15: Data
> > > > > > TLB miss fault" messages.  Now these messages can and do show up in
> > > > > > 6.5 causing segfaults, but never immediately after a reboot and
> > > > > > infrequently enough that the system is stable.  With kernel 6.6 I am
> > > > > > completely unable to build nghttp2 under any circumstances.
> > > > > > 
> > > > > > I have bisected this down to the following commit:
> > > > > > 
> > > > > > $ git bisect good
> > > > > > 3033cd4307681c60db6d08f398a64484b36e0b0f is the first bad commit
> > > > > > commit 3033cd4307681c60db6d08f398a64484b36e0b0f
> > > > > > Author: Helge Deller <deller@gmx.de>
> > > > > > Date:   Sat Aug 19 00:53:28 2023 +0200
> > > > > > 
> > > > > >       parisc: Use generic mmap top-down layout and brk randomization
> > > > > > 
> > > > > >       parisc uses a top-down layout by default that exactly fits
> > > > > > the generic
> > > > > >       functions, so get rid of arch specific code and use the
> > > > > > generic version
> > > > > >       by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
> > > > > > 
> > > > > >       Note that on parisc the stack always grows up and a "unlimited stack"
> > > > > >       simply means that the value as defined in
> > > > > > CONFIG_STACK_MAX_DEFAULT_SIZE_MB
> > > > > >       should be used. So RLIM_INFINITY is not an indicator to use
> > > > > > the legacy
> > > > > >       memory layout.
> > > > > > 
> > > > > >       Signed-off-by: Helge Deller <deller@gmx.de>
> > > > > > 
> > > > > >    arch/parisc/Kconfig             | 17 +++++++++++++
> > > > > >    arch/parisc/kernel/process.c    | 14 -----------
> > > > > >    arch/parisc/kernel/sys_parisc.c | 54
> > > > > > +----------------------------------------
> > > > > >    mm/util.c                       |  5 +++-
> > > > > >    4 files changed, 22 insertions(+), 68 deletions(-)
> > > > > 
> > > > > Thanks for your report!
> > > > > I think it's quite unlikely that this patch introduces such a bad
> > > > > regression.
> > > > > I'd suspect some other bad commmit, but I'll try to reproduce.
> > > > 
> > > > matoro, does a revert apply cleanly? Does it help?
> > > 
> > > Yes, I just tested this and it cleanly reverts on linux-6.6.y and the revert
> > > does fix the issue.
> > 
> > Helge:
> >    In that patch is:
> > 
> > diff --git a/mm/util.c b/mm/util.c
> > index dd12b9531ac4c..8810206444977 100644
> > --- a/mm/util.c
> > +++ b/mm/util.c
> > @@ -396,7 +396,10 @@ static int mmap_is_legacy(struct rlimit *rlim_stack)
> >          if (current->personality & ADDR_COMPAT_LAYOUT)
> >                  return 1;
> > 
> > -       if (rlim_stack->rlim_cur == RLIM_INFINITY)
> > +       /* On parisc the stack always grows up - so a unlimited stack should
> > +        * not be an indicator to use the legacy memory layout. */
> > +       if (rlim_stack->rlim_cur == RLIM_INFINITY &&
> > +               !IS_ENABLED(CONFIG_STACK_GROWSUP))
> >                  return 1;
> > 
> >          return sysctl_legacy_va_layout;
> > 
> > is that:
> >     '!IS_ENABLED(CONFIG_STACK_GROWSUP))'
> > 
> >   the right way around?
> > 
> > That feels inverted to me;  non-parisc don't have that config
> > set, so !IS_ENABLED... is true,  so they return 1 instead of checking
> > the flag?
> 
> Right. For non-parisc the behaviour didn't change with my patch, and this
> is intended. If rlim_stack->rlim_cur == RLIM_INFINITY, non-parisc return 1 as before.
> 
> Note that matoro reported a regression specifically on the parisc platform.

Oh, that I missed.

> This change:
> -       if (rlim_stack->rlim_cur == RLIM_INFINITY)
> +       if (rlim_stack->rlim_cur == RLIM_INFINITY &&
> +               !IS_ENABLED(CONFIG_STACK_GROWSUP))
> just changes the behaviour on parisc.
> On parisc rlim_stack->rlim_cur == RLIM_INFINITY" is always true, unless the user
> changed the stack limit manually. If unchanged, mmap_is_legacy() should return
> sysctl_legacy_va_layout, otherwise 1.
> 
> So, I think that part of the patch is OK.

OK, thanks for the clarification.

Dave
(P.S. and sorry screwing up one email in the header)

> Helge
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bisected stability regression in 6.6
  2023-11-11 23:33     ` matoro
  2023-11-12  1:22       ` Dr. David Alan Gilbert
@ 2023-11-12 20:22       ` Helge Deller
  2023-11-12 23:37         ` matoro
  1 sibling, 1 reply; 12+ messages in thread
From: Helge Deller @ 2023-11-12 20:22 UTC (permalink / raw)
  To: matoro; +Cc: Sam James, Helge Deller, linux-parisc, Linux Kernel Mailing List

* matoro <matoro_mailinglist_kernel@matoro.tk>:
> On 2023-11-11 16:27, Sam James wrote:
> > Helge Deller <deller@gmx.de> writes:
> > 
> > > On 11/11/23 07:31, matoro wrote:
> > > > Hi Helge, I have bisected a regression in 6.6 which is causing
> > > > userspace segfaults at a significantly increased rate in kernel 6.6.
> > > > There seems to be a pathological case triggered by the ninja build
> > > > tool.  The test case I have been using is cmake with ninja backend to
> > > > attempt to build the nghttp2 package.  In 6.6, this segfaults, not at
> > > > the same location every time, but with enough reliability that I was
> > > > able to use it as a bisection regression case, including immediately
> > > > after a reboot.  In the kernel log, these show up as "trap #15: Data
> > > > TLB miss fault" messages.  Now these messages can and do show up in
> > > > 6.5 causing segfaults, but never immediately after a reboot and
> > > > infrequently enough that the system is stable.  With kernel 6.6 I am
> > > > completely unable to build nghttp2 under any circumstances.
> > > > 
> > > > I have bisected this down to the following commit:
> > > > 
> > > > $ git bisect good
> > > > 3033cd4307681c60db6d08f398a64484b36e0b0f is the first bad commit
> > > > commit 3033cd4307681c60db6d08f398a64484b36e0b0f
> > > > Author: Helge Deller <deller@gmx.de>
> > > > Date:   Sat Aug 19 00:53:28 2023 +0200
> > > > 
> > > >      parisc: Use generic mmap top-down layout and brk randomization
> > > > 
> > > >      parisc uses a top-down layout by default that exactly fits
> > > > the generic
> > > >      functions, so get rid of arch specific code and use the
> > > > generic version
> > > >      by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
> > > > 
> > > >      Note that on parisc the stack always grows up and a "unlimited stack"
> > > >      simply means that the value as defined in
> > > > CONFIG_STACK_MAX_DEFAULT_SIZE_MB
> > > >      should be used. So RLIM_INFINITY is not an indicator to use
> > > > the legacy
> > > >      memory layout.
> > > > 
> > > >      Signed-off-by: Helge Deller <deller@gmx.de>
> > > > 
> > > >   arch/parisc/Kconfig             | 17 +++++++++++++
> > > >   arch/parisc/kernel/process.c    | 14 -----------
> > > >   arch/parisc/kernel/sys_parisc.c | 54
> > > > +----------------------------------------
> > > >   mm/util.c                       |  5 +++-
> > > >   4 files changed, 22 insertions(+), 68 deletions(-)
> > > 
> > > Thanks for your report!
> > > I think it's quite unlikely that this patch introduces such a bad
> > > regression.

I was wrong.
Indeed, by switching to the generic implementation with this patch
the calculation of mmap_base is wrong for parisc (because parisc
is the only architecture left where the stack grows upwards).

Could you please test the patch below. It did fixed the crashes
when building nghttp2 for me.

Helge

---

From: Helge Deller <deller@gmx.de>
Subject: [PATCH] parisc: Adjust ARCH_MMAP_RND_BITS* to previous values

Matoro reported various userspace crashes in kernel 6.6 and bisected it to
commit 3033cd430768 ("parisc: Use generic mmap top-down layout and brk
randomization").

The problem is, that mmap_base is calculated wrongly for the
stack-grows-upwards case (as on parisc). On parisc, mmap_base is simply just
below the stack start.

Reported-by: matoro <matoro_mailinglist_kernel@matoro.tk>
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 3033cd430768 ("parisc: Use generic mmap top-down layout and brk randomization")
Cc:  <stable@vger.kernel.org> # v6.6+

diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index a15ab147af2e..68cbe666510a 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -138,11 +138,11 @@ config ARCH_MMAP_RND_COMPAT_BITS_MIN
 	default 8
 
 config ARCH_MMAP_RND_BITS_MAX
-	default 24 if 64BIT
-	default 17
+	default 18 if 64BIT
+	default 13
 
 config ARCH_MMAP_RND_COMPAT_BITS_MAX
-	default 17
+	default 13
 
 # unless you want to implement ACPI on PA-RISC ... ;-)
 config PM
diff --git a/arch/parisc/include/asm/elf.h b/arch/parisc/include/asm/elf.h
index 140eaa97bf21..2d73d3c3cd37 100644
--- a/arch/parisc/include/asm/elf.h
+++ b/arch/parisc/include/asm/elf.h
@@ -349,15 +349,7 @@ struct pt_regs;	/* forward declaration... */
 
 #define ELF_HWCAP	0
 
-/* Masks for stack and mmap randomization */
-#define BRK_RND_MASK	(is_32bit_task() ? 0x07ffUL : 0x3ffffUL)
-#define MMAP_RND_MASK	(is_32bit_task() ? 0x1fffUL : 0x3ffffUL)
-#define STACK_RND_MASK	MMAP_RND_MASK
-
-struct mm_struct;
-extern unsigned long arch_randomize_brk(struct mm_struct *);
-#define arch_randomize_brk arch_randomize_brk
-
+#define STACK_RND_MASK	0x7ff	/* 8MB of VA */
 
 #define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
 struct linux_binprm;
diff --git a/arch/parisc/include/asm/processor.h b/arch/parisc/include/asm/processor.h
index ff6cbdb6903b..ece4b3046515 100644
--- a/arch/parisc/include/asm/processor.h
+++ b/arch/parisc/include/asm/processor.h
@@ -47,6 +47,8 @@
 
 #ifndef __ASSEMBLY__
 
+struct rlimit;
+unsigned long mmap_upper_limit(struct rlimit *rlim_stack);
 unsigned long calc_max_stack_size(unsigned long stack_max);
 
 /*
diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
index ab896eff7a1d..98af719d5f85 100644
--- a/arch/parisc/kernel/sys_parisc.c
+++ b/arch/parisc/kernel/sys_parisc.c
@@ -77,7 +77,7 @@ unsigned long calc_max_stack_size(unsigned long stack_max)
  * indicating that "current" should be used instead of a passed-in
  * value from the exec bprm as done with arch_pick_mmap_layout().
  */
-static unsigned long mmap_upper_limit(struct rlimit *rlim_stack)
+unsigned long mmap_upper_limit(struct rlimit *rlim_stack)
 {
 	unsigned long stack_base;
 
diff --git a/mm/util.c b/mm/util.c
index 8cbbfd3a3d59..0b7e715a71f2 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -414,6 +414,15 @@ static int mmap_is_legacy(struct rlimit *rlim_stack)
 
 static unsigned long mmap_base(unsigned long rnd, struct rlimit *rlim_stack)
 {
+#ifdef CONFIG_STACK_GROWSUP
+	/*
+	 * For an upwards growing stack the calculation is much simpler.
+	 * Memory for the maximum stack size is reserved at the top of the
+	 * task. mmap_base starts directly below the stack and grows
+	 * downwards.
+	 */
+	return PAGE_ALIGN(mmap_upper_limit(rlim_stack) - rnd);
+#else
 	unsigned long gap = rlim_stack->rlim_cur;
 	unsigned long pad = stack_guard_gap;
 
@@ -431,6 +440,7 @@ static unsigned long mmap_base(unsigned long rnd, struct rlimit *rlim_stack)
 		gap = MAX_GAP;
 
 	return PAGE_ALIGN(STACK_TOP - gap - rnd);
+#endif
 }
 
 void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Bisected stability regression in 6.6
  2023-11-12 20:22       ` Helge Deller
@ 2023-11-12 23:37         ` matoro
  0 siblings, 0 replies; 12+ messages in thread
From: matoro @ 2023-11-12 23:37 UTC (permalink / raw)
  To: Helge Deller; +Cc: Sam James, linux-parisc, Linux Kernel Mailing List

On 2023-11-12 15:22, Helge Deller wrote:
> * matoro <matoro_mailinglist_kernel@matoro.tk>:
>> On 2023-11-11 16:27, Sam James wrote:
>> > Helge Deller <deller@gmx.de> writes:
>> >
>> > > On 11/11/23 07:31, matoro wrote:
>> > > > Hi Helge, I have bisected a regression in 6.6 which is causing
>> > > > userspace segfaults at a significantly increased rate in kernel 6.6.
>> > > > There seems to be a pathological case triggered by the ninja build
>> > > > tool.  The test case I have been using is cmake with ninja backend to
>> > > > attempt to build the nghttp2 package.  In 6.6, this segfaults, not at
>> > > > the same location every time, but with enough reliability that I was
>> > > > able to use it as a bisection regression case, including immediately
>> > > > after a reboot.  In the kernel log, these show up as "trap #15: Data
>> > > > TLB miss fault" messages.  Now these messages can and do show up in
>> > > > 6.5 causing segfaults, but never immediately after a reboot and
>> > > > infrequently enough that the system is stable.  With kernel 6.6 I am
>> > > > completely unable to build nghttp2 under any circumstances.
>> > > >
>> > > > I have bisected this down to the following commit:
>> > > >
>> > > > $ git bisect good
>> > > > 3033cd4307681c60db6d08f398a64484b36e0b0f is the first bad commit
>> > > > commit 3033cd4307681c60db6d08f398a64484b36e0b0f
>> > > > Author: Helge Deller <deller@gmx.de>
>> > > > Date:   Sat Aug 19 00:53:28 2023 +0200
>> > > >
>> > > >      parisc: Use generic mmap top-down layout and brk randomization
>> > > >
>> > > >      parisc uses a top-down layout by default that exactly fits
>> > > > the generic
>> > > >      functions, so get rid of arch specific code and use the
>> > > > generic version
>> > > >      by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.
>> > > >
>> > > >      Note that on parisc the stack always grows up and a "unlimited stack"
>> > > >      simply means that the value as defined in
>> > > > CONFIG_STACK_MAX_DEFAULT_SIZE_MB
>> > > >      should be used. So RLIM_INFINITY is not an indicator to use
>> > > > the legacy
>> > > >      memory layout.
>> > > >
>> > > >      Signed-off-by: Helge Deller <deller@gmx.de>
>> > > >
>> > > >   arch/parisc/Kconfig             | 17 +++++++++++++
>> > > >   arch/parisc/kernel/process.c    | 14 -----------
>> > > >   arch/parisc/kernel/sys_parisc.c | 54
>> > > > +----------------------------------------
>> > > >   mm/util.c                       |  5 +++-
>> > > >   4 files changed, 22 insertions(+), 68 deletions(-)
>> > >
>> > > Thanks for your report!
>> > > I think it's quite unlikely that this patch introduces such a bad
>> > > regression.
> 
> I was wrong.
> Indeed, by switching to the generic implementation with this patch
> the calculation of mmap_base is wrong for parisc (because parisc
> is the only architecture left where the stack grows upwards).
> 
> Could you please test the patch below. It did fixed the crashes
> when building nghttp2 for me.
> 
> Helge
> 
> ---
> 
> From: Helge Deller <deller@gmx.de>
> Subject: [PATCH] parisc: Adjust ARCH_MMAP_RND_BITS* to previous values
> 
> Matoro reported various userspace crashes in kernel 6.6 and bisected it to
> commit 3033cd430768 ("parisc: Use generic mmap top-down layout and brk
> randomization").
> 
> The problem is, that mmap_base is calculated wrongly for the
> stack-grows-upwards case (as on parisc). On parisc, mmap_base is simply just
> below the stack start.
> 
> Reported-by: matoro <matoro_mailinglist_kernel@matoro.tk>
> Signed-off-by: Helge Deller <deller@gmx.de>
> Fixes: 3033cd430768 ("parisc: Use generic mmap top-down layout and brk 
> randomization")
> Cc:  <stable@vger.kernel.org> # v6.6+
> 
> diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
> index a15ab147af2e..68cbe666510a 100644
> --- a/arch/parisc/Kconfig
> +++ b/arch/parisc/Kconfig
> @@ -138,11 +138,11 @@ config ARCH_MMAP_RND_COMPAT_BITS_MIN
>  	default 8
> 
>  config ARCH_MMAP_RND_BITS_MAX
> -	default 24 if 64BIT
> -	default 17
> +	default 18 if 64BIT
> +	default 13
> 
>  config ARCH_MMAP_RND_COMPAT_BITS_MAX
> -	default 17
> +	default 13
> 
>  # unless you want to implement ACPI on PA-RISC ... ;-)
>  config PM
> diff --git a/arch/parisc/include/asm/elf.h b/arch/parisc/include/asm/elf.h
> index 140eaa97bf21..2d73d3c3cd37 100644
> --- a/arch/parisc/include/asm/elf.h
> +++ b/arch/parisc/include/asm/elf.h
> @@ -349,15 +349,7 @@ struct pt_regs;	/* forward declaration... */
> 
>  #define ELF_HWCAP	0
> 
> -/* Masks for stack and mmap randomization */
> -#define BRK_RND_MASK	(is_32bit_task() ? 0x07ffUL : 0x3ffffUL)
> -#define MMAP_RND_MASK	(is_32bit_task() ? 0x1fffUL : 0x3ffffUL)
> -#define STACK_RND_MASK	MMAP_RND_MASK
> -
> -struct mm_struct;
> -extern unsigned long arch_randomize_brk(struct mm_struct *);
> -#define arch_randomize_brk arch_randomize_brk
> -
> +#define STACK_RND_MASK	0x7ff	/* 8MB of VA */
> 
>  #define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
>  struct linux_binprm;
> diff --git a/arch/parisc/include/asm/processor.h 
> b/arch/parisc/include/asm/processor.h
> index ff6cbdb6903b..ece4b3046515 100644
> --- a/arch/parisc/include/asm/processor.h
> +++ b/arch/parisc/include/asm/processor.h
> @@ -47,6 +47,8 @@
> 
>  #ifndef __ASSEMBLY__
> 
> +struct rlimit;
> +unsigned long mmap_upper_limit(struct rlimit *rlim_stack);
>  unsigned long calc_max_stack_size(unsigned long stack_max);
> 
>  /*
> diff --git a/arch/parisc/kernel/sys_parisc.c 
> b/arch/parisc/kernel/sys_parisc.c
> index ab896eff7a1d..98af719d5f85 100644
> --- a/arch/parisc/kernel/sys_parisc.c
> +++ b/arch/parisc/kernel/sys_parisc.c
> @@ -77,7 +77,7 @@ unsigned long calc_max_stack_size(unsigned long stack_max)
>   * indicating that "current" should be used instead of a passed-in
>   * value from the exec bprm as done with arch_pick_mmap_layout().
>   */
> -static unsigned long mmap_upper_limit(struct rlimit *rlim_stack)
> +unsigned long mmap_upper_limit(struct rlimit *rlim_stack)
>  {
>  	unsigned long stack_base;
> 
> diff --git a/mm/util.c b/mm/util.c
> index 8cbbfd3a3d59..0b7e715a71f2 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -414,6 +414,15 @@ static int mmap_is_legacy(struct rlimit *rlim_stack)
> 
>  static unsigned long mmap_base(unsigned long rnd, struct rlimit 
> *rlim_stack)
>  {
> +#ifdef CONFIG_STACK_GROWSUP
> +	/*
> +	 * For an upwards growing stack the calculation is much simpler.
> +	 * Memory for the maximum stack size is reserved at the top of the
> +	 * task. mmap_base starts directly below the stack and grows
> +	 * downwards.
> +	 */
> +	return PAGE_ALIGN(mmap_upper_limit(rlim_stack) - rnd);
> +#else
>  	unsigned long gap = rlim_stack->rlim_cur;
>  	unsigned long pad = stack_guard_gap;
> 
> @@ -431,6 +440,7 @@ static unsigned long mmap_base(unsigned long rnd, struct 
> rlimit *rlim_stack)
>  		gap = MAX_GAP;
> 
>  	return PAGE_ALIGN(STACK_TOP - gap - rnd);
> +#endif
>  }
> 
>  void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)

Works here!  Thanks Helge!!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bisected stability regression in 6.6
  2023-11-11  7:02 ` Bagas Sanjaya
@ 2023-11-22  9:07   ` Linux regression tracking #update (Thorsten Leemhuis)
  0 siblings, 0 replies; 12+ messages in thread
From: Linux regression tracking #update (Thorsten Leemhuis) @ 2023-11-22  9:07 UTC (permalink / raw)
  To: Bagas Sanjaya, Linux PA-RISC Mailing List,
	Linux Kernel Mailing List, Sam James,
	Linux Memory Management List, Linux Regressions

[TLDR: This mail in primarily relevant for Linux kernel regression
tracking. See link in footer if these mails annoy you.]

On 11.11.23 08:02, Bagas Sanjaya wrote:
> On Sat, Nov 11, 2023 at 01:31:01AM -0500, matoro wrote:
>> Hi Helge, I have bisected a regression in 6.6 which is causing userspace
>> segfaults at a significantly increased rate in kernel 6.6. 
> #regzbot ^introduced: 3033cd4307681c

#regzbot fix: 5f74f820f6fc844b95f9
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-11-22  9:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-11  6:31 Bisected stability regression in 6.6 matoro
2023-11-11  7:02 ` Bagas Sanjaya
2023-11-22  9:07   ` Linux regression tracking #update (Thorsten Leemhuis)
2023-11-11 21:21 ` Helge Deller
2023-11-11 21:27   ` Sam James
2023-11-11 23:33     ` matoro
2023-11-12  1:22       ` Dr. David Alan Gilbert
2023-11-12  8:03         ` Helge Deller
2023-11-12 12:07           ` Dr. David Alan Gilbert
2023-11-12 20:22       ` Helge Deller
2023-11-12 23:37         ` matoro
2023-11-11 21:28   ` matoro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.