linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
@ 2024-01-19 10:46 Mikhail Gavrilov
  2024-01-19 10:54 ` Marco Elver
  0 siblings, 1 reply; 23+ messages in thread
From: Mikhail Gavrilov @ 2024-01-19 10:46 UTC (permalink / raw)
  To: andreyknvl, elver, glider, dvyukov, eugenis, Oscar Salvador,
	Vlastimil Babka, Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List

[-- Attachment #1: Type: text/plain, Size: 2081 bytes --]

Hi,
I use a system with KASAN sanitizer everyday.
Because I want to catch difficult-to-repeat bugs.
And all worked fine until commit 773688a6cb24b0b3c2ba40354d883348a2befa38.
After commit 773688a6cb24b0b3c2ba40354d883348a2befa38 all working
jerky when I compile something.
The sound is interrupted, the cursor moves jerkily if I try to do
anything when all the cores are loaded.

> git bisect bad
773688a6cb24b0b3c2ba40354d883348a2befa38 is the first bad commit
commit 773688a6cb24b0b3c2ba40354d883348a2befa38
Author: Andrey Konovalov <andreyknvl@google.com>
Date:   Mon Nov 20 18:47:19 2023 +0100

    kasan: use stack_depot_put for Generic mode

    Evict alloc/free stack traces from the stack depot for Generic KASAN once
    they are evicted from the quaratine.

    For auxiliary stack traces, evict the oldest stack trace once a new one is
    saved (KASAN only keeps references to the last two).

    Also evict all saved stack traces on krealloc.

    To avoid double-evicting and mis-evicting stack traces (in case KASAN's
    metadata was corrupted), reset KASAN's per-object metadata that stores
    stack depot handles when the object is initialized and when it's evicted
    from the quarantine.

    Note that stack_depot_put is no-op if the handle is 0.

    Link: https://lkml.kernel.org/r/5cef104d9b842899489b4054fe8d1339a71acee0.1700502145.git.andreyknvl@google.com
    Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
    Reviewed-by: Marco Elver <elver@google.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Evgenii Stepanov <eugenis@google.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

 mm/kasan/common.c     |  3 ++-
 mm/kasan/generic.c    | 22 ++++++++++++++++++----
 mm/kasan/quarantine.c | 26 ++++++++++++++++++++------
 3 files changed, 40 insertions(+), 11 deletions(-)

I attached here my build .config and kernel log.
Who could dig into it, please?

-- 
Best Regards,
Mike Gavrilov.

[-- Attachment #2: .config.zip --]
[-- Type: application/zip, Size: 65161 bytes --]

[-- Attachment #3: bisect-performance-regression-KASAN-log.zip --]
[-- Type: application/zip, Size: 1239 bytes --]

[-- Attachment #4: dmesg-performance-regression-KASAN.zip --]
[-- Type: application/zip, Size: 44240 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-01-19 10:46 regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load Mikhail Gavrilov
@ 2024-01-19 10:54 ` Marco Elver
  2024-01-19 10:59   ` Marco Elver
  0 siblings, 1 reply; 23+ messages in thread
From: Marco Elver @ 2024-01-19 10:54 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: andreyknvl, glider, dvyukov, eugenis, Oscar Salvador,
	Vlastimil Babka, Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List

On Fri, 19 Jan 2024 at 11:46, Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> Hi,
> I use a system with KASAN sanitizer everyday.
> Because I want to catch difficult-to-repeat bugs.
> And all worked fine until commit 773688a6cb24b0b3c2ba40354d883348a2befa38.
> After commit 773688a6cb24b0b3c2ba40354d883348a2befa38 all working
> jerky when I compile something.
> The sound is interrupted, the cursor moves jerkily if I try to do
> anything when all the cores are loaded.
>
> > git bisect bad
> 773688a6cb24b0b3c2ba40354d883348a2befa38 is the first bad commit
> commit 773688a6cb24b0b3c2ba40354d883348a2befa38
> Author: Andrey Konovalov <andreyknvl@google.com>
> Date:   Mon Nov 20 18:47:19 2023 +0100
>
>     kasan: use stack_depot_put for Generic mode
[...]
>  mm/kasan/common.c     |  3 ++-
>  mm/kasan/generic.c    | 22 ++++++++++++++++++----
>  mm/kasan/quarantine.c | 26 ++++++++++++++++++++------
>  3 files changed, 40 insertions(+), 11 deletions(-)
>
> I attached here my build .config and kernel log.
> Who could dig into it, please?

I was afraid this would happen - could you try this patch series:
https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-01-19 10:54 ` Marco Elver
@ 2024-01-19 10:59   ` Marco Elver
  2024-01-19 17:54     ` Mikhail Gavrilov
  0 siblings, 1 reply; 23+ messages in thread
From: Marco Elver @ 2024-01-19 10:59 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: glider, dvyukov, eugenis, Oscar Salvador, Vlastimil Babka,
	Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List, Andrey Konovalov

On Fri, 19 Jan 2024 at 11:54, Marco Elver <elver@google.com> wrote:
>
> On Fri, 19 Jan 2024 at 11:46, Mikhail Gavrilov
> <mikhail.v.gavrilov@gmail.com> wrote:
> >
> > Hi,
> > I use a system with KASAN sanitizer everyday.
> > Because I want to catch difficult-to-repeat bugs.
> > And all worked fine until commit 773688a6cb24b0b3c2ba40354d883348a2befa38.
> > After commit 773688a6cb24b0b3c2ba40354d883348a2befa38 all working
> > jerky when I compile something.
> > The sound is interrupted, the cursor moves jerkily if I try to do
> > anything when all the cores are loaded.
> >
> > > git bisect bad
> > 773688a6cb24b0b3c2ba40354d883348a2befa38 is the first bad commit
> > commit 773688a6cb24b0b3c2ba40354d883348a2befa38
> > Author: Andrey Konovalov <andreyknvl@google.com>
> > Date:   Mon Nov 20 18:47:19 2023 +0100
> >
> >     kasan: use stack_depot_put for Generic mode
> [...]
> >  mm/kasan/common.c     |  3 ++-
> >  mm/kasan/generic.c    | 22 ++++++++++++++++++----
> >  mm/kasan/quarantine.c | 26 ++++++++++++++++++++------
> >  3 files changed, 40 insertions(+), 11 deletions(-)
> >
> > I attached here my build .config and kernel log.
> > Who could dig into it, please?
>
> I was afraid this would happen - could you try this patch series:
> https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/ [1]

In addition, could you give some additional details about the number
of CPUs in your system?
And if possible, do you have a way to measure performance besides the
obvious lagging of the system? It would be interesting to know if the
fix in [1] regains performance fully.

One major difference is still that an atomic RMW is in the fast paths.
This could be fixed by reverting
773688a6cb24b0b3c2ba40354d883348a2befa38 on top of everything else,
but we're not sure yet that's necessary because the cost of an atomic
RMW really depends on the system you're working with.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-01-19 10:59   ` Marco Elver
@ 2024-01-19 17:54     ` Mikhail Gavrilov
  2024-01-29 22:25       ` Mikhail Gavrilov
  0 siblings, 1 reply; 23+ messages in thread
From: Mikhail Gavrilov @ 2024-01-19 17:54 UTC (permalink / raw)
  To: Marco Elver
  Cc: glider, dvyukov, eugenis, Oscar Salvador, Vlastimil Babka,
	Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List, Andrey Konovalov

On Fri, Jan 19, 2024 at 4:00 PM Marco Elver <elver@google.com> wrote:
> I was afraid this would happen - could you try this patch series:
> https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/ [1]

Thanks, this patch series definitely helped.
I can again work at the computer when something is compiling in the background.
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>

> In addition, could you give some additional details about the number
> of CPUs in your system?

Hardware probe: https://linux-hardware.org/?probe=ba941d7a4e
CPU: AMD Ryzen 7950x

> And if possible, do you have a way to measure performance besides the
> obvious lagging of the system? It would be interesting to know if the
> fix in [1] regains performance fully.

perf-2d5524635b00.data -
https://mega.nz/file/Q0ACSI4a#QQ8Ntbw5zvP_YZMsXPzSr-PxLVCw8fwg2RJaVOghoOQ
perf-773688a6cb24.data -
https://mega.nz/file/F8wAgBZI#OQ75qLFyf2diFXrDs9bP6_5xDevVrs1KlNdeupWSJSQ
perf-with-patchset.data -
https://mega.nz/file/l8ZXnI6Y#SmrZpH2Em6xzlIZgJe50PwSw-zLK_4whRjx3t_058kE

> perf diff perf-2d5524635b00.data perf-773688a6cb24.data
No kallsyms or vmlinux with build-id
c64a03a51e9503a251dbec8e5267fb3ae51914f2 was found
# Event 'cycles:P'
#
# Baseline  Delta Abs  Shared Object
Symbol

                        >
# ........  .........  ..............................................
......................................................................................................................................................................>
#
    59.91%    +23.05%  [kernel.vmlinux]
[k] 0xffffffff940065c0
    17.88%     -7.89%  cc1
[.] 0x0000000000207020
     9.39%     -6.30%  cc1plus
[.] 0x0000000000225110
     1.37%     -1.29%  libpython3.12.so.1.0
[.] 0x00000000000647e0
     1.16%     -0.84%  libcef.so
[.] 0x00000000021720e0
     1.27%     -0.67%  as
[.] 0x0000000000002090
     0.78%     -0.54%  steamclient.so
[.] 0x00000000001ed915
     0.77%     -0.33%  chrome
[.] 0x0000000002892080
     0.54%     -0.32%  libc.so.6
[.] _int_malloc
     0.30%     -0.23%  libpixman-1.so.0.43.0
[.] 0x00000000000078a7
     0.31%     -0.19%  libc.so.6
[.] _int_free


> perf diff perf-2d5524635b00.data perf-with-patchset.data
# Event 'cycles:P'
#
# Baseline  Delta Abs  Shared Object
Symbol

                        >
# ........  .........  ..............................................
......................................................................................................................................................................>
#
    17.88%    +12.61%  cc1
[.] 0x0000000000207020
               +3.89%  [kernel.vmlinux]
[k] unwind_next_frame
               +3.53%  [kernel.vmlinux]
[k] kasan_check_range
               +2.54%  [kernel.vmlinux]
[k] debug_check_no_obj_freed
     9.39%     +2.10%  cc1plus
[.] 0x0000000000225110
               +1.87%  [kernel.vmlinux]
[k] rcu_is_watching
               +1.41%  [kernel.vmlinux]
[k] lock_release
               +1.24%  [kernel.vmlinux]
[k] __orc_find
               +1.21%  [kernel.vmlinux]
[k] lock_acquire
               +1.08%  [kernel.vmlinux]
[k] stack_trace_consume_entry
               +1.08%  [kernel.vmlinux]
[k] check_preemption_disabled
     1.37%     +1.01%  libpython3.12.so.1.0
[.] 0x00000000000647e0
               +0.96%  [kernel.vmlinux]
[k] stack_access_ok
     1.27%     +0.79%  as
[.] 0x0000000000002090


Thanks!

-- 
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-01-19 17:54     ` Mikhail Gavrilov
@ 2024-01-29 22:25       ` Mikhail Gavrilov
  2024-01-29 23:14         ` Andrey Konovalov
  0 siblings, 1 reply; 23+ messages in thread
From: Mikhail Gavrilov @ 2024-01-29 22:25 UTC (permalink / raw)
  To: Marco Elver
  Cc: glider, dvyukov, eugenis, Oscar Salvador, Vlastimil Babka,
	Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List, Andrey Konovalov

[-- Attachment #1: Type: text/plain, Size: 2632 bytes --]

On Fri, Jan 19, 2024 at 10:54 PM Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
I continued to search regressions in 6.8 kernel.
And found another one.

cc478e0b6bdffd20561e1a07941a65f6c8962cab is the first bad commit
commit cc478e0b6bdffd20561e1a07941a65f6c8962cab
Author: Andrey Konovalov <andreyknvl@gmail.com>
Date:   Tue Jan 9 23:12:34 2024 +0100

    kasan: avoid resetting aux_lock

    With commit 63b85ac56a64 ("kasan: stop leaking stack trace handles"),
    KASAN zeroes out alloc meta when an object is freed.  The zeroed out data
    purposefully includes alloc and auxiliary stack traces but also
    accidentally includes aux_lock.

    As aux_lock is only initialized for each object slot during slab creation,
    when the freed slot is reallocated, saving auxiliary stack traces for the
    new object leads to lockdep reports when taking the zeroed out aux_lock.

    Arguably, we could reinitialize aux_lock when the object is reallocated,
    but a simpler solution is to avoid zeroing out aux_lock when an object
    gets freed.

    Link: https://lkml.kernel.org/r/20240109221234.90929-1-andrey.konovalov@linux.dev
    Fixes: 63b85ac56a64 ("kasan: stop leaking stack trace handles")
    Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com>
    Reported-by: Paul E. McKenney <paulmck@kernel.org>
    Closes: https://lore.kernel.org/linux-next/5cc0f83c-e1d6-45c5-be89-9b86746fe731@paulmck-laptop/
    Reviewed-by: Marco Elver <elver@google.com>
    Tested-by: Paul E. McKenney <paulmck@kernel.org>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

 mm/kasan/generic.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)


Here I spotted a dropped FPS in the game "Shadow of the Tomb Raider".
For measuring performance I used an internal benchmark.
Before commit cc478e0b6bdffd20561e1a07941a65f6c8962cab was 111FPS on
commit aaa2c9a97c22af5bf011f6dd8e0538219b45af88 [1].
On commit cc478e0b6bdffd20561e1a07941a65f6c8962cab I has only 63FPS [2]
And unfortunately the stackdepot patchset which I applied on top of
6.8-rc2 didn't restore initial performance [3].

[1] https://i.postimg.cc/tgvwPTkz/c11-aaa2c9a97c22af5bf011f6dd8e0538219b45af88.png
[2] https://i.postimg.cc/pX8vHDCM/c10-cc478e0b6bdffd20561e1a07941a65f6c8962cab.png
[3] https://i.postimg.cc/hvWCb7dV/6-8-0-0-rc2-with-stackdepot.png

-- 
Best Regards,
Mike Gavrilov.

[-- Attachment #2: bisect-performance-regression-in-games2.zip --]
[-- Type: application/zip, Size: 1235 bytes --]

[-- Attachment #3: .config.zip --]
[-- Type: application/zip, Size: 65242 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-01-29 22:25       ` Mikhail Gavrilov
@ 2024-01-29 23:14         ` Andrey Konovalov
  2024-02-01 22:08           ` Mikhail Gavrilov
  0 siblings, 1 reply; 23+ messages in thread
From: Andrey Konovalov @ 2024-01-29 23:14 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: Marco Elver, glider, dvyukov, eugenis, Oscar Salvador,
	Vlastimil Babka, Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List

On Mon, Jan 29, 2024 at 11:25 PM Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> On Fri, Jan 19, 2024 at 10:54 PM Mikhail Gavrilov
> <mikhail.v.gavrilov@gmail.com> wrote:
> >
> I continued to search regressions in 6.8 kernel.
> And found another one.
>
> cc478e0b6bdffd20561e1a07941a65f6c8962cab is the first bad commit
> commit cc478e0b6bdffd20561e1a07941a65f6c8962cab
> Author: Andrey Konovalov <andreyknvl@gmail.com>
> Date:   Tue Jan 9 23:12:34 2024 +0100
>
>     kasan: avoid resetting aux_lock
>
> Here I spotted a dropped FPS in the game "Shadow of the Tomb Raider".
> For measuring performance I used an internal benchmark.
> Before commit cc478e0b6bdffd20561e1a07941a65f6c8962cab was 111FPS on
> commit aaa2c9a97c22af5bf011f6dd8e0538219b45af88 [1].
> On commit cc478e0b6bdffd20561e1a07941a65f6c8962cab I has only 63FPS [2]
> And unfortunately the stackdepot patchset which I applied on top of
> 6.8-rc2 didn't restore initial performance [3].
>
> [1] https://i.postimg.cc/tgvwPTkz/c11-aaa2c9a97c22af5bf011f6dd8e0538219b45af88.png
> [2] https://i.postimg.cc/pX8vHDCM/c10-cc478e0b6bdffd20561e1a07941a65f6c8962cab.png
> [3] https://i.postimg.cc/hvWCb7dV/6-8-0-0-rc2-with-stackdepot.png

Hi Mikhail,

Please try to apply these two patches on top:
https://lore.kernel.org/linux-mm/20240129100708.39460-1-elver@google.com/

They effectively revert the change you mentioned.

Thank you for testing!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-01-29 23:14         ` Andrey Konovalov
@ 2024-02-01 22:08           ` Mikhail Gavrilov
  2024-02-02  9:00             ` Marco Elver
  0 siblings, 1 reply; 23+ messages in thread
From: Mikhail Gavrilov @ 2024-02-01 22:08 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Marco Elver, glider, dvyukov, eugenis, Oscar Salvador,
	Vlastimil Babka, Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List

On Tue, Jan 30, 2024 at 4:14 AM Andrey Konovalov <andreyknvl@gmail.com> wrote:
> Hi Mikhail,
>
> Please try to apply these two patches on top:
> https://lore.kernel.org/linux-mm/20240129100708.39460-1-elver@google.com/
>
> They effectively revert the change you mentioned.
>

I tried applying these patches on top of 6.8-rc2 and
6.8-git6764c317b6bb but performance unfortunately has not changed and
is still on regression level.
Maybe we can try something else?

-- 
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-01 22:08           ` Mikhail Gavrilov
@ 2024-02-02  9:00             ` Marco Elver
  2024-02-02 16:35               ` Mikhail Gavrilov
  0 siblings, 1 reply; 23+ messages in thread
From: Marco Elver @ 2024-02-02  9:00 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: Andrey Konovalov, glider, dvyukov, eugenis, Oscar Salvador,
	Vlastimil Babka, Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List

On Thu, 1 Feb 2024 at 23:08, Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> On Tue, Jan 30, 2024 at 4:14 AM Andrey Konovalov <andreyknvl@gmail.com> wrote:
> > Hi Mikhail,
> >
> > Please try to apply these two patches on top:
> > https://lore.kernel.org/linux-mm/20240129100708.39460-1-elver@google.com/
[1]
> >
> > They effectively revert the change you mentioned.
> >
>
> I tried applying these patches on top of 6.8-rc2 and
> 6.8-git6764c317b6bb but performance unfortunately has not changed and
> is still on regression level.
> Maybe we can try something else?

That's strange - the patches at [1] definitely revert the change you
bisected to. It's possible there is some other strange side-effect. (I
assume that you are still running all this with a KASAN kernel.)

Just so I understand it right:
You say before commit cc478e0b6bdffd20561e1a07941a65f6c8962cab the
game's FPS were good. But that is strange, because at that point we're
already doing stackdepot refcounting, i.e. after commit
773688a6cb24b0b3c2ba40354d883348a2befa38 which you reported as the
initial performance regression. The patches at [2] fixed that problem.

So now it's unclear to me how the simple change in
cc478e0b6bdffd20561e1a07941a65f6c8962cab causes the performance
problem, when in fact this is already with KASAN stackdepot
refcounting enabled but without the performance fixes from [1] and
[2].

[2] https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/

My questions now would be:
- What was the game's FPS in the last stable kernel (v6.7)?
- Can you collect another set of performance profiles between good and
bad? Maybe it would show where the time in the kernel is spent.
- Could it be an inconclusive bisection?

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-02  9:00             ` Marco Elver
@ 2024-02-02 16:35               ` Mikhail Gavrilov
  2024-02-02 16:47                 ` Marco Elver
  0 siblings, 1 reply; 23+ messages in thread
From: Mikhail Gavrilov @ 2024-02-02 16:35 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrey Konovalov, glider, dvyukov, eugenis, Oscar Salvador,
	Vlastimil Babka, Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List

On Fri, Feb 2, 2024 at 2:00 PM Marco Elver <elver@google.com> wrote:
>
> > Maybe we can try something else?
>
> That's strange - the patches at [1] definitely revert the change you
> bisected to. It's possible there is some other strange side-effect. (I
> assume that you are still running all this with a KASAN kernel.)

Yes. build .config not changed between kernel builds.

> Just so I understand it right:
> You say before commit cc478e0b6bdffd20561e1a07941a65f6c8962cab the
> game's FPS were good. But that is strange, because at that point we're
> already doing stackdepot refcounting, i.e. after commit
> 773688a6cb24b0b3c2ba40354d883348a2befa38 which you reported as the
> initial performance regression. The patches at [2] fixed that problem.
>
> So now it's unclear to me how the simple change in
> cc478e0b6bdffd20561e1a07941a65f6c8962cab causes the performance
> problem, when in fact this is already with KASAN stackdepot
> refcounting enabled but without the performance fixes from [1] and
> [2].
>
> [2] https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/
>
> My questions now would be:
> - What was the game's FPS in the last stable kernel (v6.7)?

[6.7] - 83 FPS - 13060 frames during benchmark.

> - Can you collect another set of performance profiles between good and
> bad? Maybe it would show where the time in the kernel is spent.

Yes,
please look at [aaa2c9a97c22 perf] and [cc478e0b6bdf perf]

> perf diff perf-git-aaa2c9a97c22af5bf011f6dd8e0538219b45af88.data perf-git-cc478e0b6bdffd20561e1a07941a65f6c8962cab.data
No kallsyms or vmlinux with build-id
de2a040f828394c5ce34802389239c2a0668fcc7 was found
No kallsyms or vmlinux with build-id
33ab1cd545f96f5ffc2a402a4c4cfa647fd727a0 was found
# Event 'cycles:P'
#
# Baseline  Delta Abs  Shared Object
Symbol
# ........  .........  ..............................................
.....................................................................................................................................................................................
#
    48.48%    +21.75%  [kernel.kallsyms]
[k] 0xffffffff860065c0
    36.13%    -16.49%  ShadowOfTheTombRaider
[.] 0x00000000001d7f5e
     4.43%     -2.10%  libvulkan_radeon.so
[.] 0x000000000006b870
     3.28%     -0.63%  libcef.so
[.] 0x00000000021720e0
     1.11%     -0.53%  libc.so.6
[.] syscall
     0.65%     -0.24%  libc.so.6
[.] __memmove_avx512_unaligned_erms
     0.31%     -0.14%  libc.so.6
[.] __memset_avx512_unaligned_erms
     0.26%     -0.13%  libm.so.6
[.] __powf_fma
     0.20%     -0.10%  [amdgpu]
[k] amdgpu_bo_placement_from_domain
     0.22%     -0.09%  [amdgpu]
[k] amdgpu_vram_mgr_compatible
     0.67%     -0.09%  armada-drm_dri.so
[.] 0x00000000000192b4
     0.15%     -0.08%  libc.so.6
[.] sem_post@GLIBC_2.2.5
     0.16%     -0.07%  [amdgpu]
[k] amdgpu_vm_bo_update
     0.14%     -0.07%  [amdgpu]
[k] amdgpu_bo_list_entry_cmp
     0.13%     -0.06%  libm.so.6
[.] powf@GLIBC_2.2.5
     0.14%     -0.06%  libMangoHud.so
[.] 0x000000000001c4c0
     0.10%     -0.06%  libc.so.6
[.] __futex_abstimed_wait_common
     0.19%     -0.05%  libGLESv2.so
[.] 0x0000000000160a11
     0.07%     -0.04%  libc.so.6
[.] __new_sem_wait_slow64.constprop.0
     0.10%     -0.04%  radeonsi_dri.so
[.] 0x0000000000019454
     0.05%     -0.03%  [amdgpu]
[k] optc1_get_position
     0.05%     -0.03%  libc.so.6
[.] sem_wait@@GLIBC_2.34
     0.22%     -0.02%  [vdso]
[.] 0x00000000000005a0
     0.10%     -0.02%  libc.so.6
[.] __memcmp_evex_movbe
               +0.02%  [JIT] tid 8383
[.] 0x00007f2de0052823


> - Could it be an inconclusive bisection?

I checked twice:
[6.7] - 83 FPS
[aaa2c9a97c22] - 111 FPS
[cc478e0b6bdf] - 64 FPS
[6.8-rc2 with patches] - 82 FPS


[6.7] https://i.postimg.cc/15yyzZBr/v6-7.png
[6.7 perf] https://mega.nz/file/QwJ3hbob#RslLFVYgz1SWMcPR3eF9uEpFuqxdgkwXSatWts-1wVA

[aaa2c9a97c22] https://i.postimg.cc/Sxv4VYhg/git-aaa2c9a97c22af5bf011f6dd8e0538219b45af88.png
[aaa2c9a97c22 perf]
https://mega.nz/file/dwQxha4J#2_nBF6uNzY11VX-T-Lr_-60WIMrbl1YEvPgY4CuXqEc

[cc478e0b6bdf] https://i.postimg.cc/W3cQfMfw/git-cc478e0b6bdffd20561e1a07941a65f6c8962cab.png
[cc478e0b6bdf perf]
https://mega.nz/file/hl5kwLTC#_4Fg1KBXCnQ-8OElY7EYmPOoDG6ZeZYnKFjamWpklWw

[6.8-rc2 with patches] https://i.postimg.cc/26dPpVsR/v6-8-rc2-with-patches.png
[6.8-rc2 with patches perf]
https://mega.nz/file/NxgTAb4L#0KO_WU-svpDw60Y3148RZhELPcUtFg3_VCDzJqSyz34

-- 
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-02 16:35               ` Mikhail Gavrilov
@ 2024-02-02 16:47                 ` Marco Elver
  2024-02-02 17:19                   ` Marco Elver
  0 siblings, 1 reply; 23+ messages in thread
From: Marco Elver @ 2024-02-02 16:47 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: Andrey Konovalov, glider, dvyukov, eugenis, Oscar Salvador,
	Vlastimil Babka, Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List

On Fri, 2 Feb 2024 at 17:35, Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> On Fri, Feb 2, 2024 at 2:00 PM Marco Elver <elver@google.com> wrote:
> >
> > > Maybe we can try something else?
> >
> > That's strange - the patches at [1] definitely revert the change you
> > bisected to. It's possible there is some other strange side-effect. (I
> > assume that you are still running all this with a KASAN kernel.)
>
> Yes. build .config not changed between kernel builds.
>
> > Just so I understand it right:
> > You say before commit cc478e0b6bdffd20561e1a07941a65f6c8962cab the
> > game's FPS were good. But that is strange, because at that point we're
> > already doing stackdepot refcounting, i.e. after commit
> > 773688a6cb24b0b3c2ba40354d883348a2befa38 which you reported as the
> > initial performance regression. The patches at [2] fixed that problem.
> >
> > So now it's unclear to me how the simple change in
> > cc478e0b6bdffd20561e1a07941a65f6c8962cab causes the performance
> > problem, when in fact this is already with KASAN stackdepot
> > refcounting enabled but without the performance fixes from [1] and
> > [2].
> >
> > [2] https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/
> >
> > My questions now would be:
> > - What was the game's FPS in the last stable kernel (v6.7)?
>
> [6.7] - 83 FPS - 13060 frames during benchmark.
>
> > - Can you collect another set of performance profiles between good and
> > bad? Maybe it would show where the time in the kernel is spent.
>
> Yes,
> please look at [aaa2c9a97c22 perf] and [cc478e0b6bdf perf]
>
> > perf diff perf-git-aaa2c9a97c22af5bf011f6dd8e0538219b45af88.data perf-git-cc478e0b6bdffd20561e1a07941a65f6c8962cab.data
> No kallsyms or vmlinux with build-id
> de2a040f828394c5ce34802389239c2a0668fcc7 was found
> No kallsyms or vmlinux with build-id
> 33ab1cd545f96f5ffc2a402a4c4cfa647fd727a0 was found
> # Event 'cycles:P'
> #
> # Baseline  Delta Abs  Shared Object
> Symbol
> # ........  .........  ..............................................
> .....................................................................................................................................................................................
> #
>     48.48%    +21.75%  [kernel.kallsyms]
> [k] 0xffffffff860065c0
>     36.13%    -16.49%  ShadowOfTheTombRaider
> [.] 0x00000000001d7f5e
>      4.43%     -2.10%  libvulkan_radeon.so
> [.] 0x000000000006b870
>      3.28%     -0.63%  libcef.so
> [.] 0x00000000021720e0
>      1.11%     -0.53%  libc.so.6
> [.] syscall
>      0.65%     -0.24%  libc.so.6
> [.] __memmove_avx512_unaligned_erms
>      0.31%     -0.14%  libc.so.6
> [.] __memset_avx512_unaligned_erms
>      0.26%     -0.13%  libm.so.6
> [.] __powf_fma
>      0.20%     -0.10%  [amdgpu]
> [k] amdgpu_bo_placement_from_domain
>      0.22%     -0.09%  [amdgpu]
> [k] amdgpu_vram_mgr_compatible
>      0.67%     -0.09%  armada-drm_dri.so
> [.] 0x00000000000192b4
>      0.15%     -0.08%  libc.so.6
> [.] sem_post@GLIBC_2.2.5
>      0.16%     -0.07%  [amdgpu]
> [k] amdgpu_vm_bo_update
>      0.14%     -0.07%  [amdgpu]
> [k] amdgpu_bo_list_entry_cmp
>      0.13%     -0.06%  libm.so.6
> [.] powf@GLIBC_2.2.5
>      0.14%     -0.06%  libMangoHud.so
> [.] 0x000000000001c4c0
>      0.10%     -0.06%  libc.so.6
> [.] __futex_abstimed_wait_common
>      0.19%     -0.05%  libGLESv2.so
> [.] 0x0000000000160a11
>      0.07%     -0.04%  libc.so.6
> [.] __new_sem_wait_slow64.constprop.0
>      0.10%     -0.04%  radeonsi_dri.so
> [.] 0x0000000000019454
>      0.05%     -0.03%  [amdgpu]
> [k] optc1_get_position
>      0.05%     -0.03%  libc.so.6
> [.] sem_wait@@GLIBC_2.34
>      0.22%     -0.02%  [vdso]
> [.] 0x00000000000005a0
>      0.10%     -0.02%  libc.so.6
> [.] __memcmp_evex_movbe
>                +0.02%  [JIT] tid 8383
> [.] 0x00007f2de0052823
>
>
> > - Could it be an inconclusive bisection?
>
> I checked twice:
> [6.7] - 83 FPS
> [aaa2c9a97c22] - 111 FPS
> [cc478e0b6bdf] - 64 FPS
> [6.8-rc2 with patches] - 82 FPS
>
>
> [6.7] https://i.postimg.cc/15yyzZBr/v6-7.png
> [6.7 perf] https://mega.nz/file/QwJ3hbob#RslLFVYgz1SWMcPR3eF9uEpFuqxdgkwXSatWts-1wVA
>
> [aaa2c9a97c22] https://i.postimg.cc/Sxv4VYhg/git-aaa2c9a97c22af5bf011f6dd8e0538219b45af88.png
> [aaa2c9a97c22 perf]
> https://mega.nz/file/dwQxha4J#2_nBF6uNzY11VX-T-Lr_-60WIMrbl1YEvPgY4CuXqEc
>
> [cc478e0b6bdf] https://i.postimg.cc/W3cQfMfw/git-cc478e0b6bdffd20561e1a07941a65f6c8962cab.png
> [cc478e0b6bdf perf]
> https://mega.nz/file/hl5kwLTC#_4Fg1KBXCnQ-8OElY7EYmPOoDG6ZeZYnKFjamWpklWw
>
> [6.8-rc2 with patches] https://i.postimg.cc/26dPpVsR/v6-8-rc2-with-patches.png
> [6.8-rc2 with patches perf]
> https://mega.nz/file/NxgTAb4L#0KO_WU-svpDw60Y3148RZhELPcUtFg3_VCDzJqSyz34

Thanks a lot for these results. There's definitely something strange
going - I'll try to have a detailed look some time next week.

In the meantime, this is clear: there does not seem to be a regression
between 6.7 and 6.8-rc with the patches, which is what I was
expecting. The fact that aaa2c9a97c22 is so much better could indicate
that until cc478e0b6bdf there was either a bug which turned something
into a no-op - or, the memsets() were acting as some kind of
prefetching hint to the CPU, which in turn caused a significant
reduction in cache misses. I think at this point we're not trying to
fix a regression, because we're on par with 6.7, but trying to make
sense of this information to optimize the code properly without luck
(but not sure if feasible). Hrm....

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-02 16:47                 ` Marco Elver
@ 2024-02-02 17:19                   ` Marco Elver
  2024-02-02 20:14                     ` Mikhail Gavrilov
  0 siblings, 1 reply; 23+ messages in thread
From: Marco Elver @ 2024-02-02 17:19 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: Andrey Konovalov, glider, dvyukov, eugenis, Oscar Salvador,
	Vlastimil Babka, Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List

On Fri, 2 Feb 2024 at 17:47, Marco Elver <elver@google.com> wrote:
>
> On Fri, 2 Feb 2024 at 17:35, Mikhail Gavrilov
> <mikhail.v.gavrilov@gmail.com> wrote:
> >
> > On Fri, Feb 2, 2024 at 2:00 PM Marco Elver <elver@google.com> wrote:
> > >
> > > > Maybe we can try something else?
> > >
> > > That's strange - the patches at [1] definitely revert the change you
> > > bisected to. It's possible there is some other strange side-effect. (I
> > > assume that you are still running all this with a KASAN kernel.)
> >
> > Yes. build .config not changed between kernel builds.
> >
> > > Just so I understand it right:
> > > You say before commit cc478e0b6bdffd20561e1a07941a65f6c8962cab the
> > > game's FPS were good. But that is strange, because at that point we're
> > > already doing stackdepot refcounting, i.e. after commit
> > > 773688a6cb24b0b3c2ba40354d883348a2befa38 which you reported as the
> > > initial performance regression. The patches at [2] fixed that problem.
> > >
> > > So now it's unclear to me how the simple change in
> > > cc478e0b6bdffd20561e1a07941a65f6c8962cab causes the performance
> > > problem, when in fact this is already with KASAN stackdepot
> > > refcounting enabled but without the performance fixes from [1] and
> > > [2].
> > >
> > > [2] https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/
> > >
> > > My questions now would be:
> > > - What was the game's FPS in the last stable kernel (v6.7)?
> >
> > [6.7] - 83 FPS - 13060 frames during benchmark.
> >
> > > - Can you collect another set of performance profiles between good and
> > > bad? Maybe it would show where the time in the kernel is spent.
> >
> > Yes,
> > please look at [aaa2c9a97c22 perf] and [cc478e0b6bdf perf]
> >
> > > perf diff perf-git-aaa2c9a97c22af5bf011f6dd8e0538219b45af88.data perf-git-cc478e0b6bdffd20561e1a07941a65f6c8962cab.data
> > No kallsyms or vmlinux with build-id
> > de2a040f828394c5ce34802389239c2a0668fcc7 was found
> > No kallsyms or vmlinux with build-id
> > 33ab1cd545f96f5ffc2a402a4c4cfa647fd727a0 was found
> > # Event 'cycles:P'
> > #
> > # Baseline  Delta Abs  Shared Object
> > Symbol
> > # ........  .........  ..............................................
> > .....................................................................................................................................................................................
> > #
> >     48.48%    +21.75%  [kernel.kallsyms]
> > [k] 0xffffffff860065c0
> >     36.13%    -16.49%  ShadowOfTheTombRaider
> > [.] 0x00000000001d7f5e
> >      4.43%     -2.10%  libvulkan_radeon.so
> > [.] 0x000000000006b870
> >      3.28%     -0.63%  libcef.so
> > [.] 0x00000000021720e0
> >      1.11%     -0.53%  libc.so.6
> > [.] syscall
> >      0.65%     -0.24%  libc.so.6
> > [.] __memmove_avx512_unaligned_erms
> >      0.31%     -0.14%  libc.so.6
> > [.] __memset_avx512_unaligned_erms
> >      0.26%     -0.13%  libm.so.6
> > [.] __powf_fma
> >      0.20%     -0.10%  [amdgpu]
> > [k] amdgpu_bo_placement_from_domain
> >      0.22%     -0.09%  [amdgpu]
> > [k] amdgpu_vram_mgr_compatible
> >      0.67%     -0.09%  armada-drm_dri.so
> > [.] 0x00000000000192b4
> >      0.15%     -0.08%  libc.so.6
> > [.] sem_post@GLIBC_2.2.5
> >      0.16%     -0.07%  [amdgpu]
> > [k] amdgpu_vm_bo_update
> >      0.14%     -0.07%  [amdgpu]
> > [k] amdgpu_bo_list_entry_cmp
> >      0.13%     -0.06%  libm.so.6
> > [.] powf@GLIBC_2.2.5
> >      0.14%     -0.06%  libMangoHud.so
> > [.] 0x000000000001c4c0
> >      0.10%     -0.06%  libc.so.6
> > [.] __futex_abstimed_wait_common
> >      0.19%     -0.05%  libGLESv2.so
> > [.] 0x0000000000160a11
> >      0.07%     -0.04%  libc.so.6
> > [.] __new_sem_wait_slow64.constprop.0
> >      0.10%     -0.04%  radeonsi_dri.so
> > [.] 0x0000000000019454
> >      0.05%     -0.03%  [amdgpu]
> > [k] optc1_get_position
> >      0.05%     -0.03%  libc.so.6
> > [.] sem_wait@@GLIBC_2.34
> >      0.22%     -0.02%  [vdso]
> > [.] 0x00000000000005a0
> >      0.10%     -0.02%  libc.so.6
> > [.] __memcmp_evex_movbe
> >                +0.02%  [JIT] tid 8383
> > [.] 0x00007f2de0052823
> >
> >
> > > - Could it be an inconclusive bisection?
> >
> > I checked twice:
> > [6.7] - 83 FPS
> > [aaa2c9a97c22] - 111 FPS
> > [cc478e0b6bdf] - 64 FPS
> > [6.8-rc2 with patches] - 82 FPS
> >
> >
> > [6.7] https://i.postimg.cc/15yyzZBr/v6-7.png
> > [6.7 perf] https://mega.nz/file/QwJ3hbob#RslLFVYgz1SWMcPR3eF9uEpFuqxdgkwXSatWts-1wVA
> >
> > [aaa2c9a97c22] https://i.postimg.cc/Sxv4VYhg/git-aaa2c9a97c22af5bf011f6dd8e0538219b45af88.png
> > [aaa2c9a97c22 perf]
> > https://mega.nz/file/dwQxha4J#2_nBF6uNzY11VX-T-Lr_-60WIMrbl1YEvPgY4CuXqEc
> >
> > [cc478e0b6bdf] https://i.postimg.cc/W3cQfMfw/git-cc478e0b6bdffd20561e1a07941a65f6c8962cab.png
> > [cc478e0b6bdf perf]
> > https://mega.nz/file/hl5kwLTC#_4Fg1KBXCnQ-8OElY7EYmPOoDG6ZeZYnKFjamWpklWw
> >
> > [6.8-rc2 with patches] https://i.postimg.cc/26dPpVsR/v6-8-rc2-with-patches.png
> > [6.8-rc2 with patches perf]
> > https://mega.nz/file/NxgTAb4L#0KO_WU-svpDw60Y3148RZhELPcUtFg3_VCDzJqSyz34
>
> Thanks a lot for these results. There's definitely something strange
> going - I'll try to have a detailed look some time next week.
>
> In the meantime, this is clear: there does not seem to be a regression
> between 6.7 and 6.8-rc with the patches, which is what I was
> expecting. The fact that aaa2c9a97c22 is so much better could indicate
> that until cc478e0b6bdf there was either a bug which turned something
> into a no-op - or, the memsets() were acting as some kind of
> prefetching hint to the CPU, which in turn caused a significant
> reduction in cache misses. I think at this point we're not trying to
> fix a regression, because we're on par with 6.7, but trying to make
> sense of this information to optimize the code properly without luck
> (but not sure if feasible). Hrm....

Your config has lockdep enabled, right? Because cc478e0b6bdf was
fixing an issue with lockdep, does your kernel before that commit show
some lockdep errors? Because if lockdep encounters an error it usually
turns itself off right away, which would explain the improved
performance. :-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-02 17:19                   ` Marco Elver
@ 2024-02-02 20:14                     ` Mikhail Gavrilov
  2024-02-19  9:48                       ` Mikhail Gavrilov
  0 siblings, 1 reply; 23+ messages in thread
From: Mikhail Gavrilov @ 2024-02-02 20:14 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrey Konovalov, glider, dvyukov, eugenis, Oscar Salvador,
	Vlastimil Babka, Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List

[-- Attachment #1: Type: text/plain, Size: 929 bytes --]

On Fri, Feb 2, 2024 at 10:20 PM Marco Elver <elver@google.com> wrote:
>
> Your config has lockdep enabled, right?

Yes.

> Because cc478e0b6bdf was fixing an issue with lockdep, does your kernel
> before that commit show some lockdep errors?

Let's check it, I attached the kernel log of aaa2c9a97c22.

mikhail@primary-ws ~> uname -r
6.7.0-c11-aaa2c9a97c22af5bf011f6dd8e0538219b45af88+
mikhail@primary-ws ~> sudo dmesg | grep lockdep
[sudo] password for mikhail:
[    3.115891] rcu: RCU lockdep checking is enabled.
[    3.125718] The code is fine but needs lockdep annotation, or maybe
[    3.125786]  ? lockdep_init_map_type+0x1a5/0x840
[   12.967789] INFO: lockdep is turned off.

> Because if lockdep encounters an error it usually
> turns itself off right away, which would explain the improved
> performance. :-)

You are right.
Thanks for digging into it!

-- 
Best Regards,
Mike Gavrilov.

[-- Attachment #2: dmesg-aaa2c9a97c22af5bf011f6dd8e0538219b45af88.zip --]
[-- Type: application/zip, Size: 48088 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-02 20:14                     ` Mikhail Gavrilov
@ 2024-02-19  9:48                       ` Mikhail Gavrilov
  2024-02-19  9:52                         ` Marco Elver
  0 siblings, 1 reply; 23+ messages in thread
From: Mikhail Gavrilov @ 2024-02-19  9:48 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrey Konovalov, glider, dvyukov, eugenis, Oscar Salvador,
	Vlastimil Babka, Andrew Morton, Linux List Kernel Mailing,
	Linux Memory Management List

On Sat, Feb 3, 2024 at 1:14 AM Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> You are right.
> Thanks for digging into it!
>

This [2] revert is still not merged at least I checked on 4f5e5092fdbf.
Is there any plan to merge it or find another approach?

[2] https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/

-- 
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-19  9:48                       ` Mikhail Gavrilov
@ 2024-02-19  9:52                         ` Marco Elver
  2024-02-19 10:09                           ` Vlastimil Babka
  0 siblings, 1 reply; 23+ messages in thread
From: Marco Elver @ 2024-02-19  9:52 UTC (permalink / raw)
  To: Mikhail Gavrilov, Andrew Morton
  Cc: Andrey Konovalov, glider, dvyukov, eugenis, Oscar Salvador,
	Vlastimil Babka, Linux List Kernel Mailing,
	Linux Memory Management List

On Mon, 19 Feb 2024 at 10:48, Mikhail Gavrilov
<mikhail.v.gavrilov@gmail.com> wrote:
>
> On Sat, Feb 3, 2024 at 1:14 AM Mikhail Gavrilov
> <mikhail.v.gavrilov@gmail.com> wrote:
> >
> > You are right.
> > Thanks for digging into it!
> >
>
> This [2] revert is still not merged at least I checked on 4f5e5092fdbf.
> Is there any plan to merge it or find another approach?
>
> [2] https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/

I think it's already in -mm and -next. It just takes time, which is a
good thing, after all we want to let -next testing confirm nothing is
wrong with it.

Andrew, is this planned for the next merge window or as a "hot fix"
for the current rc? Given it has the right "Fixes" tags it will make
it to stable kernels eventually, but I also think that the previous
"slow" version is almost unusable on big systems, so it may be
worthwhile considering the current rc.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-19  9:52                         ` Marco Elver
@ 2024-02-19 10:09                           ` Vlastimil Babka
  2024-02-19 23:28                             ` Andrew Morton
  0 siblings, 1 reply; 23+ messages in thread
From: Vlastimil Babka @ 2024-02-19 10:09 UTC (permalink / raw)
  To: Marco Elver, Mikhail Gavrilov, Andrew Morton
  Cc: Andrey Konovalov, glider, dvyukov, eugenis, Oscar Salvador,
	Linux List Kernel Mailing, Linux Memory Management List

On 2/19/24 10:52, Marco Elver wrote:
> On Mon, 19 Feb 2024 at 10:48, Mikhail Gavrilov
> <mikhail.v.gavrilov@gmail.com> wrote:
>>
>> On Sat, Feb 3, 2024 at 1:14 AM Mikhail Gavrilov
>> <mikhail.v.gavrilov@gmail.com> wrote:
>> >
>> > You are right.
>> > Thanks for digging into it!
>> >
>>
>> This [2] revert is still not merged at least I checked on 4f5e5092fdbf.
>> Is there any plan to merge it or find another approach?
>>
>> [2] https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/
> 
> I think it's already in -mm and -next. It just takes time, which is a
> good thing, after all we want to let -next testing confirm nothing is
> wrong with it.
> 
> Andrew, is this planned for the next merge window or as a "hot fix"
> for the current rc? Given it has the right "Fixes" tags it will make
> it to stable kernels eventually, but I also think that the previous
> "slow" version is almost unusable on big systems, so it may be
> worthwhile considering the current rc.

Yeah it would be best to fix in 6.8 to prevent regressions.

> Thanks,
> -- Marco


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-19 10:09                           ` Vlastimil Babka
@ 2024-02-19 23:28                             ` Andrew Morton
  2024-02-19 23:50                               ` Vlastimil Babka
  0 siblings, 1 reply; 23+ messages in thread
From: Andrew Morton @ 2024-02-19 23:28 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Marco Elver, Mikhail Gavrilov, Andrey Konovalov, glider, dvyukov,
	eugenis, Oscar Salvador, Linux List Kernel Mailing,
	Linux Memory Management List

On Mon, 19 Feb 2024 11:09:23 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:

> On 2/19/24 10:52, Marco Elver wrote:
> > On Mon, 19 Feb 2024 at 10:48, Mikhail Gavrilov
> > <mikhail.v.gavrilov@gmail.com> wrote:
> >>
> >> On Sat, Feb 3, 2024 at 1:14 AM Mikhail Gavrilov
> >> <mikhail.v.gavrilov@gmail.com> wrote:
> >> >
> >> > You are right.
> >> > Thanks for digging into it!
> >> >
> >>
> >> This [2] revert is still not merged at least I checked on 4f5e5092fdbf.
> >> Is there any plan to merge it or find another approach?
> >>
> >> [2] https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/
> > 
> > I think it's already in -mm and -next. It just takes time, which is a
> > good thing, after all we want to let -next testing confirm nothing is
> > wrong with it.
> > 
> > Andrew, is this planned for the next merge window or as a "hot fix"
> > for the current rc? Given it has the right "Fixes" tags it will make
> > it to stable kernels eventually, but I also think that the previous
> > "slow" version is almost unusable on big systems, so it may be
> > worthwhile considering the current rc.
> 
> Yeah it would be best to fix in 6.8 to prevent regressions.
> 

I'm all confused.

4434a56ec209 ("stackdepot: make fast paths lock-less again") was
mainlined for v6.8-rc3.

That patch Fixed: 108be8def46e ("lib/stackdepot: allow users to evict
stack traces") which was mainlined for v6.8-rc1, so 4434a56ec209 did
not need a cc:stable?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-19 23:28                             ` Andrew Morton
@ 2024-02-19 23:50                               ` Vlastimil Babka
  2024-02-20  5:37                                 ` Mikhail Gavrilov
  0 siblings, 1 reply; 23+ messages in thread
From: Vlastimil Babka @ 2024-02-19 23:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Marco Elver, Mikhail Gavrilov, Andrey Konovalov, glider, dvyukov,
	eugenis, Oscar Salvador, Linux List Kernel Mailing,
	Linux Memory Management List



On 2/20/24 00:28, Andrew Morton wrote:
> On Mon, 19 Feb 2024 11:09:23 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:
> 
>> On 2/19/24 10:52, Marco Elver wrote:
>>> On Mon, 19 Feb 2024 at 10:48, Mikhail Gavrilov
>>> <mikhail.v.gavrilov@gmail.com> wrote:
>>>>
>>>> On Sat, Feb 3, 2024 at 1:14 AM Mikhail Gavrilov
>>>> <mikhail.v.gavrilov@gmail.com> wrote:
>>>>>
>>>>> You are right.
>>>>> Thanks for digging into it!
>>>>>
>>>>
>>>> This [2] revert is still not merged at least I checked on 4f5e5092fdbf.
>>>> Is there any plan to merge it or find another approach?
>>>>
>>>> [2] https://lore.kernel.org/all/20240118110216.2539519-2-elver@google.com/
>>>
>>> I think it's already in -mm and -next. It just takes time, which is a
>>> good thing, after all we want to let -next testing confirm nothing is
>>> wrong with it.
>>>
>>> Andrew, is this planned for the next merge window or as a "hot fix"
>>> for the current rc? Given it has the right "Fixes" tags it will make
>>> it to stable kernels eventually, but I also think that the previous
>>> "slow" version is almost unusable on big systems, so it may be
>>> worthwhile considering the current rc.
>>
>> Yeah it would be best to fix in 6.8 to prevent regressions.
>>
> 
> I'm all confused.
> 
> 4434a56ec209 ("stackdepot: make fast paths lock-less again") was
> mainlined for v6.8-rc3.

Uh sorry, I just trusted the info that it's not merged and didn't verify
it myself. Yeah, I can see it is there.

> That patch Fixed: 108be8def46e ("lib/stackdepot: allow users to evict
> stack traces") which was mainlined for v6.8-rc1, so 4434a56ec209 did
> not need a cc:stable?

That's right.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-19 23:50                               ` Vlastimil Babka
@ 2024-02-20  5:37                                 ` Mikhail Gavrilov
  2024-02-20 17:30                                   ` Andrew Morton
  0 siblings, 1 reply; 23+ messages in thread
From: Mikhail Gavrilov @ 2024-02-20  5:37 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Marco Elver, Andrey Konovalov, glider, dvyukov,
	eugenis, Oscar Salvador, Linux List Kernel Mailing,
	Linux Memory Management List

On Tue, Feb 20, 2024 at 4:50 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> >
> > I'm all confused.
> >
> > 4434a56ec209 ("stackdepot: make fast paths lock-less again") was
> > mainlined for v6.8-rc3.
>
> Uh sorry, I just trusted the info that it's not merged and didn't verify
> it myself. Yeah, I can see it is there.
>

Wait, I am talk about these two patches which is not merged yet:
[PATCH v2 1/2] stackdepot: use variable size records for non-evictable entries
[PATCH v2 2/2] kasan: revert eviction of stack traces in generic mode
https://lore.kernel.org/linux-mm/20240129100708.39460-1-elver@google.com/

--
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-20  5:37                                 ` Mikhail Gavrilov
@ 2024-02-20 17:30                                   ` Andrew Morton
  2024-02-20 18:16                                     ` Vlastimil Babka
  0 siblings, 1 reply; 23+ messages in thread
From: Andrew Morton @ 2024-02-20 17:30 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: Vlastimil Babka, Marco Elver, Andrey Konovalov, glider, dvyukov,
	eugenis, Oscar Salvador, Linux List Kernel Mailing,
	Linux Memory Management List

On Tue, 20 Feb 2024 10:37:03 +0500 Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> wrote:

> On Tue, Feb 20, 2024 at 4:50 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> > >
> > > I'm all confused.
> > >
> > > 4434a56ec209 ("stackdepot: make fast paths lock-less again") was
> > > mainlined for v6.8-rc3.
> >
> > Uh sorry, I just trusted the info that it's not merged and didn't verify
> > it myself. Yeah, I can see it is there.
> >
> 
> Wait, I am talk about these two patches which is not merged yet:
> [PATCH v2 1/2] stackdepot: use variable size records for non-evictable entries
> [PATCH v2 2/2] kasan: revert eviction of stack traces in generic mode
> https://lore.kernel.org/linux-mm/20240129100708.39460-1-elver@google.com/

A can move those into the 6.8-rc hotfixes queue, and it appears a
cc:stable will not be required.

However I'm not seeing anything in the changelogs to indicate that
we're fixing a dramatic performance regression, nor why that
regressions is occurring.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-20 17:30                                   ` Andrew Morton
@ 2024-02-20 18:16                                     ` Vlastimil Babka
  2024-02-20 18:51                                       ` Marco Elver
  0 siblings, 1 reply; 23+ messages in thread
From: Vlastimil Babka @ 2024-02-20 18:16 UTC (permalink / raw)
  To: Andrew Morton, Mikhail Gavrilov
  Cc: Marco Elver, Andrey Konovalov, glider, dvyukov, eugenis,
	Oscar Salvador, Linux List Kernel Mailing,
	Linux Memory Management List

On 2/20/24 18:30, Andrew Morton wrote:
> On Tue, 20 Feb 2024 10:37:03 +0500 Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> wrote:
> 
>> On Tue, Feb 20, 2024 at 4:50 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>> > >
>> > > I'm all confused.
>> > >
>> > > 4434a56ec209 ("stackdepot: make fast paths lock-less again") was
>> > > mainlined for v6.8-rc3.
>> >
>> > Uh sorry, I just trusted the info that it's not merged and didn't verify
>> > it myself. Yeah, I can see it is there.
>> >
>> 
>> Wait, I am talk about these two patches which is not merged yet:
>> [PATCH v2 1/2] stackdepot: use variable size records for non-evictable entries
>> [PATCH v2 2/2] kasan: revert eviction of stack traces in generic mode
>> https://lore.kernel.org/linux-mm/20240129100708.39460-1-elver@google.com/
> 
> A can move those into the 6.8-rc hotfixes queue, and it appears a
> cc:stable will not be required.
> 
> However I'm not seeing anything in the changelogs to indicate that
> we're fixing a dramatic performance regression, nor why that
> regressions is occurring.

We also seem have an unhappy bot with the 2/2 patch :/ although it's not yet
clear if it's a genuine issue.

https://lore.kernel.org/all/202402201506.b7e4b9b6-oliver.sang@intel.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-20 18:16                                     ` Vlastimil Babka
@ 2024-02-20 18:51                                       ` Marco Elver
  2024-02-26  9:25                                         ` Marco Elver
  0 siblings, 1 reply; 23+ messages in thread
From: Marco Elver @ 2024-02-20 18:51 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Mikhail Gavrilov, Andrey Konovalov, glider,
	dvyukov, eugenis, Oscar Salvador, Linux List Kernel Mailing,
	Linux Memory Management List

On Tue, 20 Feb 2024 at 19:16, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 2/20/24 18:30, Andrew Morton wrote:
> > On Tue, 20 Feb 2024 10:37:03 +0500 Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> wrote:
> >
> >> On Tue, Feb 20, 2024 at 4:50 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> >> > >
> >> > > I'm all confused.
> >> > >
> >> > > 4434a56ec209 ("stackdepot: make fast paths lock-less again") was
> >> > > mainlined for v6.8-rc3.
> >> >
> >> > Uh sorry, I just trusted the info that it's not merged and didn't verify
> >> > it myself. Yeah, I can see it is there.
> >> >
> >>
> >> Wait, I am talk about these two patches which is not merged yet:
> >> [PATCH v2 1/2] stackdepot: use variable size records for non-evictable entries
> >> [PATCH v2 2/2] kasan: revert eviction of stack traces in generic mode
> >> https://lore.kernel.org/linux-mm/20240129100708.39460-1-elver@google.com/
> >
> > A can move those into the 6.8-rc hotfixes queue, and it appears a
> > cc:stable will not be required.
> >
> > However I'm not seeing anything in the changelogs to indicate that
> > we're fixing a dramatic performance regression, nor why that
> > regressions is occurring.

It's primarily fixing a regression of memory usage overhead for
stackdepot users in general. Performance is mostly fixed, but patch
2/2 ("kasan: revert eviction of stack traces in generic mode") also
helps with KASAN performance because entries that were being
repeatedly evicted-then-reallocated are just allocated once and with
increasing system uptime the slow path will be taken much less.

> We also seem have an unhappy bot with the 2/2 patch :/ although it's not yet
> clear if it's a genuine issue.
>
> https://lore.kernel.org/all/202402201506.b7e4b9b6-oliver.sang@intel.com/

While it would be nice if 6.8 would not regress over 6.7 (performance
is mostly fixed, memory usage is not), waiting for confirmation what
the rcutorture issue from the bot is about might be good.

Mikhail: since you are testing mainline, in about 4 weeks the fixes
should then reach 6.9-rc in the next merge window. Until then, if it's
not too difficult for you, you can apply those 2 patches in your own
tree.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-20 18:51                                       ` Marco Elver
@ 2024-02-26  9:25                                         ` Marco Elver
  2024-02-26 10:12                                           ` Vlastimil Babka
  0 siblings, 1 reply; 23+ messages in thread
From: Marco Elver @ 2024-02-26  9:25 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Mikhail Gavrilov, Andrey Konovalov, glider,
	dvyukov, eugenis, Oscar Salvador, Linux List Kernel Mailing,
	Linux Memory Management List

On Tue, 20 Feb 2024 at 19:51, Marco Elver <elver@google.com> wrote:
>
> On Tue, 20 Feb 2024 at 19:16, Vlastimil Babka <vbabka@suse.cz> wrote:
> >
> > On 2/20/24 18:30, Andrew Morton wrote:
> > > On Tue, 20 Feb 2024 10:37:03 +0500 Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> wrote:
> > >
> > >> On Tue, Feb 20, 2024 at 4:50 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> > >> > >
> > >> > > I'm all confused.
> > >> > >
> > >> > > 4434a56ec209 ("stackdepot: make fast paths lock-less again") was
> > >> > > mainlined for v6.8-rc3.
> > >> >
> > >> > Uh sorry, I just trusted the info that it's not merged and didn't verify
> > >> > it myself. Yeah, I can see it is there.
> > >> >
> > >>
> > >> Wait, I am talk about these two patches which is not merged yet:
> > >> [PATCH v2 1/2] stackdepot: use variable size records for non-evictable entries
> > >> [PATCH v2 2/2] kasan: revert eviction of stack traces in generic mode
> > >> https://lore.kernel.org/linux-mm/20240129100708.39460-1-elver@google.com/
> > >
> > > A can move those into the 6.8-rc hotfixes queue, and it appears a
> > > cc:stable will not be required.
> > >
> > > However I'm not seeing anything in the changelogs to indicate that
> > > we're fixing a dramatic performance regression, nor why that
> > > regressions is occurring.
>
> It's primarily fixing a regression of memory usage overhead for
> stackdepot users in general. Performance is mostly fixed, but patch
> 2/2 ("kasan: revert eviction of stack traces in generic mode") also
> helps with KASAN performance because entries that were being
> repeatedly evicted-then-reallocated are just allocated once and with
> increasing system uptime the slow path will be taken much less.
>
> > We also seem have an unhappy bot with the 2/2 patch :/ although it's not yet
> > clear if it's a genuine issue.
> >
> > https://lore.kernel.org/all/202402201506.b7e4b9b6-oliver.sang@intel.com/

This was confirmed to be a non-bug by RCU devs.

> While it would be nice if 6.8 would not regress over 6.7 (performance
> is mostly fixed, memory usage is not), waiting for confirmation what
> the rcutorture issue from the bot is about might be good.
>
> Mikhail: since you are testing mainline, in about 4 weeks the fixes
> should then reach 6.9-rc in the next merge window. Until then, if it's
> not too difficult for you, you can apply those 2 patches in your own
> tree.

There are more issues that are fixed by "[PATCH v2 1/2] stackdepot:
use variable size records for non-evictable entries". See
https://lore.kernel.org/all/ZdxYXQdZDuuhcqiv@elver.google.com/

This will eventually reach stable, but it might be good to reconsider
mainlining it earlier.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load
  2024-02-26  9:25                                         ` Marco Elver
@ 2024-02-26 10:12                                           ` Vlastimil Babka
  0 siblings, 0 replies; 23+ messages in thread
From: Vlastimil Babka @ 2024-02-26 10:12 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrew Morton, Mikhail Gavrilov, Andrey Konovalov, glider,
	dvyukov, eugenis, Oscar Salvador, Linux List Kernel Mailing,
	Linux Memory Management List

On 2/26/24 10:25, Marco Elver wrote:
> On Tue, 20 Feb 2024 at 19:51, Marco Elver <elver@google.com> wrote:
>>
>> While it would be nice if 6.8 would not regress over 6.7 (performance
>> is mostly fixed, memory usage is not), waiting for confirmation what
>> the rcutorture issue from the bot is about might be good.
>>
>> Mikhail: since you are testing mainline, in about 4 weeks the fixes
>> should then reach 6.9-rc in the next merge window. Until then, if it's
>> not too difficult for you, you can apply those 2 patches in your own
>> tree.
> 
> There are more issues that are fixed by "[PATCH v2 1/2] stackdepot:
> use variable size records for non-evictable entries". See
> https://lore.kernel.org/all/ZdxYXQdZDuuhcqiv@elver.google.com/
> 
> This will eventually reach stable, but it might be good to reconsider
> mainlining it earlier.

I believe I can see that patch, together with "kasan: revert eviction of
stack traces in generic mode" in mm-hotfixes-stable so it should be on track
for 6.8.

> Thanks,
> -- Marco


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-02-26 10:12 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-19 10:46 regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load Mikhail Gavrilov
2024-01-19 10:54 ` Marco Elver
2024-01-19 10:59   ` Marco Elver
2024-01-19 17:54     ` Mikhail Gavrilov
2024-01-29 22:25       ` Mikhail Gavrilov
2024-01-29 23:14         ` Andrey Konovalov
2024-02-01 22:08           ` Mikhail Gavrilov
2024-02-02  9:00             ` Marco Elver
2024-02-02 16:35               ` Mikhail Gavrilov
2024-02-02 16:47                 ` Marco Elver
2024-02-02 17:19                   ` Marco Elver
2024-02-02 20:14                     ` Mikhail Gavrilov
2024-02-19  9:48                       ` Mikhail Gavrilov
2024-02-19  9:52                         ` Marco Elver
2024-02-19 10:09                           ` Vlastimil Babka
2024-02-19 23:28                             ` Andrew Morton
2024-02-19 23:50                               ` Vlastimil Babka
2024-02-20  5:37                                 ` Mikhail Gavrilov
2024-02-20 17:30                                   ` Andrew Morton
2024-02-20 18:16                                     ` Vlastimil Babka
2024-02-20 18:51                                       ` Marco Elver
2024-02-26  9:25                                         ` Marco Elver
2024-02-26 10:12                                           ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).