All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yegor Yefremov <yegorslists@googlemail.com>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>, Tony Lindgren <tony@atomide.com>,
	Linux-OMAP <linux-omap@vger.kernel.org>,
	linux-clk <linux-clk@vger.kernel.org>,
	Stephen Boyd <sboyd@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: am335x: 5.18.x: system stalling
Date: Wed, 1 Jun 2022 12:46:30 +0200	[thread overview]
Message-ID: <CAGm1_ks8g3RNwOkC8C_B2eYz56cEA7L-6CRdmqmNwSvAg-JP_g@mail.gmail.com> (raw)
In-Reply-To: <CAMj1kXHUoDQ0xZ4yBx9uT6D9=6xfOsJoWLoOKho_-=Z9uYS30w@mail.gmail.com>

On Wed, Jun 1, 2022 at 12:06 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Wed, 1 Jun 2022 at 12:04, Yegor Yefremov <yegorslists@googlemail.com> wrote:
> >
> > On Wed, Jun 1, 2022 at 11:28 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Wed, 1 Jun 2022 at 10:08, Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Wed, 1 Jun 2022 at 09:59, Arnd Bergmann <arnd@arndb.de> wrote:
> > > > >
> > > > > On Wed, Jun 1, 2022 at 9:36 AM Yegor Yefremov
> > > > > <yegorslists@googlemail.com> wrote:
> > > > > > On Tue, May 31, 2022 at 5:23 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > > > > I've pushed a modified branch now, with that fix on the broken commit,
> > > > > > > and another change to make CONFIG_IRQSTACKS user-selectable rather
> > > > > > > than always enabled. That should tell us if the problem is in the SMP
> > > > > > > patching or in the irqstacks.
> > > > > > >
> > > > > > > Can you test the top of this branch with CONFIG_IRQSTACKS disabled,
> > > > > > > and (if that still stalls) retest the fixed commit f0191ea5c2e5 ("[PART 1]
> > > > > > > ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems")?
> > > > > >
> > > > > > 1. the top of this branch with CONFIG_IRQSTACKS disabled stalls
> > > > > > 2. f0191ea5c2e5 with the same config - not
> > > > >
> > > > > Ok, perfect, that does narrow down the problem quite a bit: The final
> > > > > patch has seven changes, all of which can be done individually because
> > > > > in each case the simplified version in f0191ea5c2e5 is meant to run
> > > > > the exact same instructions as the version after the change, when running
> > > > > on a uniprocessor machine such as your am335x.
> > > > >
> > > > > You have already shown earlier that the get_current() and
> > > > > __my_cpu_offset() functions are not to blame here, as reverting
> > > > > only those does not change the behavior.
> > > > >
> > > > > This leaves the is_smp() check in set_current(), and the
> > > > > four macros in <asm/assembler.h>. I don't see anything obviously
> > > > > wrong with any of those five, but I would bet on the macros
> > > > > here. Can you try bisecting into this commit, maybe reverting
> > > > > the changes to set_current and get_current first, and then
> > > > > narrowing it down to (hopefully) a single macro that causes the
> > > > > problem?
> > > > >
> > > >
> > > > set_current() is never called by the primary CPU, which is why the
> > > > is_smp() check was removed from there in 57a420435edcb0b94 ("ARM: drop
> > > > pointless SMP check on secondary startup path").
> > > >
> > > > So that leaves only the four macros in asm/assembler.h, but I don't
> > > > see anything obviously wrong with those either.
> > >
> > > I pushed a patch on top of Arnd's branch at the link below that gets
> > > rid of the subsections, and uses normal branches (and code patching)
> > > to switch between the thread ID register and the LDR to retrieve the
> > > CPU offset and the current pointer. I have no explanation whether or
> > > why it could make a difference, but I think it's worth a try.
> >
> > The link to your repo is missing.
> >
>
> Oops, sorry :-)
>
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=am335x-stall-test

I have tested your branch and it stalls:

[   69.924298] rcu: INFO: rcu_sched self-detected stall on CPU
[   69.930986] rcu:     0-...!: (2600 ticks this GP)
idle=6f5/1/0x40000004 softirq=2257/2257 fqs=0
[   69.940551]  (t=2600 jiffies g=3413 q=11)
[   69.945187] rcu: rcu_sched kthread timer wakeup didn't happen for
2599 jiffies! g3413 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[   69.957111] rcu:     Possible timer handling issue on cpu=0
timer-softirq=1261
[   69.964668] rcu: rcu_sched kthread starved for 2600 jiffies! g3413
f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[   69.975638] rcu:     Unless rcu_sched kthread gets sufficient CPU
time, OOM is now expected behavior.
[   69.985170] rcu: RCU grace-period kthread stack dump:
[   69.990708] task:rcu_sched       state:I stack:    0 pid:   10
ppid:     2 flags:0x00000000
[   70.000250] [<c0b683b4>] (__schedule) from [<c0b68cf8>] (schedule+0x54/0xe8)
[   70.008705] [<c0b68cf8>] (schedule) from [<c0b6f4fc>]
(schedule_timeout+0xa8/0x210)
[   70.017449] [<c0b6f4fc>] (schedule_timeout) from [<c01d8594>]
(rcu_gp_fqs_loop+0x118/0x6b4)
[   70.026875] [<c01d8594>] (rcu_gp_fqs_loop) from [<c01dc4c4>]
(rcu_gp_kthread+0x138/0x30c)
[   70.036074] [<c01dc4c4>] (rcu_gp_kthread) from [<c0164dd8>]
(kthread+0x13c/0x164)
[   70.044559] [<c0164dd8>] (kthread) from [<c0100150>]
(ret_from_fork+0x14/0x44)
[   70.052732] rcu: Stack dump where RCU GP kthread last ran:
[   70.058773] NMI backtrace for cpu 0
[   70.062840] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.16.0-rc1 #1
[   70.070003] Hardware name: Generic AM33XX (Flattened Device Tree)
[   70.076698] Workqueue: events dbs_work_handler
[   70.082258] [<c01115f0>] (unwind_backtrace) from [<c010bfd4>]
(show_stack+0x10/0x14)
[   70.091113] [<c010bfd4>] (show_stack) from [<d00299f0>] (0xd00299f0)
[   70.099045] NMI backtrace for cpu 0
[   70.103188] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.16.0-rc1 #1
[   70.110357] Hardware name: Generic AM33XX (Flattened Device Tree)
[   70.117027] Workqueue: events dbs_work_handler
[   70.122491] [<c01115f0>] (unwind_backtrace) from [<c010bfd4>]
(show_stack+0x10/0x14)
[   70.131254] [<c010bfd4>] (show_stack) from [<d00299f0>] (0xd00299f0)

WARNING: multiple messages have this Message-ID (diff)
From: Yegor Yefremov <yegorslists@googlemail.com>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>, Tony Lindgren <tony@atomide.com>,
	 Linux-OMAP <linux-omap@vger.kernel.org>,
	linux-clk <linux-clk@vger.kernel.org>,
	 Stephen Boyd <sboyd@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: am335x: 5.18.x: system stalling
Date: Wed, 1 Jun 2022 12:46:30 +0200	[thread overview]
Message-ID: <CAGm1_ks8g3RNwOkC8C_B2eYz56cEA7L-6CRdmqmNwSvAg-JP_g@mail.gmail.com> (raw)
In-Reply-To: <CAMj1kXHUoDQ0xZ4yBx9uT6D9=6xfOsJoWLoOKho_-=Z9uYS30w@mail.gmail.com>

On Wed, Jun 1, 2022 at 12:06 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Wed, 1 Jun 2022 at 12:04, Yegor Yefremov <yegorslists@googlemail.com> wrote:
> >
> > On Wed, Jun 1, 2022 at 11:28 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Wed, 1 Jun 2022 at 10:08, Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Wed, 1 Jun 2022 at 09:59, Arnd Bergmann <arnd@arndb.de> wrote:
> > > > >
> > > > > On Wed, Jun 1, 2022 at 9:36 AM Yegor Yefremov
> > > > > <yegorslists@googlemail.com> wrote:
> > > > > > On Tue, May 31, 2022 at 5:23 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > > > > I've pushed a modified branch now, with that fix on the broken commit,
> > > > > > > and another change to make CONFIG_IRQSTACKS user-selectable rather
> > > > > > > than always enabled. That should tell us if the problem is in the SMP
> > > > > > > patching or in the irqstacks.
> > > > > > >
> > > > > > > Can you test the top of this branch with CONFIG_IRQSTACKS disabled,
> > > > > > > and (if that still stalls) retest the fixed commit f0191ea5c2e5 ("[PART 1]
> > > > > > > ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems")?
> > > > > >
> > > > > > 1. the top of this branch with CONFIG_IRQSTACKS disabled stalls
> > > > > > 2. f0191ea5c2e5 with the same config - not
> > > > >
> > > > > Ok, perfect, that does narrow down the problem quite a bit: The final
> > > > > patch has seven changes, all of which can be done individually because
> > > > > in each case the simplified version in f0191ea5c2e5 is meant to run
> > > > > the exact same instructions as the version after the change, when running
> > > > > on a uniprocessor machine such as your am335x.
> > > > >
> > > > > You have already shown earlier that the get_current() and
> > > > > __my_cpu_offset() functions are not to blame here, as reverting
> > > > > only those does not change the behavior.
> > > > >
> > > > > This leaves the is_smp() check in set_current(), and the
> > > > > four macros in <asm/assembler.h>. I don't see anything obviously
> > > > > wrong with any of those five, but I would bet on the macros
> > > > > here. Can you try bisecting into this commit, maybe reverting
> > > > > the changes to set_current and get_current first, and then
> > > > > narrowing it down to (hopefully) a single macro that causes the
> > > > > problem?
> > > > >
> > > >
> > > > set_current() is never called by the primary CPU, which is why the
> > > > is_smp() check was removed from there in 57a420435edcb0b94 ("ARM: drop
> > > > pointless SMP check on secondary startup path").
> > > >
> > > > So that leaves only the four macros in asm/assembler.h, but I don't
> > > > see anything obviously wrong with those either.
> > >
> > > I pushed a patch on top of Arnd's branch at the link below that gets
> > > rid of the subsections, and uses normal branches (and code patching)
> > > to switch between the thread ID register and the LDR to retrieve the
> > > CPU offset and the current pointer. I have no explanation whether or
> > > why it could make a difference, but I think it's worth a try.
> >
> > The link to your repo is missing.
> >
>
> Oops, sorry :-)
>
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=am335x-stall-test

I have tested your branch and it stalls:

[   69.924298] rcu: INFO: rcu_sched self-detected stall on CPU
[   69.930986] rcu:     0-...!: (2600 ticks this GP)
idle=6f5/1/0x40000004 softirq=2257/2257 fqs=0
[   69.940551]  (t=2600 jiffies g=3413 q=11)
[   69.945187] rcu: rcu_sched kthread timer wakeup didn't happen for
2599 jiffies! g3413 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[   69.957111] rcu:     Possible timer handling issue on cpu=0
timer-softirq=1261
[   69.964668] rcu: rcu_sched kthread starved for 2600 jiffies! g3413
f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
[   69.975638] rcu:     Unless rcu_sched kthread gets sufficient CPU
time, OOM is now expected behavior.
[   69.985170] rcu: RCU grace-period kthread stack dump:
[   69.990708] task:rcu_sched       state:I stack:    0 pid:   10
ppid:     2 flags:0x00000000
[   70.000250] [<c0b683b4>] (__schedule) from [<c0b68cf8>] (schedule+0x54/0xe8)
[   70.008705] [<c0b68cf8>] (schedule) from [<c0b6f4fc>]
(schedule_timeout+0xa8/0x210)
[   70.017449] [<c0b6f4fc>] (schedule_timeout) from [<c01d8594>]
(rcu_gp_fqs_loop+0x118/0x6b4)
[   70.026875] [<c01d8594>] (rcu_gp_fqs_loop) from [<c01dc4c4>]
(rcu_gp_kthread+0x138/0x30c)
[   70.036074] [<c01dc4c4>] (rcu_gp_kthread) from [<c0164dd8>]
(kthread+0x13c/0x164)
[   70.044559] [<c0164dd8>] (kthread) from [<c0100150>]
(ret_from_fork+0x14/0x44)
[   70.052732] rcu: Stack dump where RCU GP kthread last ran:
[   70.058773] NMI backtrace for cpu 0
[   70.062840] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.16.0-rc1 #1
[   70.070003] Hardware name: Generic AM33XX (Flattened Device Tree)
[   70.076698] Workqueue: events dbs_work_handler
[   70.082258] [<c01115f0>] (unwind_backtrace) from [<c010bfd4>]
(show_stack+0x10/0x14)
[   70.091113] [<c010bfd4>] (show_stack) from [<d00299f0>] (0xd00299f0)
[   70.099045] NMI backtrace for cpu 0
[   70.103188] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.16.0-rc1 #1
[   70.110357] Hardware name: Generic AM33XX (Flattened Device Tree)
[   70.117027] Workqueue: events dbs_work_handler
[   70.122491] [<c01115f0>] (unwind_backtrace) from [<c010bfd4>]
(show_stack+0x10/0x14)
[   70.131254] [<c010bfd4>] (show_stack) from [<d00299f0>] (0xd00299f0)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-06-01 10:47 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-04 10:35 am335x: 5.18.x: system stalling Yegor Yefremov
2022-05-05  5:08 ` Tony Lindgren
2022-05-11 14:16   ` Yegor Yefremov
2022-05-12  5:41     ` Tony Lindgren
2022-05-12  5:41       ` Tony Lindgren
2022-05-12  8:14       ` Arnd Bergmann
2022-05-12  8:14         ` Arnd Bergmann
2022-05-12  8:42       ` Arnd Bergmann
2022-05-12  8:42         ` Arnd Bergmann
2022-05-12 10:20         ` Yegor Yefremov
2022-05-12 10:20           ` Yegor Yefremov
2022-05-19 16:52           ` Yegor Yefremov
2022-05-19 16:52             ` Yegor Yefremov
2022-05-21 19:41             ` Arnd Bergmann
2022-05-21 19:41               ` Arnd Bergmann
2022-05-24 13:38               ` Yegor Yefremov
2022-05-24 13:38                 ` Yegor Yefremov
2022-05-24 14:19                 ` Tony Lindgren
2022-05-24 14:19                   ` Tony Lindgren
2022-05-26  5:49                   ` Yegor Yefremov
2022-05-26  5:49                     ` Yegor Yefremov
2022-05-26  6:20                     ` Tony Lindgren
2022-05-26  6:20                       ` Tony Lindgren
2022-05-26  8:19                       ` Ard Biesheuvel
2022-05-26  8:19                         ` Ard Biesheuvel
2022-05-26 12:37                         ` Yegor Yefremov
2022-05-26 12:37                           ` Yegor Yefremov
2022-05-26 14:15                           ` Arnd Bergmann
2022-05-26 14:15                             ` Arnd Bergmann
2022-05-27  4:44                             ` Yegor Yefremov
2022-05-27  4:44                               ` Yegor Yefremov
2022-05-27  6:38                               ` Arnd Bergmann
2022-05-27  6:38                                 ` Arnd Bergmann
2022-05-27  6:50                                 ` Tony Lindgren
2022-05-27  6:50                                   ` Tony Lindgren
2022-05-27  6:57                                   ` Arnd Bergmann
2022-05-27  6:57                                     ` Arnd Bergmann
2022-05-27  8:17                                     ` Yegor Yefremov
2022-05-27  8:17                                       ` Yegor Yefremov
2022-05-27  8:38                                       ` Arnd Bergmann
2022-05-27  8:38                                         ` Arnd Bergmann
2022-05-27  9:50                                         ` Yegor Yefremov
2022-05-27  9:50                                           ` Yegor Yefremov
2022-05-27 12:53                                           ` Arnd Bergmann
2022-05-27 12:53                                             ` Arnd Bergmann
2022-05-27 13:12                                             ` Ard Biesheuvel
2022-05-27 13:12                                               ` Ard Biesheuvel
2022-05-27 14:12                                               ` Arnd Bergmann
2022-05-27 14:12                                                 ` Arnd Bergmann
2022-05-28  5:48                                                 ` Yegor Yefremov
2022-05-28  5:48                                                   ` Yegor Yefremov
2022-05-28  7:53                                                   ` Arnd Bergmann
2022-05-28  7:53                                                     ` Arnd Bergmann
2022-05-28  8:29                                                     ` Yegor Yefremov
2022-05-28  8:29                                                       ` Yegor Yefremov
2022-05-28  9:07                                                       ` Ard Biesheuvel
2022-05-28  9:07                                                         ` Ard Biesheuvel
2022-05-28 13:01                                                         ` Yegor Yefremov
2022-05-28 13:01                                                           ` Yegor Yefremov
2022-05-28 13:13                                                           ` Arnd Bergmann
2022-05-28 13:13                                                             ` Arnd Bergmann
2022-05-28 19:28                                                             ` Yegor Yefremov
2022-05-28 19:28                                                               ` Yegor Yefremov
2022-05-30 10:16                                                               ` Ard Biesheuvel
2022-05-30 10:16                                                                 ` Ard Biesheuvel
2022-05-30 12:09                                                                 ` Yegor Yefremov
2022-05-30 12:09                                                                   ` Yegor Yefremov
2022-05-30 13:54                                                               ` Arnd Bergmann
2022-05-30 13:54                                                                 ` Arnd Bergmann
2022-05-30 15:14                                                                 ` Ard Biesheuvel
2022-05-30 15:14                                                                   ` Ard Biesheuvel
2022-05-31  8:36                                                                   ` Yegor Yefremov
2022-05-31  8:36                                                                     ` Yegor Yefremov
2022-05-31 14:16                                                                     ` Yegor Yefremov
2022-05-31 14:16                                                                       ` Yegor Yefremov
2022-05-31 15:22                                                                       ` Arnd Bergmann
2022-05-31 15:22                                                                         ` Arnd Bergmann
2022-06-01  7:36                                                                         ` Yegor Yefremov
2022-06-01  7:36                                                                           ` Yegor Yefremov
2022-06-01  7:59                                                                           ` Arnd Bergmann
2022-06-01  7:59                                                                             ` Arnd Bergmann
2022-06-01  8:08                                                                             ` Ard Biesheuvel
2022-06-01  8:08                                                                               ` Ard Biesheuvel
2022-06-01  9:27                                                                               ` Ard Biesheuvel
2022-06-01  9:27                                                                                 ` Ard Biesheuvel
2022-06-01 10:03                                                                                 ` Yegor Yefremov
2022-06-01 10:03                                                                                   ` Yegor Yefremov
2022-06-01 10:06                                                                                   ` Ard Biesheuvel
2022-06-01 10:06                                                                                     ` Ard Biesheuvel
2022-06-01 10:46                                                                                     ` Yegor Yefremov [this message]
2022-06-01 10:46                                                                                       ` Yegor Yefremov
2022-06-01 10:49                                                                                       ` Ard Biesheuvel
2022-06-01 10:49                                                                                         ` Ard Biesheuvel
2022-06-02 10:17                                                                                         ` Yegor Yefremov
2022-06-02 10:17                                                                                           ` Yegor Yefremov
2022-06-02 10:37                                                                                           ` Ard Biesheuvel
2022-06-02 10:37                                                                                             ` Ard Biesheuvel
2022-06-02 12:27                                                                                             ` Yegor Yefremov
2022-06-02 12:27                                                                                               ` Yegor Yefremov
2022-06-03  8:54                                                                                               ` Yegor Yefremov
2022-06-03  8:54                                                                                                 ` Yegor Yefremov
2022-06-03  9:32                                                                                                 ` Arnd Bergmann
2022-06-03  9:32                                                                                                   ` Arnd Bergmann
2022-06-03 19:11                                                                                                   ` Yegor Yefremov
2022-06-03 19:11                                                                                                     ` Yegor Yefremov
2022-06-03 20:46                                                                                                     ` Arnd Bergmann
2022-06-03 20:46                                                                                                       ` Arnd Bergmann
2022-06-05 14:59                                                                                                       ` Ard Biesheuvel
2022-06-05 14:59                                                                                                         ` Ard Biesheuvel
2022-06-07  8:55                                                                                                         ` Yegor Yefremov
2022-06-07  8:55                                                                                                           ` Yegor Yefremov
2022-08-12  7:35                                                                                                           ` Arnd Bergmann
2022-08-12  7:35                                                                                                             ` Arnd Bergmann
2022-05-24 14:36                 ` Arnd Bergmann
2022-05-24 14:36                   ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGm1_ks8g3RNwOkC8C_B2eYz56cEA7L-6CRdmqmNwSvAg-JP_g@mail.gmail.com \
    --to=yegorslists@googlemail.com \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=sboyd@kernel.org \
    --cc=tony@atomide.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.