Re: am335x: 5.18.x: system stalling

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Arnd Bergmann <arnd@arndb.de>
To: Tony Lindgren <tony@atomide.com>
Cc: Yegor Yefremov <yegorslists@googlemail.com>,
	Ard Biesheuvel <ardb@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Linux-OMAP <linux-omap@vger.kernel.org>,
	linux-clk <linux-clk@vger.kernel.org>,
	Stephen Boyd <sboyd@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: am335x: 5.18.x: system stalling
Date: Thu, 12 May 2022 10:14:15 +0200	[thread overview]
Message-ID: <CAK8P3a3817c8JMd=vqCjmY_kvBshhzSetgMfEihZ-NdcVZgJpQ@mail.gmail.com> (raw)
In-Reply-To: <Ynyd9HeFNmGQiovY@atomide.com>

On Thu, May 12, 2022 at 7:41 AM Tony Lindgren <tony@atomide.com> wrote:
> Adding Ard and Arnd for vmap stack.

Thanks!

> * Yegor Yefremov <yegorslists@googlemail.com> [220511 14:16]:
> > On Thu, May 5, 2022 at 7:08 AM Tony Lindgren <tony@atomide.com> wrote:
> > > * Yegor Yefremov <yegorslists@googlemail.com> [220504 10:35]:

>
> Maybe Ard and Arnd have some ideas what might be going wrong here.
> Basically anything trying to use a physical address on stack will
> fail in weird ways like we've seen for smc and wl1251.

For this, the first step should be to enable CONFIG_DMA_API_DEBUG.
If any device is getting the wrong DMA address for a stack variable,
this should print a helpful debug message to the console.

> > > > [   88.408578] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > [   88.415777]  (detected by 0, t=2602 jiffies, g=2529, q=17)
> > > > [   88.422026] rcu: All QSes seen, last rcu_sched kthread activity
> > > > 2602 (-21160--23762), jiffies_till_next_fqs=1, root ->qsmask 0x0
> > > > [   88.434445] rcu: rcu_sched kthread starved for 2602 jiffies! g2529
> > > > f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > > > [   88.445274] rcu:     Unless rcu_sched kthread gets sufficient CPU
> > > > time, OOM is now expected behavior.
> > > > [   88.454859] rcu: RCU grace-period kthread stack dump:

I looked for a smoking gun in the backtrace, didn't really find anything,
so I'm guessing the problem is something that happened between the
last timer timer and the time it actually ran the rcu_gp_kthread, maybe
some DMA timeout in a device driver running with interrupts disabled.

> > > > [   88.807588]  omap3_noncore_dpll_program from clk_change_rate+0x23c/0x4f8
> > > > [   88.815375]  clk_change_rate from clk_core_set_rate_nolock+0x1b0/0x29c
> > > > [   88.822936]  clk_core_set_rate_nolock from clk_set_rate+0x30/0x64
> > > > [   88.830056]  clk_set_rate from _set_opp+0x254/0x51c
> > > > [   88.835835]  _set_opp from dev_pm_opp_set_rate+0xec/0x228
> > > > [   88.842073]  dev_pm_opp_set_rate from __cpufreq_driver_target+0x584/0x700
> > > > [   88.849792]  __cpufreq_driver_target from od_dbs_update+0xb4/0x168
> > > > [   88.856953]  od_dbs_update from dbs_work_handler+0x2c/0x60
> > > > [   88.863441]  dbs_work_handler from process_one_work+0x284/0x72c
> > > > [   88.870411]  process_one_work from worker_thread+0x28/0x4b0
> > > > [   88.876973]  worker_thread from kthread+0xe4/0x104
> > > > [   88.882692]  kthread from ret_from_fork+0x14/0x28

The only thing I see that is slightly unusual here is that the timer
tick happened
exactly during the cpufreq transition. Is this always the same backtrace when
you run into the bug? What happens when you disable the omap3 cpufreq
driver or set it to run at a fixed frequency?

          Arnd

WARNING: multiple messages have this Message-ID (diff)

From: Arnd Bergmann <arnd@arndb.de>
To: Tony Lindgren <tony@atomide.com>
Cc: Yegor Yefremov <yegorslists@googlemail.com>,
	Ard Biesheuvel <ardb@kernel.org>,  Arnd Bergmann <arnd@arndb.de>,
	Linux-OMAP <linux-omap@vger.kernel.org>,
	 linux-clk <linux-clk@vger.kernel.org>,
	Stephen Boyd <sboyd@kernel.org>,
	 Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: am335x: 5.18.x: system stalling
Date: Thu, 12 May 2022 10:14:15 +0200	[thread overview]
Message-ID: <CAK8P3a3817c8JMd=vqCjmY_kvBshhzSetgMfEihZ-NdcVZgJpQ@mail.gmail.com> (raw)
In-Reply-To: <Ynyd9HeFNmGQiovY@atomide.com>

On Thu, May 12, 2022 at 7:41 AM Tony Lindgren <tony@atomide.com> wrote:
> Adding Ard and Arnd for vmap stack.

Thanks!

> * Yegor Yefremov <yegorslists@googlemail.com> [220511 14:16]:
> > On Thu, May 5, 2022 at 7:08 AM Tony Lindgren <tony@atomide.com> wrote:
> > > * Yegor Yefremov <yegorslists@googlemail.com> [220504 10:35]:

>
> Maybe Ard and Arnd have some ideas what might be going wrong here.
> Basically anything trying to use a physical address on stack will
> fail in weird ways like we've seen for smc and wl1251.

For this, the first step should be to enable CONFIG_DMA_API_DEBUG.
If any device is getting the wrong DMA address for a stack variable,
this should print a helpful debug message to the console.

> > > > [   88.408578] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > [   88.415777]  (detected by 0, t=2602 jiffies, g=2529, q=17)
> > > > [   88.422026] rcu: All QSes seen, last rcu_sched kthread activity
> > > > 2602 (-21160--23762), jiffies_till_next_fqs=1, root ->qsmask 0x0
> > > > [   88.434445] rcu: rcu_sched kthread starved for 2602 jiffies! g2529
> > > > f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > > > [   88.445274] rcu:     Unless rcu_sched kthread gets sufficient CPU
> > > > time, OOM is now expected behavior.
> > > > [   88.454859] rcu: RCU grace-period kthread stack dump:

I looked for a smoking gun in the backtrace, didn't really find anything,
so I'm guessing the problem is something that happened between the
last timer timer and the time it actually ran the rcu_gp_kthread, maybe
some DMA timeout in a device driver running with interrupts disabled.

> > > > [   88.807588]  omap3_noncore_dpll_program from clk_change_rate+0x23c/0x4f8
> > > > [   88.815375]  clk_change_rate from clk_core_set_rate_nolock+0x1b0/0x29c
> > > > [   88.822936]  clk_core_set_rate_nolock from clk_set_rate+0x30/0x64
> > > > [   88.830056]  clk_set_rate from _set_opp+0x254/0x51c
> > > > [   88.835835]  _set_opp from dev_pm_opp_set_rate+0xec/0x228
> > > > [   88.842073]  dev_pm_opp_set_rate from __cpufreq_driver_target+0x584/0x700
> > > > [   88.849792]  __cpufreq_driver_target from od_dbs_update+0xb4/0x168
> > > > [   88.856953]  od_dbs_update from dbs_work_handler+0x2c/0x60
> > > > [   88.863441]  dbs_work_handler from process_one_work+0x284/0x72c
> > > > [   88.870411]  process_one_work from worker_thread+0x28/0x4b0
> > > > [   88.876973]  worker_thread from kthread+0xe4/0x104
> > > > [   88.882692]  kthread from ret_from_fork+0x14/0x28

The only thing I see that is slightly unusual here is that the timer
tick happened
exactly during the cpufreq transition. Is this always the same backtrace when
you run into the bug? What happens when you disable the omap3 cpufreq
driver or set it to run at a fixed frequency?

          Arnd

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

next prev parent reply	other threads:[~2022-05-12  8:14 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-04 10:35 am335x: 5.18.x: system stalling Yegor Yefremov
2022-05-05  5:08 ` Tony Lindgren
2022-05-11 14:16   ` Yegor Yefremov
2022-05-12  5:41     ` Tony Lindgren
2022-05-12  5:41       ` Tony Lindgren
2022-05-12  8:14       ` Arnd Bergmann [this message]
2022-05-12  8:14         ` Arnd Bergmann
2022-05-12  8:42       ` Arnd Bergmann
2022-05-12  8:42         ` Arnd Bergmann
2022-05-12 10:20         ` Yegor Yefremov
2022-05-12 10:20           ` Yegor Yefremov
2022-05-19 16:52           ` Yegor Yefremov
2022-05-19 16:52             ` Yegor Yefremov
2022-05-21 19:41             ` Arnd Bergmann
2022-05-21 19:41               ` Arnd Bergmann
2022-05-24 13:38               ` Yegor Yefremov
2022-05-24 13:38                 ` Yegor Yefremov
2022-05-24 14:19                 ` Tony Lindgren
2022-05-24 14:19                   ` Tony Lindgren
2022-05-26  5:49                   ` Yegor Yefremov
2022-05-26  5:49                     ` Yegor Yefremov
2022-05-26  6:20                     ` Tony Lindgren
2022-05-26  6:20                       ` Tony Lindgren
2022-05-26  8:19                       ` Ard Biesheuvel
2022-05-26  8:19                         ` Ard Biesheuvel
2022-05-26 12:37                         ` Yegor Yefremov
2022-05-26 12:37                           ` Yegor Yefremov
2022-05-26 14:15                           ` Arnd Bergmann
2022-05-26 14:15                             ` Arnd Bergmann
2022-05-27  4:44                             ` Yegor Yefremov
2022-05-27  4:44                               ` Yegor Yefremov
2022-05-27  6:38                               ` Arnd Bergmann
2022-05-27  6:38                                 ` Arnd Bergmann
2022-05-27  6:50                                 ` Tony Lindgren
2022-05-27  6:50                                   ` Tony Lindgren
2022-05-27  6:57                                   ` Arnd Bergmann
2022-05-27  6:57                                     ` Arnd Bergmann
2022-05-27  8:17                                     ` Yegor Yefremov
2022-05-27  8:17                                       ` Yegor Yefremov
2022-05-27  8:38                                       ` Arnd Bergmann
2022-05-27  8:38                                         ` Arnd Bergmann
2022-05-27  9:50                                         ` Yegor Yefremov
2022-05-27  9:50                                           ` Yegor Yefremov
2022-05-27 12:53                                           ` Arnd Bergmann
2022-05-27 12:53                                             ` Arnd Bergmann
2022-05-27 13:12                                             ` Ard Biesheuvel
2022-05-27 13:12                                               ` Ard Biesheuvel
2022-05-27 14:12                                               ` Arnd Bergmann
2022-05-27 14:12                                                 ` Arnd Bergmann
2022-05-28  5:48                                                 ` Yegor Yefremov
2022-05-28  5:48                                                   ` Yegor Yefremov
2022-05-28  7:53                                                   ` Arnd Bergmann
2022-05-28  7:53                                                     ` Arnd Bergmann
2022-05-28  8:29                                                     ` Yegor Yefremov
2022-05-28  8:29                                                       ` Yegor Yefremov
2022-05-28  9:07                                                       ` Ard Biesheuvel
2022-05-28  9:07                                                         ` Ard Biesheuvel
2022-05-28 13:01                                                         ` Yegor Yefremov
2022-05-28 13:01                                                           ` Yegor Yefremov
2022-05-28 13:13                                                           ` Arnd Bergmann
2022-05-28 13:13                                                             ` Arnd Bergmann
2022-05-28 19:28                                                             ` Yegor Yefremov
2022-05-28 19:28                                                               ` Yegor Yefremov
2022-05-30 10:16                                                               ` Ard Biesheuvel
2022-05-30 10:16                                                                 ` Ard Biesheuvel
2022-05-30 12:09                                                                 ` Yegor Yefremov
2022-05-30 12:09                                                                   ` Yegor Yefremov
2022-05-30 13:54                                                               ` Arnd Bergmann
2022-05-30 13:54                                                                 ` Arnd Bergmann
2022-05-30 15:14                                                                 ` Ard Biesheuvel
2022-05-30 15:14                                                                   ` Ard Biesheuvel
2022-05-31  8:36                                                                   ` Yegor Yefremov
2022-05-31  8:36                                                                     ` Yegor Yefremov
2022-05-31 14:16                                                                     ` Yegor Yefremov
2022-05-31 14:16                                                                       ` Yegor Yefremov
2022-05-31 15:22                                                                       ` Arnd Bergmann
2022-05-31 15:22                                                                         ` Arnd Bergmann
2022-06-01  7:36                                                                         ` Yegor Yefremov
2022-06-01  7:36                                                                           ` Yegor Yefremov
2022-06-01  7:59                                                                           ` Arnd Bergmann
2022-06-01  7:59                                                                             ` Arnd Bergmann
2022-06-01  8:08                                                                             ` Ard Biesheuvel
2022-06-01  8:08                                                                               ` Ard Biesheuvel
2022-06-01  9:27                                                                               ` Ard Biesheuvel
2022-06-01  9:27                                                                                 ` Ard Biesheuvel
2022-06-01 10:03                                                                                 ` Yegor Yefremov
2022-06-01 10:03                                                                                   ` Yegor Yefremov
2022-06-01 10:06                                                                                   ` Ard Biesheuvel
2022-06-01 10:06                                                                                     ` Ard Biesheuvel
2022-06-01 10:46                                                                                     ` Yegor Yefremov
2022-06-01 10:46                                                                                       ` Yegor Yefremov
2022-06-01 10:49                                                                                       ` Ard Biesheuvel
2022-06-01 10:49                                                                                         ` Ard Biesheuvel
2022-06-02 10:17                                                                                         ` Yegor Yefremov
2022-06-02 10:17                                                                                           ` Yegor Yefremov
2022-06-02 10:37                                                                                           ` Ard Biesheuvel
2022-06-02 10:37                                                                                             ` Ard Biesheuvel
2022-06-02 12:27                                                                                             ` Yegor Yefremov
2022-06-02 12:27                                                                                               ` Yegor Yefremov
2022-06-03  8:54                                                                                               ` Yegor Yefremov
2022-06-03  8:54                                                                                                 ` Yegor Yefremov
2022-06-03  9:32                                                                                                 ` Arnd Bergmann
2022-06-03  9:32                                                                                                   ` Arnd Bergmann
2022-06-03 19:11                                                                                                   ` Yegor Yefremov
2022-06-03 19:11                                                                                                     ` Yegor Yefremov
2022-06-03 20:46                                                                                                     ` Arnd Bergmann
2022-06-03 20:46                                                                                                       ` Arnd Bergmann
2022-06-05 14:59                                                                                                       ` Ard Biesheuvel
2022-06-05 14:59                                                                                                         ` Ard Biesheuvel
2022-06-07  8:55                                                                                                         ` Yegor Yefremov
2022-06-07  8:55                                                                                                           ` Yegor Yefremov
2022-08-12  7:35                                                                                                           ` Arnd Bergmann
2022-08-12  7:35                                                                                                             ` Arnd Bergmann
2022-05-24 14:36                 ` Arnd Bergmann
2022-05-24 14:36                   ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAK8P3a3817c8JMd=vqCjmY_kvBshhzSetgMfEihZ-NdcVZgJpQ@mail.gmail.com' \
    --to=arnd@arndb.de \
    --cc=ardb@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=sboyd@kernel.org \
    --cc=tony@atomide.com \
    --cc=yegorslists@googlemail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.