From: Ionela Voinescu <ionela.voinescu@arm.com> To: Jerome Brunet <jbrunet@baylibre.com> Cc: Jassi Brar <jassisinghbrar@gmail.com>, Kevin Hilman <khilman@baylibre.com>, linux-amlogic@lists.infradead.org, Da Xue <da@libre.computer>, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mailbox: cancel timer before starting it Date: Thu, 15 Oct 2020 14:46:28 +0100 [thread overview] Message-ID: <20201015134628.GA11989@arm.com> (raw) In-Reply-To: <20200923123916.1115962-1-jbrunet@baylibre.com> Hi guys, On Wednesday 23 Sep 2020 at 14:39:16 (+0200), Jerome Brunet wrote: > If the txdone is done by polling, it is possible for msg_submit() to start > the timer while txdone_hrtimer() callback is running. If the timer needs > recheduling, it could already be enqueued by the time hrtimer_forward_now() > is called, leading hrtimer to loudly complain. > > WARNING: CPU: 3 PID: 74 at kernel/time/hrtimer.c:932 hrtimer_forward+0xc4/0x110 > CPU: 3 PID: 74 Comm: kworker/u8:1 Not tainted 5.9.0-rc2-00236-gd3520067d01c-dirty #5 > Hardware name: Libre Computer AML-S805X-AC (DT) > Workqueue: events_freezable_power_ thermal_zone_device_check > pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--) > pc : hrtimer_forward+0xc4/0x110 > lr : txdone_hrtimer+0xf8/0x118 > [...] > > Canceling the timer before starting it ensure that the timer callback is > not running when the timer is started, solving this race condition. > > Fixes: 0cc67945ea59 ("mailbox: switch to hrtimer for tx_complete polling") > Reported-by: Da Xue <da@libre.computer> > Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> > --- > drivers/mailbox/mailbox.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c > index 0b821a5b2db8..34f9ab01caef 100644 > --- a/drivers/mailbox/mailbox.c > +++ b/drivers/mailbox/mailbox.c > @@ -82,9 +82,13 @@ static void msg_submit(struct mbox_chan *chan) > exit: > spin_unlock_irqrestore(&chan->lock, flags); > > - if (!err && (chan->txdone_method & TXDONE_BY_POLL)) > - /* kick start the timer immediately to avoid delays */ > + if (!err && (chan->txdone_method & TXDONE_BY_POLL)) { > + /* Disable the timer if already active ... */ > + hrtimer_cancel(&chan->mbox->poll_hrt); > + > + /* ... and kick start it immediately to avoid delays */ > hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL); > + } > } > > static void tx_tick(struct mbox_chan *chan, int r) I've tracked a regression back to this commit. Details to reproduce: - HEAD: (linux-next) * 62c04453381e Jerome Brunet 3 weeks ago mailbox: cancel timer before starting it - Platform: arm64 Juno R0 and Juno R2 [1] - Partial log: [ 0.000000] Booting Linux on physical CPU 0x0000000100 [0x410fd030] [ 0.000000] Linux version 5.9.0-rc8-01722-g62c04453381e () (aarch64-none-linux-gnu-gcc (GNU Toolchain for the A-profile Architecture 9.2-2019.12 (arm-9.10)) 9.2.1 20191025, GNU ld (GNU Toolchain for the A-profile Architecture 9.2-2019.12 (arm-9.10)) 2.33.1.20191209) #175 SMP PREEMPT Thu Oct 15 14:17:41 BST 2020 [ 0.000000] Machine model: ARM Juno development board (r0) [..] [ 1.714340] mhu 2b1f0000.mhu: ARM MHU Mailbox registered [ 1.722768] NET: Registered protocol family 17 [ 1.727364] 9pnet: Installing 9P2000 support [ 1.731689] Key type dns_resolver registered [ 1.735474] usb 1-1: new high-speed USB device number 2 using ehci-platform [ 1.736407] registered taskstats version 1 [ 1.747061] Loading compiled-in X.509 certificates [ 1.755885] scpi_protocol scpi: SCP Protocol 1.2 Firmware 1.21.0 version [ 1.770484] cpu cpu0: EM: created perf domain [ 1.778505] cpu cpu1: EM: created perf domain [ 1.807449] scpi_clocks scpi:clocks: failed to register clock 'pxlclk' [ 1.897593] hub 1-1:1.0: USB hub found [ 1.901656] hub 1-1:1.0: 4 ports detected [ 2.559453] atkbd serio0: keyboard reset failed on 1c060000.kmi [ 22.787431] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 22.793536] rcu: 1-...0: (1 ticks this GP) idle=222/1/0x4000000000000002 softirq=63/64 fqs=2626 [ 22.802421] (detected by 2, t=5255 jiffies, g=-991, q=9) [ 22.807823] Task dump for CPU 1: [ 22.811049] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 22.820980] Call trace: [ 22.823429] __switch_to+0x138/0x198 [ 23.583444] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-... } 5443 jiffies s: 49 root: 0x2/. [ 23.593995] rcu: blocking rcu_node structures: [ 23.598449] Task dump for CPU 1: [ 23.601680] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 23.611619] Call trace: [ 23.614064] __switch_to+0x138/0x198 [ 85.807430] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 85.813534] rcu: 1-...0: (1 ticks this GP) idle=222/1/0x4000000000000002 softirq=63/64 fqs=10502 [ 85.822506] (detected by 2, t=21009 jiffies, g=-991, q=9) [ 85.827994] Task dump for CPU 1: [ 85.831220] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 85.841150] Call trace: [ 85.843596] __switch_to+0x138/0x198 [ 87.071446] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-... } 21315 jiffies s: 49 root: 0x2/. [ 87.082088] rcu: blocking rcu_node structures: [ 87.086540] Task dump for CPU 1: [ 87.089773] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 87.099708] Call trace: [ 87.102155] __switch_to+0x138/0x198 [ 148.827442] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 148.833565] rcu: 1-...0: (1 ticks this GP) idle=222/1/0x4000000000000002 softirq=63/64 fqs=18377 [ 148.842543] (detected by 4, t=36762 jiffies, g=-991, q=9) [ 148.848037] Task dump for CPU 1: [ 148.851268] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 148.861207] Call trace: [ 148.863663] __switch_to+0x138/0x198 [ 150.559443] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-... } 37187 jiffies s: 49 root: 0x2/. [ 150.570082] rcu: blocking rcu_node structures: [ 150.574535] Task dump for CPU 1: [ 150.577767] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 150.587705] Call trace: [ 150.590151] __switch_to+0x138/0x198 - Commit working as expected: * 558e4c36ec9f Krzysztof Kozlowski 7 weeks ago maiblox: mediatek: Fix handling of platform_get_irq() error [1] https://developer.arm.com/tools-and-software/development-boards/juno-development-board Thank you, Ionela. > -- > 2.25.4 > > > _______________________________________________ > linux-amlogic mailing list > linux-amlogic@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-amlogic
WARNING: multiple messages have this Message-ID (diff)
From: Ionela Voinescu <ionela.voinescu@arm.com> To: Jerome Brunet <jbrunet@baylibre.com> Cc: Kevin Hilman <khilman@baylibre.com>, Jassi Brar <jassisinghbrar@gmail.com>, Da Xue <da@libre.computer>, linux-kernel@vger.kernel.org, linux-amlogic@lists.infradead.org Subject: Re: [PATCH] mailbox: cancel timer before starting it Date: Thu, 15 Oct 2020 14:46:28 +0100 [thread overview] Message-ID: <20201015134628.GA11989@arm.com> (raw) In-Reply-To: <20200923123916.1115962-1-jbrunet@baylibre.com> Hi guys, On Wednesday 23 Sep 2020 at 14:39:16 (+0200), Jerome Brunet wrote: > If the txdone is done by polling, it is possible for msg_submit() to start > the timer while txdone_hrtimer() callback is running. If the timer needs > recheduling, it could already be enqueued by the time hrtimer_forward_now() > is called, leading hrtimer to loudly complain. > > WARNING: CPU: 3 PID: 74 at kernel/time/hrtimer.c:932 hrtimer_forward+0xc4/0x110 > CPU: 3 PID: 74 Comm: kworker/u8:1 Not tainted 5.9.0-rc2-00236-gd3520067d01c-dirty #5 > Hardware name: Libre Computer AML-S805X-AC (DT) > Workqueue: events_freezable_power_ thermal_zone_device_check > pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--) > pc : hrtimer_forward+0xc4/0x110 > lr : txdone_hrtimer+0xf8/0x118 > [...] > > Canceling the timer before starting it ensure that the timer callback is > not running when the timer is started, solving this race condition. > > Fixes: 0cc67945ea59 ("mailbox: switch to hrtimer for tx_complete polling") > Reported-by: Da Xue <da@libre.computer> > Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> > --- > drivers/mailbox/mailbox.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c > index 0b821a5b2db8..34f9ab01caef 100644 > --- a/drivers/mailbox/mailbox.c > +++ b/drivers/mailbox/mailbox.c > @@ -82,9 +82,13 @@ static void msg_submit(struct mbox_chan *chan) > exit: > spin_unlock_irqrestore(&chan->lock, flags); > > - if (!err && (chan->txdone_method & TXDONE_BY_POLL)) > - /* kick start the timer immediately to avoid delays */ > + if (!err && (chan->txdone_method & TXDONE_BY_POLL)) { > + /* Disable the timer if already active ... */ > + hrtimer_cancel(&chan->mbox->poll_hrt); > + > + /* ... and kick start it immediately to avoid delays */ > hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL); > + } > } > > static void tx_tick(struct mbox_chan *chan, int r) I've tracked a regression back to this commit. Details to reproduce: - HEAD: (linux-next) * 62c04453381e Jerome Brunet 3 weeks ago mailbox: cancel timer before starting it - Platform: arm64 Juno R0 and Juno R2 [1] - Partial log: [ 0.000000] Booting Linux on physical CPU 0x0000000100 [0x410fd030] [ 0.000000] Linux version 5.9.0-rc8-01722-g62c04453381e () (aarch64-none-linux-gnu-gcc (GNU Toolchain for the A-profile Architecture 9.2-2019.12 (arm-9.10)) 9.2.1 20191025, GNU ld (GNU Toolchain for the A-profile Architecture 9.2-2019.12 (arm-9.10)) 2.33.1.20191209) #175 SMP PREEMPT Thu Oct 15 14:17:41 BST 2020 [ 0.000000] Machine model: ARM Juno development board (r0) [..] [ 1.714340] mhu 2b1f0000.mhu: ARM MHU Mailbox registered [ 1.722768] NET: Registered protocol family 17 [ 1.727364] 9pnet: Installing 9P2000 support [ 1.731689] Key type dns_resolver registered [ 1.735474] usb 1-1: new high-speed USB device number 2 using ehci-platform [ 1.736407] registered taskstats version 1 [ 1.747061] Loading compiled-in X.509 certificates [ 1.755885] scpi_protocol scpi: SCP Protocol 1.2 Firmware 1.21.0 version [ 1.770484] cpu cpu0: EM: created perf domain [ 1.778505] cpu cpu1: EM: created perf domain [ 1.807449] scpi_clocks scpi:clocks: failed to register clock 'pxlclk' [ 1.897593] hub 1-1:1.0: USB hub found [ 1.901656] hub 1-1:1.0: 4 ports detected [ 2.559453] atkbd serio0: keyboard reset failed on 1c060000.kmi [ 22.787431] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 22.793536] rcu: 1-...0: (1 ticks this GP) idle=222/1/0x4000000000000002 softirq=63/64 fqs=2626 [ 22.802421] (detected by 2, t=5255 jiffies, g=-991, q=9) [ 22.807823] Task dump for CPU 1: [ 22.811049] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 22.820980] Call trace: [ 22.823429] __switch_to+0x138/0x198 [ 23.583444] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-... } 5443 jiffies s: 49 root: 0x2/. [ 23.593995] rcu: blocking rcu_node structures: [ 23.598449] Task dump for CPU 1: [ 23.601680] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 23.611619] Call trace: [ 23.614064] __switch_to+0x138/0x198 [ 85.807430] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 85.813534] rcu: 1-...0: (1 ticks this GP) idle=222/1/0x4000000000000002 softirq=63/64 fqs=10502 [ 85.822506] (detected by 2, t=21009 jiffies, g=-991, q=9) [ 85.827994] Task dump for CPU 1: [ 85.831220] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 85.841150] Call trace: [ 85.843596] __switch_to+0x138/0x198 [ 87.071446] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-... } 21315 jiffies s: 49 root: 0x2/. [ 87.082088] rcu: blocking rcu_node structures: [ 87.086540] Task dump for CPU 1: [ 87.089773] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 87.099708] Call trace: [ 87.102155] __switch_to+0x138/0x198 [ 148.827442] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 148.833565] rcu: 1-...0: (1 ticks this GP) idle=222/1/0x4000000000000002 softirq=63/64 fqs=18377 [ 148.842543] (detected by 4, t=36762 jiffies, g=-991, q=9) [ 148.848037] Task dump for CPU 1: [ 148.851268] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 148.861207] Call trace: [ 148.863663] __switch_to+0x138/0x198 [ 150.559443] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-... } 37187 jiffies s: 49 root: 0x2/. [ 150.570082] rcu: blocking rcu_node structures: [ 150.574535] Task dump for CPU 1: [ 150.577767] task:swapper/1 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x0000002a [ 150.587705] Call trace: [ 150.590151] __switch_to+0x138/0x198 - Commit working as expected: * 558e4c36ec9f Krzysztof Kozlowski 7 weeks ago maiblox: mediatek: Fix handling of platform_get_irq() error [1] https://developer.arm.com/tools-and-software/development-boards/juno-development-board Thank you, Ionela. > -- > 2.25.4 > > > _______________________________________________ > linux-amlogic mailing list > linux-amlogic@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-amlogic _______________________________________________ linux-amlogic mailing list linux-amlogic@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-amlogic
next prev parent reply other threads:[~2020-10-15 13:46 UTC|newest] Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-09-23 12:39 [PATCH] mailbox: cancel timer before starting it Jerome Brunet 2020-09-23 12:39 ` Jerome Brunet 2020-09-24 16:50 ` Kevin Hilman 2020-09-24 16:50 ` Kevin Hilman 2020-10-15 13:46 ` Ionela Voinescu [this message] 2020-10-15 13:46 ` Ionela Voinescu 2020-10-15 13:58 ` Jerome Brunet 2020-10-15 13:58 ` Jerome Brunet 2020-10-15 14:29 ` Ionela Voinescu 2020-10-15 14:29 ` Ionela Voinescu 2020-10-16 8:44 ` Sudeep Holla 2020-10-16 8:44 ` Sudeep Holla 2020-10-16 9:02 ` Jerome Brunet 2020-10-16 9:02 ` Jerome Brunet 2020-10-16 9:34 ` Sudeep Holla 2020-10-16 9:34 ` Sudeep Holla 2020-10-16 11:47 ` Sudeep Holla 2020-10-16 11:47 ` Sudeep Holla 2020-10-15 18:45 ` Jassi Brar 2020-10-15 18:45 ` Jassi Brar 2020-10-16 8:52 ` Ionela Voinescu 2020-10-16 8:52 ` Ionela Voinescu 2020-10-16 9:00 ` Jerome Brunet 2020-10-16 9:00 ` Jerome Brunet 2020-10-16 13:52 ` Sudeep Holla 2020-10-16 13:52 ` Sudeep Holla 2020-10-16 17:33 ` Jassi Brar 2020-10-16 17:33 ` Jassi Brar 2020-10-16 18:54 ` Jerome Brunet 2020-10-16 18:54 ` Jerome Brunet 2020-10-16 19:27 ` Jassi Brar 2020-10-16 19:27 ` Jassi Brar 2022-07-19 7:08 ` Da Xue 2022-07-19 7:08 ` Da Xue 2020-10-16 11:52 ` Sudeep Holla 2020-10-16 11:52 ` Sudeep Holla
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20201015134628.GA11989@arm.com \ --to=ionela.voinescu@arm.com \ --cc=da@libre.computer \ --cc=jassisinghbrar@gmail.com \ --cc=jbrunet@baylibre.com \ --cc=khilman@baylibre.com \ --cc=linux-amlogic@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.