linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] irq_work: Fix IRQ_WORK_BUZY bit clearing
@ 2019-11-13 17:12 Frederic Weisbecker
  2019-11-14 15:56 ` Leonard Crestez
  2019-11-15  9:54 ` [tip: irq/core] irq_work: Fix IRQ_WORK_BUSY " tip-bot2 for Frederic Weisbecker
  0 siblings, 2 replies; 4+ messages in thread
From: Frederic Weisbecker @ 2019-11-13 17:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Leonard Crestez, Peter Zijlstra,
	Thomas Gleixner, Paul E . McKenney, kernel test robot

While attempting to clear the buzy bit at the end of a work execution,
atomic_cmpxchg() expects the value of the flags with the pending bit
cleared as the old value. However we are passing by mistake the value of
the flags before we actually cleared the pending bit.

As a result, clearing the buzy bit fails and irq_work_sync() may stall:

	watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [blktrace:4948]
	CPU: 0 PID: 4948 Comm: blktrace Not tainted 5.4.0-rc7-00003-gfeb4a51323bab #1
	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
	RIP: 0010:irq_work_sync+0x4/0x10
	Call Trace:
	  relay_close_buf+0x19/0x50
	  relay_close+0x64/0x100
	  blk_trace_free+0x1f/0x50
	  __blk_trace_remove+0x1e/0x30
	  blk_trace_ioctl+0x11b/0x140
	  blkdev_ioctl+0x6c1/0xa40
	  block_ioctl+0x39/0x40
	  do_vfs_ioctl+0xa5/0x700
	  ksys_ioctl+0x70/0x80
	  __x64_sys_ioctl+0x16/0x20
	  do_syscall_64+0x5b/0x1d0
	  entry_SYSCALL_64_after_hwframe+0x44/0xa9

So clear the appropriate bit before passing the old flags to cmpxchg().

Reported-by: kernel test robot <rong.a.chen@intel.com>
Reported-by: Leonard Crestez <leonard.crestez@nxp.com>
Fixes: feb4a51323ba ("irq_work: Slightly simplify IRQ_WORK_PENDING clearing")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
---
 kernel/irq_work.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 49c53f80a13a..828cc30774bc 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -158,6 +158,7 @@ static void irq_work_run_list(struct llist_head *list)
 		 * Clear the BUSY bit and return to the free state if
 		 * no-one else claimed it meanwhile.
 		 */
+		flags &= ~IRQ_WORK_PENDING;
 		(void)atomic_cmpxchg(&work->flags, flags, flags & ~IRQ_WORK_BUSY);
 	}
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] irq_work: Fix IRQ_WORK_BUZY bit clearing
  2019-11-13 17:12 [PATCH] irq_work: Fix IRQ_WORK_BUZY bit clearing Frederic Weisbecker
@ 2019-11-14 15:56 ` Leonard Crestez
  2019-11-15  8:51   ` Naresh Kamboju
  2019-11-15  9:54 ` [tip: irq/core] irq_work: Fix IRQ_WORK_BUSY " tip-bot2 for Frederic Weisbecker
  1 sibling, 1 reply; 4+ messages in thread
From: Leonard Crestez @ 2019-11-14 15:56 UTC (permalink / raw)
  To: Frederic Weisbecker, Ingo Molnar
  Cc: LKML, Peter Zijlstra, Thomas Gleixner, Paul E . McKenney,
	kernel test robot, Viresh Kumar

On 13.11.2019 19:12, Frederic Weisbecker wrote:
> While attempting to clear the buzy bit at the end of a work execution,
> atomic_cmpxchg() expects the value of the flags with the pending bit
> cleared as the old value. However we are passing by mistake the value of
> the flags before we actually cleared the pending bit.

Busy is spelled with an S

> 
> As a result, clearing the buzy bit fails and irq_work_sync() may stall:
> 
> 	watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [blktrace:4948]
> 	CPU: 0 PID: 4948 Comm: blktrace Not tainted 5.4.0-rc7-00003-gfeb4a51323bab #1
> 	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> 	RIP: 0010:irq_work_sync+0x4/0x10
> 	Call Trace:
> 	  relay_close_buf+0x19/0x50
> 	  relay_close+0x64/0x100
> 	  blk_trace_free+0x1f/0x50
> 	  __blk_trace_remove+0x1e/0x30
> 	  blk_trace_ioctl+0x11b/0x140
> 	  blkdev_ioctl+0x6c1/0xa40
> 	  block_ioctl+0x39/0x40
> 	  do_vfs_ioctl+0xa5/0x700
> 	  ksys_ioctl+0x70/0x80
> 	  __x64_sys_ioctl+0x16/0x20
> 	  do_syscall_64+0x5b/0x1d0
> 	  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> So clear the appropriate bit before passing the old flags to cmpxchg().
> 
> Reported-by: kernel test robot <rong.a.chen@intel.com>
> Reported-by: Leonard Crestez <leonard.crestez@nxp.com>
> Fixes: feb4a51323ba ("irq_work: Slightly simplify IRQ_WORK_PENDING clearing")
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Peter Zijlstra <peterz@infradead.org> everywhere.
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@kernel.org>

Tested-by: Leonard Crestez <leonard.crestez@nxp.com>

Without this patch switching cpufreq governors hangs on arm64.

> ---
>   kernel/irq_work.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/irq_work.c b/kernel/irq_work.c
> index 49c53f80a13a..828cc30774bc 100644
> --- a/kernel/irq_work.c
> +++ b/kernel/irq_work.c
> @@ -158,6 +158,7 @@ static void irq_work_run_list(struct llist_head *list)
>   		 * Clear the BUSY bit and return to the free state if
>   		 * no-one else claimed it meanwhile.
>   		 */
> +		flags &= ~IRQ_WORK_PENDING;
>   		(void)atomic_cmpxchg(&work->flags, flags, flags & ~IRQ_WORK_BUSY);
>   	}
>   }
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] irq_work: Fix IRQ_WORK_BUZY bit clearing
  2019-11-14 15:56 ` Leonard Crestez
@ 2019-11-15  8:51   ` Naresh Kamboju
  0 siblings, 0 replies; 4+ messages in thread
From: Naresh Kamboju @ 2019-11-15  8:51 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Leonard Crestez, LKML, Peter Zijlstra,
	Thomas Gleixner, Paul E . McKenney, kernel test robot,
	Viresh Kumar, lkft-triage

Hi Frederic,

Thanks for this fix patch.

On Thu, 14 Nov 2019 at 21:26, Leonard Crestez <leonard.crestez@nxp.com> wrote:
>
> On 13.11.2019 19:12, Frederic Weisbecker wrote:
> > While attempting to clear the buzy bit at the end of a work execution,
> > atomic_cmpxchg() expects the value of the flags with the pending bit
> > cleared as the old value. However we are passing by mistake the value of
> > the flags before we actually cleared the pending bit.
>
> Busy is spelled with an S
>
> >
> > As a result, clearing the buzy bit fails and irq_work_sync() may stall:
> >
> >       watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [blktrace:4948]
> >       CPU: 0 PID: 4948 Comm: blktrace Not tainted 5.4.0-rc7-00003-gfeb4a51323bab #1
> >       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> >       RIP: 0010:irq_work_sync+0x4/0x10
> >       Call Trace:
> >         relay_close_buf+0x19/0x50
> >         relay_close+0x64/0x100
> >         blk_trace_free+0x1f/0x50
> >         __blk_trace_remove+0x1e/0x30
> >         blk_trace_ioctl+0x11b/0x140
> >         blkdev_ioctl+0x6c1/0xa40
> >         block_ioctl+0x39/0x40
> >         do_vfs_ioctl+0xa5/0x700
> >         ksys_ioctl+0x70/0x80
> >         __x64_sys_ioctl+0x16/0x20
> >         do_syscall_64+0x5b/0x1d0
> >         entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > So clear the appropriate bit before passing the old flags to cmpxchg().
> >
> > Reported-by: kernel test robot <rong.a.chen@intel.com>
> > Reported-by: Leonard Crestez <leonard.crestez@nxp.com>
> > Fixes: feb4a51323ba ("irq_work: Slightly simplify IRQ_WORK_PENDING clearing")
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Peter Zijlstra <peterz@infradead.org> everywhere.
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@kernel.org>
>
> Tested-by: Leonard Crestez <leonard.crestez@nxp.com>
>
> Without this patch switching cpufreq governors hangs on arm64.

Right.

This patch solved two problems,
1) juno-r2 boot pass now
2) rcu_sched self-detected stall on CPU on x86_64 problem is solved now.

Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>

Hope this will get merged into linux next.

ref:
https://lkft.validation.linaro.org/scheduler/job/1010542#L260
https://lkft.validation.linaro.org/scheduler/job/1010793#L493

- Naresh

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip: irq/core] irq_work: Fix IRQ_WORK_BUSY bit clearing
  2019-11-13 17:12 [PATCH] irq_work: Fix IRQ_WORK_BUZY bit clearing Frederic Weisbecker
  2019-11-14 15:56 ` Leonard Crestez
@ 2019-11-15  9:54 ` tip-bot2 for Frederic Weisbecker
  1 sibling, 0 replies; 4+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2019-11-15  9:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kernel test robot, Leonard Crestez, Frederic Weisbecker,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     e9838bd51169af87ae248336d4c3fc59184a0e46
Gitweb:        https://git.kernel.org/tip/e9838bd51169af87ae248336d4c3fc59184a0e46
Author:        Frederic Weisbecker <frederic@kernel.org>
AuthorDate:    Wed, 13 Nov 2019 18:12:01 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 15 Nov 2019 10:48:37 +01:00

irq_work: Fix IRQ_WORK_BUSY bit clearing

While attempting to clear the busy bit at the end of a work execution,
atomic_cmpxchg() expects the value of the flags with the pending bit
cleared as the old value. However by mistake the value of the flags is
passed without clearing the pending bit first.

As a result, clearing the busy bit fails and irq_work_sync() may stall:

 watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [blktrace:4948]
 CPU: 0 PID: 4948 Comm: blktrace Not tainted 5.4.0-rc7-00003-gfeb4a51323bab #1
 RIP: 0010:irq_work_sync+0x4/0x10
 Call Trace:
  relay_close_buf+0x19/0x50
  relay_close+0x64/0x100
  blk_trace_free+0x1f/0x50
  __blk_trace_remove+0x1e/0x30
  blk_trace_ioctl+0x11b/0x140
  blkdev_ioctl+0x6c1/0xa40
  block_ioctl+0x39/0x40
  do_vfs_ioctl+0xa5/0x700
  ksys_ioctl+0x70/0x80
  __x64_sys_ioctl+0x16/0x20
  do_syscall_64+0x5b/0x1d0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

So clear the appropriate bit before passing the old flags to cmpxchg().

Fixes: feb4a51323ba ("irq_work: Slightly simplify IRQ_WORK_PENDING clearing")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Reported-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Leonard Crestez <leonard.crestez@nxp.com>
Link: https://lkml.kernel.org/r/20191113171201.14032-1-frederic@kernel.org

---
 kernel/irq_work.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 49c53f8..828cc30 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -158,6 +158,7 @@ static void irq_work_run_list(struct llist_head *list)
 		 * Clear the BUSY bit and return to the free state if
 		 * no-one else claimed it meanwhile.
 		 */
+		flags &= ~IRQ_WORK_PENDING;
 		(void)atomic_cmpxchg(&work->flags, flags, flags & ~IRQ_WORK_BUSY);
 	}
 }

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-11-15  9:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-13 17:12 [PATCH] irq_work: Fix IRQ_WORK_BUZY bit clearing Frederic Weisbecker
2019-11-14 15:56 ` Leonard Crestez
2019-11-15  8:51   ` Naresh Kamboju
2019-11-15  9:54 ` [tip: irq/core] irq_work: Fix IRQ_WORK_BUSY " tip-bot2 for Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).