All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier
@ 2021-09-30 10:37 Sebastian Andrzej Siewior
  2021-09-30 10:56 ` [RFC] Is lib/irq_poll still considered useful? Sebastian Andrzej Siewior
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-30 10:37 UTC (permalink / raw)
  To: linux-kernel, linux-scsi
  Cc: Peter Zijlstra, Thomas Gleixner, Christoph Hellwig,
	Greg Kroah-Hartman, Sebastian Andrzej Siewior

__raise_softirq_irqoff() adds a bit to the pending sofirq mask and this
is it. The softirq won't be handled in a deterministic way but randomly
when an interrupt fires and handles softirq in its irq_exit() routine or
if something randomly checks and handles pending softirqs in the call
chain before the CPU goes idle.

Add a local_bh_disable/enable() around the IRQ-off section which will
handle pending softirqs.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 lib/irq_poll.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/irq_poll.c b/lib/irq_poll.c
index 2f17b488d58e1..2b9f797642f60 100644
--- a/lib/irq_poll.c
+++ b/lib/irq_poll.c
@@ -191,11 +191,13 @@ static int irq_poll_cpu_dead(unsigned int cpu)
 	 * If a CPU goes away, splice its entries to the current CPU
 	 * and trigger a run of the softirq
 	 */
+	local_bh_disable();
 	local_irq_disable();
 	list_splice_init(&per_cpu(blk_cpu_iopoll, cpu),
 			 this_cpu_ptr(&blk_cpu_iopoll));
 	__raise_softirq_irqoff(IRQ_POLL_SOFTIRQ);
 	local_irq_enable();
+	local_bh_enable();
 
 	return 0;
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC] Is lib/irq_poll still considered useful?
  2021-09-30 10:37 [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier Sebastian Andrzej Siewior
@ 2021-09-30 10:56 ` Sebastian Andrzej Siewior
  2021-10-01  4:24   ` Christoph Hellwig
  2021-10-18  7:45 ` [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier Sebastian Andrzej Siewior
  2021-10-18 10:53 ` Christoph Hellwig
  2 siblings, 1 reply; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-30 10:56 UTC (permalink / raw)
  To: linux-kernel, linux-scsi
  Cc: Peter Zijlstra, Thomas Gleixner, Christoph Hellwig,
	Greg Kroah-Hartman, Doug Ledford, Jason Gunthorpe, linux-rdma,
	Subbu Seetharaman, Ketan Mukadam, Jitendra Bhivare,
	James E.J. Bottomley, Martin K. Petersen, Manoj N. Kumar,
	Matthew R. Ochs, Uma Krishnan, Brian King, James Smart,
	Dick Kennedy, Kashyap Desai, Sumit Saxena, Shivasharan S,
	megaraidlinux.pdl, Sathya Prakash, Sreekanth Reddy,
	Suganath Prabu Subramani, MPT-FusionLinux.pdl

I was looking at irq_poll and for missing scheduling points.
It raised the question why are there 7 driver still using irq_poll and
not moved on to something else like threaded interrupts or kworker.

There is:
- Infiband can complete direct, irq_poll and kworker.
- be2iscsi only irq_poll.
- cxlflash only irq_poll. Not sure how IRQs are acked.
- ipr direct or irq_poll, can be configured. Now sure how IRQs are acked.
- lpfc kworker and/or irq_poll. Not sure all invocations are from
  interrupts like context [0].
- megaraid irq_poll. Not sure all invocations are from interrupts like
  context [0].
- mpt3sas irq_poll or io_uring io poll. Not sure all invocations are
  from interrupts like context [0].

[0] If irq_poll_sched() is not used from an interrupt (as in interrupt
service routine, timer handler, tasklet (not encouraging just noticed))
but from task context (as in kworker for instance) then irq-poll handler
will not be invoked right away. Instead it will be delayed to random
point in time until an interrupt fires or something down the stack
performs a pending softirq check.
Waking ksoftirqd itself isn't helping much since it will set a
NEED_RESCHED bit in the task_struct which local_irq_restore() isn't
testing for. So the scheduling event is delayed until spin_unlock() for
instance.

Is there a reason for the remaining user of irq_poll to keep using it?

Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] Is lib/irq_poll still considered useful?
  2021-09-30 10:56 ` [RFC] Is lib/irq_poll still considered useful? Sebastian Andrzej Siewior
@ 2021-10-01  4:24   ` Christoph Hellwig
  2021-10-01  6:41     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2021-10-01  4:24 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, linux-scsi, Peter Zijlstra, Thomas Gleixner,
	Christoph Hellwig, Greg Kroah-Hartman, Doug Ledford,
	Jason Gunthorpe, linux-rdma, Subbu Seetharaman, Ketan Mukadam,
	Jitendra Bhivare, James E.J. Bottomley, Martin K. Petersen,
	Manoj N. Kumar, Matthew R. Ochs, Uma Krishnan, Brian King,
	James Smart, Dick Kennedy, Kashyap Desai, Sumit Saxena,
	Shivasharan S, megaraidlinux.pdl, Sathya Prakash,
	Sreekanth Reddy, Suganath Prabu Subramani, MPT-FusionLinux.pdl

On Thu, Sep 30, 2021 at 12:56:05PM +0200, Sebastian Andrzej Siewior wrote:
> Is there a reason for the remaining user of irq_poll to keep using it?

At least for RDMA there are workloads where the latency difference
matters.  That's why we added both the irq_poll and workqueue mode
to thew new CQ API a few years ago.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] Is lib/irq_poll still considered useful?
  2021-10-01  4:24   ` Christoph Hellwig
@ 2021-10-01  6:41     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-10-01  6:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-scsi, Peter Zijlstra, Thomas Gleixner,
	Greg Kroah-Hartman, Doug Ledford, Jason Gunthorpe, linux-rdma,
	Subbu Seetharaman, Ketan Mukadam, Jitendra Bhivare,
	James E.J. Bottomley, Martin K. Petersen, Manoj N. Kumar,
	Matthew R. Ochs, Uma Krishnan, Brian King, James Smart,
	Dick Kennedy, Kashyap Desai, Sumit Saxena, Shivasharan S,
	megaraidlinux.pdl, Sathya Prakash, Sreekanth Reddy,
	Suganath Prabu Subramani, MPT-FusionLinux.pdl

On 2021-10-01 05:24:49 [+0100], Christoph Hellwig wrote:
> On Thu, Sep 30, 2021 at 12:56:05PM +0200, Sebastian Andrzej Siewior wrote:
> > Is there a reason for the remaining user of irq_poll to keep using it?
> 
> At least for RDMA there are workloads where the latency difference
> matters.  That's why we added both the irq_poll and workqueue mode
> to thew new CQ API a few years ago.

Would it work for them to move to threaded interrupts or is the NAPI
like behaviour (delay after a while to the next jiffy) the killer
feature?

Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier
  2021-09-30 10:37 [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier Sebastian Andrzej Siewior
  2021-09-30 10:56 ` [RFC] Is lib/irq_poll still considered useful? Sebastian Andrzej Siewior
@ 2021-10-18  7:45 ` Sebastian Andrzej Siewior
  2021-10-18 10:53 ` Christoph Hellwig
  2 siblings, 0 replies; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-10-18  7:45 UTC (permalink / raw)
  To: linux-kernel, linux-scsi
  Cc: Peter Zijlstra, Thomas Gleixner, Christoph Hellwig, Greg Kroah-Hartman

On 2021-09-30 12:37:54 [+0200], To linux-kernel@vger.kernel.org wrote:
> __raise_softirq_irqoff() adds a bit to the pending sofirq mask and this
> is it. The softirq won't be handled in a deterministic way but randomly
> when an interrupt fires and handles softirq in its irq_exit() routine or
> if something randomly checks and handles pending softirqs in the call
> chain before the CPU goes idle.
> 
> Add a local_bh_disable/enable() around the IRQ-off section which will
> handle pending softirqs.

ping

> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  lib/irq_poll.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/lib/irq_poll.c b/lib/irq_poll.c
> index 2f17b488d58e1..2b9f797642f60 100644
> --- a/lib/irq_poll.c
> +++ b/lib/irq_poll.c
> @@ -191,11 +191,13 @@ static int irq_poll_cpu_dead(unsigned int cpu)
>  	 * If a CPU goes away, splice its entries to the current CPU
>  	 * and trigger a run of the softirq
>  	 */
> +	local_bh_disable();
>  	local_irq_disable();
>  	list_splice_init(&per_cpu(blk_cpu_iopoll, cpu),
>  			 this_cpu_ptr(&blk_cpu_iopoll));
>  	__raise_softirq_irqoff(IRQ_POLL_SOFTIRQ);
>  	local_irq_enable();
> +	local_bh_enable();
>  
>  	return 0;
>  }
> -- 
> 2.33.0
> 

Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier
  2021-09-30 10:37 [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier Sebastian Andrzej Siewior
  2021-09-30 10:56 ` [RFC] Is lib/irq_poll still considered useful? Sebastian Andrzej Siewior
  2021-10-18  7:45 ` [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier Sebastian Andrzej Siewior
@ 2021-10-18 10:53 ` Christoph Hellwig
  2021-10-18 11:49   ` Sebastian Andrzej Siewior
  2 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2021-10-18 10:53 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, linux-scsi, Peter Zijlstra, Thomas Gleixner,
	Christoph Hellwig, Greg Kroah-Hartman

On Thu, Sep 30, 2021 at 12:37:54PM +0200, Sebastian Andrzej Siewior wrote:
> __raise_softirq_irqoff() adds a bit to the pending sofirq mask and this
> is it. The softirq won't be handled in a deterministic way but randomly
> when an interrupt fires and handles softirq in its irq_exit() routine or
> if something randomly checks and handles pending softirqs in the call
> chain before the CPU goes idle.
> 
> Add a local_bh_disable/enable() around the IRQ-off section which will
> handle pending softirqs.

This patch leaves me extremely confused, and it would even more if I was
just reading the code.  local_irq_disable is supposed to disable BHs
as well, so the code looks pretty much nonsensical to me.  But
apparently that isn't the point if I follow your commit message as you
don't care about an extra level of BH disabling but want to force a
side-effect of the re-enabling?  Why not directly call the helper
to schedule the softirq then? 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier
  2021-10-18 10:53 ` Christoph Hellwig
@ 2021-10-18 11:49   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-10-18 11:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-scsi, Peter Zijlstra, Thomas Gleixner,
	Greg Kroah-Hartman

On 2021-10-18 03:53:20 [-0700], Christoph Hellwig wrote:
> On Thu, Sep 30, 2021 at 12:37:54PM +0200, Sebastian Andrzej Siewior wrote:
> > __raise_softirq_irqoff() adds a bit to the pending sofirq mask and this
> > is it. The softirq won't be handled in a deterministic way but randomly
> > when an interrupt fires and handles softirq in its irq_exit() routine or
> > if something randomly checks and handles pending softirqs in the call
> > chain before the CPU goes idle.
> > 
> > Add a local_bh_disable/enable() around the IRQ-off section which will
> > handle pending softirqs.
> 
> This patch leaves me extremely confused, and it would even more if I was
> just reading the code.  local_irq_disable is supposed to disable BHs
> as well, so the code looks pretty much nonsensical to me.  But
> apparently that isn't the point if I follow your commit message as you
> don't care about an extra level of BH disabling but want to force a
> side-effect of the re-enabling?  Why not directly call the helper
> to schedule the softirq then? 

This side-effect is actually the way it works most of the time. There is
one exception in the network stack where do_softirq() is called manually
after checking for pending softirqs. I managed to remove one instance in
commit
   fe420d87bbc23 ("net/core: remove explicit do_softirq() from busy_poll_stop()")

just wasn't so lucky with the other one (yet). Other than that, it is
either local_bh_enable() that ensures that pending softirqs are
processed or __irq_exit_rcu().
Same as preempt_enable() ensure that a context switch happens if the
scheudler decided that one is needed but the CPU was in a
preempt-disabled section at that time.

Anyway. Even with s/__raise_softirq_irqoff/raise_softirq_irqoff/ you are
not much better off. The latter will wake ksoftirqd but it might result
in setting the NEED-resched bit which is not checked by
local_irq_enable(). So you end up waiting until random spin_unlock()
which has the needed check. Nothing here does that here but probably
something before the CPU-HP code is left.

Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-18 11:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-30 10:37 [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier Sebastian Andrzej Siewior
2021-09-30 10:56 ` [RFC] Is lib/irq_poll still considered useful? Sebastian Andrzej Siewior
2021-10-01  4:24   ` Christoph Hellwig
2021-10-01  6:41     ` Sebastian Andrzej Siewior
2021-10-18  7:45 ` [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier Sebastian Andrzej Siewior
2021-10-18 10:53 ` Christoph Hellwig
2021-10-18 11:49   ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.