All of
 help / color / mirror / Atom feed
From: Thomas Gleixner <>
To: John Garry <>
Cc: "linux-kernel\" <>,
	Marc Zyngier <>,
	Peter Zijlstra <>,
	Juri Lelli <>,
	Vincent Guittot <>,
	Dietmar Eggemann <>,
	Steven Rostedt <>,
	Ben Segall <>, Mel Gorman <>,
	Daniel Bristot de Oliveira <>,
	Ingo Molnar <>
Subject: Re: Question on threaded handlers for managed interrupts
Date: Fri, 23 Apr 2021 12:50:12 +0200	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>


On Thu, Apr 22 2021 at 17:10, John Garry wrote:
> I am finding that I can pretty easily trigger a system hang for certain 
> scenarios with my storage controller.
> So I'm getting something like this when running moderately heavy data 
> throughput:
> Starting 6 processes
> [70.656622] sched: RT throttling activatedB/s][r=356k,w=0 IOPS][eta
> 01h:14m:43s]
> [  207.632161] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:ta
> 01h:12m:26s]
> [  207.638261] rcu:  0-...!: (1 GPs behind)
> idle=312/1/0x4000000000000000 softirq=508/512 fqs=0
> [  207.646777] rcu:  1-...!: (1 GPs behind) idle=694/0/0x0
> It ends pretty badly - see [0].


> The multi-queue storage controller (see [1] for memory refresh, but
> note that I can also trigger on PCI device host controller as well) is
> using managed interrupts and threaded handlers. Since the threaded
> handler uses SCHED_FIFO, aren't we always vulnerable to this situation
> with the managed interrupt and threaded handler combo? Would the
> advice be to just use irq polling here?

This is a really good question. Most interrupt handlers are not running
exceedingly long or come in with high frequency, but of course this
problem exists.

The network people have solved it with NAPI which disables the interrupt
in the device and polls it from softirq context (which might be then
delegated to ksoftirqd) until it's drained.

I'm not familiar with the block/multiqueue layer to be able to tell
whether such a concept exists there as well.

OTOH, the way how you splitted the handling into hard/thread context
provides already the base for this.

The missing piece is infrastructure at the irq/scheduler core level to
handle this transparently.

I have some horrible ideas how to solve that, but I'm sure the scheduler
wizards can come up with a reasonable and generic solution.



  reply	other threads:[~2021-04-23 10:50 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-22 16:10 Question on threaded handlers for managed interrupts John Garry
2021-04-23 10:50 ` Thomas Gleixner [this message]
2021-04-23 12:02   ` John Garry
2021-04-23 13:01   ` Thomas Gleixner
2021-04-23 15:00     ` John Garry
2021-05-21 12:46     ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.