All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Lezcano <daniel.lezcano@linaro.org>
To: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>,
	Hannes Reinecke <hare@suse.com>,
	Bart Van Assche <bvanassche@acm.org>,
	linux-scsi@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Long Li <longli@microsoft.com>,
	John Garry <john.garry@huawei.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-nvme@lists.infradead.org, Jens Axboe <axboe@fb.com>,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>
Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
Date: Fri, 6 Sep 2019 07:14:15 +0200	[thread overview]
Message-ID: <ffefcfa0-09b6-9af5-f94e-8e7ddd2eef16@linaro.org> (raw)
In-Reply-To: <20190906014819.GB27116@ming.t460p>


Hi,

On 06/09/2019 03:48, Ming Lei wrote:

[ ... ]

>> You did not share yet the analysis of the problem (the kernel warnings
>> give the symptoms) and gave the reasoning for the solution. It is hard
>> to understand what you are looking for exactly and how to connect the dots.
> 
> Let me explain it one more time:>
> When one IRQ flood happens on one CPU:
> 
> 1) softirq handling on this CPU can't make progress
> 
> 2) kernel thread bound to this CPU can't make progress
> 
> For example, network may require softirq to xmit packets, or another irq
> thread for handling keyboards/mice or whatever, or rcu_sched may depend
> on that CPU for making progress, then the irq flood stalls the whole
> system.
> 
>>
>> AFAIU, there are fast medium where the responses to requests are faster
>> than the time to process them, right?
> 
> Usually medium may not be faster than CPU, now we are talking about
> interrupts, which can be originated from lots of devices concurrently,
> for example, in Long Li'test, there are 8 NVMe drives involved.
> 
>>
>> I don't see how detecting IRQ flooding and use a threaded irq is the
>> solution, can you explain?
> 
> When IRQ flood is detected, we reserve a bit little time for providing
> chance to make softirq/threads scheduled by scheduler, then the above
> problem can be avoided.
> 
>>
>> If the responses are coming at a very high rate, whatever the solution
>> (interrupts, threaded interrupts, polling), we are still in the same
>> situation.
> 
> When we moving the interrupt handling into irq thread, other softirq/
> threaded interrupt/thread gets chance to be scheduled, so we can avoid
> to stall the whole system.

Ok, so the real problem is per-cpu bounded tasks.

I share Thomas opinion about a NAPI like approach.

I do believe you should also rely on the IRQ_TIME_ACCOUNTING (may be get
it optimized) to contribute to the CPU load and enforce task migration
at load balance.

>> My suggestion was initially to see if the interrupt load will be taken
>> into accounts in the cpu load and favorize task migration with the
>> scheduler load balance to a less loaded CPU, thus the CPU processing
>> interrupts will end up doing only that while other CPUs will handle the
>> "threaded" side.
>>
>> Beside that, I'm wondering if the block scheduler should be somehow
>> involved in that [1]
> 
> For NVMe or any multi-queue storage, the default scheduler is 'none',
> which basically does nothing except for submitting IO asap.
> 
> 
> Thanks,
> Ming
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


WARNING: multiple messages have this Message-ID (diff)
From: Daniel Lezcano <daniel.lezcano@linaro.org>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@fb.com>, Hannes Reinecke <hare@suse.com>,
	Bart Van Assche <bvanassche@acm.org>,
	linux-scsi@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Long Li <longli@microsoft.com>,
	John Garry <john.garry@huawei.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-nvme@lists.infradead.org,
	Keith Busch <keith.busch@intel.com>,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>
Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
Date: Fri, 6 Sep 2019 07:14:15 +0200	[thread overview]
Message-ID: <ffefcfa0-09b6-9af5-f94e-8e7ddd2eef16@linaro.org> (raw)
In-Reply-To: <20190906014819.GB27116@ming.t460p>


Hi,

On 06/09/2019 03:48, Ming Lei wrote:

[ ... ]

>> You did not share yet the analysis of the problem (the kernel warnings
>> give the symptoms) and gave the reasoning for the solution. It is hard
>> to understand what you are looking for exactly and how to connect the dots.
> 
> Let me explain it one more time:>
> When one IRQ flood happens on one CPU:
> 
> 1) softirq handling on this CPU can't make progress
> 
> 2) kernel thread bound to this CPU can't make progress
> 
> For example, network may require softirq to xmit packets, or another irq
> thread for handling keyboards/mice or whatever, or rcu_sched may depend
> on that CPU for making progress, then the irq flood stalls the whole
> system.
> 
>>
>> AFAIU, there are fast medium where the responses to requests are faster
>> than the time to process them, right?
> 
> Usually medium may not be faster than CPU, now we are talking about
> interrupts, which can be originated from lots of devices concurrently,
> for example, in Long Li'test, there are 8 NVMe drives involved.
> 
>>
>> I don't see how detecting IRQ flooding and use a threaded irq is the
>> solution, can you explain?
> 
> When IRQ flood is detected, we reserve a bit little time for providing
> chance to make softirq/threads scheduled by scheduler, then the above
> problem can be avoided.
> 
>>
>> If the responses are coming at a very high rate, whatever the solution
>> (interrupts, threaded interrupts, polling), we are still in the same
>> situation.
> 
> When we moving the interrupt handling into irq thread, other softirq/
> threaded interrupt/thread gets chance to be scheduled, so we can avoid
> to stall the whole system.

Ok, so the real problem is per-cpu bounded tasks.

I share Thomas opinion about a NAPI like approach.

I do believe you should also rely on the IRQ_TIME_ACCOUNTING (may be get
it optimized) to contribute to the CPU load and enforce task migration
at load balance.

>> My suggestion was initially to see if the interrupt load will be taken
>> into accounts in the cpu load and favorize task migration with the
>> scheduler load balance to a less loaded CPU, thus the CPU processing
>> interrupts will end up doing only that while other CPUs will handle the
>> "threaded" side.
>>
>> Beside that, I'm wondering if the block scheduler should be somehow
>> involved in that [1]
> 
> For NVMe or any multi-queue storage, the default scheduler is 'none',
> which basically does nothing except for submitting IO asap.
> 
> 
> Thanks,
> Ming
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2019-09-06  5:14 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-27  8:53 [PATCH 0/4] genirq/nvme: add IRQF_RESCUE_THREAD for avoiding IRQ flood Ming Lei
2019-08-27  8:53 ` [PATCH 1/4] softirq: implement IRQ flood detection mechanism Ming Lei
2019-08-27 14:42   ` Thomas Gleixner
2019-08-27 16:19     ` Thomas Gleixner
2019-08-27 23:04       ` Ming Lei
2019-08-27 23:12         ` Thomas Gleixner
2019-08-27 22:58     ` Ming Lei
2019-08-27 23:09       ` Thomas Gleixner
2019-08-28 11:06         ` Ming Lei
2019-08-28 11:23           ` Thomas Gleixner
2019-08-28 13:50             ` Ming Lei
2019-08-28 14:07               ` Thomas Gleixner
2019-09-03  3:30                 ` Ming Lei
2019-09-03  3:30                   ` Ming Lei
2019-09-03  5:59                   ` Daniel Lezcano
2019-09-03  5:59                     ` Daniel Lezcano
2019-09-03  6:31                     ` Ming Lei
2019-09-03  6:31                       ` Ming Lei
2019-09-03  6:40                       ` Daniel Lezcano
2019-09-03  6:40                         ` Daniel Lezcano
2019-09-03  7:28                         ` Ming Lei
2019-09-03  7:28                           ` Ming Lei
2019-09-03  7:50                           ` Daniel Lezcano
2019-09-03  7:50                             ` Daniel Lezcano
2019-09-03  9:30                             ` Ming Lei
2019-09-03  9:30                               ` Ming Lei
2019-09-04 17:07                             ` Bart Van Assche
2019-09-04 17:07                               ` Bart Van Assche
2019-09-04 17:31                               ` Daniel Lezcano
2019-09-04 17:31                                 ` Daniel Lezcano
2019-09-04 17:38                                 ` Bart Van Assche
2019-09-04 17:38                                   ` Bart Van Assche
2019-09-04 18:02                                   ` Peter Zijlstra
2019-09-04 18:02                                     ` Peter Zijlstra
2019-09-04 19:47                                     ` Bart Van Assche
2019-09-04 19:47                                       ` Bart Van Assche
2019-09-05  9:11                                       ` Ming Lei
2019-09-05  9:11                                         ` Ming Lei
2019-09-05  9:06                                 ` Ming Lei
2019-09-05  9:06                                   ` Ming Lei
2019-09-05 10:37                                   ` Daniel Lezcano
2019-09-05 10:37                                     ` Daniel Lezcano
2019-09-06  1:22                                     ` Long Li
2019-09-06  1:22                                       ` Long Li
2019-09-06  4:36                                       ` Daniel Lezcano
2019-09-06  4:36                                         ` Daniel Lezcano
2019-09-06  4:44                                         ` Long Li
2019-09-06  4:44                                           ` Long Li
2019-09-06  1:48                                     ` Ming Lei
2019-09-06  1:48                                       ` Ming Lei
2019-09-06  5:14                                       ` Daniel Lezcano [this message]
2019-09-06  5:14                                         ` Daniel Lezcano
2019-09-06 18:30                                         ` Sagi Grimberg
2019-09-06 18:30                                           ` Sagi Grimberg
2019-09-06 18:52                                           ` Keith Busch
2019-09-06 18:52                                             ` Keith Busch
2019-09-07  0:01                                           ` Ming Lei
2019-09-07  0:01                                             ` Ming Lei
2019-09-10  3:10                                             ` Sagi Grimberg
2019-09-10  3:10                                               ` Sagi Grimberg
2019-09-18  0:00                                               ` Long Li
2019-09-18  0:00                                                 ` Long Li
2019-09-20 17:14                                                 ` Sagi Grimberg
2019-09-20 17:14                                                   ` Sagi Grimberg
2019-09-20 19:12                                                   ` Long Li
2019-09-20 19:12                                                     ` Long Li
2019-09-20 20:45                                                     ` Sagi Grimberg
2019-09-20 20:45                                                       ` Sagi Grimberg
2019-09-24  0:57                                                       ` Long Li
2019-09-24  0:57                                                         ` Long Li
2019-09-18 14:37                                               ` Ming Lei
2019-09-18 14:37                                                 ` Ming Lei
2019-09-20 17:09                                                 ` Sagi Grimberg
2019-09-20 17:09                                                   ` Sagi Grimberg
2019-09-06 14:18                                       ` Keith Busch
2019-09-06 14:18                                         ` Keith Busch
2019-09-06 17:50                                         ` Long Li
2019-09-06 17:50                                           ` Long Li
2019-09-06 22:19                                           ` Ming Lei
2019-09-06 22:19                                             ` Ming Lei
2019-09-06 22:25                                             ` Keith Busch
2019-09-06 22:25                                               ` Keith Busch
2019-09-06 23:13                                               ` Ming Lei
2019-09-06 23:13                                                 ` Ming Lei
2019-09-10  0:24                                             ` Ming Lei
2019-09-10  0:24                                               ` Ming Lei
2019-09-03  8:09                           ` Thomas Gleixner
2019-09-03  8:09                             ` Thomas Gleixner
2019-09-03  9:24                             ` Ming Lei
2019-09-03  9:24                               ` Ming Lei
2019-08-29  6:15   ` Long Li
2019-08-30  0:55     ` Ming Lei
2019-08-27  8:53 ` [PATCH 2/4] genirq: add IRQF_RESCUE_THREAD Ming Lei
2019-08-27  8:53 ` [PATCH 3/4] nvme: pci: pass IRQF_RESCURE_THREAD to request_threaded_irq Ming Lei
2019-08-27  9:06   ` Johannes Thumshirn
2019-08-27  9:09     ` Ming Lei
2019-08-27  9:12       ` Johannes Thumshirn
2019-08-27 14:34       ` Keith Busch
2019-08-27 14:44         ` Keith Busch
2019-08-27 15:10   ` Bart Van Assche
2019-08-28  1:45     ` Ming Lei
2019-08-27  8:53 ` [PATCH 4/4] genirq: use irq's affinity for threaded irq with IRQF_RESCUE_THREAD Ming Lei
2019-08-27 14:35   ` Keith Busch
2019-09-06  8:50   ` John Garry
2019-09-06  8:50     ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ffefcfa0-09b6-9af5-f94e-8e7ddd2eef16@linaro.org \
    --to=daniel.lezcano@linaro.org \
    --cc=axboe@fb.com \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=john.garry@huawei.com \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=ming.lei@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sagi@grimberg.me \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.