From: Daniel Lezcano <daniel.lezcano@linaro.org> To: Ming Lei <ming.lei@redhat.com> Cc: Keith Busch <keith.busch@intel.com>, Hannes Reinecke <hare@suse.com>, Bart Van Assche <bvanassche@acm.org>, linux-scsi@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>, Long Li <longli@microsoft.com>, John Garry <john.garry@huawei.com>, LKML <linux-kernel@vger.kernel.org>, linux-nvme@lists.infradead.org, Jens Axboe <axboe@fb.com>, Ingo Molnar <mingo@redhat.com>, Thomas Gleixner <tglx@linutronix.de>, Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me> Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism Date: Fri, 6 Sep 2019 07:14:15 +0200 [thread overview] Message-ID: <ffefcfa0-09b6-9af5-f94e-8e7ddd2eef16@linaro.org> (raw) In-Reply-To: <20190906014819.GB27116@ming.t460p> Hi, On 06/09/2019 03:48, Ming Lei wrote: [ ... ] >> You did not share yet the analysis of the problem (the kernel warnings >> give the symptoms) and gave the reasoning for the solution. It is hard >> to understand what you are looking for exactly and how to connect the dots. > > Let me explain it one more time:> > When one IRQ flood happens on one CPU: > > 1) softirq handling on this CPU can't make progress > > 2) kernel thread bound to this CPU can't make progress > > For example, network may require softirq to xmit packets, or another irq > thread for handling keyboards/mice or whatever, or rcu_sched may depend > on that CPU for making progress, then the irq flood stalls the whole > system. > >> >> AFAIU, there are fast medium where the responses to requests are faster >> than the time to process them, right? > > Usually medium may not be faster than CPU, now we are talking about > interrupts, which can be originated from lots of devices concurrently, > for example, in Long Li'test, there are 8 NVMe drives involved. > >> >> I don't see how detecting IRQ flooding and use a threaded irq is the >> solution, can you explain? > > When IRQ flood is detected, we reserve a bit little time for providing > chance to make softirq/threads scheduled by scheduler, then the above > problem can be avoided. > >> >> If the responses are coming at a very high rate, whatever the solution >> (interrupts, threaded interrupts, polling), we are still in the same >> situation. > > When we moving the interrupt handling into irq thread, other softirq/ > threaded interrupt/thread gets chance to be scheduled, so we can avoid > to stall the whole system. Ok, so the real problem is per-cpu bounded tasks. I share Thomas opinion about a NAPI like approach. I do believe you should also rely on the IRQ_TIME_ACCOUNTING (may be get it optimized) to contribute to the CPU load and enforce task migration at load balance. >> My suggestion was initially to see if the interrupt load will be taken >> into accounts in the cpu load and favorize task migration with the >> scheduler load balance to a less loaded CPU, thus the CPU processing >> interrupts will end up doing only that while other CPUs will handle the >> "threaded" side. >> >> Beside that, I'm wondering if the block scheduler should be somehow >> involved in that [1] > > For NVMe or any multi-queue storage, the default scheduler is 'none', > which basically does nothing except for submitting IO asap. > > > Thanks, > Ming > -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog
WARNING: multiple messages have this Message-ID (diff)
From: Daniel Lezcano <daniel.lezcano@linaro.org> To: Ming Lei <ming.lei@redhat.com> Cc: Jens Axboe <axboe@fb.com>, Hannes Reinecke <hare@suse.com>, Bart Van Assche <bvanassche@acm.org>, linux-scsi@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>, Long Li <longli@microsoft.com>, John Garry <john.garry@huawei.com>, LKML <linux-kernel@vger.kernel.org>, linux-nvme@lists.infradead.org, Keith Busch <keith.busch@intel.com>, Ingo Molnar <mingo@redhat.com>, Thomas Gleixner <tglx@linutronix.de>, Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me> Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism Date: Fri, 6 Sep 2019 07:14:15 +0200 [thread overview] Message-ID: <ffefcfa0-09b6-9af5-f94e-8e7ddd2eef16@linaro.org> (raw) In-Reply-To: <20190906014819.GB27116@ming.t460p> Hi, On 06/09/2019 03:48, Ming Lei wrote: [ ... ] >> You did not share yet the analysis of the problem (the kernel warnings >> give the symptoms) and gave the reasoning for the solution. It is hard >> to understand what you are looking for exactly and how to connect the dots. > > Let me explain it one more time:> > When one IRQ flood happens on one CPU: > > 1) softirq handling on this CPU can't make progress > > 2) kernel thread bound to this CPU can't make progress > > For example, network may require softirq to xmit packets, or another irq > thread for handling keyboards/mice or whatever, or rcu_sched may depend > on that CPU for making progress, then the irq flood stalls the whole > system. > >> >> AFAIU, there are fast medium where the responses to requests are faster >> than the time to process them, right? > > Usually medium may not be faster than CPU, now we are talking about > interrupts, which can be originated from lots of devices concurrently, > for example, in Long Li'test, there are 8 NVMe drives involved. > >> >> I don't see how detecting IRQ flooding and use a threaded irq is the >> solution, can you explain? > > When IRQ flood is detected, we reserve a bit little time for providing > chance to make softirq/threads scheduled by scheduler, then the above > problem can be avoided. > >> >> If the responses are coming at a very high rate, whatever the solution >> (interrupts, threaded interrupts, polling), we are still in the same >> situation. > > When we moving the interrupt handling into irq thread, other softirq/ > threaded interrupt/thread gets chance to be scheduled, so we can avoid > to stall the whole system. Ok, so the real problem is per-cpu bounded tasks. I share Thomas opinion about a NAPI like approach. I do believe you should also rely on the IRQ_TIME_ACCOUNTING (may be get it optimized) to contribute to the CPU load and enforce task migration at load balance. >> My suggestion was initially to see if the interrupt load will be taken >> into accounts in the cpu load and favorize task migration with the >> scheduler load balance to a less loaded CPU, thus the CPU processing >> interrupts will end up doing only that while other CPUs will handle the >> "threaded" side. >> >> Beside that, I'm wondering if the block scheduler should be somehow >> involved in that [1] > > For NVMe or any multi-queue storage, the default scheduler is 'none', > which basically does nothing except for submitting IO asap. > > > Thanks, > Ming > -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2019-09-06 5:14 UTC|newest] Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-08-27 8:53 [PATCH 0/4] genirq/nvme: add IRQF_RESCUE_THREAD for avoiding IRQ flood Ming Lei 2019-08-27 8:53 ` [PATCH 1/4] softirq: implement IRQ flood detection mechanism Ming Lei 2019-08-27 14:42 ` Thomas Gleixner 2019-08-27 16:19 ` Thomas Gleixner 2019-08-27 23:04 ` Ming Lei 2019-08-27 23:12 ` Thomas Gleixner 2019-08-27 22:58 ` Ming Lei 2019-08-27 23:09 ` Thomas Gleixner 2019-08-28 11:06 ` Ming Lei 2019-08-28 11:23 ` Thomas Gleixner 2019-08-28 13:50 ` Ming Lei 2019-08-28 14:07 ` Thomas Gleixner 2019-09-03 3:30 ` Ming Lei 2019-09-03 3:30 ` Ming Lei 2019-09-03 5:59 ` Daniel Lezcano 2019-09-03 5:59 ` Daniel Lezcano 2019-09-03 6:31 ` Ming Lei 2019-09-03 6:31 ` Ming Lei 2019-09-03 6:40 ` Daniel Lezcano 2019-09-03 6:40 ` Daniel Lezcano 2019-09-03 7:28 ` Ming Lei 2019-09-03 7:28 ` Ming Lei 2019-09-03 7:50 ` Daniel Lezcano 2019-09-03 7:50 ` Daniel Lezcano 2019-09-03 9:30 ` Ming Lei 2019-09-03 9:30 ` Ming Lei 2019-09-04 17:07 ` Bart Van Assche 2019-09-04 17:07 ` Bart Van Assche 2019-09-04 17:31 ` Daniel Lezcano 2019-09-04 17:31 ` Daniel Lezcano 2019-09-04 17:38 ` Bart Van Assche 2019-09-04 17:38 ` Bart Van Assche 2019-09-04 18:02 ` Peter Zijlstra 2019-09-04 18:02 ` Peter Zijlstra 2019-09-04 19:47 ` Bart Van Assche 2019-09-04 19:47 ` Bart Van Assche 2019-09-05 9:11 ` Ming Lei 2019-09-05 9:11 ` Ming Lei 2019-09-05 9:06 ` Ming Lei 2019-09-05 9:06 ` Ming Lei 2019-09-05 10:37 ` Daniel Lezcano 2019-09-05 10:37 ` Daniel Lezcano 2019-09-06 1:22 ` Long Li 2019-09-06 1:22 ` Long Li 2019-09-06 4:36 ` Daniel Lezcano 2019-09-06 4:36 ` Daniel Lezcano 2019-09-06 4:44 ` Long Li 2019-09-06 4:44 ` Long Li 2019-09-06 1:48 ` Ming Lei 2019-09-06 1:48 ` Ming Lei 2019-09-06 5:14 ` Daniel Lezcano [this message] 2019-09-06 5:14 ` Daniel Lezcano 2019-09-06 18:30 ` Sagi Grimberg 2019-09-06 18:30 ` Sagi Grimberg 2019-09-06 18:52 ` Keith Busch 2019-09-06 18:52 ` Keith Busch 2019-09-07 0:01 ` Ming Lei 2019-09-07 0:01 ` Ming Lei 2019-09-10 3:10 ` Sagi Grimberg 2019-09-10 3:10 ` Sagi Grimberg 2019-09-18 0:00 ` Long Li 2019-09-18 0:00 ` Long Li 2019-09-20 17:14 ` Sagi Grimberg 2019-09-20 17:14 ` Sagi Grimberg 2019-09-20 19:12 ` Long Li 2019-09-20 19:12 ` Long Li 2019-09-20 20:45 ` Sagi Grimberg 2019-09-20 20:45 ` Sagi Grimberg 2019-09-24 0:57 ` Long Li 2019-09-24 0:57 ` Long Li 2019-09-18 14:37 ` Ming Lei 2019-09-18 14:37 ` Ming Lei 2019-09-20 17:09 ` Sagi Grimberg 2019-09-20 17:09 ` Sagi Grimberg 2019-09-06 14:18 ` Keith Busch 2019-09-06 14:18 ` Keith Busch 2019-09-06 17:50 ` Long Li 2019-09-06 17:50 ` Long Li 2019-09-06 22:19 ` Ming Lei 2019-09-06 22:19 ` Ming Lei 2019-09-06 22:25 ` Keith Busch 2019-09-06 22:25 ` Keith Busch 2019-09-06 23:13 ` Ming Lei 2019-09-06 23:13 ` Ming Lei 2019-09-10 0:24 ` Ming Lei 2019-09-10 0:24 ` Ming Lei 2019-09-03 8:09 ` Thomas Gleixner 2019-09-03 8:09 ` Thomas Gleixner 2019-09-03 9:24 ` Ming Lei 2019-09-03 9:24 ` Ming Lei 2019-08-29 6:15 ` Long Li 2019-08-30 0:55 ` Ming Lei 2019-08-27 8:53 ` [PATCH 2/4] genirq: add IRQF_RESCUE_THREAD Ming Lei 2019-08-27 8:53 ` [PATCH 3/4] nvme: pci: pass IRQF_RESCURE_THREAD to request_threaded_irq Ming Lei 2019-08-27 9:06 ` Johannes Thumshirn 2019-08-27 9:09 ` Ming Lei 2019-08-27 9:12 ` Johannes Thumshirn 2019-08-27 14:34 ` Keith Busch 2019-08-27 14:44 ` Keith Busch 2019-08-27 15:10 ` Bart Van Assche 2019-08-28 1:45 ` Ming Lei 2019-08-27 8:53 ` [PATCH 4/4] genirq: use irq's affinity for threaded irq with IRQF_RESCUE_THREAD Ming Lei 2019-08-27 14:35 ` Keith Busch 2019-09-06 8:50 ` John Garry 2019-09-06 8:50 ` John Garry
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=ffefcfa0-09b6-9af5-f94e-8e7ddd2eef16@linaro.org \ --to=daniel.lezcano@linaro.org \ --cc=axboe@fb.com \ --cc=bvanassche@acm.org \ --cc=hare@suse.com \ --cc=hch@lst.de \ --cc=john.garry@huawei.com \ --cc=keith.busch@intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=linux-scsi@vger.kernel.org \ --cc=longli@microsoft.com \ --cc=ming.lei@redhat.com \ --cc=mingo@redhat.com \ --cc=peterz@infradead.org \ --cc=sagi@grimberg.me \ --cc=tglx@linutronix.de \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.