* [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head @ 2021-01-23 20:10 Sebastian Andrzej Siewior 2021-01-23 20:10 ` [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() Sebastian Andrzej Siewior ` (4 more replies) 0 siblings, 5 replies; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2021-01-23 20:10 UTC (permalink / raw) To: linux-block, linux-kernel Cc: Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar Patch 2+3 were applied and then dropped by Jens due to a NOHZ+softirq related warning [0]. Turns out a successful wakeup via set_nr_if_polling() will not process any softirqs and the CPU may go back to idle. This is addressed by patch #1. smpcfd_dying_cpu() will also invoke SMP-functions calls via flush_smp_call_function_queue() but the block layer shouldn't queue anything because the CPU isn't online anymore. The two caller of flush_smp_call_function_from_idle() look fine with opening interrupts from within do_softirq(). [0] https://lkml.kernel.org/r/1ee4b31b-350e-a9f5-4349-cfb34b89829a@kernel.dk Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() 2021-01-23 20:10 [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head Sebastian Andrzej Siewior @ 2021-01-23 20:10 ` Sebastian Andrzej Siewior 2021-02-01 19:35 ` Sebastian Andrzej Siewior ` (3 more replies) 2021-01-23 20:10 ` [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq Sebastian Andrzej Siewior ` (3 subsequent siblings) 4 siblings, 4 replies; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2021-01-23 20:10 UTC (permalink / raw) To: linux-block, linux-kernel Cc: Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar, Sebastian Andrzej Siewior send_call_function_single_ipi() may wake an idle CPU without sending an IPI. The woken up CPU will process the SMP-functions in flush_smp_call_function_from_idle(). Any raised softirq from within the SMP-function call will not be processed. Should the CPU have no tasks assigned, then it will go back to idle with pending softirqs and the NOHZ will rightfully complain. Process pending softirqs on return from flush_smp_call_function_queue(). Fixes: b2a02fc43a1f4 ("smp: Optimize send_call_function_single_ipi()") Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- kernel/smp.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/smp.c b/kernel/smp.c index 1b6070bf97bb0..aeb0adfa06063 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -14,6 +14,7 @@ #include <linux/export.h> #include <linux/percpu.h> #include <linux/init.h> +#include <linux/interrupt.h> #include <linux/gfp.h> #include <linux/smp.h> #include <linux/cpu.h> @@ -449,6 +450,9 @@ void flush_smp_call_function_from_idle(void) local_irq_save(flags); flush_smp_call_function_queue(true); + if (local_softirq_pending()) + do_softirq(); + local_irq_restore(flags); } -- 2.30.0 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() 2021-01-23 20:10 ` [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() Sebastian Andrzej Siewior @ 2021-02-01 19:35 ` Sebastian Andrzej Siewior 2021-02-09 10:02 ` Peter Zijlstra ` (2 subsequent siblings) 3 siblings, 0 replies; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2021-02-01 19:35 UTC (permalink / raw) To: linux-block, linux-kernel Cc: Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On 2021-01-23 21:10:25 [+0100], To linux-block@vger.kernel.org wrote: > send_call_function_single_ipi() may wake an idle CPU without sending an > IPI. The woken up CPU will process the SMP-functions in > flush_smp_call_function_from_idle(). Any raised softirq from within the > SMP-function call will not be processed. > Should the CPU have no tasks assigned, then it will go back to idle with > pending softirqs and the NOHZ will rightfully complain. > > Process pending softirqs on return from flush_smp_call_function_queue(). > > Fixes: b2a02fc43a1f4 ("smp: Optimize send_call_function_single_ipi()") > Reported-by: Jens Axboe <axboe@kernel.dk> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> A gentle ping. This isn't just a requirement for the series: rps_trigger_softirq() is invoked from smp_call_function_single_async() and raises a softirq. Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() 2021-01-23 20:10 ` [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() Sebastian Andrzej Siewior 2021-02-01 19:35 ` Sebastian Andrzej Siewior @ 2021-02-09 10:02 ` Peter Zijlstra 2021-02-09 11:35 ` Sebastian Andrzej Siewior 2021-02-10 13:53 ` [tip: sched/core] " tip-bot2 for Sebastian Andrzej Siewior 2021-02-17 13:17 ` tip-bot2 for Sebastian Andrzej Siewior 3 siblings, 1 reply; 51+ messages in thread From: Peter Zijlstra @ 2021-02-09 10:02 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Ingo Molnar On Sat, Jan 23, 2021 at 09:10:25PM +0100, Sebastian Andrzej Siewior wrote: > send_call_function_single_ipi() may wake an idle CPU without sending an > IPI. The woken up CPU will process the SMP-functions in > flush_smp_call_function_from_idle(). Any raised softirq from within the > SMP-function call will not be processed. > Should the CPU have no tasks assigned, then it will go back to idle with > pending softirqs and the NOHZ will rightfully complain. > > Process pending softirqs on return from flush_smp_call_function_queue(). > > Fixes: b2a02fc43a1f4 ("smp: Optimize send_call_function_single_ipi()") > Reported-by: Jens Axboe <axboe@kernel.dk> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Fair enough. I'll stick this in tip/sched/smp for Jens and merge that into tip/sched/core. Thanks! ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() 2021-02-09 10:02 ` Peter Zijlstra @ 2021-02-09 11:35 ` Sebastian Andrzej Siewior 0 siblings, 0 replies; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2021-02-09 11:35 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Ingo Molnar On 2021-02-09 11:02:10 [+0100], Peter Zijlstra wrote: > Fair enough. I'll stick this in tip/sched/smp for Jens and merge that > into tip/sched/core. Thank you. > Thanks! Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* [tip: sched/core] smp: Process pending softirqs in flush_smp_call_function_from_idle() 2021-01-23 20:10 ` [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() Sebastian Andrzej Siewior 2021-02-01 19:35 ` Sebastian Andrzej Siewior 2021-02-09 10:02 ` Peter Zijlstra @ 2021-02-10 13:53 ` tip-bot2 for Sebastian Andrzej Siewior 2021-02-17 13:17 ` tip-bot2 for Sebastian Andrzej Siewior 3 siblings, 0 replies; 51+ messages in thread From: tip-bot2 for Sebastian Andrzej Siewior @ 2021-02-10 13:53 UTC (permalink / raw) To: linux-tip-commits Cc: Jens Axboe, Sebastian Andrzej Siewior, Peter Zijlstra (Intel), x86, linux-kernel The following commit has been merged into the sched/core branch of tip: Commit-ID: 66040b2d5d41f85cb1a752a75260595344c5ec3b Gitweb: https://git.kernel.org/tip/66040b2d5d41f85cb1a752a75260595344c5ec3b Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> AuthorDate: Sat, 23 Jan 2021 21:10:25 +01:00 Committer: Peter Zijlstra <peterz@infradead.org> CommitterDate: Wed, 10 Feb 2021 14:44:42 +01:00 smp: Process pending softirqs in flush_smp_call_function_from_idle() send_call_function_single_ipi() may wake an idle CPU without sending an IPI. The woken up CPU will process the SMP-functions in flush_smp_call_function_from_idle(). Any raised softirq from within the SMP-function call will not be processed. Should the CPU have no tasks assigned, then it will go back to idle with pending softirqs and the NOHZ will rightfully complain. Process pending softirqs on return from flush_smp_call_function_queue(). Fixes: b2a02fc43a1f4 ("smp: Optimize send_call_function_single_ipi()") Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210123201027.3262800-2-bigeasy@linutronix.de --- kernel/smp.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/smp.c b/kernel/smp.c index 1b6070b..aeb0adf 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -14,6 +14,7 @@ #include <linux/export.h> #include <linux/percpu.h> #include <linux/init.h> +#include <linux/interrupt.h> #include <linux/gfp.h> #include <linux/smp.h> #include <linux/cpu.h> @@ -449,6 +450,9 @@ void flush_smp_call_function_from_idle(void) local_irq_save(flags); flush_smp_call_function_queue(true); + if (local_softirq_pending()) + do_softirq(); + local_irq_restore(flags); } ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [tip: sched/core] smp: Process pending softirqs in flush_smp_call_function_from_idle() 2021-01-23 20:10 ` [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() Sebastian Andrzej Siewior ` (2 preceding siblings ...) 2021-02-10 13:53 ` [tip: sched/core] " tip-bot2 for Sebastian Andrzej Siewior @ 2021-02-17 13:17 ` tip-bot2 for Sebastian Andrzej Siewior 3 siblings, 0 replies; 51+ messages in thread From: tip-bot2 for Sebastian Andrzej Siewior @ 2021-02-17 13:17 UTC (permalink / raw) To: linux-tip-commits Cc: Jens Axboe, Sebastian Andrzej Siewior, Peter Zijlstra (Intel), Ingo Molnar, x86, linux-kernel The following commit has been merged into the sched/core branch of tip: Commit-ID: f9d34595ae4feed38856b88769e2ba5af22d2548 Gitweb: https://git.kernel.org/tip/f9d34595ae4feed38856b88769e2ba5af22d2548 Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> AuthorDate: Sat, 23 Jan 2021 21:10:25 +01:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Wed, 17 Feb 2021 14:12:42 +01:00 smp: Process pending softirqs in flush_smp_call_function_from_idle() send_call_function_single_ipi() may wake an idle CPU without sending an IPI. The woken up CPU will process the SMP-functions in flush_smp_call_function_from_idle(). Any raised softirq from within the SMP-function call will not be processed. Should the CPU have no tasks assigned, then it will go back to idle with pending softirqs and the NOHZ will rightfully complain. Process pending softirqs on return from flush_smp_call_function_queue(). Fixes: b2a02fc43a1f4 ("smp: Optimize send_call_function_single_ipi()") Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210123201027.3262800-2-bigeasy@linutronix.de --- kernel/smp.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/smp.c b/kernel/smp.c index 1b6070b..aeb0adf 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -14,6 +14,7 @@ #include <linux/export.h> #include <linux/percpu.h> #include <linux/init.h> +#include <linux/interrupt.h> #include <linux/gfp.h> #include <linux/smp.h> #include <linux/cpu.h> @@ -449,6 +450,9 @@ void flush_smp_call_function_from_idle(void) local_irq_save(flags); flush_smp_call_function_queue(true); + if (local_softirq_pending()) + do_softirq(); + local_irq_restore(flags); } ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2021-01-23 20:10 [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head Sebastian Andrzej Siewior 2021-01-23 20:10 ` [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() Sebastian Andrzej Siewior @ 2021-01-23 20:10 ` Sebastian Andrzej Siewior 2021-01-25 7:10 ` Hannes Reinecke ` (2 more replies) 2021-01-23 20:10 ` [PATCH 3/3] blk-mq: Use llist_head for blk_cpu_done Sebastian Andrzej Siewior ` (2 subsequent siblings) 4 siblings, 3 replies; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2021-01-23 20:10 UTC (permalink / raw) To: linux-block, linux-kernel Cc: Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar, Sebastian Andrzej Siewior Controllers with multiple queues have their IRQ-handelers pinned to a CPU. The core shouldn't need to complete the request on a remote CPU. Remove this case and always raise the softirq to complete the request. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- block/blk-mq.c | 14 +------------- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index f285a9123a8b0..90348ae518461 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -628,19 +628,7 @@ static void __blk_mq_complete_request_remote(void *data) { struct request *rq = data; - /* - * For most of single queue controllers, there is only one irq vector - * for handling I/O completion, and the only irq's affinity is set - * to all possible CPUs. On most of ARCHs, this affinity means the irq - * is handled on one specific CPU. - * - * So complete I/O requests in softirq context in case of single queue - * devices to avoid degrading I/O performance due to irqsoff latency. - */ - if (rq->q->nr_hw_queues == 1) - blk_mq_trigger_softirq(rq); - else - rq->q->mq_ops->complete(rq); + blk_mq_trigger_softirq(rq); } static inline bool blk_mq_complete_need_ipi(struct request *rq) -- 2.30.0 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2021-01-23 20:10 ` [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq Sebastian Andrzej Siewior @ 2021-01-25 7:10 ` Hannes Reinecke 2021-01-25 8:25 ` Christoph Hellwig 2021-01-25 8:22 ` Christoph Hellwig 2021-01-27 11:22 ` Daniel Wagner 2 siblings, 1 reply; 51+ messages in thread From: Hannes Reinecke @ 2021-01-25 7:10 UTC (permalink / raw) To: Sebastian Andrzej Siewior, linux-block, linux-kernel Cc: Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On 1/23/21 9:10 PM, Sebastian Andrzej Siewior wrote: > Controllers with multiple queues have their IRQ-handelers pinned to a > CPU. The core shouldn't need to complete the request on a remote CPU. > > Remove this case and always raise the softirq to complete the request. > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > block/blk-mq.c | 14 +------------- > 1 file changed, 1 insertion(+), 13 deletions(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index f285a9123a8b0..90348ae518461 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -628,19 +628,7 @@ static void __blk_mq_complete_request_remote(void *data) > { > struct request *rq = data; > > - /* > - * For most of single queue controllers, there is only one irq vector > - * for handling I/O completion, and the only irq's affinity is set > - * to all possible CPUs. On most of ARCHs, this affinity means the irq > - * is handled on one specific CPU. > - * > - * So complete I/O requests in softirq context in case of single queue > - * devices to avoid degrading I/O performance due to irqsoff latency. > - */ > - if (rq->q->nr_hw_queues == 1) > - blk_mq_trigger_softirq(rq); > - else > - rq->q->mq_ops->complete(rq); > + blk_mq_trigger_softirq(rq); > } > > static inline bool blk_mq_complete_need_ipi(struct request *rq) > I don't get this. This code is about _avoiding_ having to raise a softirq if the driver exports more than one hardware queue. So where exactly does the remote CPU case come in here? Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2021-01-25 7:10 ` Hannes Reinecke @ 2021-01-25 8:25 ` Christoph Hellwig 2021-01-25 8:30 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 51+ messages in thread From: Christoph Hellwig @ 2021-01-25 8:25 UTC (permalink / raw) To: Hannes Reinecke Cc: Sebastian Andrzej Siewior, linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On Mon, Jan 25, 2021 at 08:10:16AM +0100, Hannes Reinecke wrote: > I don't get this. > This code is about _avoiding_ having to raise a softirq if the driver > exports more than one hardware queue. > So where exactly does the remote CPU case come in here? __blk_mq_complete_request_remote is only called for the case where we do not completelky locally. The case that "degrades" here is where the device supports multiple queues, but less than the number of CPUs, and we bounce the completion to another CPU. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2021-01-25 8:25 ` Christoph Hellwig @ 2021-01-25 8:30 ` Sebastian Andrzej Siewior 2021-01-25 8:32 ` Christoph Hellwig 0 siblings, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2021-01-25 8:30 UTC (permalink / raw) To: Christoph Hellwig Cc: Hannes Reinecke, linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On 2021-01-25 08:25:42 [+0000], Christoph Hellwig wrote: > On Mon, Jan 25, 2021 at 08:10:16AM +0100, Hannes Reinecke wrote: > > I don't get this. > > This code is about _avoiding_ having to raise a softirq if the driver > > exports more than one hardware queue. > > So where exactly does the remote CPU case come in here? > > __blk_mq_complete_request_remote is only called for the case where we > do not completelky locally. The case that "degrades" here is where > the device supports multiple queues, but less than the number of CPUs, > and we bounce the completion to another CPU. Does it really "degrade" or just use the softirq more often? The usual case is run the softirqs in irq_exit() which is just after IPI. Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2021-01-25 8:30 ` Sebastian Andrzej Siewior @ 2021-01-25 8:32 ` Christoph Hellwig 2021-01-25 9:29 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 51+ messages in thread From: Christoph Hellwig @ 2021-01-25 8:32 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Christoph Hellwig, Hannes Reinecke, linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On Mon, Jan 25, 2021 at 09:30:29AM +0100, Sebastian Andrzej Siewior wrote: > On 2021-01-25 08:25:42 [+0000], Christoph Hellwig wrote: > > On Mon, Jan 25, 2021 at 08:10:16AM +0100, Hannes Reinecke wrote: > > > I don't get this. > > > This code is about _avoiding_ having to raise a softirq if the driver > > > exports more than one hardware queue. > > > So where exactly does the remote CPU case come in here? > > > > __blk_mq_complete_request_remote is only called for the case where we > > do not completelky locally. The case that "degrades" here is where > > the device supports multiple queues, but less than the number of CPUs, > > and we bounce the completion to another CPU. > > Does it really "degrade" or just use the softirq more often? The usual > case is run the softirqs in irq_exit() which is just after IPI. Well, I put it in quotes because I'm not sure what the exact effect is. But we do delay these completions to the softirq now instead of hardirq context, which at least in theory increases latency. OTOH it might even have positive effects on the rest of the system. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2021-01-25 8:32 ` Christoph Hellwig @ 2021-01-25 9:29 ` Sebastian Andrzej Siewior 0 siblings, 0 replies; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2021-01-25 9:29 UTC (permalink / raw) To: Christoph Hellwig Cc: Hannes Reinecke, linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On 2021-01-25 08:32:48 [+0000], Christoph Hellwig wrote: > Well, I put it in quotes because I'm not sure what the exact effect > is. But we do delay these completions to the softirq now instead of > hardirq context, which at least in theory increases latency. OTOH it > might even have positive effects on the rest of the system. The last part is/was my motivation ;) Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2021-01-23 20:10 ` [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq Sebastian Andrzej Siewior 2021-01-25 7:10 ` Hannes Reinecke @ 2021-01-25 8:22 ` Christoph Hellwig 2021-01-25 8:49 ` Christoph Hellwig 2021-01-27 11:22 ` Daniel Wagner 2 siblings, 1 reply; 51+ messages in thread From: Christoph Hellwig @ 2021-01-25 8:22 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On Sat, Jan 23, 2021 at 09:10:26PM +0100, Sebastian Andrzej Siewior wrote: > Controllers with multiple queues have their IRQ-handelers pinned to a > CPU. The core shouldn't need to complete the request on a remote CPU. > > Remove this case and always raise the softirq to complete the request. What about changing blk_mq_trigger_softirq to take a void * argument and thus removing __blk_mq_complete_request_remote entirely? ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2021-01-25 8:22 ` Christoph Hellwig @ 2021-01-25 8:49 ` Christoph Hellwig 0 siblings, 0 replies; 51+ messages in thread From: Christoph Hellwig @ 2021-01-25 8:49 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On Mon, Jan 25, 2021 at 08:23:03AM +0000, Christoph Hellwig wrote: > On Sat, Jan 23, 2021 at 09:10:26PM +0100, Sebastian Andrzej Siewior wrote: > > Controllers with multiple queues have their IRQ-handelers pinned to a > > CPU. The core shouldn't need to complete the request on a remote CPU. > > > > Remove this case and always raise the softirq to complete the request. > > What about changing blk_mq_trigger_softirq to take a void * argument > and thus removing __blk_mq_complete_request_remote entirely? I'll take this back - that change is in the way of what you do in patch 3. So this looks good as-is: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2021-01-23 20:10 ` [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq Sebastian Andrzej Siewior 2021-01-25 7:10 ` Hannes Reinecke 2021-01-25 8:22 ` Christoph Hellwig @ 2021-01-27 11:22 ` Daniel Wagner 2 siblings, 0 replies; 51+ messages in thread From: Daniel Wagner @ 2021-01-27 11:22 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On Sat, Jan 23, 2021 at 09:10:26PM +0100, Sebastian Andrzej Siewior wrote: > Controllers with multiple queues have their IRQ-handelers pinned to a > CPU. The core shouldn't need to complete the request on a remote CPU. > > Remove this case and always raise the softirq to complete the request. > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> ^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH 3/3] blk-mq: Use llist_head for blk_cpu_done 2021-01-23 20:10 [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head Sebastian Andrzej Siewior 2021-01-23 20:10 ` [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() Sebastian Andrzej Siewior 2021-01-23 20:10 ` [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq Sebastian Andrzej Siewior @ 2021-01-23 20:10 ` Sebastian Andrzej Siewior 2021-01-25 8:30 ` Christoph Hellwig 2021-01-25 4:27 ` [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head Jens Axboe 2021-02-10 14:43 ` Jens Axboe 4 siblings, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2021-01-23 20:10 UTC (permalink / raw) To: linux-block, linux-kernel Cc: Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar, Sebastian Andrzej Siewior With llist_head it is possible to avoid the locking (the irq-off region) when items are added. This makes it possible to add items on a remote CPU without additional locking. llist_add() returns true if the list was previously empty. This can be used to invoke the SMP function call / raise sofirq only if the first item was added (otherwise it is already pending). This simplifies the code a little and reduces the IRQ-off regions. blk_mq_raise_softirq() needs a preempt-disable section to ensure the request is enqueued on the same CPU as the softirq is raised. Some callers (USB-storage) invoke this path in preemptible context. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- block/blk-mq.c | 97 ++++++++++++++++++------------------------ include/linux/blkdev.h | 2 +- 2 files changed, 42 insertions(+), 57 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 90348ae518461..463de2981df8a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -41,7 +41,7 @@ #include "blk-mq-sched.h" #include "blk-rq-qos.h" -static DEFINE_PER_CPU(struct list_head, blk_cpu_done); +static DEFINE_PER_CPU(struct llist_head, blk_cpu_done); static void blk_mq_poll_stats_start(struct request_queue *q); static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb); @@ -567,68 +567,29 @@ void blk_mq_end_request(struct request *rq, blk_status_t error) } EXPORT_SYMBOL(blk_mq_end_request); -/* - * Softirq action handler - move entries to local list and loop over them - * while passing them to the queue registered handler. - */ -static __latent_entropy void blk_done_softirq(struct softirq_action *h) +static void blk_complete_reqs(struct llist_head *list) { - struct list_head *cpu_list, local_list; + struct llist_node *entry = llist_reverse_order(llist_del_all(list)); + struct request *rq, *next; - local_irq_disable(); - cpu_list = this_cpu_ptr(&blk_cpu_done); - list_replace_init(cpu_list, &local_list); - local_irq_enable(); - - while (!list_empty(&local_list)) { - struct request *rq; - - rq = list_entry(local_list.next, struct request, ipi_list); - list_del_init(&rq->ipi_list); + llist_for_each_entry_safe(rq, next, entry, ipi_list) rq->q->mq_ops->complete(rq); - } } -static void blk_mq_trigger_softirq(struct request *rq) +static __latent_entropy void blk_done_softirq(struct softirq_action *h) { - struct list_head *list; - unsigned long flags; - - local_irq_save(flags); - list = this_cpu_ptr(&blk_cpu_done); - list_add_tail(&rq->ipi_list, list); - - /* - * If the list only contains our just added request, signal a raise of - * the softirq. If there are already entries there, someone already - * raised the irq but it hasn't run yet. - */ - if (list->next == &rq->ipi_list) - raise_softirq_irqoff(BLOCK_SOFTIRQ); - local_irq_restore(flags); + blk_complete_reqs(this_cpu_ptr(&blk_cpu_done)); } static int blk_softirq_cpu_dead(unsigned int cpu) { - /* - * If a CPU goes away, splice its entries to the current CPU - * and trigger a run of the softirq - */ - local_irq_disable(); - list_splice_init(&per_cpu(blk_cpu_done, cpu), - this_cpu_ptr(&blk_cpu_done)); - raise_softirq_irqoff(BLOCK_SOFTIRQ); - local_irq_enable(); - + blk_complete_reqs(&per_cpu(blk_cpu_done, cpu)); return 0; } - static void __blk_mq_complete_request_remote(void *data) { - struct request *rq = data; - - blk_mq_trigger_softirq(rq); + __raise_softirq_irqoff(BLOCK_SOFTIRQ); } static inline bool blk_mq_complete_need_ipi(struct request *rq) @@ -657,6 +618,30 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq) return cpu_online(rq->mq_ctx->cpu); } +static void blk_mq_complete_send_ipi(struct request *rq) +{ + struct llist_head *list; + unsigned int cpu; + + cpu = rq->mq_ctx->cpu; + list = &per_cpu(blk_cpu_done, cpu); + if (llist_add(&rq->ipi_list, list)) { + INIT_CSD(&rq->csd, __blk_mq_complete_request_remote, rq); + smp_call_function_single_async(cpu, &rq->csd); + } +} + +static void blk_mq_raise_softirq(struct request *rq) +{ + struct llist_head *list; + + preempt_disable(); + list = this_cpu_ptr(&blk_cpu_done); + if (llist_add(&rq->ipi_list, list)) + raise_softirq(BLOCK_SOFTIRQ); + preempt_enable(); +} + bool blk_mq_complete_request_remote(struct request *rq) { WRITE_ONCE(rq->state, MQ_RQ_COMPLETE); @@ -669,15 +654,15 @@ bool blk_mq_complete_request_remote(struct request *rq) return false; if (blk_mq_complete_need_ipi(rq)) { - INIT_CSD(&rq->csd, __blk_mq_complete_request_remote, rq); - smp_call_function_single_async(rq->mq_ctx->cpu, &rq->csd); - } else { - if (rq->q->nr_hw_queues > 1) - return false; - blk_mq_trigger_softirq(rq); + blk_mq_complete_send_ipi(rq); + return true; } - return true; + if (rq->q->nr_hw_queues == 1) { + blk_mq_raise_softirq(rq); + return true; + } + return false; } EXPORT_SYMBOL_GPL(blk_mq_complete_request_remote); @@ -3892,7 +3877,7 @@ static int __init blk_mq_init(void) int i; for_each_possible_cpu(i) - INIT_LIST_HEAD(&per_cpu(blk_cpu_done, i)); + init_llist_head(&per_cpu(blk_cpu_done, i)); open_softirq(BLOCK_SOFTIRQ, blk_done_softirq); cpuhp_setup_state_nocalls(CPUHP_BLOCK_SOFTIRQ_DEAD, diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f94ee3089e015..89a444c5a5833 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -153,7 +153,7 @@ struct request { */ union { struct hlist_node hash; /* merge hash */ - struct list_head ipi_list; + struct llist_node ipi_list; }; /* -- 2.30.0 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH 3/3] blk-mq: Use llist_head for blk_cpu_done 2021-01-23 20:10 ` [PATCH 3/3] blk-mq: Use llist_head for blk_cpu_done Sebastian Andrzej Siewior @ 2021-01-25 8:30 ` Christoph Hellwig 2021-01-25 8:32 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 51+ messages in thread From: Christoph Hellwig @ 2021-01-25 8:30 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar > +static void blk_mq_complete_send_ipi(struct request *rq) > +{ > + struct llist_head *list; > + unsigned int cpu; > + > + cpu = rq->mq_ctx->cpu; > + list = &per_cpu(blk_cpu_done, cpu); > + if (llist_add(&rq->ipi_list, list)) { > + INIT_CSD(&rq->csd, __blk_mq_complete_request_remote, rq); > + smp_call_function_single_async(cpu, &rq->csd); > + } > +} Nit: it would be nice to initialize cpu and list in the declaration lines. Otherwise looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 3/3] blk-mq: Use llist_head for blk_cpu_done 2021-01-25 8:30 ` Christoph Hellwig @ 2021-01-25 8:32 ` Sebastian Andrzej Siewior 2021-01-25 8:39 ` Christoph Hellwig 0 siblings, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2021-01-25 8:32 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On 2021-01-25 08:30:12 [+0000], Christoph Hellwig wrote: > > +static void blk_mq_complete_send_ipi(struct request *rq) > > +{ > > + struct llist_head *list; > > + unsigned int cpu; > > + > > + cpu = rq->mq_ctx->cpu; > > + list = &per_cpu(blk_cpu_done, cpu); > > + if (llist_add(&rq->ipi_list, list)) { > > + INIT_CSD(&rq->csd, __blk_mq_complete_request_remote, rq); > > + smp_call_function_single_async(cpu, &rq->csd); > > + } > > +} > > Nit: it would be nice to initialize cpu and list in the declaration > lines. Why? They get initialized later. > Otherwise looks good: > > Reviewed-by: Christoph Hellwig <hch@lst.de> Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 3/3] blk-mq: Use llist_head for blk_cpu_done 2021-01-25 8:32 ` Sebastian Andrzej Siewior @ 2021-01-25 8:39 ` Christoph Hellwig 2021-01-25 9:54 ` [PATCH 3/3 v2] " Sebastian Andrzej Siewior 0 siblings, 1 reply; 51+ messages in thread From: Christoph Hellwig @ 2021-01-25 8:39 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Christoph Hellwig, linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On Mon, Jan 25, 2021 at 09:32:04AM +0100, Sebastian Andrzej Siewior wrote: > On 2021-01-25 08:30:12 [+0000], Christoph Hellwig wrote: > > > +static void blk_mq_complete_send_ipi(struct request *rq) > > > +{ > > > + struct llist_head *list; > > > + unsigned int cpu; > > > + > > > + cpu = rq->mq_ctx->cpu; > > > + list = &per_cpu(blk_cpu_done, cpu); > > > + if (llist_add(&rq->ipi_list, list)) { > > > + INIT_CSD(&rq->csd, __blk_mq_complete_request_remote, rq); > > > + smp_call_function_single_async(cpu, &rq->csd); > > > + } > > > +} > > > > Nit: it would be nice to initialize cpu and list in the declaration > > lines. > > Why? They get initialized later. Because: unsigned int cpu = rq->mq_ctx->cpu; struct llist_head *list = &per_cpu(blk_cpu_done, cpu); is a lot easier to follow than: struct llist_head *list; unsigned int cpu; cpu = rq->mq_ctx->cpu; list = &per_cpu(blk_cpu_done, cpu); ^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH 3/3 v2] blk-mq: Use llist_head for blk_cpu_done 2021-01-25 8:39 ` Christoph Hellwig @ 2021-01-25 9:54 ` Sebastian Andrzej Siewior 2021-01-25 10:14 ` Christoph Hellwig 2021-01-27 11:23 ` Daniel Wagner 0 siblings, 2 replies; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2021-01-25 9:54 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar With llist_head it is possible to avoid the locking (the irq-off region) when items are added. This makes it possible to add items on a remote CPU without additional locking. llist_add() returns true if the list was previously empty. This can be used to invoke the SMP function call / raise sofirq only if the first item was added (otherwise it is already pending). This simplifies the code a little and reduces the IRQ-off regions. blk_mq_raise_softirq() needs a preempt-disable section to ensure the request is enqueued on the same CPU as the softirq is raised. Some callers (USB-storage) invoke this path in preemptible context. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- v1…v2: Move var initialisation to declaration in blk_mq_complete_send_ipi(). Suggested by hch. block/blk-mq.c | 95 +++++++++++++++++------------------------- include/linux/blkdev.h | 2 +- 2 files changed, 40 insertions(+), 57 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 90348ae518461..8429be0d9b8dd 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -41,7 +41,7 @@ #include "blk-mq-sched.h" #include "blk-rq-qos.h" -static DEFINE_PER_CPU(struct list_head, blk_cpu_done); +static DEFINE_PER_CPU(struct llist_head, blk_cpu_done); static void blk_mq_poll_stats_start(struct request_queue *q); static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb); @@ -567,68 +567,29 @@ void blk_mq_end_request(struct request *rq, blk_status_t error) } EXPORT_SYMBOL(blk_mq_end_request); -/* - * Softirq action handler - move entries to local list and loop over them - * while passing them to the queue registered handler. - */ -static __latent_entropy void blk_done_softirq(struct softirq_action *h) +static void blk_complete_reqs(struct llist_head *list) { - struct list_head *cpu_list, local_list; + struct llist_node *entry = llist_reverse_order(llist_del_all(list)); + struct request *rq, *next; - local_irq_disable(); - cpu_list = this_cpu_ptr(&blk_cpu_done); - list_replace_init(cpu_list, &local_list); - local_irq_enable(); - - while (!list_empty(&local_list)) { - struct request *rq; - - rq = list_entry(local_list.next, struct request, ipi_list); - list_del_init(&rq->ipi_list); + llist_for_each_entry_safe(rq, next, entry, ipi_list) rq->q->mq_ops->complete(rq); - } } -static void blk_mq_trigger_softirq(struct request *rq) +static __latent_entropy void blk_done_softirq(struct softirq_action *h) { - struct list_head *list; - unsigned long flags; - - local_irq_save(flags); - list = this_cpu_ptr(&blk_cpu_done); - list_add_tail(&rq->ipi_list, list); - - /* - * If the list only contains our just added request, signal a raise of - * the softirq. If there are already entries there, someone already - * raised the irq but it hasn't run yet. - */ - if (list->next == &rq->ipi_list) - raise_softirq_irqoff(BLOCK_SOFTIRQ); - local_irq_restore(flags); + blk_complete_reqs(this_cpu_ptr(&blk_cpu_done)); } static int blk_softirq_cpu_dead(unsigned int cpu) { - /* - * If a CPU goes away, splice its entries to the current CPU - * and trigger a run of the softirq - */ - local_irq_disable(); - list_splice_init(&per_cpu(blk_cpu_done, cpu), - this_cpu_ptr(&blk_cpu_done)); - raise_softirq_irqoff(BLOCK_SOFTIRQ); - local_irq_enable(); - + blk_complete_reqs(&per_cpu(blk_cpu_done, cpu)); return 0; } - static void __blk_mq_complete_request_remote(void *data) { - struct request *rq = data; - - blk_mq_trigger_softirq(rq); + __raise_softirq_irqoff(BLOCK_SOFTIRQ); } static inline bool blk_mq_complete_need_ipi(struct request *rq) @@ -657,6 +618,28 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq) return cpu_online(rq->mq_ctx->cpu); } +static void blk_mq_complete_send_ipi(struct request *rq) +{ + unsigned int cpu = rq->mq_ctx->cpu; + struct llist_head *list = &per_cpu(blk_cpu_done, cpu); + + if (llist_add(&rq->ipi_list, list)) { + INIT_CSD(&rq->csd, __blk_mq_complete_request_remote, rq); + smp_call_function_single_async(cpu, &rq->csd); + } +} + +static void blk_mq_raise_softirq(struct request *rq) +{ + struct llist_head *list; + + preempt_disable(); + list = this_cpu_ptr(&blk_cpu_done); + if (llist_add(&rq->ipi_list, list)) + raise_softirq(BLOCK_SOFTIRQ); + preempt_enable(); +} + bool blk_mq_complete_request_remote(struct request *rq) { WRITE_ONCE(rq->state, MQ_RQ_COMPLETE); @@ -669,15 +652,15 @@ bool blk_mq_complete_request_remote(struct request *rq) return false; if (blk_mq_complete_need_ipi(rq)) { - INIT_CSD(&rq->csd, __blk_mq_complete_request_remote, rq); - smp_call_function_single_async(rq->mq_ctx->cpu, &rq->csd); - } else { - if (rq->q->nr_hw_queues > 1) - return false; - blk_mq_trigger_softirq(rq); + blk_mq_complete_send_ipi(rq); + return true; } - return true; + if (rq->q->nr_hw_queues == 1) { + blk_mq_raise_softirq(rq); + return true; + } + return false; } EXPORT_SYMBOL_GPL(blk_mq_complete_request_remote); @@ -3892,7 +3875,7 @@ static int __init blk_mq_init(void) int i; for_each_possible_cpu(i) - INIT_LIST_HEAD(&per_cpu(blk_cpu_done, i)); + init_llist_head(&per_cpu(blk_cpu_done, i)); open_softirq(BLOCK_SOFTIRQ, blk_done_softirq); cpuhp_setup_state_nocalls(CPUHP_BLOCK_SOFTIRQ_DEAD, diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f94ee3089e015..89a444c5a5833 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -153,7 +153,7 @@ struct request { */ union { struct hlist_node hash; /* merge hash */ - struct list_head ipi_list; + struct llist_node ipi_list; }; /* -- 2.30.0 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH 3/3 v2] blk-mq: Use llist_head for blk_cpu_done 2021-01-25 9:54 ` [PATCH 3/3 v2] " Sebastian Andrzej Siewior @ 2021-01-25 10:14 ` Christoph Hellwig 2021-01-27 11:23 ` Daniel Wagner 1 sibling, 0 replies; 51+ messages in thread From: Christoph Hellwig @ 2021-01-25 10:14 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Christoph Hellwig, linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 3/3 v2] blk-mq: Use llist_head for blk_cpu_done 2021-01-25 9:54 ` [PATCH 3/3 v2] " Sebastian Andrzej Siewior 2021-01-25 10:14 ` Christoph Hellwig @ 2021-01-27 11:23 ` Daniel Wagner 1 sibling, 0 replies; 51+ messages in thread From: Daniel Wagner @ 2021-01-27 11:23 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Christoph Hellwig, linux-block, linux-kernel, Jens Axboe, Thomas Gleixner, Peter Zijlstra, Ingo Molnar On Mon, Jan 25, 2021 at 10:54:12AM +0100, Sebastian Andrzej Siewior wrote: > With llist_head it is possible to avoid the locking (the irq-off region) > when items are added. This makes it possible to add items on a remote > CPU without additional locking. > llist_add() returns true if the list was previously empty. This can be > used to invoke the SMP function call / raise sofirq only if the first > item was added (otherwise it is already pending). > This simplifies the code a little and reduces the IRQ-off regions. > > blk_mq_raise_softirq() needs a preempt-disable section to ensure the > request is enqueued on the same CPU as the softirq is raised. > Some callers (USB-storage) invoke this path in preemptible context. > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> I did a quick test run with the whole series. Looks good. Reviewed-by: Daniel Wagner <dwagner@suse.de> ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head 2021-01-23 20:10 [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head Sebastian Andrzej Siewior ` (2 preceding siblings ...) 2021-01-23 20:10 ` [PATCH 3/3] blk-mq: Use llist_head for blk_cpu_done Sebastian Andrzej Siewior @ 2021-01-25 4:27 ` Jens Axboe 2021-02-10 14:43 ` Jens Axboe 4 siblings, 0 replies; 51+ messages in thread From: Jens Axboe @ 2021-01-25 4:27 UTC (permalink / raw) To: Sebastian Andrzej Siewior, linux-block, linux-kernel Cc: Thomas Gleixner, Peter Zijlstra, Ingo Molnar On 1/23/21 1:10 PM, Sebastian Andrzej Siewior wrote: > Patch 2+3 were applied and then dropped by Jens due to a NOHZ+softirq > related warning [0]. Turns out a successful wakeup via > set_nr_if_polling() will not process any softirqs and the CPU may go > back to idle. This is addressed by patch #1. > > smpcfd_dying_cpu() will also invoke SMP-functions calls via > flush_smp_call_function_queue() but the block layer shouldn't queue > anything because the CPU isn't online anymore. > The two caller of flush_smp_call_function_from_idle() look fine with > opening interrupts from within do_softirq(). > > [0] https://lkml.kernel.org/r/1ee4b31b-350e-a9f5-4349-cfb34b89829a@kernel.dk I can queue up the block side once the IPI fix is in some stable branch that I can pull in. -- Jens Axboe ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head 2021-01-23 20:10 [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head Sebastian Andrzej Siewior ` (3 preceding siblings ...) 2021-01-25 4:27 ` [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head Jens Axboe @ 2021-02-10 14:43 ` Jens Axboe 4 siblings, 0 replies; 51+ messages in thread From: Jens Axboe @ 2021-02-10 14:43 UTC (permalink / raw) To: Sebastian Andrzej Siewior, linux-block, linux-kernel Cc: Thomas Gleixner, Peter Zijlstra, Ingo Molnar On 1/23/21 1:10 PM, Sebastian Andrzej Siewior wrote: > Patch 2+3 were applied and then dropped by Jens due to a NOHZ+softirq > related warning [0]. Turns out a successful wakeup via > set_nr_if_polling() will not process any softirqs and the CPU may go > back to idle. This is addressed by patch #1. > > smpcfd_dying_cpu() will also invoke SMP-functions calls via > flush_smp_call_function_queue() but the block layer shouldn't queue > anything because the CPU isn't online anymore. > The two caller of flush_smp_call_function_from_idle() look fine with > opening interrupts from within do_softirq(). Applied, thanks. -- Jens Axboe ^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH 0/3 v2] blk-mq: Don't complete in IRQ, use llist_head @ 2020-12-04 19:13 Sebastian Andrzej Siewior 2020-12-04 19:13 ` [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq Sebastian Andrzej Siewior 0 siblings, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2020-12-04 19:13 UTC (permalink / raw) To: linux-block Cc: Jens Axboe, Thomas Gleixner, Peter Zijlstra, Daniel Wagner, Mike Galbraith, Christoph Hellwig, Sagi Grimberg This a repost of the patches in the old thread [0] which died, rebase against -next. The series avoids completing the requests on a remote CPU if booted with threadirqs. It avoids completing requests in hard-IRQ context on remote CPU by deferring it to the the softirq context. One change since the last post: preempt-disable() around llist_add() + raise_softirq() to ensure that request is added on the same CPU where the softirq is raised. Some callers (like usb-storage) invoke this function from preemtible context and this preserves the current "call me from any context" semantic. My understanding is that in a later attempt we may change such callers to complete directly and avoid the softirq ping-pong. [0] https://lkml.kernel.org/r/20201028141251.3608598-1-bigeasy@linutronix.de Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-04 19:13 [PATCH 0/3 v2] " Sebastian Andrzej Siewior @ 2020-12-04 19:13 ` Sebastian Andrzej Siewior 2020-12-07 23:52 ` Jens Axboe 0 siblings, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2020-12-04 19:13 UTC (permalink / raw) To: linux-block Cc: Jens Axboe, Thomas Gleixner, Peter Zijlstra, Daniel Wagner, Mike Galbraith, Christoph Hellwig, Sagi Grimberg, Sebastian Andrzej Siewior Controllers with multiple queues have their IRQ-handelers pinned to a CPU. The core shouldn't need to complete the request on a remote CPU. Remove this case and always raise the softirq to complete the request. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- block/blk-mq.c | 14 +------------- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 7091bb097c63f..3c0e94913d874 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -628,19 +628,7 @@ static void __blk_mq_complete_request_remote(void *data) { struct request *rq = data; - /* - * For most of single queue controllers, there is only one irq vector - * for handling I/O completion, and the only irq's affinity is set - * to all possible CPUs. On most of ARCHs, this affinity means the irq - * is handled on one specific CPU. - * - * So complete I/O requests in softirq context in case of single queue - * devices to avoid degrading I/O performance due to irqsoff latency. - */ - if (rq->q->nr_hw_queues == 1) - blk_mq_trigger_softirq(rq); - else - rq->q->mq_ops->complete(rq); + blk_mq_trigger_softirq(rq); } static inline bool blk_mq_complete_need_ipi(struct request *rq) -- 2.29.2 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-04 19:13 ` [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq Sebastian Andrzej Siewior @ 2020-12-07 23:52 ` Jens Axboe 2020-12-08 8:22 ` Sebastian Andrzej Siewior 2020-12-08 13:13 ` Christoph Hellwig 0 siblings, 2 replies; 51+ messages in thread From: Jens Axboe @ 2020-12-07 23:52 UTC (permalink / raw) To: Sebastian Andrzej Siewior, linux-block Cc: Thomas Gleixner, Peter Zijlstra, Daniel Wagner, Mike Galbraith, Christoph Hellwig, Sagi Grimberg On 12/4/20 12:13 PM, Sebastian Andrzej Siewior wrote: > Controllers with multiple queues have their IRQ-handelers pinned to a > CPU. The core shouldn't need to complete the request on a remote CPU. > > Remove this case and always raise the softirq to complete the request. I don't like this one at all, it'll add a softirq jump for the fast path for eg nvme devices. Did you run any performance testing with this? I can give it a spin, will do so anyway, but was curious if anything but "this still works" testing was done. -- Jens Axboe ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-07 23:52 ` Jens Axboe @ 2020-12-08 8:22 ` Sebastian Andrzej Siewior 2020-12-08 8:44 ` Daniel Wagner 2020-12-08 13:13 ` Christoph Hellwig 1 sibling, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2020-12-08 8:22 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, Thomas Gleixner, Peter Zijlstra, Daniel Wagner, Mike Galbraith, Christoph Hellwig, Sagi Grimberg On 2020-12-07 16:52:57 [-0700], Jens Axboe wrote: > On 12/4/20 12:13 PM, Sebastian Andrzej Siewior wrote: > > Controllers with multiple queues have their IRQ-handelers pinned to a > > CPU. The core shouldn't need to complete the request on a remote CPU. > > > > Remove this case and always raise the softirq to complete the request. > > I don't like this one at all, it'll add a softirq jump for the fast path > for eg nvme devices. Did you run any performance testing with this? I > can give it a spin, will do so anyway, but was curious if anything but > "this still works" testing was done. I understood that the nvme devices have a queue per CPU and don't need to complete on a remote-CPU in general. Also, they don't jump to softirq but invoke softirq on the return from IRQ. So if softirq is already busy because of network then softirq of the block layer is delayed. Otherwise there should be no significant delay if the CPU is idle (so the IRQ is handled and softirq right after it). Sagi mentioned nvme-tcp as a user of this remote completion and Daniel has been kind to run some nvme-tcp tests. > -- > Jens Axboe > Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-08 8:22 ` Sebastian Andrzej Siewior @ 2020-12-08 8:44 ` Daniel Wagner 2020-12-08 11:36 ` Daniel Wagner 2020-12-17 16:45 ` Jens Axboe 0 siblings, 2 replies; 51+ messages in thread From: Daniel Wagner @ 2020-12-08 8:44 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Jens Axboe, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Christoph Hellwig, Sagi Grimberg On Tue, Dec 08, 2020 at 09:22:20AM +0100, Sebastian Andrzej Siewior wrote: > Sagi mentioned nvme-tcp as a user of this remote completion and Daniel > has been kind to run some nvme-tcp tests. I've started with some benchmarking. The first thing I tried is to find a setup where the remote path is taken. I found a setup with nvme-fc with a workload which results in ca 10% remote completion. Setup: - NVMe over FC - 2x Emulex LPe36000 32Gb PCIe Fibre Channel Adapter - 8 mpaths - 4 E7-4820 v3, 80 cores Workload: - fio --rw=randwrite --name=test --size=50M \ --iodepth=32 --direct=0 --bs=4k --numjobs=40 \ --time_based --runtime=1h --ioengine=libaio \ --group_reporting (I played a bit around with different workloads, most of them wont use the remote completion path) I've annotated the code with a counter and exported it via debugfs. @@ -671,6 +673,8 @@ bool blk_mq_complete_request_remote(struct request *rq) return false; if (blk_mq_complete_need_ipi(rq)) { + ctx->rq_remote++; + rq->csd.func = __blk_mq_complete_request_remote; rq->csd.info = rq; rq->csd.flags = 0; And hacked a small script to collect the data. - Baseline (5.10-rc7) Starting 40 processes Jobs: 40 (f=40): [w(40)][100.0%][r=0KiB/s,w=12.0GiB/s][r=0,w=3405k IOPS][eta 00m:00s] test: (groupid=0, jobs=40): err= 0: pid=14225: Mon Dec 7 20:09:57 2020 write: IOPS=3345k, BW=12.8GiB/s (13.7GB/s)(44.9TiB/3600002msec) slat (usec): min=2, max=90772, avg= 9.43, stdev=10.67 clat (usec): min=2, max=91343, avg=371.79, stdev=119.52 lat (usec): min=5, max=91358, avg=381.31, stdev=122.45 clat percentiles (usec): | 1.00th=[ 231], 5.00th=[ 245], 10.00th=[ 253], 20.00th=[ 273], | 30.00th=[ 293], 40.00th=[ 322], 50.00th=[ 351], 60.00th=[ 388], | 70.00th=[ 420], 80.00th=[ 465], 90.00th=[ 529], 95.00th=[ 570], | 99.00th=[ 644], 99.50th=[ 676], 99.90th=[ 750], 99.95th=[ 783], | 99.99th=[ 873] bw ( KiB/s): min=107333, max=749152, per=2.51%, avg=335200.07, stdev=87628.57, samples=288000 iops : min=26833, max=187286, avg=83799.66, stdev=21907.09, samples=288000 lat (usec) : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01% lat (usec) : 250=8.04%, 500=6.79%, 750=13.75%, 1000=0.09% lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (msec) : 100=0.01% cpu : usr=29.14%, sys=70.83%, ctx=320219, majf=0, minf=13056 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=28.7%, >=64=0.0% submit : 0=0.0%, 4=28.7%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=28.7%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0% issued rwts: total=0,12042333583,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=32 Run status group 0 (all jobs): WRITE: bw=12.8GiB/s (13.7GB/s), 12.8GiB/s-12.8GiB/s (13.7GB/s-13.7GB/s), io=44.9TiB (49.3TB), run=3600002-3600002msec Disk stats (read/write): nvme5n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00% - Patched Jobs: 40 (f=40): [w(40)][100.0%][r=0KiB/s,w=12.9GiB/s][r=0,w=3383k IOPS][eta 00m:00s] test: (groupid=0, jobs=40): err= 0: pid=13413: Mon Dec 7 21:31:01 2020 write: IOPS=3371k, BW=12.9GiB/s (13.8GB/s)(45.2TiB/3600004msec) slat (nsec): min=1984, max=90341k, avg=9308.73, stdev=7068.58 clat (usec): min=2, max=91259, avg=368.94, stdev=118.31 lat (usec): min=5, max=91269, avg=378.34, stdev=121.43 clat percentiles (usec): | 1.00th=[ 231], 5.00th=[ 245], 10.00th=[ 255], 20.00th=[ 277], | 30.00th=[ 302], 40.00th=[ 318], 50.00th=[ 334], 60.00th=[ 359], | 70.00th=[ 392], 80.00th=[ 433], 90.00th=[ 562], 95.00th=[ 635], | 99.00th=[ 693], 99.50th=[ 709], 99.90th=[ 766], 99.95th=[ 816], | 99.99th=[ 914] bw ( KiB/s): min=124304, max=770204, per=2.50%, avg=337559.07, stdev=91383.66, samples=287995 iops : min=31076, max=192551, avg=84389.45, stdev=22845.91, samples=287995 lat (usec) : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01% lat (usec) : 250=7.44%, 500=7.84%, 750=13.79%, 1000=0.15% lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (msec) : 100=0.01% cpu : usr=30.30%, sys=69.69%, ctx=179950, majf=0, minf=7403 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=29.2%, >=64=0.0% submit : 0=0.0%, 4=29.2%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=29.2%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0% issued rwts: total=0,12135617715,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=32 Run status group 0 (all jobs): WRITE: bw=12.9GiB/s (13.8GB/s), 12.9GiB/s-12.9GiB/s (13.8GB/s-13.8GB/s), io=45.2TiB (49.7TB), run=3600004-3600004msec Disk stats (read/write): nvme1n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00% - Baseline nvme5c0n1 completed 411218 remote 38777 9.43% nvme5c0n2 completed 0 remote 0 0.00% nvme5c1n1 completed 411270 remote 38770 9.43% nvme5c1n2 completed 50 remote 0 0.00% nvme5c2n1 completed 0 remote 0 0.00% nvme5c2n2 completed 0 remote 0 0.00% nvme5c3n1 completed 411220 remote 38751 9.42% nvme5c3n2 completed 0 remote 0 0.00% nvme5c4n1 completed 0 remote 0 0.00% nvme5c4n2 completed 0 remote 0 0.00% nvme5c5n1 completed 0 remote 0 0.00% nvme5c5n2 completed 0 remote 0 0.00% nvme5c6n1 completed 411216 remote 38759 9.43% nvme5c6n2 completed 0 remote 0 0.00% nvme5c7n1 completed 0 remote 0 0.00% nvme5c7n2 completed 0 remote 0 0.00% - Patched nvme1c0n1 completed 0 remote 0 0.00% nvme1c0n2 completed 0 remote 0 0.00% nvme1c1n1 completed 172202 remote 17813 10.34% nvme1c1n2 completed 50 remote 0 0.00% nvme1c2n1 completed 172147 remote 17831 10.36% nvme1c2n2 completed 0 remote 0 0.00% nvme1c3n1 completed 0 remote 0 0.00% nvme1c3n2 completed 0 remote 0 0.00% nvme1c4n1 completed 172159 remote 17825 10.35% nvme1c4n2 completed 0 remote 0 0.00% nvme1c5n1 completed 0 remote 0 0.00% nvme1c5n2 completed 0 remote 0 0.00% nvme1c6n1 completed 0 remote 0 0.00% nvme1c6n2 completed 0 remote 0 0.00% nvme1c7n1 completed 172156 remote 17781 10.33% nvme1c7n2 completed 0 remote 0 0.00% It looks like the patched version show tiny bit better numbers for this workload. slat seems to be one of the major difference between the two runs. But that is the only thing I really spotted to be really off. I keep going with some more testing. Let what kind of tests you would also want to see. I'll do a few plain NVMe tests next. Thanks, Daniel ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-08 8:44 ` Daniel Wagner @ 2020-12-08 11:36 ` Daniel Wagner 2020-12-08 11:49 ` Sebastian Andrzej Siewior 2020-12-17 16:45 ` Jens Axboe 1 sibling, 1 reply; 51+ messages in thread From: Daniel Wagner @ 2020-12-08 11:36 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Jens Axboe, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Christoph Hellwig, Sagi Grimberg, Hannes Reinecke On Tue, Dec 08, 2020 at 09:44:10AM +0100, Daniel Wagner wrote: > On Tue, Dec 08, 2020 at 09:22:20AM +0100, Sebastian Andrzej Siewior wrote: > > Sagi mentioned nvme-tcp as a user of this remote completion and Daniel > > has been kind to run some nvme-tcp tests. > > I've started with some benchmarking. The first thing I tried is to find > a setup where the remote path is taken. I found a setup with nvme-fc > with a workload which results in ca 10% remote completion. Setup: - Dell Express Flash NVMe PM1725 800GB SFF - 2 Gold 6130, 64 cores Workload: - fio --rw=randread --name=test --filename=/dev/nvme0n1 \ --iodepth=64 --direct=1 --bs=4k --numjobs=32 \ --time_based --runtime=5m --ioengine=libaio --group_reporting (Searched for a workload with the highest IOPs which seems to be randread) Obvious in this configuration there are no remote completions (verified it). - baseline 5.10-rc7 Jobs: 32 (f=32): [r(32)][100.0%][r=2544MiB/s][r=651k IOPS][eta 00m:00s] test: (groupid=0, jobs=32): err= 0: pid=24118: Tue Dec 8 11:33:21 2020 read: IOPS=636k, BW=2485MiB/s (2605MB/s)(728GiB/300006msec) slat (nsec): min=1502, max=450956, avg=5576.99, stdev=1475.94 clat (usec): min=195, max=59296, avg=3212.51, stdev=1640.48 lat (usec): min=201, max=59302, avg=3218.23, stdev=1640.58 clat percentiles (usec): | 1.00th=[ 2573], 5.00th=[ 2671], 10.00th=[ 2769], 20.00th=[ 2868], | 30.00th=[ 2933], 40.00th=[ 2999], 50.00th=[ 3064], 60.00th=[ 3163], | 70.00th=[ 3261], 80.00th=[ 3359], 90.00th=[ 3589], 95.00th=[ 3818], | 99.00th=[ 4948], 99.50th=[ 5669], 99.90th=[40633], 99.95th=[44303], | 99.99th=[49021] bw ( MiB/s): min= 444, max= 2598, per=99.99%, avg=2484.33, stdev= 8.36, samples=19200 iops : min=113782, max=665312, avg=635988.04, stdev=2139.63, samples=19200 lat (usec) : 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=96.85%, 10=2.96%, 20=0.01%, 50=0.16% lat (msec) : 100=0.01% cpu : usr=9.11%, sys=14.58%, ctx=131930047, majf=0, minf=28434 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=190817510,0,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): READ: bw=2485MiB/s (2605MB/s), 2485MiB/s-2485MiB/s (2605MB/s-2605MB/s), io=728GiB (782GB), run=300006-300006msec Disk stats (read/write): nvme0n1: ios=190707084/0, merge=0/0, ticks=611781701/0, in_queue=611781702, util=100.00% - patched Jobs: 32 (f=32): [r(32)][100.0%][r=2548MiB/s][r=652k IOPS][eta 00m:00s] test: (groupid=0, jobs=32): err= 0: pid=3059: Tue Dec 8 12:11:25 2020 read: IOPS=637k, BW=2489MiB/s (2610MB/s)(729GiB/300006msec) slat (nsec): min=1453, max=4793.6k, avg=5662.01, stdev=1960.75 clat (usec): min=77, max=59685, avg=3207.13, stdev=1633.85 lat (usec): min=82, max=59696, avg=3212.92, stdev=1633.95 clat percentiles (usec): | 1.00th=[ 2573], 5.00th=[ 2671], 10.00th=[ 2737], 20.00th=[ 2835], | 30.00th=[ 2933], 40.00th=[ 2999], 50.00th=[ 3064], 60.00th=[ 3163], | 70.00th=[ 3228], 80.00th=[ 3359], 90.00th=[ 3556], 95.00th=[ 3785], | 99.00th=[ 4948], 99.50th=[ 5669], 99.90th=[40633], 99.95th=[43779], | 99.99th=[49021] bw ( MiB/s): min= 560, max= 2617, per=99.99%, avg=2488.34, stdev= 8.39, samples=19199 iops : min=143452, max=670006, avg=637013.93, stdev=2148.64, samples=19199 lat (usec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=96.92%, 10=2.89%, 20=0.01%, 50=0.16% lat (msec) : 100=0.01% cpu : usr=9.32%, sys=14.88%, ctx=130862793, majf=0, minf=38825 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=191130719,0,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): READ: bw=2489MiB/s (2610MB/s), 2489MiB/s-2489MiB/s (2610MB/s-2610MB/s), io=729GiB (783GB), run=300006-300006msec Disk stats (read/write): nvme0n1: ios=191019060/0, merge=0/0, ticks=611718395/0, in_queue=611718395, util=100.00% Again, the numbers look very alike. Thanks, Daniel ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-08 11:36 ` Daniel Wagner @ 2020-12-08 11:49 ` Sebastian Andrzej Siewior 2020-12-08 12:41 ` Daniel Wagner 0 siblings, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2020-12-08 11:49 UTC (permalink / raw) To: Daniel Wagner Cc: Jens Axboe, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Christoph Hellwig, Sagi Grimberg, Hannes Reinecke On 2020-12-08 12:36:53 [+0100], Daniel Wagner wrote: > Obvious in this configuration there are no remote completions (verified > it). do you complete on a remote CPU if you limit the queues to one (this is untested of course)? diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 3be352403839a..f35224a64a56e 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2126,7 +2126,7 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) * If tags are shared with admin queue (Apple bug), then * make sure we only use one IO queue. */ - if (dev->ctrl.quirks & NVME_QUIRK_SHARED_TAGS) + if (1 || dev->ctrl.quirks & NVME_QUIRK_SHARED_TAGS) nr_io_queues = 1; else nr_io_queues = min(nvme_max_io_queues(dev), > Thanks, > Daniel Sebastian ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-08 11:49 ` Sebastian Andrzej Siewior @ 2020-12-08 12:41 ` Daniel Wagner 2020-12-08 12:52 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 51+ messages in thread From: Daniel Wagner @ 2020-12-08 12:41 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Jens Axboe, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Christoph Hellwig, Sagi Grimberg, Hannes Reinecke On Tue, Dec 08, 2020 at 12:49:36PM +0100, Sebastian Andrzej Siewior wrote: > On 2020-12-08 12:36:53 [+0100], Daniel Wagner wrote: > > Obvious in this configuration there are no remote completions (verified > > it). > > do you complete on a remote CPU if you limit the queues to one (this is > untested of course)? nvme0n1/ completed 11913011 remote 6718563 56.40% yes, but how is this relevant? I thought Jens complain was about the additional indirection via the softirq context - rq->q->mq_ops->complete(rq); + blk_mq_trigger_softirq(rq); and not the remote completion path. I can benchmark it out but I don't know if it's really helping in the discussion. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-08 12:41 ` Daniel Wagner @ 2020-12-08 12:52 ` Sebastian Andrzej Siewior 2020-12-08 12:57 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2020-12-08 12:52 UTC (permalink / raw) To: Daniel Wagner Cc: Jens Axboe, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Christoph Hellwig, Sagi Grimberg, Hannes Reinecke On 2020-12-08 13:41:48 [+0100], Daniel Wagner wrote: > On Tue, Dec 08, 2020 at 12:49:36PM +0100, Sebastian Andrzej Siewior wrote: > > On 2020-12-08 12:36:53 [+0100], Daniel Wagner wrote: > > > Obvious in this configuration there are no remote completions (verified > > > it). > > > > do you complete on a remote CPU if you limit the queues to one (this is > > untested of course)? > > nvme0n1/ completed 11913011 remote 6718563 56.40% > > yes, but how is this relevant? I thought Jens complain was about the > additional indirection via the softirq context > > - rq->q->mq_ops->complete(rq); > + blk_mq_trigger_softirq(rq); > > and not the remote completion path. I can benchmark it out but I don't > know if it's really helping in the discussion. The only additional softirq path is for cross-CPU completion. If I understood you correctly then your NVME device always completes locally because the queue interrupt fires on the correct CPU. If you take away the queues then you should have cross-CPU completion since you have only one queue and this will now complete on the remote CPU in softirq context (and not in IRQ as it used to). If this single queue NVME device, which may complete on another CPU, is not an issue / interesting because it is already limited then ignore this. Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-08 12:52 ` Sebastian Andrzej Siewior @ 2020-12-08 12:57 ` Sebastian Andrzej Siewior 2020-12-08 13:27 ` Daniel Wagner 0 siblings, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2020-12-08 12:57 UTC (permalink / raw) To: Daniel Wagner Cc: Jens Axboe, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Christoph Hellwig, Sagi Grimberg, Hannes Reinecke On 2020-12-08 13:52:25 [+0100], To Daniel Wagner wrote: > On 2020-12-08 13:41:48 [+0100], Daniel Wagner wrote: > > On Tue, Dec 08, 2020 at 12:49:36PM +0100, Sebastian Andrzej Siewior wrote: > > > On 2020-12-08 12:36:53 [+0100], Daniel Wagner wrote: > > > > Obvious in this configuration there are no remote completions (verified > > > > it). > > > > > > do you complete on a remote CPU if you limit the queues to one (this is > > > untested of course)? > > > > nvme0n1/ completed 11913011 remote 6718563 56.40% > > > > yes, but how is this relevant? I thought Jens complain was about the > > additional indirection via the softirq context > > > > - rq->q->mq_ops->complete(rq); > > + blk_mq_trigger_softirq(rq); > > > > and not the remote completion path. I can benchmark it out but I don't > > know if it's really helping in the discussion. … blurp Yes, you are right. Even cross-CPU completion for single-queue was already completing in softirq. So the only change is for multiqueue devices which you just demonstrated that it does not happen. Thank you! Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-08 12:57 ` Sebastian Andrzej Siewior @ 2020-12-08 13:27 ` Daniel Wagner 0 siblings, 0 replies; 51+ messages in thread From: Daniel Wagner @ 2020-12-08 13:27 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Jens Axboe, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Christoph Hellwig, Sagi Grimberg, Hannes Reinecke On Tue, Dec 08, 2020 at 01:57:09PM +0100, Sebastian Andrzej Siewior wrote: > Yes, you are right. Even cross-CPU completion for single-queue was > already completing in softirq. So the only change is for multiqueue > devices which you just demonstrated that it does not happen. It can happen if there are less hardware queues than CPUs. But I'd bet that application which care about performance are well aware of this and try to keep the work vertical aligned. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-08 8:44 ` Daniel Wagner 2020-12-08 11:36 ` Daniel Wagner @ 2020-12-17 16:45 ` Jens Axboe 2020-12-17 16:49 ` Daniel Wagner 1 sibling, 1 reply; 51+ messages in thread From: Jens Axboe @ 2020-12-17 16:45 UTC (permalink / raw) To: Daniel Wagner, Sebastian Andrzej Siewior Cc: linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Christoph Hellwig, Sagi Grimberg On 12/8/20 1:44 AM, Daniel Wagner wrote: > It looks like the patched version show tiny bit better numbers for this > workload. slat seems to be one of the major difference between the two > runs. But that is the only thing I really spotted to be really off. slat is the same, just one is in nsec and the other is in usec. > I keep going with some more testing. Let what kind of tests you would > also want to see. I'll do a few plain NVMe tests next. This is a good test, thanks. -- Jens Axboe ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 16:45 ` Jens Axboe @ 2020-12-17 16:49 ` Daniel Wagner 2020-12-17 16:54 ` Jens Axboe 0 siblings, 1 reply; 51+ messages in thread From: Daniel Wagner @ 2020-12-17 16:49 UTC (permalink / raw) To: Jens Axboe Cc: Sebastian Andrzej Siewior, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Christoph Hellwig, Sagi Grimberg On Thu, Dec 17, 2020 at 09:45:47AM -0700, Jens Axboe wrote: > On 12/8/20 1:44 AM, Daniel Wagner wrote: > > It looks like the patched version show tiny bit better numbers for this > > workload. slat seems to be one of the major difference between the two > > runs. But that is the only thing I really spotted to be really off. > > slat is the same, just one is in nsec and the other is in usec. Ah, good eyes. Need to remember this :) > > I keep going with some more testing. Let what kind of tests you would > > also want to see. I'll do a few plain NVMe tests next. > > This is a good test, thanks. Got sidetracked. Haven't started yet with these tests. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 16:49 ` Daniel Wagner @ 2020-12-17 16:54 ` Jens Axboe 0 siblings, 0 replies; 51+ messages in thread From: Jens Axboe @ 2020-12-17 16:54 UTC (permalink / raw) To: Daniel Wagner Cc: Sebastian Andrzej Siewior, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Christoph Hellwig, Sagi Grimberg On 12/17/20 9:49 AM, Daniel Wagner wrote: > On Thu, Dec 17, 2020 at 09:45:47AM -0700, Jens Axboe wrote: >> On 12/8/20 1:44 AM, Daniel Wagner wrote: >>> It looks like the patched version show tiny bit better numbers for this >>> workload. slat seems to be one of the major difference between the two >>> runs. But that is the only thing I really spotted to be really off. >> >> slat is the same, just one is in nsec and the other is in usec. > > Ah, good eyes. Need to remember this :) > >>> I keep going with some more testing. Let what kind of tests you would >>> also want to see. I'll do a few plain NVMe tests next. >> >> This is a good test, thanks. > > Got sidetracked. Haven't started yet with these tests. I just ran some here, and as expected, didn't see much change. Single core IOPS at 2.8M in both cases, and not expecting any remote IPI for that test case. So I'd say that's good enough, wasn't expecting a change, and we don't see one even for your case with remote IPI. -- Jens Axboe ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-07 23:52 ` Jens Axboe 2020-12-08 8:22 ` Sebastian Andrzej Siewior @ 2020-12-08 13:13 ` Christoph Hellwig 2020-12-17 16:43 ` Sebastian Andrzej Siewior 1 sibling, 1 reply; 51+ messages in thread From: Christoph Hellwig @ 2020-12-08 13:13 UTC (permalink / raw) To: Jens Axboe Cc: Sebastian Andrzej Siewior, linux-block, Thomas Gleixner, Peter Zijlstra, Daniel Wagner, Mike Galbraith, Christoph Hellwig, Sagi Grimberg On Mon, Dec 07, 2020 at 04:52:57PM -0700, Jens Axboe wrote: > On 12/4/20 12:13 PM, Sebastian Andrzej Siewior wrote: > > Controllers with multiple queues have their IRQ-handelers pinned to a > > CPU. The core shouldn't need to complete the request on a remote CPU. > > > > Remove this case and always raise the softirq to complete the request. > > I don't like this one at all, it'll add a softirq jump for the fast path > for eg nvme devices. For the real fast path, that is either a polled queue or irq driven queues that only map to a single CPU we are never reaching this code, so I'm not too worried. Not that I'd complain about numbers, preferably in the commit log. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-08 13:13 ` Christoph Hellwig @ 2020-12-17 16:43 ` Sebastian Andrzej Siewior 2020-12-17 16:55 ` Jens Axboe 0 siblings, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2020-12-17 16:43 UTC (permalink / raw) To: Christoph Hellwig Cc: Jens Axboe, linux-block, Thomas Gleixner, Peter Zijlstra, Daniel Wagner, Mike Galbraith, Sagi Grimberg On 2020-12-08 13:13:19 [+0000], Christoph Hellwig wrote: > On Mon, Dec 07, 2020 at 04:52:57PM -0700, Jens Axboe wrote: > > On 12/4/20 12:13 PM, Sebastian Andrzej Siewior wrote: > > > Controllers with multiple queues have their IRQ-handelers pinned to a > > > CPU. The core shouldn't need to complete the request on a remote CPU. > > > > > > Remove this case and always raise the softirq to complete the request. > > > > I don't like this one at all, it'll add a softirq jump for the fast path > > for eg nvme devices. > > For the real fast path, that is either a polled queue or irq driven > queues that only map to a single CPU we are never reaching this code, > so I'm not too worried. Not that I'd complain about numbers, preferably > in the commit log. Did Daniel provide all the numbers you/Jens were looking for or do we still wait for some? Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 16:43 ` Sebastian Andrzej Siewior @ 2020-12-17 16:55 ` Jens Axboe 2020-12-17 16:58 ` Sebastian Andrzej Siewior 2020-12-17 18:16 ` Daniel Wagner 0 siblings, 2 replies; 51+ messages in thread From: Jens Axboe @ 2020-12-17 16:55 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Christoph Hellwig Cc: linux-block, Thomas Gleixner, Peter Zijlstra, Daniel Wagner, Mike Galbraith, Sagi Grimberg On 12/17/20 9:43 AM, Sebastian Andrzej Siewior wrote: > On 2020-12-08 13:13:19 [+0000], Christoph Hellwig wrote: >> On Mon, Dec 07, 2020 at 04:52:57PM -0700, Jens Axboe wrote: >>> On 12/4/20 12:13 PM, Sebastian Andrzej Siewior wrote: >>>> Controllers with multiple queues have their IRQ-handelers pinned to a >>>> CPU. The core shouldn't need to complete the request on a remote CPU. >>>> >>>> Remove this case and always raise the softirq to complete the request. >>> >>> I don't like this one at all, it'll add a softirq jump for the fast path >>> for eg nvme devices. >> >> For the real fast path, that is either a polled queue or irq driven >> queues that only map to a single CPU we are never reaching this code, >> so I'm not too worried. Not that I'd complain about numbers, preferably >> in the commit log. > > Did Daniel provide all the numbers you/Jens were looking for or do we > still wait for some? Yeah, I think we're good at this point. I'll queue this up for 5.11. -- Jens Axboe ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 16:55 ` Jens Axboe @ 2020-12-17 16:58 ` Sebastian Andrzej Siewior 2020-12-17 17:05 ` Daniel Wagner 2020-12-17 18:16 ` Daniel Wagner 1 sibling, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2020-12-17 16:58 UTC (permalink / raw) To: Jens Axboe, Daniel Wagner Cc: Christoph Hellwig, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Sagi Grimberg On 2020-12-17 09:55:08 [-0700], Jens Axboe wrote: > > Yeah, I think we're good at this point. I'll queue this up for 5.11. Thank you very much. Thank you Daniel for running all these tests. Sebastian ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 16:58 ` Sebastian Andrzej Siewior @ 2020-12-17 17:05 ` Daniel Wagner 0 siblings, 0 replies; 51+ messages in thread From: Daniel Wagner @ 2020-12-17 17:05 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Jens Axboe, Christoph Hellwig, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Sagi Grimberg On Thu, Dec 17, 2020 at 05:58:44PM +0100, Sebastian Andrzej Siewior wrote: > On 2020-12-17 09:55:08 [-0700], Jens Axboe wrote: > > > > Yeah, I think we're good at this point. I'll queue this up for 5.11. > Thank you very much. > Thank you Daniel for running all these tests. np. Glad I can contribute to the -rt project :) ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 16:55 ` Jens Axboe 2020-12-17 16:58 ` Sebastian Andrzej Siewior @ 2020-12-17 18:16 ` Daniel Wagner 2020-12-17 18:22 ` Jens Axboe 1 sibling, 1 reply; 51+ messages in thread From: Daniel Wagner @ 2020-12-17 18:16 UTC (permalink / raw) To: Jens Axboe Cc: Sebastian Andrzej Siewior, Christoph Hellwig, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Sagi Grimberg On Thu, Dec 17, 2020 at 09:55:08AM -0700, Jens Axboe wrote: > Yeah, I think we're good at this point. I'll queue this up for 5.11. If I am not complete mistaken you queued v2 of patch 3. Sebastian sent out a v3 (slightly hidden): https://lore.kernel.org/linux-block/20201214202030.izhm4byeznfjoobe@linutronix.de/ which includes the changes Christoph asked for. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 18:16 ` Daniel Wagner @ 2020-12-17 18:22 ` Jens Axboe 2020-12-17 18:41 ` Daniel Wagner 0 siblings, 1 reply; 51+ messages in thread From: Jens Axboe @ 2020-12-17 18:22 UTC (permalink / raw) To: Daniel Wagner Cc: Sebastian Andrzej Siewior, Christoph Hellwig, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Sagi Grimberg On 12/17/20 11:16 AM, Daniel Wagner wrote: > On Thu, Dec 17, 2020 at 09:55:08AM -0700, Jens Axboe wrote: >> Yeah, I think we're good at this point. I'll queue this up for 5.11. > > If I am not complete mistaken you queued v2 of patch 3. Sebastian sent out > a v3 (slightly hidden): > > https://lore.kernel.org/linux-block/20201214202030.izhm4byeznfjoobe@linutronix.de/ > > which includes the changes Christoph asked for. Not only slightly hidden, b4 gets me v2 as well. Which isn't surprising, since it's just a patch in a reply. I'll fix it up, but would've been great if a v3 series had been posted, or at least a v3 of patch 3 in that thread sent properly. -- Jens Axboe ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 18:22 ` Jens Axboe @ 2020-12-17 18:41 ` Daniel Wagner 2020-12-17 18:46 ` Jens Axboe 0 siblings, 1 reply; 51+ messages in thread From: Daniel Wagner @ 2020-12-17 18:41 UTC (permalink / raw) To: Jens Axboe Cc: Sebastian Andrzej Siewior, Christoph Hellwig, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Sagi Grimberg On Thu, Dec 17, 2020 at 11:22:58AM -0700, Jens Axboe wrote: > Not only slightly hidden, b4 gets me v2 as well. Which isn't surprising, > since it's just a patch in a reply. I'll fix it up, but would've been > great if a v3 series had been posted, or at least a v3 of patch 3 in > that thread sent properly. Yep. BTW, if you like you can add my Reviewed-by: Daniel Wagner <dwagner@suse.de> ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 18:41 ` Daniel Wagner @ 2020-12-17 18:46 ` Jens Axboe 2020-12-17 19:07 ` Daniel Wagner 0 siblings, 1 reply; 51+ messages in thread From: Jens Axboe @ 2020-12-17 18:46 UTC (permalink / raw) To: Daniel Wagner Cc: Sebastian Andrzej Siewior, Christoph Hellwig, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Sagi Grimberg On 12/17/20 11:41 AM, Daniel Wagner wrote: > On Thu, Dec 17, 2020 at 11:22:58AM -0700, Jens Axboe wrote: >> Not only slightly hidden, b4 gets me v2 as well. Which isn't surprising, >> since it's just a patch in a reply. I'll fix it up, but would've been >> great if a v3 series had been posted, or at least a v3 of patch 3 in >> that thread sent properly. > > Yep. > > BTW, if you like you can add my > > Reviewed-by: Daniel Wagner <dwagner@suse.de> To the series, or just that one patch? -- Jens Axboe ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 18:46 ` Jens Axboe @ 2020-12-17 19:07 ` Daniel Wagner 2020-12-17 19:13 ` Jens Axboe 0 siblings, 1 reply; 51+ messages in thread From: Daniel Wagner @ 2020-12-17 19:07 UTC (permalink / raw) To: Jens Axboe Cc: Sebastian Andrzej Siewior, Christoph Hellwig, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Sagi Grimberg On 17.12.20 19:46, Jens Axboe wrote: > On 12/17/20 11:41 AM, Daniel Wagner wrote: >> On Thu, Dec 17, 2020 at 11:22:58AM -0700, Jens Axboe wrote: >>> Not only slightly hidden, b4 gets me v2 as well. Which isn't surprising, >>> since it's just a patch in a reply. I'll fix it up, but would've been >>> great if a v3 series had been posted, or at least a v3 of patch 3 in >>> that thread sent properly. >> >> Yep. >> >> BTW, if you like you can add my >> >> Reviewed-by: Daniel Wagner <dwagner@suse.de> > > To the series, or just that one patch? to the series but if I am late for this, it's also okau if I miss out to the party :) ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 19:07 ` Daniel Wagner @ 2020-12-17 19:13 ` Jens Axboe 2020-12-17 19:15 ` Daniel Wagner 0 siblings, 1 reply; 51+ messages in thread From: Jens Axboe @ 2020-12-17 19:13 UTC (permalink / raw) To: Daniel Wagner Cc: Sebastian Andrzej Siewior, Christoph Hellwig, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Sagi Grimberg On 12/17/20 12:07 PM, Daniel Wagner wrote: > On 17.12.20 19:46, Jens Axboe wrote: >> On 12/17/20 11:41 AM, Daniel Wagner wrote: >>> On Thu, Dec 17, 2020 at 11:22:58AM -0700, Jens Axboe wrote: >>>> Not only slightly hidden, b4 gets me v2 as well. Which isn't surprising, >>>> since it's just a patch in a reply. I'll fix it up, but would've been >>>> great if a v3 series had been posted, or at least a v3 of patch 3 in >>>> that thread sent properly. >>> >>> Yep. >>> >>> BTW, if you like you can add my >>> >>> Reviewed-by: Daniel Wagner <dwagner@suse.de> >> >> To the series, or just that one patch? > > to the series but if I am late for this, it's also okau if I miss out to > the party :) Well, had to update #3 anyway, so done. -- Jens Axboe ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-12-17 19:13 ` Jens Axboe @ 2020-12-17 19:15 ` Daniel Wagner 0 siblings, 0 replies; 51+ messages in thread From: Daniel Wagner @ 2020-12-17 19:15 UTC (permalink / raw) To: Jens Axboe Cc: Sebastian Andrzej Siewior, Christoph Hellwig, linux-block, Thomas Gleixner, Peter Zijlstra, Mike Galbraith, Sagi Grimberg > Well, had to update #3 anyway, so done. Thanks! ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH RFC] blk-mq: Don't IPI requests on PREEMPT_RT @ 2020-10-28 6:56 Christoph Hellwig 2020-10-28 14:12 ` [PATCH 1/3] blk-mq: Don't complete on a remote CPU in force threaded mode Sebastian Andrzej Siewior 0 siblings, 1 reply; 51+ messages in thread From: Christoph Hellwig @ 2020-10-28 6:56 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Christoph Hellwig, Thomas Gleixner, David Runge, linux-rt-users, Jens Axboe, linux-block, linux-kernel, Peter Zijlstra, Daniel Wagner > The remaining part is a switch to llist which avoids locking (IRQ > off/on) and it allows invoke the IPI/raise softirq only if something was > added. The entries are now processed in the reverse order but this > shouldn't matter right? For correctness it should not matter, but I think it could have performance implications. I think you'll have to throw in a llist_reverse_order. > I would split this into two patches (the blk_mq_complete_need_ipi() hunk > and the llist part) unless there are objections. Yes, please do. ^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH 1/3] blk-mq: Don't complete on a remote CPU in force threaded mode 2020-10-28 6:56 [PATCH RFC] blk-mq: Don't IPI requests on PREEMPT_RT Christoph Hellwig @ 2020-10-28 14:12 ` Sebastian Andrzej Siewior 2020-10-28 14:12 ` [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq Sebastian Andrzej Siewior 0 siblings, 1 reply; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2020-10-28 14:12 UTC (permalink / raw) To: linux-block Cc: Christoph Hellwig, Thomas Gleixner, David Runge, linux-rt-users, Jens Axboe, linux-kernel, Peter Zijlstra, Daniel Wagner, Sebastian Andrzej Siewior With force threaded interrupts enabled, raising softirq from an SMP function call will always result in waking the ksoftirqd thread. This is not optimal given that the thread runs at SCHED_OTHER priority. Completing the request in hard IRQ-context on PREEMPT_RT (which enforces the force threaded mode) is bad because the completion handler may acquire sleeping locks which violate the locking context. Disable request completing on a remote CPU in force threaded mode. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- block/blk-mq.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 55bcee5dc0320..421a40968c9ff 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -648,6 +648,14 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq) if (!IS_ENABLED(CONFIG_SMP) || !test_bit(QUEUE_FLAG_SAME_COMP, &rq->q->queue_flags)) return false; + /* + * With force threaded interrupts enabled, raising softirq from an SMP + * function call will always result in waking the ksoftirqd thread. + * This is probably worse than completing the request on a different + * cache domain. + */ + if (force_irqthreads) + return false; /* same CPU or cache domain? Complete locally */ if (cpu == rq->mq_ctx->cpu || -- 2.28.0 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq 2020-10-28 14:12 ` [PATCH 1/3] blk-mq: Don't complete on a remote CPU in force threaded mode Sebastian Andrzej Siewior @ 2020-10-28 14:12 ` Sebastian Andrzej Siewior 0 siblings, 0 replies; 51+ messages in thread From: Sebastian Andrzej Siewior @ 2020-10-28 14:12 UTC (permalink / raw) To: linux-block Cc: Christoph Hellwig, Thomas Gleixner, David Runge, linux-rt-users, Jens Axboe, linux-kernel, Peter Zijlstra, Daniel Wagner, Sebastian Andrzej Siewior Controllers with multiple queues have their IRQ-handelers pinned to a CPU. The core shouldn't need to complete the request on a remote CPU. Remove this case and always raise the softirq to complete the request. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- block/blk-mq.c | 14 +------------- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 421a40968c9ff..769d2d532a825 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -626,19 +626,7 @@ static void __blk_mq_complete_request_remote(void *data) { struct request *rq = data; - /* - * For most of single queue controllers, there is only one irq vector - * for handling I/O completion, and the only irq's affinity is set - * to all possible CPUs. On most of ARCHs, this affinity means the irq - * is handled on one specific CPU. - * - * So complete I/O requests in softirq context in case of single queue - * devices to avoid degrading I/O performance due to irqsoff latency. - */ - if (rq->q->nr_hw_queues == 1) - blk_mq_trigger_softirq(rq); - else - rq->q->mq_ops->complete(rq); + blk_mq_trigger_softirq(rq); } static inline bool blk_mq_complete_need_ipi(struct request *rq) -- 2.28.0 ^ permalink raw reply related [flat|nested] 51+ messages in thread
end of thread, other threads:[~2021-02-17 13:20 UTC | newest] Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-01-23 20:10 [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head Sebastian Andrzej Siewior 2021-01-23 20:10 ` [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle() Sebastian Andrzej Siewior 2021-02-01 19:35 ` Sebastian Andrzej Siewior 2021-02-09 10:02 ` Peter Zijlstra 2021-02-09 11:35 ` Sebastian Andrzej Siewior 2021-02-10 13:53 ` [tip: sched/core] " tip-bot2 for Sebastian Andrzej Siewior 2021-02-17 13:17 ` tip-bot2 for Sebastian Andrzej Siewior 2021-01-23 20:10 ` [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq Sebastian Andrzej Siewior 2021-01-25 7:10 ` Hannes Reinecke 2021-01-25 8:25 ` Christoph Hellwig 2021-01-25 8:30 ` Sebastian Andrzej Siewior 2021-01-25 8:32 ` Christoph Hellwig 2021-01-25 9:29 ` Sebastian Andrzej Siewior 2021-01-25 8:22 ` Christoph Hellwig 2021-01-25 8:49 ` Christoph Hellwig 2021-01-27 11:22 ` Daniel Wagner 2021-01-23 20:10 ` [PATCH 3/3] blk-mq: Use llist_head for blk_cpu_done Sebastian Andrzej Siewior 2021-01-25 8:30 ` Christoph Hellwig 2021-01-25 8:32 ` Sebastian Andrzej Siewior 2021-01-25 8:39 ` Christoph Hellwig 2021-01-25 9:54 ` [PATCH 3/3 v2] " Sebastian Andrzej Siewior 2021-01-25 10:14 ` Christoph Hellwig 2021-01-27 11:23 ` Daniel Wagner 2021-01-25 4:27 ` [PATCH v3 0/3] blk-mq: Don't complete in IRQ, use llist_head Jens Axboe 2021-02-10 14:43 ` Jens Axboe -- strict thread matches above, loose matches on Subject: below -- 2020-12-04 19:13 [PATCH 0/3 v2] " Sebastian Andrzej Siewior 2020-12-04 19:13 ` [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq Sebastian Andrzej Siewior 2020-12-07 23:52 ` Jens Axboe 2020-12-08 8:22 ` Sebastian Andrzej Siewior 2020-12-08 8:44 ` Daniel Wagner 2020-12-08 11:36 ` Daniel Wagner 2020-12-08 11:49 ` Sebastian Andrzej Siewior 2020-12-08 12:41 ` Daniel Wagner 2020-12-08 12:52 ` Sebastian Andrzej Siewior 2020-12-08 12:57 ` Sebastian Andrzej Siewior 2020-12-08 13:27 ` Daniel Wagner 2020-12-17 16:45 ` Jens Axboe 2020-12-17 16:49 ` Daniel Wagner 2020-12-17 16:54 ` Jens Axboe 2020-12-08 13:13 ` Christoph Hellwig 2020-12-17 16:43 ` Sebastian Andrzej Siewior 2020-12-17 16:55 ` Jens Axboe 2020-12-17 16:58 ` Sebastian Andrzej Siewior 2020-12-17 17:05 ` Daniel Wagner 2020-12-17 18:16 ` Daniel Wagner 2020-12-17 18:22 ` Jens Axboe 2020-12-17 18:41 ` Daniel Wagner 2020-12-17 18:46 ` Jens Axboe 2020-12-17 19:07 ` Daniel Wagner 2020-12-17 19:13 ` Jens Axboe 2020-12-17 19:15 ` Daniel Wagner 2020-10-28 6:56 [PATCH RFC] blk-mq: Don't IPI requests on PREEMPT_RT Christoph Hellwig 2020-10-28 14:12 ` [PATCH 1/3] blk-mq: Don't complete on a remote CPU in force threaded mode Sebastian Andrzej Siewior 2020-10-28 14:12 ` [PATCH 2/3] blk-mq: Always complete remote completions requests in softirq Sebastian Andrzej Siewior
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.