On 03/31/2015 09:33 AM, Frederic Weisbecker wrote: > On Tue, Mar 31, 2015 at 09:07:11AM -0600, Jens Axboe wrote: >> On 03/31/2015 08:27 AM, Rik van Riel wrote: >>> CPUs with nohz_full do not want disruption from timer interrupts, >>> or other random system things. This includes block mq work. >>> >>> There is another issue with block mq vs. realtime tasks that run >>> 100% of the time, which is not uncommon on systems that have CPUs >>> dedicated to real time use with isolcpus= and nohz_full= >>> >>> Specifically, on systems like that, a block work item may never >>> get to run, which could lead to filesystems getting stuck forever. >>> >>> We can avoid both issues by not scheduling blk-mq workqueues on >>> cpus in nohz_full mode. >>> >>> Question for Jens: should we try to spread out the load for >>> currently offline and nohz CPUs across the remaining CPUs in >>> the system, to get the full benefit of blk-mq in these situations? >>> >>> If so, do you have any preference on how I should implement that? >>> >>> Cc: Frederic Weisbecker >>> Cc: Ingo Molnar >>> Cc: Jens Axboe >>> Signed-off-by: Rik van Riel >>> --- >>> block/blk-mq.c | 5 +++++ >>> 1 file changed, 5 insertions(+) >>> >>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>> index 4f4bea21052e..1004d6817fa4 100644 >>> --- a/block/blk-mq.c >>> +++ b/block/blk-mq.c >>> @@ -21,6 +21,7 @@ >>> #include >>> #include >>> #include >>> +#include >>> >>> #include >>> >>> @@ -1760,6 +1761,10 @@ static void blk_mq_init_cpu_queues(struct request_queue *q, >>> if (!cpu_online(i)) >>> continue; >>> >>> + /* Do not schedule work on nohz full dedicated CPUs. */ >>> + if (tick_nohz_full_cpu(i)) >>> + continue; >> >> Is this CPU ever going to queue IO? If yes, then it needs to be mapped. If >> userspace never runs on it and submits IO, then we'll never run completions >> on it nor schedule the associated workqueue. So I really don't see how it >> doesn't already work, as-is. > > Well, it's fairly possible that full dynticks CPUs do IO of any sort. Is it possible > to affine these asynchronous works to specific CPU? The usual scheme of full dynticks > is to have CPU 0 handling any kind of housekeeping and other CPUs doing latency or performance > sensitive works that don't want to be disturbed. That'd be easy enough to do, that's how blk-mq handles offline CPUs as well. The attached patch is completely untested, but will handle offline or nohz CPUs in the same fashion - they will punt to hardware queue 0, which is mapped to CPU0 (and others, depending on the queue vs CPU ratio). -- Jens Axboe