From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A19BC43381 for ; Mon, 25 Mar 2019 13:27:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 143062084D for ; Mon, 25 Mar 2019 13:27:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727360AbfCYN11 (ORCPT ); Mon, 25 Mar 2019 09:27:27 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:45853 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725795AbfCYN11 (ORCPT ); Mon, 25 Mar 2019 09:27:27 -0400 Received: from [5.158.153.52] (helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1h8PdM-0005T7-CQ; Mon, 25 Mar 2019 14:27:24 +0100 Date: Mon, 25 Mar 2019 14:27:23 +0100 (CET) From: Thomas Gleixner To: Peter Xu cc: Ming Lei , Christoph Hellwig , Jason Wang , Luiz Capitulino , Linux Kernel Mailing List , "Michael S. Tsirkin" , minlei@redhat.com Subject: Re: Virtio-scsi multiqueue irq affinity In-Reply-To: <20190325094340.GJ9149@xz-x1> Message-ID: References: <20190318062150.GC6654@xz-x1> <20190325050213.GH9149@xz-x1> <20190325070616.GA9642@ming.t460p> <20190325094340.GJ9149@xz-x1> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Peter, On Mon, 25 Mar 2019, Peter Xu wrote: > Now I understand it can be guaranteed so it should not break > determinism of the real-time applications. But again, I'm curious > whether we can specify how to spread the hardware queues of a block > controller (as I asked in my previous post) instead of the default one > (which is to spread the queues upon all the cores)? I'll try to give > a detailed example on this one this time: Let's assume we've had a > host with 2 nodes and 8 cores (Node 0 with CPUs 0-3, Node 1 with CPUs > 4-7), and a SCSI controller with 4 queues. We want to take the 2nd > node to run the real-time applications so we do isolcpus=4-7. By > default, IIUC the hardware queues will be allocated like this: > > - queue 1: CPU 0,1 > - queue 2: CPU 2,3 > - queue 3: CPU 4,5 > - queue 4: CPU 6,7 > > And the IRQs of the queues will be bound to the same cpuset that the > queue is bound to. > > So my previous question is: since we know that CPU 4-7 won't generate > any IO after all (and they shouldn't), could it be possible that we > configure the system somehow to reflect a mapping like below: > > - queue 1: CPU 0 > - qeueu 2: CPU 1 > - queue 3: CPU 2 > - queue 4: CPU 3 > > Then we disallow the CPUs 4-7 to generate IO and return failure if > they tries to. > > Again, I'm pretty uncertain on whether this case can be anything close > to useful... It just came out of my pure curiosity. I think it at > least has some benefits like: we will guarantee that the realtime CPUs > won't send block IO requests (which could be good because it could > simply break real-time determinism), and we'll save two queues from > being totally idle (so if we run non-real-time block applications on > cores 0-3 we still gain 4 hardware queues's throughput rather than 2). If that _IS_ useful, then the affinity spreading logic can be changed to accomodate that. It's not really hard to do so, but we'd need a proper usecase for justification. Thanks, tglx