From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01B84C433B4 for ; Mon, 10 May 2021 03:19:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D4E94613DF for ; Mon, 10 May 2021 03:19:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230213AbhEJDU4 (ORCPT ); Sun, 9 May 2021 23:20:56 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:2733 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230129AbhEJDUz (ORCPT ); Sun, 9 May 2021 23:20:55 -0400 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4FdmRf71BLzqTmQ; Mon, 10 May 2021 11:16:30 +0800 (CST) Received: from [10.67.110.108] (10.67.110.108) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Mon, 10 May 2021 11:19:42 +0800 Subject: Re: Virtio-scsi multiqueue irq affinity To: Thomas Gleixner , xuyihang , "Ming Lei" CC: Peter Xu , Christoph Hellwig , Jason Wang , Luiz Capitulino , "Linux Kernel Mailing List" , "Michael S. Tsirkin" , References: <20190318062150.GC6654@xz-x1> <20190325050213.GH9149@xz-x1> <20190325070616.GA9642@ming.t460p> <20190325095011.GA23225@ming.t460p> <0f6c8a5f-ad33-1199-f313-53fe9187a672@huawei.com> <87zgx5l8ck.ffs@nanos.tec.linutronix.de> From: "liaochang (A)" Message-ID: <9903df53-8a84-fe89-7ae0-aac8e6d3f42f@huawei.com> Date: Mon, 10 May 2021 11:19:43 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.3 MIME-Version: 1.0 In-Reply-To: <87zgx5l8ck.ffs@nanos.tec.linutronix.de> Content-Type: text/plain; charset="gbk" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.110.108] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Thomas, ÔÚ 2021/5/8 20:26, Thomas Gleixner дµÀ: > Yihang, > > On Sat, May 08 2021 at 15:52, xuyihang wrote: >> >> We are dealing with a scenario which may need to assign a default >> irqaffinity for managed IRQ. >> >> Assume we have a full CPU usage RT thread running binded to a specific >> CPU. >> >> In the mean while, interrupt handler registered by a device which is >> ksoftirqd may never have a chance to run. (And we don't want to use >> isolate CPU) > > A device cannot register and interrupt handler in ksoftirqd. I learn the scenario further after communicate with Yihang offline: 1.We have a machine with 36 CPUs,and assign several RT threads to last two CPUs(CPU-34, CPU-35). 2.I/O device driver create single managed irq, the affinity of which includes CPU-34 and CPU-35. 3.Another regular application launch I/O operation at different CPUs with the ones RT threads use, then CPU-34/35 will receive hardware interrupt and wakeup ksoftirqd to deal with real I/O stuff. 4.Cause the priority and schedule policy of RT thread overwhlem per-cpu ksoftirqd, it looks like ksoftirqd has no chance to run at CPU-34/35,which leads to I/O processing can't finish at time, and application get stuck. > >> There could be a couple way to deal with this problem: >> >> 1. Adjust priority of ksoftirqd or RT thread, so the interrupt handler >> could preempt >> >> RT thread. However, I am not sure whether it could have some side >> effects or not. >> >> 2. Adjust interrupt CPU affinity or RT thread affinity. But managed IRQ >> seems design to forbid user from manipulating interrupt affinity. >> >> It seems managed IRQ is coupled with user side application to me. >> >> Would you share your thoughts about this issue please? > > Can you please provide a more detailed description of your system? > > - Number of CPUs > > - Kernel version > - Is NOHZ full enabled? > - Any isolation mechanisms enabled, and if so how are they > configured (e.g. on the kernel command line)? > > - Number of queues in the multiqueue device > > - Is the RT thread issuing I/O to the multiqueue device? > > Thanks, > > tglx > . > BR, Liao Chang