From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D28C0C43381 for ; Tue, 19 Mar 2019 16:58:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9E07820835 for ; Tue, 19 Mar 2019 16:58:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727771AbfCSQ6D (ORCPT ); Tue, 19 Mar 2019 12:58:03 -0400 Received: from foss.arm.com ([217.140.101.70]:55360 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726778AbfCSQ6C (ORCPT ); Tue, 19 Mar 2019 12:58:02 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 348951650; Tue, 19 Mar 2019 09:58:02 -0700 (PDT) Received: from big-swifty.misterjones.org (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 433083F614; Tue, 19 Mar 2019 09:57:58 -0700 (PDT) Date: Tue, 19 Mar 2019 16:57:55 +0000 Message-ID: <86h8byztrg.wl-marc.zyngier@arm.com> From: Marc Zyngier To: Zenghui Yu Cc: , "Raslan, KarimAllah" , , , , , , , , , , , , , , Subject: Re: [RFC PATCH] KVM: arm/arm64: Enable direct irqfd MSI injection In-Reply-To: <4fedabbe-b2d0-c04c-e8ce-a1adbf419f8a@huawei.com> References: <1552833373-19828-1-git-send-email-yuzenghui@huawei.com> <86o969z42z.wl-marc.zyngier@arm.com> <428b2aac-5a0f-e9da-8d74-8045f99a8c74@huawei.com> <20190319100141.69821f8b@why.wild-wind.fr.eu.org> <4fedabbe-b2d0-c04c-e8ce-a1adbf419f8a@huawei.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/26 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: ARM Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 19 Mar 2019 15:59:00 +0000, Zenghui Yu wrote: > > Hi Marc, > > On 2019/3/19 18:01, Marc Zyngier wrote: > > On Tue, 19 Mar 2019 09:09:43 +0800 > > Zenghui Yu wrote: > > > >> Hi all, > >> > >> On 2019/3/18 3:35, Marc Zyngier wrote: > >>> A first approach would be to keep a small cache of the last few > >>> successful translations for this ITS, cache that could be looked-up by > >>> holding a spinlock instead. A hit in this cache could directly be > >>> injected. Any command that invalidates or changes anything (DISCARD, > >>> INV, INVALL, MAPC with V=0, MAPD with V=0, MOVALL, MOVI) should nuke > >>> the cache altogether. > >>> > >>> Of course, all of that needs to be quantified. > >> > >> Thanks for all of your explanations, especially for Marc's suggestions! > >> It took me long time to figure out my mistakes, since I am not very > >> familiar with the locking stuff. Now I have to apologize for my noise. > > > > No need to apologize. The whole point of this list is to have > > discussions. Although your approach wasn't working, you did > > identify potential room for improvement. > > > >> As for the its-translation-cache code (a really good news to us), we > >> have a rough look at it and start testing now! > > > > Please let me know about your findings. My initial test doesn't show > > any improvement, but that could easily be attributed to the system I > > running this on (a tiny and slightly broken dual A53 system). The sizing > > of the cache is also important: too small, and you have the overhead of > > the lookup for no benefit; too big, and you waste memory. > > Not smoothly as expected. With below config (in the form of XML): The good news is that nothing was expected at all. > ---8<--- > > > > > > ---8<--- Sorry, I don't read XML, and I have zero idea what this represent. > > VM can't even get to boot successfully! > > > Kernel version is -stable 4.19.28. And *dmesg* on host shows: Please don't test on any other thing but mainline. The only thing I'm interested in at the moment is 5.1-rc1. > > ---8<--- > [ 507.908330] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: > [ 507.908338] rcu: 35-...0: (0 ticks this GP) > idle=d06/1/0x4000000000000000 softirq=72150/72150 fqs=6269 > [ 507.908341] rcu: 41-...0: (0 ticks this GP) > idle=dee/1/0x4000000000000000 softirq=68144/68144 fqs=6269 > [ 507.908342] rcu: (detected by 23, t=15002 jiffies, g=68929, q=408641) > [ 507.908350] Task dump for CPU 35: > [ 507.908351] qemu-kvm R running task 0 66789 1 > 0x00000002 > [ 507.908354] Call trace: > [ 507.908360] __switch_to+0x94/0xe8 > [ 507.908363] _cond_resched+0x24/0x68 > [ 507.908366] __flush_work+0x58/0x280 > [ 507.908369] free_unref_page_commit+0xc4/0x198 > [ 507.908370] free_unref_page+0x84/0xa0 > [ 507.908371] __free_pages+0x58/0x68 > [ 507.908372] free_pages.part.21+0x34/0x40 > [ 507.908373] free_pages+0x2c/0x38 > [ 507.908375] poll_freewait+0xa8/0xd0 > [ 507.908377] do_sys_poll+0x3d0/0x560 > [ 507.908378] __arm64_sys_ppoll+0x180/0x1e8 > [ 507.908380] 0xa48990 > [ 507.908381] Task dump for CPU 41: > [ 507.908382] kworker/41:1 R running task 0 647 2 > 0x0000002a > [ 507.908387] Workqueue: events irqfd_inject > [ 507.908389] Call trace: > [ 507.908391] __switch_to+0x94/0xe8 > [ 507.908392] 0x200000131 > [... ...] > [ 687.928330] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: > [ 687.928339] rcu: 35-...0: (0 ticks this GP) > idle=d06/1/0x4000000000000000 softirq=72150/72150 fqs=25034 > [ 687.928341] rcu: 41-...0: (0 ticks this GP) > idle=dee/1/0x4000000000000000 softirq=68144/68144 fqs=25034 > [ 687.928343] rcu: (detected by 16, t=60007 jiffies, g=68929, > q=1601093) > [ 687.928351] Task dump for CPU 35: > [ 687.928352] qemu-kvm R running task 0 66789 1 > 0x00000002 > [ 687.928355] Call trace: > [ 687.928360] __switch_to+0x94/0xe8 > [ 687.928364] _cond_resched+0x24/0x68 > [ 687.928367] __flush_work+0x58/0x280 > [ 687.928369] free_unref_page_commit+0xc4/0x198 > [ 687.928370] free_unref_page+0x84/0xa0 > [ 687.928372] __free_pages+0x58/0x68 > [ 687.928373] free_pages.part.21+0x34/0x40 > [ 687.928374] free_pages+0x2c/0x38 > [ 687.928376] poll_freewait+0xa8/0xd0 > [ 687.928378] do_sys_poll+0x3d0/0x560 > [ 687.928379] __arm64_sys_ppoll+0x180/0x1e8 > [ 687.928381] 0xa48990 > [ 687.928382] Task dump for CPU 41: > [ 687.928383] kworker/41:1 R running task 0 647 2 > 0x0000002a > [ 687.928389] Workqueue: events irqfd_inject > [ 687.928391] Call trace: > [ 687.928392] __switch_to+0x94/0xe8 > [ 687.928394] 0x200000131 > [...] > ---8<--- endlessly ... > > It seems that we've suffered from some locking related issues. Any > suggestions for debugging? None at the moment. And this doesn't seem quite related to the problem at hand, does it? > And could you please provide your test steps ? So that I can run > some tests on my HW to see improvement hopefully. Here you go: qemu-system-aarch64 -m 512M -smp 2 -cpu host,aarch64=on -machine virt,accel=kvm,gic_version=3,its -nographic -drive if=pflash,format=raw,readonly,file=/usr/share/AAVMF/AAVMF_CODE.fd -drive if=pflash,format=raw,file=buster/GXnkZdHqG4e7o4pC.fd -netdev tap,fds=128:129,id=hostnet0,vhost=on,vhostfds=130:131 -device virtio-net-pci,mac=5a:fe:00:e5:b1:30,netdev=hostnet0,mq=on,vectors=6 -drive if=none,format=raw,file=buster/GXnkZdHqG4e7o4pC.img,id=disk0 -device virtio-blk-pci,drive=disk0 -drive file=debian-testing-arm64-DVD-1-preseed.iso,id=cdrom,if=none,media=cdrom -device virtio-scsi-pci -device scsi-cd,drive=cdrom 128<>/dev/tap7 129<>/dev/tap7 130<>/dev/vhost-net 131<>/dev/vhost-net > > Having thought about it a bit more, I think we can drop the > > invalidation on MOVI/MOVALL, as the LPI is still perfectly valid, and > > we don't cache the target vcpu. On the other hand, the cache must be > > nuked when the ITS is turned off. > > All of these are valuable. But it might be early for me to consider > about them (I have to get the above problem solved first ...) I'm not asking you to consider them. I jumped in this thread explaining what could be done instead. These are ideas on top of what I've already offered. M. -- Jazz is not dead, it just smell funny.