From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1E4CC433FE for ; Mon, 10 Jan 2022 03:21:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238348AbiAJDVd (ORCPT ); Sun, 9 Jan 2022 22:21:33 -0500 Received: from szxga01-in.huawei.com ([45.249.212.187]:16696 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232846AbiAJDVc (ORCPT ); Sun, 9 Jan 2022 22:21:32 -0500 Received: from dggeme758-chm.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4JXJtD4Wt3zZf4c; Mon, 10 Jan 2022 11:17:56 +0800 (CST) Received: from [10.67.110.136] (10.67.110.136) by dggeme758-chm.china.huawei.com (10.3.19.104) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2308.20; Mon, 10 Jan 2022 11:20:39 +0800 Subject: Re: [PATCH] arm64: Make CONFIG_ARM64_PSEUDO_NMI macro wrap all the pseudo-NMI code To: Marc Zyngier CC: , , , , , , , References: <20220107085536.214501-1-heying24@huawei.com> <87pmp2tmpg.wl-maz@kernel.org> From: He Ying Message-ID: Date: Mon, 10 Jan 2022 11:20:39 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: <87pmp2tmpg.wl-maz@kernel.org> Content-Type: text/plain; charset="gbk"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.110.136] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggeme758-chm.china.huawei.com (10.3.19.104) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Marc, I'm just back from the weekend and sorry for the delayed reply. ÔÚ 2022/1/8 20:51, Marc Zyngier дµÀ: > On Fri, 07 Jan 2022 08:55:36 +0000, > He Ying wrote: >> Our product has been updating its kernel from 4.4 to 5.10 recently and >> found a performance issue. We do a bussiness test called ARP test, which >> tests the latency for a ping-pong packets traffic with a certain payload. >> The result is as following. >> >> - 4.4 kernel: avg = ~20s >> - 5.10 kernel (CONFIG_ARM64_PSEUDO_NMI is not set): avg = ~40s >> >> I have been just learning arm64 pseudo-NMI code and have a question, >> why is the related code not wrapped by CONFIG_ARM64_PSEUDO_NMI? >> I wonder if this brings some performance regression. >> >> First, I make this patch and then do the test again. Here's the result. >> >> - 5.10 kernel with this patch not applied: avg = ~40s >> - 5.10 kernel with this patch applied: avg = ~23s >> >> Amazing! Note that all kernel is built with CONFIG_ARM64_PSEUDO_NMI not >> set. It seems the pseudo-NMI feature actually brings some overhead to >> performance event if CONFIG_ARM64_PSEUDO_NMI is not set. >> >> Furthermore, I find the feature also brings some overhead to vmlinux size. >> I build 5.10 kernel with this patch applied or not while >> CONFIG_ARM64_PSEUDO_NMI is not set. >> >> - 5.10 kernel with this patch not applied: vmlinux size is 384060600 Bytes. >> - 5.10 kernel with this patch applied: vmlinux size is 383842936 Bytes. >> >> That means arm64 pseudo-NMI feature may bring ~200KB overhead to >> vmlinux size. >> >> Above all, arm64 pseudo-NMI feature brings some overhead to vmlinux size >> and performance even if config is not set. To avoid it, add macro control >> all around the related code. > This obviously attracted my attention, and I took this patch for a > ride on 5.16-rc8 on a machine that doesn't support GICv3 NMIs to make > sure that any extra code would only result in pure overhead. > > There was no measurable difference with this patch applied or not, > with CONFIG_ARM64_PSEUDO_NMI selected or not for the workloads I tried > (I/O heavy virtual machines, hackbench). Our test is some kind of network test. > > Mark already asked a number of questions (test case, implementation, > test on a modern kernel). Please provide as many detail as you > possibly can, because such a regression really isn't expected, and > doesn't show up on the systems I have at hand. Some profiling numbers > could also be interesting, in case this is a result of a particular > resource being thrashed (TLB, cache...). I replied to Mark a few moments ago and provided as many details as I can. You mentioned TLB and cache could be thrashed. How can we check this? By using perf tools? > > Thanks, > > M. > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CA6A0C433F5 for ; Mon, 10 Jan 2022 03:23:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:CC:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=OFSazOFZLEQjvVBy2I0TOHovo/QCBus7IRXH57+0hh8=; b=HkGPdI7sZ3Tzv+aYDmeWXrDmcU pxa+b7qxbEppWX0IzAObHeHlS3tQgz3vnr1LW1wOqUSPBC+NdsPXtbB0YDX9E44+aEE4MmDQsVNSi cABl3+M64+JPU535SUD89zUvLTp0MvEt/V3rgv9D8Mb076vsABwaso2wlIx+L7MOApnVNNz5csSJB x7DnskJ32arD2hXP2g28UeY2uPCwu0IbGsobxBQn9dnHHKb5HyQdlsQbMiZ7rrDuTlM0/E6iwsBAC zSL97y1Soiv6SesSSlW5MTJrDSAaQm2vX6iqccGfIn+J/FIWpvu7A3XKHplo+vbwD4U7xPDby2b1v vqueuZdg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1n6lFa-0092ru-6E; Mon, 10 Jan 2022 03:21:38 +0000 Received: from szxga01-in.huawei.com ([45.249.212.187]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1n6lFV-0092qW-T9 for linux-arm-kernel@lists.infradead.org; Mon, 10 Jan 2022 03:21:35 +0000 Received: from dggeme758-chm.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4JXJtD4Wt3zZf4c; Mon, 10 Jan 2022 11:17:56 +0800 (CST) Received: from [10.67.110.136] (10.67.110.136) by dggeme758-chm.china.huawei.com (10.3.19.104) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2308.20; Mon, 10 Jan 2022 11:20:39 +0800 Subject: Re: [PATCH] arm64: Make CONFIG_ARM64_PSEUDO_NMI macro wrap all the pseudo-NMI code To: Marc Zyngier CC: , , , , , , , References: <20220107085536.214501-1-heying24@huawei.com> <87pmp2tmpg.wl-maz@kernel.org> From: He Ying Message-ID: Date: Mon, 10 Jan 2022 11:20:39 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: <87pmp2tmpg.wl-maz@kernel.org> X-Originating-IP: [10.67.110.136] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggeme758-chm.china.huawei.com (10.3.19.104) X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220109_192134_301294_B3A5B5D2 X-CRM114-Status: GOOD ( 22.44 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: base64 Content-Type: text/plain; charset="gbk"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org SGkgTWFyYywKCkknbSBqdXN0IGJhY2sgZnJvbSB0aGUgd2Vla2VuZCBhbmQgc29ycnkgZm9yIHRo ZSBkZWxheWVkIHJlcGx5LgoKCtTaIDIwMjIvMS84IDIwOjUxLCBNYXJjIFp5bmdpZXIg0LS1wDoK PiBPbiBGcmksIDA3IEphbiAyMDIyIDA4OjU1OjM2ICswMDAwLAo+IEhlIFlpbmcgPGhleWluZzI0 QGh1YXdlaS5jb20+IHdyb3RlOgo+PiBPdXIgcHJvZHVjdCBoYXMgYmVlbiB1cGRhdGluZyBpdHMg a2VybmVsIGZyb20gNC40IHRvIDUuMTAgcmVjZW50bHkgYW5kCj4+IGZvdW5kIGEgcGVyZm9ybWFu Y2UgaXNzdWUuIFdlIGRvIGEgYnVzc2luZXNzIHRlc3QgY2FsbGVkIEFSUCB0ZXN0LCB3aGljaAo+ PiB0ZXN0cyB0aGUgbGF0ZW5jeSBmb3IgYSBwaW5nLXBvbmcgcGFja2V0cyB0cmFmZmljIHdpdGgg YSBjZXJ0YWluIHBheWxvYWQuCj4+IFRoZSByZXN1bHQgaXMgYXMgZm9sbG93aW5nLgo+Pgo+PiAg IC0gNC40IGtlcm5lbDogYXZnID0gfjIwcwo+PiAgIC0gNS4xMCBrZXJuZWwgKENPTkZJR19BUk02 NF9QU0VVRE9fTk1JIGlzIG5vdCBzZXQpOiBhdmcgPSB+NDBzCj4+Cj4+IEkgaGF2ZSBiZWVuIGp1 c3QgbGVhcm5pbmcgYXJtNjQgcHNldWRvLU5NSSBjb2RlIGFuZCBoYXZlIGEgcXVlc3Rpb24sCj4+ IHdoeSBpcyB0aGUgcmVsYXRlZCBjb2RlIG5vdCB3cmFwcGVkIGJ5IENPTkZJR19BUk02NF9QU0VV RE9fTk1JPwo+PiBJIHdvbmRlciBpZiB0aGlzIGJyaW5ncyBzb21lIHBlcmZvcm1hbmNlIHJlZ3Jl c3Npb24uCj4+Cj4+IEZpcnN0LCBJIG1ha2UgdGhpcyBwYXRjaCBhbmQgdGhlbiBkbyB0aGUgdGVz dCBhZ2Fpbi4gSGVyZSdzIHRoZSByZXN1bHQuCj4+Cj4+ICAgLSA1LjEwIGtlcm5lbCB3aXRoIHRo aXMgcGF0Y2ggbm90IGFwcGxpZWQ6IGF2ZyA9IH40MHMKPj4gICAtIDUuMTAga2VybmVsIHdpdGgg dGhpcyBwYXRjaCBhcHBsaWVkOiBhdmcgPSB+MjNzCj4+Cj4+IEFtYXppbmchIE5vdGUgdGhhdCBh bGwga2VybmVsIGlzIGJ1aWx0IHdpdGggQ09ORklHX0FSTTY0X1BTRVVET19OTUkgbm90Cj4+IHNl dC4gSXQgc2VlbXMgdGhlIHBzZXVkby1OTUkgZmVhdHVyZSBhY3R1YWxseSBicmluZ3Mgc29tZSBv dmVyaGVhZCB0bwo+PiBwZXJmb3JtYW5jZSBldmVudCBpZiBDT05GSUdfQVJNNjRfUFNFVURPX05N SSBpcyBub3Qgc2V0Lgo+Pgo+PiBGdXJ0aGVybW9yZSwgSSBmaW5kIHRoZSBmZWF0dXJlIGFsc28g YnJpbmdzIHNvbWUgb3ZlcmhlYWQgdG8gdm1saW51eCBzaXplLgo+PiBJIGJ1aWxkIDUuMTAga2Vy bmVsIHdpdGggdGhpcyBwYXRjaCBhcHBsaWVkIG9yIG5vdCB3aGlsZQo+PiBDT05GSUdfQVJNNjRf UFNFVURPX05NSSBpcyBub3Qgc2V0Lgo+Pgo+PiAgIC0gNS4xMCBrZXJuZWwgd2l0aCB0aGlzIHBh dGNoIG5vdCBhcHBsaWVkOiB2bWxpbnV4IHNpemUgaXMgMzg0MDYwNjAwIEJ5dGVzLgo+PiAgIC0g NS4xMCBrZXJuZWwgd2l0aCB0aGlzIHBhdGNoIGFwcGxpZWQ6IHZtbGludXggc2l6ZSBpcyAzODM4 NDI5MzYgQnl0ZXMuCj4+Cj4+IFRoYXQgbWVhbnMgYXJtNjQgcHNldWRvLU5NSSBmZWF0dXJlIG1h eSBicmluZyB+MjAwS0Igb3ZlcmhlYWQgdG8KPj4gdm1saW51eCBzaXplLgo+Pgo+PiBBYm92ZSBh bGwsIGFybTY0IHBzZXVkby1OTUkgZmVhdHVyZSBicmluZ3Mgc29tZSBvdmVyaGVhZCB0byB2bWxp bnV4IHNpemUKPj4gYW5kIHBlcmZvcm1hbmNlIGV2ZW4gaWYgY29uZmlnIGlzIG5vdCBzZXQuIFRv IGF2b2lkIGl0LCBhZGQgbWFjcm8gY29udHJvbAo+PiBhbGwgYXJvdW5kIHRoZSByZWxhdGVkIGNv ZGUuCj4gVGhpcyBvYnZpb3VzbHkgYXR0cmFjdGVkIG15IGF0dGVudGlvbiwgYW5kIEkgdG9vayB0 aGlzIHBhdGNoIGZvciBhCj4gcmlkZSBvbiA1LjE2LXJjOCBvbiBhIG1hY2hpbmUgdGhhdCBkb2Vz bid0IHN1cHBvcnQgR0lDdjMgTk1JcyB0byBtYWtlCj4gc3VyZSB0aGF0IGFueSBleHRyYSBjb2Rl IHdvdWxkIG9ubHkgcmVzdWx0IGluIHB1cmUgb3ZlcmhlYWQuCj4KPiBUaGVyZSB3YXMgbm8gbWVh c3VyYWJsZSBkaWZmZXJlbmNlIHdpdGggdGhpcyBwYXRjaCBhcHBsaWVkIG9yIG5vdCwKPiB3aXRo IENPTkZJR19BUk02NF9QU0VVRE9fTk1JIHNlbGVjdGVkIG9yIG5vdCBmb3IgdGhlIHdvcmtsb2Fk cyBJIHRyaWVkCj4gKEkvTyBoZWF2eSB2aXJ0dWFsIG1hY2hpbmVzLCBoYWNrYmVuY2gpLgpPdXIg dGVzdCBpcyBzb21lIGtpbmQgb2YgbmV0d29yayB0ZXN0Lgo+Cj4gTWFyayBhbHJlYWR5IGFza2Vk IGEgbnVtYmVyIG9mIHF1ZXN0aW9ucyAodGVzdCBjYXNlLCBpbXBsZW1lbnRhdGlvbiwKPiB0ZXN0 IG9uIGEgbW9kZXJuIGtlcm5lbCkuIFBsZWFzZSBwcm92aWRlIGFzIG1hbnkgZGV0YWlsIGFzIHlv dQo+IHBvc3NpYmx5IGNhbiwgYmVjYXVzZSBzdWNoIGEgcmVncmVzc2lvbiByZWFsbHkgaXNuJ3Qg ZXhwZWN0ZWQsIGFuZAo+IGRvZXNuJ3Qgc2hvdyB1cCBvbiB0aGUgc3lzdGVtcyBJIGhhdmUgYXQg aGFuZC4gU29tZSBwcm9maWxpbmcgbnVtYmVycwo+IGNvdWxkIGFsc28gYmUgaW50ZXJlc3Rpbmcs IGluIGNhc2UgdGhpcyBpcyBhIHJlc3VsdCBvZiBhIHBhcnRpY3VsYXIKPiByZXNvdXJjZSBiZWlu ZyB0aHJhc2hlZCAoVExCLCBjYWNoZS4uLikuCgpJIHJlcGxpZWQgdG8gTWFyayBhIGZldyBtb21l bnRzIGFnbyBhbmQgcHJvdmlkZWQgYXMgbWFueSBkZXRhaWxzIGFzIEkgY2FuLgoKWW91IG1lbnRp b25lZCBUTEIgYW5kIGNhY2hlIGNvdWxkIGJlIHRocmFzaGVkLiBIb3cgY2FuIHdlIGNoZWNrIHRo aXM/CgpCeSB1c2luZyBwZXJmIHRvb2xzPwoKPgo+IFRoYW5rcywKPgo+IAlNLgo+CgpfX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpsaW51eC1hcm0ta2VybmVs IG1haWxpbmcgbGlzdApsaW51eC1hcm0ta2VybmVsQGxpc3RzLmluZnJhZGVhZC5vcmcKaHR0cDov L2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1hcm0ta2VybmVsCg==