From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78A23C433E0 for ; Mon, 29 Mar 2021 10:38:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 40B8161933 for ; Mon, 29 Mar 2021 10:38:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232118AbhC2Ki2 (ORCPT ); Mon, 29 Mar 2021 06:38:28 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:15032 "EHLO szxga05-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231639AbhC2KiS (ORCPT ); Mon, 29 Mar 2021 06:38:18 -0400 Received: from DGGEMS402-HUB.china.huawei.com (unknown [172.30.72.58]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4F889j5xQyzNrLs; Mon, 29 Mar 2021 18:35:37 +0800 (CST) Received: from [10.174.187.192] (10.174.187.192) by DGGEMS402-HUB.china.huawei.com (10.3.19.202) with Microsoft SMTP Server id 14.3.498.0; Mon, 29 Mar 2021 18:38:04 +0800 Subject: Re: [RFC PATCH 1/3] irqchip/gic-v3: Make use of ICC_SGI1R IRM bit To: Marc Zyngier CC: , , , , , References: <20210329085210.11524-1-wangjingyi11@huawei.com> <20210329085210.11524-2-wangjingyi11@huawei.com> <87wntqqo6s.wl-maz@kernel.org> From: Jingyi Wang Message-ID: <7e44b7a1-4a12-86bf-4651-aa6a03c4f832@huawei.com> Date: Mon, 29 Mar 2021 18:38:04 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <87wntqqo6s.wl-maz@kernel.org> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.187.192] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/29/2021 5:55 PM, Marc Zyngier wrote: > On Mon, 29 Mar 2021 09:52:08 +0100, > Jingyi Wang wrote: >> >> IRM, bit[40] in ICC_SGI1R, determines how the generated SGIs >> are distributed to PEs. If the bit is set, interrupts are routed >> to all PEs in the system excluding "self". We use cpumask to >> determine if this bit should be set and make use of that. >> >> This will reduce vm trap when broadcast IPIs are sent. > > I remember writing similar code about 4 years ago, only to realise > what: > > - the cost of computing the resulting mask is pretty high for large > machines > - Linux almost never sends broadcast IPIs, so the complexity was all > in vain > > What changed? Please provide supporting data showing how many IPIs we > actually save, and for which workload. Maybe we can implement send_IPI_allbutself hooks as other some other archs instead of computing cpumask here? >> >> Signed-off-by: Jingyi Wang >> --- >> drivers/irqchip/irq-gic-v3.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c >> index eb0ee356a629..8ecc1b274ea8 100644 >> --- a/drivers/irqchip/irq-gic-v3.c >> +++ b/drivers/irqchip/irq-gic-v3.c >> @@ -1127,6 +1127,7 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, unsigned int irq) >> static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask) >> { >> int cpu; >> + cpumask_t tmp; >> >> if (WARN_ON(d->hwirq >= 16)) >> return; >> @@ -1137,6 +1138,17 @@ static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask) >> */ >> wmb(); >> >> + if (!cpumask_and(&tmp, mask, cpumask_of(smp_processor_id()))) { > > Are you sure this does the right thing? This is checking that the > current CPU is not part of the mask. But it not checking that the mask > is actually "all but self". > > This means you are potentially sending IPIs to CPUs that are not part > of the mask, making performance potentially worse. > > Thanks, > > M. > I will fix that,thanks. >> + /* Set Interrupt Routing Mode bit */ >> + u64 val; >> + val = (d->hwirq) << ICC_SGI1R_SGI_ID_SHIFT; >> + val |= BIT_ULL(ICC_SGI1R_IRQ_ROUTING_MODE_BIT); >> + gic_write_sgi1r(val); >> + >> + isb(); >> + return; >> + } >> + >> for_each_cpu(cpu, mask) { >> u64 cluster_id = MPIDR_TO_SGI_CLUSTER_ID(cpu_logical_map(cpu)); >> u16 tlist; >> -- >> 2.19.1 >> >> > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0206C433C1 for ; Mon, 29 Mar 2021 18:36:01 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 293AC61982 for ; Mon, 29 Mar 2021 18:36:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 293AC61982 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:CC:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=ZO9tEKR/u2vwcXClHE39FvKmE1/e+QeIzQKdKk6+7oc=; b=q6oMRYSlRRYSQEKkRsFGUmYKl EhWm92NpvtHSUr+1ctPmAg0ZEJV+6ZoyPGa4qRRKKWsgpWoYZHE3l6Zy0DLba3DTICXXIjLBQap0c gjnzc9IsvLRMfL+AUWR2m509Um9hX2YxMU1qA2fVatqYUed1V+7TmSFYJEW5dItbMClDW0R4wSa7O GRMg0MGc94NjKAgCqjry4przZqoo9F2ZUPwk7YJ6bpzydOKqG3y2omH+UhtI2egvw906jRKD/h3kY pcfspb2RjJDsWll4HGXxxr1BuZw4pb51PVL/Mfskt1qqfIOaYAZW8UleIuILbfnlfGV1hme7HghJk Zbnw71+9A==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lQwgF-0014Bg-13; Mon, 29 Mar 2021 18:32:05 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lQpHv-000abo-SZ for linux-arm-kernel@lists.infradead.org; Mon, 29 Mar 2021 10:38:29 +0000 Received: from DGGEMS402-HUB.china.huawei.com (unknown [172.30.72.58]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4F889j5xQyzNrLs; Mon, 29 Mar 2021 18:35:37 +0800 (CST) Received: from [10.174.187.192] (10.174.187.192) by DGGEMS402-HUB.china.huawei.com (10.3.19.202) with Microsoft SMTP Server id 14.3.498.0; Mon, 29 Mar 2021 18:38:04 +0800 Subject: Re: [RFC PATCH 1/3] irqchip/gic-v3: Make use of ICC_SGI1R IRM bit To: Marc Zyngier CC: , , , , , References: <20210329085210.11524-1-wangjingyi11@huawei.com> <20210329085210.11524-2-wangjingyi11@huawei.com> <87wntqqo6s.wl-maz@kernel.org> From: Jingyi Wang Message-ID: <7e44b7a1-4a12-86bf-4651-aa6a03c4f832@huawei.com> Date: Mon, 29 Mar 2021 18:38:04 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <87wntqqo6s.wl-maz@kernel.org> Content-Language: en-US X-Originating-IP: [10.174.187.192] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210329_113828_257716_0435777F X-CRM114-Status: GOOD ( 22.53 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 3/29/2021 5:55 PM, Marc Zyngier wrote: > On Mon, 29 Mar 2021 09:52:08 +0100, > Jingyi Wang wrote: >> >> IRM, bit[40] in ICC_SGI1R, determines how the generated SGIs >> are distributed to PEs. If the bit is set, interrupts are routed >> to all PEs in the system excluding "self". We use cpumask to >> determine if this bit should be set and make use of that. >> >> This will reduce vm trap when broadcast IPIs are sent. > > I remember writing similar code about 4 years ago, only to realise > what: > > - the cost of computing the resulting mask is pretty high for large > machines > - Linux almost never sends broadcast IPIs, so the complexity was all > in vain > > What changed? Please provide supporting data showing how many IPIs we > actually save, and for which workload. Maybe we can implement send_IPI_allbutself hooks as other some other archs instead of computing cpumask here? >> >> Signed-off-by: Jingyi Wang >> --- >> drivers/irqchip/irq-gic-v3.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c >> index eb0ee356a629..8ecc1b274ea8 100644 >> --- a/drivers/irqchip/irq-gic-v3.c >> +++ b/drivers/irqchip/irq-gic-v3.c >> @@ -1127,6 +1127,7 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, unsigned int irq) >> static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask) >> { >> int cpu; >> + cpumask_t tmp; >> >> if (WARN_ON(d->hwirq >= 16)) >> return; >> @@ -1137,6 +1138,17 @@ static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask) >> */ >> wmb(); >> >> + if (!cpumask_and(&tmp, mask, cpumask_of(smp_processor_id()))) { > > Are you sure this does the right thing? This is checking that the > current CPU is not part of the mask. But it not checking that the mask > is actually "all but self". > > This means you are potentially sending IPIs to CPUs that are not part > of the mask, making performance potentially worse. > > Thanks, > > M. > I will fix that,thanks. >> + /* Set Interrupt Routing Mode bit */ >> + u64 val; >> + val = (d->hwirq) << ICC_SGI1R_SGI_ID_SHIFT; >> + val |= BIT_ULL(ICC_SGI1R_IRQ_ROUTING_MODE_BIT); >> + gic_write_sgi1r(val); >> + >> + isb(); >> + return; >> + } >> + >> for_each_cpu(cpu, mask) { >> u64 cluster_id = MPIDR_TO_SGI_CLUSTER_ID(cpu_logical_map(cpu)); >> u16 tlist; >> -- >> 2.19.1 >> >> > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel