From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50849C433FF for ; Mon, 12 Aug 2019 10:42:34 +0000 (UTC) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2BE362085A for ; Mon, 12 Aug 2019 10:42:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2BE362085A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=iommu-bounces@lists.linux-foundation.org Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id E3575C77; Mon, 12 Aug 2019 10:42:33 +0000 (UTC) Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 51754B79 for ; Mon, 12 Aug 2019 10:42:33 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from huawei.com (szxga06-in.huawei.com [45.249.212.32]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 8AE8F2C6 for ; Mon, 12 Aug 2019 10:42:32 +0000 (UTC) Received: from DGGEMS406-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 5DD18646940A019F322D; Mon, 12 Aug 2019 18:42:30 +0800 (CST) Received: from [127.0.0.1] (10.202.227.238) by DGGEMS406-HUB.china.huawei.com (10.3.19.206) with Microsoft SMTP Server id 14.3.439.0; Mon, 12 Aug 2019 18:42:22 +0800 Subject: Re: [PATCH] iommu/arm-smmu-v3: add nr_ats_masters to avoid unnecessary operations To: Zhen Lei , Jean-Philippe Brucker , Robin Murphy , "Will Deacon" , Joerg Roedel , linux-arm-kernel , iommu , linux-kernel , References: <20190801122040.26024-1-thunder.leizhen@huawei.com> From: John Garry Message-ID: Date: Mon, 12 Aug 2019 11:42:17 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <20190801122040.26024-1-thunder.leizhen@huawei.com> X-Originating-IP: [10.202.227.238] X-CFilter-Loop: Reflected X-BeenThere: iommu@lists.linux-foundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Development issues for Linux IOMMU support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: iommu-bounces@lists.linux-foundation.org Errors-To: iommu-bounces@lists.linux-foundation.org On 01/08/2019 13:20, Zhen Lei wrote: > When (smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS) is true, even if a > smmu domain does not contain any ats master, the operations of > arm_smmu_atc_inv_to_cmd() and lock protection in arm_smmu_atc_inv_domain() > are always executed. This will impact performance, especially in > multi-core and stress scenarios. For my FIO test scenario, about 8% > performance reduced. > > In fact, we can use a atomic member to record how many ats masters the > smmu contains. And check that without traverse the list and check all > masters one by one in the lock protection. > Hi Will, Robin, Jean-Philippe, Can you kindly check this issue? We have seen a signifigant performance regression here. Thanks! > Fixes: 9ce27afc0830 ("iommu/arm-smmu-v3: Add support for PCI ATS") > Signed-off-by: Zhen Lei > --- > drivers/iommu/arm-smmu-v3.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > index a9a9fabd396804a..1b370d9aca95f94 100644 > --- a/drivers/iommu/arm-smmu-v3.c > +++ b/drivers/iommu/arm-smmu-v3.c > @@ -631,6 +631,7 @@ struct arm_smmu_domain { > > struct io_pgtable_ops *pgtbl_ops; > bool non_strict; > + atomic_t nr_ats_masters; It's not ideal to keep a separate count of ats masters...hmmm > > enum arm_smmu_domain_stage stage; > union { > @@ -1531,7 +1532,7 @@ static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, > struct arm_smmu_cmdq_ent cmd; > struct arm_smmu_master *master; > > - if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS)) > + if (!atomic_read(&smmu_domain->nr_ats_masters)) > return 0; The rest of the code is here: arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd); spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_for_each_entry(master, &smmu_domain->devices, domain_head) ret |= arm_smmu_atc_inv_master(master, &cmd); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); return ret ? -ETIMEDOUT : 0; } Not directly related to leizhen's issue: Could RCU protection be used for this list iteration? I can't imagine that the devices list changes often. And also we already protect the cmdq in arm_smmu_atc_inv_master(). > > arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd); > @@ -1869,6 +1870,7 @@ static int arm_smmu_enable_ats(struct arm_smmu_master *master) > size_t stu; > struct pci_dev *pdev; > struct arm_smmu_device *smmu = master->smmu; > + struct arm_smmu_domain *smmu_domain = master->domain; > struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev); > > if (!(smmu->features & ARM_SMMU_FEAT_ATS) || !dev_is_pci(master->dev) || > @@ -1887,12 +1889,15 @@ static int arm_smmu_enable_ats(struct arm_smmu_master *master) > return ret; > > master->ats_enabled = true; > + atomic_inc(&smmu_domain->nr_ats_masters); > + > return 0; > } > > static void arm_smmu_disable_ats(struct arm_smmu_master *master) > { > struct arm_smmu_cmdq_ent cmd; > + struct arm_smmu_domain *smmu_domain = master->domain; > > if (!master->ats_enabled || !dev_is_pci(master->dev)) > return; > @@ -1901,6 +1906,7 @@ static void arm_smmu_disable_ats(struct arm_smmu_master *master) > arm_smmu_atc_inv_master(master, &cmd); > pci_disable_ats(to_pci_dev(master->dev)); > master->ats_enabled = false; > + atomic_dec(&smmu_domain->nr_ats_masters); > } > > static void arm_smmu_detach_dev(struct arm_smmu_master *master) > @@ -1915,10 +1921,10 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master) > list_del(&master->domain_head); > spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); > > - master->domain = NULL; > arm_smmu_install_ste_for_dev(master); > > arm_smmu_disable_ats(master); > + master->domain = NULL; > } > > static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) > _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu