From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F10DBC433E1 for ; Tue, 18 Aug 2020 23:41:34 +0000 (UTC) Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CC78120738 for ; Tue, 18 Aug 2020 23:41:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC78120738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=hisilicon.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=iommu-bounces@lists.linux-foundation.org Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id AA0368780D; Tue, 18 Aug 2020 23:41:34 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gVGOUHuyH4bu; Tue, 18 Aug 2020 23:41:34 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 1C64C855EA; Tue, 18 Aug 2020 23:41:34 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 098CBC07FF; Tue, 18 Aug 2020 23:41:34 +0000 (UTC) Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id D530AC0051 for ; Tue, 18 Aug 2020 23:41:32 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id C4CA085C4C for ; Tue, 18 Aug 2020 23:41:32 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8cshpTbscTOt for ; Tue, 18 Aug 2020 23:41:32 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from huawei.com (szxga04-in.huawei.com [45.249.212.190]) by fraxinus.osuosl.org (Postfix) with ESMTPS id 2600D85BFB for ; Tue, 18 Aug 2020 23:41:32 +0000 (UTC) Received: from DGGEMS403-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 8192B60E87E67DB91656; Wed, 19 Aug 2020 07:41:30 +0800 (CST) Received: from SWX921481.china.huawei.com (10.126.203.140) by DGGEMS403-HUB.china.huawei.com (10.3.19.203) with Microsoft SMTP Server id 14.3.487.0; Wed, 19 Aug 2020 07:41:21 +0800 From: Barry Song To: , , Subject: [PATCH v4 3/3] iommu/arm-smmu-v3: permit users to disable msi polling Date: Wed, 19 Aug 2020 11:38:27 +1200 Message-ID: <20200818233827.21452-4-song.bao.hua@hisilicon.com> X-Mailer: git-send-email 2.21.0.windows.1 In-Reply-To: <20200818233827.21452-1-song.bao.hua@hisilicon.com> References: <20200818233827.21452-1-song.bao.hua@hisilicon.com> MIME-Version: 1.0 X-Originating-IP: [10.126.203.140] X-CFilter-Loop: Reflected Cc: iommu@lists.linux-foundation.org, linux-arm-kernel@lists.infradead.org, prime.zeng@hisilicon.com, linuxarm@huawei.com X-BeenThere: iommu@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Development issues for Linux IOMMU support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: iommu-bounces@lists.linux-foundation.org Sender: "iommu" Polling by MSI isn't necessarily faster than polling by SEV. Tests on hi1620 show hns3 100G NIC network throughput can improve from 25G to 27G if we disable MSI polling while running 16 netperf threads sending UDP packets in size 32KB. TX throughput can improve from 7G to 7.7G for single thread. The reason for the throughput improvement is that the latency to poll the completion of CMD_SYNC becomes smaller. After sending a CMD_SYNC in an empty cmd queue, typically we need to wait for 280ns using MSI polling. But we only need around 190ns after disabling MSI polling. This patch provides a command line option so that users can decide to use MSI polling or not based on their tests. Signed-off-by: Barry Song --- -v4: add ARM_SMMU_OPT_MSIPOLL flag with respect to Robin's comment drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 5b40d535a7c8..7332251dd8cd 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -418,6 +418,11 @@ module_param(disable_bypass, bool, 0444); MODULE_PARM_DESC(disable_bypass, "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); +static bool disable_msipolling; +module_param(disable_msipolling, bool, 0444); +MODULE_PARM_DESC(disable_msipolling, + "Disable MSI-based polling for CMD_SYNC completion."); + enum pri_resp { PRI_RESP_DENY = 0, PRI_RESP_FAIL = 1, @@ -652,6 +657,7 @@ struct arm_smmu_device { #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) #define ARM_SMMU_OPT_PAGE0_REGS_ONLY (1 << 1) +#define ARM_SMMU_OPT_MSIPOLL (1 << 2) u32 options; struct arm_smmu_cmdq cmdq; @@ -992,8 +998,7 @@ static void arm_smmu_cmdq_build_sync_cmd(u64 *cmd, struct arm_smmu_device *smmu, * Beware that Hi16xx adds an extra 32 bits of goodness to its MSI * payload, so the write will zero the entire command on that platform. */ - if (smmu->features & ARM_SMMU_FEAT_MSI && - smmu->features & ARM_SMMU_FEAT_COHERENCY) { + if (smmu->options & ARM_SMMU_OPT_MSIPOLL) { ent.sync.msiaddr = q->base_dma + Q_IDX(&q->llq, prod) * q->ent_dwords * 8; } @@ -1332,8 +1337,7 @@ static int __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu, static int arm_smmu_cmdq_poll_until_sync(struct arm_smmu_device *smmu, struct arm_smmu_ll_queue *llq) { - if (smmu->features & ARM_SMMU_FEAT_MSI && - smmu->features & ARM_SMMU_FEAT_COHERENCY) + if (smmu->options & ARM_SMMU_OPT_MSIPOLL) return __arm_smmu_cmdq_poll_until_msi(smmu, llq); return __arm_smmu_cmdq_poll_until_consumed(smmu, llq); @@ -3741,8 +3745,11 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) if (reg & IDR0_SEV) smmu->features |= ARM_SMMU_FEAT_SEV; - if (reg & IDR0_MSI) + if (reg & IDR0_MSI) { smmu->features |= ARM_SMMU_FEAT_MSI; + if (coherent && !disable_msipolling) + smmu->options |= ARM_SMMU_OPT_MSIPOLL; + } if (reg & IDR0_HYP) smmu->features |= ARM_SMMU_FEAT_HYP; -- 2.27.0 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu