From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 960AFC07E9C for ; Wed, 14 Jul 2021 10:42:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7F727611C0 for ; Wed, 14 Jul 2021 10:42:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239156AbhGNKox (ORCPT ); Wed, 14 Jul 2021 06:44:53 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]:3395 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239145AbhGNKow (ORCPT ); Wed, 14 Jul 2021 06:44:52 -0400 Received: from fraeml736-chm.china.huawei.com (unknown [172.18.147.207]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4GPv3t0mYQz6FCyj; Wed, 14 Jul 2021 18:33:30 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml736-chm.china.huawei.com (10.206.15.217) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 12:41:59 +0200 Received: from localhost.localdomain (10.69.192.58) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 11:41:55 +0100 From: John Garry To: , , , CC: , , , , , , , , , , , , , , , , John Garry Subject: [PATCH v4 2/6] iova: Allow rcache range upper limit to be flexible Date: Wed, 14 Jul 2021 18:36:39 +0800 Message-ID: <1626259003-201303-3-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1626259003-201303-1-git-send-email-john.garry@huawei.com> References: <1626259003-201303-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Some LLDs may request DMA mappings whose IOVA length exceeds that of the current rcache upper limit. This means that allocations for those IOVAs will never be cached, and always must be allocated and freed from the RB tree per DMA mapping cycle. This has a significant effect on performance, more so since commit 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search fails"), as discussed at [0]. As a first step towards allowing the rcache range upper limit be configured, hold this value in the IOVA rcache structure, and allocate the rcaches separately. [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/ Signed-off-by: John Garry --- drivers/iommu/dma-iommu.c | 2 +- drivers/iommu/iova.c | 23 +++++++++++++++++------ include/linux/iova.h | 4 ++-- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 98ba927aee1a..4772278aa5da 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -434,7 +434,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain, * rounding up anything cacheable to make sure that can't happen. The * order of the unadjusted size will still match upon freeing. */ - if (iova_len < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1))) + if (iova_len < (1 << (iovad->rcache_max_size - 1))) iova_len = roundup_pow_of_two(iova_len); dma_limit = min_not_zero(dma_limit, dev->bus_dma_limit); diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index b6cf5f16123b..07ce73fdd8c1 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -15,6 +15,8 @@ /* The anchor node sits above the top of the usable address space */ #define IOVA_ANCHOR ~0UL +#define IOVA_RANGE_CACHE_MAX_SIZE 6 /* log of max cached IOVA range size (in pages) */ + static bool iova_rcache_insert(struct iova_domain *iovad, unsigned long pfn, unsigned long size); @@ -881,7 +883,14 @@ static void init_iova_rcaches(struct iova_domain *iovad) unsigned int cpu; int i; - for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + iovad->rcache_max_size = IOVA_RANGE_CACHE_MAX_SIZE; + + iovad->rcaches = kcalloc(iovad->rcache_max_size, + sizeof(*iovad->rcaches), GFP_KERNEL); + if (!iovad->rcaches) + return; + + for (i = 0; i < iovad->rcache_max_size; ++i) { rcache = &iovad->rcaches[i]; spin_lock_init(&rcache->lock); rcache->depot_size = 0; @@ -956,7 +965,7 @@ static bool iova_rcache_insert(struct iova_domain *iovad, unsigned long pfn, { unsigned int log_size = order_base_2(size); - if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE) + if (log_size >= iovad->rcache_max_size) return false; return __iova_rcache_insert(iovad, &iovad->rcaches[log_size], pfn); @@ -1012,7 +1021,7 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad, { unsigned int log_size = order_base_2(size); - if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE) + if (log_size >= iovad->rcache_max_size) return 0; return __iova_rcache_get(&iovad->rcaches[log_size], limit_pfn - size); @@ -1028,7 +1037,7 @@ static void free_iova_rcaches(struct iova_domain *iovad) unsigned int cpu; int i, j; - for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + for (i = 0; i < iovad->rcache_max_size; ++i) { rcache = &iovad->rcaches[i]; for_each_possible_cpu(cpu) { cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu); @@ -1039,6 +1048,8 @@ static void free_iova_rcaches(struct iova_domain *iovad) for (j = 0; j < rcache->depot_size; ++j) iova_magazine_free(rcache->depot[j]); } + + kfree(iovad->rcaches); } /* @@ -1051,7 +1062,7 @@ static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad) unsigned long flags; int i; - for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + for (i = 0; i < iovad->rcache_max_size; ++i) { rcache = &iovad->rcaches[i]; cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu); spin_lock_irqsave(&cpu_rcache->lock, flags); @@ -1070,7 +1081,7 @@ static void free_global_cached_iovas(struct iova_domain *iovad) unsigned long flags; int i, j; - for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + for (i = 0; i < iovad->rcache_max_size; ++i) { rcache = &iovad->rcaches[i]; spin_lock_irqsave(&rcache->lock, flags); for (j = 0; j < rcache->depot_size; ++j) { diff --git a/include/linux/iova.h b/include/linux/iova.h index 71d8a2de6635..9974e1d3e2bc 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -25,7 +25,6 @@ struct iova { struct iova_magazine; struct iova_cpu_rcache; -#define IOVA_RANGE_CACHE_MAX_SIZE 6 /* log of max cached IOVA range size (in pages) */ #define MAX_GLOBAL_MAGS 32 /* magazines per bin */ struct iova_rcache { @@ -74,6 +73,7 @@ struct iova_domain { unsigned long start_pfn; /* Lower limit for this domain */ unsigned long dma_32bit_pfn; unsigned long max32_alloc_size; /* Size of last failed allocation */ + unsigned long rcache_max_size; /* Upper limit of cached IOVA RANGE */ struct iova_fq __percpu *fq; /* Flush Queue */ atomic64_t fq_flush_start_cnt; /* Number of TLB flushes that @@ -83,7 +83,6 @@ struct iova_domain { have been finished */ struct iova anchor; /* rbtree lookup anchor */ - struct iova_rcache rcaches[IOVA_RANGE_CACHE_MAX_SIZE]; /* IOVA range caches */ iova_flush_cb flush_cb; /* Call-Back function to flush IOMMU TLBs */ @@ -96,6 +95,7 @@ struct iova_domain { atomic_t fq_timer_on; /* 1 when timer is active, 0 when not */ struct hlist_node cpuhp_dead; + struct iova_rcache *rcaches; /* IOVA range caches */ }; static inline unsigned long iova_size(struct iova *iova) -- 2.26.2 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E234C07E9A for ; Wed, 14 Jul 2021 10:42:06 +0000 (UTC) Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2D2F461362 for ; Wed, 14 Jul 2021 10:42:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2D2F461362 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=iommu-bounces@lists.linux-foundation.org Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 05E1441583; Wed, 14 Jul 2021 10:42:06 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id j_sWJ7ij5Q0r; Wed, 14 Jul 2021 10:42:04 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id 80B0A41579; Wed, 14 Jul 2021 10:42:04 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 5BE01C001A; Wed, 14 Jul 2021 10:42:04 +0000 (UTC) Received: from smtp2.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 7C956C000E for ; Wed, 14 Jul 2021 10:42:03 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 6B96C402BE for ; Wed, 14 Jul 2021 10:42:03 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gz731gYPkk8x for ; Wed, 14 Jul 2021 10:42:02 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by smtp2.osuosl.org (Postfix) with ESMTPS id 4639240299 for ; Wed, 14 Jul 2021 10:42:02 +0000 (UTC) Received: from fraeml736-chm.china.huawei.com (unknown [172.18.147.207]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4GPv3t0mYQz6FCyj; Wed, 14 Jul 2021 18:33:30 +0800 (CST) Received: from lhreml724-chm.china.huawei.com (10.201.108.75) by fraeml736-chm.china.huawei.com (10.206.15.217) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 12:41:59 +0200 Received: from localhost.localdomain (10.69.192.58) by lhreml724-chm.china.huawei.com (10.201.108.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 14 Jul 2021 11:41:55 +0100 From: John Garry To: , , , Subject: [PATCH v4 2/6] iova: Allow rcache range upper limit to be flexible Date: Wed, 14 Jul 2021 18:36:39 +0800 Message-ID: <1626259003-201303-3-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1626259003-201303-1-git-send-email-john.garry@huawei.com> References: <1626259003-201303-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To lhreml724-chm.china.huawei.com (10.201.108.75) X-CFilter-Loop: Reflected Cc: linux-kernel@vger.kernel.org, sakari.ailus@linux.intel.com, mst@redhat.com, airlied@linux.ie, gregkh@linuxfoundation.org, jasowang@redhat.com, linuxarm@huawei.com, jonathanh@nvidia.com, iommu@lists.linux-foundation.org, thierry.reding@gmail.com, daniel@ffwll.ch, bingbu.cao@intel.com, digetx@gmail.com, mchehab@kernel.org, tian.shu.qiu@intel.com X-BeenThere: iommu@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Development issues for Linux IOMMU support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: iommu-bounces@lists.linux-foundation.org Sender: "iommu" Some LLDs may request DMA mappings whose IOVA length exceeds that of the current rcache upper limit. This means that allocations for those IOVAs will never be cached, and always must be allocated and freed from the RB tree per DMA mapping cycle. This has a significant effect on performance, more so since commit 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search fails"), as discussed at [0]. As a first step towards allowing the rcache range upper limit be configured, hold this value in the IOVA rcache structure, and allocate the rcaches separately. [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/ Signed-off-by: John Garry --- drivers/iommu/dma-iommu.c | 2 +- drivers/iommu/iova.c | 23 +++++++++++++++++------ include/linux/iova.h | 4 ++-- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 98ba927aee1a..4772278aa5da 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -434,7 +434,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain, * rounding up anything cacheable to make sure that can't happen. The * order of the unadjusted size will still match upon freeing. */ - if (iova_len < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1))) + if (iova_len < (1 << (iovad->rcache_max_size - 1))) iova_len = roundup_pow_of_two(iova_len); dma_limit = min_not_zero(dma_limit, dev->bus_dma_limit); diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index b6cf5f16123b..07ce73fdd8c1 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -15,6 +15,8 @@ /* The anchor node sits above the top of the usable address space */ #define IOVA_ANCHOR ~0UL +#define IOVA_RANGE_CACHE_MAX_SIZE 6 /* log of max cached IOVA range size (in pages) */ + static bool iova_rcache_insert(struct iova_domain *iovad, unsigned long pfn, unsigned long size); @@ -881,7 +883,14 @@ static void init_iova_rcaches(struct iova_domain *iovad) unsigned int cpu; int i; - for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + iovad->rcache_max_size = IOVA_RANGE_CACHE_MAX_SIZE; + + iovad->rcaches = kcalloc(iovad->rcache_max_size, + sizeof(*iovad->rcaches), GFP_KERNEL); + if (!iovad->rcaches) + return; + + for (i = 0; i < iovad->rcache_max_size; ++i) { rcache = &iovad->rcaches[i]; spin_lock_init(&rcache->lock); rcache->depot_size = 0; @@ -956,7 +965,7 @@ static bool iova_rcache_insert(struct iova_domain *iovad, unsigned long pfn, { unsigned int log_size = order_base_2(size); - if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE) + if (log_size >= iovad->rcache_max_size) return false; return __iova_rcache_insert(iovad, &iovad->rcaches[log_size], pfn); @@ -1012,7 +1021,7 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad, { unsigned int log_size = order_base_2(size); - if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE) + if (log_size >= iovad->rcache_max_size) return 0; return __iova_rcache_get(&iovad->rcaches[log_size], limit_pfn - size); @@ -1028,7 +1037,7 @@ static void free_iova_rcaches(struct iova_domain *iovad) unsigned int cpu; int i, j; - for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + for (i = 0; i < iovad->rcache_max_size; ++i) { rcache = &iovad->rcaches[i]; for_each_possible_cpu(cpu) { cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu); @@ -1039,6 +1048,8 @@ static void free_iova_rcaches(struct iova_domain *iovad) for (j = 0; j < rcache->depot_size; ++j) iova_magazine_free(rcache->depot[j]); } + + kfree(iovad->rcaches); } /* @@ -1051,7 +1062,7 @@ static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad) unsigned long flags; int i; - for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + for (i = 0; i < iovad->rcache_max_size; ++i) { rcache = &iovad->rcaches[i]; cpu_rcache = per_cpu_ptr(rcache->cpu_rcaches, cpu); spin_lock_irqsave(&cpu_rcache->lock, flags); @@ -1070,7 +1081,7 @@ static void free_global_cached_iovas(struct iova_domain *iovad) unsigned long flags; int i, j; - for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + for (i = 0; i < iovad->rcache_max_size; ++i) { rcache = &iovad->rcaches[i]; spin_lock_irqsave(&rcache->lock, flags); for (j = 0; j < rcache->depot_size; ++j) { diff --git a/include/linux/iova.h b/include/linux/iova.h index 71d8a2de6635..9974e1d3e2bc 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -25,7 +25,6 @@ struct iova { struct iova_magazine; struct iova_cpu_rcache; -#define IOVA_RANGE_CACHE_MAX_SIZE 6 /* log of max cached IOVA range size (in pages) */ #define MAX_GLOBAL_MAGS 32 /* magazines per bin */ struct iova_rcache { @@ -74,6 +73,7 @@ struct iova_domain { unsigned long start_pfn; /* Lower limit for this domain */ unsigned long dma_32bit_pfn; unsigned long max32_alloc_size; /* Size of last failed allocation */ + unsigned long rcache_max_size; /* Upper limit of cached IOVA RANGE */ struct iova_fq __percpu *fq; /* Flush Queue */ atomic64_t fq_flush_start_cnt; /* Number of TLB flushes that @@ -83,7 +83,6 @@ struct iova_domain { have been finished */ struct iova anchor; /* rbtree lookup anchor */ - struct iova_rcache rcaches[IOVA_RANGE_CACHE_MAX_SIZE]; /* IOVA range caches */ iova_flush_cb flush_cb; /* Call-Back function to flush IOMMU TLBs */ @@ -96,6 +95,7 @@ struct iova_domain { atomic_t fq_timer_on; /* 1 when timer is active, 0 when not */ struct hlist_node cpuhp_dead; + struct iova_rcache *rcaches; /* IOVA range caches */ }; static inline unsigned long iova_size(struct iova *iova) -- 2.26.2 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu