From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80AF1C433E0 for ; Wed, 3 Jun 2020 02:44:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 643A120663 for ; Wed, 3 Jun 2020 02:44:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725956AbgFCCoT (ORCPT ); Tue, 2 Jun 2020 22:44:19 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:5776 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725888AbgFCCoP (ORCPT ); Tue, 2 Jun 2020 22:44:15 -0400 Received: from DGGEMS406-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 9B98898D0D4831E0C2FB; Wed, 3 Jun 2020 10:44:13 +0800 (CST) Received: from SWX921481.china.huawei.com (10.126.201.193) by DGGEMS406-HUB.china.huawei.com (10.3.19.206) with Microsoft SMTP Server id 14.3.487.0; Wed, 3 Jun 2020 10:44:04 +0800 From: Barry Song To: , , , CC: , , , , , , , Barry Song Subject: [PATCH 0/3] support per-numa CMA for ARM server Date: Wed, 3 Jun 2020 14:42:28 +1200 Message-ID: <20200603024231.61748-1-song.bao.hua@hisilicon.com> X-Mailer: git-send-email 2.21.0.windows.1 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.126.201.193] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Right now, smmu is using dma_alloc_coherent() to get memory to save queues and tables. Typically, on ARM64 server, there is a default CMA located at node0, which could be far away from node2, node3 etc. Saving queues and tables remotely will increase the latency of ARM SMMU significantly. For example, when SMMU is at node2 and the default global CMA is at node0, after sending a CMD_SYNC in an empty command queue, we have to wait more than 550ns for the completion of the command CMD_SYNC. However, if we save them locally, we only need to wait for 240ns. with per-numa CMA, smmu will get memory from local numa node to save command queues and page tables. that means dma_unmap latency will be shrunk much. Meanwhile, when iommu.passthrough is on, device drivers which call dma_ alloc_coherent() will also get local memory and avoid the travel between numa nodes. Barry Song (3): dma-direct: provide the ability to reserve per-numa CMA arm64: mm: reserve hugetlb CMA after numa_init arm64: mm: reserve per-numa CMA after numa_init arch/arm64/mm/init.c | 12 ++++++---- include/linux/dma-contiguous.h | 4 ++++ kernel/dma/Kconfig | 10 ++++++++ kernel/dma/contiguous.c | 43 +++++++++++++++++++++++++++++++++- 4 files changed, 63 insertions(+), 6 deletions(-) -- 2.23.0 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13870C433E1 for ; Wed, 3 Jun 2020 02:44:22 +0000 (UTC) Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9D51220663 for ; Wed, 3 Jun 2020 02:44:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9D51220663 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=hisilicon.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=iommu-bounces@lists.linux-foundation.org Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 551B1888FC; Wed, 3 Jun 2020 02:44:21 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZNs9mMKB4Wcj; Wed, 3 Jun 2020 02:44:20 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 4EE2888805; Wed, 3 Jun 2020 02:44:20 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 3DB91C0865; Wed, 3 Jun 2020 02:44:20 +0000 (UTC) Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 97419C016E for ; Wed, 3 Jun 2020 02:44:18 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 7AFE720486 for ; Wed, 3 Jun 2020 02:44:18 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pjwMuWotadX9 for ; Wed, 3 Jun 2020 02:44:17 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from huawei.com (szxga05-in.huawei.com [45.249.212.191]) by silver.osuosl.org (Postfix) with ESMTPS id F08E62041C for ; Wed, 3 Jun 2020 02:44:16 +0000 (UTC) Received: from DGGEMS406-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 9B98898D0D4831E0C2FB; Wed, 3 Jun 2020 10:44:13 +0800 (CST) Received: from SWX921481.china.huawei.com (10.126.201.193) by DGGEMS406-HUB.china.huawei.com (10.3.19.206) with Microsoft SMTP Server id 14.3.487.0; Wed, 3 Jun 2020 10:44:04 +0800 From: Barry Song To: , , , Subject: [PATCH 0/3] support per-numa CMA for ARM server Date: Wed, 3 Jun 2020 14:42:28 +1200 Message-ID: <20200603024231.61748-1-song.bao.hua@hisilicon.com> X-Mailer: git-send-email 2.21.0.windows.1 MIME-Version: 1.0 X-Originating-IP: [10.126.201.193] X-CFilter-Loop: Reflected Cc: linux-kernel@vger.kernel.org, linuxarm@huawei.com, iommu@lists.linux-foundation.org, prime.zeng@hisilicon.com, linux-arm-kernel@lists.infradead.org X-BeenThere: iommu@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Development issues for Linux IOMMU support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: iommu-bounces@lists.linux-foundation.org Sender: "iommu" Right now, smmu is using dma_alloc_coherent() to get memory to save queues and tables. Typically, on ARM64 server, there is a default CMA located at node0, which could be far away from node2, node3 etc. Saving queues and tables remotely will increase the latency of ARM SMMU significantly. For example, when SMMU is at node2 and the default global CMA is at node0, after sending a CMD_SYNC in an empty command queue, we have to wait more than 550ns for the completion of the command CMD_SYNC. However, if we save them locally, we only need to wait for 240ns. with per-numa CMA, smmu will get memory from local numa node to save command queues and page tables. that means dma_unmap latency will be shrunk much. Meanwhile, when iommu.passthrough is on, device drivers which call dma_ alloc_coherent() will also get local memory and avoid the travel between numa nodes. Barry Song (3): dma-direct: provide the ability to reserve per-numa CMA arm64: mm: reserve hugetlb CMA after numa_init arm64: mm: reserve per-numa CMA after numa_init arch/arm64/mm/init.c | 12 ++++++---- include/linux/dma-contiguous.h | 4 ++++ kernel/dma/Kconfig | 10 ++++++++ kernel/dma/contiguous.c | 43 +++++++++++++++++++++++++++++++++- 4 files changed, 63 insertions(+), 6 deletions(-) -- 2.23.0 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BB06C433DF for ; Wed, 3 Jun 2020 02:44:32 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F313E20663 for ; Wed, 3 Jun 2020 02:44:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="um2bYcBF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F313E20663 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=hisilicon.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:Subject:To :From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=MBUwdg8n4tv/moEHul6d2xSZFzaFlsWQXBSDeqiMdpo=; b=um2bYcBF+oxh88 cER259PQZ6Sfkyrfgbc42ShGAhQumNV50QmQrKbUfWIKiLtzCcMoVeeRAZgqueUEYog1SI7ANasRj UQ1ANewPtorW20gd0xSsmNQiIiVSuSfOpccE6E7I07lw1qA3DHi9FVMYSI9ImC+2Ltsv1E+GsC/N/ sE29VLJ8UA7Ihae+99+2OVisCLNN7/ru3x5rDWz8EAU0AlQous6caix3jH13vVwuMLCLW/j0igb2I jNnWjPrh055VYIufrzoSOlFvDyPMU2EOnjQPVgPMNx0Plwa2s5PsDMUsGC1kcUJ2ZMBTqJjxslglT 1Nm++v80Hb7vJy4mcRJA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jgJOJ-00012n-0g; Wed, 03 Jun 2020 02:44:31 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191] helo=huawei.com) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jgJOG-00010R-45 for linux-arm-kernel@lists.infradead.org; Wed, 03 Jun 2020 02:44:29 +0000 Received: from DGGEMS406-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 9B98898D0D4831E0C2FB; Wed, 3 Jun 2020 10:44:13 +0800 (CST) Received: from SWX921481.china.huawei.com (10.126.201.193) by DGGEMS406-HUB.china.huawei.com (10.3.19.206) with Microsoft SMTP Server id 14.3.487.0; Wed, 3 Jun 2020 10:44:04 +0800 From: Barry Song To: , , , Subject: [PATCH 0/3] support per-numa CMA for ARM server Date: Wed, 3 Jun 2020 14:42:28 +1200 Message-ID: <20200603024231.61748-1-song.bao.hua@hisilicon.com> X-Mailer: git-send-email 2.21.0.windows.1 MIME-Version: 1.0 X-Originating-IP: [10.126.201.193] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200602_194428_328641_FD89FF79 X-CRM114-Status: UNSURE ( 7.63 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Barry Song , john.garry@huawei.com, linux-kernel@vger.kernel.org, linuxarm@huawei.com, iommu@lists.linux-foundation.org, prime.zeng@hisilicon.com, Jonathan.Cameron@huawei.com, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org Right now, smmu is using dma_alloc_coherent() to get memory to save queues and tables. Typically, on ARM64 server, there is a default CMA located at node0, which could be far away from node2, node3 etc. Saving queues and tables remotely will increase the latency of ARM SMMU significantly. For example, when SMMU is at node2 and the default global CMA is at node0, after sending a CMD_SYNC in an empty command queue, we have to wait more than 550ns for the completion of the command CMD_SYNC. However, if we save them locally, we only need to wait for 240ns. with per-numa CMA, smmu will get memory from local numa node to save command queues and page tables. that means dma_unmap latency will be shrunk much. Meanwhile, when iommu.passthrough is on, device drivers which call dma_ alloc_coherent() will also get local memory and avoid the travel between numa nodes. Barry Song (3): dma-direct: provide the ability to reserve per-numa CMA arm64: mm: reserve hugetlb CMA after numa_init arm64: mm: reserve per-numa CMA after numa_init arch/arm64/mm/init.c | 12 ++++++---- include/linux/dma-contiguous.h | 4 ++++ kernel/dma/Kconfig | 10 ++++++++ kernel/dma/contiguous.c | 43 +++++++++++++++++++++++++++++++++- 4 files changed, 63 insertions(+), 6 deletions(-) -- 2.23.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel