From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55B93C43603 for ; Thu, 19 Dec 2019 03:17:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3484421D7D for ; Thu, 19 Dec 2019 03:17:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726840AbfLSDRe (ORCPT ); Wed, 18 Dec 2019 22:17:34 -0500 Received: from mga05.intel.com ([192.55.52.43]:29385 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726463AbfLSDRe (ORCPT ); Wed, 18 Dec 2019 22:17:34 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Dec 2019 19:17:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,330,1571727600"; d="scan'208";a="222160398" Received: from allen-box.sh.intel.com ([10.239.159.136]) by fmsmga001.fm.intel.com with ESMTP; 18 Dec 2019 19:17:30 -0800 From: Lu Baolu To: Joerg Roedel , David Woodhouse , Alex Williamson Cc: ashok.raj@intel.com, sanjay.k.kumar@intel.com, jacob.jun.pan@linux.intel.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, Peter Xu , iommu@lists.linux-foundation.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v4 0/7] Use 1st-level for IOVA translation Date: Thu, 19 Dec 2019 11:16:27 +0800 Message-Id: <20191219031634.15168-1-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.17.1 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Intel VT-d in scalable mode supports two types of page tables for DMA translation: the first level page table and the second level page table. The first level page table uses the same format as the CPU page table, while the second level page table keeps compatible with previous formats. The software is able to choose any one of them for DMA remapping according to the use case. This patchset aims to move IOVA (I/O Virtual Address) translation to 1st-level page table in scalable mode. This will simplify vIOMMU (IOMMU simulated by VM hypervisor) design by using the two-stage translation, a.k.a. nested mode translation. As Intel VT-d architecture offers caching mode, guest IOVA (GIOVA) support is currently implemented in a shadow page manner. The device simulation software, like QEMU, has to figure out GIOVA->GPA mappings and write them to a shadowed page table, which will be used by the physical IOMMU. Each time when mappings are created or destroyed in vIOMMU, the simulation software has to intervene. Hence, the changes on GIOVA->GPA could be shadowed to host. .-----------. | vIOMMU | |-----------| .--------------------. | |IOTLB flush trap | QEMU | .-----------. (map/unmap) |--------------------| |GIOVA->GPA |---------------->| .------------. | '-----------' | | GIOVA->HPA | | | | | '------------' | '-----------' | | | | '--------------------' | <------------------------------------ | v VFIO/IOMMU API .-----------. | pIOMMU | |-----------| | | .-----------. |GIOVA->HPA | '-----------' | | '-----------' In VT-d 3.0, scalable mode is introduced, which offers two-level translation page tables and nested translation mode. Regards to GIOVA support, it can be simplified by 1) moving the GIOVA support over 1st-level page table to store GIOVA->GPA mapping in vIOMMU, 2) binding vIOMMU 1st level page table to the pIOMMU, 3) using pIOMMU second level for GPA->HPA translation, and 4) enable nested (a.k.a. dual-stage) translation in host. Compared with current shadow GIOVA support, the new approach makes the vIOMMU design simpler and more efficient as we only need to flush the pIOMMU IOTLB and possible device-IOTLB when an IOVA mapping in vIOMMU is torn down. .-----------. | vIOMMU | |-----------| .-----------. | |IOTLB flush trap | QEMU | .-----------. (unmap) |-----------| |GIOVA->GPA |---------------->| | '-----------' '-----------' | | | '-----------' | <------------------------------ | VFIO/IOMMU | cache invalidation and | guest gpd bind interfaces v .-----------. | pIOMMU | |-----------| .-----------. |GIOVA->GPA |<---First level '-----------' | GPA->HPA |<---Scond level '-----------' '-----------' This patch applies the first level page table for IOVA translation unless the DOMAIN_ATTR_NESTING domain attribution has been set. Setting of this attribution means the second level will be used to map gPA (guest physical address) to hPA (host physical address), and the mappings between gVA (guest virtual address) and gPA will be maintained by the guest with the page table address binding to host's first level. Based-on-idea-by: Ashok Raj Based-on-idea-by: Kevin Tian Based-on-idea-by: Liu Yi L Based-on-idea-by: Jacob Pan Based-on-idea-by: Sanjay Kumar Based-on-idea-by: Lu Baolu Change log: v3->v4: - The previous version was posted here https://lkml.org/lkml/2019/12/10/2126 - Set Execute Disable (bit 63) in first level table entries. - Enhance pasid-based iotlb invalidation for both default domain and auxiliary domain. - Add debugfs file to expose page table internals. v2->v3: - The previous version was posted here https://lkml.org/lkml/2019/11/27/1831 - Accept Jacob's suggestion on merging two page tables. v1->v2 - The first series was posted here https://lkml.org/lkml/2019/9/23/297 - Use per domain page table ops to handle different page tables. - Use first level for DMA remapping by default on both bare metal and vm guest. - Code refine according to code review comments for v1. Lu Baolu (7): iommu/vt-d: Identify domains using first level page table iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr iommu/vt-d: Add PASID_FLAG_FL5LP for first-level pasid setup iommu/vt-d: Setup pasid entries for iova over first level iommu/vt-d: Flush PASID-based iotlb for iova over first level iommu/vt-d: Use iova over first level iommu/vt-d: debugfs: Add support to show page table internals drivers/iommu/dmar.c | 41 ++++++ drivers/iommu/intel-iommu-debugfs.c | 75 +++++++++++ drivers/iommu/intel-iommu.c | 201 +++++++++++++++++++++++++--- drivers/iommu/intel-pasid.c | 7 +- drivers/iommu/intel-pasid.h | 6 + drivers/iommu/intel-svm.c | 8 +- include/linux/intel-iommu.h | 20 ++- 7 files changed, 326 insertions(+), 32 deletions(-) -- 2.17.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5875FC2D0D1 for ; Thu, 19 Dec 2019 03:17:44 +0000 (UTC) Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 36CFD21D7D for ; Thu, 19 Dec 2019 03:17:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 36CFD21D7D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=iommu-bounces@lists.linux-foundation.org Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 2680A84B0F; Thu, 19 Dec 2019 03:17:44 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MDlsIqNzwPUl; Thu, 19 Dec 2019 03:17:42 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id 4899180CF3; Thu, 19 Dec 2019 03:17:42 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 317F9C1AE8; Thu, 19 Dec 2019 03:17:42 +0000 (UTC) Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id A551CC077D for ; Thu, 19 Dec 2019 03:17:38 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id A1E9284B0F for ; Thu, 19 Dec 2019 03:17:38 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Fo84rnFeHMlA for ; Thu, 19 Dec 2019 03:17:36 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by whitealder.osuosl.org (Postfix) with ESMTPS id AB4B280CF3 for ; Thu, 19 Dec 2019 03:17:35 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Dec 2019 19:17:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,330,1571727600"; d="scan'208";a="222160398" Received: from allen-box.sh.intel.com ([10.239.159.136]) by fmsmga001.fm.intel.com with ESMTP; 18 Dec 2019 19:17:30 -0800 From: Lu Baolu To: Joerg Roedel , David Woodhouse , Alex Williamson Subject: [PATCH v4 0/7] Use 1st-level for IOVA translation Date: Thu, 19 Dec 2019 11:16:27 +0800 Message-Id: <20191219031634.15168-1-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.17.1 Cc: kevin.tian@intel.com, ashok.raj@intel.com, kvm@vger.kernel.org, sanjay.k.kumar@intel.com, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, yi.y.sun@intel.com X-BeenThere: iommu@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Development issues for Linux IOMMU support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: iommu-bounces@lists.linux-foundation.org Sender: "iommu" Intel VT-d in scalable mode supports two types of page tables for DMA translation: the first level page table and the second level page table. The first level page table uses the same format as the CPU page table, while the second level page table keeps compatible with previous formats. The software is able to choose any one of them for DMA remapping according to the use case. This patchset aims to move IOVA (I/O Virtual Address) translation to 1st-level page table in scalable mode. This will simplify vIOMMU (IOMMU simulated by VM hypervisor) design by using the two-stage translation, a.k.a. nested mode translation. As Intel VT-d architecture offers caching mode, guest IOVA (GIOVA) support is currently implemented in a shadow page manner. The device simulation software, like QEMU, has to figure out GIOVA->GPA mappings and write them to a shadowed page table, which will be used by the physical IOMMU. Each time when mappings are created or destroyed in vIOMMU, the simulation software has to intervene. Hence, the changes on GIOVA->GPA could be shadowed to host. .-----------. | vIOMMU | |-----------| .--------------------. | |IOTLB flush trap | QEMU | .-----------. (map/unmap) |--------------------| |GIOVA->GPA |---------------->| .------------. | '-----------' | | GIOVA->HPA | | | | | '------------' | '-----------' | | | | '--------------------' | <------------------------------------ | v VFIO/IOMMU API .-----------. | pIOMMU | |-----------| | | .-----------. |GIOVA->HPA | '-----------' | | '-----------' In VT-d 3.0, scalable mode is introduced, which offers two-level translation page tables and nested translation mode. Regards to GIOVA support, it can be simplified by 1) moving the GIOVA support over 1st-level page table to store GIOVA->GPA mapping in vIOMMU, 2) binding vIOMMU 1st level page table to the pIOMMU, 3) using pIOMMU second level for GPA->HPA translation, and 4) enable nested (a.k.a. dual-stage) translation in host. Compared with current shadow GIOVA support, the new approach makes the vIOMMU design simpler and more efficient as we only need to flush the pIOMMU IOTLB and possible device-IOTLB when an IOVA mapping in vIOMMU is torn down. .-----------. | vIOMMU | |-----------| .-----------. | |IOTLB flush trap | QEMU | .-----------. (unmap) |-----------| |GIOVA->GPA |---------------->| | '-----------' '-----------' | | | '-----------' | <------------------------------ | VFIO/IOMMU | cache invalidation and | guest gpd bind interfaces v .-----------. | pIOMMU | |-----------| .-----------. |GIOVA->GPA |<---First level '-----------' | GPA->HPA |<---Scond level '-----------' '-----------' This patch applies the first level page table for IOVA translation unless the DOMAIN_ATTR_NESTING domain attribution has been set. Setting of this attribution means the second level will be used to map gPA (guest physical address) to hPA (host physical address), and the mappings between gVA (guest virtual address) and gPA will be maintained by the guest with the page table address binding to host's first level. Based-on-idea-by: Ashok Raj Based-on-idea-by: Kevin Tian Based-on-idea-by: Liu Yi L Based-on-idea-by: Jacob Pan Based-on-idea-by: Sanjay Kumar Based-on-idea-by: Lu Baolu Change log: v3->v4: - The previous version was posted here https://lkml.org/lkml/2019/12/10/2126 - Set Execute Disable (bit 63) in first level table entries. - Enhance pasid-based iotlb invalidation for both default domain and auxiliary domain. - Add debugfs file to expose page table internals. v2->v3: - The previous version was posted here https://lkml.org/lkml/2019/11/27/1831 - Accept Jacob's suggestion on merging two page tables. v1->v2 - The first series was posted here https://lkml.org/lkml/2019/9/23/297 - Use per domain page table ops to handle different page tables. - Use first level for DMA remapping by default on both bare metal and vm guest. - Code refine according to code review comments for v1. Lu Baolu (7): iommu/vt-d: Identify domains using first level page table iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr iommu/vt-d: Add PASID_FLAG_FL5LP for first-level pasid setup iommu/vt-d: Setup pasid entries for iova over first level iommu/vt-d: Flush PASID-based iotlb for iova over first level iommu/vt-d: Use iova over first level iommu/vt-d: debugfs: Add support to show page table internals drivers/iommu/dmar.c | 41 ++++++ drivers/iommu/intel-iommu-debugfs.c | 75 +++++++++++ drivers/iommu/intel-iommu.c | 201 +++++++++++++++++++++++++--- drivers/iommu/intel-pasid.c | 7 +- drivers/iommu/intel-pasid.h | 6 + drivers/iommu/intel-svm.c | 8 +- include/linux/intel-iommu.h | 20 ++- 7 files changed, 326 insertions(+), 32 deletions(-) -- 2.17.1 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu