IOMMU Archive on lore.kernel.org
 help / color / Atom feed
From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Lu Baolu <baolu.lu@linux.intel.com>
Cc: kevin.tian@intel.com, ashok.raj@intel.com, kvm@vger.kernel.org,
	sanjay.k.kumar@intel.com, yi.y.sun@intel.com,
	linux-kernel@vger.kernel.org,
	Alex Williamson <alex.williamson@redhat.com>,
	iommu@lists.linux-foundation.org,
	David Woodhouse <dwmw2@infradead.org>
Subject: Re: [PATCH v2 0/8] Use 1st-level for DMA remapping
Date: Mon, 2 Dec 2019 12:19:27 -0800
Message-ID: <20191202121927.2fef85ba@jacob-builder> (raw)
In-Reply-To: <20191128022550.9832-1-baolu.lu@linux.intel.com>

On Thu, 28 Nov 2019 10:25:42 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:

> Intel VT-d in scalable mode supports two types of page talbes
tables
> for DMA translation: the first level page table and the second
> level page table. The first level page table uses the same
> format as the CPU page table, while the second level page table
> keeps compatible with previous formats. The software is able
> to choose any one of them for DMA remapping according to the use
> case.
> 
> This patchset aims to move IOVA (I/O Virtual Address) translation
move guest IOVA only, right?
> to 1st-level page table in scalable mode. This will simplify vIOMMU
> (IOMMU simulated by VM hypervisor) design by using the two-stage
> translation, a.k.a. nested mode translation.
> 
> As Intel VT-d architecture offers caching mode, guest IOVA (GIOVA)
> support is now implemented in a shadow page manner. The device
> simulation software, like QEMU, has to figure out GIOVA->GPA mappings
> and write them to a shadowed page table, which will be used by the
> physical IOMMU. Each time when mappings are created or destroyed in
> vIOMMU, the simulation software has to intervene. Hence, the changes
> on GIOVA->GPA could be shadowed to host.
> 
> 
>      .-----------.
>      |  vIOMMU   |
>      |-----------|                 .--------------------.
>      |           |IOTLB flush trap |        QEMU        |
>      .-----------. (map/unmap)     |--------------------|
>      |GIOVA->GPA |---------------->|    .------------.  |
>      '-----------'                 |    | GIOVA->HPA |  |
>      |           |                 |    '------------'  |
>      '-----------'                 |                    |
>                                    |                    |
>                                    '--------------------'
>                                                 |
>             <------------------------------------
>             |
>             v VFIO/IOMMU API
>       .-----------.
>       |  pIOMMU   |
>       |-----------|
>       |           |
>       .-----------.
>       |GIOVA->HPA |
>       '-----------'
>       |           |
>       '-----------'
> 
> In VT-d 3.0, scalable mode is introduced, which offers two-level
> translation page tables and nested translation mode. Regards to
> GIOVA support, it can be simplified by 1) moving the GIOVA support
> over 1st-level page table to store GIOVA->GPA mapping in vIOMMU,
> 2) binding vIOMMU 1st level page table to the pIOMMU, 3) using pIOMMU
> second level for GPA->HPA translation, and 4) enable nested (a.k.a.
> dual-stage) translation in host. Compared with current shadow GIOVA
> support, the new approach makes the vIOMMU design simpler and more
> efficient as we only need to flush the pIOMMU IOTLB and possible
> device-IOTLB when an IOVA mapping in vIOMMU is torn down.
> 
>      .-----------.
>      |  vIOMMU   |
>      |-----------|                 .-----------.
>      |           |IOTLB flush trap |   QEMU    |
>      .-----------.    (unmap)      |-----------|
>      |GIOVA->GPA |---------------->|           |
>      '-----------'                 '-----------'
>      |           |                       |
>      '-----------'                       |
>            <------------------------------
>            |      VFIO/IOMMU          
>            |  cache invalidation and  
>            | guest gpd bind interfaces
>            v
>      .-----------.
>      |  pIOMMU   |
>      |-----------|
>      .-----------.
>      |GIOVA->GPA |<---First level
>      '-----------'
>      | GPA->HPA  |<---Scond level
>      '-----------'
>      '-----------'
> 
> This patch set includes two parts. The former part implements the
> per-domain page table abstraction, which makes the page table
> difference transparent to various map/unmap APIs. The later part
s/later/latter/
> applies the first level page table for IOVA translation unless the
> DOMAIN_ATTR_NESTING domain attribution has been set, which indicates
> nested mode in use.
> 
Maybe I am reading this wrong, but shouldn't it be the opposite?
i.e. Use FL page table for IOVA if it is a nesting domain?

> Based-on-idea-by: Ashok Raj <ashok.raj@intel.com>
> Based-on-idea-by: Kevin Tian <kevin.tian@intel.com>
> Based-on-idea-by: Liu Yi L <yi.l.liu@intel.com>
> Based-on-idea-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Based-on-idea-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
> Based-on-idea-by: Lu Baolu <baolu.lu@linux.intel.com>
> 
> Change log:
> 
>  v1->v2
>  - The first series was posted here
>    https://lkml.org/lkml/2019/9/23/297
>  - Use per domain page table ops to handle different page tables.
>  - Use first level for DMA remapping by default on both bare metal
>    and vm guest.
>  - Code refine according to code review comments for v1.
> 
> Lu Baolu (8):
>   iommu/vt-d: Add per domain page table ops
>   iommu/vt-d: Move domain_flush_cache helper into header
>   iommu/vt-d: Implement second level page table ops
>   iommu/vt-d: Apply per domain second level page table ops
>   iommu/vt-d: Add first level page table interfaces
>   iommu/vt-d: Implement first level page table ops
>   iommu/vt-d: Identify domains using first level page table
>   iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr
> 
>  drivers/iommu/Makefile             |   2 +-
>  drivers/iommu/intel-iommu.c        | 412
> +++++++++++++++++++++++------ drivers/iommu/intel-pgtable.c      |
> 376 ++++++++++++++++++++++++++ include/linux/intel-iommu.h        |
> 64 ++++- include/trace/events/intel_iommu.h |  60 +++++
>  5 files changed, 837 insertions(+), 77 deletions(-)
>  create mode 100644 drivers/iommu/intel-pgtable.c
> 

[Jacob Pan]
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  parent reply index

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-28  2:25 Lu Baolu
2019-11-28  2:25 ` [PATCH v2 1/8] iommu/vt-d: Add per domain page table ops Lu Baolu
2019-11-28  2:25 ` [PATCH v2 2/8] iommu/vt-d: Move domain_flush_cache helper into header Lu Baolu
2019-11-28  2:25 ` [PATCH v2 3/8] iommu/vt-d: Implement second level page table ops Lu Baolu
2019-11-28  2:25 ` [PATCH v2 4/8] iommu/vt-d: Apply per domain " Lu Baolu
2019-11-28  2:25 ` [PATCH v2 5/8] iommu/vt-d: Add first level page table interfaces Lu Baolu
2019-12-02 23:27   ` Jacob Pan
2019-12-03  2:36     ` Lu Baolu
2019-12-11  1:56     ` Lu Baolu
2019-11-28  2:25 ` [PATCH v2 6/8] iommu/vt-d: Implement first level page table ops Lu Baolu
2019-11-28  2:25 ` [PATCH v2 7/8] iommu/vt-d: Identify domains using first level page table Lu Baolu
2019-11-28  2:25 ` [PATCH v2 8/8] iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr Lu Baolu
2019-12-02 20:19 ` Jacob Pan [this message]
2019-12-03  2:19   ` [PATCH v2 0/8] Use 1st-level for DMA remapping Lu Baolu

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191202121927.2fef85ba@jacob-builder \
    --to=jacob.jun.pan@linux.intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sanjay.k.kumar@intel.com \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

IOMMU Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-iommu/0 linux-iommu/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-iommu linux-iommu/ https://lore.kernel.org/linux-iommu \
		iommu@lists.linux-foundation.org
	public-inbox-index linux-iommu

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.linux-foundation.lists.iommu


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git