kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Liu, Yi L" <yi.l.liu@intel.com>
To: qemu-devel@nongnu.org, david@gibson.dropbear.id.au,
	pbonzini@redhat.com, alex.williamson@redhat.com,
	peterx@redhat.com
Cc: mst@redhat.com, eric.auger@redhat.com, kevin.tian@intel.com,
	yi.l.liu@intel.com, jun.j.tian@intel.com, yi.y.sun@intel.com,
	kvm@vger.kernel.org, hao.wu@intel.com,
	Jacob Pan <jacob.jun.pan@linux.intel.com>,
	Yi Sun <yi.y.sun@linux.intel.com>
Subject: [RFC v3 02/25] hw/iommu: introduce DualStageIOMMUObject
Date: Wed, 29 Jan 2020 04:16:33 -0800	[thread overview]
Message-ID: <1580300216-86172-3-git-send-email-yi.l.liu@intel.com> (raw)
In-Reply-To: <1580300216-86172-1-git-send-email-yi.l.liu@intel.com>

From: Liu Yi L <yi.l.liu@intel.com>

Currently, many platform vendors provide the capability of dual stage
DMA address translation in hardware. For example, nested translation
on Intel VT-d scalable mode, nested stage translation on ARM SMMUv3,
and etc. In dual stage DMA address translation, there are two stages
address translation, stage-1 (a.k.a first-level) and stage-2 (a.k.a
second-level) translation structures. Stage-1 translation results are
also subjected to stage-2 translation structures. Take vSVA (Virtual
Shared Virtual Addressing) as an example, guest IOMMU driver owns
stage-1 translation structures (covers GVA->GPA translation), and host
IOMMU driver owns stage-2 translation structures (covers GPA->HPA
translation). VMM is responsible to bind stage-1 translation structures
to host, thus hardware could achieve GVA->GPA and then GPA->HPA
translation. For more background on SVA, refer the below links.
 - https://www.youtube.com/watch?v=Kq_nfGK5MwQ
 - https://events19.lfasiallc.com/wp-content/uploads/2017/11/\
Shared-Virtual-Memory-in-KVM_Yi-Liu.pdf

As above, dual stage DMA translation offers two stage address mappings,
which could have better DMA address translation support for passthru
devices. This is also what vIOMMU developers are doing so far. Efforts
includes vSVA enabling from Yi Liu and SMMUv3 Nested Stage Setup from
Eric Auger.
https://www.spinics.net/lists/kvm/msg198556.html
https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg02842.html

Both efforts are aiming to expose a vIOMMU with dual stage hardware
backed. As so, QEMU needs to have an explicit object to stand for
the dual stage capability from hardware. Such object offers abstract
for the dual stage DMA translation related operations, like:

 1) PASID allocation (allow host to intercept in PASID allocation)
 2) bind stage-1 translation structures to host
 3) propagate stage-1 cache invalidation to host
 4) DMA address translation fault (I/O page fault) servicing etc.

This patch introduces DualStageIOMMUObject to stand for the hardware
dual stage DMA translation capability. PASID allocation/free are the
first operation included in it, in future, there will be more operations
like bind_stage1_pgtbl and invalidate_stage1_cache and etc.

Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Yi Sun <yi.y.sun@linux.intel.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
---
 hw/Makefile.objs                    |  1 +
 hw/iommu/Makefile.objs              |  1 +
 hw/iommu/dual_stage_iommu.c         | 59 +++++++++++++++++++++++++++++++++++++
 include/hw/iommu/dual_stage_iommu.h | 59 +++++++++++++++++++++++++++++++++++++
 4 files changed, 120 insertions(+)
 create mode 100644 hw/iommu/Makefile.objs
 create mode 100644 hw/iommu/dual_stage_iommu.c
 create mode 100644 include/hw/iommu/dual_stage_iommu.h

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 660e2b4..cab83fe 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -40,6 +40,7 @@ devices-dirs-$(CONFIG_MEM_DEVICE) += mem/
 devices-dirs-$(CONFIG_NUBUS) += nubus/
 devices-dirs-y += semihosting/
 devices-dirs-y += smbios/
+devices-dirs-y += iommu/
 endif
 
 common-obj-y += $(devices-dirs-y)
diff --git a/hw/iommu/Makefile.objs b/hw/iommu/Makefile.objs
new file mode 100644
index 0000000..d4f3b39
--- /dev/null
+++ b/hw/iommu/Makefile.objs
@@ -0,0 +1 @@
+obj-y += dual_stage_iommu.o
diff --git a/hw/iommu/dual_stage_iommu.c b/hw/iommu/dual_stage_iommu.c
new file mode 100644
index 0000000..be4179d
--- /dev/null
+++ b/hw/iommu/dual_stage_iommu.c
@@ -0,0 +1,59 @@
+/*
+ * QEMU abstract of Hardware Dual Stage DMA translation capability
+ *
+ * Copyright (C) 2020 Intel Corporation.
+ *
+ * Authors: Liu Yi L <yi.l.liu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/iommu/dual_stage_iommu.h"
+
+int ds_iommu_pasid_alloc(DualStageIOMMUObject *dsi_obj, uint32_t min,
+                         uint32_t max, uint32_t *pasid)
+{
+    if (!dsi_obj) {
+        return -ENOENT;
+    }
+
+    if (dsi_obj->ops && dsi_obj->ops->pasid_alloc) {
+        return dsi_obj->ops->pasid_alloc(dsi_obj, min, max, pasid);
+    }
+    return -ENOENT;
+}
+
+int ds_iommu_pasid_free(DualStageIOMMUObject *dsi_obj, uint32_t pasid)
+{
+    if (!dsi_obj) {
+        return -ENOENT;
+    }
+
+    if (dsi_obj->ops && dsi_obj->ops->pasid_free) {
+        return dsi_obj->ops->pasid_free(dsi_obj, pasid);
+    }
+    return -ENOENT;
+}
+
+void ds_iommu_object_init(DualStageIOMMUObject *dsi_obj,
+                          DualStageIOMMUOps *ops)
+{
+    dsi_obj->ops = ops;
+}
+
+void ds_iommu_object_destroy(DualStageIOMMUObject *dsi_obj)
+{
+    dsi_obj->ops = NULL;
+}
diff --git a/include/hw/iommu/dual_stage_iommu.h b/include/hw/iommu/dual_stage_iommu.h
new file mode 100644
index 0000000..e9891e3
--- /dev/null
+++ b/include/hw/iommu/dual_stage_iommu.h
@@ -0,0 +1,59 @@
+/*
+ * QEMU abstraction of IOMMU Context
+ *
+ * Copyright (C) 2020 Red Hat Inc.
+ *
+ * Authors: Liu, Yi L <yi.l.liu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_DS_IOMMU_H
+#define HW_DS_IOMMU_H
+
+#include "qemu/queue.h"
+#ifndef CONFIG_USER_ONLY
+#include "exec/hwaddr.h"
+#endif
+
+typedef struct DualStageIOMMUObject DualStageIOMMUObject;
+typedef struct DualStageIOMMUOps DualStageIOMMUOps;
+
+struct DualStageIOMMUOps {
+    /* Allocate pasid from DualStageIOMMU (a.k.a. host IOMMU) */
+    int (*pasid_alloc)(DualStageIOMMUObject *dsi_obj,
+                       uint32_t min,
+                       uint32_t max,
+                       uint32_t *pasid);
+    /* Reclaim a pasid from DualStageIOMMU (a.k.a. host IOMMU) */
+    int (*pasid_free)(DualStageIOMMUObject *dsi_obj,
+                      uint32_t pasid);
+};
+
+/*
+ * This is an abstraction of Dual-stage IOMMU.
+ */
+struct DualStageIOMMUObject {
+    DualStageIOMMUOps *ops;
+};
+
+int ds_iommu_pasid_alloc(DualStageIOMMUObject *dsi_obj, uint32_t min,
+                         uint32_t max, uint32_t *pasid);
+int ds_iommu_pasid_free(DualStageIOMMUObject *dsi_obj, uint32_t pasid);
+
+void ds_iommu_object_init(DualStageIOMMUObject *dsi_obj,
+                          DualStageIOMMUOps *ops);
+void ds_iommu_object_destroy(DualStageIOMMUObject *dsi_obj);
+
+#endif
-- 
2.7.4


  parent reply	other threads:[~2020-01-29 12:11 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-29 12:16 [RFC v3 00/25] intel_iommu: expose Shared Virtual Addressing to VMs Liu, Yi L
2020-01-29 12:16 ` [RFC v3 01/25] hw/pci: modify pci_setup_iommu() to set PCIIOMMUOps Liu, Yi L
2020-01-29 12:16 ` Liu, Yi L [this message]
2020-01-31  3:59   ` [RFC v3 02/25] hw/iommu: introduce DualStageIOMMUObject David Gibson
2020-01-31 11:42     ` Liu, Yi L
2020-02-12  6:32       ` David Gibson
2020-01-29 12:16 ` [RFC v3 03/25] hw/iommu: introduce IOMMUContext Liu, Yi L
2020-01-31  4:06   ` David Gibson
2020-01-31 11:42     ` Liu, Yi L
2020-02-11 16:58       ` Peter Xu
2020-02-12  7:15         ` Liu, Yi L
2020-02-12 15:59           ` Peter Xu
2020-02-13  2:46             ` Liu, Yi L
2020-02-14  5:36           ` David Gibson
2020-02-15  6:25             ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 04/25] hw/pci: introduce pci_device_iommu_context() Liu, Yi L
2020-01-29 12:16 ` [RFC v3 05/25] intel_iommu: provide get_iommu_context() callback Liu, Yi L
2020-01-29 12:16 ` [RFC v3 06/25] scripts/update-linux-headers: Import iommu.h Liu, Yi L
2020-01-29 12:25   ` Cornelia Huck
2020-01-31 11:40     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 07/25] header file update VFIO/IOMMU vSVA APIs Liu, Yi L
2020-01-29 12:28   ` Cornelia Huck
2020-01-31 11:41     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 08/25] vfio: pass IOMMUContext into vfio_get_group() Liu, Yi L
2020-01-29 12:16 ` [RFC v3 09/25] vfio: check VFIO_TYPE1_NESTING_IOMMU support Liu, Yi L
2020-02-11 19:08   ` Peter Xu
2020-02-12  7:16     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 10/25] vfio: register DualStageIOMMUObject to vIOMMU Liu, Yi L
2020-01-29 12:16 ` [RFC v3 11/25] vfio: get stage-1 pasid formats from Kernel Liu, Yi L
2020-02-11 19:30   ` Peter Xu
2020-02-12  7:19     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 12/25] vfio/common: add pasid_alloc/free support Liu, Yi L
2020-02-11 19:31   ` Peter Xu
2020-02-12  7:20     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 13/25] intel_iommu: modify x-scalable-mode to be string option Liu, Yi L
2020-02-11 19:43   ` Peter Xu
2020-02-12  7:28     ` Liu, Yi L
2020-02-12 16:05       ` Peter Xu
2020-02-13  2:44         ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 14/25] intel_iommu: add virtual command capability support Liu, Yi L
2020-02-11 20:16   ` Peter Xu
2020-02-12  7:32     ` Liu, Yi L
2020-02-11 21:56   ` Peter Xu
2020-02-13  2:40     ` Liu, Yi L
2020-02-13 14:31       ` Peter Xu
2020-02-13 15:08         ` Peter Xu
2020-02-15  8:49           ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 15/25] intel_iommu: process pasid cache invalidation Liu, Yi L
2020-02-11 20:17   ` Peter Xu
2020-02-12  7:33     ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 16/25] intel_iommu: add PASID cache management infrastructure Liu, Yi L
2020-02-11 23:35   ` Peter Xu
2020-02-12  8:37     ` Liu, Yi L
2020-02-12 15:26       ` Peter Xu
2020-02-13  2:59         ` Liu, Yi L
2020-02-13 15:14           ` Peter Xu
2020-02-15  8:50             ` Liu, Yi L
2020-01-29 12:16 ` [RFC v3 17/25] vfio: add bind stage-1 page table support Liu, Yi L
2020-01-29 12:16 ` [RFC v3 18/25] intel_iommu: bind/unbind guest page table to host Liu, Yi L
2020-01-29 12:16 ` [RFC v3 19/25] intel_iommu: replay guest pasid bindings " Liu, Yi L
2020-01-29 12:16 ` [RFC v3 20/25] intel_iommu: replay pasid binds after context cache invalidation Liu, Yi L
2020-01-29 12:16 ` [RFC v3 21/25] intel_iommu: do not pass down pasid bind for PASID #0 Liu, Yi L
2020-01-29 12:16 ` [RFC v3 22/25] vfio: add support for flush iommu stage-1 cache Liu, Yi L
2020-01-29 12:16 ` [RFC v3 23/25] intel_iommu: process PASID-based iotlb invalidation Liu, Yi L
2020-01-29 12:16 ` [RFC v3 24/25] intel_iommu: propagate PASID-based iotlb invalidation to host Liu, Yi L
2020-01-29 12:16 ` [RFC v3 25/25] intel_iommu: process PASID-based Device-TLB invalidation Liu, Yi L
2020-01-29 13:44 ` [RFC v3 00/25] intel_iommu: expose Shared Virtual Addressing to VMs no-reply
2020-01-29 13:48 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1580300216-86172-3-git-send-email-yi.l.liu@intel.com \
    --to=yi.l.liu@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=eric.auger@redhat.com \
    --cc=hao.wu@intel.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jun.j.tian@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yi.y.sun@intel.com \
    --cc=yi.y.sun@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).