iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/9] iommu: Bounce page for untrusted devices
@ 2019-06-03  1:16 Lu Baolu
  2019-06-03  1:16 ` [PATCH v4 1/9] PCI: Add dev_is_untrusted helper Lu Baolu
                   ` (10 more replies)
  0 siblings, 11 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-03  1:16 UTC (permalink / raw)
  To: David Woodhouse, Joerg Roedel, Bjorn Helgaas, Christoph Hellwig
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, ashok.raj,
	Konrad Rzeszutek Wilk, alan.cox, Jonathan Corbet, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Boris Ostrovsky,
	mika.westerberg

The Thunderbolt vulnerabilities are public and have a nice
name as Thunderclap [1] [3] nowadays. This patch series aims
to mitigate those concerns.

An external PCI device is a PCI peripheral device connected
to the system through an external bus, such as Thunderbolt.
What makes it different is that it can't be trusted to the
same degree as the devices build into the system. Generally,
a trusted PCIe device will DMA into the designated buffers
and not overrun or otherwise write outside the specified
bounds. But it's different for an external device.

The minimum IOMMU mapping granularity is one page (4k), so
for DMA transfers smaller than that a malicious PCIe device
can access the whole page of memory even if it does not
belong to the driver in question. This opens a possibility
for DMA attack. For more information about DMA attacks
imposed by an untrusted PCI/PCIe device, please refer to [2].

This implements bounce buffer for the untrusted external
devices. The transfers should be limited in isolated pages
so the IOMMU window does not cover memory outside of what
the driver expects. Previously (v3 and before), we proposed
an optimisation to only copy the head and tail of the buffer
if it spans multiple pages, and directly map the ones in the
middle. Figure 1 gives a big picture about this solution.

                                swiotlb             System
                IOVA          bounce page           Memory
             .---------.      .---------.        .---------.
             |         |      |         |        |         |
             |         |      |         |        |         |
buffer_start .---------.      .---------.        .---------.
             |         |----->|         |*******>|         |
             |         |      |         | swiotlb|         |
             |         |      |         | mapping|         |
 IOMMU Page  '---------'      '---------'        '---------'
  Boundary   |         |                         |         |
             |         |                         |         |
             |         |                         |         |
             |         |------------------------>|         |
             |         |    IOMMU mapping        |         |
             |         |                         |         |
 IOMMU Page  .---------.                         .---------.
  Boundary   |         |                         |         |
             |         |                         |         |
             |         |------------------------>|         |
             |         |     IOMMU mapping       |         |
             |         |                         |         |
             |         |                         |         |
 IOMMU Page  .---------.      .---------.        .---------.
  Boundary   |         |      |         |        |         |
             |         |      |         |        |         |
             |         |----->|         |*******>|         |
  buffer_end '---------'      '---------' swiotlb'---------'
             |         |      |         | mapping|         |
             |         |      |         |        |         |
             '---------'      '---------'        '---------'
          Figure 1: A big view of iommu bounce page 

As Robin Murphy pointed out, this ties us to using strict mode for
TLB maintenance, which may not be an overall win depending on the
balance between invalidation bandwidth vs. memcpy bandwidth. If we
use standard SWIOTLB logic to always copy the whole thing, we should
be able to release the bounce pages via the flush queue to allow
'safe' lazy unmaps. So since v4 we start to use the standard swiotlb
logic.

                                swiotlb             System
                IOVA          bounce page           Memory
buffer_start .---------.      .---------.        .---------.
             |         |      |         |        |         |
             |         |      |         |        |         |
             |         |      |         |        .---------.physical
             |         |----->|         | ------>|         |_start  
             |         |iommu |         | swiotlb|         |
             |         | map  |         |   map  |         |
 IOMMU Page  .---------.      .---------.        '---------'
  Boundary   |         |      |         |        |         |
             |         |      |         |        |         |
             |         |----->|         |        |         |
             |         |iommu |         |        |         |
             |         | map  |         |        |         |
             |         |      |         |        |         |
 IOMMU Page  .---------.      .---------.        .---------.
  Boundary   |         |      |         |        |         |
             |         |----->|         |        |         |
             |         |iommu |         |        |         |
             |         | map  |         |        |         |
             |         |      |         |        |         |
 IOMMU Page  |         |      |         |        |         |
  Boundary   .---------.      .---------.        .---------.
             |         |      |         |------->|         |
  buffer_end '---------'      '---------' swiotlb|         |
             |         |----->|         |   map  |         |
             |         |iommu |         |        |         |
             |         | map  |         |        '---------' physical
             |         |      |         |        |         | _end    
             '---------'      '---------'        '---------'
          Figure 2: A big view of simplified iommu bounce page 

The implementation of bounce buffers for untrusted devices
will cause a little performance overhead, but we didn't see
any user experience problems. The users could use the kernel
parameter defined in the IOMMU driver to remove the performance
overhead if they trust their devices enough.

This series introduces below APIs for bounce page:

 * iommu_bounce_map(dev, addr, paddr, size, dir, attrs)
   - Map a buffer start at DMA address @addr in bounce page
     manner. For buffer that doesn't cross whole minimal
     IOMMU pages, the bounce buffer policy is applied.
     A bounce page mapped by swiotlb will be used as the DMA
     target in the IOMMU page table.
 
 * iommu_bounce_unmap(dev, addr, size, dir, attrs)
   - Unmap the buffer mapped with iommu_bounce_map(). The bounce
     page will be torn down after the bounced data get synced.
 
 * iommu_bounce_sync_single(dev, addr, size, dir, target)
   - Synce the bounced data in case the bounce mapped buffer is
     reused.

The bounce page idea:
Based-on-idea-by: Mika Westerberg <mika.westerberg@intel.com>
Based-on-idea-by: Ashok Raj <ashok.raj@intel.com>
Based-on-idea-by: Alan Cox <alan.cox@intel.com>
Based-on-idea-by: Kevin Tian <kevin.tian@intel.com>
Based-on-idea-by: Robin Murphy <robin.murphy@arm.com>

The patch series has been tested by:
Tested-by: Xu Pengfei <pengfei.xu@intel.com>
Tested-by: Mika Westerberg <mika.westerberg@intel.com>

Reference:
[1] https://thunderclap.io/
[2] https://thunderclap.io/thunderclap-paper-ndss2019.pdf
[3] https://christian.kellner.me/2019/02/27/thunderclap-and-linux/
[4] https://lkml.org/lkml/2019/3/4/644

Best regards,
Baolu

Change log:
  v3->v4:
  - The previous v3 was posted here:
    https://lkml.org/lkml/2019/4/20/213
  - Discard the optimization of only mapping head and tail
    partial pages, use the standard swiotlb in order to achieve
    iotlb flush efficiency.
  - This patch series is based on the top of the vt-d branch of
    Joerg's iommu tree.

  v2->v3:
  - The previous v2 was posed here:
    https://lkml.org/lkml/2019/3/27/157
  - Reuse the existing swiotlb APIs for bounce buffer by
    extending it to support bounce page.
  - Move the bouce page APIs into iommu generic layer.
  - This patch series is based on 5.1-rc1.

  v1->v2:
  - The previous v1 was posted here:
    https://lkml.org/lkml/2019/3/12/66
  - Refactor the code to remove struct bounce_param;
  - During the v1 review cycle, we discussed the possibility
    of reusing swiotlb code to avoid code dumplication, but
    we found the swiotlb implementations are not ready for the
    use of bounce page pool.
    https://lkml.org/lkml/2019/3/19/259
  - This patch series has been rebased to v5.1-rc2.

Lu Baolu (9):
  PCI: Add dev_is_untrusted helper
  swiotlb: Split size parameter to map/unmap APIs
  swiotlb: Zero out bounce buffer for untrusted device
  iommu: Add bounce page APIs
  iommu/vt-d: Don't switch off swiotlb if use direct dma
  iommu/vt-d: Check whether device requires bounce buffer
  iommu/vt-d: Add trace events for domain map/unmap
  iommu/vt-d: Code refactoring for bounce map and unmap
  iommu/vt-d: Use bounce buffer for untrusted devices

 .../admin-guide/kernel-parameters.txt         |   5 +
 drivers/iommu/Kconfig                         |  14 ++
 drivers/iommu/Makefile                        |   1 +
 drivers/iommu/intel-iommu.c                   | 225 +++++++++++++-----
 drivers/iommu/intel-trace.c                   |  14 ++
 drivers/iommu/iommu.c                         | 119 +++++++++
 drivers/xen/swiotlb-xen.c                     |   8 +-
 include/linux/iommu.h                         |  35 +++
 include/linux/pci.h                           |   2 +
 include/linux/swiotlb.h                       |   8 +-
 include/trace/events/intel_iommu.h            | 132 ++++++++++
 kernel/dma/direct.c                           |   2 +-
 kernel/dma/swiotlb.c                          |  30 ++-
 13 files changed, 522 insertions(+), 73 deletions(-)
 create mode 100644 drivers/iommu/intel-trace.c
 create mode 100644 include/trace/events/intel_iommu.h

-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 1/9] PCI: Add dev_is_untrusted helper
  2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
@ 2019-06-03  1:16 ` Lu Baolu
  2019-06-03  1:16 ` [PATCH v4 2/9] swiotlb: Split size parameter to map/unmap APIs Lu Baolu
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-03  1:16 UTC (permalink / raw)
  To: David Woodhouse, Joerg Roedel, Bjorn Helgaas, Christoph Hellwig
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, ashok.raj,
	Konrad Rzeszutek Wilk, alan.cox, Jonathan Corbet, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Boris Ostrovsky,
	mika.westerberg

There are several places in the kernel where it is necessary to
check whether a device is a pci untrusted device. Add a helper
to simplify the callers.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/pci.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 4a5a84d7bdd4..0582d869923d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -979,6 +979,7 @@ void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type);
 void pci_sort_breadthfirst(void);
 #define dev_is_pci(d) ((d)->bus == &pci_bus_type)
 #define dev_is_pf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_physfn : false))
+#define dev_is_untrusted(d) ((dev_is_pci(d) ? to_pci_dev(d)->untrusted : false))
 
 /* Generic PCI functions exported to card drivers */
 
@@ -1707,6 +1708,7 @@ static inline struct pci_dev *pci_dev_get(struct pci_dev *dev) { return NULL; }
 
 #define dev_is_pci(d) (false)
 #define dev_is_pf(d) (false)
+#define dev_is_untrusted(d) (false)
 static inline bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags)
 { return false; }
 static inline int pci_irqd_intx_xlate(struct irq_domain *d,
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 2/9] swiotlb: Split size parameter to map/unmap APIs
  2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
  2019-06-03  1:16 ` [PATCH v4 1/9] PCI: Add dev_is_untrusted helper Lu Baolu
@ 2019-06-03  1:16 ` Lu Baolu
  2019-06-03  1:16 ` [PATCH v4 3/9] swiotlb: Zero out bounce buffer for untrusted device Lu Baolu
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-03  1:16 UTC (permalink / raw)
  To: David Woodhouse, Joerg Roedel, Bjorn Helgaas, Christoph Hellwig
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, ashok.raj,
	Konrad Rzeszutek Wilk, alan.cox, Jonathan Corbet, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Boris Ostrovsky,
	mika.westerberg

This splits the size parameter to swiotlb_tbl_map_single() and
swiotlb_tbl_unmap_single() into an alloc_size and a mapping_size
parameter, where the latter one is rounded up to the iommu page
size.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/xen/swiotlb-xen.c |  8 ++++----
 include/linux/swiotlb.h   |  8 ++++++--
 kernel/dma/direct.c       |  2 +-
 kernel/dma/swiotlb.c      | 24 +++++++++++++-----------
 4 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 5dcb06fe9667..bb734f851d87 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -399,8 +399,8 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
 	 */
 	trace_swiotlb_bounced(dev, dev_addr, size, swiotlb_force);
 
-	map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir,
-				     attrs);
+	map = swiotlb_tbl_map_single(dev, start_dma_addr, phys,
+				     size, size, dir, attrs);
 	if (map == DMA_MAPPING_ERROR)
 		return DMA_MAPPING_ERROR;
 
@@ -410,7 +410,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
 	 * Ensure that the address returned is DMA'ble
 	 */
 	if (unlikely(!dma_capable(dev, dev_addr, size))) {
-		swiotlb_tbl_unmap_single(dev, map, size, dir,
+		swiotlb_tbl_unmap_single(dev, map, size, size, dir,
 				attrs | DMA_ATTR_SKIP_CPU_SYNC);
 		return DMA_MAPPING_ERROR;
 	}
@@ -446,7 +446,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
 
 	/* NOTE: We use dev_addr here, not paddr! */
 	if (is_xen_swiotlb_buffer(dev_addr))
-		swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
+		swiotlb_tbl_unmap_single(hwdev, paddr, size, size, dir, attrs);
 }
 
 static void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 361f62bb4a8e..cde3dc18e21a 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -46,13 +46,17 @@ enum dma_sync_target {
 
 extern phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 					  dma_addr_t tbl_dma_addr,
-					  phys_addr_t phys, size_t size,
+					  phys_addr_t phys,
+					  size_t mapping_size,
+					  size_t alloc_size,
 					  enum dma_data_direction dir,
 					  unsigned long attrs);
 
 extern void swiotlb_tbl_unmap_single(struct device *hwdev,
 				     phys_addr_t tlb_addr,
-				     size_t size, enum dma_data_direction dir,
+				     size_t mapping_size,
+				     size_t alloc_size,
+				     enum dma_data_direction dir,
 				     unsigned long attrs);
 
 extern void swiotlb_tbl_sync_single(struct device *hwdev,
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 2c2772e9702a..c892394bbe51 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -290,7 +290,7 @@ void dma_direct_unmap_page(struct device *dev, dma_addr_t addr,
 		dma_direct_sync_single_for_cpu(dev, addr, size, dir);
 
 	if (unlikely(is_swiotlb_buffer(phys)))
-		swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs);
+		swiotlb_tbl_unmap_single(dev, phys, size, size, dir, attrs);
 }
 EXPORT_SYMBOL(dma_direct_unmap_page);
 
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 13f0cb080a4d..f956f785645a 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -442,7 +442,9 @@ static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
 
 phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 				   dma_addr_t tbl_dma_addr,
-				   phys_addr_t orig_addr, size_t size,
+				   phys_addr_t orig_addr,
+				   size_t mapping_size,
+				   size_t alloc_size,
 				   enum dma_data_direction dir,
 				   unsigned long attrs)
 {
@@ -479,8 +481,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 	 * For mappings greater than or equal to a page, we limit the stride
 	 * (and hence alignment) to a page size.
 	 */
-	nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
-	if (size >= PAGE_SIZE)
+	nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
+	if (alloc_size >= PAGE_SIZE)
 		stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT));
 	else
 		stride = 1;
@@ -545,7 +547,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 	if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
 		dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
-			 size, io_tlb_nslabs, tmp_io_tlb_used);
+			 alloc_size, io_tlb_nslabs, tmp_io_tlb_used);
 	return DMA_MAPPING_ERROR;
 found:
 	io_tlb_used += nslots;
@@ -560,7 +562,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 		io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
 	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
 	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
-		swiotlb_bounce(orig_addr, tlb_addr, size, DMA_TO_DEVICE);
+		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
 
 	return tlb_addr;
 }
@@ -569,11 +571,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
  * tlb_addr is the physical address of the bounce buffer to unmap.
  */
 void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
-			      size_t size, enum dma_data_direction dir,
-			      unsigned long attrs)
+			      size_t mapping_size, size_t alloc_size,
+			      enum dma_data_direction dir, unsigned long attrs)
 {
 	unsigned long flags;
-	int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
+	int i, count, nslots = ALIGN(alloc_size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
 	int index = (tlb_addr - io_tlb_start) >> IO_TLB_SHIFT;
 	phys_addr_t orig_addr = io_tlb_orig_addr[index];
 
@@ -583,7 +585,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 	if (orig_addr != INVALID_PHYS_ADDR &&
 	    !(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
 	    ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
-		swiotlb_bounce(orig_addr, tlb_addr, size, DMA_FROM_DEVICE);
+		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_FROM_DEVICE);
 
 	/*
 	 * Return the buffer to the free list by setting the corresponding
@@ -663,14 +665,14 @@ bool swiotlb_map(struct device *dev, phys_addr_t *phys, dma_addr_t *dma_addr,
 
 	/* Oh well, have to allocate and map a bounce buffer. */
 	*phys = swiotlb_tbl_map_single(dev, __phys_to_dma(dev, io_tlb_start),
-			*phys, size, dir, attrs);
+			*phys, size, size, dir, attrs);
 	if (*phys == DMA_MAPPING_ERROR)
 		return false;
 
 	/* Ensure that the address returned is DMA'ble */
 	*dma_addr = __phys_to_dma(dev, *phys);
 	if (unlikely(!dma_capable(dev, *dma_addr, size))) {
-		swiotlb_tbl_unmap_single(dev, *phys, size, dir,
+		swiotlb_tbl_unmap_single(dev, *phys, size, size, dir,
 			attrs | DMA_ATTR_SKIP_CPU_SYNC);
 		return false;
 	}
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 3/9] swiotlb: Zero out bounce buffer for untrusted device
  2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
  2019-06-03  1:16 ` [PATCH v4 1/9] PCI: Add dev_is_untrusted helper Lu Baolu
  2019-06-03  1:16 ` [PATCH v4 2/9] swiotlb: Split size parameter to map/unmap APIs Lu Baolu
@ 2019-06-03  1:16 ` Lu Baolu
  2019-06-10 15:45   ` Konrad Rzeszutek Wilk
  2019-06-03  1:16 ` [PATCH v4 4/9] iommu: Add bounce page APIs Lu Baolu
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Lu Baolu @ 2019-06-03  1:16 UTC (permalink / raw)
  To: David Woodhouse, Joerg Roedel, Bjorn Helgaas, Christoph Hellwig
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, ashok.raj,
	Konrad Rzeszutek Wilk, alan.cox, Jonathan Corbet, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Boris Ostrovsky,
	mika.westerberg

This is necessary to avoid exposing valid kernel data to any
milicious device.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 kernel/dma/swiotlb.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index f956f785645a..ed41eb7f6131 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -35,6 +35,7 @@
 #include <linux/scatterlist.h>
 #include <linux/mem_encrypt.h>
 #include <linux/set_memory.h>
+#include <linux/pci.h>
 #ifdef CONFIG_DEBUG_FS
 #include <linux/debugfs.h>
 #endif
@@ -560,6 +561,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 	 */
 	for (i = 0; i < nslots; i++)
 		io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
+
+	/* Zero out the bounce buffer if the consumer is untrusted. */
+	if (dev_is_untrusted(hwdev))
+		memset(phys_to_virt(tlb_addr), 0, alloc_size);
+
 	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
 	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
 		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 4/9] iommu: Add bounce page APIs
  2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
                   ` (2 preceding siblings ...)
  2019-06-03  1:16 ` [PATCH v4 3/9] swiotlb: Zero out bounce buffer for untrusted device Lu Baolu
@ 2019-06-03  1:16 ` Lu Baolu
  2019-06-10 15:56   ` Konrad Rzeszutek Wilk
  2019-06-11 12:10   ` Pavel Begunkov
  2019-06-03  1:16 ` [PATCH v4 5/9] iommu/vt-d: Don't switch off swiotlb if use direct dma Lu Baolu
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-03  1:16 UTC (permalink / raw)
  To: David Woodhouse, Joerg Roedel, Bjorn Helgaas, Christoph Hellwig
  Cc: alan.cox, Stefano Stabellini, ashok.raj, Jonathan Corbet,
	pengfei.xu, Ingo Molnar, kevin.tian, Konrad Rzeszutek Wilk,
	Steven Rostedt, Boris Ostrovsky, mika.westerberg, Alan Cox,
	Juergen Gross, Greg Kroah-Hartman, Mika Westerberg, linux-kernel,
	iommu, jacob.jun.pan, Robin Murphy

IOMMU hardware always use paging for DMA remapping.  The
minimum mapped window is a page size. The device drivers
may map buffers not filling whole IOMMU window. It allows
device to access to possibly unrelated memory and various
malicious devices can exploit this to perform DMA attack.

This introduces the bouce buffer mechanism for DMA buffers
which doesn't fill a minimal IOMMU page. It could be used
by various vendor specific IOMMU drivers as long as the
DMA domain is managed by the generic IOMMU layer. Below
APIs are added:

* iommu_bounce_map(dev, addr, paddr, size, dir, attrs)
  - Map a buffer start at DMA address @addr in bounce page
    manner. For buffer parts that doesn't cross a whole
    minimal IOMMU page, the bounce page policy is applied.
    A bounce page mapped by swiotlb will be used as the DMA
    target in the IOMMU page table. Otherwise, the physical
    address @paddr is mapped instead.

* iommu_bounce_unmap(dev, addr, size, dir, attrs)
  - Unmap the buffer mapped with iommu_bounce_map(). The bounce
    page will be torn down after the bounced data get synced.

* iommu_bounce_sync(dev, addr, size, dir, target)
  - Synce the bounced data in case the bounce mapped buffer is
    reused.

The whole APIs are included within a kernel option IOMMU_BOUNCE_PAGE.
It's useful for cases where bounce page doesn't needed, for example,
embedded cases.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/Kconfig |  14 +++++
 drivers/iommu/iommu.c | 119 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h |  35 +++++++++++++
 3 files changed, 168 insertions(+)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 83664db5221d..d837ec3f359b 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -86,6 +86,20 @@ config IOMMU_DEFAULT_PASSTHROUGH
 
 	  If unsure, say N here.
 
+config IOMMU_BOUNCE_PAGE
+	bool "Use bounce page for untrusted devices"
+	depends on IOMMU_API
+	select SWIOTLB
+	help
+	  IOMMU hardware always use paging for DMA remapping. The minimum
+	  mapped window is a page size. The device drivers may map buffers
+	  not filling whole IOMMU window. This allows device to access to
+	  possibly unrelated memory and malicious device can exploit this
+	  to perform a DMA attack. Select this to use a bounce page for the
+	  buffer which doesn't fill a whole IOMU page.
+
+	  If unsure, say N here.
+
 config OF_IOMMU
        def_bool y
        depends on OF && IOMMU_API
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 2a906386bb8e..fa44f681a82b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2246,3 +2246,122 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
 	return ops->sva_get_pasid(handle);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+#ifdef CONFIG_IOMMU_BOUNCE_PAGE
+
+/*
+ * Bounce buffer support for external devices:
+ *
+ * IOMMU hardware always use paging for DMA remapping. The minimum mapped
+ * window is a page size. The device drivers may map buffers not filling
+ * whole IOMMU window. This allows device to access to possibly unrelated
+ * memory and malicious device can exploit this to perform a DMA attack.
+ * Use bounce pages for the buffer which doesn't fill whole IOMMU pages.
+ */
+
+static inline size_t
+get_aligned_size(struct iommu_domain *domain, dma_addr_t addr, size_t size)
+{
+	unsigned long page_size = 1 << __ffs(domain->pgsize_bitmap);
+	unsigned long offset = page_size - 1;
+
+	return ALIGN((addr & offset) + size, page_size);
+}
+
+dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
+			    phys_addr_t paddr, size_t size,
+			    enum dma_data_direction dir,
+			    unsigned long attrs)
+{
+	struct iommu_domain *domain;
+	unsigned int min_pagesz;
+	phys_addr_t tlb_addr;
+	size_t aligned_size;
+	int prot = 0;
+	int ret;
+
+	domain = iommu_get_dma_domain(dev);
+	if (!domain)
+		return DMA_MAPPING_ERROR;
+
+	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
+		prot |= IOMMU_READ;
+	if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)
+		prot |= IOMMU_WRITE;
+
+	aligned_size = get_aligned_size(domain, paddr, size);
+	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
+
+	/*
+	 * If both the physical buffer start address and size are
+	 * page aligned, we don't need to use a bounce page.
+	 */
+	if (!IS_ALIGNED(paddr | size, min_pagesz)) {
+		tlb_addr = swiotlb_tbl_map_single(dev,
+				__phys_to_dma(dev, io_tlb_start),
+				paddr, size, aligned_size, dir, attrs);
+		if (tlb_addr == DMA_MAPPING_ERROR)
+			return DMA_MAPPING_ERROR;
+	} else {
+		tlb_addr = paddr;
+	}
+
+	ret = iommu_map(domain, iova, tlb_addr, aligned_size, prot);
+	if (ret) {
+		swiotlb_tbl_unmap_single(dev, tlb_addr, size,
+					 aligned_size, dir, attrs);
+		return DMA_MAPPING_ERROR;
+	}
+
+	return iova;
+}
+EXPORT_SYMBOL_GPL(iommu_bounce_map);
+
+static inline phys_addr_t
+iova_to_tlb_addr(struct iommu_domain *domain, dma_addr_t addr)
+{
+	if (unlikely(!domain->ops || !domain->ops->iova_to_phys))
+		return 0;
+
+	return domain->ops->iova_to_phys(domain, addr);
+}
+
+void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
+			enum dma_data_direction dir, unsigned long attrs)
+{
+	struct iommu_domain *domain;
+	phys_addr_t tlb_addr;
+	size_t aligned_size;
+
+	domain = iommu_get_dma_domain(dev);
+	if (WARN_ON(!domain))
+		return;
+
+	aligned_size = get_aligned_size(domain, iova, size);
+	tlb_addr = iova_to_tlb_addr(domain, iova);
+	if (WARN_ON(!tlb_addr))
+		return;
+
+	iommu_unmap(domain, iova, aligned_size);
+	if (is_swiotlb_buffer(tlb_addr))
+		swiotlb_tbl_unmap_single(dev, tlb_addr, size,
+					 aligned_size, dir, attrs);
+}
+EXPORT_SYMBOL_GPL(iommu_bounce_unmap);
+
+void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
+		       enum dma_data_direction dir, enum dma_sync_target target)
+{
+	struct iommu_domain *domain;
+	phys_addr_t tlb_addr;
+
+	domain = iommu_get_dma_domain(dev);
+	if (WARN_ON(!domain))
+		return;
+
+	tlb_addr = iova_to_tlb_addr(domain, addr);
+	if (is_swiotlb_buffer(tlb_addr))
+		swiotlb_tbl_sync_single(dev, tlb_addr, size, dir, target);
+}
+EXPORT_SYMBOL_GPL(iommu_bounce_sync);
+#endif /* CONFIG_IOMMU_BOUNCE_PAGE */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 91af22a344e2..814c0da64692 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -25,6 +25,8 @@
 #include <linux/errno.h>
 #include <linux/err.h>
 #include <linux/of.h>
+#include <linux/swiotlb.h>
+#include <linux/dma-direct.h>
 
 #define IOMMU_READ	(1 << 0)
 #define IOMMU_WRITE	(1 << 1)
@@ -499,6 +501,39 @@ int iommu_sva_set_ops(struct iommu_sva *handle,
 		      const struct iommu_sva_ops *ops);
 int iommu_sva_get_pasid(struct iommu_sva *handle);
 
+#ifdef CONFIG_IOMMU_BOUNCE_PAGE
+dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
+			    phys_addr_t paddr, size_t size,
+			    enum dma_data_direction dir,
+			    unsigned long attrs);
+void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
+			enum dma_data_direction dir, unsigned long attrs);
+void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
+		       enum dma_data_direction dir,
+		       enum dma_sync_target target);
+#else
+static inline
+dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
+			    phys_addr_t paddr, size_t size,
+			    enum dma_data_direction dir,
+			    unsigned long attrs)
+{
+	return DMA_MAPPING_ERROR;
+}
+
+static inline
+void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
+			enum dma_data_direction dir, unsigned long attrs)
+{
+}
+
+static inline
+void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
+		       enum dma_data_direction dir, enum dma_sync_target target)
+{
+}
+#endif /* CONFIG_IOMMU_BOUNCE_PAGE */
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 5/9] iommu/vt-d: Don't switch off swiotlb if use direct dma
  2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
                   ` (3 preceding siblings ...)
  2019-06-03  1:16 ` [PATCH v4 4/9] iommu: Add bounce page APIs Lu Baolu
@ 2019-06-03  1:16 ` Lu Baolu
  2019-06-10 15:54   ` Konrad Rzeszutek Wilk
  2019-06-03  1:16 ` [PATCH v4 6/9] iommu/vt-d: Check whether device requires bounce buffer Lu Baolu
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Lu Baolu @ 2019-06-03  1:16 UTC (permalink / raw)
  To: David Woodhouse, Joerg Roedel, Bjorn Helgaas, Christoph Hellwig
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, ashok.raj,
	Konrad Rzeszutek Wilk, alan.cox, Jonathan Corbet, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Boris Ostrovsky,
	mika.westerberg, Mika Westerberg

The direct dma implementation depends on swiotlb. Hence, don't
switch of swiotlb since direct dma interfaces are used in this
driver.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Mika Westerberg <mika.westerberg@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index d5a6c8064c56..235837c50719 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4625,9 +4625,6 @@ static int __init platform_optin_force_iommu(void)
 		iommu_identity_mapping |= IDENTMAP_ALL;
 
 	dmar_disabled = 0;
-#if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
-	swiotlb = 0;
-#endif
 	no_iommu = 0;
 
 	return 1;
@@ -4765,9 +4762,6 @@ int __init intel_iommu_init(void)
 	}
 	up_write(&dmar_global_lock);
 
-#if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
-	swiotlb = 0;
-#endif
 	dma_ops = &intel_dma_ops;
 
 	init_iommu_pm_ops();
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 6/9] iommu/vt-d: Check whether device requires bounce buffer
  2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
                   ` (4 preceding siblings ...)
  2019-06-03  1:16 ` [PATCH v4 5/9] iommu/vt-d: Don't switch off swiotlb if use direct dma Lu Baolu
@ 2019-06-03  1:16 ` Lu Baolu
  2019-06-10 16:08   ` Konrad Rzeszutek Wilk
  2019-06-03  1:16 ` [PATCH v4 7/9] iommu/vt-d: Add trace events for domain map/unmap Lu Baolu
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Lu Baolu @ 2019-06-03  1:16 UTC (permalink / raw)
  To: David Woodhouse, Joerg Roedel, Bjorn Helgaas, Christoph Hellwig
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, ashok.raj,
	Konrad Rzeszutek Wilk, alan.cox, Jonathan Corbet, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Boris Ostrovsky,
	mika.westerberg

This adds a helper to check whether a device needs to
use bounce buffer. It also provides a boot time option
to disable the bounce buffer. Users can use this to
prevent the iommu driver from using the bounce buffer
for performance gain.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Xu Pengfei <pengfei.xu@intel.com>
Tested-by: Mika Westerberg <mika.westerberg@intel.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 5 +++++
 drivers/iommu/intel-iommu.c                     | 6 ++++++
 2 files changed, 11 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 138f6664b2e2..65685c6e53e4 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1728,6 +1728,11 @@
 			Note that using this option lowers the security
 			provided by tboot because it makes the system
 			vulnerable to DMA attacks.
+		nobounce [Default off]
+			Do not use the bounce buffer for untrusted devices like
+			the Thunderbolt devices. This will treat the untrusted
+			devices as the trusted ones, hence might expose security
+			risks of DMA attacks.
 
 	intel_idle.max_cstate=	[KNL,HW,ACPI,X86]
 			0	disables intel_idle and fall back on acpi_idle.
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 235837c50719..41439647f75d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -371,6 +371,7 @@ static int dmar_forcedac;
 static int intel_iommu_strict;
 static int intel_iommu_superpage = 1;
 static int iommu_identity_mapping;
+static int intel_no_bounce;
 
 #define IDENTMAP_ALL		1
 #define IDENTMAP_GFX		2
@@ -384,6 +385,8 @@ EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped);
 static DEFINE_SPINLOCK(device_domain_lock);
 static LIST_HEAD(device_domain_list);
 
+#define device_needs_bounce(d) (!intel_no_bounce && dev_is_untrusted(d))
+
 /*
  * Iterate over elements in device_domain_list and call the specified
  * callback @fn against each element.
@@ -466,6 +469,9 @@ static int __init intel_iommu_setup(char *str)
 			printk(KERN_INFO
 				"Intel-IOMMU: not forcing on after tboot. This could expose security risk for tboot\n");
 			intel_iommu_tboot_noforce = 1;
+		} else if (!strncmp(str, "nobounce", 8)) {
+			pr_info("Intel-IOMMU: No bounce buffer. This could expose security risks of DMA attacks\n");
+			intel_no_bounce = 1;
 		}
 
 		str += strcspn(str, ",");
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 7/9] iommu/vt-d: Add trace events for domain map/unmap
  2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
                   ` (5 preceding siblings ...)
  2019-06-03  1:16 ` [PATCH v4 6/9] iommu/vt-d: Check whether device requires bounce buffer Lu Baolu
@ 2019-06-03  1:16 ` Lu Baolu
  2019-06-04  9:01   ` Steven Rostedt
  2019-06-10 16:08   ` Konrad Rzeszutek Wilk
  2019-06-03  1:16 ` [PATCH v4 8/9] iommu/vt-d: Code refactoring for bounce map and unmap Lu Baolu
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-03  1:16 UTC (permalink / raw)
  To: David Woodhouse, Joerg Roedel, Bjorn Helgaas, Christoph Hellwig
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, ashok.raj,
	Konrad Rzeszutek Wilk, alan.cox, Jonathan Corbet, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Boris Ostrovsky,
	mika.westerberg

This adds trace support for the Intel IOMMU driver. It
also declares some events which could be used to trace
the events when an IOVA is being mapped or unmapped in
a domain.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/Makefile             |   1 +
 drivers/iommu/intel-trace.c        |  14 +++
 include/trace/events/intel_iommu.h | 132 +++++++++++++++++++++++++++++
 3 files changed, 147 insertions(+)
 create mode 100644 drivers/iommu/intel-trace.c
 create mode 100644 include/trace/events/intel_iommu.h

diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 8c71a15e986b..8b5fb8051281 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_ARM_SMMU) += arm-smmu.o
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
 obj-$(CONFIG_DMAR_TABLE) += dmar.o
 obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o
+obj-$(CONFIG_INTEL_IOMMU) += intel-trace.o
 obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += intel-iommu-debugfs.o
 obj-$(CONFIG_INTEL_IOMMU_SVM) += intel-svm.o
 obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o
diff --git a/drivers/iommu/intel-trace.c b/drivers/iommu/intel-trace.c
new file mode 100644
index 000000000000..bfb6a6e37a88
--- /dev/null
+++ b/drivers/iommu/intel-trace.c
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel IOMMU trace support
+ *
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Author: Lu Baolu <baolu.lu@linux.intel.com>
+ */
+
+#include <linux/string.h>
+#include <linux/types.h>
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/intel_iommu.h>
diff --git a/include/trace/events/intel_iommu.h b/include/trace/events/intel_iommu.h
new file mode 100644
index 000000000000..49ca57a90079
--- /dev/null
+++ b/include/trace/events/intel_iommu.h
@@ -0,0 +1,132 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Intel IOMMU trace support
+ *
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Author: Lu Baolu <baolu.lu@linux.intel.com>
+ */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM intel_iommu
+
+#if !defined(_TRACE_INTEL_IOMMU_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_INTEL_IOMMU_H
+
+#include <linux/tracepoint.h>
+#include <linux/intel-iommu.h>
+
+TRACE_EVENT(bounce_map_single,
+	TP_PROTO(struct device *dev, dma_addr_t dev_addr, phys_addr_t phys_addr,
+		 size_t size),
+
+	TP_ARGS(dev, dev_addr, phys_addr, size),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(dev))
+		__field(dma_addr_t, dev_addr)
+		__field(phys_addr_t, phys_addr)
+		__field(size_t,	size)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name(dev));
+		__entry->dev_addr = dev_addr;
+		__entry->phys_addr = phys_addr;
+		__entry->size = size;
+	),
+
+	TP_printk("dev=%s dev_addr=0x%llx phys_addr=0x%llx size=%zu",
+		  __get_str(dev_name),
+		  (unsigned long long)__entry->dev_addr,
+		  (unsigned long long)__entry->phys_addr,
+		  __entry->size)
+);
+
+TRACE_EVENT(bounce_unmap_single,
+	TP_PROTO(struct device *dev, dma_addr_t dev_addr, size_t size),
+
+	TP_ARGS(dev, dev_addr, size),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(dev))
+		__field(dma_addr_t, dev_addr)
+		__field(size_t,	size)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name(dev));
+		__entry->dev_addr = dev_addr;
+		__entry->size = size;
+	),
+
+	TP_printk("dev=%s dev_addr=0x%llx size=%zu",
+		  __get_str(dev_name),
+		  (unsigned long long)__entry->dev_addr,
+		  __entry->size)
+);
+
+TRACE_EVENT(bounce_map_sg,
+	TP_PROTO(struct device *dev, unsigned int i, unsigned int nelems,
+		 dma_addr_t dev_addr, phys_addr_t phys_addr, size_t size),
+
+	TP_ARGS(dev, i, nelems, dev_addr, phys_addr, size),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(dev))
+		__field(unsigned int, i)
+		__field(unsigned int, last)
+		__field(dma_addr_t, dev_addr)
+		__field(phys_addr_t, phys_addr)
+		__field(size_t,	size)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name(dev));
+		__entry->i = i;
+		__entry->last = nelems - 1;
+		__entry->dev_addr = dev_addr;
+		__entry->phys_addr = phys_addr;
+		__entry->size = size;
+	),
+
+	TP_printk("dev=%s elem=%u/%u dev_addr=0x%llx phys_addr=0x%llx size=%zu",
+		  __get_str(dev_name), __entry->i, __entry->last,
+		  (unsigned long long)__entry->dev_addr,
+		  (unsigned long long)__entry->phys_addr,
+		  __entry->size)
+);
+
+TRACE_EVENT(bounce_unmap_sg,
+	TP_PROTO(struct device *dev, unsigned int i, unsigned int nelems,
+		 dma_addr_t dev_addr, phys_addr_t phys_addr, size_t size),
+
+	TP_ARGS(dev, i, nelems, dev_addr, phys_addr, size),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(dev))
+		__field(unsigned int, i)
+		__field(unsigned int, last)
+		__field(dma_addr_t, dev_addr)
+		__field(phys_addr_t, phys_addr)
+		__field(size_t,	size)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name(dev));
+		__entry->i = i;
+		__entry->last = nelems - 1;
+		__entry->dev_addr = dev_addr;
+		__entry->phys_addr = phys_addr;
+		__entry->size = size;
+	),
+
+	TP_printk("dev=%s elem=%u/%u dev_addr=0x%llx phys_addr=0x%llx size=%zu",
+		  __get_str(dev_name), __entry->i, __entry->last,
+		  (unsigned long long)__entry->dev_addr,
+		  (unsigned long long)__entry->phys_addr,
+		  __entry->size)
+);
+#endif /* _TRACE_INTEL_IOMMU_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 8/9] iommu/vt-d: Code refactoring for bounce map and unmap
  2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
                   ` (6 preceding siblings ...)
  2019-06-03  1:16 ` [PATCH v4 7/9] iommu/vt-d: Add trace events for domain map/unmap Lu Baolu
@ 2019-06-03  1:16 ` Lu Baolu
  2019-06-03  1:16 ` [PATCH v4 9/9] iommu/vt-d: Use bounce buffer for untrusted devices Lu Baolu
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-03  1:16 UTC (permalink / raw)
  To: David Woodhouse, Joerg Roedel, Bjorn Helgaas, Christoph Hellwig
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, ashok.raj,
	Konrad Rzeszutek Wilk, alan.cox, Jonathan Corbet, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Boris Ostrovsky,
	mika.westerberg

In order to making it ready for calling iommu_bounce_map() and
iommu_bounce_unmap() in __intel_map_single() and intel_unmap(),
we need to do some code refactoring.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel-iommu.c | 89 ++++++++++++++++++++-----------------
 1 file changed, 49 insertions(+), 40 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 41439647f75d..2f54734d1c43 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3510,16 +3510,19 @@ static bool iommu_need_mapping(struct device *dev)
 	return true;
 }
 
-static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr,
-				     size_t size, int dir, u64 dma_mask)
+static dma_addr_t
+__intel_map_single(struct device *dev, phys_addr_t paddr, size_t size,
+		   enum dma_data_direction dir, unsigned long attrs,
+		   u64 dma_mask)
 {
+	unsigned long paddr_pfn = paddr >> PAGE_SHIFT;
 	struct dmar_domain *domain;
+	struct intel_iommu *iommu;
 	phys_addr_t start_paddr;
 	unsigned long iova_pfn;
+	unsigned long nrpages;
 	int prot = 0;
 	int ret;
-	struct intel_iommu *iommu;
-	unsigned long paddr_pfn = paddr >> PAGE_SHIFT;
 
 	BUG_ON(dir == DMA_NONE);
 
@@ -3528,9 +3531,9 @@ static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr,
 		return DMA_MAPPING_ERROR;
 
 	iommu = domain_get_iommu(domain);
-	size = aligned_nrpages(paddr, size);
-
-	iova_pfn = intel_alloc_iova(dev, domain, dma_to_mm_pfn(size), dma_mask);
+	nrpages = aligned_nrpages(paddr, size);
+	iova_pfn = intel_alloc_iova(dev, domain,
+				    dma_to_mm_pfn(nrpages), dma_mask);
 	if (!iova_pfn)
 		goto error;
 
@@ -3550,7 +3553,7 @@ static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr,
 	 * is not a big problem
 	 */
 	ret = domain_pfn_mapping(domain, mm_to_dma_pfn(iova_pfn),
-				 mm_to_dma_pfn(paddr_pfn), size, prot);
+				 mm_to_dma_pfn(paddr_pfn), nrpages, prot);
 	if (ret)
 		goto error;
 
@@ -3560,7 +3563,7 @@ static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr,
 
 error:
 	if (iova_pfn)
-		free_iova_fast(&domain->iovad, iova_pfn, dma_to_mm_pfn(size));
+		free_iova_fast(&domain->iovad, iova_pfn, dma_to_mm_pfn(nrpages));
 	dev_err(dev, "Device request: %zx@%llx dir %d --- failed\n",
 		size, (unsigned long long)paddr, dir);
 	return DMA_MAPPING_ERROR;
@@ -3573,7 +3576,7 @@ static dma_addr_t intel_map_page(struct device *dev, struct page *page,
 {
 	if (iommu_need_mapping(dev))
 		return __intel_map_single(dev, page_to_phys(page) + offset,
-				size, dir, *dev->dma_mask);
+					  size, dir, attrs, *dev->dma_mask);
 	return dma_direct_map_page(dev, page, offset, size, dir, attrs);
 }
 
@@ -3582,38 +3585,53 @@ static dma_addr_t intel_map_resource(struct device *dev, phys_addr_t phys_addr,
 				     unsigned long attrs)
 {
 	if (iommu_need_mapping(dev))
-		return __intel_map_single(dev, phys_addr, size, dir,
-				*dev->dma_mask);
+		return __intel_map_single(dev, phys_addr,
+					  size, dir, attrs, *dev->dma_mask);
 	return dma_direct_map_resource(dev, phys_addr, size, dir, attrs);
 }
 
-static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
+static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size,
+			struct scatterlist *sglist, int nelems,
+			enum dma_data_direction dir, unsigned long attrs)
 {
-	struct dmar_domain *domain;
 	unsigned long start_pfn, last_pfn;
-	unsigned long nrpages;
-	unsigned long iova_pfn;
-	struct intel_iommu *iommu;
-	struct page *freelist;
+	struct page *freelist = NULL;
 	struct pci_dev *pdev = NULL;
+	struct dmar_domain *domain;
+	unsigned long nrpages = 0;
+	struct intel_iommu *iommu;
+	unsigned long iova_pfn;
 
 	domain = find_domain(dev);
 	BUG_ON(!domain);
 
 	iommu = domain_get_iommu(domain);
 
-	iova_pfn = IOVA_PFN(dev_addr);
+	if (sglist) {
+		struct scatterlist *sg;
+		int i;
 
-	nrpages = aligned_nrpages(dev_addr, size);
-	start_pfn = mm_to_dma_pfn(iova_pfn);
-	last_pfn = start_pfn + nrpages - 1;
+		dev_addr = sg_dma_address(sglist) & PAGE_MASK;
+		iova_pfn = IOVA_PFN(dev_addr);
+		for_each_sg(sglist, sg, nelems, i) {
+			nrpages += aligned_nrpages(sg_dma_address(sg),
+			sg_dma_len(sg));
+		}
+		start_pfn = mm_to_dma_pfn(iova_pfn);
+		last_pfn = start_pfn + nrpages - 1;
 
-	if (dev_is_pci(dev))
-		pdev = to_pci_dev(dev);
+		freelist = domain_unmap(domain, start_pfn, last_pfn);
+	} else {
+		iova_pfn = IOVA_PFN(dev_addr);
+		nrpages = aligned_nrpages(dev_addr, size);
+		start_pfn = mm_to_dma_pfn(iova_pfn);
+		last_pfn = start_pfn + nrpages - 1;
 
-	dev_dbg(dev, "Device unmapping: pfn %lx-%lx\n", start_pfn, last_pfn);
+		freelist = domain_unmap(domain, start_pfn, last_pfn);
+	}
 
-	freelist = domain_unmap(domain, start_pfn, last_pfn);
+	if (dev_is_pci(dev))
+		pdev = to_pci_dev(dev);
 
 	if (intel_iommu_strict || (pdev && pdev->untrusted)) {
 		iommu_flush_iotlb_psi(iommu, domain, start_pfn,
@@ -3636,7 +3654,7 @@ static void intel_unmap_page(struct device *dev, dma_addr_t dev_addr,
 			     unsigned long attrs)
 {
 	if (iommu_need_mapping(dev))
-		intel_unmap(dev, dev_addr, size);
+		intel_unmap(dev, dev_addr, size, NULL, 0, dir, attrs);
 	else
 		dma_direct_unmap_page(dev, dev_addr, size, dir, attrs);
 }
@@ -3645,7 +3663,7 @@ static void intel_unmap_resource(struct device *dev, dma_addr_t dev_addr,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
 	if (iommu_need_mapping(dev))
-		intel_unmap(dev, dev_addr, size);
+		intel_unmap(dev, dev_addr, size, NULL, 0, dir, attrs);
 }
 
 static void *intel_alloc_coherent(struct device *dev, size_t size,
@@ -3675,7 +3693,7 @@ static void *intel_alloc_coherent(struct device *dev, size_t size,
 	memset(page_address(page), 0, size);
 
 	*dma_handle = __intel_map_single(dev, page_to_phys(page), size,
-					 DMA_BIDIRECTIONAL,
+					 DMA_BIDIRECTIONAL, attrs,
 					 dev->coherent_dma_mask);
 	if (*dma_handle != DMA_MAPPING_ERROR)
 		return page_address(page);
@@ -3697,7 +3715,7 @@ static void intel_free_coherent(struct device *dev, size_t size, void *vaddr,
 	size = PAGE_ALIGN(size);
 	order = get_order(size);
 
-	intel_unmap(dev, dma_handle, size);
+	intel_unmap(dev, dma_handle, size, NULL, 0, 0, attrs);
 	if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
 		__free_pages(page, order);
 }
@@ -3706,19 +3724,10 @@ static void intel_unmap_sg(struct device *dev, struct scatterlist *sglist,
 			   int nelems, enum dma_data_direction dir,
 			   unsigned long attrs)
 {
-	dma_addr_t startaddr = sg_dma_address(sglist) & PAGE_MASK;
-	unsigned long nrpages = 0;
-	struct scatterlist *sg;
-	int i;
-
 	if (!iommu_need_mapping(dev))
 		return dma_direct_unmap_sg(dev, sglist, nelems, dir, attrs);
 
-	for_each_sg(sglist, sg, nelems, i) {
-		nrpages += aligned_nrpages(sg_dma_address(sg), sg_dma_len(sg));
-	}
-
-	intel_unmap(dev, startaddr, nrpages << VTD_PAGE_SHIFT);
+	intel_unmap(dev, 0, 0, sglist, nelems, dir, attrs);
 }
 
 static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nelems,
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 9/9] iommu/vt-d: Use bounce buffer for untrusted devices
  2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
                   ` (7 preceding siblings ...)
  2019-06-03  1:16 ` [PATCH v4 8/9] iommu/vt-d: Code refactoring for bounce map and unmap Lu Baolu
@ 2019-06-03  1:16 ` Lu Baolu
  2019-06-10 15:42 ` [PATCH v4 0/9] iommu: Bounce page " Konrad Rzeszutek Wilk
  2019-06-10 16:10 ` Konrad Rzeszutek Wilk
  10 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-03  1:16 UTC (permalink / raw)
  To: David Woodhouse, Joerg Roedel, Bjorn Helgaas, Christoph Hellwig
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, ashok.raj,
	Konrad Rzeszutek Wilk, alan.cox, Jonathan Corbet, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Boris Ostrovsky,
	mika.westerberg

The Intel VT-d hardware uses paging for DMA remapping.
The minimum mapped window is a page size. The device
drivers may map buffers not filling the whole IOMMU
window. This allows the device to access to possibly
unrelated memory and a malicious device could exploit
this to perform DMA attacks. To address this, the
Intel IOMMU driver will use bounce pages for those
buffers which don't fill whole IOMMU pages.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Xu Pengfei <pengfei.xu@intel.com>
Tested-by: Mika Westerberg <mika.westerberg@intel.com>
---
 drivers/iommu/intel-iommu.c | 128 ++++++++++++++++++++++++++++++++----
 1 file changed, 117 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2f54734d1c43..4bf744a1c239 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -52,6 +52,7 @@
 #include <asm/irq_remapping.h>
 #include <asm/cacheflush.h>
 #include <asm/iommu.h>
+#include <trace/events/intel_iommu.h>
 
 #include "irq_remapping.h"
 #include "intel-pasid.h"
@@ -3537,6 +3538,19 @@ __intel_map_single(struct device *dev, phys_addr_t paddr, size_t size,
 	if (!iova_pfn)
 		goto error;
 
+	if (device_needs_bounce(dev)) {
+		dma_addr_t ret_addr;
+
+		ret_addr = iommu_bounce_map(dev, iova_pfn << PAGE_SHIFT,
+					    paddr, size, dir, attrs);
+		if (ret_addr == DMA_MAPPING_ERROR)
+			goto error;
+		trace_bounce_map_single(dev, iova_pfn << PAGE_SHIFT,
+					paddr, size);
+
+		return ret_addr;
+	}
+
 	/*
 	 * Check if DMAR supports zero-length reads on write only
 	 * mappings..
@@ -3620,14 +3634,28 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size,
 		start_pfn = mm_to_dma_pfn(iova_pfn);
 		last_pfn = start_pfn + nrpages - 1;
 
-		freelist = domain_unmap(domain, start_pfn, last_pfn);
+		if (device_needs_bounce(dev))
+			for_each_sg(sglist, sg, nelems, i) {
+				iommu_bounce_unmap(dev, sg_dma_address(sg),
+						   sg->length, dir, attrs);
+				trace_bounce_unmap_sg(dev, i, nelems,
+						      sg_dma_address(sg),
+						      sg_phys(sg), sg->length);
+			}
+		else
+			freelist = domain_unmap(domain, start_pfn, last_pfn);
 	} else {
 		iova_pfn = IOVA_PFN(dev_addr);
 		nrpages = aligned_nrpages(dev_addr, size);
 		start_pfn = mm_to_dma_pfn(iova_pfn);
 		last_pfn = start_pfn + nrpages - 1;
 
-		freelist = domain_unmap(domain, start_pfn, last_pfn);
+		if (device_needs_bounce(dev)) {
+			iommu_bounce_unmap(dev, dev_addr, size, dir, attrs);
+			trace_bounce_unmap_single(dev, dev_addr, size);
+		} else {
+			freelist = domain_unmap(domain, start_pfn, last_pfn);
+		}
 	}
 
 	if (dev_is_pci(dev))
@@ -3774,6 +3802,26 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
 		prot |= DMA_PTE_WRITE;
 
 	start_vpfn = mm_to_dma_pfn(iova_pfn);
+	if (device_needs_bounce(dev)) {
+		for_each_sg(sglist, sg, nelems, i) {
+			dma_addr_t ret_addr;
+
+			ret_addr = iommu_bounce_map(dev,
+					start_vpfn << VTD_PAGE_SHIFT,
+					sg_phys(sg), sg->length, dir, attrs);
+			if (ret_addr == DMA_MAPPING_ERROR)
+				break;
+
+			trace_bounce_map_sg(dev, i, nelems, ret_addr,
+					    sg_phys(sg), sg->length);
+
+			sg->dma_address = ret_addr;
+			sg->dma_length = sg->length;
+			start_vpfn += aligned_nrpages(sg->offset, sg->length);
+		}
+
+		return i;
+	}
 
 	ret = domain_sg_mapping(domain, start_vpfn, sglist, size, prot);
 	if (unlikely(ret)) {
@@ -3787,16 +3835,74 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
 	return nelems;
 }
 
+static void
+intel_sync_single_for_cpu(struct device *dev, dma_addr_t addr,
+			  size_t size, enum dma_data_direction dir)
+{
+	if (!iommu_need_mapping(dev))
+		dma_direct_sync_single_for_cpu(dev, addr, size, dir);
+
+	if (device_needs_bounce(dev))
+		iommu_bounce_sync(dev, addr, size, dir, SYNC_FOR_CPU);
+}
+
+static void
+intel_sync_single_for_device(struct device *dev, dma_addr_t addr,
+			     size_t size, enum dma_data_direction dir)
+{
+	if (!iommu_need_mapping(dev))
+		dma_direct_sync_single_for_device(dev, addr, size, dir);
+
+	if (device_needs_bounce(dev))
+		iommu_bounce_sync(dev, addr, size, dir, SYNC_FOR_DEVICE);
+}
+
+static void
+intel_sync_sg_for_cpu(struct device *dev, struct scatterlist *sglist,
+		      int nelems, enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	if (!iommu_need_mapping(dev))
+		dma_direct_sync_sg_for_cpu(dev, sglist, nelems, dir);
+
+	if (device_needs_bounce(dev))
+		for_each_sg(sglist, sg, nelems, i)
+			iommu_bounce_sync(dev, sg_dma_address(sg),
+					  sg_dma_len(sg), dir, SYNC_FOR_CPU);
+}
+
+static void
+intel_sync_sg_for_device(struct device *dev, struct scatterlist *sglist,
+			 int nelems, enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	if (!iommu_need_mapping(dev))
+		dma_direct_sync_sg_for_device(dev, sglist, nelems, dir);
+
+	if (device_needs_bounce(dev))
+		for_each_sg(sglist, sg, nelems, i)
+			iommu_bounce_sync(dev, sg_dma_address(sg),
+					  sg_dma_len(sg), dir, SYNC_FOR_DEVICE);
+}
+
 static const struct dma_map_ops intel_dma_ops = {
-	.alloc = intel_alloc_coherent,
-	.free = intel_free_coherent,
-	.map_sg = intel_map_sg,
-	.unmap_sg = intel_unmap_sg,
-	.map_page = intel_map_page,
-	.unmap_page = intel_unmap_page,
-	.map_resource = intel_map_resource,
-	.unmap_resource = intel_unmap_resource,
-	.dma_supported = dma_direct_supported,
+	.alloc			= intel_alloc_coherent,
+	.free			= intel_free_coherent,
+	.map_sg			= intel_map_sg,
+	.unmap_sg		= intel_unmap_sg,
+	.map_page		= intel_map_page,
+	.unmap_page		= intel_unmap_page,
+	.sync_single_for_cpu	= intel_sync_single_for_cpu,
+	.sync_single_for_device	= intel_sync_single_for_device,
+	.sync_sg_for_cpu	= intel_sync_sg_for_cpu,
+	.sync_sg_for_device	= intel_sync_sg_for_device,
+	.map_resource		= intel_map_resource,
+	.unmap_resource		= intel_unmap_resource,
+	.dma_supported		= dma_direct_supported,
 };
 
 static inline int iommu_domain_cache_init(void)
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 7/9] iommu/vt-d: Add trace events for domain map/unmap
  2019-06-03  1:16 ` [PATCH v4 7/9] iommu/vt-d: Add trace events for domain map/unmap Lu Baolu
@ 2019-06-04  9:01   ` Steven Rostedt
  2019-06-05  6:48     ` Lu Baolu
  2019-06-10 16:08   ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 31+ messages in thread
From: Steven Rostedt @ 2019-06-04  9:01 UTC (permalink / raw)
  To: Lu Baolu
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Konrad Rzeszutek Wilk, Bjorn Helgaas,
	Boris Ostrovsky, mika.westerberg, Juergen Gross,
	Greg Kroah-Hartman, linux-kernel, iommu, jacob.jun.pan,
	Robin Murphy

On Mon,  3 Jun 2019 09:16:18 +0800
Lu Baolu <baolu.lu@linux.intel.com> wrote:


> +TRACE_EVENT(bounce_unmap_single,
> +	TP_PROTO(struct device *dev, dma_addr_t dev_addr, size_t size),
> +
> +	TP_ARGS(dev, dev_addr, size),
> +
> +	TP_STRUCT__entry(
> +		__string(dev_name, dev_name(dev))
> +		__field(dma_addr_t, dev_addr)
> +		__field(size_t,	size)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(dev_name, dev_name(dev));
> +		__entry->dev_addr = dev_addr;
> +		__entry->size = size;
> +	),
> +
> +	TP_printk("dev=%s dev_addr=0x%llx size=%zu",
> +		  __get_str(dev_name),
> +		  (unsigned long long)__entry->dev_addr,
> +		  __entry->size)
> +);
> +
> +TRACE_EVENT(bounce_map_sg,
> +	TP_PROTO(struct device *dev, unsigned int i, unsigned int nelems,
> +		 dma_addr_t dev_addr, phys_addr_t phys_addr, size_t size),
> +
> +	TP_ARGS(dev, i, nelems, dev_addr, phys_addr, size),
> +
> +	TP_STRUCT__entry(
> +		__string(dev_name, dev_name(dev))
> +		__field(unsigned int, i)
> +		__field(unsigned int, last)
> +		__field(dma_addr_t, dev_addr)
> +		__field(phys_addr_t, phys_addr)
> +		__field(size_t,	size)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(dev_name, dev_name(dev));
> +		__entry->i = i;
> +		__entry->last = nelems - 1;
> +		__entry->dev_addr = dev_addr;
> +		__entry->phys_addr = phys_addr;
> +		__entry->size = size;
> +	),
> +
> +	TP_printk("dev=%s elem=%u/%u dev_addr=0x%llx phys_addr=0x%llx size=%zu",
> +		  __get_str(dev_name), __entry->i, __entry->last,
> +		  (unsigned long long)__entry->dev_addr,
> +		  (unsigned long long)__entry->phys_addr,
> +		  __entry->size)
> +);
> +
> +TRACE_EVENT(bounce_unmap_sg,
> +	TP_PROTO(struct device *dev, unsigned int i, unsigned int nelems,
> +		 dma_addr_t dev_addr, phys_addr_t phys_addr, size_t size),
> +
> +	TP_ARGS(dev, i, nelems, dev_addr, phys_addr, size),
> +
> +	TP_STRUCT__entry(
> +		__string(dev_name, dev_name(dev))
> +		__field(unsigned int, i)
> +		__field(unsigned int, last)
> +		__field(dma_addr_t, dev_addr)
> +		__field(phys_addr_t, phys_addr)
> +		__field(size_t,	size)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(dev_name, dev_name(dev));
> +		__entry->i = i;
> +		__entry->last = nelems - 1;
> +		__entry->dev_addr = dev_addr;
> +		__entry->phys_addr = phys_addr;
> +		__entry->size = size;
> +	),
> +
> +	TP_printk("dev=%s elem=%u/%u dev_addr=0x%llx phys_addr=0x%llx size=%zu",
> +		  __get_str(dev_name), __entry->i, __entry->last,
> +		  (unsigned long long)__entry->dev_addr,
> +		  (unsigned long long)__entry->phys_addr,
> +		  __entry->size)
> +);

These last two events look identical. Please use the
DECLARE_EVENT_CLASS() to describe the event and then DEFINE_EVENT() for
the two events.

Each TRACE_EVENT() can add up to 5k of data/text, where as a
DEFINE_EVENT() just adds around 250 bytes.

(Note, a TRACE_EVENT() is defined as a
DECLARE_EVENT_CLASS()/DEFINE_EVENT() pair)

-- Steve


> +#endif /* _TRACE_INTEL_IOMMU_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 7/9] iommu/vt-d: Add trace events for domain map/unmap
  2019-06-04  9:01   ` Steven Rostedt
@ 2019-06-05  6:48     ` Lu Baolu
  0 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-05  6:48 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Konrad Rzeszutek Wilk, Bjorn Helgaas,
	Boris Ostrovsky, mika.westerberg, Juergen Gross,
	Greg Kroah-Hartman, linux-kernel, iommu, jacob.jun.pan,
	Robin Murphy

Hi Steve,

On 6/4/19 5:01 PM, Steven Rostedt wrote:
> On Mon,  3 Jun 2019 09:16:18 +0800
> Lu Baolu <baolu.lu@linux.intel.com> wrote:
> 
> 
>> +TRACE_EVENT(bounce_unmap_single,
>> +	TP_PROTO(struct device *dev, dma_addr_t dev_addr, size_t size),
>> +
>> +	TP_ARGS(dev, dev_addr, size),
>> +
>> +	TP_STRUCT__entry(
>> +		__string(dev_name, dev_name(dev))
>> +		__field(dma_addr_t, dev_addr)
>> +		__field(size_t,	size)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__assign_str(dev_name, dev_name(dev));
>> +		__entry->dev_addr = dev_addr;
>> +		__entry->size = size;
>> +	),
>> +
>> +	TP_printk("dev=%s dev_addr=0x%llx size=%zu",
>> +		  __get_str(dev_name),
>> +		  (unsigned long long)__entry->dev_addr,
>> +		  __entry->size)
>> +);
>> +
>> +TRACE_EVENT(bounce_map_sg,
>> +	TP_PROTO(struct device *dev, unsigned int i, unsigned int nelems,
>> +		 dma_addr_t dev_addr, phys_addr_t phys_addr, size_t size),
>> +
>> +	TP_ARGS(dev, i, nelems, dev_addr, phys_addr, size),
>> +
>> +	TP_STRUCT__entry(
>> +		__string(dev_name, dev_name(dev))
>> +		__field(unsigned int, i)
>> +		__field(unsigned int, last)
>> +		__field(dma_addr_t, dev_addr)
>> +		__field(phys_addr_t, phys_addr)
>> +		__field(size_t,	size)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__assign_str(dev_name, dev_name(dev));
>> +		__entry->i = i;
>> +		__entry->last = nelems - 1;
>> +		__entry->dev_addr = dev_addr;
>> +		__entry->phys_addr = phys_addr;
>> +		__entry->size = size;
>> +	),
>> +
>> +	TP_printk("dev=%s elem=%u/%u dev_addr=0x%llx phys_addr=0x%llx size=%zu",
>> +		  __get_str(dev_name), __entry->i, __entry->last,
>> +		  (unsigned long long)__entry->dev_addr,
>> +		  (unsigned long long)__entry->phys_addr,
>> +		  __entry->size)
>> +);
>> +
>> +TRACE_EVENT(bounce_unmap_sg,
>> +	TP_PROTO(struct device *dev, unsigned int i, unsigned int nelems,
>> +		 dma_addr_t dev_addr, phys_addr_t phys_addr, size_t size),
>> +
>> +	TP_ARGS(dev, i, nelems, dev_addr, phys_addr, size),
>> +
>> +	TP_STRUCT__entry(
>> +		__string(dev_name, dev_name(dev))
>> +		__field(unsigned int, i)
>> +		__field(unsigned int, last)
>> +		__field(dma_addr_t, dev_addr)
>> +		__field(phys_addr_t, phys_addr)
>> +		__field(size_t,	size)
>> +	),
>> +
>> +	TP_fast_assign(
>> +		__assign_str(dev_name, dev_name(dev));
>> +		__entry->i = i;
>> +		__entry->last = nelems - 1;
>> +		__entry->dev_addr = dev_addr;
>> +		__entry->phys_addr = phys_addr;
>> +		__entry->size = size;
>> +	),
>> +
>> +	TP_printk("dev=%s elem=%u/%u dev_addr=0x%llx phys_addr=0x%llx size=%zu",
>> +		  __get_str(dev_name), __entry->i, __entry->last,
>> +		  (unsigned long long)__entry->dev_addr,
>> +		  (unsigned long long)__entry->phys_addr,
>> +		  __entry->size)
>> +);
> 
> These last two events look identical. Please use the
> DECLARE_EVENT_CLASS() to describe the event and then DEFINE_EVENT() for
> the two events.
> 
> Each TRACE_EVENT() can add up to 5k of data/text, where as a
> DEFINE_EVENT() just adds around 250 bytes.
> 
> (Note, a TRACE_EVENT() is defined as a
> DECLARE_EVENT_CLASS()/DEFINE_EVENT() pair)

Thanks for reviewing. I will rework this patch according to your
comments here.

> 
> -- Steve
> 

Best regards,
Baolu

> 
>> +#endif /* _TRACE_INTEL_IOMMU_H */
>> +
>> +/* This part must be outside protection */
>> +#include <trace/define_trace.h>
> 
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 0/9] iommu: Bounce page for untrusted devices
  2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
                   ` (8 preceding siblings ...)
  2019-06-03  1:16 ` [PATCH v4 9/9] iommu/vt-d: Use bounce buffer for untrusted devices Lu Baolu
@ 2019-06-10 15:42 ` Konrad Rzeszutek Wilk
  2019-06-12  3:00   ` Lu Baolu
  2019-06-10 16:10 ` Konrad Rzeszutek Wilk
  10 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-06-10 15:42 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, mika.westerberg,
	ashok.raj, Jonathan Corbet, alan.cox, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Bjorn Helgaas,
	Boris Ostrovsky, David Woodhouse, Christoph Hellwig

On Mon, Jun 03, 2019 at 09:16:11AM +0800, Lu Baolu wrote:
> The Thunderbolt vulnerabilities are public and have a nice
> name as Thunderclap [1] [3] nowadays. This patch series aims
> to mitigate those concerns.
> 
> An external PCI device is a PCI peripheral device connected
> to the system through an external bus, such as Thunderbolt.
> What makes it different is that it can't be trusted to the
> same degree as the devices build into the system. Generally,
> a trusted PCIe device will DMA into the designated buffers
> and not overrun or otherwise write outside the specified
> bounds. But it's different for an external device.
> 
> The minimum IOMMU mapping granularity is one page (4k), so
> for DMA transfers smaller than that a malicious PCIe device
> can access the whole page of memory even if it does not
> belong to the driver in question. This opens a possibility
> for DMA attack. For more information about DMA attacks
> imposed by an untrusted PCI/PCIe device, please refer to [2].
> 
> This implements bounce buffer for the untrusted external
> devices. The transfers should be limited in isolated pages
> so the IOMMU window does not cover memory outside of what
> the driver expects. Previously (v3 and before), we proposed
> an optimisation to only copy the head and tail of the buffer
> if it spans multiple pages, and directly map the ones in the
> middle. Figure 1 gives a big picture about this solution.
> 
>                                 swiotlb             System
>                 IOVA          bounce page           Memory
>              .---------.      .---------.        .---------.
>              |         |      |         |        |         |
>              |         |      |         |        |         |
> buffer_start .---------.      .---------.        .---------.
>              |         |----->|         |*******>|         |
>              |         |      |         | swiotlb|         |
>              |         |      |         | mapping|         |
>  IOMMU Page  '---------'      '---------'        '---------'
>   Boundary   |         |                         |         |
>              |         |                         |         |
>              |         |                         |         |
>              |         |------------------------>|         |
>              |         |    IOMMU mapping        |         |
>              |         |                         |         |
>  IOMMU Page  .---------.                         .---------.
>   Boundary   |         |                         |         |
>              |         |                         |         |
>              |         |------------------------>|         |
>              |         |     IOMMU mapping       |         |
>              |         |                         |         |
>              |         |                         |         |
>  IOMMU Page  .---------.      .---------.        .---------.
>   Boundary   |         |      |         |        |         |
>              |         |      |         |        |         |
>              |         |----->|         |*******>|         |
>   buffer_end '---------'      '---------' swiotlb'---------'
>              |         |      |         | mapping|         |
>              |         |      |         |        |         |
>              '---------'      '---------'        '---------'
>           Figure 1: A big view of iommu bounce page 
> 
> As Robin Murphy pointed out, this ties us to using strict mode for
> TLB maintenance, which may not be an overall win depending on the
> balance between invalidation bandwidth vs. memcpy bandwidth. If we
> use standard SWIOTLB logic to always copy the whole thing, we should
> be able to release the bounce pages via the flush queue to allow
> 'safe' lazy unmaps. So since v4 we start to use the standard swiotlb
> logic.
> 
>                                 swiotlb             System
>                 IOVA          bounce page           Memory
> buffer_start .---------.      .---------.        .---------.
>              |         |      |         |        |         |
>              |         |      |         |        |         |
>              |         |      |         |        .---------.physical
>              |         |----->|         | ------>|         |_start  
>              |         |iommu |         | swiotlb|         |
>              |         | map  |         |   map  |         |
>  IOMMU Page  .---------.      .---------.        '---------'

The prior picture had 'buffer_start' at an offset in the page. I am
assuming you meant that here in as well?

Meaning it starts at the same offset as 'physical_start' in the right
side box?

>   Boundary   |         |      |         |        |         |
>              |         |      |         |        |         |
>              |         |----->|         |        |         |
>              |         |iommu |         |        |         |
>              |         | map  |         |        |         |
>              |         |      |         |        |         |
>  IOMMU Page  .---------.      .---------.        .---------.
>   Boundary   |         |      |         |        |         |
>              |         |----->|         |        |         |
>              |         |iommu |         |        |         |
>              |         | map  |         |        |         |
>              |         |      |         |        |         |
>  IOMMU Page  |         |      |         |        |         |
>   Boundary   .---------.      .---------.        .---------.
>              |         |      |         |------->|         |
>   buffer_end '---------'      '---------' swiotlb|         |
>              |         |----->|         |   map  |         |
>              |         |iommu |         |        |         |
>              |         | map  |         |        '---------' physical
>              |         |      |         |        |         | _end    
>              '---------'      '---------'        '---------'
>           Figure 2: A big view of simplified iommu bounce page 
> 
> The implementation of bounce buffers for untrusted devices
> will cause a little performance overhead, but we didn't see
> any user experience problems. The users could use the kernel

What kind of devices did you test it with?

Thank you for making this awesome cover letter btw!
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/9] swiotlb: Zero out bounce buffer for untrusted device
  2019-06-03  1:16 ` [PATCH v4 3/9] swiotlb: Zero out bounce buffer for untrusted device Lu Baolu
@ 2019-06-10 15:45   ` Konrad Rzeszutek Wilk
  2019-06-12  0:43     ` Lu Baolu
  0 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-06-10 15:45 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, mika.westerberg,
	ashok.raj, Jonathan Corbet, alan.cox, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Bjorn Helgaas,
	Boris Ostrovsky, David Woodhouse, Christoph Hellwig

On Mon, Jun 03, 2019 at 09:16:14AM +0800, Lu Baolu wrote:
> This is necessary to avoid exposing valid kernel data to any
> milicious device.

malicious 

> 
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  kernel/dma/swiotlb.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index f956f785645a..ed41eb7f6131 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -35,6 +35,7 @@
>  #include <linux/scatterlist.h>
>  #include <linux/mem_encrypt.h>
>  #include <linux/set_memory.h>
> +#include <linux/pci.h>
>  #ifdef CONFIG_DEBUG_FS
>  #include <linux/debugfs.h>
>  #endif
> @@ -560,6 +561,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
>  	 */
>  	for (i = 0; i < nslots; i++)
>  		io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
> +
> +	/* Zero out the bounce buffer if the consumer is untrusted. */
> +	if (dev_is_untrusted(hwdev))
> +		memset(phys_to_virt(tlb_addr), 0, alloc_size);

What if the alloc_size is less than a PAGE? Should this at least have ALIGN or such?

> +
>  	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
>  	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
>  		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
> -- 
> 2.17.1
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 5/9] iommu/vt-d: Don't switch off swiotlb if use direct dma
  2019-06-03  1:16 ` [PATCH v4 5/9] iommu/vt-d: Don't switch off swiotlb if use direct dma Lu Baolu
@ 2019-06-10 15:54   ` Konrad Rzeszutek Wilk
  2019-06-12  2:03     ` Lu Baolu
  0 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-06-10 15:54 UTC (permalink / raw)
  To: Lu Baolu
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Juergen Gross, Greg Kroah-Hartman,
	Mika Westerberg, linux-kernel, iommu, jacob.jun.pan,
	Robin Murphy

On Mon, Jun 03, 2019 at 09:16:16AM +0800, Lu Baolu wrote:
> The direct dma implementation depends on swiotlb. Hence, don't
> switch of swiotlb since direct dma interfaces are used in this

s/of/off/

> driver.

But I think you really want to leave the code as is but change
the #ifdef to check for IOMMU_BOUNCE_PAGE and not CONFIG_SWIOTLB.

As one could disable IOMMU_BOUNCE_PAGE.
> 
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Mika Westerberg <mika.westerberg@intel.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/intel-iommu.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index d5a6c8064c56..235837c50719 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -4625,9 +4625,6 @@ static int __init platform_optin_force_iommu(void)
>  		iommu_identity_mapping |= IDENTMAP_ALL;
>  
>  	dmar_disabled = 0;
> -#if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
> -	swiotlb = 0;
> -#endif
>  	no_iommu = 0;
>  
>  	return 1;
> @@ -4765,9 +4762,6 @@ int __init intel_iommu_init(void)
>  	}
>  	up_write(&dmar_global_lock);
>  
> -#if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
> -	swiotlb = 0;
> -#endif
>  	dma_ops = &intel_dma_ops;
>  
>  	init_iommu_pm_ops();
> -- 
> 2.17.1
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/9] iommu: Add bounce page APIs
  2019-06-03  1:16 ` [PATCH v4 4/9] iommu: Add bounce page APIs Lu Baolu
@ 2019-06-10 15:56   ` Konrad Rzeszutek Wilk
  2019-06-12  0:45     ` Lu Baolu
  2019-06-11 12:10   ` Pavel Begunkov
  1 sibling, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-06-10 15:56 UTC (permalink / raw)
  To: Lu Baolu
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Alan Cox, Juergen Gross, Greg Kroah-Hartman,
	Mika Westerberg, linux-kernel, iommu, jacob.jun.pan,
	Robin Murphy

On Mon, Jun 03, 2019 at 09:16:15AM +0800, Lu Baolu wrote:
> IOMMU hardware always use paging for DMA remapping.  The
> minimum mapped window is a page size. The device drivers
> may map buffers not filling whole IOMMU window. It allows
> device to access to possibly unrelated memory and various
> malicious devices can exploit this to perform DMA attack.
> 
> This introduces the bouce buffer mechanism for DMA buffers
> which doesn't fill a minimal IOMMU page. It could be used
> by various vendor specific IOMMU drivers as long as the
> DMA domain is managed by the generic IOMMU layer. Below
> APIs are added:
> 
> * iommu_bounce_map(dev, addr, paddr, size, dir, attrs)
>   - Map a buffer start at DMA address @addr in bounce page
>     manner. For buffer parts that doesn't cross a whole
>     minimal IOMMU page, the bounce page policy is applied.
>     A bounce page mapped by swiotlb will be used as the DMA
>     target in the IOMMU page table. Otherwise, the physical
>     address @paddr is mapped instead.
> 
> * iommu_bounce_unmap(dev, addr, size, dir, attrs)
>   - Unmap the buffer mapped with iommu_bounce_map(). The bounce
>     page will be torn down after the bounced data get synced.
> 
> * iommu_bounce_sync(dev, addr, size, dir, target)
>   - Synce the bounced data in case the bounce mapped buffer is
>     reused.
> 
> The whole APIs are included within a kernel option IOMMU_BOUNCE_PAGE.
> It's useful for cases where bounce page doesn't needed, for example,
> embedded cases.
> 
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Alan Cox <alan@linux.intel.com>
> Cc: Mika Westerberg <mika.westerberg@intel.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/Kconfig |  14 +++++
>  drivers/iommu/iommu.c | 119 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/iommu.h |  35 +++++++++++++
>  3 files changed, 168 insertions(+)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 83664db5221d..d837ec3f359b 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -86,6 +86,20 @@ config IOMMU_DEFAULT_PASSTHROUGH
>  
>  	  If unsure, say N here.
>  
> +config IOMMU_BOUNCE_PAGE
> +	bool "Use bounce page for untrusted devices"
> +	depends on IOMMU_API
> +	select SWIOTLB

I think you want:
	depends on IOMMU_API && SWIOTLB

As people may want to have IOMMU and SWIOTLB, and not IOMMU_BOUNCE_PAGE enabled.

> +	help
> +	  IOMMU hardware always use paging for DMA remapping. The minimum
> +	  mapped window is a page size. The device drivers may map buffers
> +	  not filling whole IOMMU window. This allows device to access to
> +	  possibly unrelated memory and malicious device can exploit this
> +	  to perform a DMA attack. Select this to use a bounce page for the
> +	  buffer which doesn't fill a whole IOMU page.
> +
> +	  If unsure, say N here.
> +
>  config OF_IOMMU
>         def_bool y
>         depends on OF && IOMMU_API
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 2a906386bb8e..fa44f681a82b 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2246,3 +2246,122 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
>  	return ops->sva_get_pasid(handle);
>  }
>  EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
> +
> +#ifdef CONFIG_IOMMU_BOUNCE_PAGE
> +
> +/*
> + * Bounce buffer support for external devices:
> + *
> + * IOMMU hardware always use paging for DMA remapping. The minimum mapped
> + * window is a page size. The device drivers may map buffers not filling
> + * whole IOMMU window. This allows device to access to possibly unrelated
> + * memory and malicious device can exploit this to perform a DMA attack.
> + * Use bounce pages for the buffer which doesn't fill whole IOMMU pages.
> + */
> +
> +static inline size_t
> +get_aligned_size(struct iommu_domain *domain, dma_addr_t addr, size_t size)
> +{
> +	unsigned long page_size = 1 << __ffs(domain->pgsize_bitmap);
> +	unsigned long offset = page_size - 1;
> +
> +	return ALIGN((addr & offset) + size, page_size);
> +}
> +
> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
> +			    phys_addr_t paddr, size_t size,
> +			    enum dma_data_direction dir,
> +			    unsigned long attrs)
> +{
> +	struct iommu_domain *domain;
> +	unsigned int min_pagesz;
> +	phys_addr_t tlb_addr;
> +	size_t aligned_size;
> +	int prot = 0;
> +	int ret;
> +
> +	domain = iommu_get_dma_domain(dev);
> +	if (!domain)
> +		return DMA_MAPPING_ERROR;
> +
> +	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
> +		prot |= IOMMU_READ;
> +	if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)
> +		prot |= IOMMU_WRITE;
> +
> +	aligned_size = get_aligned_size(domain, paddr, size);
> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +
> +	/*
> +	 * If both the physical buffer start address and size are
> +	 * page aligned, we don't need to use a bounce page.
> +	 */
> +	if (!IS_ALIGNED(paddr | size, min_pagesz)) {
> +		tlb_addr = swiotlb_tbl_map_single(dev,
> +				__phys_to_dma(dev, io_tlb_start),
> +				paddr, size, aligned_size, dir, attrs);
> +		if (tlb_addr == DMA_MAPPING_ERROR)
> +			return DMA_MAPPING_ERROR;
> +	} else {
> +		tlb_addr = paddr;
> +	}
> +
> +	ret = iommu_map(domain, iova, tlb_addr, aligned_size, prot);
> +	if (ret) {
> +		swiotlb_tbl_unmap_single(dev, tlb_addr, size,
> +					 aligned_size, dir, attrs);
> +		return DMA_MAPPING_ERROR;
> +	}
> +
> +	return iova;
> +}
> +EXPORT_SYMBOL_GPL(iommu_bounce_map);
> +
> +static inline phys_addr_t
> +iova_to_tlb_addr(struct iommu_domain *domain, dma_addr_t addr)
> +{
> +	if (unlikely(!domain->ops || !domain->ops->iova_to_phys))
> +		return 0;
> +
> +	return domain->ops->iova_to_phys(domain, addr);
> +}
> +
> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
> +			enum dma_data_direction dir, unsigned long attrs)
> +{
> +	struct iommu_domain *domain;
> +	phys_addr_t tlb_addr;
> +	size_t aligned_size;
> +
> +	domain = iommu_get_dma_domain(dev);
> +	if (WARN_ON(!domain))
> +		return;
> +
> +	aligned_size = get_aligned_size(domain, iova, size);
> +	tlb_addr = iova_to_tlb_addr(domain, iova);
> +	if (WARN_ON(!tlb_addr))
> +		return;
> +
> +	iommu_unmap(domain, iova, aligned_size);
> +	if (is_swiotlb_buffer(tlb_addr))
> +		swiotlb_tbl_unmap_single(dev, tlb_addr, size,
> +					 aligned_size, dir, attrs);
> +}
> +EXPORT_SYMBOL_GPL(iommu_bounce_unmap);
> +
> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
> +		       enum dma_data_direction dir, enum dma_sync_target target)
> +{
> +	struct iommu_domain *domain;
> +	phys_addr_t tlb_addr;
> +
> +	domain = iommu_get_dma_domain(dev);
> +	if (WARN_ON(!domain))
> +		return;
> +
> +	tlb_addr = iova_to_tlb_addr(domain, addr);
> +	if (is_swiotlb_buffer(tlb_addr))
> +		swiotlb_tbl_sync_single(dev, tlb_addr, size, dir, target);
> +}
> +EXPORT_SYMBOL_GPL(iommu_bounce_sync);
> +#endif /* CONFIG_IOMMU_BOUNCE_PAGE */
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 91af22a344e2..814c0da64692 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -25,6 +25,8 @@
>  #include <linux/errno.h>
>  #include <linux/err.h>
>  #include <linux/of.h>
> +#include <linux/swiotlb.h>
> +#include <linux/dma-direct.h>
>  
>  #define IOMMU_READ	(1 << 0)
>  #define IOMMU_WRITE	(1 << 1)
> @@ -499,6 +501,39 @@ int iommu_sva_set_ops(struct iommu_sva *handle,
>  		      const struct iommu_sva_ops *ops);
>  int iommu_sva_get_pasid(struct iommu_sva *handle);
>  
> +#ifdef CONFIG_IOMMU_BOUNCE_PAGE
> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
> +			    phys_addr_t paddr, size_t size,
> +			    enum dma_data_direction dir,
> +			    unsigned long attrs);
> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
> +			enum dma_data_direction dir, unsigned long attrs);
> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
> +		       enum dma_data_direction dir,
> +		       enum dma_sync_target target);
> +#else
> +static inline
> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
> +			    phys_addr_t paddr, size_t size,
> +			    enum dma_data_direction dir,
> +			    unsigned long attrs)
> +{
> +	return DMA_MAPPING_ERROR;
> +}
> +
> +static inline
> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
> +			enum dma_data_direction dir, unsigned long attrs)
> +{
> +}
> +
> +static inline
> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
> +		       enum dma_data_direction dir, enum dma_sync_target target)
> +{
> +}
> +#endif /* CONFIG_IOMMU_BOUNCE_PAGE */
> +
>  #else /* CONFIG_IOMMU_API */
>  
>  struct iommu_ops {};
> -- 
> 2.17.1
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 6/9] iommu/vt-d: Check whether device requires bounce buffer
  2019-06-03  1:16 ` [PATCH v4 6/9] iommu/vt-d: Check whether device requires bounce buffer Lu Baolu
@ 2019-06-10 16:08   ` Konrad Rzeszutek Wilk
  2019-06-12  2:22     ` Lu Baolu
  0 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-06-10 16:08 UTC (permalink / raw)
  To: Lu Baolu
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Juergen Gross, Greg Kroah-Hartman, linux-kernel,
	iommu, jacob.jun.pan, Robin Murphy

On Mon, Jun 03, 2019 at 09:16:17AM +0800, Lu Baolu wrote:
> This adds a helper to check whether a device needs to
> use bounce buffer. It also provides a boot time option
> to disable the bounce buffer. Users can use this to
> prevent the iommu driver from using the bounce buffer
> for performance gain.
> 
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Tested-by: Xu Pengfei <pengfei.xu@intel.com>
> Tested-by: Mika Westerberg <mika.westerberg@intel.com>
> ---
>  Documentation/admin-guide/kernel-parameters.txt | 5 +++++
>  drivers/iommu/intel-iommu.c                     | 6 ++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 138f6664b2e2..65685c6e53e4 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1728,6 +1728,11 @@
>  			Note that using this option lowers the security
>  			provided by tboot because it makes the system
>  			vulnerable to DMA attacks.
> +		nobounce [Default off]
> +			Do not use the bounce buffer for untrusted devices like
> +			the Thunderbolt devices. This will treat the untrusted

My brain has sometimes a hard time parsing 'Not' and 'un'. Could this be:

	Disable bounce buffer for unstrusted devices ..?


And perhaps call it 'noswiotlb' ? Not everyone knows that SWIOTLB = bounce buffer.

> +			devices as the trusted ones, hence might expose security
> +			risks of DMA attacks.
>  
>  	intel_idle.max_cstate=	[KNL,HW,ACPI,X86]
>  			0	disables intel_idle and fall back on acpi_idle.
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 235837c50719..41439647f75d 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -371,6 +371,7 @@ static int dmar_forcedac;
>  static int intel_iommu_strict;
>  static int intel_iommu_superpage = 1;
>  static int iommu_identity_mapping;
> +static int intel_no_bounce;

intel_swiotlb_on = 1 ?

>  
>  #define IDENTMAP_ALL		1
>  #define IDENTMAP_GFX		2
> @@ -384,6 +385,8 @@ EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped);
>  static DEFINE_SPINLOCK(device_domain_lock);
>  static LIST_HEAD(device_domain_list);
>  
> +#define device_needs_bounce(d) (!intel_no_bounce && dev_is_untrusted(d))
> +
>  /*
>   * Iterate over elements in device_domain_list and call the specified
>   * callback @fn against each element.
> @@ -466,6 +469,9 @@ static int __init intel_iommu_setup(char *str)
>  			printk(KERN_INFO
>  				"Intel-IOMMU: not forcing on after tboot. This could expose security risk for tboot\n");
>  			intel_iommu_tboot_noforce = 1;
> +		} else if (!strncmp(str, "nobounce", 8)) {
> +			pr_info("Intel-IOMMU: No bounce buffer. This could expose security risks of DMA attacks\n");

Again, Intel-IOMMU: No SWIOTLB. T.. blah blah'

Asking for this as doing 'dmesg | grep SWIOTLB' will expose nicely all
the SWIOTLB invocations..

> +			intel_no_bounce = 1;
>  		}
>  
>  		str += strcspn(str, ",");
> -- 
> 2.17.1
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 7/9] iommu/vt-d: Add trace events for domain map/unmap
  2019-06-03  1:16 ` [PATCH v4 7/9] iommu/vt-d: Add trace events for domain map/unmap Lu Baolu
  2019-06-04  9:01   ` Steven Rostedt
@ 2019-06-10 16:08   ` Konrad Rzeszutek Wilk
  2019-06-12  2:31     ` Lu Baolu
  1 sibling, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-06-10 16:08 UTC (permalink / raw)
  To: Lu Baolu
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Juergen Gross, Greg Kroah-Hartman, linux-kernel,
	iommu, jacob.jun.pan, Robin Murphy

On Mon, Jun 03, 2019 at 09:16:18AM +0800, Lu Baolu wrote:
> This adds trace support for the Intel IOMMU driver. It
> also declares some events which could be used to trace
> the events when an IOVA is being mapped or unmapped in
> a domain.

Is that even needed considering SWIOTLB also has tracing events?
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 0/9] iommu: Bounce page for untrusted devices
  2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
                   ` (9 preceding siblings ...)
  2019-06-10 15:42 ` [PATCH v4 0/9] iommu: Bounce page " Konrad Rzeszutek Wilk
@ 2019-06-10 16:10 ` Konrad Rzeszutek Wilk
  2019-06-12  3:04   ` Lu Baolu
  10 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-06-10 16:10 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, mika.westerberg,
	ashok.raj, Jonathan Corbet, alan.cox, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Bjorn Helgaas,
	Boris Ostrovsky, David Woodhouse, Christoph Hellwig

On Mon, Jun 03, 2019 at 09:16:11AM +0800, Lu Baolu wrote:
> The Thunderbolt vulnerabilities are public and have a nice
> name as Thunderclap [1] [3] nowadays. This patch series aims
> to mitigate those concerns.

.. Forgot to ask but should the patches also include the CVE number?
Or at least the last one that enables this?

Thanks.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/9] iommu: Add bounce page APIs
  2019-06-03  1:16 ` [PATCH v4 4/9] iommu: Add bounce page APIs Lu Baolu
  2019-06-10 15:56   ` Konrad Rzeszutek Wilk
@ 2019-06-11 12:10   ` Pavel Begunkov
  2019-06-12  0:52     ` Lu Baolu
  1 sibling, 1 reply; 31+ messages in thread
From: Pavel Begunkov @ 2019-06-11 12:10 UTC (permalink / raw)
  To: Lu Baolu, David Woodhouse, Joerg Roedel, Bjorn Helgaas,
	Christoph Hellwig
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, ashok.raj,
	Konrad Rzeszutek Wilk, alan.cox, Jonathan Corbet, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Boris Ostrovsky, Alan Cox,
	mika.westerberg, Mika Westerberg


[-- Attachment #1.1.1: Type: text/plain, Size: 9315 bytes --]



On 03/06/2019 04:16, Lu Baolu wrote:
> IOMMU hardware always use paging for DMA remapping.  The
> minimum mapped window is a page size. The device drivers
> may map buffers not filling whole IOMMU window. It allows
> device to access to possibly unrelated memory and various
> malicious devices can exploit this to perform DMA attack.
> 
> This introduces the bouce buffer mechanism for DMA buffers
> which doesn't fill a minimal IOMMU page. It could be used
> by various vendor specific IOMMU drivers as long as the
> DMA domain is managed by the generic IOMMU layer. Below
> APIs are added:
> 
> * iommu_bounce_map(dev, addr, paddr, size, dir, attrs)
>   - Map a buffer start at DMA address @addr in bounce page
>     manner. For buffer parts that doesn't cross a whole
>     minimal IOMMU page, the bounce page policy is applied.
>     A bounce page mapped by swiotlb will be used as the DMA
>     target in the IOMMU page table. Otherwise, the physical
>     address @paddr is mapped instead.
> 
> * iommu_bounce_unmap(dev, addr, size, dir, attrs)
>   - Unmap the buffer mapped with iommu_bounce_map(). The bounce
>     page will be torn down after the bounced data get synced.
> 
> * iommu_bounce_sync(dev, addr, size, dir, target)
>   - Synce the bounced data in case the bounce mapped buffer is
>     reused.
> 
> The whole APIs are included within a kernel option IOMMU_BOUNCE_PAGE.
> It's useful for cases where bounce page doesn't needed, for example,
> embedded cases.
> 
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Alan Cox <alan@linux.intel.com>
> Cc: Mika Westerberg <mika.westerberg@intel.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/Kconfig |  14 +++++
>  drivers/iommu/iommu.c | 119 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/iommu.h |  35 +++++++++++++
>  3 files changed, 168 insertions(+)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 83664db5221d..d837ec3f359b 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -86,6 +86,20 @@ config IOMMU_DEFAULT_PASSTHROUGH
>  
>  	  If unsure, say N here.
>  
> +config IOMMU_BOUNCE_PAGE
> +	bool "Use bounce page for untrusted devices"
> +	depends on IOMMU_API
> +	select SWIOTLB
> +	help
> +	  IOMMU hardware always use paging for DMA remapping. The minimum
> +	  mapped window is a page size. The device drivers may map buffers
> +	  not filling whole IOMMU window. This allows device to access to
> +	  possibly unrelated memory and malicious device can exploit this
> +	  to perform a DMA attack. Select this to use a bounce page for the
> +	  buffer which doesn't fill a whole IOMU page.
> +
> +	  If unsure, say N here.
> +
>  config OF_IOMMU
>         def_bool y
>         depends on OF && IOMMU_API
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 2a906386bb8e..fa44f681a82b 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2246,3 +2246,122 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
>  	return ops->sva_get_pasid(handle);
>  }
>  EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
> +
> +#ifdef CONFIG_IOMMU_BOUNCE_PAGE
> +
> +/*
> + * Bounce buffer support for external devices:
> + *
> + * IOMMU hardware always use paging for DMA remapping. The minimum mapped
> + * window is a page size. The device drivers may map buffers not filling
> + * whole IOMMU window. This allows device to access to possibly unrelated
> + * memory and malicious device can exploit this to perform a DMA attack.
> + * Use bounce pages for the buffer which doesn't fill whole IOMMU pages.
> + */
> +
> +static inline size_t
> +get_aligned_size(struct iommu_domain *domain, dma_addr_t addr, size_t size)
> +{
> +	unsigned long page_size = 1 << __ffs(domain->pgsize_bitmap);
> +	unsigned long offset = page_size - 1;
> +
> +	return ALIGN((addr & offset) + size, page_size);
> +}
> +
> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
> +			    phys_addr_t paddr, size_t size,
> +			    enum dma_data_direction dir,
> +			    unsigned long attrs)
> +{
> +	struct iommu_domain *domain;
> +	unsigned int min_pagesz;
> +	phys_addr_t tlb_addr;
> +	size_t aligned_size;
> +	int prot = 0;
> +	int ret;
> +
> +	domain = iommu_get_dma_domain(dev);
> +	if (!domain)
> +		return DMA_MAPPING_ERROR;
> +
> +	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
> +		prot |= IOMMU_READ;
> +	if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)
> +		prot |= IOMMU_WRITE;
> +
> +	aligned_size = get_aligned_size(domain, paddr, size);
> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +
> +	/*
> +	 * If both the physical buffer start address and size are
> +	 * page aligned, we don't need to use a bounce page.
> +	 */
> +	if (!IS_ALIGNED(paddr | size, min_pagesz)) {
> +		tlb_addr = swiotlb_tbl_map_single(dev,
> +				__phys_to_dma(dev, io_tlb_start),
> +				paddr, size, aligned_size, dir, attrs);
> +		if (tlb_addr == DMA_MAPPING_ERROR)
> +			return DMA_MAPPING_ERROR;
> +	} else {
> +		tlb_addr = paddr;
> +	}
> +
> +	ret = iommu_map(domain, iova, tlb_addr, aligned_size, prot);
> +	if (ret) {
> +		swiotlb_tbl_unmap_single(dev, tlb_addr, size,
> +					 aligned_size, dir, attrs);
You would probably want to check, whether @tlb_addr came from
swiotlb_tbl_map_single. (is_swiotlb_buffer() or reuse predicate above).


> +		return DMA_MAPPING_ERROR;
> +	}
> +
> +	return iova;
> +}
> +EXPORT_SYMBOL_GPL(iommu_bounce_map);
> +
> +static inline phys_addr_t
> +iova_to_tlb_addr(struct iommu_domain *domain, dma_addr_t addr)
> +{
> +	if (unlikely(!domain->ops || !domain->ops->iova_to_phys))
> +		return 0;
> +
> +	return domain->ops->iova_to_phys(domain, addr);
> +}
> +
> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
> +			enum dma_data_direction dir, unsigned long attrs)
> +{
> +	struct iommu_domain *domain;
> +	phys_addr_t tlb_addr;
> +	size_t aligned_size;
> +
> +	domain = iommu_get_dma_domain(dev);
> +	if (WARN_ON(!domain))
> +		return;
> +
> +	aligned_size = get_aligned_size(domain, iova, size);
> +	tlb_addr = iova_to_tlb_addr(domain, iova);
> +	if (WARN_ON(!tlb_addr))
> +		return;
> +
> +	iommu_unmap(domain, iova, aligned_size);
> +	if (is_swiotlb_buffer(tlb_addr))

Is there any chance, this @tlb_addr is a swiotlb buffer, but owned by an
API user? I mean something like
iommu_bounce_map(swiotlb_tbl_map_single()).

Then, to retain ownership semantic, we shouldn't unmap it. Maybe to
check and fail iommu_bounce_map() to be sure?


> +		swiotlb_tbl_unmap_single(dev, tlb_addr, size,
> +					 aligned_size, dir, attrs);
> +}
> +EXPORT_SYMBOL_GPL(iommu_bounce_unmap);
> +
> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
> +		       enum dma_data_direction dir, enum dma_sync_target target)
> +{
> +	struct iommu_domain *domain;
> +	phys_addr_t tlb_addr;
> +
> +	domain = iommu_get_dma_domain(dev);
> +	if (WARN_ON(!domain))
> +		return;
> +
> +	tlb_addr = iova_to_tlb_addr(domain, addr);
> +	if (is_swiotlb_buffer(tlb_addr))
> +		swiotlb_tbl_sync_single(dev, tlb_addr, size, dir, target);
> +}
> +EXPORT_SYMBOL_GPL(iommu_bounce_sync);
> +#endif /* CONFIG_IOMMU_BOUNCE_PAGE */
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 91af22a344e2..814c0da64692 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -25,6 +25,8 @@
>  #include <linux/errno.h>
>  #include <linux/err.h>
>  #include <linux/of.h>
> +#include <linux/swiotlb.h>
> +#include <linux/dma-direct.h>
>  
>  #define IOMMU_READ	(1 << 0)
>  #define IOMMU_WRITE	(1 << 1)
> @@ -499,6 +501,39 @@ int iommu_sva_set_ops(struct iommu_sva *handle,
>  		      const struct iommu_sva_ops *ops);
>  int iommu_sva_get_pasid(struct iommu_sva *handle);
>  
> +#ifdef CONFIG_IOMMU_BOUNCE_PAGE
> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
> +			    phys_addr_t paddr, size_t size,
> +			    enum dma_data_direction dir,
> +			    unsigned long attrs);
> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
> +			enum dma_data_direction dir, unsigned long attrs);
> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
> +		       enum dma_data_direction dir,
> +		       enum dma_sync_target target);
> +#else
> +static inline
> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
> +			    phys_addr_t paddr, size_t size,
> +			    enum dma_data_direction dir,
> +			    unsigned long attrs)
> +{
> +	return DMA_MAPPING_ERROR;
> +}
> +
> +static inline
> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
> +			enum dma_data_direction dir, unsigned long attrs)
> +{
> +}
> +
> +static inline
> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
> +		       enum dma_data_direction dir, enum dma_sync_target target)
> +{
> +}
> +#endif /* CONFIG_IOMMU_BOUNCE_PAGE */
> +
>  #else /* CONFIG_IOMMU_API */
>  
>  struct iommu_ops {};
> 

-- 
Yours sincerely,
Pavel Begunkov


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/9] swiotlb: Zero out bounce buffer for untrusted device
  2019-06-10 15:45   ` Konrad Rzeszutek Wilk
@ 2019-06-12  0:43     ` Lu Baolu
  2019-06-12  1:05       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 31+ messages in thread
From: Lu Baolu @ 2019-06-12  0:43 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Juergen Gross, Greg Kroah-Hartman, linux-kernel,
	iommu, jacob.jun.pan, Robin Murphy

Hi Konrad,

Thanks a lot for your reviewing.

On 6/10/19 11:45 PM, Konrad Rzeszutek Wilk wrote:
> On Mon, Jun 03, 2019 at 09:16:14AM +0800, Lu Baolu wrote:
>> This is necessary to avoid exposing valid kernel data to any
>> milicious device.
> 
> malicious

Yes, thanks.

> 
>>
>> Suggested-by: Christoph Hellwig <hch@lst.de>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>>   kernel/dma/swiotlb.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
>> index f956f785645a..ed41eb7f6131 100644
>> --- a/kernel/dma/swiotlb.c
>> +++ b/kernel/dma/swiotlb.c
>> @@ -35,6 +35,7 @@
>>   #include <linux/scatterlist.h>
>>   #include <linux/mem_encrypt.h>
>>   #include <linux/set_memory.h>
>> +#include <linux/pci.h>
>>   #ifdef CONFIG_DEBUG_FS
>>   #include <linux/debugfs.h>
>>   #endif
>> @@ -560,6 +561,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
>>   	 */
>>   	for (i = 0; i < nslots; i++)
>>   		io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
>> +
>> +	/* Zero out the bounce buffer if the consumer is untrusted. */
>> +	if (dev_is_untrusted(hwdev))
>> +		memset(phys_to_virt(tlb_addr), 0, alloc_size);
> 
> What if the alloc_size is less than a PAGE? Should this at least have ALIGN or such?

It's the consumer (iommu subsystem) who requires this to be page
aligned. For swiotlb, it just clears out all data in the allocated
bounce buffer.

Best regards,
Baolu

> 
>> +
>>   	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
>>   	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
>>   		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
>> -- 
>> 2.17.1
>>
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/9] iommu: Add bounce page APIs
  2019-06-10 15:56   ` Konrad Rzeszutek Wilk
@ 2019-06-12  0:45     ` Lu Baolu
  0 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-12  0:45 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Alan Cox, Juergen Gross, Greg Kroah-Hartman,
	Mika Westerberg, linux-kernel, iommu, jacob.jun.pan,
	Robin Murphy

Hi,

On 6/10/19 11:56 PM, Konrad Rzeszutek Wilk wrote:
> On Mon, Jun 03, 2019 at 09:16:15AM +0800, Lu Baolu wrote:
>> IOMMU hardware always use paging for DMA remapping.  The
>> minimum mapped window is a page size. The device drivers
>> may map buffers not filling whole IOMMU window. It allows
>> device to access to possibly unrelated memory and various
>> malicious devices can exploit this to perform DMA attack.
>>
>> This introduces the bouce buffer mechanism for DMA buffers
>> which doesn't fill a minimal IOMMU page. It could be used
>> by various vendor specific IOMMU drivers as long as the
>> DMA domain is managed by the generic IOMMU layer. Below
>> APIs are added:
>>
>> * iommu_bounce_map(dev, addr, paddr, size, dir, attrs)
>>    - Map a buffer start at DMA address @addr in bounce page
>>      manner. For buffer parts that doesn't cross a whole
>>      minimal IOMMU page, the bounce page policy is applied.
>>      A bounce page mapped by swiotlb will be used as the DMA
>>      target in the IOMMU page table. Otherwise, the physical
>>      address @paddr is mapped instead.
>>
>> * iommu_bounce_unmap(dev, addr, size, dir, attrs)
>>    - Unmap the buffer mapped with iommu_bounce_map(). The bounce
>>      page will be torn down after the bounced data get synced.
>>
>> * iommu_bounce_sync(dev, addr, size, dir, target)
>>    - Synce the bounced data in case the bounce mapped buffer is
>>      reused.
>>
>> The whole APIs are included within a kernel option IOMMU_BOUNCE_PAGE.
>> It's useful for cases where bounce page doesn't needed, for example,
>> embedded cases.
>>
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> Cc: Alan Cox <alan@linux.intel.com>
>> Cc: Mika Westerberg <mika.westerberg@intel.com>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/Kconfig |  14 +++++
>>   drivers/iommu/iommu.c | 119 ++++++++++++++++++++++++++++++++++++++++++
>>   include/linux/iommu.h |  35 +++++++++++++
>>   3 files changed, 168 insertions(+)
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index 83664db5221d..d837ec3f359b 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -86,6 +86,20 @@ config IOMMU_DEFAULT_PASSTHROUGH
>>   
>>   	  If unsure, say N here.
>>   
>> +config IOMMU_BOUNCE_PAGE
>> +	bool "Use bounce page for untrusted devices"
>> +	depends on IOMMU_API
>> +	select SWIOTLB
> 
> I think you want:
> 	depends on IOMMU_API && SWIOTLB
> 
> As people may want to have IOMMU and SWIOTLB, and not IOMMU_BOUNCE_PAGE enabled.

Yes. Yours makes more sense.

Best regards,
Baolu

> 
>> +	help
>> +	  IOMMU hardware always use paging for DMA remapping. The minimum
>> +	  mapped window is a page size. The device drivers may map buffers
>> +	  not filling whole IOMMU window. This allows device to access to
>> +	  possibly unrelated memory and malicious device can exploit this
>> +	  to perform a DMA attack. Select this to use a bounce page for the
>> +	  buffer which doesn't fill a whole IOMU page.
>> +
>> +	  If unsure, say N here.
>> +
>>   config OF_IOMMU
>>          def_bool y
>>          depends on OF && IOMMU_API
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 2a906386bb8e..fa44f681a82b 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -2246,3 +2246,122 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
>>   	return ops->sva_get_pasid(handle);
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
>> +
>> +#ifdef CONFIG_IOMMU_BOUNCE_PAGE
>> +
>> +/*
>> + * Bounce buffer support for external devices:
>> + *
>> + * IOMMU hardware always use paging for DMA remapping. The minimum mapped
>> + * window is a page size. The device drivers may map buffers not filling
>> + * whole IOMMU window. This allows device to access to possibly unrelated
>> + * memory and malicious device can exploit this to perform a DMA attack.
>> + * Use bounce pages for the buffer which doesn't fill whole IOMMU pages.
>> + */
>> +
>> +static inline size_t
>> +get_aligned_size(struct iommu_domain *domain, dma_addr_t addr, size_t size)
>> +{
>> +	unsigned long page_size = 1 << __ffs(domain->pgsize_bitmap);
>> +	unsigned long offset = page_size - 1;
>> +
>> +	return ALIGN((addr & offset) + size, page_size);
>> +}
>> +
>> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
>> +			    phys_addr_t paddr, size_t size,
>> +			    enum dma_data_direction dir,
>> +			    unsigned long attrs)
>> +{
>> +	struct iommu_domain *domain;
>> +	unsigned int min_pagesz;
>> +	phys_addr_t tlb_addr;
>> +	size_t aligned_size;
>> +	int prot = 0;
>> +	int ret;
>> +
>> +	domain = iommu_get_dma_domain(dev);
>> +	if (!domain)
>> +		return DMA_MAPPING_ERROR;
>> +
>> +	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
>> +		prot |= IOMMU_READ;
>> +	if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)
>> +		prot |= IOMMU_WRITE;
>> +
>> +	aligned_size = get_aligned_size(domain, paddr, size);
>> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
>> +
>> +	/*
>> +	 * If both the physical buffer start address and size are
>> +	 * page aligned, we don't need to use a bounce page.
>> +	 */
>> +	if (!IS_ALIGNED(paddr | size, min_pagesz)) {
>> +		tlb_addr = swiotlb_tbl_map_single(dev,
>> +				__phys_to_dma(dev, io_tlb_start),
>> +				paddr, size, aligned_size, dir, attrs);
>> +		if (tlb_addr == DMA_MAPPING_ERROR)
>> +			return DMA_MAPPING_ERROR;
>> +	} else {
>> +		tlb_addr = paddr;
>> +	}
>> +
>> +	ret = iommu_map(domain, iova, tlb_addr, aligned_size, prot);
>> +	if (ret) {
>> +		swiotlb_tbl_unmap_single(dev, tlb_addr, size,
>> +					 aligned_size, dir, attrs);
>> +		return DMA_MAPPING_ERROR;
>> +	}
>> +
>> +	return iova;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_bounce_map);
>> +
>> +static inline phys_addr_t
>> +iova_to_tlb_addr(struct iommu_domain *domain, dma_addr_t addr)
>> +{
>> +	if (unlikely(!domain->ops || !domain->ops->iova_to_phys))
>> +		return 0;
>> +
>> +	return domain->ops->iova_to_phys(domain, addr);
>> +}
>> +
>> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
>> +			enum dma_data_direction dir, unsigned long attrs)
>> +{
>> +	struct iommu_domain *domain;
>> +	phys_addr_t tlb_addr;
>> +	size_t aligned_size;
>> +
>> +	domain = iommu_get_dma_domain(dev);
>> +	if (WARN_ON(!domain))
>> +		return;
>> +
>> +	aligned_size = get_aligned_size(domain, iova, size);
>> +	tlb_addr = iova_to_tlb_addr(domain, iova);
>> +	if (WARN_ON(!tlb_addr))
>> +		return;
>> +
>> +	iommu_unmap(domain, iova, aligned_size);
>> +	if (is_swiotlb_buffer(tlb_addr))
>> +		swiotlb_tbl_unmap_single(dev, tlb_addr, size,
>> +					 aligned_size, dir, attrs);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_bounce_unmap);
>> +
>> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
>> +		       enum dma_data_direction dir, enum dma_sync_target target)
>> +{
>> +	struct iommu_domain *domain;
>> +	phys_addr_t tlb_addr;
>> +
>> +	domain = iommu_get_dma_domain(dev);
>> +	if (WARN_ON(!domain))
>> +		return;
>> +
>> +	tlb_addr = iova_to_tlb_addr(domain, addr);
>> +	if (is_swiotlb_buffer(tlb_addr))
>> +		swiotlb_tbl_sync_single(dev, tlb_addr, size, dir, target);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_bounce_sync);
>> +#endif /* CONFIG_IOMMU_BOUNCE_PAGE */
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 91af22a344e2..814c0da64692 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -25,6 +25,8 @@
>>   #include <linux/errno.h>
>>   #include <linux/err.h>
>>   #include <linux/of.h>
>> +#include <linux/swiotlb.h>
>> +#include <linux/dma-direct.h>
>>   
>>   #define IOMMU_READ	(1 << 0)
>>   #define IOMMU_WRITE	(1 << 1)
>> @@ -499,6 +501,39 @@ int iommu_sva_set_ops(struct iommu_sva *handle,
>>   		      const struct iommu_sva_ops *ops);
>>   int iommu_sva_get_pasid(struct iommu_sva *handle);
>>   
>> +#ifdef CONFIG_IOMMU_BOUNCE_PAGE
>> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
>> +			    phys_addr_t paddr, size_t size,
>> +			    enum dma_data_direction dir,
>> +			    unsigned long attrs);
>> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
>> +			enum dma_data_direction dir, unsigned long attrs);
>> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
>> +		       enum dma_data_direction dir,
>> +		       enum dma_sync_target target);
>> +#else
>> +static inline
>> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
>> +			    phys_addr_t paddr, size_t size,
>> +			    enum dma_data_direction dir,
>> +			    unsigned long attrs)
>> +{
>> +	return DMA_MAPPING_ERROR;
>> +}
>> +
>> +static inline
>> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
>> +			enum dma_data_direction dir, unsigned long attrs)
>> +{
>> +}
>> +
>> +static inline
>> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
>> +		       enum dma_data_direction dir, enum dma_sync_target target)
>> +{
>> +}
>> +#endif /* CONFIG_IOMMU_BOUNCE_PAGE */
>> +
>>   #else /* CONFIG_IOMMU_API */
>>   
>>   struct iommu_ops {};
>> -- 
>> 2.17.1
>>
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/9] iommu: Add bounce page APIs
  2019-06-11 12:10   ` Pavel Begunkov
@ 2019-06-12  0:52     ` Lu Baolu
  0 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-12  0:52 UTC (permalink / raw)
  To: Pavel Begunkov, David Woodhouse, Joerg Roedel, Bjorn Helgaas,
	Christoph Hellwig
  Cc: alan.cox, Stefano Stabellini, ashok.raj, Jonathan Corbet,
	pengfei.xu, Ingo Molnar, kevin.tian, Konrad Rzeszutek Wilk,
	Steven Rostedt, Boris Ostrovsky, mika.westerberg, Alan Cox,
	Juergen Gross, Greg Kroah-Hartman, Mika Westerberg, linux-kernel,
	iommu, jacob.jun.pan, Robin Murphy

Hi Pavel,

Thanks a lot for your reviewing.

On 6/11/19 8:10 PM, Pavel Begunkov wrote:
> 
> 
> On 03/06/2019 04:16, Lu Baolu wrote:
>> IOMMU hardware always use paging for DMA remapping.  The
>> minimum mapped window is a page size. The device drivers
>> may map buffers not filling whole IOMMU window. It allows
>> device to access to possibly unrelated memory and various
>> malicious devices can exploit this to perform DMA attack.
>>
>> This introduces the bouce buffer mechanism for DMA buffers
>> which doesn't fill a minimal IOMMU page. It could be used
>> by various vendor specific IOMMU drivers as long as the
>> DMA domain is managed by the generic IOMMU layer. Below
>> APIs are added:
>>
>> * iommu_bounce_map(dev, addr, paddr, size, dir, attrs)
>>    - Map a buffer start at DMA address @addr in bounce page
>>      manner. For buffer parts that doesn't cross a whole
>>      minimal IOMMU page, the bounce page policy is applied.
>>      A bounce page mapped by swiotlb will be used as the DMA
>>      target in the IOMMU page table. Otherwise, the physical
>>      address @paddr is mapped instead.
>>
>> * iommu_bounce_unmap(dev, addr, size, dir, attrs)
>>    - Unmap the buffer mapped with iommu_bounce_map(). The bounce
>>      page will be torn down after the bounced data get synced.
>>
>> * iommu_bounce_sync(dev, addr, size, dir, target)
>>    - Synce the bounced data in case the bounce mapped buffer is
>>      reused.
>>
>> The whole APIs are included within a kernel option IOMMU_BOUNCE_PAGE.
>> It's useful for cases where bounce page doesn't needed, for example,
>> embedded cases.
>>
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> Cc: Alan Cox <alan@linux.intel.com>
>> Cc: Mika Westerberg <mika.westerberg@intel.com>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/Kconfig |  14 +++++
>>   drivers/iommu/iommu.c | 119 ++++++++++++++++++++++++++++++++++++++++++
>>   include/linux/iommu.h |  35 +++++++++++++
>>   3 files changed, 168 insertions(+)
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index 83664db5221d..d837ec3f359b 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -86,6 +86,20 @@ config IOMMU_DEFAULT_PASSTHROUGH
>>   
>>   	  If unsure, say N here.
>>   
>> +config IOMMU_BOUNCE_PAGE
>> +	bool "Use bounce page for untrusted devices"
>> +	depends on IOMMU_API
>> +	select SWIOTLB
>> +	help
>> +	  IOMMU hardware always use paging for DMA remapping. The minimum
>> +	  mapped window is a page size. The device drivers may map buffers
>> +	  not filling whole IOMMU window. This allows device to access to
>> +	  possibly unrelated memory and malicious device can exploit this
>> +	  to perform a DMA attack. Select this to use a bounce page for the
>> +	  buffer which doesn't fill a whole IOMU page.
>> +
>> +	  If unsure, say N here.
>> +
>>   config OF_IOMMU
>>          def_bool y
>>          depends on OF && IOMMU_API
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 2a906386bb8e..fa44f681a82b 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -2246,3 +2246,122 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
>>   	return ops->sva_get_pasid(handle);
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
>> +
>> +#ifdef CONFIG_IOMMU_BOUNCE_PAGE
>> +
>> +/*
>> + * Bounce buffer support for external devices:
>> + *
>> + * IOMMU hardware always use paging for DMA remapping. The minimum mapped
>> + * window is a page size. The device drivers may map buffers not filling
>> + * whole IOMMU window. This allows device to access to possibly unrelated
>> + * memory and malicious device can exploit this to perform a DMA attack.
>> + * Use bounce pages for the buffer which doesn't fill whole IOMMU pages.
>> + */
>> +
>> +static inline size_t
>> +get_aligned_size(struct iommu_domain *domain, dma_addr_t addr, size_t size)
>> +{
>> +	unsigned long page_size = 1 << __ffs(domain->pgsize_bitmap);
>> +	unsigned long offset = page_size - 1;
>> +
>> +	return ALIGN((addr & offset) + size, page_size);
>> +}
>> +
>> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
>> +			    phys_addr_t paddr, size_t size,
>> +			    enum dma_data_direction dir,
>> +			    unsigned long attrs)
>> +{
>> +	struct iommu_domain *domain;
>> +	unsigned int min_pagesz;
>> +	phys_addr_t tlb_addr;
>> +	size_t aligned_size;
>> +	int prot = 0;
>> +	int ret;
>> +
>> +	domain = iommu_get_dma_domain(dev);
>> +	if (!domain)
>> +		return DMA_MAPPING_ERROR;
>> +
>> +	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
>> +		prot |= IOMMU_READ;
>> +	if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)
>> +		prot |= IOMMU_WRITE;
>> +
>> +	aligned_size = get_aligned_size(domain, paddr, size);
>> +	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
>> +
>> +	/*
>> +	 * If both the physical buffer start address and size are
>> +	 * page aligned, we don't need to use a bounce page.
>> +	 */
>> +	if (!IS_ALIGNED(paddr | size, min_pagesz)) {
>> +		tlb_addr = swiotlb_tbl_map_single(dev,
>> +				__phys_to_dma(dev, io_tlb_start),
>> +				paddr, size, aligned_size, dir, attrs);
>> +		if (tlb_addr == DMA_MAPPING_ERROR)
>> +			return DMA_MAPPING_ERROR;
>> +	} else {
>> +		tlb_addr = paddr;
>> +	}
>> +
>> +	ret = iommu_map(domain, iova, tlb_addr, aligned_size, prot);
>> +	if (ret) {
>> +		swiotlb_tbl_unmap_single(dev, tlb_addr, size,
>> +					 aligned_size, dir, attrs);
> You would probably want to check, whether @tlb_addr came from
> swiotlb_tbl_map_single. (is_swiotlb_buffer() or reuse predicate above).

Right. Good catch.

> 
> 
>> +		return DMA_MAPPING_ERROR;
>> +	}
>> +
>> +	return iova;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_bounce_map);
>> +
>> +static inline phys_addr_t
>> +iova_to_tlb_addr(struct iommu_domain *domain, dma_addr_t addr)
>> +{
>> +	if (unlikely(!domain->ops || !domain->ops->iova_to_phys))
>> +		return 0;
>> +
>> +	return domain->ops->iova_to_phys(domain, addr);
>> +}
>> +
>> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
>> +			enum dma_data_direction dir, unsigned long attrs)
>> +{
>> +	struct iommu_domain *domain;
>> +	phys_addr_t tlb_addr;
>> +	size_t aligned_size;
>> +
>> +	domain = iommu_get_dma_domain(dev);
>> +	if (WARN_ON(!domain))
>> +		return;
>> +
>> +	aligned_size = get_aligned_size(domain, iova, size);
>> +	tlb_addr = iova_to_tlb_addr(domain, iova);
>> +	if (WARN_ON(!tlb_addr))
>> +		return;
>> +
>> +	iommu_unmap(domain, iova, aligned_size);
>> +	if (is_swiotlb_buffer(tlb_addr))
> 
> Is there any chance, this @tlb_addr is a swiotlb buffer, but owned by an
> API user? I mean something like
> iommu_bounce_map(swiotlb_tbl_map_single()).
> 
> Then, to retain ownership semantic, we shouldn't unmap it. Maybe to
> check and fail iommu_bounce_map() to be sure?

For untrusted devices, we always force iommu to enforce the DMA
isolation. There are no cases where @tlb_addr is a swiotlb buffer
owned by a device driver as far as I can see.

Best regards,
Baolu

> 
> 
>> +		swiotlb_tbl_unmap_single(dev, tlb_addr, size,
>> +					 aligned_size, dir, attrs);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_bounce_unmap);
>> +
>> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
>> +		       enum dma_data_direction dir, enum dma_sync_target target)
>> +{
>> +	struct iommu_domain *domain;
>> +	phys_addr_t tlb_addr;
>> +
>> +	domain = iommu_get_dma_domain(dev);
>> +	if (WARN_ON(!domain))
>> +		return;
>> +
>> +	tlb_addr = iova_to_tlb_addr(domain, addr);
>> +	if (is_swiotlb_buffer(tlb_addr))
>> +		swiotlb_tbl_sync_single(dev, tlb_addr, size, dir, target);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_bounce_sync);
>> +#endif /* CONFIG_IOMMU_BOUNCE_PAGE */
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 91af22a344e2..814c0da64692 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -25,6 +25,8 @@
>>   #include <linux/errno.h>
>>   #include <linux/err.h>
>>   #include <linux/of.h>
>> +#include <linux/swiotlb.h>
>> +#include <linux/dma-direct.h>
>>   
>>   #define IOMMU_READ	(1 << 0)
>>   #define IOMMU_WRITE	(1 << 1)
>> @@ -499,6 +501,39 @@ int iommu_sva_set_ops(struct iommu_sva *handle,
>>   		      const struct iommu_sva_ops *ops);
>>   int iommu_sva_get_pasid(struct iommu_sva *handle);
>>   
>> +#ifdef CONFIG_IOMMU_BOUNCE_PAGE
>> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
>> +			    phys_addr_t paddr, size_t size,
>> +			    enum dma_data_direction dir,
>> +			    unsigned long attrs);
>> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
>> +			enum dma_data_direction dir, unsigned long attrs);
>> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
>> +		       enum dma_data_direction dir,
>> +		       enum dma_sync_target target);
>> +#else
>> +static inline
>> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
>> +			    phys_addr_t paddr, size_t size,
>> +			    enum dma_data_direction dir,
>> +			    unsigned long attrs)
>> +{
>> +	return DMA_MAPPING_ERROR;
>> +}
>> +
>> +static inline
>> +void iommu_bounce_unmap(struct device *dev, dma_addr_t iova, size_t size,
>> +			enum dma_data_direction dir, unsigned long attrs)
>> +{
>> +}
>> +
>> +static inline
>> +void iommu_bounce_sync(struct device *dev, dma_addr_t addr, size_t size,
>> +		       enum dma_data_direction dir, enum dma_sync_target target)
>> +{
>> +}
>> +#endif /* CONFIG_IOMMU_BOUNCE_PAGE */
>> +
>>   #else /* CONFIG_IOMMU_API */
>>   
>>   struct iommu_ops {};
>>
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/9] swiotlb: Zero out bounce buffer for untrusted device
  2019-06-12  0:43     ` Lu Baolu
@ 2019-06-12  1:05       ` Konrad Rzeszutek Wilk
  2019-06-12  3:08         ` Lu Baolu
  0 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-06-12  1:05 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, mika.westerberg,
	ashok.raj, Jonathan Corbet, alan.cox, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Bjorn Helgaas,
	Boris Ostrovsky, David Woodhouse, Christoph Hellwig

On Wed, Jun 12, 2019 at 08:43:40AM +0800, Lu Baolu wrote:
> Hi Konrad,
> 
> Thanks a lot for your reviewing.
> 
> On 6/10/19 11:45 PM, Konrad Rzeszutek Wilk wrote:
> > On Mon, Jun 03, 2019 at 09:16:14AM +0800, Lu Baolu wrote:
> > > This is necessary to avoid exposing valid kernel data to any
> > > milicious device.
> > 
> > malicious
> 
> Yes, thanks.
> 
> > 
> > > 
> > > Suggested-by: Christoph Hellwig <hch@lst.de>
> > > Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> > > ---
> > >   kernel/dma/swiotlb.c | 6 ++++++
> > >   1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> > > index f956f785645a..ed41eb7f6131 100644
> > > --- a/kernel/dma/swiotlb.c
> > > +++ b/kernel/dma/swiotlb.c
> > > @@ -35,6 +35,7 @@
> > >   #include <linux/scatterlist.h>
> > >   #include <linux/mem_encrypt.h>
> > >   #include <linux/set_memory.h>
> > > +#include <linux/pci.h>
> > >   #ifdef CONFIG_DEBUG_FS
> > >   #include <linux/debugfs.h>
> > >   #endif
> > > @@ -560,6 +561,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
> > >   	 */
> > >   	for (i = 0; i < nslots; i++)
> > >   		io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
> > > +
> > > +	/* Zero out the bounce buffer if the consumer is untrusted. */
> > > +	if (dev_is_untrusted(hwdev))
> > > +		memset(phys_to_virt(tlb_addr), 0, alloc_size);
> > 
> > What if the alloc_size is less than a PAGE? Should this at least have ALIGN or such?
> 
> It's the consumer (iommu subsystem) who requires this to be page
> aligned. For swiotlb, it just clears out all data in the allocated
> bounce buffer.

I am thinking that the if you don't memset the full page the malicious hardware could read stale date from the rest of the page
that hasn't been cleared?

> 
> Best regards,
> Baolu
> 
> > 
> > > +
> > >   	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
> > >   	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
> > >   		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
> > > -- 
> > > 2.17.1
> > > 
> > 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 5/9] iommu/vt-d: Don't switch off swiotlb if use direct dma
  2019-06-10 15:54   ` Konrad Rzeszutek Wilk
@ 2019-06-12  2:03     ` Lu Baolu
  0 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-12  2:03 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Juergen Gross, Greg Kroah-Hartman,
	Mika Westerberg, linux-kernel, iommu, jacob.jun.pan,
	Robin Murphy

Hi,

On 6/10/19 11:54 PM, Konrad Rzeszutek Wilk wrote:
> On Mon, Jun 03, 2019 at 09:16:16AM +0800, Lu Baolu wrote:
>> The direct dma implementation depends on swiotlb. Hence, don't
>> switch of swiotlb since direct dma interfaces are used in this
> 
> s/of/off/

Yes.

> 
>> driver.
> 
> But I think you really want to leave the code as is but change
> the #ifdef to check for IOMMU_BOUNCE_PAGE and not CONFIG_SWIOTLB.
> 
> As one could disable IOMMU_BOUNCE_PAGE.

SWIOTLB is not only used when IOMMU_BOUCE_PAGE is enabled.

Please check this code:

static dma_addr_t intel_map_page(struct device *dev, struct page *page,
                                  unsigned long offset, size_t size,
                                  enum dma_data_direction dir,
                                  unsigned long attrs)
{
         if (iommu_need_mapping(dev))
                 return __intel_map_single(dev, page_to_phys(page) + offset,
                                 size, dir, *dev->dma_mask);
         return dma_direct_map_page(dev, page, offset, size, dir, attrs);
}

The dma_direct_map_page() will eventually call swiotlb_map() if the
buffer is beyond the address capability of the device.

Best regards,
Baolu

>>
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> Cc: Mika Westerberg <mika.westerberg@intel.com>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/intel-iommu.c | 6 ------
>>   1 file changed, 6 deletions(-)
>>
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index d5a6c8064c56..235837c50719 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -4625,9 +4625,6 @@ static int __init platform_optin_force_iommu(void)
>>   		iommu_identity_mapping |= IDENTMAP_ALL;
>>   
>>   	dmar_disabled = 0;
>> -#if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
>> -	swiotlb = 0;
>> -#endif
>>   	no_iommu = 0;
>>   
>>   	return 1;
>> @@ -4765,9 +4762,6 @@ int __init intel_iommu_init(void)
>>   	}
>>   	up_write(&dmar_global_lock);
>>   
>> -#if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
>> -	swiotlb = 0;
>> -#endif
>>   	dma_ops = &intel_dma_ops;
>>   
>>   	init_iommu_pm_ops();
>> -- 
>> 2.17.1
>>
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 6/9] iommu/vt-d: Check whether device requires bounce buffer
  2019-06-10 16:08   ` Konrad Rzeszutek Wilk
@ 2019-06-12  2:22     ` Lu Baolu
  0 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-12  2:22 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Juergen Gross, Greg Kroah-Hartman, linux-kernel,
	iommu, jacob.jun.pan, Robin Murphy

Hi,

On 6/11/19 12:08 AM, Konrad Rzeszutek Wilk wrote:
> On Mon, Jun 03, 2019 at 09:16:17AM +0800, Lu Baolu wrote:
>> This adds a helper to check whether a device needs to use bounce
>> buffer. It also provides a boot time option to disable the bounce
>> buffer. Users can use this to prevent the iommu driver from using
>> the bounce buffer for performance gain.
>> 
>> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan
>> <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian
>> <kevin.tian@intel.com> Signed-off-by: Lu Baolu
>> <baolu.lu@linux.intel.com> Tested-by: Xu Pengfei
>> <pengfei.xu@intel.com> Tested-by: Mika Westerberg
>> <mika.westerberg@intel.com> --- 
>> Documentation/admin-guide/kernel-parameters.txt | 5 +++++ 
>> drivers/iommu/intel-iommu.c                     | 6 ++++++ 2 files
>> changed, 11 insertions(+)
>> 
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
>> b/Documentation/admin-guide/kernel-parameters.txt index
>> 138f6664b2e2..65685c6e53e4 100644 ---
>> a/Documentation/admin-guide/kernel-parameters.txt +++
>> b/Documentation/admin-guide/kernel-parameters.txt @@ -1728,6
>> +1728,11 @@ Note that using this option lowers the security 
>> provided by tboot because it makes the system vulnerable to DMA
>> attacks. +		nobounce [Default off] +			Do not use the bounce buffer
>> for untrusted devices like +			the Thunderbolt devices. This will
>> treat the untrusted
> 
> My brain has sometimes a hard time parsing 'Not' and 'un'. Could this
> be:
> 
> Disable bounce buffer for unstrusted devices ..?
> 

Fair enough.

> 
> And perhaps call it 'noswiotlb' ? Not everyone knows that SWIOTLB =
> bounce buffer.

As I said in previous thread, swiotlb is not only used for BOUNCE_PAGE
case, but also used by direct dma APIs. Will it cause confusion?

Anyway, I have no strong feeling to use 'nobounce' or 'noswiotlb'. It's
a driver specific switch for debugging purpose. People suggested that we
should move this switch into pci module, but I heard that it's more
helpful to implement per-device switch for "trusted' or "untrusted".
So I kept this untouched in this version.

> 
>> +			devices as the trusted ones, hence might expose security +
>> risks of DMA attacks.
>> 
>> intel_idle.max_cstate=	[KNL,HW,ACPI,X86] 0	disables intel_idle and
>> fall back on acpi_idle. diff --git a/drivers/iommu/intel-iommu.c
>> b/drivers/iommu/intel-iommu.c index 235837c50719..41439647f75d
>> 100644 --- a/drivers/iommu/intel-iommu.c +++
>> b/drivers/iommu/intel-iommu.c @@ -371,6 +371,7 @@ static int
>> dmar_forcedac; static int intel_iommu_strict; static int
>> intel_iommu_superpage = 1; static int iommu_identity_mapping; 
>> +static int intel_no_bounce;
> 
> intel_swiotlb_on = 1 ?
> 
>> 
>> #define IDENTMAP_ALL		1 #define IDENTMAP_GFX		2 @@ -384,6 +385,8 @@
>> EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped); static
>> DEFINE_SPINLOCK(device_domain_lock); static
>> LIST_HEAD(device_domain_list);
>> 
>> +#define device_needs_bounce(d) (!intel_no_bounce &&
>> dev_is_untrusted(d)) + /* * Iterate over elements in
>> device_domain_list and call the specified * callback @fn against
>> each element. @@ -466,6 +469,9 @@ static int __init
>> intel_iommu_setup(char *str) printk(KERN_INFO "Intel-IOMMU: not
>> forcing on after tboot. This could expose security risk for
>> tboot\n"); intel_iommu_tboot_noforce = 1; +		} else if
>> (!strncmp(str, "nobounce", 8)) { +			pr_info("Intel-IOMMU: No
>> bounce buffer. This could expose security risks of DMA
>> attacks\n");
> 
> Again, Intel-IOMMU: No SWIOTLB. T.. blah blah'
> 
> Asking for this as doing 'dmesg | grep SWIOTLB' will expose nicely
> all the SWIOTLB invocations..

Yes. Will refine this.

Best regards,
Baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 7/9] iommu/vt-d: Add trace events for domain map/unmap
  2019-06-10 16:08   ` Konrad Rzeszutek Wilk
@ 2019-06-12  2:31     ` Lu Baolu
  0 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-12  2:31 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Juergen Gross, Greg Kroah-Hartman, linux-kernel,
	iommu, jacob.jun.pan, Robin Murphy

Hi,

On 6/11/19 12:08 AM, Konrad Rzeszutek Wilk wrote:
> On Mon, Jun 03, 2019 at 09:16:18AM +0800, Lu Baolu wrote:
>> This adds trace support for the Intel IOMMU driver. It
>> also declares some events which could be used to trace
>> the events when an IOVA is being mapped or unmapped in
>> a domain.
> 
> Is that even needed considering SWIOTLB also has tracing events?
> 

Currently there isn't any trace point in swiotlb_tbl_map_single().
If we want to add trace point there, I hope we can distinguish the
bounce page events from other use cases (such as bounce buffer for
direct dma), so that we could calculate how many percents of DMA buffers
used by a specific device driver needs to use bounce page.

Best regards,
Baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 0/9] iommu: Bounce page for untrusted devices
  2019-06-10 15:42 ` [PATCH v4 0/9] iommu: Bounce page " Konrad Rzeszutek Wilk
@ 2019-06-12  3:00   ` Lu Baolu
  2019-06-12  6:22     ` Mika Westerberg
  0 siblings, 1 reply; 31+ messages in thread
From: Lu Baolu @ 2019-06-12  3:00 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Juergen Gross, Greg Kroah-Hartman, linux-kernel,
	iommu, jacob.jun.pan, Robin Murphy

Hi,

On 6/10/19 11:42 PM, Konrad Rzeszutek Wilk wrote:
> On Mon, Jun 03, 2019 at 09:16:11AM +0800, Lu Baolu wrote:
>> The Thunderbolt vulnerabilities are public and have a nice
>> name as Thunderclap [1] [3] nowadays. This patch series aims
>> to mitigate those concerns.
>>
>> An external PCI device is a PCI peripheral device connected
>> to the system through an external bus, such as Thunderbolt.
>> What makes it different is that it can't be trusted to the
>> same degree as the devices build into the system. Generally,
>> a trusted PCIe device will DMA into the designated buffers
>> and not overrun or otherwise write outside the specified
>> bounds. But it's different for an external device.
>>
>> The minimum IOMMU mapping granularity is one page (4k), so
>> for DMA transfers smaller than that a malicious PCIe device
>> can access the whole page of memory even if it does not
>> belong to the driver in question. This opens a possibility
>> for DMA attack. For more information about DMA attacks
>> imposed by an untrusted PCI/PCIe device, please refer to [2].
>>
>> This implements bounce buffer for the untrusted external
>> devices. The transfers should be limited in isolated pages
>> so the IOMMU window does not cover memory outside of what
>> the driver expects. Previously (v3 and before), we proposed
>> an optimisation to only copy the head and tail of the buffer
>> if it spans multiple pages, and directly map the ones in the
>> middle. Figure 1 gives a big picture about this solution.
>>
>>                                  swiotlb             System
>>                  IOVA          bounce page           Memory
>>               .---------.      .---------.        .---------.
>>               |         |      |         |        |         |
>>               |         |      |         |        |         |
>> buffer_start .---------.      .---------.        .---------.
>>               |         |----->|         |*******>|         |
>>               |         |      |         | swiotlb|         |
>>               |         |      |         | mapping|         |
>>   IOMMU Page  '---------'      '---------'        '---------'
>>    Boundary   |         |                         |         |
>>               |         |                         |         |
>>               |         |                         |         |
>>               |         |------------------------>|         |
>>               |         |    IOMMU mapping        |         |
>>               |         |                         |         |
>>   IOMMU Page  .---------.                         .---------.
>>    Boundary   |         |                         |         |
>>               |         |                         |         |
>>               |         |------------------------>|         |
>>               |         |     IOMMU mapping       |         |
>>               |         |                         |         |
>>               |         |                         |         |
>>   IOMMU Page  .---------.      .---------.        .---------.
>>    Boundary   |         |      |         |        |         |
>>               |         |      |         |        |         |
>>               |         |----->|         |*******>|         |
>>    buffer_end '---------'      '---------' swiotlb'---------'
>>               |         |      |         | mapping|         |
>>               |         |      |         |        |         |
>>               '---------'      '---------'        '---------'
>>            Figure 1: A big view of iommu bounce page
>>
>> As Robin Murphy pointed out, this ties us to using strict mode for
>> TLB maintenance, which may not be an overall win depending on the
>> balance between invalidation bandwidth vs. memcpy bandwidth. If we
>> use standard SWIOTLB logic to always copy the whole thing, we should
>> be able to release the bounce pages via the flush queue to allow
>> 'safe' lazy unmaps. So since v4 we start to use the standard swiotlb
>> logic.
>>
>>                                  swiotlb             System
>>                  IOVA          bounce page           Memory
>> buffer_start .---------.      .---------.        .---------.
>>               |         |      |         |        |         |
>>               |         |      |         |        |         |
>>               |         |      |         |        .---------.physical
>>               |         |----->|         | ------>|         |_start
>>               |         |iommu |         | swiotlb|         |
>>               |         | map  |         |   map  |         |
>>   IOMMU Page  .---------.      .---------.        '---------'
> 
> The prior picture had 'buffer_start' at an offset in the page. I am
> assuming you meant that here in as well?

In prior picture, since we only use bounce buffer for head and tail
partial-page buffers, so we need to return buffer_start at the same
offset as the physical buffer.

Here, we use a whole swiotlb bounce buffer, hence we should use the same
offset as the bounce buffer (a.k.a. offset = 0).

> 
> Meaning it starts at the same offset as 'physical_start' in the right
> side box?
> 
>>    Boundary   |         |      |         |        |         |
>>               |         |      |         |        |         |
>>               |         |----->|         |        |         |
>>               |         |iommu |         |        |         |
>>               |         | map  |         |        |         |
>>               |         |      |         |        |         |
>>   IOMMU Page  .---------.      .---------.        .---------.
>>    Boundary   |         |      |         |        |         |
>>               |         |----->|         |        |         |
>>               |         |iommu |         |        |         |
>>               |         | map  |         |        |         |
>>               |         |      |         |        |         |
>>   IOMMU Page  |         |      |         |        |         |
>>    Boundary   .---------.      .---------.        .---------.
>>               |         |      |         |------->|         |
>>    buffer_end '---------'      '---------' swiotlb|         |
>>               |         |----->|         |   map  |         |
>>               |         |iommu |         |        |         |
>>               |         | map  |         |        '---------' physical
>>               |         |      |         |        |         | _end
>>               '---------'      '---------'        '---------'
>>            Figure 2: A big view of simplified iommu bounce page
>>
>> The implementation of bounce buffers for untrusted devices
>> will cause a little performance overhead, but we didn't see
>> any user experience problems. The users could use the kernel
> 
> What kind of devices did you test it with?

Most test work was done by Xu Pengfei (cc'ed). He has run the code
on real platforms with various thunderbolt peripherals (usb, disk,
network, etc.).

> 
> Thank you for making this awesome cover letter btw!
> 

You are welcome.

Best regards,
Baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 0/9] iommu: Bounce page for untrusted devices
  2019-06-10 16:10 ` Konrad Rzeszutek Wilk
@ 2019-06-12  3:04   ` Lu Baolu
  0 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-12  3:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Juergen Gross, Greg Kroah-Hartman, linux-kernel,
	iommu, jacob.jun.pan, Robin Murphy

Hi,

On 6/11/19 12:10 AM, Konrad Rzeszutek Wilk wrote:
> On Mon, Jun 03, 2019 at 09:16:11AM +0800, Lu Baolu wrote:
>> The Thunderbolt vulnerabilities are public and have a nice
>> name as Thunderclap [1] [3] nowadays. This patch series aims
>> to mitigate those concerns.
> 
> .. Forgot to ask but should the patches also include the CVE number?
> Or at least the last one that enables this?

I am sorry, but what's CVE number and where could I get one?

Best regards,
Baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/9] swiotlb: Zero out bounce buffer for untrusted device
  2019-06-12  1:05       ` Konrad Rzeszutek Wilk
@ 2019-06-12  3:08         ` Lu Baolu
  0 siblings, 0 replies; 31+ messages in thread
From: Lu Baolu @ 2019-06-12  3:08 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: alan.cox, Christoph Hellwig, Stefano Stabellini, ashok.raj,
	Jonathan Corbet, pengfei.xu, Ingo Molnar, David Woodhouse,
	kevin.tian, Steven Rostedt, Bjorn Helgaas, Boris Ostrovsky,
	mika.westerberg, Juergen Gross, Greg Kroah-Hartman, linux-kernel,
	iommu, jacob.jun.pan, Robin Murphy

Hi,

On 6/12/19 9:05 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, Jun 12, 2019 at 08:43:40AM +0800, Lu Baolu wrote:
>> Hi Konrad,
>>
>> Thanks a lot for your reviewing.
>>
>> On 6/10/19 11:45 PM, Konrad Rzeszutek Wilk wrote:
>>> On Mon, Jun 03, 2019 at 09:16:14AM +0800, Lu Baolu wrote:
>>>> This is necessary to avoid exposing valid kernel data to any
>>>> milicious device.
>>>
>>> malicious
>>
>> Yes, thanks.
>>
>>>
>>>>
>>>> Suggested-by: Christoph Hellwig <hch@lst.de>
>>>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>>>> ---
>>>>    kernel/dma/swiotlb.c | 6 ++++++
>>>>    1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
>>>> index f956f785645a..ed41eb7f6131 100644
>>>> --- a/kernel/dma/swiotlb.c
>>>> +++ b/kernel/dma/swiotlb.c
>>>> @@ -35,6 +35,7 @@
>>>>    #include <linux/scatterlist.h>
>>>>    #include <linux/mem_encrypt.h>
>>>>    #include <linux/set_memory.h>
>>>> +#include <linux/pci.h>
>>>>    #ifdef CONFIG_DEBUG_FS
>>>>    #include <linux/debugfs.h>
>>>>    #endif
>>>> @@ -560,6 +561,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
>>>>    	 */
>>>>    	for (i = 0; i < nslots; i++)
>>>>    		io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
>>>> +
>>>> +	/* Zero out the bounce buffer if the consumer is untrusted. */
>>>> +	if (dev_is_untrusted(hwdev))
>>>> +		memset(phys_to_virt(tlb_addr), 0, alloc_size);
>>>
>>> What if the alloc_size is less than a PAGE? Should this at least have ALIGN or such?
>>
>> It's the consumer (iommu subsystem) who requires this to be page
>> aligned. For swiotlb, it just clears out all data in the allocated
>> bounce buffer.
> 
> I am thinking that the if you don't memset the full page the malicious hardware could read stale date from the rest of the page
> that hasn't been cleared?

Yes. My point is that this should be guaranteed by the bounce page
implementation in iommu.

Best regards,
Baolu

> 
>>
>> Best regards,
>> Baolu
>>
>>>
>>>> +
>>>>    	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
>>>>    	    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
>>>>    		swiotlb_bounce(orig_addr, tlb_addr, mapping_size, DMA_TO_DEVICE);
>>>> -- 
>>>> 2.17.1
>>>>
>>>
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 0/9] iommu: Bounce page for untrusted devices
  2019-06-12  3:00   ` Lu Baolu
@ 2019-06-12  6:22     ` Mika Westerberg
  0 siblings, 0 replies; 31+ messages in thread
From: Mika Westerberg @ 2019-06-12  6:22 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Juergen Gross, kevin.tian, Stefano Stabellini, ashok.raj,
	Konrad Rzeszutek Wilk, alan.cox, Jonathan Corbet, Robin Murphy,
	Steven Rostedt, linux-kernel, iommu, pengfei.xu, Ingo Molnar,
	jacob.jun.pan, Greg Kroah-Hartman, Bjorn Helgaas,
	Boris Ostrovsky, David Woodhouse, Christoph Hellwig

On Wed, Jun 12, 2019 at 11:00:06AM +0800, Lu Baolu wrote:
> > What kind of devices did you test it with?
> 
> Most test work was done by Xu Pengfei (cc'ed). He has run the code
> on real platforms with various thunderbolt peripherals (usb, disk,
> network, etc.).

In addtition to that we are also in works to build a real thunderclap
platform to verify it can only access the bounce buffer if the DMA
transfer was to a memory not filling the whole page.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2019-06-12  6:24 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-03  1:16 [PATCH v4 0/9] iommu: Bounce page for untrusted devices Lu Baolu
2019-06-03  1:16 ` [PATCH v4 1/9] PCI: Add dev_is_untrusted helper Lu Baolu
2019-06-03  1:16 ` [PATCH v4 2/9] swiotlb: Split size parameter to map/unmap APIs Lu Baolu
2019-06-03  1:16 ` [PATCH v4 3/9] swiotlb: Zero out bounce buffer for untrusted device Lu Baolu
2019-06-10 15:45   ` Konrad Rzeszutek Wilk
2019-06-12  0:43     ` Lu Baolu
2019-06-12  1:05       ` Konrad Rzeszutek Wilk
2019-06-12  3:08         ` Lu Baolu
2019-06-03  1:16 ` [PATCH v4 4/9] iommu: Add bounce page APIs Lu Baolu
2019-06-10 15:56   ` Konrad Rzeszutek Wilk
2019-06-12  0:45     ` Lu Baolu
2019-06-11 12:10   ` Pavel Begunkov
2019-06-12  0:52     ` Lu Baolu
2019-06-03  1:16 ` [PATCH v4 5/9] iommu/vt-d: Don't switch off swiotlb if use direct dma Lu Baolu
2019-06-10 15:54   ` Konrad Rzeszutek Wilk
2019-06-12  2:03     ` Lu Baolu
2019-06-03  1:16 ` [PATCH v4 6/9] iommu/vt-d: Check whether device requires bounce buffer Lu Baolu
2019-06-10 16:08   ` Konrad Rzeszutek Wilk
2019-06-12  2:22     ` Lu Baolu
2019-06-03  1:16 ` [PATCH v4 7/9] iommu/vt-d: Add trace events for domain map/unmap Lu Baolu
2019-06-04  9:01   ` Steven Rostedt
2019-06-05  6:48     ` Lu Baolu
2019-06-10 16:08   ` Konrad Rzeszutek Wilk
2019-06-12  2:31     ` Lu Baolu
2019-06-03  1:16 ` [PATCH v4 8/9] iommu/vt-d: Code refactoring for bounce map and unmap Lu Baolu
2019-06-03  1:16 ` [PATCH v4 9/9] iommu/vt-d: Use bounce buffer for untrusted devices Lu Baolu
2019-06-10 15:42 ` [PATCH v4 0/9] iommu: Bounce page " Konrad Rzeszutek Wilk
2019-06-12  3:00   ` Lu Baolu
2019-06-12  6:22     ` Mika Westerberg
2019-06-10 16:10 ` Konrad Rzeszutek Wilk
2019-06-12  3:04   ` Lu Baolu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).