All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/1] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
@ 2022-03-05 17:07 ` Halil Pasic
  0 siblings, 0 replies; 10+ messages in thread
From: Halil Pasic @ 2022-03-05 17:07 UTC (permalink / raw)
  To: Christoph Hellwig, Marek Szyprowski, Robin Murphy
  Cc: Halil Pasic, stable, Doug Gilbert, James E.J. Bottomley,
	Martin K. Petersen, Christian Borntraeger, Anshuman Khandual,
	Thiago Jung Bauermann, Tom Lendacky, Michael S. Tsirkin,
	Linus Torvalds, iommu, linux-doc

Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.

The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
  must take precedence over DMA_ATTR_SKIP_CPU_SYNC

Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.

Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.

To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: stable@vger.kernel.org

---

I just realized that there are still scenarios where swiotlb may produce
some strange effects. Thus I don't think we have discussed the
dma_sync_* part in detail.

v1 -> v2:
* single patch instead of revert + right version
---
 Documentation/core-api/dma-attributes.rst |  8 --------
 include/linux/dma-mapping.h               |  8 --------
 kernel/dma/swiotlb.c                      | 23 +++++++++++++++--------
 3 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 17706dc91ec9..1887d92e8e92 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,11 +130,3 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
 subsystem that the buffer is fully accessible at the elevated privilege
 level (and ideally inaccessible or at least read-only at the
 lesser-privileged levels).
-
-DMA_ATTR_OVERWRITE
-------------------
-
-This is a hint to the DMA-mapping subsystem that the device is expected to
-overwrite the entire mapped size, thus the caller does not require any of the
-previous buffer contents to be preserved. This allows bounce-buffering
-implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 6150d11a607e..dca2b1355bb1 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,14 +61,6 @@
  */
 #define DMA_ATTR_PRIVILEGED		(1UL << 9)
 
-/*
- * This is a hint to the DMA-mapping subsystem that the device is expected
- * to overwrite the entire mapped size, thus the caller does not require any
- * of the previous buffer contents to be preserved. This allows
- * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
- */
-#define DMA_ATTR_OVERWRITE		(1UL << 10)
-
 /*
  * A dma_addr_t can hold any valid DMA or bus address for the platform.  It can
  * be given to a device to use as a DMA source or target.  It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index bfc56cb21705..6db1c475ec82 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -627,10 +627,14 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
 	for (i = 0; i < nr_slots(alloc_size + offset); i++)
 		mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
 	tlb_addr = slot_addr(mem->start, index) + offset;
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
-	    (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
-	    dir == DMA_BIDIRECTIONAL))
-		swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
+	/*
+	 * When dir == DMA_FROM_DEVICE we could omit the copy from the orig
+	 * to the tlb buffer, if we knew for sure the device will
+	 * overwirte the entire current content. But we don't. Thus
+	 * unconditional bounce may prevent leaking swiotlb content (i.e.
+	 * kernel memory) to user-space.
+	 */
+	swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
 	return tlb_addr;
 }
 
@@ -697,10 +701,13 @@ void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr,
 void swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr,
 		size_t size, enum dma_data_direction dir)
 {
-	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
-		swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
-	else
-		BUG_ON(dir != DMA_FROM_DEVICE);
+	/*
+	 * Unconditional bounce is necessary to avoid corruption on
+	 * sync_*_for_cpu or dma_ummap_* when the device didn't overwrite
+	 * the whole lengt of the bounce buffer.
+	 */
+	swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
+	BUG_ON(!valid_dma_direction(dir));
 }
 
 void swiotlb_sync_single_for_cpu(struct device *dev, phys_addr_t tlb_addr,

base-commit: 38f80f42147ff658aff218edb0a88c37e58bf44f
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 1/1] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
@ 2022-03-05 17:07 ` Halil Pasic
  0 siblings, 0 replies; 10+ messages in thread
From: Halil Pasic @ 2022-03-05 17:07 UTC (permalink / raw)
  To: Christoph Hellwig, Marek Szyprowski, Robin Murphy
  Cc: Tom Lendacky, Martin K. Petersen, Michael S. Tsirkin,
	James E.J. Bottomley, linux-doc, stable, Halil Pasic,
	Christian Borntraeger, iommu, Doug Gilbert, Linus Torvalds,
	Thiago Jung Bauermann, Anshuman Khandual

Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.

The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
  must take precedence over DMA_ATTR_SKIP_CPU_SYNC

Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.

Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.

To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: stable@vger.kernel.org

---

I just realized that there are still scenarios where swiotlb may produce
some strange effects. Thus I don't think we have discussed the
dma_sync_* part in detail.

v1 -> v2:
* single patch instead of revert + right version
---
 Documentation/core-api/dma-attributes.rst |  8 --------
 include/linux/dma-mapping.h               |  8 --------
 kernel/dma/swiotlb.c                      | 23 +++++++++++++++--------
 3 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 17706dc91ec9..1887d92e8e92 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -130,11 +130,3 @@ accesses to DMA buffers in both privileged "supervisor" and unprivileged
 subsystem that the buffer is fully accessible at the elevated privilege
 level (and ideally inaccessible or at least read-only at the
 lesser-privileged levels).
-
-DMA_ATTR_OVERWRITE
-------------------
-
-This is a hint to the DMA-mapping subsystem that the device is expected to
-overwrite the entire mapped size, thus the caller does not require any of the
-previous buffer contents to be preserved. This allows bounce-buffering
-implementations to optimise DMA_FROM_DEVICE transfers.
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 6150d11a607e..dca2b1355bb1 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -61,14 +61,6 @@
  */
 #define DMA_ATTR_PRIVILEGED		(1UL << 9)
 
-/*
- * This is a hint to the DMA-mapping subsystem that the device is expected
- * to overwrite the entire mapped size, thus the caller does not require any
- * of the previous buffer contents to be preserved. This allows
- * bounce-buffering implementations to optimise DMA_FROM_DEVICE transfers.
- */
-#define DMA_ATTR_OVERWRITE		(1UL << 10)
-
 /*
  * A dma_addr_t can hold any valid DMA or bus address for the platform.  It can
  * be given to a device to use as a DMA source or target.  It is specific to a
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index bfc56cb21705..6db1c475ec82 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -627,10 +627,14 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
 	for (i = 0; i < nr_slots(alloc_size + offset); i++)
 		mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
 	tlb_addr = slot_addr(mem->start, index) + offset;
-	if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
-	    (!(attrs & DMA_ATTR_OVERWRITE) || dir == DMA_TO_DEVICE ||
-	    dir == DMA_BIDIRECTIONAL))
-		swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
+	/*
+	 * When dir == DMA_FROM_DEVICE we could omit the copy from the orig
+	 * to the tlb buffer, if we knew for sure the device will
+	 * overwirte the entire current content. But we don't. Thus
+	 * unconditional bounce may prevent leaking swiotlb content (i.e.
+	 * kernel memory) to user-space.
+	 */
+	swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
 	return tlb_addr;
 }
 
@@ -697,10 +701,13 @@ void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr,
 void swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr,
 		size_t size, enum dma_data_direction dir)
 {
-	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
-		swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
-	else
-		BUG_ON(dir != DMA_FROM_DEVICE);
+	/*
+	 * Unconditional bounce is necessary to avoid corruption on
+	 * sync_*_for_cpu or dma_ummap_* when the device didn't overwrite
+	 * the whole lengt of the bounce buffer.
+	 */
+	swiotlb_bounce(dev, tlb_addr, size, DMA_TO_DEVICE);
+	BUG_ON(!valid_dma_direction(dir));
 }
 
 void swiotlb_sync_single_for_cpu(struct device *dev, phys_addr_t tlb_addr,

base-commit: 38f80f42147ff658aff218edb0a88c37e58bf44f
-- 
2.32.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/1] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
  2022-03-05 17:07 ` Halil Pasic
@ 2022-03-05 19:44   ` Linus Torvalds
  -1 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2022-03-05 19:44 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, stable,
	Doug Gilbert, James E.J. Bottomley, Martin K. Petersen,
	Christian Borntraeger, Anshuman Khandual, Thiago Jung Bauermann,
	Tom Lendacky, Michael S. Tsirkin, iommu, open list:DOCUMENTATION

On Sat, Mar 5, 2022 at 9:07 AM Halil Pasic <pasic@linux.ibm.com> wrote:
>
> Unfortunately, we ended up merging an old version of the patch "fix info
> leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
> (the swiotlb maintainer), he asked me to create an incremental fix
> (after I have pointed this out the mix up, and asked him for guidance).
> So here we go.
 [...]

Christoph, I am assuming I'll get this from you, but if you have
nothing else pending, just holler and I can take it directly.

That said, it seems sad to bounce the buffer just in case the device
doesn't do what we expect it to do. Wouldn't it be better to just
clear the buffer instead of copying the old data into it?

Also, possibly stupid question - when is swiotlb used in practice
these days? What are the performance implications of this? Will this
mean completely unnecessary copies for all normal IO that will be
overwritten by the DMA? Can't we limit it to just SG_IO (or is it
already for some reason)?

                  Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/1] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
@ 2022-03-05 19:44   ` Linus Torvalds
  0 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2022-03-05 19:44 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Tom Lendacky, Martin K. Petersen, Michael S. Tsirkin,
	James E.J. Bottomley, open list:DOCUMENTATION, stable,
	Christoph Hellwig, Christian Borntraeger, iommu, Doug Gilbert,
	Anshuman Khandual, Robin Murphy, Thiago Jung Bauermann

On Sat, Mar 5, 2022 at 9:07 AM Halil Pasic <pasic@linux.ibm.com> wrote:
>
> Unfortunately, we ended up merging an old version of the patch "fix info
> leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
> (the swiotlb maintainer), he asked me to create an incremental fix
> (after I have pointed this out the mix up, and asked him for guidance).
> So here we go.
 [...]

Christoph, I am assuming I'll get this from you, but if you have
nothing else pending, just holler and I can take it directly.

That said, it seems sad to bounce the buffer just in case the device
doesn't do what we expect it to do. Wouldn't it be better to just
clear the buffer instead of copying the old data into it?

Also, possibly stupid question - when is swiotlb used in practice
these days? What are the performance implications of this? Will this
mean completely unnecessary copies for all normal IO that will be
overwritten by the DMA? Can't we limit it to just SG_IO (or is it
already for some reason)?

                  Linus
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/1] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
  2022-03-05 19:44   ` Linus Torvalds
@ 2022-03-06  7:58     ` Christoph Hellwig
  -1 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2022-03-06  7:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Halil Pasic, Christoph Hellwig, Marek Szyprowski, Robin Murphy,
	stable, Doug Gilbert, James E.J. Bottomley, Martin K. Petersen,
	Christian Borntraeger, Anshuman Khandual, Thiago Jung Bauermann,
	Tom Lendacky, Michael S. Tsirkin, iommu, open list:DOCUMENTATION

On Sat, Mar 05, 2022 at 11:44:51AM -0800, Linus Torvalds wrote:
> Christoph, I am assuming I'll get this from you, but if you have
> nothing else pending, just holler and I can take it directly.

I have nothing else pending, so feel free to take it directly:

Reviewed-by: Christoph Hellwig <hch@lst.de>

> That said, it seems sad to bounce the buffer just in case the device
> doesn't do what we expect it to do. Wouldn't it be better to just
> clear the buffer instead of copying the old data into it?

That unfortunately does not work always work, as we have interfaces
where we expect data that is not written to by the device to remain
unchanged, and there are many cases where the device does not write
the whole buffer.  In various places like SCSI the data transfer
can be smaller, but even more common is the case of no data transfer
at all error cases.

> Also, possibly stupid question - when is swiotlb used in practice
> these days? What are the performance implications of this? Will this
> mean completely unnecessary copies for all normal IO that will be
> overwritten by the DMA? Can't we limit it to just SG_IO (or is it
> already for some reason)?

There are three use case:
 - bounce buffering due to address limitations when no IOMMU is present
 - bounce buffering for not page aligned I/O on IOMMUs if the device
   is considered untrusted (which りight now means thunderbolt attached)
 - unconditional bounce buffering for trusted hypervisor schemes

All of thee apply to all I/O.  SG_IO is just a vector here as a
particularly badly designed userspace interface that happens to get a
decent test coverage.  There unfortuately are plenty of other ways to
expose this issue as well.  We're thinking of building interfaces to
avoid the overhead for well designed interfaces, but it will take some
time.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/1] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
@ 2022-03-06  7:58     ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2022-03-06  7:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Martin K. Petersen, Tom Lendacky,
	James E.J. Bottomley, Michael S. Tsirkin,
	open list:DOCUMENTATION, stable, Halil Pasic,
	Christian Borntraeger, iommu, Doug Gilbert, Anshuman Khandual,
	Robin Murphy, Thiago Jung Bauermann

On Sat, Mar 05, 2022 at 11:44:51AM -0800, Linus Torvalds wrote:
> Christoph, I am assuming I'll get this from you, but if you have
> nothing else pending, just holler and I can take it directly.

I have nothing else pending, so feel free to take it directly:

Reviewed-by: Christoph Hellwig <hch@lst.de>

> That said, it seems sad to bounce the buffer just in case the device
> doesn't do what we expect it to do. Wouldn't it be better to just
> clear the buffer instead of copying the old data into it?

That unfortunately does not work always work, as we have interfaces
where we expect data that is not written to by the device to remain
unchanged, and there are many cases where the device does not write
the whole buffer.  In various places like SCSI the data transfer
can be smaller, but even more common is the case of no data transfer
at all error cases.

> Also, possibly stupid question - when is swiotlb used in practice
> these days? What are the performance implications of this? Will this
> mean completely unnecessary copies for all normal IO that will be
> overwritten by the DMA? Can't we limit it to just SG_IO (or is it
> already for some reason)?

There are three use case:
 - bounce buffering due to address limitations when no IOMMU is present
 - bounce buffering for not page aligned I/O on IOMMUs if the device
   is considered untrusted (which りight now means thunderbolt attached)
 - unconditional bounce buffering for trusted hypervisor schemes

All of thee apply to all I/O.  SG_IO is just a vector here as a
particularly badly designed userspace interface that happens to get a
decent test coverage.  There unfortuately are plenty of other ways to
expose this issue as well.  We're thinking of building interfaces to
avoid the overhead for well designed interfaces, but it will take some
time.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/1] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
  2022-03-06  7:58     ` Christoph Hellwig
@ 2022-03-06 20:30       ` Linus Torvalds
  -1 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2022-03-06 20:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Halil Pasic, Marek Szyprowski, Robin Murphy, stable,
	Doug Gilbert, James E.J. Bottomley, Martin K. Petersen,
	Christian Borntraeger, Anshuman Khandual, Thiago Jung Bauermann,
	Tom Lendacky, Michael S. Tsirkin, iommu, open list:DOCUMENTATION

On Sat, Mar 5, 2022 at 11:58 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> All of thee apply to all I/O.  SG_IO is just a vector here as a
> particularly badly designed userspace interface that happens to get a
> decent test coverage.

I phrased that badly. I didn't mean SG_IO in particular, I meant any
of the "generate special commands" interfaces.

In particular, I meant it as "not *regular* read/write commands".

It seems extremely wasteful to do a completely unnecessary bounce
buffer operation for READ/WRITE commands.

Because honestly, if *those* are broken on some DMA device and they
don't fill the buffer completely - despite claiming they do - then
that device is so terminally broken that it's not even worth worrying
about.

So I would expect that

 (a) READ/WRITE actually fills the whole buffer

 (b) READ/WRITE are the only ones where we care about performance at a
bounce-buffer level

so it boils down to "do we still do this horrible memcpy even for
regular IO commands"? Because that would, in my opinion, just be
stupid.

                 Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/1] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
@ 2022-03-06 20:30       ` Linus Torvalds
  0 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2022-03-06 20:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tom Lendacky, Martin K. Petersen, Michael S. Tsirkin,
	James E.J. Bottomley, open list:DOCUMENTATION, stable,
	Halil Pasic, Christian Borntraeger, iommu, Doug Gilbert,
	Anshuman Khandual, Robin Murphy, Thiago Jung Bauermann

On Sat, Mar 5, 2022 at 11:58 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> All of thee apply to all I/O.  SG_IO is just a vector here as a
> particularly badly designed userspace interface that happens to get a
> decent test coverage.

I phrased that badly. I didn't mean SG_IO in particular, I meant any
of the "generate special commands" interfaces.

In particular, I meant it as "not *regular* read/write commands".

It seems extremely wasteful to do a completely unnecessary bounce
buffer operation for READ/WRITE commands.

Because honestly, if *those* are broken on some DMA device and they
don't fill the buffer completely - despite claiming they do - then
that device is so terminally broken that it's not even worth worrying
about.

So I would expect that

 (a) READ/WRITE actually fills the whole buffer

 (b) READ/WRITE are the only ones where we care about performance at a
bounce-buffer level

so it boils down to "do we still do this horrible memcpy even for
regular IO commands"? Because that would, in my opinion, just be
stupid.

                 Linus
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/1] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
  2022-03-06 20:30       ` Linus Torvalds
@ 2022-03-07  7:58         ` Christoph Hellwig
  -1 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2022-03-07  7:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Tom Lendacky, Martin K. Petersen,
	Michael S. Tsirkin, James E.J. Bottomley,
	open list:DOCUMENTATION, stable, Halil Pasic,
	Christian Borntraeger, iommu, Doug Gilbert, Anshuman Khandual,
	Robin Murphy, Thiago Jung Bauermann

On Sun, Mar 06, 2022 at 12:30:55PM -0800, Linus Torvalds wrote:
> So I would expect that
> 
>  (a) READ/WRITE actually fills the whole buffer
> 
>  (b) READ/WRITE are the only ones where we care about performance at a
> bounce-buffer level
> 
> so it boils down to "do we still do this horrible memcpy even for
> regular IO commands"? Because that would, in my opinion, just be
> stupid.

For one thing this is not just for block I/O, but for all DMA.
Second, "normal" I/O might always fail, including after partial
transfers.  SCSI even considers that normal.  Network devices consider
it normal to not fill the entiret receive buffers, etc.

In short:  anything that operates directly on user memory is a trivial
reproducer here.  The CVE uses SG_IO, but staying in block land direct
I/O will work just the same because swiotlb will copy back the
uninitialized data to user memory after an I/O failure.

What we've been thinking of is a version of the dma map calls where
the unmap gets passed how much data actually was transferred and only
copies that out.  But that seems like the only sane interface.

Now IFF we known that the buffer is never looked at on I/O failure
or short I/O we could do away with all this.  But without adding new
interfaces where the caller guarantees that we can't know that.  For
userspace memory it is guaranteed to be not true.  For kernel memory
is most likely is true, but there's some amazingly awful pieces of
code that probably still get it wrong.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/1] swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
@ 2022-03-07  7:58         ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2022-03-07  7:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tom Lendacky, Martin K. Petersen, open list:DOCUMENTATION,
	James E.J. Bottomley, Michael S. Tsirkin, stable,
	Christoph Hellwig, Christian Borntraeger, iommu, Doug Gilbert,
	Halil Pasic, Robin Murphy, Thiago Jung Bauermann,
	Anshuman Khandual

On Sun, Mar 06, 2022 at 12:30:55PM -0800, Linus Torvalds wrote:
> So I would expect that
> 
>  (a) READ/WRITE actually fills the whole buffer
> 
>  (b) READ/WRITE are the only ones where we care about performance at a
> bounce-buffer level
> 
> so it boils down to "do we still do this horrible memcpy even for
> regular IO commands"? Because that would, in my opinion, just be
> stupid.

For one thing this is not just for block I/O, but for all DMA.
Second, "normal" I/O might always fail, including after partial
transfers.  SCSI even considers that normal.  Network devices consider
it normal to not fill the entiret receive buffers, etc.

In short:  anything that operates directly on user memory is a trivial
reproducer here.  The CVE uses SG_IO, but staying in block land direct
I/O will work just the same because swiotlb will copy back the
uninitialized data to user memory after an I/O failure.

What we've been thinking of is a version of the dma map calls where
the unmap gets passed how much data actually was transferred and only
copies that out.  But that seems like the only sane interface.

Now IFF we known that the buffer is never looked at on I/O failure
or short I/O we could do away with all this.  But without adding new
interfaces where the caller guarantees that we can't know that.  For
userspace memory it is guaranteed to be not true.  For kernel memory
is most likely is true, but there's some amazingly awful pieces of
code that probably still get it wrong.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-03-07  7:58 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-05 17:07 [PATCH v2 1/1] swiotlb: rework "fix info leak with DMA_FROM_DEVICE" Halil Pasic
2022-03-05 17:07 ` Halil Pasic
2022-03-05 19:44 ` Linus Torvalds
2022-03-05 19:44   ` Linus Torvalds
2022-03-06  7:58   ` Christoph Hellwig
2022-03-06  7:58     ` Christoph Hellwig
2022-03-06 20:30     ` Linus Torvalds
2022-03-06 20:30       ` Linus Torvalds
2022-03-07  7:58       ` Christoph Hellwig
2022-03-07  7:58         ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.