Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
       [not found] <X/27MSbfDGCY9WZu@martin>
@ 2021-01-13 11:30 ` Christoph Hellwig
  2021-01-18 11:44   ` Martin Radev
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2021-01-13 11:30 UTC (permalink / raw)
  To: Martin Radev
  Cc: konrad.wilk, hch, m.szyprowski, robin.murphy, iommu,
	linux-kernel, joro, kirill.shutemov, thomas.lendacky,
	robert.buhren, file, mathias.morbitzer, virtualization, kvm

On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote:
> The size of the buffer being bounced is not checked if it happens
> to be larger than the size of the mapped buffer. Because the size
> can be controlled by a device, as it's the case with virtio devices,
> this can lead to memory corruption.
> 

I'm really worried about all these hodge podge hacks for not trusted
hypervisors in the I/O stack.  Instead of trying to harden protocols
that are fundamentally not designed for this, how about instead coming
up with a new paravirtualized I/O interface that is specifically
designed for use with an untrusted hypervisor from the start?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
  2021-01-13 11:30 ` [PATCH] swiotlb: Validate bounce size in the sync/unmap path Christoph Hellwig
@ 2021-01-18 11:44   ` Martin Radev
  2021-01-18 15:14     ` Konrad Rzeszutek Wilk
  2021-02-03 12:49     ` Christoph Hellwig
  0 siblings, 2 replies; 12+ messages in thread
From: Martin Radev @ 2021-01-18 11:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: konrad.wilk, m.szyprowski, robin.murphy, iommu, linux-kernel,
	joro, kirill.shutemov, thomas.lendacky, robert.buhren, file,
	mathias.morbitzer, virtualization, kvm

On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote:
> On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote:
> > The size of the buffer being bounced is not checked if it happens
> > to be larger than the size of the mapped buffer. Because the size
> > can be controlled by a device, as it's the case with virtio devices,
> > this can lead to memory corruption.
> > 
> 
> I'm really worried about all these hodge podge hacks for not trusted
> hypervisors in the I/O stack.  Instead of trying to harden protocols
> that are fundamentally not designed for this, how about instead coming
> up with a new paravirtualized I/O interface that is specifically
> designed for use with an untrusted hypervisor from the start?

Your comment makes sense but then that would require the cooperation
of these vendors and the cloud providers to agree on something meaningful.
I am also not sure whether the end result would be better than hardening
this interface to catch corruption. There is already some validation in
unmap path anyway.

Another possibility is to move this hardening to the common virtio code,
but I think the code may become more complicated there since it would
require tracking both the dma_addr and length for each descriptor.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
  2021-01-18 11:44   ` Martin Radev
@ 2021-01-18 15:14     ` Konrad Rzeszutek Wilk
  2021-01-25 18:33       ` Martin Radev
  2021-02-03 12:49     ` Christoph Hellwig
  1 sibling, 1 reply; 12+ messages in thread
From: Konrad Rzeszutek Wilk @ 2021-01-18 15:14 UTC (permalink / raw)
  To: Martin Radev
  Cc: Christoph Hellwig, thomas.lendacky, file, robert.buhren, kvm,
	konrad.wilk, mathias.morbitzer, linux-kernel, virtualization,
	iommu, robin.murphy, kirill.shutemov

On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote:
> On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote:
> > On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote:
> > > The size of the buffer being bounced is not checked if it happens
> > > to be larger than the size of the mapped buffer. Because the size
> > > can be controlled by a device, as it's the case with virtio devices,
> > > this can lead to memory corruption.
> > > 
> > 
> > I'm really worried about all these hodge podge hacks for not trusted
> > hypervisors in the I/O stack.  Instead of trying to harden protocols
> > that are fundamentally not designed for this, how about instead coming
> > up with a new paravirtualized I/O interface that is specifically
> > designed for use with an untrusted hypervisor from the start?
> 
> Your comment makes sense but then that would require the cooperation
> of these vendors and the cloud providers to agree on something meaningful.
> I am also not sure whether the end result would be better than hardening
> this interface to catch corruption. There is already some validation in
> unmap path anyway.
> 
> Another possibility is to move this hardening to the common virtio code,
> but I think the code may become more complicated there since it would
> require tracking both the dma_addr and length for each descriptor.

Christoph,

I've been wrestling with the same thing - this is specific to busted
drivers. And in reality you could do the same thing with a hardware
virtio device (see example in http://thunderclap.io/) - where the
mitigation is 'enable the IOMMU to do its job.'.

AMD SEV documents speak about utilizing IOMMU to do this (AMD SEV-SNP)..
and while that is great in the future, SEV without IOMMU is now here.

Doing a full circle here, this issue can be exploited with virtio
but you could say do that with real hardware too if you hacked the
firmware, so if you say used Intel SR-IOV NIC that was compromised
on an AMD SEV machine, and plumbed in the guest - the IOMMU inside
of the guest would be SWIOTLB code. Last line of defense against
bad firmware to say.

As such I am leaning towards taking this code, but I am worried
about the performance hit .. but perhaps I shouldn't as if you
are using SWIOTLB=force already you are kind of taking a
performance hit?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
  2021-01-18 15:14     ` Konrad Rzeszutek Wilk
@ 2021-01-25 18:33       ` Martin Radev
  2021-02-02 16:37         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Radev @ 2021-01-25 18:33 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Christoph Hellwig, thomas.lendacky, file, robert.buhren, kvm,
	konrad.wilk, mathias.morbitzer, linux-kernel, virtualization,
	iommu, robin.murphy, kirill.shutemov

On Mon, Jan 18, 2021 at 10:14:28AM -0500, Konrad Rzeszutek Wilk wrote:
> On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote:
> > On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote:
> > > On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote:
> > > > The size of the buffer being bounced is not checked if it happens
> > > > to be larger than the size of the mapped buffer. Because the size
> > > > can be controlled by a device, as it's the case with virtio devices,
> > > > this can lead to memory corruption.
> > > > 
> > > 
> > > I'm really worried about all these hodge podge hacks for not trusted
> > > hypervisors in the I/O stack.  Instead of trying to harden protocols
> > > that are fundamentally not designed for this, how about instead coming
> > > up with a new paravirtualized I/O interface that is specifically
> > > designed for use with an untrusted hypervisor from the start?
> > 
> > Your comment makes sense but then that would require the cooperation
> > of these vendors and the cloud providers to agree on something meaningful.
> > I am also not sure whether the end result would be better than hardening
> > this interface to catch corruption. There is already some validation in
> > unmap path anyway.
> > 
> > Another possibility is to move this hardening to the common virtio code,
> > but I think the code may become more complicated there since it would
> > require tracking both the dma_addr and length for each descriptor.
> 
> Christoph,
> 
> I've been wrestling with the same thing - this is specific to busted
> drivers. And in reality you could do the same thing with a hardware
> virtio device (see example in http://thunderclap.io/) - where the
> mitigation is 'enable the IOMMU to do its job.'.
> 
> AMD SEV documents speak about utilizing IOMMU to do this (AMD SEV-SNP)..
> and while that is great in the future, SEV without IOMMU is now here.
> 
> Doing a full circle here, this issue can be exploited with virtio
> but you could say do that with real hardware too if you hacked the
> firmware, so if you say used Intel SR-IOV NIC that was compromised
> on an AMD SEV machine, and plumbed in the guest - the IOMMU inside
> of the guest would be SWIOTLB code. Last line of defense against
> bad firmware to say.
> 
> As such I am leaning towards taking this code, but I am worried
> about the performance hit .. but perhaps I shouldn't as if you
> are using SWIOTLB=force already you are kind of taking a
> performance hit?
> 

I have not measured the performance degradation. This will hit all AMD SEV,
Intel TDX, IBM Protected Virtualization VMs. I don't expect the hit to
be large since there are only few added operations per hundreads of copied
bytes. I could try to measure the performance hit by running some benchmark
with virtio-net/virtio-blk/virtio-rng.

Earlier I said:
> > Another possibility is to move this hardening to the common virtio code,
> > but I think the code may become more complicated there since it would
> > require tracking both the dma_addr and length for each descriptor.

Unfortunately, this doesn't make sense. Even if there's validation for
the size in the common virtio layer, there will be some other device
which controls a dma_addr and length passed to dma_unmap* in the
corresponding driver. The device can target a specific dma-mapped private
buffer by changing the dma_addr and set a good length to overwrite buffers
following it.

So, instead of doing the check in every driver and hitting a performance
cost even when swiotlb is not used, it's probably better to fix it in
swiotlb.

@Tom Lendacky, do you think that it makes sense to harden swiotlb or
some other approach may be better for the SEV features?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
  2021-01-25 18:33       ` Martin Radev
@ 2021-02-02 16:37         ` Konrad Rzeszutek Wilk
  2021-02-02 22:34           ` Tom Lendacky
  0 siblings, 1 reply; 12+ messages in thread
From: Konrad Rzeszutek Wilk @ 2021-02-02 16:37 UTC (permalink / raw)
  To: Martin Radev
  Cc: Konrad Rzeszutek Wilk, Christoph Hellwig, thomas.lendacky, file,
	robert.buhren, kvm, mathias.morbitzer, linux-kernel,
	virtualization, iommu, robin.murphy, kirill.shutemov

On Mon, Jan 25, 2021 at 07:33:35PM +0100, Martin Radev wrote:
> On Mon, Jan 18, 2021 at 10:14:28AM -0500, Konrad Rzeszutek Wilk wrote:
> > On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote:
> > > On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote:
> > > > On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote:
> > > > > The size of the buffer being bounced is not checked if it happens
> > > > > to be larger than the size of the mapped buffer. Because the size
> > > > > can be controlled by a device, as it's the case with virtio devices,
> > > > > this can lead to memory corruption.
> > > > > 
> > > > 
> > > > I'm really worried about all these hodge podge hacks for not trusted
> > > > hypervisors in the I/O stack.  Instead of trying to harden protocols
> > > > that are fundamentally not designed for this, how about instead coming
> > > > up with a new paravirtualized I/O interface that is specifically
> > > > designed for use with an untrusted hypervisor from the start?
> > > 
> > > Your comment makes sense but then that would require the cooperation
> > > of these vendors and the cloud providers to agree on something meaningful.
> > > I am also not sure whether the end result would be better than hardening
> > > this interface to catch corruption. There is already some validation in
> > > unmap path anyway.
> > > 
> > > Another possibility is to move this hardening to the common virtio code,
> > > but I think the code may become more complicated there since it would
> > > require tracking both the dma_addr and length for each descriptor.
> > 
> > Christoph,
> > 
> > I've been wrestling with the same thing - this is specific to busted
> > drivers. And in reality you could do the same thing with a hardware
> > virtio device (see example in http://thunderclap.io/) - where the
> > mitigation is 'enable the IOMMU to do its job.'.
> > 
> > AMD SEV documents speak about utilizing IOMMU to do this (AMD SEV-SNP)..
> > and while that is great in the future, SEV without IOMMU is now here.
> > 
> > Doing a full circle here, this issue can be exploited with virtio
> > but you could say do that with real hardware too if you hacked the
> > firmware, so if you say used Intel SR-IOV NIC that was compromised
> > on an AMD SEV machine, and plumbed in the guest - the IOMMU inside
> > of the guest would be SWIOTLB code. Last line of defense against
> > bad firmware to say.
> > 
> > As such I am leaning towards taking this code, but I am worried
> > about the performance hit .. but perhaps I shouldn't as if you
> > are using SWIOTLB=force already you are kind of taking a
> > performance hit?
> > 
> 
> I have not measured the performance degradation. This will hit all AMD SEV,
> Intel TDX, IBM Protected Virtualization VMs. I don't expect the hit to
> be large since there are only few added operations per hundreads of copied
> bytes. I could try to measure the performance hit by running some benchmark
> with virtio-net/virtio-blk/virtio-rng.
> 
> Earlier I said:
> > > Another possibility is to move this hardening to the common virtio code,
> > > but I think the code may become more complicated there since it would
> > > require tracking both the dma_addr and length for each descriptor.
> 
> Unfortunately, this doesn't make sense. Even if there's validation for
> the size in the common virtio layer, there will be some other device
> which controls a dma_addr and length passed to dma_unmap* in the
> corresponding driver. The device can target a specific dma-mapped private
> buffer by changing the dma_addr and set a good length to overwrite buffers
> following it.
> 
> So, instead of doing the check in every driver and hitting a performance
> cost even when swiotlb is not used, it's probably better to fix it in
> swiotlb.
> 
> @Tom Lendacky, do you think that it makes sense to harden swiotlb or
> some other approach may be better for the SEV features?

I am not Tom, but this change seems the right way forward regardless if
is TDX, AMD SEV, or any other architecture that encrypt memory and use
SWIOTLB.

Let me queue it up in development branch and do some regression testing.
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
  2021-02-02 16:37         ` Konrad Rzeszutek Wilk
@ 2021-02-02 22:34           ` Tom Lendacky
  2021-02-02 23:13             ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 12+ messages in thread
From: Tom Lendacky @ 2021-02-02 22:34 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Martin Radev
  Cc: Konrad Rzeszutek Wilk, Christoph Hellwig, file, robert.buhren,
	kvm, mathias.morbitzer, linux-kernel, virtualization, iommu,
	robin.murphy, kirill.shutemov

On 2/2/21 10:37 AM, Konrad Rzeszutek Wilk wrote:
> On Mon, Jan 25, 2021 at 07:33:35PM +0100, Martin Radev wrote:
>> On Mon, Jan 18, 2021 at 10:14:28AM -0500, Konrad Rzeszutek Wilk wrote:
>>> On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote:
>>>> On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote:
>>>>> On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote:
>>>>>> The size of the buffer being bounced is not checked if it happens
>>>>>> to be larger than the size of the mapped buffer. Because the size
>>>>>> can be controlled by a device, as it's the case with virtio devices,
>>>>>> this can lead to memory corruption.
>>>>>>
>>>>>
>>>>> I'm really worried about all these hodge podge hacks for not trusted
>>>>> hypervisors in the I/O stack.  Instead of trying to harden protocols
>>>>> that are fundamentally not designed for this, how about instead coming
>>>>> up with a new paravirtualized I/O interface that is specifically
>>>>> designed for use with an untrusted hypervisor from the start?
>>>>
>>>> Your comment makes sense but then that would require the cooperation
>>>> of these vendors and the cloud providers to agree on something meaningful.
>>>> I am also not sure whether the end result would be better than hardening
>>>> this interface to catch corruption. There is already some validation in
>>>> unmap path anyway.
>>>>
>>>> Another possibility is to move this hardening to the common virtio code,
>>>> but I think the code may become more complicated there since it would
>>>> require tracking both the dma_addr and length for each descriptor.
>>>
>>> Christoph,
>>>
>>> I've been wrestling with the same thing - this is specific to busted
>>> drivers. And in reality you could do the same thing with a hardware
>>> virtio device (see example in https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fthunderclap.io%2F&amp;data=04%7C01%7Cthomas.lendacky%40amd.com%7Cfc27af49d9a943699f6c08d8c798eed4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637478806973542212%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=aUVqobkOSDfDhCAEauABOUvCAaIcw%2FTh07YFxeBjBDU%3D&amp;reserved=0) - where the
>>> mitigation is 'enable the IOMMU to do its job.'.
>>>
>>> AMD SEV documents speak about utilizing IOMMU to do this (AMD SEV-SNP)..
>>> and while that is great in the future, SEV without IOMMU is now here.
>>>
>>> Doing a full circle here, this issue can be exploited with virtio
>>> but you could say do that with real hardware too if you hacked the
>>> firmware, so if you say used Intel SR-IOV NIC that was compromised
>>> on an AMD SEV machine, and plumbed in the guest - the IOMMU inside
>>> of the guest would be SWIOTLB code. Last line of defense against
>>> bad firmware to say.
>>>
>>> As such I am leaning towards taking this code, but I am worried
>>> about the performance hit .. but perhaps I shouldn't as if you
>>> are using SWIOTLB=force already you are kind of taking a
>>> performance hit?
>>>
>>
>> I have not measured the performance degradation. This will hit all AMD SEV,
>> Intel TDX, IBM Protected Virtualization VMs. I don't expect the hit to
>> be large since there are only few added operations per hundreads of copied
>> bytes. I could try to measure the performance hit by running some benchmark
>> with virtio-net/virtio-blk/virtio-rng.
>>
>> Earlier I said:
>>>> Another possibility is to move this hardening to the common virtio code,
>>>> but I think the code may become more complicated there since it would
>>>> require tracking both the dma_addr and length for each descriptor.
>>
>> Unfortunately, this doesn't make sense. Even if there's validation for
>> the size in the common virtio layer, there will be some other device
>> which controls a dma_addr and length passed to dma_unmap* in the
>> corresponding driver. The device can target a specific dma-mapped private
>> buffer by changing the dma_addr and set a good length to overwrite buffers
>> following it.
>>
>> So, instead of doing the check in every driver and hitting a performance
>> cost even when swiotlb is not used, it's probably better to fix it in
>> swiotlb.
>>
>> @Tom Lendacky, do you think that it makes sense to harden swiotlb or
>> some other approach may be better for the SEV features?
> 
> I am not Tom, but this change seems the right way forward regardless if
> is TDX, AMD SEV, or any other architecture that encrypt memory and use
> SWIOTLB.

Sorry, I missed the @Tom before. I'm with Konrad and believe it makes
sense to add these checks.

I'm not sure if there would be a better approach for all confidential
computing technologies. SWIOTLB works nicely, but is limited because of
the 32-bit compatible memory location. Being able to have buffers above
the 32-bit limit would alleviate that, but that is support that would have
to be developed.

Thanks,
Tom

> 
> Let me queue it up in development branch and do some regression testing.
>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
  2021-02-02 22:34           ` Tom Lendacky
@ 2021-02-02 23:13             ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 12+ messages in thread
From: Konrad Rzeszutek Wilk @ 2021-02-02 23:13 UTC (permalink / raw)
  To: Tom Lendacky, joe.jin, dongli.zhang
  Cc: Martin Radev, Konrad Rzeszutek Wilk, Christoph Hellwig, file,
	robert.buhren, kvm, mathias.morbitzer, linux-kernel,
	virtualization, iommu, robin.murphy, kirill.shutemov

On Tue, Feb 02, 2021 at 04:34:09PM -0600, Tom Lendacky wrote:
> On 2/2/21 10:37 AM, Konrad Rzeszutek Wilk wrote:
> > On Mon, Jan 25, 2021 at 07:33:35PM +0100, Martin Radev wrote:
> >> On Mon, Jan 18, 2021 at 10:14:28AM -0500, Konrad Rzeszutek Wilk wrote:
> >>> On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote:
> >>>> On Wed, Jan 13, 2021 at 12:30:17PM +0100, Christoph Hellwig wrote:
> >>>>> On Tue, Jan 12, 2021 at 04:07:29PM +0100, Martin Radev wrote:
> >>>>>> The size of the buffer being bounced is not checked if it happens
> >>>>>> to be larger than the size of the mapped buffer. Because the size
> >>>>>> can be controlled by a device, as it's the case with virtio devices,
> >>>>>> this can lead to memory corruption.
> >>>>>>
> >>>>>
> >>>>> I'm really worried about all these hodge podge hacks for not trusted
> >>>>> hypervisors in the I/O stack.  Instead of trying to harden protocols
> >>>>> that are fundamentally not designed for this, how about instead coming
> >>>>> up with a new paravirtualized I/O interface that is specifically
> >>>>> designed for use with an untrusted hypervisor from the start?
> >>>>
> >>>> Your comment makes sense but then that would require the cooperation
> >>>> of these vendors and the cloud providers to agree on something meaningful.
> >>>> I am also not sure whether the end result would be better than hardening
> >>>> this interface to catch corruption. There is already some validation in
> >>>> unmap path anyway.
> >>>>
> >>>> Another possibility is to move this hardening to the common virtio code,
> >>>> but I think the code may become more complicated there since it would
> >>>> require tracking both the dma_addr and length for each descriptor.
> >>>
> >>> Christoph,
> >>>
> >>> I've been wrestling with the same thing - this is specific to busted
> >>> drivers. And in reality you could do the same thing with a hardware
> >>> virtio device (see example in https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fthunderclap.io%2F&amp;data=04%7C01%7Cthomas.lendacky%40amd.com%7Cfc27af49d9a943699f6c08d8c798eed4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637478806973542212%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=aUVqobkOSDfDhCAEauABOUvCAaIcw%2FTh07YFxeBjBDU%3D&amp;reserved=0) - where the
> >>> mitigation is 'enable the IOMMU to do its job.'.
> >>>
> >>> AMD SEV documents speak about utilizing IOMMU to do this (AMD SEV-SNP)..
> >>> and while that is great in the future, SEV without IOMMU is now here.
> >>>
> >>> Doing a full circle here, this issue can be exploited with virtio
> >>> but you could say do that with real hardware too if you hacked the
> >>> firmware, so if you say used Intel SR-IOV NIC that was compromised
> >>> on an AMD SEV machine, and plumbed in the guest - the IOMMU inside
> >>> of the guest would be SWIOTLB code. Last line of defense against
> >>> bad firmware to say.
> >>>
> >>> As such I am leaning towards taking this code, but I am worried
> >>> about the performance hit .. but perhaps I shouldn't as if you
> >>> are using SWIOTLB=force already you are kind of taking a
> >>> performance hit?
> >>>
> >>
> >> I have not measured the performance degradation. This will hit all AMD SEV,
> >> Intel TDX, IBM Protected Virtualization VMs. I don't expect the hit to
> >> be large since there are only few added operations per hundreads of copied
> >> bytes. I could try to measure the performance hit by running some benchmark
> >> with virtio-net/virtio-blk/virtio-rng.
> >>
> >> Earlier I said:
> >>>> Another possibility is to move this hardening to the common virtio code,
> >>>> but I think the code may become more complicated there since it would
> >>>> require tracking both the dma_addr and length for each descriptor.
> >>
> >> Unfortunately, this doesn't make sense. Even if there's validation for
> >> the size in the common virtio layer, there will be some other device
> >> which controls a dma_addr and length passed to dma_unmap* in the
> >> corresponding driver. The device can target a specific dma-mapped private
> >> buffer by changing the dma_addr and set a good length to overwrite buffers
> >> following it.
> >>
> >> So, instead of doing the check in every driver and hitting a performance
> >> cost even when swiotlb is not used, it's probably better to fix it in
> >> swiotlb.
> >>
> >> @Tom Lendacky, do you think that it makes sense to harden swiotlb or
> >> some other approach may be better for the SEV features?
> > 
> > I am not Tom, but this change seems the right way forward regardless if
> > is TDX, AMD SEV, or any other architecture that encrypt memory and use
> > SWIOTLB.
> 
> Sorry, I missed the @Tom before. I'm with Konrad and believe it makes
> sense to add these checks.
> 
> I'm not sure if there would be a better approach for all confidential
> computing technologies. SWIOTLB works nicely, but is limited because of
> the 32-bit compatible memory location. Being able to have buffers above
> the 32-bit limit would alleviate that, but that is support that would have
> to be developed.

Funny you mention that.. Dongli (CCed) is working on exactly that and
should be posting his patches the next couple of days.

> 
> Thanks,
> Tom
> 
> > 
> > Let me queue it up in development branch and do some regression testing.
> >>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
  2021-01-18 11:44   ` Martin Radev
  2021-01-18 15:14     ` Konrad Rzeszutek Wilk
@ 2021-02-03 12:49     ` Christoph Hellwig
  2021-02-03 19:36       ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2021-02-03 12:49 UTC (permalink / raw)
  To: Martin Radev
  Cc: Christoph Hellwig, konrad.wilk, m.szyprowski, robin.murphy,
	iommu, linux-kernel, joro, kirill.shutemov, thomas.lendacky,
	robert.buhren, file, mathias.morbitzer, virtualization, kvm

On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote:
> Your comment makes sense but then that would require the cooperation
> of these vendors and the cloud providers to agree on something meaningful.
> I am also not sure whether the end result would be better than hardening
> this interface to catch corruption. There is already some validation in
> unmap path anyway.

So what?  If you guys want to provide a new capability you'll have to do
work.  And designing a new protocol based around the fact that the
hardware/hypervisor is not trusted and a copy is always required makes
a lot of more sense than throwing in band aids all over the place.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
  2021-02-03 12:49     ` Christoph Hellwig
@ 2021-02-03 19:36       ` Konrad Rzeszutek Wilk
  2021-02-05 17:58         ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Konrad Rzeszutek Wilk @ 2021-02-03 19:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Martin Radev, m.szyprowski, robin.murphy, iommu, linux-kernel,
	joro, kirill.shutemov, thomas.lendacky, robert.buhren, file,
	mathias.morbitzer, virtualization, kvm

On Wed, Feb 03, 2021 at 01:49:22PM +0100, Christoph Hellwig wrote:
> On Mon, Jan 18, 2021 at 12:44:58PM +0100, Martin Radev wrote:
> > Your comment makes sense but then that would require the cooperation
> > of these vendors and the cloud providers to agree on something meaningful.
> > I am also not sure whether the end result would be better than hardening
> > this interface to catch corruption. There is already some validation in
> > unmap path anyway.
> 
> So what?  If you guys want to provide a new capability you'll have to do
> work.  And designing a new protocol based around the fact that the
> hardware/hypervisor is not trusted and a copy is always required makes
> a lot of more sense than throwing in band aids all over the place.

If you don't trust the hypervisor, what would this capability be in?

I suppose you mean this would need to be in the the guest kernel and
this protocol would depend on .. not-hypervisor and most certainly not
the virtio or any SR-IOV device. That removes a lot of options.

The one sensibile one (since folks will trust OEM vendors like Intel
or AMD to provide the memory encryption so they will also trust the
IOMMU - I hope?) - and they do have plans for that with their IOMMU
frameworks which will remove the need for SWIOTLB (I hope).

But that is not now, but in future.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
  2021-02-03 19:36       ` Konrad Rzeszutek Wilk
@ 2021-02-05 17:58         ` Christoph Hellwig
  2021-02-08 17:14           ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2021-02-05 17:58 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Christoph Hellwig, Martin Radev, m.szyprowski, robin.murphy,
	iommu, linux-kernel, joro, kirill.shutemov, thomas.lendacky,
	robert.buhren, file, mathias.morbitzer, virtualization, kvm

On Wed, Feb 03, 2021 at 02:36:38PM -0500, Konrad Rzeszutek Wilk wrote:
> > So what?  If you guys want to provide a new capability you'll have to do
> > work.  And designing a new protocol based around the fact that the
> > hardware/hypervisor is not trusted and a copy is always required makes
> > a lot of more sense than throwing in band aids all over the place.
> 
> If you don't trust the hypervisor, what would this capability be in?

Well, they don't trust the hypervisor to not attack the guest somehow,
except through the data read.  I never really understood the concept,
as it leaves too many holes.

But the point is that these schemes want to force bounce buffering
because they think it is more secure.  And if that is what you want
you better have protocol build around the fact that each I/O needs
to use bounce buffers, so you make those buffers the actual shared
memory use for communication, and build the protocol around it.
E.g. you don't force the ridiculous NVMe PRP offset rules on the block
layer, just to make a complicated swiotlb allocation that needs to
preserve the alignment just do I/O.  But instead you have a trivial
ring buffer or whatever because you know I/O will be copied anyway
and none of all the hard work higher layers do to make the I/O suitable
for a normal device apply.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
  2021-02-05 17:58         ` Christoph Hellwig
@ 2021-02-08 17:14           ` Konrad Rzeszutek Wilk
  2021-02-09  8:26             ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Konrad Rzeszutek Wilk @ 2021-02-08 17:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Martin Radev, m.szyprowski, robin.murphy, iommu, linux-kernel,
	joro, kirill.shutemov, thomas.lendacky, robert.buhren, file,
	mathias.morbitzer, virtualization, kvm

On Fri, Feb 05, 2021 at 06:58:52PM +0100, Christoph Hellwig wrote:
> On Wed, Feb 03, 2021 at 02:36:38PM -0500, Konrad Rzeszutek Wilk wrote:
> > > So what?  If you guys want to provide a new capability you'll have to do
> > > work.  And designing a new protocol based around the fact that the
> > > hardware/hypervisor is not trusted and a copy is always required makes
> > > a lot of more sense than throwing in band aids all over the place.
> > 
> > If you don't trust the hypervisor, what would this capability be in?
> 
> Well, they don't trust the hypervisor to not attack the guest somehow,
> except through the data read.  I never really understood the concept,
> as it leaves too many holes.
> 
> But the point is that these schemes want to force bounce buffering
> because they think it is more secure.  And if that is what you want
> you better have protocol build around the fact that each I/O needs
> to use bounce buffers, so you make those buffers the actual shared
> memory use for communication, and build the protocol around it.

Right. That is what the SWIOTLB pool ends up being as it is allocated at
bootup where the guest tells the hypervisor - these are shared and
clear-text.

> E.g. you don't force the ridiculous NVMe PRP offset rules on the block
> layer, just to make a complicated swiotlb allocation that needs to
> preserve the alignment just do I/O.  But instead you have a trivial

I agree that NVMe is being silly. It could have allocated the coherent
pool and use that and do its own offset within that. That would in
essence carve out a static pool within the SWIOTLB static one..

TTM does that - it has its own DMA machinery on top of DMA API to deal
with its "passing" buffers from one application to another and the fun
of keeping track of that.

> ring buffer or whatever because you know I/O will be copied anyway
> and none of all the hard work higher layers do to make the I/O suitable
> for a normal device apply.

I lost you here. Sorry, are you saying have a simple ring protocol
(like NVME has), where the ring entries (SG or DMA phys) are statically
allocated and whenever NVME driver gets data from user-space it
would copy it in there?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] swiotlb: Validate bounce size in the sync/unmap path
  2021-02-08 17:14           ` Konrad Rzeszutek Wilk
@ 2021-02-09  8:26             ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2021-02-09  8:26 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Christoph Hellwig, Martin Radev, m.szyprowski, robin.murphy,
	iommu, linux-kernel, joro, kirill.shutemov, thomas.lendacky,
	robert.buhren, file, mathias.morbitzer, virtualization, kvm

On Mon, Feb 08, 2021 at 12:14:49PM -0500, Konrad Rzeszutek Wilk wrote:
> > ring buffer or whatever because you know I/O will be copied anyway
> > and none of all the hard work higher layers do to make the I/O suitable
> > for a normal device apply.
> 
> I lost you here. Sorry, are you saying have a simple ring protocol
> (like NVME has), where the ring entries (SG or DMA phys) are statically
> allocated and whenever NVME driver gets data from user-space it
> would copy it in there?

Yes.  Basically extend the virtio or NVMe ring/queue concept to not just
cover commands and completions, but also the data transfers.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-02-09  8:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <X/27MSbfDGCY9WZu@martin>
2021-01-13 11:30 ` [PATCH] swiotlb: Validate bounce size in the sync/unmap path Christoph Hellwig
2021-01-18 11:44   ` Martin Radev
2021-01-18 15:14     ` Konrad Rzeszutek Wilk
2021-01-25 18:33       ` Martin Radev
2021-02-02 16:37         ` Konrad Rzeszutek Wilk
2021-02-02 22:34           ` Tom Lendacky
2021-02-02 23:13             ` Konrad Rzeszutek Wilk
2021-02-03 12:49     ` Christoph Hellwig
2021-02-03 19:36       ` Konrad Rzeszutek Wilk
2021-02-05 17:58         ` Christoph Hellwig
2021-02-08 17:14           ` Konrad Rzeszutek Wilk
2021-02-09  8:26             ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).