All of lore.kernel.org
 help / color / mirror / Atom feed
From: Halil Pasic <pasic@linux.ibm.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: "Linus Torvalds" <torvalds@linux-foundation.org>,
	"Oleksandr Natalenko" <oleksandr@natalenko.name>,
	"Christoph Hellwig" <hch@lst.de>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Toke Høiland-Jørgensen" <toke@toke.dk>,
	"Kalle Valo" <kvalo@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Olha Cherevyk" <olha.cherevyk@gmail.com>,
	iommu <iommu@lists.linux-foundation.org>,
	linux-wireless <linux-wireless@vger.kernel.org>,
	Netdev <netdev@vger.kernel.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	stable <stable@vger.kernel.org>,
	"Halil Pasic" <pasic@linux.ibm.com>
Subject: Re: [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break ath9k-based AP
Date: Thu, 24 Mar 2022 19:02:16 +0100	[thread overview]
Message-ID: <20220324190216.0efa067f.pasic@linux.ibm.com> (raw)
In-Reply-To: <f88ca616-96d1-82dc-1bc8-b17480e937dd@arm.com>

On Wed, 23 Mar 2022 20:54:08 +0000
Robin Murphy <robin.murphy@arm.com> wrote:

> On 2022-03-23 19:16, Linus Torvalds wrote:
> > On Wed, Mar 23, 2022 at 12:06 PM Robin Murphy <robin.murphy@arm.com> wrote:  
> >>
> >> On 2022-03-23 17:27, Linus Torvalds wrote:  
> >>>
> >>> I'm assuming that the ath9k issue is that it gives DMA mapping a big
> >>> enough area to handle any possible packet size, and just expects -
> >>> quite reasonably - smaller packets to only fill the part they need.
> >>>
> >>> Which that "info leak" patch obviously breaks entirely.  
> >>
> >> Except that's the exact case which the new patch is addressing  
> > 
> > Not "addressing". Breaking.
> > 
> > Which is why it will almost certainly get reverted.
> > 
> > Not doing DMA to the whole area seems to be quite the sane thing to do
> > for things like network packets, and overwriting the part that didn't
> > get DMA'd with zeroes seems to be exactly the wrong thing here.
> > 
> > So the SG_IO - and other random untrusted block command sources - data
> > leak will almost certainly have to be addressed differently. Possibly
> > by simply allocating the area with GFP_ZERO to begin with.  
> 
> Er, the point of the block layer case is that whole area *is* zeroed to 
> begin with, and a latent memory corruption problem in SWIOTLB itself 
> replaces those zeros with random other kernel data unexpectedly. Let me 
> try illustrating some sequences for clarity...
> 
> Expected behaviour/without SWIOTLB:
>                               Memory
> ---------------------------------------------------
> start                        12345678
> dma_map(DMA_FROM_DEVICE)      no-op
> device writes partial data   12ABC678 <- ABC
> dma_unmap(DMA_FROM_DEVICE)   12ABC678
> 
> 
> SWIOTLB previously:
>                               Memory      Bounce buffer
> ---------------------------------------------------
> start                        12345678    xxxxxxxx
> dma_map(DMA_FROM_DEVICE)             no-op
> device writes partial data   12345678    xxABCxxx <- ABC
> dma_unmap(DMA_FROM_DEVICE)   xxABCxxx <- xxABCxxx
> 
> 
> SWIOTLB Now:
>                               Memory      Bounce buffer
> ---------------------------------------------------
> start                        12345678    xxxxxxxx
> dma_map(DMA_FROM_DEVICE)     12345678 -> 12345678
> device writes partial data   12345678    12ABC678 <- ABC
> dma_unmap(DMA_FROM_DEVICE)   12ABC678 <- 12ABC678
> 
> 
> Now, sure we can prevent any actual information leakage by initialising 
> the bounce buffer slot with zeros, but then we're just corrupting the 
> not-written-to parts of the mapping with zeros instead of anyone else's 
> old data. That's still fundamentally not OK. The only thing SWIOTLB can 
> do to be correct is treat DMA_FROM_DEVICE as a read-modify-write of the 
> entire mapping, because it has no way to know how much of it is actually 
> going to be modified.
> 

Very nice explanation! Thanks!

> I'll admit I still never quite grasped the reason for also adding the 
> override to swiotlb_sync_single_for_device() in aa6f8dcbab47, but I 
> think by that point we were increasingly tired and confused and starting 
> to second-guess ourselves (well, I was, at least).

I raised the question, do we need to do the same for
swiotlb_sync_single_for_device(). Did that based on my understanding of the
DMA API documentation. I had the following scenario in mind

SWIOTLB without the snyc_single:
                                  Memory      Bounce buffer      Owner
--------------------------------------------------------------------------
start                             12345678    xxxxxxxx             C
dma_map(DMA_FROM_DEVICE)          12345678 -> 12345678             C->D
device writes partial data        12345678    12ABC678 <- ABC      D
sync_for_cpu(DMA_FROM_DEVICE)     12ABC678 <- 12ABC678             D->C
cpu modifies buffer               66666666    12ABC678             C
sync_for_device(DMA_FROM_DEVICE)  66666666    12ABC678             C->D
device writes partial data        66666666    1EFGC678 <-EFG       D
dma_unmap(DMA_FROM_DEVICE)        1EFGC678 <- 1EFGC678             D->C

Legend: in Owner column C stands for cpu and D for device.

Without swiotlb, I believe we should have arrived at 6EFG6666. To get the
same result, IMHO, we need to do a sync in sync_for_device().
And aa6f8dcbab47 is an imperfect solution to that (because of size).


> I don't think it's 
> wrong per se, but as I said I do think it can bite anyone who's been 
> doing dma_sync_*() wrong but getting away with it until now. 

I fully agree.

> If 
> ddbd89deb7d3 alone turns out to work OK then I'd be inclined to try a 
> partial revert of just that one hunk.
>

I'm not against being pragmatic and doing the partial revert. But as
explained above, I do believe for correctness of swiotlb we ultimately
do need that change. So if the revert is the short term solution,
what should be our mid-term road-map?

Regards,
Halil
 
> Thanks,
> Robin.


WARNING: multiple messages have this Message-ID (diff)
From: Halil Pasic <pasic@linux.ibm.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: "Toke Høiland-Jørgensen" <toke@toke.dk>,
	Netdev <netdev@vger.kernel.org>, "Kalle Valo" <kvalo@kernel.org>,
	linux-wireless <linux-wireless@vger.kernel.org>,
	"Oleksandr Natalenko" <oleksandr@natalenko.name>,
	stable <stable@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	"Halil Pasic" <pasic@linux.ibm.com>,
	iommu <iommu@lists.linux-foundation.org>,
	"Olha Cherevyk" <olha.cherevyk@gmail.com>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Christoph Hellwig" <hch@lst.de>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Subject: Re: [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break ath9k-based AP
Date: Thu, 24 Mar 2022 19:02:16 +0100	[thread overview]
Message-ID: <20220324190216.0efa067f.pasic@linux.ibm.com> (raw)
In-Reply-To: <f88ca616-96d1-82dc-1bc8-b17480e937dd@arm.com>

On Wed, 23 Mar 2022 20:54:08 +0000
Robin Murphy <robin.murphy@arm.com> wrote:

> On 2022-03-23 19:16, Linus Torvalds wrote:
> > On Wed, Mar 23, 2022 at 12:06 PM Robin Murphy <robin.murphy@arm.com> wrote:  
> >>
> >> On 2022-03-23 17:27, Linus Torvalds wrote:  
> >>>
> >>> I'm assuming that the ath9k issue is that it gives DMA mapping a big
> >>> enough area to handle any possible packet size, and just expects -
> >>> quite reasonably - smaller packets to only fill the part they need.
> >>>
> >>> Which that "info leak" patch obviously breaks entirely.  
> >>
> >> Except that's the exact case which the new patch is addressing  
> > 
> > Not "addressing". Breaking.
> > 
> > Which is why it will almost certainly get reverted.
> > 
> > Not doing DMA to the whole area seems to be quite the sane thing to do
> > for things like network packets, and overwriting the part that didn't
> > get DMA'd with zeroes seems to be exactly the wrong thing here.
> > 
> > So the SG_IO - and other random untrusted block command sources - data
> > leak will almost certainly have to be addressed differently. Possibly
> > by simply allocating the area with GFP_ZERO to begin with.  
> 
> Er, the point of the block layer case is that whole area *is* zeroed to 
> begin with, and a latent memory corruption problem in SWIOTLB itself 
> replaces those zeros with random other kernel data unexpectedly. Let me 
> try illustrating some sequences for clarity...
> 
> Expected behaviour/without SWIOTLB:
>                               Memory
> ---------------------------------------------------
> start                        12345678
> dma_map(DMA_FROM_DEVICE)      no-op
> device writes partial data   12ABC678 <- ABC
> dma_unmap(DMA_FROM_DEVICE)   12ABC678
> 
> 
> SWIOTLB previously:
>                               Memory      Bounce buffer
> ---------------------------------------------------
> start                        12345678    xxxxxxxx
> dma_map(DMA_FROM_DEVICE)             no-op
> device writes partial data   12345678    xxABCxxx <- ABC
> dma_unmap(DMA_FROM_DEVICE)   xxABCxxx <- xxABCxxx
> 
> 
> SWIOTLB Now:
>                               Memory      Bounce buffer
> ---------------------------------------------------
> start                        12345678    xxxxxxxx
> dma_map(DMA_FROM_DEVICE)     12345678 -> 12345678
> device writes partial data   12345678    12ABC678 <- ABC
> dma_unmap(DMA_FROM_DEVICE)   12ABC678 <- 12ABC678
> 
> 
> Now, sure we can prevent any actual information leakage by initialising 
> the bounce buffer slot with zeros, but then we're just corrupting the 
> not-written-to parts of the mapping with zeros instead of anyone else's 
> old data. That's still fundamentally not OK. The only thing SWIOTLB can 
> do to be correct is treat DMA_FROM_DEVICE as a read-modify-write of the 
> entire mapping, because it has no way to know how much of it is actually 
> going to be modified.
> 

Very nice explanation! Thanks!

> I'll admit I still never quite grasped the reason for also adding the 
> override to swiotlb_sync_single_for_device() in aa6f8dcbab47, but I 
> think by that point we were increasingly tired and confused and starting 
> to second-guess ourselves (well, I was, at least).

I raised the question, do we need to do the same for
swiotlb_sync_single_for_device(). Did that based on my understanding of the
DMA API documentation. I had the following scenario in mind

SWIOTLB without the snyc_single:
                                  Memory      Bounce buffer      Owner
--------------------------------------------------------------------------
start                             12345678    xxxxxxxx             C
dma_map(DMA_FROM_DEVICE)          12345678 -> 12345678             C->D
device writes partial data        12345678    12ABC678 <- ABC      D
sync_for_cpu(DMA_FROM_DEVICE)     12ABC678 <- 12ABC678             D->C
cpu modifies buffer               66666666    12ABC678             C
sync_for_device(DMA_FROM_DEVICE)  66666666    12ABC678             C->D
device writes partial data        66666666    1EFGC678 <-EFG       D
dma_unmap(DMA_FROM_DEVICE)        1EFGC678 <- 1EFGC678             D->C

Legend: in Owner column C stands for cpu and D for device.

Without swiotlb, I believe we should have arrived at 6EFG6666. To get the
same result, IMHO, we need to do a sync in sync_for_device().
And aa6f8dcbab47 is an imperfect solution to that (because of size).


> I don't think it's 
> wrong per se, but as I said I do think it can bite anyone who's been 
> doing dma_sync_*() wrong but getting away with it until now. 

I fully agree.

> If 
> ddbd89deb7d3 alone turns out to work OK then I'd be inclined to try a 
> partial revert of just that one hunk.
>

I'm not against being pragmatic and doing the partial revert. But as
explained above, I do believe for correctness of swiotlb we ultimately
do need that change. So if the revert is the short term solution,
what should be our mid-term road-map?

Regards,
Halil
 
> Thanks,
> Robin.

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  parent reply	other threads:[~2022-03-24 18:02 UTC|newest]

Thread overview: 139+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-23  7:19 [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break ath9k-based AP Oleksandr Natalenko
2022-03-23  7:19 ` Oleksandr Natalenko via iommu
2022-03-23  7:28 ` Kalle Valo
2022-03-23  7:28   ` Kalle Valo
2022-03-23 17:27 ` Linus Torvalds
2022-03-23 17:27   ` Linus Torvalds
2022-03-23 19:06   ` Robin Murphy
2022-03-23 19:06     ` Robin Murphy
2022-03-23 19:16     ` Linus Torvalds
2022-03-23 19:16       ` Linus Torvalds
2022-03-23 20:54       ` Robin Murphy
2022-03-23 20:54         ` Robin Murphy
2022-03-24  5:57         ` Christoph Hellwig
2022-03-24  5:57           ` Christoph Hellwig
2022-03-24 10:25           ` Oleksandr Natalenko
2022-03-24 10:25             ` Oleksandr Natalenko via iommu
2022-03-24 11:05             ` Robin Murphy
2022-03-24 11:05               ` Robin Murphy
2022-03-24 14:27               ` Toke Høiland-Jørgensen
2022-03-24 14:27                 ` Toke Høiland-Jørgensen via iommu
2022-03-24 16:29                 ` Maxime Bizon
2022-03-24 16:29                   ` Maxime Bizon
2022-03-24 16:31                   ` Christoph Hellwig
2022-03-24 16:31                     ` Christoph Hellwig
2022-03-24 16:52                     ` Robin Murphy
2022-03-24 16:52                       ` Robin Murphy
2022-03-24 17:07                       ` Toke Høiland-Jørgensen
2022-03-24 17:07                         ` Toke Høiland-Jørgensen via iommu
2022-03-24 19:26                         ` Linus Torvalds
2022-03-24 19:26                           ` Linus Torvalds
2022-03-24 21:14                           ` Toke Høiland-Jørgensen
2022-03-24 21:14                             ` Toke Høiland-Jørgensen via iommu
2022-03-25 10:25                           ` Maxime Bizon
2022-03-25 10:25                             ` Maxime Bizon
2022-03-25 11:27                             ` Robin Murphy
2022-03-25 11:27                               ` Robin Murphy
2022-03-25 23:38                               ` Halil Pasic
2022-03-25 23:38                                 ` Halil Pasic
2022-03-26 16:05                                 ` Toke Høiland-Jørgensen
2022-03-26 16:05                                   ` Toke Høiland-Jørgensen via iommu
2022-03-26 18:38                                   ` Linus Torvalds
2022-03-26 18:38                                     ` Linus Torvalds
2022-03-26 22:38                                     ` David Laight
2022-03-26 22:38                                       ` David Laight
2022-03-26 22:41                                       ` Linus Torvalds
2022-03-26 22:41                                         ` Linus Torvalds
2022-03-25 16:25                             ` Toke Høiland-Jørgensen
2022-03-25 16:25                               ` Toke Høiland-Jørgensen via iommu
2022-03-25 16:45                               ` Robin Murphy
2022-03-25 16:45                                 ` Robin Murphy
2022-03-25 18:13                                 ` Toke Høiland-Jørgensen via iommu
2022-03-25 18:13                                   ` Toke Høiland-Jørgensen
2022-03-25 18:30                             ` Linus Torvalds
2022-03-25 18:30                               ` Linus Torvalds
2022-03-25 19:14                               ` Robin Murphy
2022-03-25 19:14                                 ` Robin Murphy
2022-03-25 19:21                                 ` Linus Torvalds
2022-03-25 19:21                                   ` Linus Torvalds
2022-03-25 19:26                               ` Oleksandr Natalenko via iommu
2022-03-25 19:26                                 ` Oleksandr Natalenko
2022-03-25 19:27                                 ` Linus Torvalds
2022-03-25 19:27                                   ` Linus Torvalds
2022-03-25 19:35                                   ` Oleksandr Natalenko via iommu
2022-03-25 19:35                                     ` Oleksandr Natalenko
2022-03-25 20:37                               ` Johannes Berg
2022-03-25 20:37                                 ` Johannes Berg
2022-03-25 20:47                                 ` Linus Torvalds
2022-03-25 20:47                                   ` Linus Torvalds
2022-03-25 21:13                                   ` Johannes Berg
2022-03-25 21:13                                     ` Johannes Berg
2022-03-25 21:40                                     ` David Laight
2022-03-25 21:40                                       ` David Laight
2022-03-25 21:56                                     ` Linus Torvalds
2022-03-25 21:56                                       ` Linus Torvalds
2022-03-25 22:41                                       ` David Laight
2022-03-25 22:41                                         ` David Laight
2022-03-27  3:15                                     ` Halil Pasic
2022-03-27  3:15                                       ` Halil Pasic
2022-03-28  9:48                                       ` Johannes Berg
2022-03-28  9:48                                         ` Johannes Berg
2022-03-28  9:50                                         ` Johannes Berg
2022-03-28  9:50                                           ` Johannes Berg
2022-03-28  9:57                                           ` Johannes Berg
2022-03-28  9:57                                             ` Johannes Berg
2022-03-27  3:48                           ` Halil Pasic
2022-03-27  3:48                             ` Halil Pasic
2022-03-27  5:06                             ` Linus Torvalds
2022-03-27  5:06                               ` Linus Torvalds
2022-03-27  5:21                               ` Linus Torvalds
2022-03-27  5:21                                 ` Linus Torvalds
2022-03-27 15:24                                 ` David Laight
2022-03-27 15:24                                   ` David Laight
2022-03-27 19:23                                   ` Linus Torvalds
2022-03-27 19:23                                     ` Linus Torvalds
2022-03-27 20:04                                     ` Linus Torvalds
2022-03-27 20:04                                       ` Linus Torvalds
2022-03-27 23:52                                 ` Halil Pasic
2022-03-27 23:52                                   ` Halil Pasic
2022-03-28  0:30                                   ` Linus Torvalds
2022-03-28  0:30                                     ` Linus Torvalds
2022-03-28 12:02                                     ` Halil Pasic
2022-03-28 12:02                                       ` Halil Pasic
2022-03-27 23:37                               ` Halil Pasic
2022-03-27 23:37                                 ` Halil Pasic
2022-03-28  0:37                                 ` Linus Torvalds
2022-03-28  0:37                                   ` Linus Torvalds
2022-03-25  7:12                         ` Oleksandr Natalenko
2022-03-25  7:12                           ` Oleksandr Natalenko via iommu
2022-03-25  9:21                           ` Thorsten Leemhuis
2022-03-25  9:21                             ` Thorsten Leemhuis
2022-03-24 18:31                       ` Halil Pasic
2022-03-24 18:31                         ` Halil Pasic
2022-03-25 16:31                         ` Christoph Hellwig
2022-03-25 16:31                           ` Christoph Hellwig
2022-03-24 18:02         ` Halil Pasic [this message]
2022-03-24 18:02           ` Halil Pasic
2022-03-25 15:25           ` Halil Pasic
2022-03-25 15:25             ` Halil Pasic
2022-03-25 16:23             ` Robin Murphy
2022-03-25 16:23               ` Robin Murphy
2022-03-25 16:32           ` Christoph Hellwig
2022-03-25 16:32             ` Christoph Hellwig
2022-03-25 18:15             ` Toke Høiland-Jørgensen via iommu
2022-03-25 18:15               ` Toke Høiland-Jørgensen
2022-03-25 18:42               ` Robin Murphy
2022-03-25 18:42                 ` Robin Murphy
2022-03-25 18:46                 ` Linus Torvalds
2022-03-25 18:46                   ` Linus Torvalds
2022-03-28  6:37                   ` Christoph Hellwig
2022-03-28  6:37                     ` Christoph Hellwig
2022-03-28  8:15                     ` David Laight
2022-03-28  8:15                       ` David Laight
2022-03-30 12:11                     ` Halil Pasic
2022-03-30 12:11                       ` Halil Pasic
2022-03-24  8:55   ` Oleksandr Natalenko
2022-03-24  8:55     ` Oleksandr Natalenko via iommu
2022-03-24 12:32 ` [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break ath9k-based AP #forregzbot Thorsten Leemhuis
2022-03-25  9:24   ` Thorsten Leemhuis
2022-03-27  9:00     ` Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220324190216.0efa067f.pasic@linux.ibm.com \
    --to=pasic@linux.ibm.com \
    --cc=davem@davemloft.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=kuba@kernel.org \
    --cc=kvalo@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=netdev@vger.kernel.org \
    --cc=oleksandr@natalenko.name \
    --cc=olha.cherevyk@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=robin.murphy@arm.com \
    --cc=stable@vger.kernel.org \
    --cc=toke@toke.dk \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.