From: Linus Torvalds <torvalds@linux-foundation.org> To: David Laight <David.Laight@aculab.com> Cc: "Halil Pasic" <pasic@linux.ibm.com>, "Toke Høiland-Jørgensen" <toke@toke.dk>, "Robin Murphy" <robin.murphy@arm.com>, "Christoph Hellwig" <hch@lst.de>, "Maxime Bizon" <mbizon@freebox.fr>, "Oleksandr Natalenko" <oleksandr@natalenko.name>, "Marek Szyprowski" <m.szyprowski@samsung.com>, "Kalle Valo" <kvalo@kernel.org>, "David S. Miller" <davem@davemloft.net>, "Jakub Kicinski" <kuba@kernel.org>, "Paolo Abeni" <pabeni@redhat.com>, "Olha Cherevyk" <olha.cherevyk@gmail.com>, iommu <iommu@lists.linux-foundation.org>, linux-wireless <linux-wireless@vger.kernel.org>, Netdev <netdev@vger.kernel.org>, "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>, stable <stable@vger.kernel.org> Subject: Re: [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break ath9k-based AP Date: Sun, 27 Mar 2022 12:23:34 -0700 [thread overview] Message-ID: <CAHk-=wgLyqNJx=bb8=o0Nk5U8gMnf0-=qx53ShLEb3V=Yrt8fw@mail.gmail.com> (raw) In-Reply-To: <0745b44456d44d1e9fc364e5a3780d9a@AcuMS.aculab.com> On Sun, Mar 27, 2022 at 8:24 AM David Laight <David.Laight@aculab.com> wrote: > > Aren't bounce buffers just a more extreme case on non-coherent > memory accesses? No. In fact, this whoe change came about exactly because bounce buffers are different. The difference is that bounce buffers have that (wait for it) bounce buffer, which can have stale contents. > They just need explicit memory copies rather than just cache > writeback and invalidate operations. That's the thing - the memory copies introduce entirely new issues. I really think that instead of making up abstract rules ("ownership" or "bounce buffers are just extreme cases of non-coherency") we should make the rules very much practical and down to earth, and write out exactly what they *do*. The whole "sync DMA" is odd and abstract enough as a concept on its own, we shouldn't then make the rules for it odd and abstract. We should make them very very practical. So I will propose that we really make it very much about practical concerns, and we document things as (a) the "sync" operation has by definition a "whose _future_ access do we sync for" notion. So "dma_sync_single_for_cpu()" says that the future accesses to this dma area is for the CPU. Note how it does *NOT* say that the "CPU owns the are". That's bullsh*t, and we now know it's BS. (b) but the sync operation also has a "who wrote the data we're syncing" Note that this is *not* "who accessed or owned it last", because that's nonsensical: if we're syncing for the CPU, then the only reason to do so is because we expect that the last access was by the device, so specifying that separately would just be redundant and stupid. But specifying who *wrote* to the area is meaningful and matters. It matters for the non-coherent cache case (because of writeback issues), but it also matters for the bounce buffer case (becasue it determines which way we should copy). Note how this makes sense: a "sync" operation is clearly about taking some past state, and making it palatable for a future use. The past state is pretty much defined by who wrote the data, and then we can use that and the "the next thing to access it" to determine what we need to do about the sync. It is *NOT* about "ownership". So let's go through the cases, and I'm going to ignore the "CPU caches are coherent with device DMA" case because that's always going to be a no-op wrt data movement (but it will still generally need a memory barrier, which I will mention here and then ignore going forward). Syncing for *CPU* accesses (ie dma_sync_single_for_cpu()) has four choices I can see: - nobody wrote the data at all (aka DMA_NONE). This is nonsensical and should warn. If nobody wrote to it, why would the CPU ever validly access it? Maybe you should have written "memset(buffer, 0, size)" instead? - the CPU wrote the data in the first place (aka DMA_TO_DEVICE) This is a no-op (possibly a memory barrier), because even stale CPU caches are fine, and even if it was in a bounce buffer, the original CPU-side data is fine. - the device wrote the data (aka DMA_FROM_DEVICE) This is just the regular case of a device having written something, and the CPU wants to see it. It obviously needs real work, but it's simple and straightforward. For non-coherent caches, it needs a cache invalidate. For a bounce buffer, it needs a copy from the bounce buffer to the "real" buffer. - it's not clear who write the data (aka DMA_BIDIRECTIONAL) This is not really doable for a bounce buffer - we just don't know which buffer contents are valid. I think it's very very questionable for non-coherent caches too, but "writeback and invalidate" probably can't hurt. So probably warn about it, and do whatever we used to do historically. Syncing for device accesses (ie dma_sync_single_for_device()) also has the same four choices I can see, but obviously does different things: - nobody wrote the data at all (aka DMA_NONE) This sounds as nonsensical as the CPU case, but maybe isn't. We may not have any previous explicit writes, but we *do* have that "insecure and possibly stale buffer contents" bounce buffer thing on the device side. So with a bounce buffer, it's actually somewhat sane to say "initialize the bounce buffer to a known state". Because bounce buffers *are* special. Unlike even the "noncoherent caches" issue, they have that entirely *other* hidden state in the form of the bounce buffer itself. Discuss. - the CPU wrote the data in the first place (aka DMA_TO_DEVICE). This is the regular and common case of "we have data on the CPU side that is written to the device". Again, needs work, but is simple and straightforward. For non-coherent caches, we need a writeback on the CPU. For a bounce buffer, we need to copy from the regular buffer to the bounce buffer. - the device wrote the data in the first place (aka DMA_FROM_DEVICE) This is the case that we hit on ath9k. It's *not* obvious, but when we write this out this way, I really think the semantics are pretty clear. For non-coherent caches, we may need an "invalidate". For a bounce buffer, it's a no-op (because the bounce buffer already contains the data) - it's not clear who write the data (aka DMA_BIDIRECTIONAL) This is again not really doable for a bounce buffer. We don't know which buffer contains the right data, we should warn about it and do whatever we used to do historically. Again, it's very questionable for non-coherent caches too, but "writeback and invalidate" probably at least can't hurt. So hey, that's my thinking. The whole "ownership" model is, I think, obviously untenable. But just going through and listing the different cases and making them explicit I think explains exactly what the different situations are, and that then makes it fairly clear what the different combinations should do. No? Linus
WARNING: multiple messages have this Message-ID (diff)
From: Linus Torvalds <torvalds@linux-foundation.org> To: David Laight <David.Laight@aculab.com> Cc: "Toke Høiland-Jørgensen" <toke@toke.dk>, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>, Netdev <netdev@vger.kernel.org>, "Kalle Valo" <kvalo@kernel.org>, linux-wireless <linux-wireless@vger.kernel.org>, "Oleksandr Natalenko" <oleksandr@natalenko.name>, stable <stable@vger.kernel.org>, "David S. Miller" <davem@davemloft.net>, "Halil Pasic" <pasic@linux.ibm.com>, iommu <iommu@lists.linux-foundation.org>, "Olha Cherevyk" <olha.cherevyk@gmail.com>, "Jakub Kicinski" <kuba@kernel.org>, "Maxime Bizon" <mbizon@freebox.fr>, "Paolo Abeni" <pabeni@redhat.com>, "Robin Murphy" <robin.murphy@arm.com>, "Christoph Hellwig" <hch@lst.de>, "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org> Subject: Re: [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break ath9k-based AP Date: Sun, 27 Mar 2022 12:23:34 -0700 [thread overview] Message-ID: <CAHk-=wgLyqNJx=bb8=o0Nk5U8gMnf0-=qx53ShLEb3V=Yrt8fw@mail.gmail.com> (raw) In-Reply-To: <0745b44456d44d1e9fc364e5a3780d9a@AcuMS.aculab.com> On Sun, Mar 27, 2022 at 8:24 AM David Laight <David.Laight@aculab.com> wrote: > > Aren't bounce buffers just a more extreme case on non-coherent > memory accesses? No. In fact, this whoe change came about exactly because bounce buffers are different. The difference is that bounce buffers have that (wait for it) bounce buffer, which can have stale contents. > They just need explicit memory copies rather than just cache > writeback and invalidate operations. That's the thing - the memory copies introduce entirely new issues. I really think that instead of making up abstract rules ("ownership" or "bounce buffers are just extreme cases of non-coherency") we should make the rules very much practical and down to earth, and write out exactly what they *do*. The whole "sync DMA" is odd and abstract enough as a concept on its own, we shouldn't then make the rules for it odd and abstract. We should make them very very practical. So I will propose that we really make it very much about practical concerns, and we document things as (a) the "sync" operation has by definition a "whose _future_ access do we sync for" notion. So "dma_sync_single_for_cpu()" says that the future accesses to this dma area is for the CPU. Note how it does *NOT* say that the "CPU owns the are". That's bullsh*t, and we now know it's BS. (b) but the sync operation also has a "who wrote the data we're syncing" Note that this is *not* "who accessed or owned it last", because that's nonsensical: if we're syncing for the CPU, then the only reason to do so is because we expect that the last access was by the device, so specifying that separately would just be redundant and stupid. But specifying who *wrote* to the area is meaningful and matters. It matters for the non-coherent cache case (because of writeback issues), but it also matters for the bounce buffer case (becasue it determines which way we should copy). Note how this makes sense: a "sync" operation is clearly about taking some past state, and making it palatable for a future use. The past state is pretty much defined by who wrote the data, and then we can use that and the "the next thing to access it" to determine what we need to do about the sync. It is *NOT* about "ownership". So let's go through the cases, and I'm going to ignore the "CPU caches are coherent with device DMA" case because that's always going to be a no-op wrt data movement (but it will still generally need a memory barrier, which I will mention here and then ignore going forward). Syncing for *CPU* accesses (ie dma_sync_single_for_cpu()) has four choices I can see: - nobody wrote the data at all (aka DMA_NONE). This is nonsensical and should warn. If nobody wrote to it, why would the CPU ever validly access it? Maybe you should have written "memset(buffer, 0, size)" instead? - the CPU wrote the data in the first place (aka DMA_TO_DEVICE) This is a no-op (possibly a memory barrier), because even stale CPU caches are fine, and even if it was in a bounce buffer, the original CPU-side data is fine. - the device wrote the data (aka DMA_FROM_DEVICE) This is just the regular case of a device having written something, and the CPU wants to see it. It obviously needs real work, but it's simple and straightforward. For non-coherent caches, it needs a cache invalidate. For a bounce buffer, it needs a copy from the bounce buffer to the "real" buffer. - it's not clear who write the data (aka DMA_BIDIRECTIONAL) This is not really doable for a bounce buffer - we just don't know which buffer contents are valid. I think it's very very questionable for non-coherent caches too, but "writeback and invalidate" probably can't hurt. So probably warn about it, and do whatever we used to do historically. Syncing for device accesses (ie dma_sync_single_for_device()) also has the same four choices I can see, but obviously does different things: - nobody wrote the data at all (aka DMA_NONE) This sounds as nonsensical as the CPU case, but maybe isn't. We may not have any previous explicit writes, but we *do* have that "insecure and possibly stale buffer contents" bounce buffer thing on the device side. So with a bounce buffer, it's actually somewhat sane to say "initialize the bounce buffer to a known state". Because bounce buffers *are* special. Unlike even the "noncoherent caches" issue, they have that entirely *other* hidden state in the form of the bounce buffer itself. Discuss. - the CPU wrote the data in the first place (aka DMA_TO_DEVICE). This is the regular and common case of "we have data on the CPU side that is written to the device". Again, needs work, but is simple and straightforward. For non-coherent caches, we need a writeback on the CPU. For a bounce buffer, we need to copy from the regular buffer to the bounce buffer. - the device wrote the data in the first place (aka DMA_FROM_DEVICE) This is the case that we hit on ath9k. It's *not* obvious, but when we write this out this way, I really think the semantics are pretty clear. For non-coherent caches, we may need an "invalidate". For a bounce buffer, it's a no-op (because the bounce buffer already contains the data) - it's not clear who write the data (aka DMA_BIDIRECTIONAL) This is again not really doable for a bounce buffer. We don't know which buffer contains the right data, we should warn about it and do whatever we used to do historically. Again, it's very questionable for non-coherent caches too, but "writeback and invalidate" probably at least can't hurt. So hey, that's my thinking. The whole "ownership" model is, I think, obviously untenable. But just going through and listing the different cases and making them explicit I think explains exactly what the different situations are, and that then makes it fairly clear what the different combinations should do. No? Linus _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2022-03-27 19:24 UTC|newest] Thread overview: 139+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-03-23 7:19 [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break ath9k-based AP Oleksandr Natalenko 2022-03-23 7:19 ` Oleksandr Natalenko via iommu 2022-03-23 7:28 ` Kalle Valo 2022-03-23 7:28 ` Kalle Valo 2022-03-23 17:27 ` Linus Torvalds 2022-03-23 17:27 ` Linus Torvalds 2022-03-23 19:06 ` Robin Murphy 2022-03-23 19:06 ` Robin Murphy 2022-03-23 19:16 ` Linus Torvalds 2022-03-23 19:16 ` Linus Torvalds 2022-03-23 20:54 ` Robin Murphy 2022-03-23 20:54 ` Robin Murphy 2022-03-24 5:57 ` Christoph Hellwig 2022-03-24 5:57 ` Christoph Hellwig 2022-03-24 10:25 ` Oleksandr Natalenko 2022-03-24 10:25 ` Oleksandr Natalenko via iommu 2022-03-24 11:05 ` Robin Murphy 2022-03-24 11:05 ` Robin Murphy 2022-03-24 14:27 ` Toke Høiland-Jørgensen 2022-03-24 14:27 ` Toke Høiland-Jørgensen via iommu 2022-03-24 16:29 ` Maxime Bizon 2022-03-24 16:29 ` Maxime Bizon 2022-03-24 16:31 ` Christoph Hellwig 2022-03-24 16:31 ` Christoph Hellwig 2022-03-24 16:52 ` Robin Murphy 2022-03-24 16:52 ` Robin Murphy 2022-03-24 17:07 ` Toke Høiland-Jørgensen 2022-03-24 17:07 ` Toke Høiland-Jørgensen via iommu 2022-03-24 19:26 ` Linus Torvalds 2022-03-24 19:26 ` Linus Torvalds 2022-03-24 21:14 ` Toke Høiland-Jørgensen 2022-03-24 21:14 ` Toke Høiland-Jørgensen via iommu 2022-03-25 10:25 ` Maxime Bizon 2022-03-25 10:25 ` Maxime Bizon 2022-03-25 11:27 ` Robin Murphy 2022-03-25 11:27 ` Robin Murphy 2022-03-25 23:38 ` Halil Pasic 2022-03-25 23:38 ` Halil Pasic 2022-03-26 16:05 ` Toke Høiland-Jørgensen 2022-03-26 16:05 ` Toke Høiland-Jørgensen via iommu 2022-03-26 18:38 ` Linus Torvalds 2022-03-26 18:38 ` Linus Torvalds 2022-03-26 22:38 ` David Laight 2022-03-26 22:38 ` David Laight 2022-03-26 22:41 ` Linus Torvalds 2022-03-26 22:41 ` Linus Torvalds 2022-03-25 16:25 ` Toke Høiland-Jørgensen 2022-03-25 16:25 ` Toke Høiland-Jørgensen via iommu 2022-03-25 16:45 ` Robin Murphy 2022-03-25 16:45 ` Robin Murphy 2022-03-25 18:13 ` Toke Høiland-Jørgensen via iommu 2022-03-25 18:13 ` Toke Høiland-Jørgensen 2022-03-25 18:30 ` Linus Torvalds 2022-03-25 18:30 ` Linus Torvalds 2022-03-25 19:14 ` Robin Murphy 2022-03-25 19:14 ` Robin Murphy 2022-03-25 19:21 ` Linus Torvalds 2022-03-25 19:21 ` Linus Torvalds 2022-03-25 19:26 ` Oleksandr Natalenko via iommu 2022-03-25 19:26 ` Oleksandr Natalenko 2022-03-25 19:27 ` Linus Torvalds 2022-03-25 19:27 ` Linus Torvalds 2022-03-25 19:35 ` Oleksandr Natalenko via iommu 2022-03-25 19:35 ` Oleksandr Natalenko 2022-03-25 20:37 ` Johannes Berg 2022-03-25 20:37 ` Johannes Berg 2022-03-25 20:47 ` Linus Torvalds 2022-03-25 20:47 ` Linus Torvalds 2022-03-25 21:13 ` Johannes Berg 2022-03-25 21:13 ` Johannes Berg 2022-03-25 21:40 ` David Laight 2022-03-25 21:40 ` David Laight 2022-03-25 21:56 ` Linus Torvalds 2022-03-25 21:56 ` Linus Torvalds 2022-03-25 22:41 ` David Laight 2022-03-25 22:41 ` David Laight 2022-03-27 3:15 ` Halil Pasic 2022-03-27 3:15 ` Halil Pasic 2022-03-28 9:48 ` Johannes Berg 2022-03-28 9:48 ` Johannes Berg 2022-03-28 9:50 ` Johannes Berg 2022-03-28 9:50 ` Johannes Berg 2022-03-28 9:57 ` Johannes Berg 2022-03-28 9:57 ` Johannes Berg 2022-03-27 3:48 ` Halil Pasic 2022-03-27 3:48 ` Halil Pasic 2022-03-27 5:06 ` Linus Torvalds 2022-03-27 5:06 ` Linus Torvalds 2022-03-27 5:21 ` Linus Torvalds 2022-03-27 5:21 ` Linus Torvalds 2022-03-27 15:24 ` David Laight 2022-03-27 15:24 ` David Laight 2022-03-27 19:23 ` Linus Torvalds [this message] 2022-03-27 19:23 ` Linus Torvalds 2022-03-27 20:04 ` Linus Torvalds 2022-03-27 20:04 ` Linus Torvalds 2022-03-27 23:52 ` Halil Pasic 2022-03-27 23:52 ` Halil Pasic 2022-03-28 0:30 ` Linus Torvalds 2022-03-28 0:30 ` Linus Torvalds 2022-03-28 12:02 ` Halil Pasic 2022-03-28 12:02 ` Halil Pasic 2022-03-27 23:37 ` Halil Pasic 2022-03-27 23:37 ` Halil Pasic 2022-03-28 0:37 ` Linus Torvalds 2022-03-28 0:37 ` Linus Torvalds 2022-03-25 7:12 ` Oleksandr Natalenko 2022-03-25 7:12 ` Oleksandr Natalenko via iommu 2022-03-25 9:21 ` Thorsten Leemhuis 2022-03-25 9:21 ` Thorsten Leemhuis 2022-03-24 18:31 ` Halil Pasic 2022-03-24 18:31 ` Halil Pasic 2022-03-25 16:31 ` Christoph Hellwig 2022-03-25 16:31 ` Christoph Hellwig 2022-03-24 18:02 ` Halil Pasic 2022-03-24 18:02 ` Halil Pasic 2022-03-25 15:25 ` Halil Pasic 2022-03-25 15:25 ` Halil Pasic 2022-03-25 16:23 ` Robin Murphy 2022-03-25 16:23 ` Robin Murphy 2022-03-25 16:32 ` Christoph Hellwig 2022-03-25 16:32 ` Christoph Hellwig 2022-03-25 18:15 ` Toke Høiland-Jørgensen via iommu 2022-03-25 18:15 ` Toke Høiland-Jørgensen 2022-03-25 18:42 ` Robin Murphy 2022-03-25 18:42 ` Robin Murphy 2022-03-25 18:46 ` Linus Torvalds 2022-03-25 18:46 ` Linus Torvalds 2022-03-28 6:37 ` Christoph Hellwig 2022-03-28 6:37 ` Christoph Hellwig 2022-03-28 8:15 ` David Laight 2022-03-28 8:15 ` David Laight 2022-03-30 12:11 ` Halil Pasic 2022-03-30 12:11 ` Halil Pasic 2022-03-24 8:55 ` Oleksandr Natalenko 2022-03-24 8:55 ` Oleksandr Natalenko via iommu 2022-03-24 12:32 ` [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break ath9k-based AP #forregzbot Thorsten Leemhuis 2022-03-25 9:24 ` Thorsten Leemhuis 2022-03-27 9:00 ` Thorsten Leemhuis
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAHk-=wgLyqNJx=bb8=o0Nk5U8gMnf0-=qx53ShLEb3V=Yrt8fw@mail.gmail.com' \ --to=torvalds@linux-foundation.org \ --cc=David.Laight@aculab.com \ --cc=davem@davemloft.net \ --cc=gregkh@linuxfoundation.org \ --cc=hch@lst.de \ --cc=iommu@lists.linux-foundation.org \ --cc=kuba@kernel.org \ --cc=kvalo@kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-wireless@vger.kernel.org \ --cc=m.szyprowski@samsung.com \ --cc=mbizon@freebox.fr \ --cc=netdev@vger.kernel.org \ --cc=oleksandr@natalenko.name \ --cc=olha.cherevyk@gmail.com \ --cc=pabeni@redhat.com \ --cc=pasic@linux.ibm.com \ --cc=robin.murphy@arm.com \ --cc=stable@vger.kernel.org \ --cc=toke@toke.dk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.