From: Russell King - ARM Linux <linux-I+IVW8TIWO2tmTQ+vhA3Yw@public.gmane.org> To: Alexey Brodkin <Alexey.Brodkin-HKixBCOQz3hWk0Htik3J/w@public.gmane.org> Cc: "linux-sh-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-sh-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, "linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org" <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>, "sparclinux-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <sparclinux-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, "deanbo422-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <deanbo422-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>, "hch-jcswGhMUV9g@public.gmane.org" <hch-jcswGhMUV9g@public.gmane.org>, "linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, "linux-c6x-dev-jPsnJVOj+W6hPH1hqNUYSQ@public.gmane.org" <linux-c6x-dev-jPsnJVOj+W6hPH1hqNUYSQ@public.gmane.org>, "linux-hexagon-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" <linux-hexagon-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, "linux-snps-arc-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org" <linux-snps-arc-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org>, "linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw@public.gmane.org" <linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw@public.gmane.org>, "Vineet.Gupta1-HKixBCOQz3hWk0Htik3J/w@public.gmane.org" <Vineet.Gupta1-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>, "linux-m68k-cunTk1MwBs8S/qaLPR03pWD2FQJk+8+b@public.gmane.org" <linux-m68k-cunTk1MwBs8S/qaLPR03pWD2FQJk+8+b@public.gmane.org>, "openrisc-cunTk1MwBs9a3B2Vnqf2dGD2FQJk+8+b@public.gmane.org" <openrisc-cunTk1MwBs9a3B2Vnqf2dGD2FQJk+8+b@public.gmane.org>, "green.hu-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <green.hu-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>, "linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org" <linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org>, monstr@mons Subject: Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation) Date: Fri, 18 May 2018 22:33:10 +0100 [thread overview] Message-ID: <20180518213309.GG17671@n2100.armlinux.org.uk> (raw) In-Reply-To: <182840dedb4890a88c672b1c5d556920bf89a8fb.camel-HKixBCOQz3hWk0Htik3J/w@public.gmane.org> On Fri, May 18, 2018 at 07:57:34PM +0000, Alexey Brodkin wrote: > Hi Russel, That's Russell. > On Fri, 2018-05-18 at 18:50 +0100, Russell King - ARM Linux wrote: > > It's necessary. Take a moment to think carefully about this: > > > > dma_map_single(, dir) > > > > dma_sync_single_for_cpu(, dir) > > > > dma_sync_single_for_device(, dir) > > > > dma_unmap_single(, dir) > > > > In the case of a DMA-incoherent architecture, the operations done at each > > stage depend on the direction argument: > > > > map for_cpu for_device unmap > > TO_DEV writeback none writeback none > > TO_CPU invalidate invalidate* invalidate invalidate* > > BIDIR writeback invalidate writeback invalidate > > > > * - only necessary if the CPU speculatively prefetches. > > I think invalidation of DMA buffer is required on for_cpu(TO_CPU) even > if CPU doesn't preferch - what if we reuse the same buffer for multiple > reads from DMA device? That's fine - for non-coherent DMA, the CPU caches will only end up containing data for that memory if: - the CPU speculatively fetches data from that memory, or - the CPU explicitly touches that memory > > The multiple invalidations for the TO_CPU case handles different > > conditions that can result in data corruption, and for some CPUs, all > > four are necessary. > > I would agree that map()/unmap() a quite a special cases and so depending > on direction we need to execute in them either for_cpu() or for_device() > call-backs depending on direction. > > As for invalidation in case of for_device(TO_CPU) I still don't see > a rationale behind it. Would be interesting to see a real example where > we benefit from this. Yes, you could avoid that, but depending how you structure the architecture implementation, it can turn out to be a corner case. The above table is precisely how 32-bit ARM is implemented, because the way we implement them is based on who owns the memory - the "map" and "for_device" operations translate internally to a cpu-to-device ownership transition of the buffer. Similar for "unmap" and "to_cpu". It basically avoids having to add additional functions at the lower implementation levels. > > Things get more interesting if the implementation behind the DMA API has > > to copy data between the buffer supplied to the mapping and some DMA > > accessible buffer: > > > > map for_cpu for_device unmap > > TO_DEV copy to dma none copy to dma none > > TO_CPU none copy to cpu none copy to cpu > > BIDIR copy to dma copy to cpu copy to dma copy to cpu > > > > So, in both cases, the value of the direction argument defines what you > > need to do in each call. > > Interesting enough in your seond table (which describes more complicated > case indeed) you set "none" for for_device(TO_CPU) which looks logical > to me. > > So IMHO that's what make sense: > ---------------------------->8----------------------------- > map for_cpu for_device unmap > TO_DEV writeback none writeback none > TO_CPU none invalidate none invalidate* > BIDIR writeback invalidate writeback invalidate* > ---------------------------->8----------------------------- That doesn't make sense for the TO_CPU case. If the caches contain dirty cache lines, and you're DMAing from the device to the system RAM, other system activity can cause the dirty cache lines to be evicted (written back) to memory which the DMA has already overwritten. The result is data corruption. So, you really can't have "none" in the "map" case there. Given that, the "for_cpu" case becomes dependent on whether the CPU speculatively prefetches. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up
WARNING: multiple messages have this Message-ID (diff)
From: Russell King - ARM Linux <linux@armlinux.org.uk> To: Alexey Brodkin <Alexey.Brodkin@synopsys.com> Cc: "deanbo422@gmail.com" <deanbo422@gmail.com>, "linux-sh@vger.kernel.org" <linux-sh@vger.kernel.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "nios2-dev@lists.rocketboards.org" <nios2-dev@lists.rocketboards.org>, "linux-xtensa@linux-xtensa.org" <linux-xtensa@linux-xtensa.org>, "linux-m68k@lists.linux-m68k.org" <linux-m68k@lists.linux-m68k.org>, "linux-mm@kvack.org" <linux-mm@kvack.org>, "linux-hexagon@vger.kernel.org" <linux-hexagon@vger.kernel.org>, "hch@lst.de" <hch@lst.de>, "linux-alpha@vger.kernel.org" <linux-alpha@vger.kernel.org>, "linux-snps-arc@lists.infradead.org" <linux-snps-arc@lists.infradead.org>, "iommu@lists.linux-foundation.org" <iommu@lists.linux-foundation.org>, "green.hu@gmail.com" <green.hu@gmail.com>, "Vineet.Gupta1@synopsys.com" <Vineet.Gupta1@synopsys.com>, "akpm@linux-foundation.org" <akpm@linux-foundation.org>, "openrisc@lists.librecores.org" <openrisc@lists.librecores.org>, "linux-arm-kernel@lists.infradead.org" <linux-arm-kernel@lists.infradead.org>, "monstr@monstr.eu" <monstr@monstr.eu>, "linux-parisc@vger.kernel.org" <linux-parisc@vger.kernel.org>, "linux-c6x-dev@linux-c6x.org" <linux-c6x-dev@linux-c6x.org>, "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>, "sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org> Subject: Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation) Date: Fri, 18 May 2018 22:33:10 +0100 [thread overview] Message-ID: <20180518213309.GG17671@n2100.armlinux.org.uk> (raw) Message-ID: <20180518213310.dCtFdGOu0gFb5OkOSG0gygspVsWU94r_NUP1Xsx83O0@z> (raw) In-Reply-To: <182840dedb4890a88c672b1c5d556920bf89a8fb.camel@synopsys.com> On Fri, May 18, 2018 at 07:57:34PM +0000, Alexey Brodkin wrote: > Hi Russel, That's Russell. > On Fri, 2018-05-18 at 18:50 +0100, Russell King - ARM Linux wrote: > > It's necessary. Take a moment to think carefully about this: > > > > dma_map_single(, dir) > > > > dma_sync_single_for_cpu(, dir) > > > > dma_sync_single_for_device(, dir) > > > > dma_unmap_single(, dir) > > > > In the case of a DMA-incoherent architecture, the operations done at each > > stage depend on the direction argument: > > > > map for_cpu for_device unmap > > TO_DEV writeback none writeback none > > TO_CPU invalidate invalidate* invalidate invalidate* > > BIDIR writeback invalidate writeback invalidate > > > > * - only necessary if the CPU speculatively prefetches. > > I think invalidation of DMA buffer is required on for_cpu(TO_CPU) even > if CPU doesn't preferch - what if we reuse the same buffer for multiple > reads from DMA device? That's fine - for non-coherent DMA, the CPU caches will only end up containing data for that memory if: - the CPU speculatively fetches data from that memory, or - the CPU explicitly touches that memory > > The multiple invalidations for the TO_CPU case handles different > > conditions that can result in data corruption, and for some CPUs, all > > four are necessary. > > I would agree that map()/unmap() a quite a special cases and so depending > on direction we need to execute in them either for_cpu() or for_device() > call-backs depending on direction. > > As for invalidation in case of for_device(TO_CPU) I still don't see > a rationale behind it. Would be interesting to see a real example where > we benefit from this. Yes, you could avoid that, but depending how you structure the architecture implementation, it can turn out to be a corner case. The above table is precisely how 32-bit ARM is implemented, because the way we implement them is based on who owns the memory - the "map" and "for_device" operations translate internally to a cpu-to-device ownership transition of the buffer. Similar for "unmap" and "to_cpu". It basically avoids having to add additional functions at the lower implementation levels. > > Things get more interesting if the implementation behind the DMA API has > > to copy data between the buffer supplied to the mapping and some DMA > > accessible buffer: > > > > map for_cpu for_device unmap > > TO_DEV copy to dma none copy to dma none > > TO_CPU none copy to cpu none copy to cpu > > BIDIR copy to dma copy to cpu copy to dma copy to cpu > > > > So, in both cases, the value of the direction argument defines what you > > need to do in each call. > > Interesting enough in your seond table (which describes more complicated > case indeed) you set "none" for for_device(TO_CPU) which looks logical > to me. > > So IMHO that's what make sense: > ---------------------------->8----------------------------- > map for_cpu for_device unmap > TO_DEV writeback none writeback none > TO_CPU none invalidate none invalidate* > BIDIR writeback invalidate writeback invalidate* > ---------------------------->8----------------------------- That doesn't make sense for the TO_CPU case. If the caches contain dirty cache lines, and you're DMAing from the device to the system RAM, other system activity can cause the dirty cache lines to be evicted (written back) to memory which the DMA has already overwritten. The result is data corruption. So, you really can't have "none" in the "map" case there. Given that, the "for_cpu" case becomes dependent on whether the CPU speculatively prefetches. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up
next prev parent reply other threads:[~2018-05-18 21:33 UTC|newest] Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-05-11 7:59 common non-cache coherent direct dma mapping ops Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 01/20] dma-mapping: simplify Kconfig dependencies Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig [not found] ` <20180511075945.16548-1-hch-jcswGhMUV9g@public.gmane.org> 2018-05-11 7:59 ` [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig [not found] ` <20180511075945.16548-3-hch-jcswGhMUV9g@public.gmane.org> 2018-05-18 13:03 ` Alexey Brodkin 2018-05-18 13:03 ` Alexey Brodkin [not found] ` <bad125dff49f6e49c895e818c9d1abb346a46e8e.camel-HKixBCOQz3hWk0Htik3J/w@public.gmane.org> 2018-05-18 13:27 ` hch-jcswGhMUV9g 2018-05-18 13:27 ` hch [not found] ` <20180518132731.GA31125-jcswGhMUV9g@public.gmane.org> 2018-05-18 14:13 ` Alexey Brodkin 2018-05-18 14:13 ` Alexey Brodkin 2018-05-18 17:28 ` Vineet Gupta 2018-05-18 17:28 ` Vineet Gupta 2018-05-18 17:20 ` dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation) Vineet Gupta 2018-05-18 17:20 ` Vineet Gupta 2018-05-18 17:50 ` Russell King - ARM Linux 2018-05-18 17:50 ` Russell King - ARM Linux [not found] ` <20180518175004.GF17671-l+eeeJia6m9URfEZ8mYm6t73F7V6hmMc@public.gmane.org> 2018-05-18 19:57 ` Alexey Brodkin 2018-05-18 19:57 ` Alexey Brodkin [not found] ` <182840dedb4890a88c672b1c5d556920bf89a8fb.camel-HKixBCOQz3hWk0Htik3J/w@public.gmane.org> 2018-05-18 21:33 ` Russell King - ARM Linux [this message] 2018-05-18 21:33 ` Russell King - ARM Linux 2018-05-18 20:35 ` Vineet Gupta 2018-05-18 20:35 ` Vineet Gupta 2018-05-18 21:55 ` Russell King - ARM Linux 2018-05-18 21:55 ` Russell King - ARM Linux 2018-05-18 20:05 ` [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation Helge Deller 2018-05-18 20:05 ` Helge Deller [not found] ` <0c5d27e9-2799-eb38-8b09-47a04c48b5c7-Mmb7MZpHnFY@public.gmane.org> 2018-05-19 6:38 ` hch-jcswGhMUV9g 2018-05-19 6:38 ` hch 2018-05-11 7:59 ` [PATCH 03/20] arc: use generic dma_noncoherent_ops Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig [not found] ` <20180511075945.16548-4-hch-jcswGhMUV9g@public.gmane.org> 2018-05-11 12:44 ` Alexey Brodkin 2018-05-11 12:44 ` Alexey Brodkin 2018-05-11 7:59 ` [PATCH 04/20] arm-nommu: " Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig [not found] ` <20180511075945.16548-5-hch-jcswGhMUV9g@public.gmane.org> 2018-05-11 9:11 ` Russell King - ARM Linux 2018-05-11 9:11 ` Russell King - ARM Linux [not found] ` <20180511091114.GA16141-l+eeeJia6m9URfEZ8mYm6t73F7V6hmMc@public.gmane.org> 2018-05-22 11:53 ` Christoph Hellwig 2018-05-22 11:53 ` Christoph Hellwig 2018-05-11 13:56 ` John Garry 2018-05-11 13:56 ` John Garry 2018-05-11 7:59 ` [PATCH 05/20] c6x: " Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig [not found] ` <20180511075945.16548-6-hch-jcswGhMUV9g@public.gmane.org> 2018-05-15 0:25 ` [Linux-c6x-dev] " Mark Salter 2018-05-15 0:25 ` Mark Salter 2018-05-11 7:59 ` [PATCH 06/20] hexagon: " Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 07/20] m68k: " Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 08/20] microblaze: " Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 09/20] microblaze: remove the consistent_sync and consistent_sync_page Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 10/20] nds32: use generic dma_noncoherent_ops Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 11/20] nios2: " Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 12/20] openrisc: " Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 13/20] sh: simplify get_arch_dma_ops Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 14/20] sh: introduce a sh_cacheop_vaddr helper Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 15/20] sh: use dma_direct_ops for the CONFIG_DMA_COHERENT case Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 16/20] mm: split arch/sh/mm/consistent.c Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 17/20] sh: use generic dma_noncoherent_ops Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 18/20] xtensa: " Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 19/20] sparc: " Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-11 7:59 ` [PATCH 20/20] parisc: " Christoph Hellwig 2018-05-11 7:59 ` Christoph Hellwig 2018-05-13 13:26 ` common non-cache coherent direct dma mapping ops Helge Deller 2018-05-13 13:26 ` Helge Deller
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20180518213309.GG17671@n2100.armlinux.org.uk \ --to=linux-i+ivw8tiwo2tmtq+vha3yw@public.gmane.org \ --cc=Alexey.Brodkin-HKixBCOQz3hWk0Htik3J/w@public.gmane.org \ --cc=Vineet.Gupta1-HKixBCOQz3hWk0Htik3J/w@public.gmane.org \ --cc=deanbo422-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \ --cc=green.hu-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \ --cc=hch-jcswGhMUV9g@public.gmane.org \ --cc=linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \ --cc=linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \ --cc=linux-c6x-dev-jPsnJVOj+W6hPH1hqNUYSQ@public.gmane.org \ --cc=linux-hexagon-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \ --cc=linux-m68k-cunTk1MwBs8S/qaLPR03pWD2FQJk+8+b@public.gmane.org \ --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \ --cc=linux-sh-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \ --cc=linux-snps-arc-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \ --cc=linux-xtensa-PjhNF2WwrV/0Sa2dR60CXw@public.gmane.org \ --cc=monstr@mons \ --cc=openrisc-cunTk1MwBs9a3B2Vnqf2dGD2FQJk+8+b@public.gmane.org \ --cc=sparclinux-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).