Linux-arch Archive on lore.kernel.org
 help / color / Atom feed
WARNING: multiple messages have this Message-ID
From: Russell King - ARM Linux <linux@armlinux.org.uk>
To: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com>,
	"hch@lst.de" <hch@lst.de>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-xtensa@linux-xtensa.org" <linux-xtensa@linux-xtensa.org>,
	"monstr@monstr.eu" <monstr@monstr.eu>,
	"deanbo422@gmail.com" <deanbo422@gmail.com>,
	"linux-c6x-dev@linux-c6x.org" <linux-c6x-dev@linux-c6x.org>,
	"linux-parisc@vger.kernel.org" <linux-parisc@vger.kernel.org>,
	"linux-sh@vger.kernel.org" <linux-sh@vger.kernel.org>,
	"linux-m68k@lists.linux-m68k.org"
	<linux-m68k@lists.linux-m68k.org>,
	"linux-hexagon@vger.kernel.org" <linux-hexagon@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	openrisc@lis
Subject: Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)
Date: Fri, 18 May 2018 18:50:04 +0100
Message-ID: <20180518175004.GF17671@n2100.armlinux.org.uk> (raw)
In-Reply-To: <5ac5b1e3-9b96-9c7c-4dfe-f65be45ec179@synopsys.com>

On Fri, May 18, 2018 at 10:20:02AM -0700, Vineet Gupta wrote:
> I never understood the need for this direction. And if memory serves me
> right, at that time I was seeing twice the amount of cache flushing !

It's necessary.  Take a moment to think carefully about this:

	dma_map_single(, dir)

	dma_sync_single_for_cpu(, dir)

	dma_sync_single_for_device(, dir)

	dma_unmap_single(, dir)

In the case of a DMA-incoherent architecture, the operations done at each
stage depend on the direction argument:

	map		for_cpu		for_device	unmap
TO_DEV	writeback	none		writeback	none
TO_CPU	invalidate	invalidate*	invalidate	invalidate*
BIDIR	writeback	invalidate	writeback	invalidate

* - only necessary if the CPU speculatively prefetches.

The multiple invalidations for the TO_CPU case handles different
conditions that can result in data corruption, and for some CPUs, all
four are necessary.

This is what is implemented for 32-bit ARM, depending on the CPU
capabilities, as we have DMA incoherent devices and we have CPUs that
speculatively prefetch data, and so may load data into the caches while
DMA is in operation.


Things get more interesting if the implementation behind the DMA API has
to copy data between the buffer supplied to the mapping and some DMA
accessible buffer:

	map		for_cpu		for_device	unmap
TO_DEV	copy to dma	none		copy to dma	none
TO_CPU	none		copy to cpu	none		copy to cpu
BIDIR	copy to dma	copy to cpu	copy to dma	copy to cpu

So, in both cases, the value of the direction argument defines what you
need to do in each call.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

From: Russell King - ARM Linux <linux@armlinux.org.uk>
To: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com>,
	"hch@lst.de" <hch@lst.de>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-xtensa@linux-xtensa.org" <linux-xtensa@linux-xtensa.org>,
	"monstr@monstr.eu" <monstr@monstr.eu>,
	"deanbo422@gmail.com" <deanbo422@gmail.com>,
	"linux-c6x-dev@linux-c6x.org" <linux-c6x-dev@linux-c6x.org>,
	"linux-parisc@vger.kernel.org" <linux-parisc@vger.kernel.org>,
	"linux-sh@vger.kernel.org" <linux-sh@vger.kernel.org>,
	"linux-m68k@lists.linux-m68k.org"
	<linux-m68k@lists.linux-m68k.org>,
	"linux-hexagon@vger.kernel.org" <linux-hexagon@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"openrisc@lists.librecores.org" <openrisc@lists.librecores.org>,
	"green.hu@gmail.com" <green.hu@gmail.com>,
	"linux-alpha@vger.kernel.org" <linux-alpha@vger.kernel.org>,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	"nios2-dev@lists.rocketboards.org"
	<nios2-dev@lists.rocketboards.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-snps-arc@lists.infradead.org"
	<linux-snps-arc@lists.infradead.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation)
Date: Fri, 18 May 2018 18:50:04 +0100
Message-ID: <20180518175004.GF17671@n2100.armlinux.org.uk> (raw)
Message-ID: <20180518175004.W26FNv8Y5Dx5j5oEooLna96VVPlhQ6ynO7wsefvK4cQ@z> (raw)
In-Reply-To: <5ac5b1e3-9b96-9c7c-4dfe-f65be45ec179@synopsys.com>

On Fri, May 18, 2018 at 10:20:02AM -0700, Vineet Gupta wrote:
> I never understood the need for this direction. And if memory serves me
> right, at that time I was seeing twice the amount of cache flushing !

It's necessary.  Take a moment to think carefully about this:

	dma_map_single(, dir)

	dma_sync_single_for_cpu(, dir)

	dma_sync_single_for_device(, dir)

	dma_unmap_single(, dir)

In the case of a DMA-incoherent architecture, the operations done at each
stage depend on the direction argument:

	map		for_cpu		for_device	unmap
TO_DEV	writeback	none		writeback	none
TO_CPU	invalidate	invalidate*	invalidate	invalidate*
BIDIR	writeback	invalidate	writeback	invalidate

* - only necessary if the CPU speculatively prefetches.

The multiple invalidations for the TO_CPU case handles different
conditions that can result in data corruption, and for some CPUs, all
four are necessary.

This is what is implemented for 32-bit ARM, depending on the CPU
capabilities, as we have DMA incoherent devices and we have CPUs that
speculatively prefetch data, and so may load data into the caches while
DMA is in operation.


Things get more interesting if the implementation behind the DMA API has
to copy data between the buffer supplied to the mapping and some DMA
accessible buffer:

	map		for_cpu		for_device	unmap
TO_DEV	copy to dma	none		copy to dma	none
TO_CPU	none		copy to cpu	none		copy to cpu
BIDIR	copy to dma	copy to cpu	copy to dma	copy to cpu

So, in both cases, the value of the direction argument defines what you
need to do in each call.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

  parent reply index

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-11  7:59 common non-cache coherent direct dma mapping ops Christoph Hellwig
2018-05-11  7:59 ` Christoph Hellwig
2018-05-11  7:59 ` [PATCH 01/20] dma-mapping: simplify Kconfig dependencies Christoph Hellwig
2018-05-11  7:59   ` Christoph Hellwig
     [not found] ` <20180511075945.16548-1-hch-jcswGhMUV9g@public.gmane.org>
2018-05-11  7:59   ` [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
     [not found]     ` <20180511075945.16548-3-hch-jcswGhMUV9g@public.gmane.org>
2018-05-18 13:03       ` Alexey Brodkin
2018-05-18 13:03         ` Alexey Brodkin
     [not found]         ` <bad125dff49f6e49c895e818c9d1abb346a46e8e.camel-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
2018-05-18 13:27           ` hch-jcswGhMUV9g
2018-05-18 13:27             ` hch
     [not found]             ` <20180518132731.GA31125-jcswGhMUV9g@public.gmane.org>
2018-05-18 14:13               ` Alexey Brodkin
2018-05-18 14:13                 ` Alexey Brodkin
2018-05-18 17:28               ` Vineet Gupta
2018-05-18 17:28                 ` Vineet Gupta
2018-05-18 17:20           ` dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation) Vineet Gupta
2018-05-18 17:20             ` Vineet Gupta
2018-05-18 17:50             ` Russell King - ARM Linux [this message]
2018-05-18 17:50               ` Russell King - ARM Linux
     [not found]               ` <20180518175004.GF17671-l+eeeJia6m9URfEZ8mYm6t73F7V6hmMc@public.gmane.org>
2018-05-18 19:57                 ` Alexey Brodkin
2018-05-18 19:57                   ` Alexey Brodkin
     [not found]                   ` <182840dedb4890a88c672b1c5d556920bf89a8fb.camel-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
2018-05-18 21:33                     ` Russell King - ARM Linux
2018-05-18 21:33                       ` Russell King - ARM Linux
2018-05-18 20:35                 ` Vineet Gupta
2018-05-18 20:35                   ` Vineet Gupta
2018-05-18 21:55                   ` Russell King - ARM Linux
2018-05-18 21:55                     ` Russell King - ARM Linux
2018-05-18 20:05         ` [PATCH 02/20] dma-mapping: provide a generic dma-noncoherent implementation Helge Deller
2018-05-18 20:05           ` Helge Deller
     [not found]           ` <0c5d27e9-2799-eb38-8b09-47a04c48b5c7-Mmb7MZpHnFY@public.gmane.org>
2018-05-19  6:38             ` hch-jcswGhMUV9g
2018-05-19  6:38               ` hch
2018-05-11  7:59   ` [PATCH 03/20] arc: use generic dma_noncoherent_ops Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
     [not found]     ` <20180511075945.16548-4-hch-jcswGhMUV9g@public.gmane.org>
2018-05-11 12:44       ` Alexey Brodkin
2018-05-11 12:44         ` Alexey Brodkin
2018-05-11  7:59   ` [PATCH 04/20] arm-nommu: " Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
     [not found]     ` <20180511075945.16548-5-hch-jcswGhMUV9g@public.gmane.org>
2018-05-11  9:11       ` Russell King - ARM Linux
2018-05-11  9:11         ` Russell King - ARM Linux
     [not found]         ` <20180511091114.GA16141-l+eeeJia6m9URfEZ8mYm6t73F7V6hmMc@public.gmane.org>
2018-05-22 11:53           ` Christoph Hellwig
2018-05-22 11:53             ` Christoph Hellwig
2018-05-11 13:56       ` John Garry
2018-05-11 13:56         ` John Garry
2018-05-11  7:59   ` [PATCH 05/20] c6x: " Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
     [not found]     ` <20180511075945.16548-6-hch-jcswGhMUV9g@public.gmane.org>
2018-05-15  0:25       ` [Linux-c6x-dev] " Mark Salter
2018-05-15  0:25         ` Mark Salter
2018-05-11  7:59   ` [PATCH 06/20] hexagon: " Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 07/20] m68k: " Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 08/20] microblaze: " Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 09/20] microblaze: remove the consistent_sync and consistent_sync_page Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 10/20] nds32: use generic dma_noncoherent_ops Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 11/20] nios2: " Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 12/20] openrisc: " Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 13/20] sh: simplify get_arch_dma_ops Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 14/20] sh: introduce a sh_cacheop_vaddr helper Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 15/20] sh: use dma_direct_ops for the CONFIG_DMA_COHERENT case Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 16/20] mm: split arch/sh/mm/consistent.c Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 17/20] sh: use generic dma_noncoherent_ops Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 18/20] xtensa: " Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 19/20] sparc: " Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-11  7:59   ` [PATCH 20/20] parisc: " Christoph Hellwig
2018-05-11  7:59     ` Christoph Hellwig
2018-05-13 13:26 ` common non-cache coherent direct dma mapping ops Helge Deller
2018-05-13 13:26   ` Helge Deller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180518175004.GF17671@n2100.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=Alexey.Brodkin@synopsys.com \
    --cc=Vineet.Gupta1@synopsys.com \
    --cc=deanbo422@gmail.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-c6x-dev@linux-c6x.org \
    --cc=linux-hexagon@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux-xtensa@linux-xtensa.org \
    --cc=monstr@monstr.eu \
    --cc=openrisc@lis \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-arch Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arch/0 linux-arch/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arch linux-arch/ https://lore.kernel.org/linux-arch \
		linux-arch@vger.kernel.org
	public-inbox-index linux-arch

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-arch


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git