linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] generic device DMA implementation
@ 2002-12-04 17:47 James Bottomley
  2002-12-04 18:27 ` Jeff Garzik
                   ` (3 more replies)
  0 siblings, 4 replies; 119+ messages in thread
From: James Bottomley @ 2002-12-04 17:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: James.Bottomley

[-- Attachment #1: Type: text/plain, Size: 1426 bytes --]

Currently our only DMA API is highly PCI specific (making any non-pci bus with 
a DMA controller create fake PCI devices to help it function).

Now that we have the generic device model, it should be equally possible to 
rephrase the entire API for generic devices instead of pci_devs.

This patch does just that (for x86---although I also have working code for 
parisc, that's where I actually tested the DMA capability).

The API is substantially the same as the PCI DMA one, with one important 
exception with regard to consistent memory:

The PCI api has pci_alloc_consistent which allocates only consistent memory 
and fails the allocation if none is available thus leading to driver writers 
who might need to function with inconsistent memory to detect this and employ 
a fallback strategy.

The new DMA API allows a driver to advertise its level of consistent memory 
compliance to dma_alloc_consistent.  There are essentially two levels:

- I only work with consistent memory, fail if I cannot get it, or
- I can work with inconsistent memory, try consistent first but return 
inconsistent if it's not available.

The idea is that the memory type can be coded into dma_addr_t which the 
subsequent memory sync operations can use to determine whether 
wback/invalidate should be a nop or not.

Using this scheme allows me to eliminate all the inconsistent memory fallbacks 
from my drivers.

Comments welcome.

James


[-- Attachment #2: tmp.diff --]
[-- Type: text/plain , Size: 17207 bytes --]

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.926   -> 1.929  
#	arch/i386/kernel/pci-dma.c	1.8     -> 1.9    
#	   drivers/pci/pci.c	1.50    -> 1.51   
#	include/asm-i386/pci.h	1.17    -> 1.18   
#	 include/linux/pci.h	1.52    -> 1.54   
#	               (new)	        -> 1.3     include/asm-i386/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-generic/dma-mapping.h
#	               (new)	        -> 1.1     include/linux/dma-mapping.h
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/12/03	jejb@raven.il.steeleye.com	1.927
# Implement a generic device based DMA API
# 
# implement a DMA mapping API based on generic devices instead of
# PCI ones.
# 
# The API is almost identical to the PCI one except for
# 
# - the dma_alloc_consistent takes a conformance level, so the driver
# may choose to support only consistent or also non-consistent memory.
# --------------------------------------------
# 02/12/03	jejb@raven.il.steeleye.com	1.928
# Update to include dma_supported in the API
# --------------------------------------------
# 02/12/03	jejb@raven.il.steeleye.com	1.929
# minor fixes to x86 dma implementation
# --------------------------------------------
#
diff -Nru a/arch/i386/kernel/pci-dma.c b/arch/i386/kernel/pci-dma.c
--- a/arch/i386/kernel/pci-dma.c	Wed Dec  4 11:25:59 2002
+++ b/arch/i386/kernel/pci-dma.c	Wed Dec  4 11:25:59 2002
@@ -13,13 +13,17 @@
 #include <linux/pci.h>
 #include <asm/io.h>
 
-void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
-			   dma_addr_t *dma_handle)
+void *dma_alloc_consistent(struct device *dev, size_t size,
+			   dma_addr_t *dma_handle,
+			   enum dma_conformance_level level)
 {
 	void *ret;
 	int gfp = GFP_ATOMIC;
 
-	if (hwdev == NULL || ((u32)hwdev->dma_mask != 0xffffffff))
+	if(level == DMA_CONFORMANCE_NONE)
+		return NULL;
+
+	if (dev == NULL || ((u32)*dev->dma_mask != 0xffffffff))
 		gfp |= GFP_DMA;
 	ret = (void *)__get_free_pages(gfp, get_order(size));
 
@@ -30,7 +34,7 @@
 	return ret;
 }
 
-void pci_free_consistent(struct pci_dev *hwdev, size_t size,
+void dma_free_consistent(struct device *dev, size_t size,
 			 void *vaddr, dma_addr_t dma_handle)
 {
 	free_pages((unsigned long)vaddr, get_order(size));
diff -Nru a/drivers/pci/pci.c b/drivers/pci/pci.c
--- a/drivers/pci/pci.c	Wed Dec  4 11:25:59 2002
+++ b/drivers/pci/pci.c	Wed Dec  4 11:25:59 2002
@@ -680,7 +680,7 @@
 int
 pci_set_dma_mask(struct pci_dev *dev, u64 mask)
 {
-	if (!pci_dma_supported(dev, mask))
+	if (!dma_supported(&dev->dev, mask))
 		return -EIO;
 
 	dev->dma_mask = mask;
diff -Nru a/include/asm-generic/dma-mapping.h b/include/asm-generic/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-generic/dma-mapping.h	Wed Dec  4 11:25:59 2002
@@ -0,0 +1,29 @@
+#ifndef _ASM_GENERIC_DMA_MAPPING_H
+#define _ASM_GENERIC_DMA_MAPPING_H
+
+int dma_set_mask(struct device *dev, u64 dma_mask);
+void *dma_alloc_consistent(struct device *dev, size_t size,
+			   dma_addr_t *dma_handle,
+			   enum dma_conformance_level level);
+enum dma_conformance_level dma_get_conformance(dma_addr_t dma_handle);
+void dma_free_consistent(struct device *dev, size_t size, void *cpu_addr,
+			  dma_addr_t dma_handle);
+dma_addr_t dma_map_single(struct device *dev, void *cpu_addr, size_t size,
+			   enum dma_data_direction direction);
+void dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
+		      enum dma_data_direction direction);
+dma_addr_t dma_map_page(struct device *dev, struct page *page,
+			unsigned long offset, size_t size,
+			enum dma_data_direction direction);
+void dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
+		    enum dma_data_direction direction);
+int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+	       enum dma_data_direction direction);
+void dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
+		  enum dma_data_direction direction);
+void dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size,
+		     enum dma_data_direction direction);
+void dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems,
+		 enum dma_data_direction direction);
+#endif
+
diff -Nru a/include/asm-i386/dma-mapping.h b/include/asm-i386/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-i386/dma-mapping.h	Wed Dec  4 11:25:59 2002
@@ -0,0 +1,113 @@
+#ifndef _ASM_I386_DMA_MAPPING_H
+#define _ASM_I386_DMA_MAPPING_H
+
+void *dma_alloc_consistent(struct device *dev, size_t size,
+			   dma_addr_t *dma_handle,
+			   enum dma_conformance_level level);
+
+void dma_free_consistent(struct device *dev, size_t size,
+			 void *vaddr, dma_addr_t dma_handle);
+
+static inline dma_addr_t
+dma_map_single(struct device *dev, void *ptr, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+	flush_write_buffers();
+	return virt_to_phys(ptr);
+}
+
+static inline void
+dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
+		 enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+static inline int
+dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+	   enum dma_data_direction direction)
+{
+	int i;
+
+	BUG_ON(direction == DMA_NONE);
+
+	for (i = 0; i < nents; i++ ) {
+		BUG_ON(!sg[i].page);
+
+		sg[i].dma_address = page_to_phys(sg[i].page) + sg[i].offset;
+	}
+
+	flush_write_buffers();
+	return nents;
+}
+
+static inline dma_addr_t
+dma_map_page(struct device *dev, struct page *page, unsigned long offset,
+	     size_t size, enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+	return (dma_addr_t)(page_to_pfn(page)) * PAGE_SIZE + offset;
+}
+
+static inline void
+dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+
+static inline void
+dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
+	     enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+static inline void
+dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size,
+		enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+static inline void
+dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems,
+		 enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+static inline enum dma_conformance_level
+dma_get_conformance(dma_addr_t dma_handle)
+{
+	return DMA_CONFORMANCE_CONSISTENT;
+}
+
+static inline int
+dma_supported(struct device *dev, u64 mask)
+{
+        /*
+         * we fall back to GFP_DMA when the mask isn't all 1s,
+         * so we can't guarantee allocations that must be
+         * within a tighter range than GFP_DMA..
+         */
+        if(mask < 0x00ffffff)
+                return 0;
+
+	return 1;
+}
+
+static inline int
+dma_set_mask(struct device *dev, u64 mask)
+{
+	if(!dev->dma_mask || !dma_supported(dev, mask))
+		return -EIO;
+
+	*dev->dma_mask = mask;
+
+	return 0;
+}
+
+#endif
diff -Nru a/include/asm-i386/pci.h b/include/asm-i386/pci.h
--- a/include/asm-i386/pci.h	Wed Dec  4 11:25:59 2002
+++ b/include/asm-i386/pci.h	Wed Dec  4 11:25:59 2002
@@ -6,6 +6,9 @@
 #ifdef __KERNEL__
 #include <linux/mm.h>		/* for struct page */
 
+/* we support the new DMA API, but still provide the old one */
+#define PCI_NEW_DMA_COMPAT_API	1
+
 /* Can be used to override the logic in pci_scan_bus for skipping
    already-configured bus numbers - to be used for buggy BIOSes
    or architectures with incomplete PCI setup by the loader */
@@ -46,78 +49,6 @@
  */
 #define PCI_DMA_BUS_IS_PHYS	(1)
 
-/* Allocate and map kernel buffer using consistent mode DMA for a device.
- * hwdev should be valid struct pci_dev pointer for PCI devices,
- * NULL for PCI-like buses (ISA, EISA).
- * Returns non-NULL cpu-view pointer to the buffer if successful and
- * sets *dma_addrp to the pci side dma address as well, else *dma_addrp
- * is undefined.
- */
-extern void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
-				  dma_addr_t *dma_handle);
-
-/* Free and unmap a consistent DMA buffer.
- * cpu_addr is what was returned from pci_alloc_consistent,
- * size must be the same as what as passed into pci_alloc_consistent,
- * and likewise dma_addr must be the same as what *dma_addrp was set to.
- *
- * References to the memory and mappings associated with cpu_addr/dma_addr
- * past this call are illegal.
- */
-extern void pci_free_consistent(struct pci_dev *hwdev, size_t size,
-				void *vaddr, dma_addr_t dma_handle);
-
-/* Map a single buffer of the indicated size for DMA in streaming mode.
- * The 32-bit bus address to use is returned.
- *
- * Once the device is given the dma address, the device owns this memory
- * until either pci_unmap_single or pci_dma_sync_single is performed.
- */
-static inline dma_addr_t pci_map_single(struct pci_dev *hwdev, void *ptr,
-					size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-	return virt_to_phys(ptr);
-}
-
-/* Unmap a single streaming mode DMA translation.  The dma_addr and size
- * must match what was provided for in a previous pci_map_single call.  All
- * other usages are undefined.
- *
- * After this call, reads by the cpu to the buffer are guarenteed to see
- * whatever the device wrote there.
- */
-static inline void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
-				    size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
-/*
- * pci_{map,unmap}_single_page maps a kernel page to a dma_addr_t. identical
- * to pci_map_single, but takes a struct page instead of a virtual address
- */
-static inline dma_addr_t pci_map_page(struct pci_dev *hwdev, struct page *page,
-				      unsigned long offset, size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-
-	return (dma_addr_t)(page_to_pfn(page)) * PAGE_SIZE + offset;
-}
-
-static inline void pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
-				  size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
 /* pci_unmap_{page,single} is a nop so... */
 #define DECLARE_PCI_UNMAP_ADDR(ADDR_NAME)
 #define DECLARE_PCI_UNMAP_LEN(LEN_NAME)
@@ -126,84 +57,6 @@
 #define pci_unmap_len(PTR, LEN_NAME)		(0)
 #define pci_unmap_len_set(PTR, LEN_NAME, VAL)	do { } while (0)
 
-/* Map a set of buffers described by scatterlist in streaming
- * mode for DMA.  This is the scather-gather version of the
- * above pci_map_single interface.  Here the scatter gather list
- * elements are each tagged with the appropriate dma address
- * and length.  They are obtained via sg_dma_{address,length}(SG).
- *
- * NOTE: An implementation may be able to use a smaller number of
- *       DMA address/length pairs than there are SG table elements.
- *       (for example via virtual mapping capabilities)
- *       The routine returns the number of addr/length pairs actually
- *       used, at most nents.
- *
- * Device ownership issues as mentioned above for pci_map_single are
- * the same here.
- */
-static inline int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
-			     int nents, int direction)
-{
-	int i;
-
-	if (direction == PCI_DMA_NONE)
-		BUG();
-
-	for (i = 0; i < nents; i++ ) {
-		if (!sg[i].page)
-			BUG();
-
-		sg[i].dma_address = page_to_phys(sg[i].page) + sg[i].offset;
-	}
-
-	flush_write_buffers();
-	return nents;
-}
-
-/* Unmap a set of streaming mode DMA translations.
- * Again, cpu read rules concerning calls here are the same as for
- * pci_unmap_single() above.
- */
-static inline void pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
-				int nents, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
-/* Make physical memory consistent for a single
- * streaming mode DMA translation after a transfer.
- *
- * If you perform a pci_map_single() but wish to interrogate the
- * buffer using the cpu, yet do not wish to teardown the PCI dma
- * mapping, you must call this function before doing so.  At the
- * next point you give the PCI dma address back to the card, the
- * device again owns the buffer.
- */
-static inline void pci_dma_sync_single(struct pci_dev *hwdev,
-				       dma_addr_t dma_handle,
-				       size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-}
-
-/* Make physical memory consistent for a set of streaming
- * mode DMA translations after a transfer.
- *
- * The same as pci_dma_sync_single but for a scatter-gather list,
- * same rules and usage.
- */
-static inline void pci_dma_sync_sg(struct pci_dev *hwdev,
-				   struct scatterlist *sg,
-				   int nelems, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-}
 
 /* Return whether the given PCI device DMA address mask can
  * be supported properly.  For example, if your device can
diff -Nru a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/linux/dma-mapping.h	Wed Dec  4 11:25:59 2002
@@ -0,0 +1,32 @@
+#ifndef _ASM_LINUX_DMA_MAPPING_H
+#define _ASM_LINUX_DMA_MAPPING_H
+
+/* This is the level of conformance required for consistent allocation
+ *
+ * DMA_CONFORMANCE_NONE - debugging: always fail consistent allocation
+ * DMA_CONFORMANCE_CONSISTENT - only allocate consistent memory.  Fail
+ *	if consistent memory cannot be allocated.
+ * DMA_CONFORMANCE_NON_CONSISTENT - driver has full writeback/invalidate
+ *	compliance.  Return non-consistent memory if consistent cannot be
+ *	allocated.
+ */
+enum dma_conformance_level {
+	DMA_CONFORMANCE_NONE,
+	DMA_CONFORMANCE_CONSISTENT,
+	DMA_CONFORMANCE_NON_CONSISTENT,
+};
+
+/* These definitions mirror those in pci.h, so they can be used
+ * interchangeably with their PCI_ counterparts */
+enum dma_data_direction {
+	DMA_BIDIRECTIONAL = 0,
+	DMA_TO_DEVICE = 1,
+	DMA_FROM_DEVICE = 2,
+	DMA_NONE = 3,
+};
+
+#include <asm/dma-mapping.h>
+
+#endif
+
+
diff -Nru a/include/linux/pci.h b/include/linux/pci.h
--- a/include/linux/pci.h	Wed Dec  4 11:25:59 2002
+++ b/include/linux/pci.h	Wed Dec  4 11:25:59 2002
@@ -788,5 +788,95 @@
 #define PCIPCI_VIAETBF		8
 #define PCIPCI_VSFX		16
 
+#include <linux/dma-mapping.h>
+
+/* If you define this macro it means you support the new DMA API.
+ *
+ * The old pci_dma... API is deprecated, but for now it is simply translated
+ * into the new generic device based API.
+ */
+#ifdef PCI_NEW_DMA_COMPAT_API
+
+/* note pci_set_dma_mask isn't here, since it's a public function
+ * exported from drivers/pci, use dma_supported instead */
+
+static inline int
+pci_dma_supported(struct pci_dev *hwdev, u64 mask)
+{
+	return dma_supported(&hwdev->dev, mask);
+}
+
+static inline void *
+pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
+		     dma_addr_t *dma_handle)
+{
+	return dma_alloc_consistent(&hwdev->dev, size, dma_handle,
+				    DMA_CONFORMANCE_CONSISTENT);
+}
+
+static inline void
+pci_free_consistent(struct pci_dev *hwdev, size_t size,
+		    void *vaddr, dma_addr_t dma_handle)
+{
+	dma_free_consistent(&hwdev->dev, size, vaddr, dma_handle);
+}
+
+static inline dma_addr_t
+pci_map_single(struct pci_dev *hwdev, void *ptr, size_t size, int direction)
+{
+	return dma_map_single(&hwdev->dev, ptr, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
+		 size_t size, int direction)
+{
+	dma_unmap_single(&hwdev->dev, dma_addr, size, (enum dma_data_direction)direction);
+}
+
+static inline dma_addr_t
+pci_map_page(struct pci_dev *hwdev, struct page *page,
+	     unsigned long offset, size_t size, int direction)
+{
+	return dma_map_page(&hwdev->dev, page, offset, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
+	       size_t size, int direction)
+{
+	dma_unmap_page(&hwdev->dev, dma_address, size, (enum dma_data_direction)direction);
+}
+
+static inline int
+pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	   int nents, int direction)
+{
+	return dma_map_sg(&hwdev->dev, sg, nents, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	     int nents, int direction)
+{
+	dma_unmap_sg(&hwdev->dev, sg, nents, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle,
+		    size_t size, int direction)
+{
+	dma_sync_single(&hwdev->dev, dma_handle, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+		int nelems, int direction)
+{
+	dma_sync_sg(&hwdev->dev, sg, nelems, (enum dma_data_direction)direction);
+}
+
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* LINUX_PCI_H */

^ permalink raw reply	[flat|nested] 119+ messages in thread
* [RFT][PATCH] generic device DMA implementation
@ 2002-12-18  3:01 James Bottomley
  2002-12-18  3:13 ` David Mosberger
  2002-12-28 18:14 ` Russell King
  0 siblings, 2 replies; 119+ messages in thread
From: James Bottomley @ 2002-12-18  3:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: James.Bottomley

[-- Attachment #1: Type: text/plain, Size: 463 bytes --]

The attached should represent close to final form for the generic DMA API.  It 
includes documentation (surprise!) and and implementation in terms of the pci_ 
API for every arch (apart from parisc, which will be submitted later).

I've folded in the feedback from the previous thread.  Hopefully, this should 
be ready for inclusion.  If people could test it on x86 and other 
architectures, I'd be grateful.

comments and feedback from testing welcome.

James


[-- Attachment #2: tmp.diff --]
[-- Type: text/plain , Size: 39112 bytes --]

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.859   -> 1.861  
#	arch/i386/kernel/pci-dma.c	1.8     -> 1.10   
#	   drivers/pci/pci.c	1.51    -> 1.52   
#	include/asm-i386/pci.h	1.17    -> 1.18   
#	 include/linux/pci.h	1.55    -> 1.56   
#	Documentation/DMA-mapping.txt	1.13    -> 1.14   
#	arch/i386/kernel/i386_ksyms.c	1.40    -> 1.41   
#	               (new)	        -> 1.1     include/asm-s390x/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-arm/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-sparc/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-cris/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-sh/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-ppc64/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-m68knommu/dma-mapping.h
#	               (new)	        -> 1.1     Documentation/DMA-API.txt
#	               (new)	        -> 1.1     include/asm-um/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-ia64/dma-mapping.h
#	               (new)	        -> 1.3     include/asm-generic/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-alpha/dma-mapping.h
#	               (new)	        -> 1.2     include/linux/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-ppc/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-s390/dma-mapping.h
#	               (new)	        -> 1.5     include/asm-i386/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-sparc64/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-m68k/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-mips/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-x86_64/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-v850/dma-mapping.h
#	               (new)	        -> 1.1     include/asm-mips64/dma-mapping.h
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/12/09	jejb@mulgrave.(none)	1.860
# Merge ssh://raven/BK/dma-generic-device-2.5.50
# into mulgrave.(none):/home/jejb/BK/dma-generic-device-2.5
# --------------------------------------------
# 02/12/16	jejb@mulgrave.(none)	1.861
# Documentation complete
# --------------------------------------------
#
diff -Nru a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/Documentation/DMA-API.txt	Tue Dec 17 20:49:32 2002
@@ -0,0 +1,325 @@
+               Dynamic DMA mapping using the generic device
+               ============================================
+
+        James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
+
+This document describes the DMA API.  For a more gentle introduction
+phrased in terms of the pci_ equivalents (and actual examples) see
+DMA-mapping.txt
+
+This API is split into two pieces.  Part I describes the API and the
+corresponding pci_ API.  Part II describes the extensions to the API
+for supporting non-consistent memory machines.  Unless you know that
+your driver absolutely has to support non-consistent platforms (this
+is usually only legacy platforms) you should only use the API
+described in part I.
+
+Part I - pci_ and dma_ Equivalent API 
+-------------------------------------
+
+To get the pci_ API, you must #include <linux/pci.h>
+To get the dma_ API, you must #include <linux/dma-mapping.h>
+
+void *
+dma_alloc_consistent(struct device *dev, size_t size,
+			     dma_addr_t *dma_handle)
+void *
+pci_alloc_consistent(struct pci_dev *dev, size_t size,
+			     dma_addr_t *dma_handle)
+
+Consistent memory is memory for which a write by either the device or
+the processor can immediately be read by the processor or device
+without having to worry about caching effects.
+
+This routine allocates a region of <size> bytes of consistent memory.
+it also returns a <dma_handle> which may be cast to an unsigned
+integer the same width as the bus and used as the physical address
+base of the region.
+
+Returns: a pointer to the allocated region (in the processor's virtual
+address space) or NULL if the allocation failed.
+
+Note: consistent memory can be expensive on some platforms, and the
+minimum allocation length may be as big as a page, so you should
+consolidate your requests for consistent memory as much as possible.
+
+void
+dma_free_consistent(struct device *dev, size_t size, void *cpu_addr
+			   dma_addr_t dma_handle)
+void
+pci_free_consistent(struct pci_dev *dev, size_t size, void *cpu_addr
+			   dma_addr_t dma_handle)
+
+Free the region of consistent memory you previously allocated.  dev,
+size and dma_handle must all be the same as those passed into the
+consistent allocate.  cpu_addr must be the virtual address returned by
+the consistent allocate
+
+int
+dma_supported(struct device *dev, u64 mask)
+int
+pci_dma_supported(struct device *dev, u64 mask)
+
+Checks to see if the device can support DMA to the memory described by
+mask.
+
+Returns: 1 if it can and 0 if it can't.
+
+Notes: This routine merely tests to see if the mask is possible.  It
+won't change the current mask settings.  It is more intended as an
+internal API for use by the platform than an external API for use by
+driver writers.
+
+int
+dma_set_mask(struct device *dev, u64 mask)
+int
+pci_dma_set_mask(struct pci_device *dev, u64 mask)
+
+Checks to see if the mask is possible and updates the device
+parameters if it is.
+
+Returns: 1 if successful and 0 if not
+
+dma_addr_t
+dma_map_single(struct device *dev, void *cpu_addr, size_t size,
+		      enum dma_data_direction direction)
+dma_addr_t
+pci_map_single(struct device *dev, void *cpu_addr, size_t size,
+		      int direction)
+
+Maps a piece of processor virtual memory so it can be accessed by the
+device and returns the physical handle of the memory.
+
+The direction for both api's may be converted freely by casting.
+However the dma_ API uses a strongly typed enumerator for its
+direction:
+
+DMA_NONE		= PCI_DMA_NONE		no direction (used for
+						debugging)
+DMA_TO_DEVICE		= PCI_DMA_TODEVICE	data is going from the
+						memory to the device
+DMA_FROM_DEVICE		= PCI_DMA_FROMDEVICE	data is coming from
+						the device to the
+						memory
+DMA_BIDIRECTIONAL	= PCI_DMA_BIDIRECTIONAL	direction isn't known
+
+Notes:  Not all memory regions in a machine can be mapped by this
+API.  Further, regions that appear to be physically contiguous in
+kernel virtual space may not be contiguous as physical memory.  Since
+this API does not provide any scatter/gather capability, it will fail
+if the user tries to map a non physically contiguous piece of memory.
+For this reason, it is recommended that memory mapped by this API be
+obtained only from sources which guarantee to be physically contiguous
+(like kmalloc).
+
+Further, the physical address of the memory must be within the
+dma_mask of the device (the dma_mask represents a bit mask of the
+addressable region for the device.  i.e. if the physical address of
+the memory anded with the dma_mask is still equal to the physical
+address, then the device can perform DMA to the memory).  In order to
+ensure that the memory allocated by kmalloc is within the dma_mask,
+the driver may specify various platform dependent flags to restrict
+the physical memory range of the allocation (e.g. on x86, GFP_DMA
+guarantees to be within the first 16Mb of available physical memory,
+as required by ISA devices).
+
+Note also that the above constraints on physical contiguity and
+dma_mask may not apply if the platform has an IOMMU (a device which
+supplies a physical to virtual mapping between the I/O memory bus and
+the device).  However, to be portable, device driver writers may *not*
+assume that such an IOMMU exists.
+
+Warnings:  Memory coherency operates at a granularity called the cache
+line width.  In order for memory mapped by this API to operate
+correctly, the mapped region must begin exactly on a cache line
+boundary and end exactly on one (to prevent two separately mapped
+regions from sharing a single cache line).  Since the cache line size
+may not be known at compile time, the API will not enforce this
+requirement.  Therefore, it is recommended that driver writers who
+don't take special care to determine the cache line size at run time
+only map virtual regions that begin and end on page boundaries (which
+are guaranteed also to be cache line boundaries).
+
+DMA_TO_DEVICE synchronisation must be done after the last modification
+of the memory region by the software and before it is handed off to
+the driver.  Once this primitive is used.  Memory covered by this
+primitive should be treated as read only by the device.  If the device
+may write to it at any point, it should be DMA_BIDIRECTIONAL (see
+below).
+
+DMA_FROM_DEVICE synchronisation must be done before the driver
+accesses data that may be changed by the device.  This memory should
+be treated as read only by the driver.  If the driver needs to write
+to it at any point, it should be DMA_BIDIRECTIONAL (see below).
+
+DMA_BIDIRECTIONAL requires special handling: it means that the driver
+isn't sure if the memory was modified before being handed off to the
+device and also isn't sure if the device will also modify it.  Thus,
+you must always sync bidirectional memory twice: once before the
+memory is handed off to the device (to make sure all memory changes
+are flushed from the processor) and once before the data may be
+accessed after being used by the device (to make sure any processor
+cache lines are updated with data that the device may have changed.
+
+void
+dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
+		 enum dma_data_direction direction)
+void
+pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
+		 size_t size, int direction)
+
+Unmaps the region previously mapped.  All the parameters passed in
+must be identical to those passed in (and returned) by the mapping
+API.
+
+dma_addr_t
+dma_map_page(struct device *dev, struct page *page,
+		    unsigned long offset, size_t size,
+		    enum dma_data_direction direction)
+dma_addr_t
+pci_map_page(struct pci_dev *hwdev, struct page *page,
+		    unsigned long offset, size_t size, int direction)
+void
+dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
+	       enum dma_data_direction direction)
+void
+pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
+	       size_t size, int direction)
+
+API for mapping and unmapping for pages.  All the notes and warnings
+for the other mapping APIs apply here.  Also, although the <offset>
+and <size> parameters are provided to do partial page mapping, it is
+recommended that you never use these unless you really know what the
+cache width is.
+
+int
+dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+	   enum dma_data_direction direction)
+int
+pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	   int nents, int direction)
+
+Maps a scatter gather list from the block layer.
+
+Returns: the number of physical segments mapped (this may be shorted
+than <nents> passed in if the block layer determines that some
+elements of the scatter/gather list are physically adjacent and thus
+may be mapped with a single entry).
+
+void
+dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
+	     enum dma_data_direction direction)
+void
+pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	     int nents, int direction)
+
+unmap the previously mapped scatter/gather list.  All the parameters
+must be the same as those and passed in to the scatter/gather mapping
+API.
+
+Note: <nents> must be the number you passed in, *not* the number of
+physical entries returned.
+
+void
+dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size,
+		enum dma_data_direction direction)
+void
+pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle,
+			   size_t size, int direction)
+void
+dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems,
+			  enum dma_data_direction direction)
+void
+pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+		       int nelems, int direction)
+
+synchronise a single contiguous or scatter/gather mapping.  All the
+parameters must be the same as those passed into the single mapping
+API.
+
+Notes:  You must do this:
+
+- Before reading values that have been written by DMA from the device
+  (use the DMA_FROM_DEVICE direction)
+- After writing values that will be written to the device using DMA
+  (use the DMA_TO_DEVICE) direction
+- before *and* after handing memory to the device if the memory is
+  DMA_BIDIRECTIONAL
+
+See also dma_map_single().
+
+Part II - Advanced dma_ usage
+-----------------------------
+
+Warning: These pieces of the DMA API have no PCI equivalent.  They
+should also not be used in the majority of cases, since they cater for
+unlikely corner cases that don't belong in usual drivers.
+
+If you don't understand how cache line coherency works between a
+processor and an I/O device, you should not be using this part of the
+API at all.
+
+void *
+dma_alloc_nonconsistent(struct device *dev, size_t size,
+			       dma_addr_t *dma_handle)
+
+Identical to dma_alloc_consistent() except that the platform will
+choose to return either consistent or non-consistent memory as it sees
+fit.  By using this API, you are guaranteeing to the platform that you
+have all the correct and necessary sync points for this memory in the
+driver should it choose to return non-consistent memory.
+
+Note: where the platform can return consistent memory, it will
+guarantee that the sync points become nops.
+
+Warning:  Handling non-consistent memory is a real pain.  You should
+only ever use this API if you positively know your driver will be
+required to work on one of the rare (usually non-PCI) architectures
+that simply cannot make consistent memory.
+
+void
+dma_free_nonconsistent(struct device *dev, size_t size, void *cpu_addr,
+			      dma_addr_t dma_handle)
+
+free memory allocated by the nonconsistent API.  All parameters must
+be identical to those passed in (and returned by
+dma_alloc_nonconsistent()).
+
+int
+dma_is_consistent(dma_addr_t dma_handle)
+
+returns true if the memory pointed to by the dma_handle is actually
+consistent.
+
+int
+dma_get_cache_alignment(void)
+
+returns the processor cache alignment.  This is the absolute minimum
+alignment *and* width that you must observe when either mapping
+memory or doing partial flushes.
+
+Notes: This API may return a number *larger* than the actual cache
+line, but it will guarantee that one or more cache lines fit exactly
+into the width returned by this call.  It will also always be a power
+of two for easy alignment
+
+void
+dma_sync_single_range(struct device *dev, dma_addr_t dma_handle,
+		      unsigned long offset, size_t size,
+		      enum dma_data_direction direction)
+
+does a partial sync.  starting at offset and continuing for size.  You
+must be careful to observe the cache alignment and width when doing
+anything like this.  You must also be extra careful about accessing
+memory you intend to sync partially.
+
+void
+dma_cache_sync(void *vaddr, size_t size,
+	       enum dma_data_direction direction)
+
+Do a partial sync of memory that was allocated by
+dma_alloc_nonconsistent(), starting at virtual address vaddr and
+continuing on for size.  Again, you *must* observe the cache line
+boundaries when doing this.
+
+
diff -Nru a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt
--- a/Documentation/DMA-mapping.txt	Tue Dec 17 20:49:32 2002
+++ b/Documentation/DMA-mapping.txt	Tue Dec 17 20:49:32 2002
@@ -5,6 +5,10 @@
 		 Richard Henderson <rth@cygnus.com>
 		  Jakub Jelinek <jakub@redhat.com>
 
+This document describes the DMA mapping system in terms of the pci_
+API.  For a similar API that works for generic devices, see
+DMA-API.txt.
+
 Most of the 64bit platforms have special hardware that translates bus
 addresses (DMA addresses) into physical addresses.  This is similar to
 how page tables and/or a TLB translates virtual addresses to physical
diff -Nru a/arch/i386/kernel/i386_ksyms.c b/arch/i386/kernel/i386_ksyms.c
--- a/arch/i386/kernel/i386_ksyms.c	Tue Dec 17 20:49:32 2002
+++ b/arch/i386/kernel/i386_ksyms.c	Tue Dec 17 20:49:32 2002
@@ -124,8 +124,8 @@
 EXPORT_SYMBOL(__copy_to_user);
 EXPORT_SYMBOL(strnlen_user);
 
-EXPORT_SYMBOL(pci_alloc_consistent);
-EXPORT_SYMBOL(pci_free_consistent);
+EXPORT_SYMBOL(dma_alloc_consistent);
+EXPORT_SYMBOL(dma_free_consistent);
 
 #ifdef CONFIG_PCI
 EXPORT_SYMBOL(pcibios_penalize_isa_irq);
diff -Nru a/arch/i386/kernel/pci-dma.c b/arch/i386/kernel/pci-dma.c
--- a/arch/i386/kernel/pci-dma.c	Tue Dec 17 20:49:32 2002
+++ b/arch/i386/kernel/pci-dma.c	Tue Dec 17 20:49:32 2002
@@ -13,13 +13,13 @@
 #include <linux/pci.h>
 #include <asm/io.h>
 
-void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
+void *dma_alloc_consistent(struct device *dev, size_t size,
 			   dma_addr_t *dma_handle)
 {
 	void *ret;
 	int gfp = GFP_ATOMIC;
 
-	if (hwdev == NULL || ((u32)hwdev->dma_mask != 0xffffffff))
+	if (dev == NULL || ((u32)*dev->dma_mask != 0xffffffff))
 		gfp |= GFP_DMA;
 	ret = (void *)__get_free_pages(gfp, get_order(size));
 
@@ -30,7 +30,7 @@
 	return ret;
 }
 
-void pci_free_consistent(struct pci_dev *hwdev, size_t size,
+void dma_free_consistent(struct device *dev, size_t size,
 			 void *vaddr, dma_addr_t dma_handle)
 {
 	free_pages((unsigned long)vaddr, get_order(size));
diff -Nru a/include/asm-alpha/dma-mapping.h b/include/asm-alpha/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-alpha/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-arm/dma-mapping.h b/include/asm-arm/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-arm/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-cris/dma-mapping.h b/include/asm-cris/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-cris/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-generic/dma-mapping.h b/include/asm-generic/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-generic/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1,154 @@
+/* Copyright (C) 2002 by James.Bottomley@HansenPartnership.com 
+ *
+ * Implements the generic device dma API via the existing pci_ one
+ * for unconverted architectures
+ */
+
+#ifndef _ASM_GENERIC_DMA_MAPPING_H
+#define _ASM_GENERIC_DMA_MAPPING_H
+
+/* we implement the API below in terms of the existing PCI one,
+ * so include it */
+#include <linux/pci.h>
+
+static inline int
+dma_supported(struct device *dev, u64 mask)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_dma_supported(to_pci_dev(dev), mask);
+}
+
+static inline int
+dma_set_mask(struct device *dev, u64 dma_mask)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_set_dma_mask(to_pci_dev(dev), dma_mask);
+}
+
+static inline void *
+dma_alloc_consistent(struct device *dev, size_t size, dma_addr_t *dma_handle)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_alloc_consistent(to_pci_dev(dev), size, dma_handle);
+}
+
+static inline void
+dma_free_consistent(struct device *dev, size_t size, void *cpu_addr,
+		    dma_addr_t dma_handle)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_free_consistent(to_pci_dev(dev), size, cpu_addr, dma_handle);
+}
+
+static inline dma_addr_t
+dma_map_single(struct device *dev, void *cpu_addr, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_map_single(to_pci_dev(dev), cpu_addr, size, (int)direction);
+}
+
+static inline void
+dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
+		 enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_unmap_single(to_pci_dev(dev), dma_addr, size, (int)direction);
+}
+
+static inline dma_addr_t
+dma_map_page(struct device *dev, struct page *page,
+	     unsigned long offset, size_t size,
+	     enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_map_page(to_pci_dev(dev), page, offset, size, (int)direction);
+}
+
+static inline void
+dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_unmap_page(to_pci_dev(dev), dma_address, size, (int)direction);
+}
+
+static inline int
+dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+	   enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	return pci_map_sg(to_pci_dev(dev), sg, nents, (int)direction);
+}
+
+static inline void
+dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
+	     enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_unmap_sg(to_pci_dev(dev), sg, nhwentries, (int)direction);
+}
+
+static inline void
+dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size,
+		enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_dma_sync_single(to_pci_dev(dev), dma_handle, size, (int)direction);
+}
+
+static inline void
+dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems,
+	    enum dma_data_direction direction)
+{
+	BUG_ON(dev->bus != &pci_bus_type);
+
+	pci_dma_sync_sg(to_pci_dev(dev), sg, nelems, (int)direction);
+}
+
+/* Now for the API extensions over the pci_ one */
+
+#define dma_alloc_nonconsistent(d, s, h) dma_alloc_consistent(d, s, h)
+#define dma_free_nonconsistent(d, s, v, h) dma_free_consistent(d, s, v, h)
+#define dma_is_consistent(d)	(1)
+
+static inline int
+dma_get_cache_alignment(void)
+{
+	/* no easy way to get cache size on all processors, so return
+	 * the maximum possible, to be safe */
+	return (1 << L1_CACHE_SHIFT_MAX);
+}
+
+static inline void
+dma_sync_single_range(struct device *dev, dma_addr_t dma_handle,
+		      unsigned long offset, size_t size,
+		      enum dma_data_direction direction)
+{
+	/* just sync everything, that's all the pci API can do */
+	dma_sync_single(dev, dma_handle, offset+size, direction);
+}
+
+static inline void
+dma_cache_sync(void *vaddr, size_t size,
+	       enum dma_data_direction direction)
+{
+	/* could define this in terms of the dma_cache ... operations,
+	 * but if you get this on a platform, you should convert the platform
+	 * to using the generic device DMA API */
+	BUG();
+}
+
+#endif
+
diff -Nru a/include/asm-i386/dma-mapping.h b/include/asm-i386/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-i386/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1,137 @@
+#ifndef _ASM_I386_DMA_MAPPING_H
+#define _ASM_I386_DMA_MAPPING_H
+
+#include <asm/cache.h>
+
+#define dma_alloc_nonconsistent(d, s, h) dma_alloc_consistent(d, s, h)
+#define dma_free_nonconsistent(d, s, v, h) dma_free_consistent(d, s, v, h)
+
+void *dma_alloc_consistent(struct device *dev, size_t size,
+			   dma_addr_t *dma_handle);
+
+void dma_free_consistent(struct device *dev, size_t size,
+			 void *vaddr, dma_addr_t dma_handle);
+
+static inline dma_addr_t
+dma_map_single(struct device *dev, void *ptr, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+	flush_write_buffers();
+	return virt_to_phys(ptr);
+}
+
+static inline void
+dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
+		 enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+static inline int
+dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
+	   enum dma_data_direction direction)
+{
+	int i;
+
+	BUG_ON(direction == DMA_NONE);
+
+	for (i = 0; i < nents; i++ ) {
+		BUG_ON(!sg[i].page);
+
+		sg[i].dma_address = page_to_phys(sg[i].page) + sg[i].offset;
+	}
+
+	flush_write_buffers();
+	return nents;
+}
+
+static inline dma_addr_t
+dma_map_page(struct device *dev, struct page *page, unsigned long offset,
+	     size_t size, enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+	return (dma_addr_t)(page_to_pfn(page)) * PAGE_SIZE + offset;
+}
+
+static inline void
+dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
+	       enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+
+static inline void
+dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
+	     enum dma_data_direction direction)
+{
+	BUG_ON(direction == DMA_NONE);
+}
+
+static inline void
+dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size,
+		enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+static inline void
+dma_sync_single_range(struct device *dev, dma_addr_t dma_handle,
+		      unsigned long offset, size_t size,
+		      enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+
+static inline void
+dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems,
+		 enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+static inline int
+dma_supported(struct device *dev, u64 mask)
+{
+        /*
+         * we fall back to GFP_DMA when the mask isn't all 1s,
+         * so we can't guarantee allocations that must be
+         * within a tighter range than GFP_DMA..
+         */
+        if(mask < 0x00ffffff)
+                return 0;
+
+	return 1;
+}
+
+static inline int
+dma_set_mask(struct device *dev, u64 mask)
+{
+	if(!dev->dma_mask || !dma_supported(dev, mask))
+		return -EIO;
+
+	*dev->dma_mask = mask;
+
+	return 0;
+}
+
+static inline int
+dma_get_cache_alignment(void)
+{
+	/* no easy way to get cache size on all x86, so return the
+	 * maximum possible, to be safe */
+	return (1 << L1_CACHE_SHIFT_MAX);
+}
+
+#define dma_is_consistent(d)	(1)
+
+static inline void
+dma_cache_sync(void *vaddr, size_t size,
+	       enum dma_data_direction direction)
+{
+	flush_write_buffers();
+}
+
+#endif
diff -Nru a/include/asm-i386/pci.h b/include/asm-i386/pci.h
--- a/include/asm-i386/pci.h	Tue Dec 17 20:49:32 2002
+++ b/include/asm-i386/pci.h	Tue Dec 17 20:49:32 2002
@@ -6,6 +6,9 @@
 #ifdef __KERNEL__
 #include <linux/mm.h>		/* for struct page */
 
+/* we support the new DMA API, but still provide the old one */
+#define PCI_NEW_DMA_COMPAT_API	1
+
 /* Can be used to override the logic in pci_scan_bus for skipping
    already-configured bus numbers - to be used for buggy BIOSes
    or architectures with incomplete PCI setup by the loader */
@@ -46,78 +49,6 @@
  */
 #define PCI_DMA_BUS_IS_PHYS	(1)
 
-/* Allocate and map kernel buffer using consistent mode DMA for a device.
- * hwdev should be valid struct pci_dev pointer for PCI devices,
- * NULL for PCI-like buses (ISA, EISA).
- * Returns non-NULL cpu-view pointer to the buffer if successful and
- * sets *dma_addrp to the pci side dma address as well, else *dma_addrp
- * is undefined.
- */
-extern void *pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
-				  dma_addr_t *dma_handle);
-
-/* Free and unmap a consistent DMA buffer.
- * cpu_addr is what was returned from pci_alloc_consistent,
- * size must be the same as what as passed into pci_alloc_consistent,
- * and likewise dma_addr must be the same as what *dma_addrp was set to.
- *
- * References to the memory and mappings associated with cpu_addr/dma_addr
- * past this call are illegal.
- */
-extern void pci_free_consistent(struct pci_dev *hwdev, size_t size,
-				void *vaddr, dma_addr_t dma_handle);
-
-/* Map a single buffer of the indicated size for DMA in streaming mode.
- * The 32-bit bus address to use is returned.
- *
- * Once the device is given the dma address, the device owns this memory
- * until either pci_unmap_single or pci_dma_sync_single is performed.
- */
-static inline dma_addr_t pci_map_single(struct pci_dev *hwdev, void *ptr,
-					size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-	return virt_to_phys(ptr);
-}
-
-/* Unmap a single streaming mode DMA translation.  The dma_addr and size
- * must match what was provided for in a previous pci_map_single call.  All
- * other usages are undefined.
- *
- * After this call, reads by the cpu to the buffer are guarenteed to see
- * whatever the device wrote there.
- */
-static inline void pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
-				    size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
-/*
- * pci_{map,unmap}_single_page maps a kernel page to a dma_addr_t. identical
- * to pci_map_single, but takes a struct page instead of a virtual address
- */
-static inline dma_addr_t pci_map_page(struct pci_dev *hwdev, struct page *page,
-				      unsigned long offset, size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-
-	return (dma_addr_t)(page_to_pfn(page)) * PAGE_SIZE + offset;
-}
-
-static inline void pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
-				  size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
 /* pci_unmap_{page,single} is a nop so... */
 #define DECLARE_PCI_UNMAP_ADDR(ADDR_NAME)
 #define DECLARE_PCI_UNMAP_LEN(LEN_NAME)
@@ -126,84 +57,6 @@
 #define pci_unmap_len(PTR, LEN_NAME)		(0)
 #define pci_unmap_len_set(PTR, LEN_NAME, VAL)	do { } while (0)
 
-/* Map a set of buffers described by scatterlist in streaming
- * mode for DMA.  This is the scather-gather version of the
- * above pci_map_single interface.  Here the scatter gather list
- * elements are each tagged with the appropriate dma address
- * and length.  They are obtained via sg_dma_{address,length}(SG).
- *
- * NOTE: An implementation may be able to use a smaller number of
- *       DMA address/length pairs than there are SG table elements.
- *       (for example via virtual mapping capabilities)
- *       The routine returns the number of addr/length pairs actually
- *       used, at most nents.
- *
- * Device ownership issues as mentioned above for pci_map_single are
- * the same here.
- */
-static inline int pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
-			     int nents, int direction)
-{
-	int i;
-
-	if (direction == PCI_DMA_NONE)
-		BUG();
-
-	for (i = 0; i < nents; i++ ) {
-		if (!sg[i].page)
-			BUG();
-
-		sg[i].dma_address = page_to_phys(sg[i].page) + sg[i].offset;
-	}
-
-	flush_write_buffers();
-	return nents;
-}
-
-/* Unmap a set of streaming mode DMA translations.
- * Again, cpu read rules concerning calls here are the same as for
- * pci_unmap_single() above.
- */
-static inline void pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
-				int nents, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	/* Nothing to do */
-}
-
-/* Make physical memory consistent for a single
- * streaming mode DMA translation after a transfer.
- *
- * If you perform a pci_map_single() but wish to interrogate the
- * buffer using the cpu, yet do not wish to teardown the PCI dma
- * mapping, you must call this function before doing so.  At the
- * next point you give the PCI dma address back to the card, the
- * device again owns the buffer.
- */
-static inline void pci_dma_sync_single(struct pci_dev *hwdev,
-				       dma_addr_t dma_handle,
-				       size_t size, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-}
-
-/* Make physical memory consistent for a set of streaming
- * mode DMA translations after a transfer.
- *
- * The same as pci_dma_sync_single but for a scatter-gather list,
- * same rules and usage.
- */
-static inline void pci_dma_sync_sg(struct pci_dev *hwdev,
-				   struct scatterlist *sg,
-				   int nelems, int direction)
-{
-	if (direction == PCI_DMA_NONE)
-		BUG();
-	flush_write_buffers();
-}
 
 /* Return whether the given PCI device DMA address mask can
  * be supported properly.  For example, if your device can
diff -Nru a/include/asm-ia64/dma-mapping.h b/include/asm-ia64/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-ia64/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-m68k/dma-mapping.h b/include/asm-m68k/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-m68k/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-m68knommu/dma-mapping.h b/include/asm-m68knommu/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-m68knommu/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-mips/dma-mapping.h b/include/asm-mips/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-mips/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-mips64/dma-mapping.h b/include/asm-mips64/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-mips64/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-ppc/dma-mapping.h b/include/asm-ppc/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-ppc/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-ppc64/dma-mapping.h b/include/asm-ppc64/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-ppc64/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-s390/dma-mapping.h b/include/asm-s390/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-s390/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-s390x/dma-mapping.h b/include/asm-s390x/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-s390x/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-sh/dma-mapping.h b/include/asm-sh/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-sh/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-sparc/dma-mapping.h b/include/asm-sparc/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-sparc/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-sparc64/dma-mapping.h b/include/asm-sparc64/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-sparc64/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-um/dma-mapping.h b/include/asm-um/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-um/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-v850/dma-mapping.h b/include/asm-v850/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-v850/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/asm-x86_64/dma-mapping.h b/include/asm-x86_64/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/asm-x86_64/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1 @@
+#include <asm-generic/dma-mapping.h>
diff -Nru a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ b/include/linux/dma-mapping.h	Tue Dec 17 20:49:32 2002
@@ -0,0 +1,17 @@
+#ifndef _ASM_LINUX_DMA_MAPPING_H
+#define _ASM_LINUX_DMA_MAPPING_H
+
+/* These definitions mirror those in pci.h, so they can be used
+ * interchangeably with their PCI_ counterparts */
+enum dma_data_direction {
+	DMA_BIDIRECTIONAL = 0,
+	DMA_TO_DEVICE = 1,
+	DMA_FROM_DEVICE = 2,
+	DMA_NONE = 3,
+};
+
+#include <asm/dma-mapping.h>
+
+#endif
+
+
diff -Nru a/include/linux/pci.h b/include/linux/pci.h
--- a/include/linux/pci.h	Tue Dec 17 20:49:32 2002
+++ b/include/linux/pci.h	Tue Dec 17 20:49:32 2002
@@ -826,5 +826,92 @@
 #define PCIPCI_VIAETBF		8
 #define PCIPCI_VSFX		16
 
+#include <linux/dma-mapping.h>
+
+/* If you define PCI_NEW_DMA_COMPAT_API it means you support the new DMA API
+ * and you want the pci_ DMA API to be implemented using it.
+ */
+#if defined(PCI_NEW_DMA_COMPAT_API) && defined(CONFIG_PCI)
+
+/* note pci_set_dma_mask isn't here, since it's a public function
+ * exported from drivers/pci, use dma_supported instead */
+
+static inline int
+pci_dma_supported(struct pci_dev *hwdev, u64 mask)
+{
+	return dma_supported(&hwdev->dev, mask);
+}
+
+static inline void *
+pci_alloc_consistent(struct pci_dev *hwdev, size_t size,
+		     dma_addr_t *dma_handle)
+{
+	return dma_alloc_consistent(&hwdev->dev, size, dma_handle);
+}
+
+static inline void
+pci_free_consistent(struct pci_dev *hwdev, size_t size,
+		    void *vaddr, dma_addr_t dma_handle)
+{
+	dma_free_consistent(&hwdev->dev, size, vaddr, dma_handle);
+}
+
+static inline dma_addr_t
+pci_map_single(struct pci_dev *hwdev, void *ptr, size_t size, int direction)
+{
+	return dma_map_single(&hwdev->dev, ptr, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
+		 size_t size, int direction)
+{
+	dma_unmap_single(&hwdev->dev, dma_addr, size, (enum dma_data_direction)direction);
+}
+
+static inline dma_addr_t
+pci_map_page(struct pci_dev *hwdev, struct page *page,
+	     unsigned long offset, size_t size, int direction)
+{
+	return dma_map_page(&hwdev->dev, page, offset, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
+	       size_t size, int direction)
+{
+	dma_unmap_page(&hwdev->dev, dma_address, size, (enum dma_data_direction)direction);
+}
+
+static inline int
+pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	   int nents, int direction)
+{
+	return dma_map_sg(&hwdev->dev, sg, nents, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+	     int nents, int direction)
+{
+	dma_unmap_sg(&hwdev->dev, sg, nents, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle,
+		    size_t size, int direction)
+{
+	dma_sync_single(&hwdev->dev, dma_handle, size, (enum dma_data_direction)direction);
+}
+
+static inline void
+pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg,
+		int nelems, int direction)
+{
+	dma_sync_sg(&hwdev->dev, sg, nelems, (enum dma_data_direction)direction);
+}
+
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* LINUX_PCI_H */

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-07 14:37 Adam J. Richter
  0 siblings, 0 replies; 119+ messages in thread
From: Adam J. Richter @ 2002-12-07 14:37 UTC (permalink / raw)
  To: rmk; +Cc: david, James.Bottomley, jgarzik, linux-kernel, miles

On Sat, 7 Dec 2002 11 Russell King wrote:
>On Sat, Dec 07, 2002 at 08:45:30PM +1100, David Gibson wrote:
>> Actually, no, since my idea was to remove the "consistent_alloc()"
>> path from the driver entirely - leaving only the map/sync approach.
>> That gives a result which is correct everywhere (afaict) but (as
>> you've since pointed out) will perform poorly on platforms where the
>> map/sync operations are expensive.

>As I've also pointed out in the past couple of days, doing this will
>mean that you then need to teach the drivers to align structures to
>cache line boundaries.  Otherwise, you _will_ get into a situation
>where you _will_ loose data.

	Drivers for such hardware would allocate their memory with
dma_alloc(...,DMA_CONSISTENT), which is what 99.9% of all current
drivers would do, indicating that the allocation should
fail if consistent memory is unavailable.

	David Gibson was describing a hypothetical platform which
would have both consistent and inconsistent meory but on which the
cache operations were so cheap that he thought it might be more
optimal to give inconsistent memory to those drivers that claimed
to be able to handle it.  (Ignore the question of whether that
really is optimal; let's assume David is right for the sake
of example.)  On such a platform, drivers that did not
claim to be able to handle inconsistent memory would still get
consistent memory (or get NULL).  The optimization that David has
in mind would only be done for drivers that claim to be able to
handle inconsistent memory.

>I would rather keep the consistent_alloc() approach for allocating
>consistent memory, and align structures as they see fit, rather than
>having to teach the drivers to align appropriately.  And you can be
>damned sure that driver writers are _not_ going to get the alignment
>right.

	Nobody is talking about eliminating the mechanism for a
driver to say "fail if you cannot give me consistent memory."
That would be the normal usage.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-07  4:12 Adam J. Richter
  0 siblings, 0 replies; 119+ messages in thread
From: Adam J. Richter @ 2002-12-07  4:12 UTC (permalink / raw)
  To: davem; +Cc: James.Bottomley, linux-kernel, willy

	The question of flags versus an extra procedure name is not
actually that big of a deal to me.  I can live with either approach
for dma_alloc.  However, I'll explain what I see as advantages of
using a flags parameter for others to consider (or to tell me where
they think I'm wrong or haven't thought of something).

On Fri, 06 Dec 2002, David S. Miller wrote:
> I don't want a 'flags' thing, because that tends to be the action
> which opens the flood gates for putting random feature-of-the-day new
> bits.

	It is possible to overuse any extension mechanism.  I think
you've made a general argument against extension mechanisms that is
usually not true in practice, at least in Linux.  I think simple
extension mechanisms, like having a flag word in this case when we
need to express a choice between two options anyhow, tend to do more
good than harm on average, even when I think about most egregious
cases like filesystems.  Maybe if I could see an example or two
extension mechanisms that have a net negative impact in your opinion,
I'd better understand.


> If you have to actually get a real API change made, it will get review
> and won't "sneak on in"

	Or Linux just won't get that optimization because people give
up or leave their changes on the back burner indefinitely, something
that I think happens to most Linux improvements, especially if you
count those that don't make it to implementation because people
correctly forsee this kind of bureaucracy.  If anyone does decide to
propose another flag-like facility for dma_alloc, I expect people will
complain that changing the API may require hundreds of drivers to be
updated.

	I did think about the possibility of a flags parameter
inviting features that aren't worth their complexity or other costs
before I suggested a flags parameter.  My view was and is that if
people handling individual architectures want to add and remove flags
bits, even just to experiment with features, I think the existing
process of getting patches integrated would cause enough review.  I
also think our capacity to process changes is already exceeded by
voluntary submissions as evidenced by backlogs and dropped patches.


> I also don't want architectures adding arch
> specific flag bits that some drivers end up using, for example.

	Here I'm guessing at your intended meaning.  If you mean that
there will be numbering collisions, I would expect these flags to be
defined in include/asm-xxx/dma-mapping.h.  I was going to suggest that
we even do this for DMA_ALLOW_INCONSISENT.
include/asm-parisc/dma-mapping.h would contain:

#define DMA_ALLOW_INCONSISTENT 	0x1

linux/dma-mapping.h would contain:

#include <asm/dma-mapping.h>
#ifndef DMA_ALLOW_INCONSISTENT
# define DMA_ALLOW_INCONSISTENT		0
#endif

	By that convention, bits would not be used in architectures
that never set them, and it could conceivably simplify some compiler
optimizations like "flags |= DMA_SOME_FLAG;" and
"if (flags & DMA_SOME_FLAG) {....}" on architectures where this is never
true.  Bit assignment would be under the control of the platforms, at
least in the absense of a flag that is meaningful on every platform (if
there were one, it would just simplify the source code to define it only
in linux/dma-mapping.h).

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-06 22:52 Adam J. Richter
  0 siblings, 0 replies; 119+ messages in thread
From: Adam J. Richter @ 2002-12-06 22:52 UTC (permalink / raw)
  To: James.Bottomley; +Cc: davem, linux-kernel, willy

James Bottomley wrote:
>how about dma_alloc to take two flags

>DRIVER_SUPPORTS_CONSISTENT_ONLY

	It's pretty much impossible not to support consistent memory.

	I'd suggest a shorter name for code readability and particularly
to hint that this is the standard usage.  I'd suggest DMA_CONSISTENT
or "0".

>and

>DRIVER_SUPPORTS_NON_CONSISTENT

	There is a pretty strong convention for medium to short names
in the kernel, although this name will be used much less, so its
length is not as important.  I'd like something that would match the
names of the corresponding cache flushing and invalidation functions.
I think I had previously suggested DMA_MAYBE_CONSISTENT and wmb_maybe
or dma_sync_maybe but I'm not that attached to the "maybe" word.

[...]

>and dma_alloc_consistent to be equivalent to dma_alloc with  
>DRIVER_SUPPORTS_CONSISTENT_ONLY (and hence equivalent to pci_alloc_consistent)

Why have a separate dma_alloc_consistent function?

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."



^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-06 22:17 Adam J. Richter
  2002-12-06 22:26 ` James Bottomley
  0 siblings, 1 reply; 119+ messages in thread
From: Adam J. Richter @ 2002-12-06 22:17 UTC (permalink / raw)
  To: willy; +Cc: davem, James.Bottomley, linux-kernel

On Fri, 6 Dec 2002, Matthew Wilcox wrote:
>Leaving aside the T-class, machines that don't support io consistent memory
>generally have:
>
>(drivers that need io consistent memory):
[...]
> - zero to four EISA slots

	So it sounds like any EISA or ISA card could be plugged into
these machines.

	This makes me lean infinitesmally more toward a parameter
to dma_alloc rather than a separate dma_alloc_not_necessarily_consistent
function, because if there ever are other dma_alloc variations that
we want to support, it is more likely that there may be overlap
between the users of those features and then the number of
different function calls would have to grow exponentially (or
we might then talk about changing the API again, which is not
the end of the world, but is certainly more difficult than not
having to do so).

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-06 17:39 Adam J. Richter
  2002-12-06 18:07 ` Matthew Wilcox
  0 siblings, 1 reply; 119+ messages in thread
From: Adam J. Richter @ 2002-12-06 17:39 UTC (permalink / raw)
  To: willy; +Cc: davem, James.Bottomley, linux-kernel

On Fri, 6 Dec 2002, Matthew Wilcox wrote:
>Machines built with PCXS and PCXT processors are guaranteed not to have
>PCI.  So this only becomes a problem when supporting non-PCI devices.
>The devices you mentioned -- 53c700 & 82596 -- are core IO and really do
>need to be supported.  There's also a large userbase for these machines,
>dropping support for them is not an option.

Back on 7 Nov 2002, James Bottomley wrote:
| The ncr8xxx driver is another one used for the Zalon controller in parisc, so
| it will eventually have the same issues.

	How many other drivers beyond these three do we expect to
need similar sync points if the T class remains unsupported?

>T class machines don't have PCI slots per se, but they do have GSC
>slots into which a card can be plugged that contains a Dino GSC to PCI
>bridge and one or more PCI devices.  Examples of cards that are like
>this include acenic, single and dual tulip.

	Regarding the "T class", I would be intersted in knowing how
old it is, if it is discontinued at this point, how much of a user
base there is, and how many of these PCI-on-GSC cards there are.

	I was previously under the impression that there were some
parisc machines that could take some kind of commodity PCI cards and
lacked consistent memory.  If the reality is that only about six
drivers would ever have to be ported to use these sync points, then I
could see keeping dma_{alloc,free}_consistent, and moving the
capability of dealing with inconsistent memory to some wrappers in a
separate .h file (dma_alloc_maybe_consistent, dma_alloc_maybe_free).

	I suppose another consideration would be how likely it is that
a machine that we might care about without consistent memory will ship
in the future.  In general, the memory hierarchy is getting taller
(levels of caching, non-uniform memory access), but perhaps the
industry will continue to treat consistent memory capability as a
requirement.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-06 17:07 Adam J. Richter
  0 siblings, 0 replies; 119+ messages in thread
From: Adam J. Richter @ 2002-12-06 17:07 UTC (permalink / raw)
  To: James.Bottomley; +Cc: david, jgarzik, linux-kernel, miles

On Fri, 06 Dec 2002, James Bottomley wrote:
>adam@yggdrasil.com said:
>> 	I like your term DMA_CONSISTENT better than DMA_CONFORMANCE_CONSISTANT
>> .  I think the word "conformance" in there does not reduce the time
>> that it takes to figure out what the symbol means.  I don't think any
>> other facility will want to use the terms DMA_{,IN}CONSISTENT, so I
>> prefer that we go with the more medium sized symbol. 

>I'm not so keen on this.  The idea of this parameter is not to tell the 
>allocation routine what type of memory you would like, but to tell it what 
>type of memory the driver can cope with.  I think for the inconsistent case, 
>DMA_INCONSISTENT looks like the driver is requiring inconsistent memory, and 
>expecting to get it.  I'm open to changing the "CONFORMANCE" part, but I'd 
>like to name these parameters something that doesn't imply they're requesting 
>a type of memory.

	How about renaming DMA_INCONSISTENT to DMA_MAYBE_CONSISTENT?

	By the way, I previously suggested a flags field to indicate
what the driver could cope with.  0 would mean consistent memory, 1's
would indicate other things that the driver could cope with that would
be added if and when a real need for them arises (read caching, write
back cachine, cpu-cpu consistency, cache line size smaller than 2**n
bytes, etc.).  Regarding the debugging capability of
DMA_CONFORMANCE_NONE, I don't think that will be as useful in the way
that DMA_DIRECTION_NONE is, because transfer direction is often passed
through the io path of a device driver and errors in doing so are a
common.  In comparison, I think the calls to dma_malloc will typically
have this argument specified as a constant where the call is made,
with the possible exception of some allocation being consolidated in
the generic device layer.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-06 16:48 James Bottomley
  0 siblings, 0 replies; 119+ messages in thread
From: James Bottomley @ 2002-12-06 16:48 UTC (permalink / raw)
  To: David S. Miller; +Cc: James.Bottomley, linux-kernel

> These systems simply do not exist.

Yes, they do.  The parisc pcxs and pcxt processors are the prime example that 
has annoyed me for a while.  This has no ability to control the cache at the 
page level  (it doesn't even seem to allow fully disabling the processor 
cache---not that you'd want to do that).  The result is that it cannot ever 
return consistent memory, so pci_alloc_consistent always fails (see 
arch/parisc/kernel/pci-dma.c:fail_alloc_consistent).  I have one of these 
machines (A HP9000/715) and I maintain the driver for the SCSI chip, which 
also needs to work efficiently on the intel platform, which is what got me 
first thinking about the problem.

Let me say again:  I don't envisage any driver writer worrying about this edge 
case, unless they're already implementing work arounds for it now.

I plan to maintain the current pci_ DMA API exactly as it is, with no 
deviations.  Thus the dma_ API too can be operated in full compatibility mode 
with the pci_ API.  That's the design intent.  However, I want the dma_ API to 
simplify this driver edge case for me (and for others who have to maintain 
similar drivers), which is why it allows a deviation from the pci_ API *if the 
driver writer asks for it*.

James





^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-06 16:19 Adam J. Richter
  2002-12-06 16:40 ` Matthew Wilcox
  2002-12-06 18:17 ` David S. Miller
  0 siblings, 2 replies; 119+ messages in thread
From: Adam J. Richter @ 2002-12-06 16:19 UTC (permalink / raw)
  To: davem; +Cc: James.Bottomley, linux-kernel, willy

On Fri, 2002-12-06, David S. Miller wrote:
>On Thu, 2002-12-05 at 23:41, Adam J. Richter wrote:
>> These change
>> will eliminate a difficulty in supporting devices on inconsistent-only
>> machines

>I think you are solving a non-problem, but if you want me to see your
>side of the story you need to give me specific examples of where
>pci_alloc_consistent() is "IMPOSSIBLE".

	I am not a parisc developer, but it is apparently the
case for certain parisc machines with "PCXS/T processors" or
the "T class" machines, as described by Mathew Wilcox:

http://lists.parisc-linux.org/pipermail/parisc-linux/2002-December/018535.html

	They currently need the contortions
that are implemented in linux-2.5.50/drivers/net/lasi_82596.c
and partially implemented in drivers/scsi/53c700.c to be
implemented in every driver that they want to use (i.e., what
these drivers try to do when pci_alloc_consistent fails).

	Under the API addition that we've been discussing, the
extra cache flushes and invalidations that these drivers need
would become macros that would be expand to nothing on the
other architectures, and the drivers would no longer have to
have "if (consistent_alloation_failed) ..." branches around them.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-06  7:41 Adam J. Richter
  2002-12-06 15:50 ` David S. Miller
  0 siblings, 1 reply; 119+ messages in thread
From: Adam J. Richter @ 2002-12-06  7:41 UTC (permalink / raw)
  To: linux-kernel

Dave Miller wrote:
>I think it's a huge error to try and move the DMA stuff into
>the generic device interfaces _AND_ change semantics and arguments
>at the same time.

	Nobody is talking about changing the existing pci_xxx
interface.  For the new dma_xxx routines, I think it would actually be
an error to wait to make the particular changes we are discussing,
because now there when there is no compatability to break and, less
importantly, because having it in 2.6.0 from the start might make one
less #ifdef for those people who want to try to maintain a
multi-version "2.6.x" device driver.  (Notice that I try to describe
underlying advantages or disadvantages when I advocate something.)

	People have already given a lot of thought to the modest
difference in the dma_xxx interface being discussed.  These change
will eliminate a difficulty in supporting devices on inconsistent-only
machines, I real problem that was partly induced by the original
pci_alloc_consistent interface.

	Six months ago, I posted proposal to turn scatterlists into
linked lists to reduce copying and translation between certain IO
list formats.  David responded "Now is not the time for this, when we
finally have the generic struct device stuff, then you can start doing
DMA stuff at the generic layer" in this posting:
http://marc.theaimsgroup.com/?l=linux-kernel&m=102501406027125&w=2

	I think that David is erring too much on the side of
stagnation right now.  I hope he'll understand this in future.  In the
meantime, I'd be in favor of continuing to work this into a clean
patch that everyone else likes and then asking Linus to integrate that
with or without David's blessing if nobody identifies any real
technical problems with it, at least if nobody else objects.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-06  7:14 Adam J. Richter
  2002-12-06 16:26 ` James Bottomley
  2002-12-07  9:45 ` David Gibson
  0 siblings, 2 replies; 119+ messages in thread
From: Adam J. Richter @ 2002-12-06  7:14 UTC (permalink / raw)
  To: david; +Cc: James.Bottomley, jgarzik, linux-kernel, miles

David Gibson wrote:
>On Thu, Dec 05, 2002 at 06:08:22PM -0800, Adam J. Richter wrote:
[...]
>> 	In linux-2.5.50/drivers/net/lasi_82596.c, the macros
>> CHECK_{WBACK,INV,WBACK_INV} have definitions like:
>> 
>> #define  CHECK_WBACK(addr,len) \
>> 	do { if (!dma_consistent) dma_cache_wback((unsigned long)addr,len); } while (0)
>> 
>> 	These macros are even used in IO paths like i596_rx().  The
>> "if()" statement in each of these macros is the extra branch that
>> disappears on most architectures under James's proposal.
>
>Erm... I have no problem with the macros that James's proposal would
>use to take away this branch - I would expect to use exactly the same
>ones.  It's just the notion of "try to get consistent memory, but get
>me any old memory otherwise" that I'm not so convinced by.
>
>In any case, on platforms where the dma_malloc() could really return
>either consistent or non-consistent memory, James's sync macros would
>have to have an equivalent branch within.

	Yeah, I should have said "because then you don't have to have a
branch for the case where the platform always or *never* returns
consistent memory on a give machine."

>> >What performance advantages of consistent memory?

>> [...]  For
>> example, pci_sync_single is 55 lines of C code in
>> linux-2.5.50/arch/sparc64/kernel/pci_iommu.c.
>
>Hmm... fair enough.  Ok, I can see the point of a fall back to
>non-consistent approach given that.  So I guess the idea makes sense,
>so long as dma_malloc() (without the consistent flag) is taken to be
>"give me DMAable memory, consistent or not, whichever is cheaper for
>this platform" rather than "give me DMAable memory, consistent if
>possible".  It was originally presented as the latter which misled me.

	As long as dma_sync_maybe works with the addresses returned by
dma_malloc and dma_malloc only returns the types of memory that the
callers claims to be prepared to deal with, the decision about what
kind of memory dma_malloc should return when it has a choice is up to
the platform implementation.

>I think the change to the parameters which I suggested in a reply to
>James makes this a bit clearer.

	I previously suggested some of the changes in your description:
name them dma_{malloc,free} (which James basically agrees with), have
a flags field.  However, given that it's a parameter and you're going
to pass a constant symbol like DMA_CONSISTENT or DMA_INCONSISTENT to it,
it doesn't really matter if its an enum or an int to start with, as it
could be changed later with minimal or zero driver changes.

	I like your term DMA_CONSISTENT better than
DMA_CONFORMANCE_CONSISTANT.  I think the word "conformance" in there
does not reduce the time that it takes to figure out what the symbol
means.  I don't think any other facility will want to use the terms
DMA_{,IN}CONSISTENT, so I prefer that we go with the more medium sized
symbol.

	Naming the parameter to dma_malloc "bus" would imply that it
will not look at individual device information like dma_mask, which is
wrong.  Putting the flags field in the middle of the parameter list
will make the dma_malloc and dma_free lists unnnecessarily different.
I think these two were just oversights in your posting.

	Anyhow, I think we're in full agreement at this point on the
substantive stuff at this point.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-06  6:15 David Brownell
  0 siblings, 0 replies; 119+ messages in thread
From: David Brownell @ 2002-12-06  6:15 UTC (permalink / raw)
  To: linux-kernel

I'm all in favor of making the driver model support dma mapping,
so usb won't need to try any more.  I'd expect that to make some
dma model issues for the sa1100 and uml usb ports vanish, and
ideally to eliminate some code now in usbcore.


 > empty before adding new requests.  I think that the Linux OHCI
 > controller currently only queues one request per bulk or control
 > endpoint, so I don't think it uses this feature, if it were to, it

In 2.5, all hcds are supposed to queue all kinds of usb requests,
including ohci.  (The ohci driver has supported that feature as
long as I recall.)  Storage is using that by default now, which
lets high speed disks talk using big scatterlist dma requests.

That's a big change from 2.4, where queueing mostly worked but
wasn't really used by many drivers.  In particular, storage
rarely queued more than one page ... now I've seen it queueing
several dozen pages, so faster devices can reach their peak
transfer speeds.  (Tens of MByte/sec, sure.)

- Dave





^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-06  2:08 Adam J. Richter
  2002-12-06  2:53 ` David Gibson
  0 siblings, 1 reply; 119+ messages in thread
From: Adam J. Richter @ 2002-12-06  2:08 UTC (permalink / raw)
  To: david; +Cc: davem, James.Bottomley, jgarzik, linux-kernel, miles

David Gibson wrote:
>On Thu, Dec 05, 2002 at 03:57:53AM -0800, Adam J. Richter wrote:
>> David Gibson wrote:
>> >Since, with James's approach you'd need a dma sync function (which
>> >might compile to NOP) in pretty much the same places you'd need
>> >map/sync calls, I don't see that it does make the source noticeably
>> >simpler.
>> 
>>         Because then you don't have to have a branch for
>> case where the platform *does* support consistent memory.

>Sorry, you're going to have to explain where this extra branch is, I
>don't see it.

	In linux-2.5.50/drivers/net/lasi_82596.c, the macros
CHECK_{WBACK,INV,WBACK_INV} have definitions like:

#define  CHECK_WBACK(addr,len) \
	do { if (!dma_consistent) dma_cache_wback((unsigned long)addr,len); } while (0)

	These macros are even used in IO paths like i596_rx().  The
"if()" statement in each of these macros is the extra branch that
disappears on most architectures under James's proposal.

[...]
>What performance advantages of consistent memory?  Can you name any
>non-fully-consistent platform where consistent memory is preferable
>when it is not strictly required?  For, all the non-consistent
>platforms I'm aware of getting consistent memory means disabling the
>cache and therefore is to be avoided wherever it can be.

	I believe that the cache synchronization operations for
nonconsistent memory are often expensive enough so that consistent
memory is faster on many platforms for small reads and writes, such as
dealing with control and status fields and hardware DMA lists.  For
example, pci_sync_single is 55 lines of C code in
linux-2.5.50/arch/sparc64/kernel/pci_iommu.c.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-05 20:27 Adam J. Richter
  0 siblings, 0 replies; 119+ messages in thread
From: Adam J. Richter @ 2002-12-05 20:27 UTC (permalink / raw)
  To: rmk; +Cc: linux-kernel

Russell King wrote:
>On Thu, Dec 05, 2002 at 04:21:01AM -0800, Adam J. Richter wrote:
>> Russell King wrote:
>> [An excellent explanation of why you sometimes may need consistent
>> memory.]
>> >In other words, you _will_ loose information in this case, guaranteed.
>> >I'd rather keep our existing pci_* API than be forced into this crap
>> >again.
>> 
>> 	All of the proposed API variants that we have discussed in
>> this thread for pci_alloc_consistent / dma_malloc give you consistent
>> memory (or fail) unless you specifically tell it that returning
>> inconsistent memory is OK.

>How does a driver writer determine if his driver can cope with inconsistent
>memory?  If their view is a 32-byte cache line, and their descriptors are
>32 bytes long, they could well say "we can cope with inconsistent memory".
>When 64 byte cache lines are the norm, the driver magically breaks.
>
>I think we actually want to pass the minimum granularity the driver can
>cope with if we're going to allocate inconsistent memory.  A driver
>writer does not have enough information to determine on their own
>whether inconsistent memory is going to be usable on any architecture.

	I agree with James that dma_malloc should round its allocation
sizes up to a multiple of cache line size (at least if it is returning
inconsistent writeback cached memory), and I would extend that
statement to the pool allocator (currently PCI specific, and an API
that I'd like to change slightly, but that's another matter).  For
dma_malloc, this would currently just be a documentation change, as it
currently always allocates entire pages.  For the pool allocator,
might requiring adding a few lines of code.

	There may be still be other cache size issues, and how to deal
with them will be a driver-specific question.  I think most drivers
will not this problem because hardware programming and data structures
are designed so that at any given time either the IO device or the CPU
has an implicit write lock on the data structure and there is a
specific protocol for handing ownership from one to the other (for
example, the CPU sets up the data structures, flushes its write cache,
then writes sets the "go" bit and does not do further writes until it
sees that the IO device has set the "done" bit).

	However, not all data strucutres and protocols are amenable to
such techniques.  For example, OHCI USB controllers have a 16 byte
Endpoint Descriptor which contains a NextTD (next transfer descriptor)
field designed to be writable by the controller and a EndTD (end
transfer descriptor designed to be writable by the controller) so that
the device driver can add more transfers to an endpoint while that
endpoint descriptor is still hot as long as the architecture supports
32-bit atomic writes, instead of waiting for that endpoint's queue to
empty before adding new requests.  I think that the Linux OHCI
controller currently only queues one request per bulk or control
endpoint, so I don't think it uses this feature, if it were to, it
would have to check that it really did have consistent memory or that
the cache line size was 8 bytes or less (not 4 bytes, because of where
these registers are located).  These checks would evaluate to compile
time constant on most or all architectures.

	For other devices, it may be necessary to use other workarounds
or to fail initialization.  It may also depend on how inconsistent the
memory is.  For example, read cache write through memory may suffice
given read barriers (and it would be interesting to find out if this
kind of memory is available on the inconsistent parisc machines).

	I think the question of whether it would actually simplify
things to embed this test in dma_malloc would depend on how common the
case is where you really want the device driver to fail.  I suspet
that it would be simpler to create a symbol like SMP_CACHE_BYTES or
L1_CACHE_BYTES that the affected drivers could examine.  Also, if the
need really is that common, maybe it could be put in struct
device_driver so that it could appear once instead of in the typically
two or three times in the drivers (and you could even teach depmod to
read, although I don't know if that would be useful).

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-05 17:49 Manfred Spraul
  2002-12-06  0:08 ` David Gibson
  0 siblings, 1 reply; 119+ messages in thread
From: Manfred Spraul @ 2002-12-05 17:49 UTC (permalink / raw)
  To: David Gibson; +Cc: linux-kernel

>
>
>Hmm... that doesn't seem sufficient to explain it.
>
>Some background: I work with PPC embedded chips (the 4xx family) whose
>only way to get consistent memory is by entirely disabling the cache.
>
What do you mean with "disable"?
Do you have to disable the cache entirely when you encounter the first 
pci_alloc_consistent() call, or do you disable the cache just for the 
region that is returned by pci_alloc_consistent()?

If you disable it entirely - would "before_acess_consistent_area() / 
after_access_consistent_area()" macros help to avoid that, or are there 
other problems?

--
    Manfred


^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-05 12:21 Adam J. Richter
  2002-12-05 12:44 ` Russell King
  0 siblings, 1 reply; 119+ messages in thread
From: Adam J. Richter @ 2002-12-05 12:21 UTC (permalink / raw)
  To: rmk; +Cc: linux-kernel

Russell King wrote:
[An excellent explanation of why you sometimes may need consistent
memory.]
>In other words, you _will_ loose information in this case, guaranteed.
>I'd rather keep our existing pci_* API than be forced into this crap
>again.

	All of the proposed API variants that we have discussed in
this thread for pci_alloc_consistent / dma_malloc give you consistent
memory (or fail) unless you specifically tell it that returning
inconsistent memory is OK.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-05 12:13 Adam J. Richter
  0 siblings, 0 replies; 119+ messages in thread
From: Adam J. Richter @ 2002-12-05 12:13 UTC (permalink / raw)
  To: benh; +Cc: James.Bottomley, linux-kernel, miles

Benjamin Herrenschmidt wrote:
>On Wed, 2002-12-04 at 22:46, James Bottomley wrote:
>> If you have a machine that has both consistent and inconsistent blocks, you 
>> need to encode that in dma_addr_t (which is a platform definable type).
>
>I don't agree here. Encoding things in dma_addr_t, then special casing
>in consistent_{map,unmap,sync,....) looks really ugly to me ! You want
>dma_addr_t to contain a bus address for the given bus you are working
>with and pass that to your device, period.

	I don't think that James meant actually defining flag bits
inside of dma_addr_t, although I suppose you could do it for some
unused high bits on some architectures.  I think the implication
was that you could have something like:


static inline int is_consistent(dma_addr_t addr)
{
	return (addr >= CONSISTENT_AREA_START && addr < CONSISTENT_AREA_END);
}

	I also don't recall anyone proposing special casing
dma_{map,unmap,sync} based on the results of such a check.  I think
the only function that might use it would be maybe_wmb(addr,len),
which, on some might machines might be:

static inline void maybe_wmb(dma_addr_t addr, size_t len)
{
	if (!is_consistent(addr))
		wmb();
}

	In practice, I think dma_malloc() would either always return
consistent memory or never return consistent memory on a given
machine, so maybe_wmb would probably never do such range checking.
Instead it would compile to nothing on machines where dma_alloc always
returned consistent memory, would compile to wmb() on systems where
dma_alloc would only succeed if it could return non-consistent memory,
and would compile to a procedure pointer on parisc that would either
set to point to a no-op or wmb, depending on which kind of machine the
kernel was booted on.


Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-05 11:57 Adam J. Richter
  2002-12-06  0:06 ` David Gibson
  0 siblings, 1 reply; 119+ messages in thread
From: Adam J. Richter @ 2002-12-05 11:57 UTC (permalink / raw)
  To: david; +Cc: davem, James.Bottomley, jgarzik, linux-kernel, miles

David Gibson wrote:
>Since, with James's approach you'd need a dma sync function (which
>might compile to NOP) in pretty much the same places you'd need
>map/sync calls, I don't see that it does make the source noticeably
>simpler.

        Because then you don't have to have a branch for
case where the platform *does* support consistent memory.

>>       If were to try the approach of using pci_{map,sync}_single
>> always (i.e., just writing the code not to use alloc_consistent),
>> that would have a performance cost on machines where using
>> consistent memory for writing small amounts of data is cheaper than
>> the cost of the cache flushes that would otherwise be required.
>
>Well, I'm only talking about the cases where we actually care about
>reducing the use of consistent memory.

        Then you're not fully weighing the benefits of this facility.
The primary beneficiaries of this facility are device drivers for
which we'd like to have the performance advantages of consistent
memory when available (at least on machines that always return
consistent memory) but which we'd also like to have work as
efficiently as possible on platforms that lack consistent memory or
have so little that we want the device driver to still work even when
no consistent memory is available.  That includes all PCI devices that
users of the inconsistent parisc machines want to use.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-05  5:20 Adam J. Richter
  0 siblings, 0 replies; 119+ messages in thread
From: Adam J. Richter @ 2002-12-05  5:20 UTC (permalink / raw)
  To: david; +Cc: davem, James.Bottomley, jgarzik, linux-kernel, miles

At the risk of beating a dead horse, I'd like to clarify a potential
ambiguity.

David Gibson wrote:
>It seems the "try to get consistent memory, but otherwise give me
>inconsistent" is only useful on machines which:
>	(1) Are not fully consisent, BUT
>	(2) Can get consistent memory without disabling the cache, BUT
>	(3) Not very much of it, so you might run out.

>The point is, there has to be an advantage to using consistent memory
>if it is available AND the possibility of it not being available.

	It is enough that there is an advantage to using consistent
memory on one platform (such as sparc64?) and the possibility of it
not being available on another platform (such as parisc), given that
you want the driver on both platforms (such as 53c700).  In that case,
we have identified three possible choices so far:

APPROACH				PROBLEMS

1. Use both memory allocators.		Increased source and object size,
   (as 53c700 currently does)		rarely used code branches, unneeded
					"if (!consistent)" tests on platforms
					where the answer is constant.

2. Assume only inconsistent memory.	Slower on platforms where consistent
					memory has speed advantage

3. Have "maybe consistent" allocation
   and {w,r}mb_maybe(addr,len) macros.


Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."


^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-05  3:02 Adam J. Richter
  2002-12-05  6:15 ` David Gibson
  0 siblings, 1 reply; 119+ messages in thread
From: Adam J. Richter @ 2002-12-05  3:02 UTC (permalink / raw)
  To: david; +Cc: davem, James.Bottomley, jgarzik, linux-kernel, miles

>On Wed, Dec 04, 2002 at 07:44:17PM -0600, James Bottomley wrote:
>> david@gibson.dropbear.id.au said:
>> > Do you have an example of where the second option is useful?  Off hand
>> > the only places I can think of where you'd use a consistent_alloc()
>> > rather than map_single() and friends is in cases where the hardware's
>> > behaviour means you absolutely positively have to have consistent
>> > memory. 
>> 
>> Well, it comes from parisc drivers.  Here you'd really rather have
>> consistent memory because it's more efficient, but on certain
>> platforms it's just not possible.

>Hmm... that doesn't seem sufficient to explain it.

	The question is not what is possible, but what is optimal.

	Yes, it is possible to write drivers for machines without
consistent memory that work with any DMA device, by using
dma_{map,sync}_single as you suggest, even if caching could be
disabled.  That is how drivers/scsi/53c700.c and
drivers/net/lasi_82596.c work today.

	The advantages of James's approach is that it will result in
these drivers having simpler source code and even smaller object code
on machines that do not have this problem.

	If were to try the approach of using pci_{map,sync}_single
always (i.e., just writing the code not to use alloc_consistent), that
would have a performance cost on machines where using consistent
memory for writing small amounts of data is cheaper than the cost of
the cache flushes that would otherwise be required.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."



^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-05  1:21 Adam J. Richter
  2002-12-05  2:40 ` David Gibson
  0 siblings, 1 reply; 119+ messages in thread
From: Adam J. Richter @ 2002-12-05  1:21 UTC (permalink / raw)
  To: jgarzik; +Cc: davem, david, James.Bottomley, linux-kernel, miles

David Gibson wrote:
>On Wed, Dec 04, 2002 at 11:47:14AM -0600, James Bottomley wrote:
[...]
>> The new DMA API allows a driver to advertise its level of consistent memory 
>> compliance to dma_alloc_consistent.  There are essentially two levels:
>> 
>> - I only work with consistent memory, fail if I cannot get it, or
>> - I can work with inconsistent memory, try consistent first but return 
>> inconsistent if it's not available.
>
>Do you have an example of where the second option is useful?

	From a previous discussion, I understand that there are some
PCI bus parisc machines without consistent memory.

>Off hand
>the only places I can think of where you'd use a consistent_alloc()
>rather than map_single() and friends is in cases where the hardware's
>behaviour means you absolutely positively have to have consistent
>memory.

	That would result in big rarely used branches in device
drivers or lots of ifdef's and the equivalent.  With James's approach,
porting a driver to support those parisc machines (for example) would
involve sprinkling in some calls to macros that would compile to
nothing on the other machines.

	Compare the code clutter involved in allowing those
inconsistent parisc machines to run, say, the ten most popular
ethernet controllers and the four most popular scsi controllers.  I
think the difference in the resulting source code size would already
be in the hundreds of lines.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."




^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [RFC] generic device DMA implementation
@ 2002-12-05  0:43 Adam J. Richter
  2002-12-05  0:55 ` Jeff Garzik
  2002-12-05  2:02 ` James Bottomley
  0 siblings, 2 replies; 119+ messages in thread
From: Adam J. Richter @ 2002-12-05  0:43 UTC (permalink / raw)
  To: James.Bottomley; +Cc: davem, jgarzik, linux-kernel, miles

On 2002-12-04, James Bottomley wrote:

>Now that we have the generic device model, it should be equally possible to 
>rephrase the entire [DMA] API for generic devices instead of pci_devs.

	Yes.  This issue has come up repeatedly.  I'd really like to
see a change like yours integrated soon to stop the spread of fake PCI
devices (including the pcidev==NULL convention) and other contortions
being used to work around this.  Also, such a change would enable
consolidation of certain memory allocations and their often buggy
error branches from hundred of drivers into a few places.

	As you know, I posted a similar patch that created a new field
in struct bus_type, as Miles Bader suggested just now, although only
for {alloc,free}_consistent.  if the bus-specific variation can be
confined to some smaller part of these routines or eliminated, then
I'm all in favor of skipping the extra indirection and going with your
approach.  It will be interesting to see if your model allows most of
the sbus_ and pci_ DMA mapping routines in sparc to be merged.  I
suspect that you will have to adopt some kind of convention, such as
that device->parent->driver_private will have a common meaning for pci
and sbus device on that platform.


>The new DMA API allows a driver to advertise its level of consistent memory 
>compliance to dma_alloc_consistent.  There are essentially two levels:
>
>- I only work with consistent memory, fail if I cannot get it, or
>- I can work with inconsistent memory, try consistent first but return 
>inconsistent if it's not available.

	If these routines can allocate non-consistent memory, then how
about renaming them to something less misleading, like dma_{malloc,free}?

	Can you please define the "consistency" argument to these
two routines as a bit mask?  There are probably other kinds of memory
inconsistency a driver might be able to accomodate in the future (CPU
read caching, CPU writeback, incosistency across mulitple CPU's if the
driver knows that it is only going to run on one CPU).  I think 0
should be the "most consistent" kind of memory.  That way, DMA memory
allocators could ignore bits that they don't know about, as those bits
would only advertise extra capabilities of a driver.  I think this
extensibility is more useful than the debugging value of
DMA_CONFORMANCE_NONE.


>The idea is that the memory type can be coded into dma_addr_t which the 
>subsequent memory sync operations can use to determine whether 
>wback/invalidate should be a nop or not.

	Your patch does not have to wait for this, but I would like
macros like {r,w}mb_maybe(dma_addr, len) that would compile to nothing
on machines where dma_malloc always returned consistent memory,
compile to your proposed range checking versions on machines that
could return consistent or inconsistent memory, and compile to
dma_cache_wback and rmb(?) on machines that always returned
inconsistent memory.  The existing dma_cache_wback routines would
still never do the range checks, because they would continue to be
used only in cases where the need for flushing is known at compile
time (they would always compile to either the barrier code or nothing).

	Also something that could be added later is a
bus_type.mem_mapped flag so that these DMA routines could do:

		BUG_ON(!dev->bus.mem_mapped);

	...to catch attempts to allocate memory for devices that are
not mapped.  Alternatively, we could have a struct mem_device that
embeds a struct device and represents only those types of devices
that can be mapped into memory.

	It is also possible that we might want to add a field to
struct device identifying the memory mapped "parent" of a
non-memory-mapped device, such as the PCI-based USB host adapter of a
USB network device so that mapping of network packets for transmission
could be centralized.  That's probably a separate patch though.

	P.S., Did you miss a patch for include/linux/device.h adding
device.dma_mask, or is that change already queued for 2.5.51?

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [parisc-linux] Untested port of parisc_device to generic device interface
@ 2002-11-09  4:51 Adam J. Richter
  2002-11-09  5:21 ` Matthew Wilcox
  2002-11-09 18:04 ` Grant Grundler
  0 siblings, 2 replies; 119+ messages in thread
From: Adam J. Richter @ 2002-11-09  4:51 UTC (permalink / raw)
  To: willy; +Cc: andmike, hch, James.Bottomley, linux-kernel, mochel, parisc-linux

Matthew Wilcox wrote:
>Actually I think the generic device model is crap. [...]

My patch is a net deletion of 57 lines and will allow simplification
of parisc DMA allocation.

Although I agree with most of your criticisms about the generic device
model, most of the problems with it are the way people use it (the
first thing everyone wants to do is a driverfs file system) and some
conventions that I disagree with, such as the idea that drivers that
embed struct device and struct device_driver should not initialize
those fields directly, but should have xxx_register_device copy them
in.  parisc can use the generic driver API without getting fat.

Problems specific to the generic device API can be incrementally
improved and nobody is treating it as set in stone.  I think the
generic device API is close enough already so that it's worth porting
to, even if future clean-ups will then require some small changes to
the code that is ported to it.

Please do not throw the baby out with the bath water.  The generic
driver interface in its present form really can make parisc smaller
and cleaner.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Miplitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 119+ messages in thread

end of thread, other threads:[~2002-12-28 18:11 UTC | newest]

Thread overview: 119+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-12-04 17:47 [RFC] generic device DMA implementation James Bottomley
2002-12-04 18:27 ` Jeff Garzik
2002-12-04 19:36   ` James Bottomley
2002-12-04 21:19 ` Miles Bader
2002-12-04 21:21 ` Miles Bader
2002-12-04 21:42   ` James Bottomley
2002-12-05  5:44     ` Miles Bader
2002-12-04 21:46   ` James Bottomley
2002-12-05  2:31     ` Miles Bader
2002-12-05  3:06       ` James Bottomley
2002-12-05  5:02       ` David Gibson
2002-12-05 11:15     ` Benjamin Herrenschmidt
2002-12-05 11:16       ` William Lee Irwin III
2002-12-05 15:12       ` James Bottomley
2002-12-05  0:47 ` David Gibson
2002-12-05  0:54   ` Jeff Garzik
2002-12-05  1:44   ` James Bottomley
2002-12-05  2:38     ` David Gibson
2002-12-05  3:13       ` James Bottomley
2002-12-05  5:05         ` David Gibson
2002-12-05 15:03           ` James Bottomley
2002-12-05 23:54             ` David Gibson
2002-12-05  3:17       ` Miles Bader
2002-12-05  6:06         ` David Gibson
2002-12-05  6:43           ` Miles Bader
2002-12-05 23:44             ` David Gibson
2002-12-06  2:23               ` Miles Bader
2002-12-05  3:41       ` Jeff Garzik
2002-12-05  6:04         ` David Gibson
2002-12-05 16:29           ` Jeff Garzik
2002-12-05 23:59             ` David Gibson
2002-12-05 11:08   ` Benjamin Herrenschmidt
2002-12-05 11:35     ` Russell King
2002-12-05 15:24       ` James Bottomley
2002-12-06  0:01     ` David Gibson
  -- strict thread matches above, loose matches on Subject: below --
2002-12-18  3:01 [RFT][PATCH] " James Bottomley
2002-12-18  3:13 ` David Mosberger
2002-12-28 18:14 ` Russell King
2002-12-28 18:19   ` James Bottomley
2002-12-07 14:37 [RFC] " Adam J. Richter
2002-12-07  4:12 Adam J. Richter
2002-12-06 22:52 Adam J. Richter
2002-12-06 22:17 Adam J. Richter
2002-12-06 22:26 ` James Bottomley
2002-12-06 22:29   ` David S. Miller
2002-12-06 22:48     ` James Bottomley
2002-12-06 22:49       ` David S. Miller
2002-12-06 22:32   ` Arjan van de Ven
2002-12-06 17:39 Adam J. Richter
2002-12-06 18:07 ` Matthew Wilcox
2002-12-06 17:07 Adam J. Richter
2002-12-06 16:48 James Bottomley
2002-12-06 16:19 Adam J. Richter
2002-12-06 16:40 ` Matthew Wilcox
2002-12-06 18:17 ` David S. Miller
2002-12-06 18:29   ` James Bottomley
2002-12-06 18:31     ` David S. Miller
2002-12-06 18:40       ` James Bottomley
2002-12-06 18:42         ` David S. Miller
2002-12-06 21:04           ` Oliver Xymoron
2002-12-07 10:19       ` David Gibson
2002-12-06 18:36   ` Matthew Wilcox
2002-12-06 18:38     ` David S. Miller
2002-12-06  7:41 Adam J. Richter
2002-12-06 15:50 ` David S. Miller
2002-12-06  7:14 Adam J. Richter
2002-12-06 16:26 ` James Bottomley
2002-12-06 17:48   ` Miles Bader
2002-12-07  9:56   ` David Gibson
2002-12-07  9:45 ` David Gibson
2002-12-07 11:26   ` Russell King
2002-12-08  5:28     ` David Gibson
2002-12-06  6:15 David Brownell
2002-12-06  2:08 Adam J. Richter
2002-12-06  2:53 ` David Gibson
2002-12-06  4:03   ` David S. Miller
2002-12-05 20:27 Adam J. Richter
2002-12-05 17:49 Manfred Spraul
2002-12-06  0:08 ` David Gibson
2002-12-05 12:21 Adam J. Richter
2002-12-05 12:44 ` Russell King
2002-12-05 12:13 Adam J. Richter
2002-12-05 11:57 Adam J. Richter
2002-12-06  0:06 ` David Gibson
2002-12-05  5:20 Adam J. Richter
2002-12-05  3:02 Adam J. Richter
2002-12-05  6:15 ` David Gibson
2002-12-05  1:21 Adam J. Richter
2002-12-05  2:40 ` David Gibson
2002-12-05  2:49   ` Miles Bader
2002-12-05  6:12     ` David Gibson
2002-12-05  0:43 Adam J. Richter
2002-12-05  0:55 ` Jeff Garzik
2002-12-05  2:02 ` James Bottomley
2002-11-09  4:51 [parisc-linux] Untested port of parisc_device to generic device interface Adam J. Richter
2002-11-09  5:21 ` Matthew Wilcox
2002-11-09  6:03   ` Greg KH
2002-11-09 15:33     ` J.E.J. Bottomley
2002-11-13  6:13       ` Greg KH
2002-11-13  7:46         ` Miles Bader
2002-11-13  7:52           ` Greg KH
2002-11-13  8:02             ` Miles Bader
2002-11-13  8:10               ` Greg KH
2002-11-13  8:26                 ` Miles Bader
2002-11-13  8:25                   ` Greg KH
2002-11-13  9:05                     ` Miles Bader
     [not found]               ` <miles@lsi.nec.co.jp>
2002-11-13 20:13                 ` Grant Grundler
2002-11-13 20:21                   ` J.E.J. Bottomley
2002-11-13 20:37                     ` Grant Grundler
2002-11-13 11:59             ` Ivan Kokshaysky
2002-11-13 12:36               ` Marc Zyngier
2002-11-13 16:32             ` Bjorn Helgaas
2002-11-13 17:23               ` J.E.J. Bottomley
2002-11-13 20:33                 ` Grant Grundler
2002-11-13 20:44                   ` J.E.J. Bottomley
2002-11-13 21:42                     ` Grant Grundler
2002-11-13 20:12             ` Grant Grundler
2002-11-09  7:58   ` Marc Zyngier
2002-11-09 18:04 ` Grant Grundler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).