linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/9] wii: add usb 2.0 support
@ 2010-02-28 14:07 Albert Herranz
  2010-02-28 14:07 ` [RFC PATCH v2 1/9] powerpc: add per-device dma coherent support Albert Herranz
                   ` (8 more replies)
  0 siblings, 9 replies; 21+ messages in thread
From: Albert Herranz @ 2010-02-28 14:07 UTC (permalink / raw)
  To: linuxppc-dev, linux-arm-kernel, linux-usb; +Cc: Albert Herranz

The following patch series adds USB 2.0 support for the Wii PowerPC
platform via the EHCI controller present in the "Hollywood" chipset
of the video game console.

v1 -> v2
- enable per-device DMA coherent support on the Wii
- add a generic dmabounce based on ARM dmabounce
- make PowerPC and ARM use the generic dmabounce support
  (ARM just compile-tested)
- do not bounce buffers at the USB layer
- handle bounce buffers on the Wii at the platform support layer via the
  DMA API
- introduce a flag HCD_NO_COHERENT_MEM to avoid using coherent memory for
  USB buffers
- make the "Hollywood" EHCI controller use HCD_NO_COHERENT_MEM and honor
  MEM2 DMA constraints to solve the platform restrictions accessing
  coherent memory.

Albert Herranz (9):
  powerpc: add per-device dma coherent support
  wii: have generic dma coherent
  dma-coherent: fix bitmap access races
  add generic dmabounce support
  arm: use generic dmabounce support
  powerpc: add optional per-device dmabounce support
  wii: add mem2 dma mapping ops
  USB: add HCD_NO_COHERENT_MEM host controller driver flag
  wii: hollywood ehci controller support

 arch/Kconfig                                 |    3 +
 arch/arm/Kconfig                             |    4 +-
 arch/arm/common/Kconfig                      |    6 +-
 arch/arm/common/dmabounce.c                  |  376 +++--------------
 arch/arm/include/asm/device.h                |    2 +-
 arch/powerpc/boot/wii.c                      |   34 ++
 arch/powerpc/include/asm/device.h            |    3 +
 arch/powerpc/include/asm/dma-mapping.h       |    1 +
 arch/powerpc/include/asm/wii.h               |   25 ++
 arch/powerpc/kernel/dma.c                    |    5 +
 arch/powerpc/platforms/embedded6xx/Kconfig   |    3 +
 arch/powerpc/platforms/embedded6xx/Makefile  |    2 +-
 arch/powerpc/platforms/embedded6xx/wii-dma.c |  557 ++++++++++++++++++++++++++
 drivers/base/dma-coherent.c                  |   11 +
 drivers/usb/core/buffer.c                    |   15 +-
 drivers/usb/core/hcd.c                       |   50 ++-
 drivers/usb/core/hcd.h                       |   13 +-
 drivers/usb/host/Kconfig                     |    8 +
 drivers/usb/host/ehci-hcd.c                  |    5 +
 drivers/usb/host/ehci-hlwd.c                 |  233 +++++++++++
 drivers/usb/host/ehci.h                      |   23 +
 include/linux/dmabounce.h                    |   71 ++++
 lib/Kconfig                                  |   10 +
 lib/Makefile                                 |    2 +
 lib/dmabounce.c                              |  389 ++++++++++++++++++
 25 files changed, 1512 insertions(+), 339 deletions(-)
 create mode 100644 arch/powerpc/include/asm/wii.h
 create mode 100755 arch/powerpc/platforms/embedded6xx/wii-dma.c
 create mode 100644 drivers/usb/host/ehci-hlwd.c
 create mode 100644 include/linux/dmabounce.h
 create mode 100644 lib/dmabounce.c

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 1/9] powerpc: add per-device dma coherent support
  2010-02-28 14:07 [RFC PATCH v2 0/9] wii: add usb 2.0 support Albert Herranz
@ 2010-02-28 14:07 ` Albert Herranz
  2010-02-28 14:07 ` [RFC PATCH v2 2/9] wii: have generic dma coherent Albert Herranz
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Albert Herranz @ 2010-02-28 14:07 UTC (permalink / raw)
  To: linuxppc-dev, linux-arm-kernel, linux-usb; +Cc: Albert Herranz

Use the generic per-device dma coherent allocator on powerpc.
This allows a driver to declare coherent memory area from where
a device can allocate coherent memory chunks.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
---
 arch/powerpc/include/asm/dma-mapping.h |    1 +
 arch/powerpc/kernel/dma.c              |    5 +++++
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h
index 80a973b..18ecec8 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -17,6 +17,7 @@
 #include <linux/dma-debug.h>
 #include <asm/io.h>
 #include <asm/swiotlb.h>
+#include <asm-generic/dma-coherent.h>
 
 #define DMA_ERROR_CODE		(~(dma_addr_t)0x0)
 
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 6215062..83d5232 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -27,6 +27,9 @@ void *dma_direct_alloc_coherent(struct device *dev, size_t size,
 {
 	void *ret;
 #ifdef CONFIG_NOT_COHERENT_CACHE
+	if (dma_alloc_from_coherent(dev, size, dma_handle, &ret))
+		return ret;
+
 	ret = __dma_alloc_coherent(dev, size, dma_handle, flag);
 	if (ret == NULL)
 		return NULL;
@@ -54,6 +57,8 @@ void dma_direct_free_coherent(struct device *dev, size_t size,
 			      void *vaddr, dma_addr_t dma_handle)
 {
 #ifdef CONFIG_NOT_COHERENT_CACHE
+	if (dma_release_from_coherent(dev, get_order(size), vaddr))
+		return;
 	__dma_free_coherent(size, vaddr);
 #else
 	free_pages((unsigned long)vaddr, get_order(size));
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 2/9] wii: have generic dma coherent
  2010-02-28 14:07 [RFC PATCH v2 0/9] wii: add usb 2.0 support Albert Herranz
  2010-02-28 14:07 ` [RFC PATCH v2 1/9] powerpc: add per-device dma coherent support Albert Herranz
@ 2010-02-28 14:07 ` Albert Herranz
  2010-02-28 14:07 ` [RFC PATCH v2 3/9] dma-coherent: fix bitmap access races Albert Herranz
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Albert Herranz @ 2010-02-28 14:07 UTC (permalink / raw)
  To: linuxppc-dev, linux-arm-kernel, linux-usb; +Cc: Albert Herranz

Let the Nintendo Wii gaming console use per-device dma coherent allocations.
This will be used later by some of its drivers.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
---
 arch/powerpc/platforms/embedded6xx/Kconfig |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/platforms/embedded6xx/Kconfig b/arch/powerpc/platforms/embedded6xx/Kconfig
index 524d971..fe77ab2 100644
--- a/arch/powerpc/platforms/embedded6xx/Kconfig
+++ b/arch/powerpc/platforms/embedded6xx/Kconfig
@@ -119,6 +119,7 @@ config WII
 	bool "Nintendo-Wii"
 	depends on EMBEDDED6xx
 	select GAMECUBE_COMMON
+	select HAVE_GENERIC_DMA_COHERENT
 	help
 	  Select WII if configuring for the Nintendo Wii.
 	  More information at: <http://gc-linux.sourceforge.net/>
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 3/9] dma-coherent: fix bitmap access races
  2010-02-28 14:07 [RFC PATCH v2 0/9] wii: add usb 2.0 support Albert Herranz
  2010-02-28 14:07 ` [RFC PATCH v2 1/9] powerpc: add per-device dma coherent support Albert Herranz
  2010-02-28 14:07 ` [RFC PATCH v2 2/9] wii: have generic dma coherent Albert Herranz
@ 2010-02-28 14:07 ` Albert Herranz
  2010-02-28 14:07 ` [RFC PATCH v2 4/9] add generic dmabounce support Albert Herranz
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Albert Herranz @ 2010-02-28 14:07 UTC (permalink / raw)
  To: linuxppc-dev, linux-arm-kernel, linux-usb; +Cc: Albert Herranz

The coherent per-device memory handling functions use the in-kernel
bitmap library to account for the allocated regions.

The bitmap functions, though, do not protect the bitmap structure from
being modified concurrently. This can lead, for example, to double
allocations if dma_alloc_from_coherent() is called while another
dma_alloc_from_coherent() is already in progress.

Fix those races by protecting concurrent modifications of the allocation
bitmap. spin_lock_irqsave()/spin_unlock_irqrestore() is used as the
allocation/release functions are planned to be used in interrupt context
for streaming DMA mappings/unmappings via bounce buffers.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
---
 drivers/base/dma-coherent.c |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
index 962a3b5..9d27d63 100644
--- a/drivers/base/dma-coherent.c
+++ b/drivers/base/dma-coherent.c
@@ -11,6 +11,7 @@ struct dma_coherent_mem {
 	int		size;
 	int		flags;
 	unsigned long	*bitmap;
+	spinlock_t	lock;	/* protect bitmap operations */
 };
 
 int dma_declare_coherent_memory(struct device *dev, dma_addr_t bus_addr,
@@ -44,6 +45,7 @@ int dma_declare_coherent_memory(struct device *dev, dma_addr_t bus_addr,
 	dev->dma_mem->device_base = device_addr;
 	dev->dma_mem->size = pages;
 	dev->dma_mem->flags = flags;
+	spin_lock_init(&dev->dma_mem->lock);
 
 	if (flags & DMA_MEMORY_MAP)
 		return DMA_MEMORY_MAP;
@@ -77,6 +79,7 @@ void *dma_mark_declared_memory_occupied(struct device *dev,
 {
 	struct dma_coherent_mem *mem = dev->dma_mem;
 	int pos, err;
+	unsigned long flags;
 
 	size += device_addr & ~PAGE_MASK;
 
@@ -84,7 +87,9 @@ void *dma_mark_declared_memory_occupied(struct device *dev,
 		return ERR_PTR(-EINVAL);
 
 	pos = (device_addr - mem->device_base) >> PAGE_SHIFT;
+	spin_lock_irqsave(&mem->lock, flags);
 	err = bitmap_allocate_region(mem->bitmap, pos, get_order(size));
+	spin_unlock_irqrestore(&mem->lock, flags);
 	if (err != 0)
 		return ERR_PTR(err);
 	return mem->virt_base + (pos << PAGE_SHIFT);
@@ -112,6 +117,7 @@ int dma_alloc_from_coherent(struct device *dev, ssize_t size,
 	struct dma_coherent_mem *mem;
 	int order = get_order(size);
 	int pageno;
+	unsigned long flags;
 
 	if (!dev)
 		return 0;
@@ -124,7 +130,9 @@ int dma_alloc_from_coherent(struct device *dev, ssize_t size,
 	if (unlikely(size > (mem->size << PAGE_SHIFT)))
 		goto err;
 
+	spin_lock_irqsave(&mem->lock, flags);
 	pageno = bitmap_find_free_region(mem->bitmap, mem->size, order);
+	spin_unlock_irqrestore(&mem->lock, flags);
 	if (unlikely(pageno < 0))
 		goto err;
 
@@ -163,12 +171,15 @@ EXPORT_SYMBOL(dma_alloc_from_coherent);
 int dma_release_from_coherent(struct device *dev, int order, void *vaddr)
 {
 	struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
+	unsigned long flags;
 
 	if (mem && vaddr >= mem->virt_base && vaddr <
 		   (mem->virt_base + (mem->size << PAGE_SHIFT))) {
 		int page = (vaddr - mem->virt_base) >> PAGE_SHIFT;
 
+		spin_lock_irqsave(&mem->lock, flags);
 		bitmap_release_region(mem->bitmap, page, order);
+		spin_unlock_irqrestore(&mem->lock, flags);
 		return 1;
 	}
 	return 0;
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 4/9] add generic dmabounce support
  2010-02-28 14:07 [RFC PATCH v2 0/9] wii: add usb 2.0 support Albert Herranz
                   ` (2 preceding siblings ...)
  2010-02-28 14:07 ` [RFC PATCH v2 3/9] dma-coherent: fix bitmap access races Albert Herranz
@ 2010-02-28 14:07 ` Albert Herranz
  2010-02-28 14:20   ` Russell King - ARM Linux
  2010-02-28 14:07 ` [RFC PATCH v2 5/9] arm: use " Albert Herranz
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Albert Herranz @ 2010-02-28 14:07 UTC (permalink / raw)
  To: linuxppc-dev, linux-arm-kernel, linux-usb; +Cc: Albert Herranz

This patch makes part of the ARM dmabounce code available to other
architectures as a generic API.
See included kernel-doc annotations for the actual API implemented.

An architecture can opt-in for generic dmabounce support by defining
HAVE_DMABOUNCE.

This support will be used later to address DMA memory access restrictions
on the Nintendo Wii video game console.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
---
 arch/Kconfig              |    3 +
 include/linux/dmabounce.h |   77 +++++++++
 lib/Kconfig               |   10 ++
 lib/Makefile              |    2 +
 lib/dmabounce.c           |  395 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 487 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/dmabounce.h
 create mode 100644 lib/dmabounce.c

diff --git a/arch/Kconfig b/arch/Kconfig
index 9d055b4..98ff26d 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -130,6 +130,9 @@ config HAVE_CLK
 config HAVE_DMA_API_DEBUG
 	bool
 
+config HAVE_DMABOUNCE
+	bool
+
 config HAVE_DEFAULT_NO_SPIN_MUTEXES
 	bool
 
diff --git a/include/linux/dmabounce.h b/include/linux/dmabounce.h
new file mode 100644
index 0000000..d60dc04
--- /dev/null
+++ b/include/linux/dmabounce.h
@@ -0,0 +1,77 @@
+#ifndef _LINUX_DMABOUNCE_H
+#define _LINUX_DMABOUNCE_H
+
+/* FIXME remove later when arch/arm/common/dmabounce.c is updated */
+#ifndef CONFIG_ARM
+
+#ifdef CONFIG_DMABOUNCE
+
+#include <linux/dmapool.h>
+#include <linux/dma-mapping.h>
+#include <linux/list.h>
+#include <linux/types.h>
+#include <linux/device.h>
+
+struct dmabounce_info;
+
+#ifdef CONFIG_DMABOUNCE_STATS
+struct dmabounce_stats {
+	unsigned long total_allocs;
+	unsigned long map_op_count;
+	unsigned long bounce_count;
+};
+extern struct dmabounce_stats *dmabounce_get_stats(struct dmabounce_info *);
+#define DMABOUNCE_DO_STATS(i, X) do { dmabounce_get_stats(i)->X ; } while (0)
+#else
+#define DMABOUNCE_DO_STATS(i, X) do { } while (0)
+#endif /* CONFIG_DMABOUNCE_STATS */
+
+struct dmabounce_pool {
+	unsigned long	size;
+	struct dma_pool	*pool;
+#ifdef CONFIG_DMABOUNCE_STATS
+	unsigned long allocs;
+#endif
+};
+
+struct dmabounce_buffer {
+	struct list_head node;
+
+	/* original buffer */
+	void		*buf;
+	size_t		size;
+	enum dma_data_direction dir;
+
+	/* bounced buffer */
+	void		*bounce_buf;
+	dma_addr_t	bounce_buf_dma;
+
+	struct dmabounce_pool	*pool;
+};
+
+extern struct dmabounce_buffer *
+dmabounce_alloc_buffer(struct dmabounce_info *info,
+		       void *buf, size_t size, enum dma_data_direction dir,
+		       gfp_t gfp);
+extern void dmabounce_free_buffer(struct dmabounce_info *info,
+				  struct dmabounce_buffer *bb);
+extern struct dmabounce_buffer *
+dmabounce_find_buffer(struct dmabounce_info *info, dma_addr_t bounce_buf_dma,
+		      size_t size, enum dma_data_direction dir);
+
+extern struct dmabounce_info *
+dmabounce_info_alloc(struct device *dev,
+		     size_t small_buffer_size, size_t large_buffer_size,
+		     size_t align, size_t boundary);
+extern void dmabounce_info_free(struct dmabounce_info *info);
+
+extern int dmabounce_info_register(struct device *dev,
+				   struct dmabounce_info *info);
+extern void dmabounce_info_unregister(struct device *dev);
+
+#endif /* CONFIG_DMABOUNCE */
+
+/* FIXME remove later when arch/arm/common/dmabounce.c is updated */
+#endif /* !CONFIG_ARM */
+
+#endif /* _LINUX_DMABOUNCE_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index 97b136f..b53b7dc 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -207,4 +207,14 @@ config GENERIC_ATOMIC64
 config LRU_CACHE
 	tristate
 
+config DMABOUNCE
+	bool
+	depends on HAVE_DMABOUNCE
+	select ZONE_DMA if ARM
+	default y
+
+config DMABOUNCE_STATS
+	bool "Track dmabounce statistics"
+	depends on DMABOUNCE
+
 endmenu
diff --git a/lib/Makefile b/lib/Makefile
index 3b0b4a6..097c2ed 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -100,6 +100,8 @@ obj-$(CONFIG_GENERIC_CSUM) += checksum.o
 
 obj-$(CONFIG_GENERIC_ATOMIC64) += atomic64.o
 
+obj-$(CONFIG_DMABOUNCE) += dmabounce.o
+
 hostprogs-y	:= gen_crc32table
 clean-files	:= crc32table.h
 
diff --git a/lib/dmabounce.c b/lib/dmabounce.c
new file mode 100644
index 0000000..620d314
--- /dev/null
+++ b/lib/dmabounce.c
@@ -0,0 +1,395 @@
+/*
+ * lib/dmabounce.c
+ *
+ * Generic DMA bounce buffer functions.
+ * Copyright (C) 2010 Albert Herranz <albert_herranz@yahoo.es>
+ *
+ * Based on arch/arm/common/dmabounce.c
+ *
+ * Original version by Brad Parker (brad@heeltoe.com)
+ * Re-written by Christopher Hoover <ch@murgatroid.com>
+ * Made generic by Deepak Saxena <dsaxena@plexity.net>
+ *
+ * Copyright (C) 2002 Hewlett Packard Company.
+ * Copyright (C) 2004 MontaVista Software, Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ */
+
+/* FIXME remove later when arch/arm/common/dmabounce.c is updated */
+#ifndef CONFIG_ARM
+
+#define DRV_MODULE_NAME "dmabounce"
+#define DRV_DESCRIPTION "Generic DMA bounce buffer functions"
+#define DRV_AUTHOR      "Christopher Hoover <ch@hpl.hp.com>, " \
+			"Deepak Saxena <dsaxena@plexity.net>, " \
+			"Albert Herranz <albert_herranz@yahoo.es>"
+
+#define pr_fmt(fmt) DRV_MODULE_NAME ": " fmt
+#include <linux/kernel.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmapool.h>
+#include <linux/dmabounce.h>
+
+
+struct dmabounce_info {
+	struct device *dev;
+	struct list_head bounce_buffers;
+
+#ifdef CONFIG_DMABOUNCE_STATS
+	struct dmabounce_stats stats;
+	int attr_res;
+#endif
+	struct dmabounce_pool	small;
+	struct dmabounce_pool	large;
+
+	rwlock_t lock;
+};
+
+#ifdef CONFIG_DMABOUNCE_STATS
+struct dmabounce_stats *dmabounce_get_stats(struct dmabounce_info *info)
+{
+	return &info->stats;
+}
+
+static ssize_t dmabounce_show(struct device *dev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct dmabounce_info *info = dev->archdata.dmabounce;
+	return sprintf(buf, "%lu %lu %lu %lu %lu %lu\n",
+		       info->small.allocs,
+		       info->large.allocs,
+		       info->stats.total_allocs -
+				info->small.allocs - info->large.allocs,
+		       info->stats.total_allocs,
+		       info->stats.map_op_count,
+		       info->stats.bounce_count);
+}
+
+static DEVICE_ATTR(dmabounce_stats, 0400, dmabounce_show, NULL);
+#endif /* CONFIG_DMABOUNCE_STATS */
+
+/**
+ * dmabounce_alloc_buffer() - try to allocate a coherent DMA bounce buffer
+ * @info:	This dmabounce_info.
+ * @buf:	Original buffer virtual address.
+ * @size:	Original buffer length.
+ * @dir:	Direction of DMA transfer for this buffer.
+ * @gfp:	Flags used for memory allocations.
+ *
+ * Use this function to allocate a coherent buffer of size @size associated
+ * to a dmabounce_info structure @info.
+ * The allocation will be performed using the @gfp allocation flags.
+ *
+ * The coherent buffer can be used later to bounce data from/to the
+ * corresponding normal buffer @buf with the specified direction @dir.
+ *
+ * If the buffer cannot be allocated the function returns NULL.
+ * Otherwise, the allocated buffer is returned.
+ */
+struct dmabounce_buffer *
+dmabounce_alloc_buffer(struct dmabounce_info *info,
+		       void *buf, size_t size, enum dma_data_direction dir,
+		       gfp_t gfp)
+{
+	struct device *dev = info->dev;
+	struct dmabounce_pool *pool = NULL;
+	struct dmabounce_buffer *bb;
+	unsigned long flags;
+
+	dev_dbg(dev, "%s(buf=%p, size=%d, dir=%d)\n", __func__, buf, size, dir);
+
+	if (size <= info->small.size)
+		pool = &info->small;
+	else if (size <= info->large.size)
+		pool = &info->large;
+
+	bb = kzalloc(sizeof(*bb), gfp);
+	if (!bb) {
+		dev_err(dev, "%s: kmalloc failed\n", __func__);
+		goto out;
+	}
+
+	if (pool) {
+		bb->bounce_buf = dma_pool_alloc(pool->pool, gfp,
+						&bb->bounce_buf_dma);
+	} else {
+		bb->bounce_buf = dma_alloc_coherent(dev, size,
+						    &bb->bounce_buf_dma, gfp);
+	}
+
+	if (!bb->bounce_buf) {
+		dev_err(dev, "%s: error allocating DMA memory (size=%d)\n",
+			__func__, size);
+		kfree(bb);
+		bb = NULL;
+		goto out;
+	}
+
+#ifdef CONFIG_DMABOUNCE_STATS
+	if (pool)
+		pool->allocs++;
+#endif
+	DMABOUNCE_DO_STATS(info, total_allocs++);
+
+	bb->buf = buf;
+	bb->size = size;
+	bb->dir = dir;
+	bb->pool = pool;
+
+	write_lock_irqsave(&info->lock, flags);
+	list_add(&bb->node, &info->bounce_buffers);
+	write_unlock_irqrestore(&info->lock, flags);
+out:
+	return bb;
+}
+EXPORT_SYMBOL(dmabounce_alloc_buffer);
+
+/**
+ * dmabounce_free_buffer() - free a coherent DMA bounce buffer
+ * @info:	This dmabounce_info.
+ * @bb:		The coherent DMA bounce buffer to free.
+ *
+ * Free a previously allocated coherent DMA bounce buffer @bb from its
+ * associated dmabounce_info structure @info.
+ *
+ * The coherent DMA bounce buffer @bb must have been previously allocated
+ * using dmabounce_alloc_buffer().
+ */
+void
+dmabounce_free_buffer(struct dmabounce_info *info, struct dmabounce_buffer *bb)
+{
+	unsigned long flags;
+
+	dev_dbg(info->dev, "%s(buf=%p)\n", __func__, bb->buf);
+
+	write_lock_irqsave(&info->lock, flags);
+	list_del(&bb->node);
+	write_unlock_irqrestore(&info->lock, flags);
+
+	if (bb->pool)
+		dma_pool_free(bb->pool->pool, bb->bounce_buf,
+			      bb->bounce_buf_dma);
+	else {
+		dma_free_coherent(info->dev, bb->size, bb->bounce_buf,
+				  bb->bounce_buf_dma);
+	}
+
+	kfree(bb);
+}
+EXPORT_SYMBOL(dmabounce_free_buffer);
+
+/**
+ * dmabounce_find_buffer() - locate an existing coherent DMA bounce buffer
+ * @info:		This dmabounce_info.
+ * @bounce_buf_dma:	DMA handle for the bounce buffer to find.
+ * @size:		Size of the bounce buffer to find.
+ * @dir:		Direction of DMA transfer for the buffer to find.
+ *
+ * Finds a previously allocated coherent DMA bounce buffer associated
+ * to a dmabounce_info structure @info.
+ *
+ * The coherent DMA bounce buffer searched must have the given
+ * @bounce_buf_dma DMA handle, @size size and DMA direction @dir.
+ * If @size is zero, the searched buffer can have any size.
+ *
+ * If no matching coherent DMA bounce buffer is found the function returns
+ * NULL. Otherwise, the matching buffer is returned.
+ */
+struct dmabounce_buffer *
+dmabounce_find_buffer(struct dmabounce_info *info, dma_addr_t bounce_buf_dma,
+		      size_t size, enum dma_data_direction dir)
+{
+	struct dmabounce_buffer *bb, *needle = NULL;
+	unsigned long flags;
+
+	read_lock_irqsave(&info->lock, flags);
+
+	list_for_each_entry(bb, &info->bounce_buffers, node) {
+		if (bb->bounce_buf_dma == bounce_buf_dma) {
+			/* we should get a perfect match here */
+			BUG_ON((size && bb->size != size) || bb->dir != dir);
+
+			needle = bb;
+			break;
+		}
+	}
+
+	read_unlock_irqrestore(&info->lock, flags);
+	return needle;
+}
+EXPORT_SYMBOL(dmabounce_find_buffer);
+
+static int dmabounce_init_pool(struct dmabounce_pool *pool,
+			       struct device *dev, const char *name,
+			       size_t size, size_t align, size_t boundary)
+{
+	pool->size = size;
+#ifdef CONFIG_DMABOUNCE_STATS
+	pool->allocs = 0;
+#endif
+	pool->pool = dma_pool_create(name, dev, size, align, boundary);
+
+	return pool->pool ? 0 : -ENOMEM;
+}
+
+/**
+ * dmabounce_info_alloc() - allocate a dmabounce_info structure
+ * @dev:		Device for which coherent memory allocations are done.
+ * @small_buffer_size:	Buffer size for allocations from the small pool.
+ * @large_buffer_size:	Buffer size for allocations from the large pool.
+ * @align:		Alignment for pool based allocations.
+ * @boundary:		Boundary for pool based allocations.
+ *
+ * Use this function to allocate a dmabounce_info structure.
+ * A dmabounce_info structure can be used to manage a set of related
+ * coherent DMA bounce buffers.
+ *
+ * Memory for the coherent DMA bounce buffers will be allocated for the
+ * specified device @dev.
+ * If @small_buffer_size or @large_buffer_size are non-zero, allocations
+ * will be tried from the closest associated DMA pool which has a size
+ * greater or equal than the specified @size size. Allocations from DMA
+ * pools will honor the alignment and boundary crossing restrictions
+ * specified in @align and @boundary.
+ * Otherwise, allocations will be performed from non-pool coherent memory.
+ *
+ * If the dmabounce_info structure cannot be allocated the function
+ * returns NULL. Otherwise, the allocated dmabounce_info structure is returned.
+ */
+struct dmabounce_info *
+dmabounce_info_alloc(struct device *dev,
+		     size_t small_buffer_size, size_t large_buffer_size,
+		     size_t align, size_t boundary)
+{
+	struct dmabounce_info *info;
+	int error;
+
+	info = kzalloc(sizeof(*info), GFP_KERNEL);
+	if (!info) {
+		dev_err(dev, "%s: allocation error\n", __func__);
+		goto out;
+	}
+
+	if (small_buffer_size) {
+		error = dmabounce_init_pool(&info->small, dev,
+					    "small_dmabounce_pool",
+					    small_buffer_size, align, boundary);
+		if (error) {
+			dev_err(dev, "error %d allocating DMA pool for %zu"
+				" byte objects\n", error, small_buffer_size);
+			goto err_alloc_small;
+		}
+	}
+
+	if (large_buffer_size) {
+		error = dmabounce_init_pool(&info->large, dev,
+					    "large_dmabounce_pool",
+					    large_buffer_size, align, boundary);
+		if (error) {
+			dev_err(dev, "error %d allocating DMA pool for %zu"
+				" byte objects\n", error, large_buffer_size);
+			goto err_alloc_large;
+		}
+	}
+
+	info->dev = dev;
+	INIT_LIST_HEAD(&info->bounce_buffers);
+	rwlock_init(&info->lock);
+
+	DMABOUNCE_DO_STATS(info, total_allocs = 0);
+	DMABOUNCE_DO_STATS(info, map_op_count = 0);
+	DMABOUNCE_DO_STATS(info, bounce_count = 0);
+	goto out;
+
+err_alloc_large:
+	dma_pool_destroy(info->small.pool);
+err_alloc_small:
+	kfree(info);
+	info = NULL;
+out:
+	return info;
+}
+EXPORT_SYMBOL(dmabounce_info_alloc);
+
+/**
+ * dmabounce_info_free() - free a dmabounce_info
+ * @info:	This dmabounce_info.
+ *
+ * Free a previously allocated dmabounce_info structure @info.
+ *
+ * The dmabounce_info structure @info must have been previously allocated
+ * using dmabounce_info_alloc().
+ */
+void dmabounce_info_free(struct dmabounce_info *info)
+{
+	if (!list_empty(&info->bounce_buffers)) {
+		dev_err(info->dev, "freeing dmabounce with pending buffers!\n");
+		BUG();
+	}
+
+	if (info->small.pool)
+		dma_pool_destroy(info->small.pool);
+	if (info->large.pool)
+		dma_pool_destroy(info->large.pool);
+
+	kfree(info);
+}
+EXPORT_SYMBOL(dmabounce_info_free);
+
+/**
+ * dmabounce_info_register() - register a dmabounce_info for a device
+ * @dev:	Device for which the dmabounce_info is registered.
+ * @info:	dmabounce_info to register.
+ *
+ * Use this function to register a given dmabounce_info @info into a
+ * device @dev.
+ *
+ * A device can only have one dmabounce_info registered with.
+ * The same dmabounce_info may be registered for many devices.
+ */
+int dmabounce_info_register(struct device *dev, struct dmabounce_info *info)
+{
+	if (dev->archdata.dmabounce)
+		return -EBUSY;
+
+	dev->archdata.dmabounce = info;
+
+#ifdef CONFIG_DMABOUNCE_STATS
+	info->attr_res = device_create_file(dev, &dev_attr_dmabounce_stats);
+#endif
+
+	dev_info(dev, pr_fmt("device registered\n"));
+	return 0;
+}
+EXPORT_SYMBOL(dmabounce_info_register);
+
+/**
+ * dmabounce_info_unregister() - unregister the dmabounce_info from a device
+ * @dev:	Device for which the dmabounce_info is unregistered.
+ *
+ * Use this function to unregister a previously registered dmabounce_info
+ * from a device @dev.
+ */
+void dmabounce_info_unregister(struct device *dev)
+{
+#ifdef CONFIG_DMABOUNCE_STATS
+	struct dmabounce_info *info = dev->archdata.dmabounce;
+
+	if (info && info->attr_res == 0)
+		device_remove_file(dev, &dev_attr_dmabounce_stats);
+#endif
+	dev->archdata.dmabounce = NULL;
+
+	dev_info(dev, pr_fmt("device unregistered\n"));
+}
+EXPORT_SYMBOL(dmabounce_info_unregister);
+
+MODULE_AUTHOR(DRV_AUTHOR);
+MODULE_DESCRIPTION(DRV_DESCRIPTION);
+MODULE_LICENSE("GPL");
+
+/* FIXME remove later when arch/arm/common/dmabounce.c is updated */
+#endif /* !CONFIG_ARM */
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 5/9] arm: use generic dmabounce support
  2010-02-28 14:07 [RFC PATCH v2 0/9] wii: add usb 2.0 support Albert Herranz
                   ` (3 preceding siblings ...)
  2010-02-28 14:07 ` [RFC PATCH v2 4/9] add generic dmabounce support Albert Herranz
@ 2010-02-28 14:07 ` Albert Herranz
  2010-02-28 14:07 ` [RFC PATCH v2 6/9] powerpc: add optional per-device " Albert Herranz
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Albert Herranz @ 2010-02-28 14:07 UTC (permalink / raw)
  To: linuxppc-dev, linux-arm-kernel, linux-usb; +Cc: Albert Herranz

Update ARM dmabounce to use the generic dmabounce support.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
---
 arch/arm/Kconfig              |    4 +-
 arch/arm/common/Kconfig       |    6 +-
 arch/arm/common/dmabounce.c   |  376 +++++++---------------------------------
 arch/arm/include/asm/device.h |    2 +-
 include/linux/dmabounce.h     |    6 -
 lib/dmabounce.c               |    6 -
 6 files changed, 70 insertions(+), 330 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 184a6bd..25fe1d0 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -417,7 +417,7 @@ config ARCH_IXP4XX
 	select GENERIC_GPIO
 	select GENERIC_TIME
 	select GENERIC_CLOCKEVENTS
-	select DMABOUNCE if PCI
+	select HAVE_DMABOUNCE if PCI
 	help
 	  Support for Intel's IXP4XX (XScale) family of processors.
 
@@ -974,7 +974,7 @@ config PCI_HOST_ITE8152
 	bool
 	depends on PCI && MACH_ARMCORE
 	default y
-	select DMABOUNCE
+	select HAVE_DMABOUNCE
 
 source "drivers/pci/Kconfig"
 
diff --git a/arch/arm/common/Kconfig b/arch/arm/common/Kconfig
index 4efbb9d..01869a5 100644
--- a/arch/arm/common/Kconfig
+++ b/arch/arm/common/Kconfig
@@ -20,11 +20,7 @@ config ICST307
 
 config SA1111
 	bool
-	select DMABOUNCE if !ARCH_PXA
-
-config DMABOUNCE
-	bool
-	select ZONE_DMA
+	select HAVE_DMABOUNCE if !ARCH_PXA
 
 config TIMER_ACORN
 	bool
diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index cc32c1e..a114ee1 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -31,203 +31,38 @@
 #include <linux/dmapool.h>
 #include <linux/list.h>
 #include <linux/scatterlist.h>
+#include <linux/dmabounce.h>
 
 #include <asm/cacheflush.h>
 
-#undef STATS
-
-#ifdef STATS
-#define DO_STATS(X) do { X ; } while (0)
-#else
-#define DO_STATS(X) do { } while (0)
-#endif
-
-/* ************************************************** */
-
-struct safe_buffer {
-	struct list_head node;
-
-	/* original request */
-	void		*ptr;
-	size_t		size;
-	int		direction;
-
-	/* safe buffer info */
-	struct dmabounce_pool *pool;
-	void		*safe;
-	dma_addr_t	safe_dma_addr;
-};
-
-struct dmabounce_pool {
-	unsigned long	size;
-	struct dma_pool	*pool;
-#ifdef STATS
-	unsigned long	allocs;
-#endif
-};
-
-struct dmabounce_device_info {
-	struct device *dev;
-	struct list_head safe_buffers;
-#ifdef STATS
-	unsigned long total_allocs;
-	unsigned long map_op_count;
-	unsigned long bounce_count;
-	int attr_res;
-#endif
-	struct dmabounce_pool	small;
-	struct dmabounce_pool	large;
-
-	rwlock_t lock;
-};
-
-#ifdef STATS
-static ssize_t dmabounce_show(struct device *dev, struct device_attribute *attr,
-			      char *buf)
-{
-	struct dmabounce_device_info *device_info = dev->archdata.dmabounce;
-	return sprintf(buf, "%lu %lu %lu %lu %lu %lu\n",
-		device_info->small.allocs,
-		device_info->large.allocs,
-		device_info->total_allocs - device_info->small.allocs -
-			device_info->large.allocs,
-		device_info->total_allocs,
-		device_info->map_op_count,
-		device_info->bounce_count);
-}
-
-static DEVICE_ATTR(dmabounce_stats, 0400, dmabounce_show, NULL);
-#endif
-
-
-/* allocate a 'safe' buffer and keep track of it */
-static inline struct safe_buffer *
-alloc_safe_buffer(struct dmabounce_device_info *device_info, void *ptr,
-		  size_t size, enum dma_data_direction dir)
-{
-	struct safe_buffer *buf;
-	struct dmabounce_pool *pool;
-	struct device *dev = device_info->dev;
-	unsigned long flags;
-
-	dev_dbg(dev, "%s(ptr=%p, size=%d, dir=%d)\n",
-		__func__, ptr, size, dir);
-
-	if (size <= device_info->small.size) {
-		pool = &device_info->small;
-	} else if (size <= device_info->large.size) {
-		pool = &device_info->large;
-	} else {
-		pool = NULL;
-	}
-
-	buf = kmalloc(sizeof(struct safe_buffer), GFP_ATOMIC);
-	if (buf == NULL) {
-		dev_warn(dev, "%s: kmalloc failed\n", __func__);
-		return NULL;
-	}
-
-	buf->ptr = ptr;
-	buf->size = size;
-	buf->direction = dir;
-	buf->pool = pool;
-
-	if (pool) {
-		buf->safe = dma_pool_alloc(pool->pool, GFP_ATOMIC,
-					   &buf->safe_dma_addr);
-	} else {
-		buf->safe = dma_alloc_coherent(dev, size, &buf->safe_dma_addr,
-					       GFP_ATOMIC);
-	}
-
-	if (buf->safe == NULL) {
-		dev_warn(dev,
-			 "%s: could not alloc dma memory (size=%d)\n",
-			 __func__, size);
-		kfree(buf);
-		return NULL;
-	}
-
-#ifdef STATS
-	if (pool)
-		pool->allocs++;
-	device_info->total_allocs++;
-#endif
-
-	write_lock_irqsave(&device_info->lock, flags);
-	list_add(&buf->node, &device_info->safe_buffers);
-	write_unlock_irqrestore(&device_info->lock, flags);
-
-	return buf;
-}
-
-/* determine if a buffer is from our "safe" pool */
-static inline struct safe_buffer *
-find_safe_buffer(struct dmabounce_device_info *device_info, dma_addr_t safe_dma_addr)
-{
-	struct safe_buffer *b, *rb = NULL;
-	unsigned long flags;
-
-	read_lock_irqsave(&device_info->lock, flags);
-
-	list_for_each_entry(b, &device_info->safe_buffers, node)
-		if (b->safe_dma_addr == safe_dma_addr) {
-			rb = b;
-			break;
-		}
-
-	read_unlock_irqrestore(&device_info->lock, flags);
-	return rb;
-}
-
-static inline void
-free_safe_buffer(struct dmabounce_device_info *device_info, struct safe_buffer *buf)
-{
-	unsigned long flags;
-
-	dev_dbg(device_info->dev, "%s(buf=%p)\n", __func__, buf);
-
-	write_lock_irqsave(&device_info->lock, flags);
-
-	list_del(&buf->node);
-
-	write_unlock_irqrestore(&device_info->lock, flags);
-
-	if (buf->pool)
-		dma_pool_free(buf->pool->pool, buf->safe, buf->safe_dma_addr);
-	else
-		dma_free_coherent(device_info->dev, buf->size, buf->safe,
-				    buf->safe_dma_addr);
-
-	kfree(buf);
-}
-
-/* ************************************************** */
-
-static struct safe_buffer *find_safe_buffer_dev(struct device *dev,
-		dma_addr_t dma_addr, const char *where)
+static struct dmabounce_buffer *
+find_safe_buffer_dev(struct device *dev, dma_addr_t dma_addr, size_t size,
+		     enum dma_data_direction dir, const char *where)
 {
 	if (!dev || !dev->archdata.dmabounce)
 		return NULL;
 	if (dma_mapping_error(dev, dma_addr)) {
 		if (dev)
 			dev_err(dev, "Trying to %s invalid mapping\n", where);
-		else
-			pr_err("unknown device: Trying to %s invalid mapping\n", where);
+		else {
+			pr_err("unknown device: "
+			       "Trying to %s invalid mapping\n", where);
+		}
 		return NULL;
 	}
-	return find_safe_buffer(dev->archdata.dmabounce, dma_addr);
+	return dmabounce_find_buffer(dev->archdata.dmabounce, dma_addr,
+				     size, dir);
 }
 
 static inline dma_addr_t map_single(struct device *dev, void *ptr, size_t size,
 		enum dma_data_direction dir)
 {
-	struct dmabounce_device_info *device_info = dev->archdata.dmabounce;
+	struct dmabounce_info *info = dev->archdata.dmabounce;
 	dma_addr_t dma_addr;
 	int needs_bounce = 0;
 
-	if (device_info)
-		DO_STATS ( device_info->map_op_count++ );
+	if (info)
+		DMABOUNCE_DO_STATS(info, map_op_count++);
 
 	dma_addr = virt_to_dma(dev, ptr);
 
@@ -248,11 +83,11 @@ static inline dma_addr_t map_single(struct device *dev, void *ptr, size_t size,
 		needs_bounce = (dma_addr | (dma_addr + size - 1)) & ~mask;
 	}
 
-	if (device_info && (needs_bounce || dma_needs_bounce(dev, dma_addr, size))) {
-		struct safe_buffer *buf;
+	if (info && (needs_bounce || dma_needs_bounce(dev, dma_addr, size))) {
+		struct dmabounce_buffer *bb;
 
-		buf = alloc_safe_buffer(device_info, ptr, size, dir);
-		if (buf == 0) {
+		bb = dmabounce_alloc_buffer(info, ptr, size, dir, GFP_ATOMIC);
+		if (!bb) {
 			dev_err(dev, "%s: unable to map unsafe buffer %p!\n",
 			       __func__, ptr);
 			return 0;
@@ -260,18 +95,17 @@ static inline dma_addr_t map_single(struct device *dev, void *ptr, size_t size,
 
 		dev_dbg(dev,
 			"%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
-			__func__, buf->ptr, virt_to_dma(dev, buf->ptr),
-			buf->safe, buf->safe_dma_addr);
+			__func__, bb->buf, virt_to_dma(dev, bb->buf),
+			bb->bounce_buf, bb->bounce_buf_dma);
 
 		if ((dir == DMA_TO_DEVICE) ||
 		    (dir == DMA_BIDIRECTIONAL)) {
 			dev_dbg(dev, "%s: copy unsafe %p to safe %p, size %d\n",
-				__func__, ptr, buf->safe, size);
-			memcpy(buf->safe, ptr, size);
+				__func__, ptr, bb->bounce_buf, size);
+			memcpy(bb->bounce_buf, ptr, size);
 		}
-		ptr = buf->safe;
 
-		dma_addr = buf->safe_dma_addr;
+		dma_addr = bb->bounce_buf_dma;
 	} else {
 		/*
 		 * We don't need to sync the DMA buffer since
@@ -286,26 +120,25 @@ static inline dma_addr_t map_single(struct device *dev, void *ptr, size_t size,
 static inline void unmap_single(struct device *dev, dma_addr_t dma_addr,
 		size_t size, enum dma_data_direction dir)
 {
-	struct safe_buffer *buf = find_safe_buffer_dev(dev, dma_addr, "unmap");
-
-	if (buf) {
-		BUG_ON(buf->size != size);
-		BUG_ON(buf->direction != dir);
+	struct dmabounce_buffer *bb;
 
+	bb = find_safe_buffer_dev(dev, dma_addr, size, dir, "unmap");
+	if (bb) {
 		dev_dbg(dev,
-			"%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
-			__func__, buf->ptr, virt_to_dma(dev, buf->ptr),
-			buf->safe, buf->safe_dma_addr);
+			"%s: unsafe buffer %p (dma=%#x) mapped to"
+			" %p (dma=%#x)\n",
+			__func__, bb->buf, virt_to_dma(dev, bb->buf),
+			bb->bounce_buf, bb->bounce_buf_dma);
 
-		DO_STATS(dev->archdata.dmabounce->bounce_count++);
+		DMABOUNCE_DO_STATS(dev->archdata.dmabounce, bounce_count++);
 
 		if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) {
-			void *ptr = buf->ptr;
+			void *ptr = bb->buf;
 
 			dev_dbg(dev,
 				"%s: copy back safe %p to unsafe %p size %d\n",
-				__func__, buf->safe, ptr, size);
-			memcpy(ptr, buf->safe, size);
+				__func__, bb->bounce_buf, ptr, size);
+			memcpy(ptr, bb->bounce_buf, size);
 
 			/*
 			 * Since we may have written to a page cache page,
@@ -314,7 +147,7 @@ static inline void unmap_single(struct device *dev, dma_addr_t dma_addr,
 			 */
 			__cpuc_flush_dcache_area(ptr, size);
 		}
-		free_safe_buffer(dev->archdata.dmabounce, buf);
+		dmabounce_free_buffer(dev->archdata.dmabounce, bb);
 	}
 }
 
@@ -391,27 +224,25 @@ EXPORT_SYMBOL(dma_unmap_page);
 int dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr,
 		unsigned long off, size_t sz, enum dma_data_direction dir)
 {
-	struct safe_buffer *buf;
+	struct dmabounce_buffer *bb;
 
 	dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n",
 		__func__, addr, off, sz, dir);
 
-	buf = find_safe_buffer_dev(dev, addr, __func__);
-	if (!buf)
+	bb = find_safe_buffer_dev(dev, addr, 0, dir, __func__);
+	if (!bb)
 		return 1;
 
-	BUG_ON(buf->direction != dir);
-
 	dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
-		__func__, buf->ptr, virt_to_dma(dev, buf->ptr),
-		buf->safe, buf->safe_dma_addr);
+		__func__, bb->buf, virt_to_dma(dev, bb->buf),
+		bb->bounce_buf, bb->bounce_buf_dma);
 
-	DO_STATS(dev->archdata.dmabounce->bounce_count++);
+	DMABOUNCE_DO_STATS(dev->archdata.dmabounce, bounce_count++);
 
 	if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) {
 		dev_dbg(dev, "%s: copy back safe %p to unsafe %p size %d\n",
-			__func__, buf->safe + off, buf->ptr + off, sz);
-		memcpy(buf->ptr + off, buf->safe + off, sz);
+			__func__, bb->bounce_buf + off, bb->buf + off, sz);
+		memcpy(bb->buf + off, bb->bounce_buf + off, sz);
 	}
 	return 0;
 }
@@ -420,135 +251,60 @@ EXPORT_SYMBOL(dmabounce_sync_for_cpu);
 int dmabounce_sync_for_device(struct device *dev, dma_addr_t addr,
 		unsigned long off, size_t sz, enum dma_data_direction dir)
 {
-	struct safe_buffer *buf;
+	struct dmabounce_buffer *bb;
 
 	dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n",
 		__func__, addr, off, sz, dir);
 
-	buf = find_safe_buffer_dev(dev, addr, __func__);
-	if (!buf)
+	bb = find_safe_buffer_dev(dev, addr, 0, dir, __func__);
+	if (!bb)
 		return 1;
 
-	BUG_ON(buf->direction != dir);
-
 	dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n",
-		__func__, buf->ptr, virt_to_dma(dev, buf->ptr),
-		buf->safe, buf->safe_dma_addr);
+		__func__, bb->buf, virt_to_dma(dev, bb->buf),
+		bb->bounce_buf, bb->bounce_buf_dma);
 
-	DO_STATS(dev->archdata.dmabounce->bounce_count++);
+	DMABOUNCE_DO_STATS(dev->archdata.dmabounce, bounce_count++);
 
 	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) {
 		dev_dbg(dev, "%s: copy out unsafe %p to safe %p, size %d\n",
-			__func__,buf->ptr + off, buf->safe + off, sz);
-		memcpy(buf->safe + off, buf->ptr + off, sz);
+			__func__, bb->buf + off, bb->bounce_buf + off, sz);
+		memcpy(bb->bounce_buf + off, bb->buf + off, sz);
 	}
 	return 0;
 }
 EXPORT_SYMBOL(dmabounce_sync_for_device);
 
-static int dmabounce_init_pool(struct dmabounce_pool *pool, struct device *dev,
-		const char *name, unsigned long size)
-{
-	pool->size = size;
-	DO_STATS(pool->allocs = 0);
-	pool->pool = dma_pool_create(name, dev, size,
-				     0 /* byte alignment */,
-				     0 /* no page-crossing issues */);
-
-	return pool->pool ? 0 : -ENOMEM;
-}
-
 int dmabounce_register_dev(struct device *dev, unsigned long small_buffer_size,
 		unsigned long large_buffer_size)
 {
-	struct dmabounce_device_info *device_info;
-	int ret;
-
-	device_info = kmalloc(sizeof(struct dmabounce_device_info), GFP_ATOMIC);
-	if (!device_info) {
-		dev_err(dev,
-			"Could not allocated dmabounce_device_info\n");
-		return -ENOMEM;
-	}
-
-	ret = dmabounce_init_pool(&device_info->small, dev,
-				  "small_dmabounce_pool", small_buffer_size);
-	if (ret) {
-		dev_err(dev,
-			"dmabounce: could not allocate DMA pool for %ld byte objects\n",
-			small_buffer_size);
-		goto err_free;
-	}
-
-	if (large_buffer_size) {
-		ret = dmabounce_init_pool(&device_info->large, dev,
-					  "large_dmabounce_pool",
-					  large_buffer_size);
-		if (ret) {
-			dev_err(dev,
-				"dmabounce: could not allocate DMA pool for %ld byte objects\n",
-				large_buffer_size);
-			goto err_destroy;
-		}
-	}
-
-	device_info->dev = dev;
-	INIT_LIST_HEAD(&device_info->safe_buffers);
-	rwlock_init(&device_info->lock);
-
-#ifdef STATS
-	device_info->total_allocs = 0;
-	device_info->map_op_count = 0;
-	device_info->bounce_count = 0;
-	device_info->attr_res = device_create_file(dev, &dev_attr_dmabounce_stats);
-#endif
-
-	dev->archdata.dmabounce = device_info;
-
-	dev_info(dev, "dmabounce: registered device\n");
-
-	return 0;
-
- err_destroy:
-	dma_pool_destroy(device_info->small.pool);
- err_free:
-	kfree(device_info);
-	return ret;
+	struct dmabounce_info *info;
+	int error = -ENOMEM;
+
+	info = dmabounce_info_alloc(dev, small_buffer_size, large_buffer_size,
+				    0, 0);
+	if (!info)
+		goto out;
+	error = dmabounce_info_register(dev, info);
+	if (error)
+		dmabounce_info_free(info);
+out:
+	return error;
 }
 EXPORT_SYMBOL(dmabounce_register_dev);
 
 void dmabounce_unregister_dev(struct device *dev)
 {
-	struct dmabounce_device_info *device_info = dev->archdata.dmabounce;
-
-	dev->archdata.dmabounce = NULL;
+	struct dmabounce_info *info = dev->archdata.dmabounce;
 
-	if (!device_info) {
+	if (!info) {
 		dev_warn(dev,
 			 "Never registered with dmabounce but attempting"
 			 "to unregister!\n");
 		return;
 	}
-
-	if (!list_empty(&device_info->safe_buffers)) {
-		dev_err(dev,
-			"Removing from dmabounce with pending buffers!\n");
-		BUG();
-	}
-
-	if (device_info->small.pool)
-		dma_pool_destroy(device_info->small.pool);
-	if (device_info->large.pool)
-		dma_pool_destroy(device_info->large.pool);
-
-#ifdef STATS
-	if (device_info->attr_res == 0)
-		device_remove_file(dev, &dev_attr_dmabounce_stats);
-#endif
-
-	kfree(device_info);
-
-	dev_info(dev, "dmabounce: device unregistered\n");
+	dmabounce_info_unregister(dev);
+	dmabounce_info_free(info);
 }
 EXPORT_SYMBOL(dmabounce_unregister_dev);
 
diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
index 9f390ce..709d501 100644
--- a/arch/arm/include/asm/device.h
+++ b/arch/arm/include/asm/device.h
@@ -8,7 +8,7 @@
 
 struct dev_archdata {
 #ifdef CONFIG_DMABOUNCE
-	struct dmabounce_device_info *dmabounce;
+	struct dmabounce_info *dmabounce;
 #endif
 };
 
diff --git a/include/linux/dmabounce.h b/include/linux/dmabounce.h
index d60dc04..dc3ad5b 100644
--- a/include/linux/dmabounce.h
+++ b/include/linux/dmabounce.h
@@ -1,9 +1,6 @@
 #ifndef _LINUX_DMABOUNCE_H
 #define _LINUX_DMABOUNCE_H
 
-/* FIXME remove later when arch/arm/common/dmabounce.c is updated */
-#ifndef CONFIG_ARM
-
 #ifdef CONFIG_DMABOUNCE
 
 #include <linux/dmapool.h>
@@ -71,7 +68,4 @@ extern void dmabounce_info_unregister(struct device *dev);
 
 #endif /* CONFIG_DMABOUNCE */
 
-/* FIXME remove later when arch/arm/common/dmabounce.c is updated */
-#endif /* !CONFIG_ARM */
-
 #endif /* _LINUX_DMABOUNCE_H */
diff --git a/lib/dmabounce.c b/lib/dmabounce.c
index 620d314..7cb4740 100644
--- a/lib/dmabounce.c
+++ b/lib/dmabounce.c
@@ -18,9 +18,6 @@
  * version 2 as published by the Free Software Foundation.
  */
 
-/* FIXME remove later when arch/arm/common/dmabounce.c is updated */
-#ifndef CONFIG_ARM
-
 #define DRV_MODULE_NAME "dmabounce"
 #define DRV_DESCRIPTION "Generic DMA bounce buffer functions"
 #define DRV_AUTHOR      "Christopher Hoover <ch@hpl.hp.com>, " \
@@ -390,6 +387,3 @@ EXPORT_SYMBOL(dmabounce_info_unregister);
 MODULE_AUTHOR(DRV_AUTHOR);
 MODULE_DESCRIPTION(DRV_DESCRIPTION);
 MODULE_LICENSE("GPL");
-
-/* FIXME remove later when arch/arm/common/dmabounce.c is updated */
-#endif /* !CONFIG_ARM */
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 6/9] powerpc: add optional per-device dmabounce support
  2010-02-28 14:07 [RFC PATCH v2 0/9] wii: add usb 2.0 support Albert Herranz
                   ` (4 preceding siblings ...)
  2010-02-28 14:07 ` [RFC PATCH v2 5/9] arm: use " Albert Herranz
@ 2010-02-28 14:07 ` Albert Herranz
  2010-02-28 14:08 ` [RFC PATCH v2 7/9] wii: add mem2 dma mapping ops Albert Herranz
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Albert Herranz @ 2010-02-28 14:07 UTC (permalink / raw)
  To: linuxppc-dev, linux-arm-kernel, linux-usb; +Cc: Albert Herranz

Add support for the PowerPC architecture for using the generic dmabounce
support on a per-device basis.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
---
 arch/powerpc/include/asm/device.h |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/device.h b/arch/powerpc/include/asm/device.h
index 6d94d27..b9a47e8 100644
--- a/arch/powerpc/include/asm/device.h
+++ b/arch/powerpc/include/asm/device.h
@@ -28,6 +28,9 @@ struct dev_archdata {
 #ifdef CONFIG_SWIOTLB
 	dma_addr_t		max_direct_dma_addr;
 #endif
+#ifdef CONFIG_DMABOUNCE
+	struct dmabounce_info	*dmabounce;
+#endif
 };
 
 static inline void dev_archdata_set_node(struct dev_archdata *ad,
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 7/9] wii: add mem2 dma mapping ops
  2010-02-28 14:07 [RFC PATCH v2 0/9] wii: add usb 2.0 support Albert Herranz
                   ` (5 preceding siblings ...)
  2010-02-28 14:07 ` [RFC PATCH v2 6/9] powerpc: add optional per-device " Albert Herranz
@ 2010-02-28 14:08 ` Albert Herranz
  2010-02-28 14:08 ` [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag Albert Herranz
  2010-02-28 14:08 ` [RFC PATCH v2 9/9] wii: hollywood ehci controller support Albert Herranz
  8 siblings, 0 replies; 21+ messages in thread
From: Albert Herranz @ 2010-02-28 14:08 UTC (permalink / raw)
  To: linuxppc-dev, linux-arm-kernel, linux-usb; +Cc: Albert Herranz

Some of the devices in the "Hollywood" chipset of the Nintendo Wii video
game console have restrictions performing DMA transfers to the first
contiguous RAM region (known as MEM1).
For example, up to 3 bytes of the last word of a DMA transfer of a
non-32 bit aligned length to MEM1 may be lost.
Such restrictions do not apply when using the second contiguous RAM
region (known as MEM2).

Add a set of DMA mapping operations which said devices can use to make
sure that DMA transfers are always performed to/from memory buffers
within MEM2.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>

QUICKFIX: wii-dma

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
---
 arch/powerpc/boot/wii.c                      |   34 ++
 arch/powerpc/include/asm/wii.h               |   25 ++
 arch/powerpc/platforms/embedded6xx/Kconfig   |    1 +
 arch/powerpc/platforms/embedded6xx/Makefile  |    2 +-
 arch/powerpc/platforms/embedded6xx/wii-dma.c |  557 ++++++++++++++++++++++++++
 5 files changed, 618 insertions(+), 1 deletions(-)
 create mode 100644 arch/powerpc/include/asm/wii.h
 create mode 100755 arch/powerpc/platforms/embedded6xx/wii-dma.c

diff --git a/arch/powerpc/boot/wii.c b/arch/powerpc/boot/wii.c
index 2ebaec0..f884006 100644
--- a/arch/powerpc/boot/wii.c
+++ b/arch/powerpc/boot/wii.c
@@ -30,6 +30,9 @@ BSS_STACK(8192);
 #define MEM2_TOP		(0x10000000 + 64*1024*1024)
 #define FIRMWARE_DEFAULT_SIZE	(12*1024*1024)
 
+#define MEM2_DMA_BASE_PROP	"linux,wii-mem2-dma-base"
+#define MEM2_DMA_SIZE_PROP	"linux,wii-mem2-dma-size"
+#define MEM2_DMA_DEFAULT_SIZE	(1*1024*1024)
 
 struct mipc_infohdr {
 	char magic[3];
@@ -101,6 +104,30 @@ out:
 
 }
 
+static void mem2_fixups(u32 *top, u32 *reg)
+{
+	void *chosen;
+	u32 dma_base, dma_size;
+	int len;
+
+	chosen = finddevice("/chosen");
+	if (!chosen)
+		fatal("Can't find chosen node\n");
+
+	len = getprop(chosen, MEM2_DMA_SIZE_PROP, &dma_size, sizeof(dma_size));
+	if (len != sizeof(dma_size))
+		dma_size = MEM2_DMA_DEFAULT_SIZE;
+	if (dma_size > reg[3])
+		dma_size = reg[3];
+	setprop_val(chosen, MEM2_DMA_SIZE_PROP, dma_size);
+
+	*top -= dma_size;
+	dma_base = *top;
+	setprop_val(chosen, MEM2_DMA_BASE_PROP, dma_base);
+
+	printf("mem2_dma: %08X@%08X\n", dma_size, dma_base);
+}
+
 static void platform_fixups(void)
 {
 	void *mem;
@@ -127,9 +154,16 @@ static void platform_fixups(void)
 		mem2_boundary = MEM2_TOP - FIRMWARE_DEFAULT_SIZE;
 	}
 
+	mem2_fixups(&mem2_boundary, reg);
+
 	if (mem2_boundary > reg[2] && mem2_boundary < reg[2] + reg[3]) {
 		reg[3] = mem2_boundary - reg[2];
 		printf("top of MEM2 @ %08X\n", reg[2] + reg[3]);
+		/*
+		 * Find again the memory node as it may have changed its
+		 * position after adding some non-existing properties.
+		 */
+		mem = finddevice("/memory");
 		setprop(mem, "reg", reg, sizeof(reg));
 	}
 
diff --git a/arch/powerpc/include/asm/wii.h b/arch/powerpc/include/asm/wii.h
new file mode 100644
index 0000000..bb83c32
--- /dev/null
+++ b/arch/powerpc/include/asm/wii.h
@@ -0,0 +1,25 @@
+/*
+ * arch/powerpc/include/asm/wii.h
+ *
+ * Nintendo Wii board-specific definitions
+ * Copyright (C) 2010 The GameCube Linux Team
+ * Copyright (C) 2010 Albert Herranz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ */
+
+#ifndef __ASM_POWERPC_WII_H
+#define __ASM_POWERPC_WII_H
+
+/*
+ * DMA operations for the Nintendo Wii.
+ */
+extern struct dma_map_ops wii_mem2_dma_ops;
+
+extern int wii_set_mem2_dma_constraints(struct device *dev);
+extern void wii_clear_mem2_dma_constraints(struct device *dev);
+
+#endif /* __ASM_POWERPC_WII_H */
diff --git a/arch/powerpc/platforms/embedded6xx/Kconfig b/arch/powerpc/platforms/embedded6xx/Kconfig
index fe77ab2..4d33755 100644
--- a/arch/powerpc/platforms/embedded6xx/Kconfig
+++ b/arch/powerpc/platforms/embedded6xx/Kconfig
@@ -120,6 +120,7 @@ config WII
 	depends on EMBEDDED6xx
 	select GAMECUBE_COMMON
 	select HAVE_GENERIC_DMA_COHERENT
+	select HAVE_DMABOUNCE
 	help
 	  Select WII if configuring for the Nintendo Wii.
 	  More information at: <http://gc-linux.sourceforge.net/>
diff --git a/arch/powerpc/platforms/embedded6xx/Makefile b/arch/powerpc/platforms/embedded6xx/Makefile
index 66c23e4..4d4c776 100644
--- a/arch/powerpc/platforms/embedded6xx/Makefile
+++ b/arch/powerpc/platforms/embedded6xx/Makefile
@@ -10,4 +10,4 @@ obj-$(CONFIG_PPC_C2K)		+= c2k.o
 obj-$(CONFIG_USBGECKO_UDBG)	+= usbgecko_udbg.o
 obj-$(CONFIG_GAMECUBE_COMMON)	+= flipper-pic.o
 obj-$(CONFIG_GAMECUBE)		+= gamecube.o
-obj-$(CONFIG_WII)		+= wii.o hlwd-pic.o
+obj-$(CONFIG_WII)		+= wii.o hlwd-pic.o wii-dma.o
diff --git a/arch/powerpc/platforms/embedded6xx/wii-dma.c b/arch/powerpc/platforms/embedded6xx/wii-dma.c
new file mode 100755
index 0000000..d0d2c1f
--- /dev/null
+++ b/arch/powerpc/platforms/embedded6xx/wii-dma.c
@@ -0,0 +1,557 @@
+/*
+ * arch/powerpc/platforms/embedded6xx/wii-dma.c
+ *
+ * DMA functions for the Nintendo Wii video game console.
+ * Copyright (C) 2010 The GameCube Linux Team
+ * Copyright (C) 2010 Albert Herranz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ */
+#undef BOUNCE_ALL
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <linux/kernel.h>
+#include <linux/platform_device.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmabounce.h>
+#include <linux/of.h>
+#include <linux/lmb.h>
+#include <asm/wii.h>
+
+#include "mm/mmu_decl.h"
+
+#define MEM2_DMA_BASE_PROP	"linux,wii-mem2-dma-base"
+#define MEM2_DMA_SIZE_PROP	"linux,wii-mem2-dma-size"
+
+#define MEM2_DMA_MAPPING_ERROR	((~(dma_addr_t)0)-1)
+
+#define MEM2_BOUNCE_ALIGN	32	/* cache line size */
+#define MEM2_ACCESS_ALIGN	4
+
+#define __align_ofs(x, size)	((unsigned long)(x)&((size)-1))
+#define __align_up(x, size)	(((unsigned long)(x)+((size)-1))&(~((size)-1)))
+#define __align_down(x, size)	(((unsigned long)(x))&(~((size)-1)))
+
+/* get the offset from the previous aligned access boundary */
+#define mem2_align_ofs(x)		__align_ofs((x), MEM2_ACCESS_ALIGN)
+/* adjust to the previous aligned access boundary */
+#define mem2_align_down(x)		__align_down((x), MEM2_ACCESS_ALIGN)
+/* adjust to the next aligned bouncing boundary */
+#define mem2_align_up(x)		__align_up((x), MEM2_BOUNCE_ALIGN)
+
+/*
+ * The Nintendo Wii video game console is a NOT_COHERENT_CACHE
+ * platform that is unable to safely perform non-32 bit uncached writes
+ * to RAM because the byte enables are not connected to the bus.
+ * Thus, in this platform, "coherent" DMA buffers cannot be directly used
+ * by the kernel code unless it guarantees that all write accesses
+ * to said buffers are done in 32 bit chunks.
+ *
+ * In addition, some of the devices in the "Hollywood" chipset have a
+ * similar restriction regarding DMA transfers: those with non-32bit
+ * aligned lengths only work when performed to/from the second contiguous
+ * region of memory (known as MEM2).
+ *
+ * To solve these issues a specific set of dma mapping operations is made
+ * available for devices requiring it. When enabled, the kernel will make
+ * sure that DMA buffers sitting in MEM1 get bounced to/from coherent DMA
+ * buffers allocated from MEM2, and that the actual bouncing is done by
+ * using 32-bit accesses to coherent memory.
+ *
+ * Bouncing is performed with the help of the generic dmabounce support.
+ */
+
+/*
+ * Copies @len bytes from the coherent memory region starting at
+ * @src + @offset to the memory region starting at @dst + @offset.
+ * The source coherent memory region length is guaranteed to be
+ * 32-bit aligned _and_ equal or greater than @len.
+ *
+ * Reads from coherent memory have no access restrictions.
+ */
+static inline void *memcpy_from_coherent(void *dst, void *src,
+					 unsigned long offset, size_t len)
+{
+	memcpy(dst + offset, src + offset, len);
+	return dst;
+}
+
+/*
+ * Copies @len bytes from the memory region starting at @src + @offset
+ * to the coherent memory region starting at @dst + @offset.
+ * The destination coherent memory region length is guaranteed to be
+ * 32-bit aligned _and_ equal or greater than @len.
+ *
+ * Because of the write access restrictions, all writes to the
+ * destination coherent memory region must be performed in 32-bit chunks.
+ */
+static void *memcpy_to_coherent(void *dst, void *src,
+				unsigned long offset, size_t len)
+{
+	u32 *p4, *q4, v4;
+	u8 *p1, *q1;
+	size_t chunk_delta, chunk_size;
+
+	if (!len)
+		return dst;
+
+	/* first copy the unaligned prefix, if available */
+	q4 = dst + offset;
+	p4 = src + offset;
+	chunk_size = 0;
+	chunk_delta = mem2_align_ofs(q4);
+	if (chunk_delta) {
+		chunk_size = min(len, MEM2_ACCESS_ALIGN - chunk_delta);
+		q4 = (u32 *)mem2_align_down(q4);
+		v4 = *q4;
+		q1 = (u8 *)&v4;
+		memcpy(q1 + chunk_delta, p4, chunk_size);
+		*q4++ = v4;
+		p4 = src + offset + chunk_size;
+		len -= chunk_size;
+	}
+
+	/* second, perform the aligned central copy */
+	while (len >= 4) {
+		*q4++ = *p4++;
+		len -= 4;
+	}
+
+	/* finally, copy the unaligned trailing chunk if needed */
+	p1 = (u8 *)p4;
+	v4 = *q4;
+	q1 = (u8 *)&v4;
+	switch (len) {
+	case 3:
+		*q4 = p1[0] << 24 | p1[1] << 16 | p1[2] << 8 | q1[3];
+		break;
+	case 2:
+		*q4 = p1[0] << 24 | p1[1] << 16 | q1[2] << 8 | q1[3];
+		break;
+	case 1:
+		*q4 = p1[0] << 24 | q1[1] << 16 | q1[2] << 8 | q1[3];
+		break;
+	default:
+		break;
+	}
+	return dst;
+}
+
+/*
+ * Determines if a given DMA region specified by @dma_handle and @size
+ * requires bouncing.
+ *
+ * Bouncing is required if the DMA region falls within MEM1.
+ */
+static int mem2_needs_dmabounce(dma_addr_t dma_handle)
+{
+#ifndef BOUNCE_ALL
+	return dma_handle < wii_hole_start;
+#else
+	return 1;
+#endif
+}
+
+static int mem2_mapping_error(struct device *dev, dma_addr_t dma_handle)
+{
+	return dma_handle == MEM2_DMA_MAPPING_ERROR;
+}
+
+/*
+ * Use the dma_direct_ops hooks for allocating and freeing coherent memory.
+ * Idem for checking if a device supports a specific DMA mask.
+ */
+
+static void *mem2_alloc_coherent(struct device *dev, size_t size,
+				 dma_addr_t *dma_handle, gfp_t gfp)
+{
+	return dma_direct_ops.alloc_coherent(dev, size, dma_handle, gfp);
+}
+
+static void mem2_free_coherent(struct device *dev, size_t size,
+			       void *vaddr, dma_addr_t dma_handle)
+{
+	return dma_direct_ops.free_coherent(dev, size, vaddr, dma_handle);
+}
+
+static int mem2_dma_supported(struct device *dev, u64 mask)
+{
+	return dma_direct_ops.dma_supported(dev, mask);
+}
+
+/*
+ * Maps (part of) a page so it can be safely accessed by a device.
+ *
+ * Calls the corresponding dma_direct_ops hook if the page region falls
+ * within MEM2.
+ * Otherwise, a bounce buffer allocated from MEM2 coherent memory is used.
+ */
+static dma_addr_t mem2_map_page(struct device *dev, struct page *page,
+				unsigned long offset, size_t size,
+				enum dma_data_direction dir,
+				struct dma_attrs *attrs)
+{
+	struct dmabounce_info *info = dev->archdata.dmabounce;
+	void *buf = page_address(page) + offset;
+	dma_addr_t dma_handle = phys_to_dma(dev, page_to_phys(page) + offset);
+	size_t up_size = mem2_align_up(size);
+	struct dmabounce_buffer *bb;
+
+	BUG_ON(!info);
+	DMABOUNCE_DO_STATS(info, map_op_count++);
+
+	if (!mem2_needs_dmabounce(dma_handle)) {
+		dma_handle = dma_direct_ops.map_page(dev, page, offset, size,
+						     dir, attrs);
+		goto out;
+	}
+
+	bb = dmabounce_alloc_buffer(info, buf, up_size, dir, GFP_ATOMIC);
+	if (!bb) {
+		pr_debug("%s: dmabounce_alloc_buffer error\n", __func__);
+		dma_handle = MEM2_DMA_MAPPING_ERROR;
+		goto out;
+	}
+
+	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
+		memcpy_to_coherent(bb->bounce_buf, bb->buf, 0, size);
+	dma_handle = bb->bounce_buf_dma;
+out:
+	return dma_handle;
+}
+
+/*
+ * Unmaps (part of) a page previously mapped.
+ *
+ * Calls the corresponding dma_direct_ops hook if the DMA region associated
+ * to the dma handle @dma_handle wasn't bounced.
+ * Otherwise, the associated bounce buffer is de-bounced.
+ */
+static void mem2_unmap_page(struct device *dev, dma_addr_t dma_handle,
+			    size_t size, enum dma_data_direction dir,
+			    struct dma_attrs *attrs)
+{
+	struct dmabounce_info *info = dev->archdata.dmabounce;
+	size_t up_size = mem2_align_up(size);
+	struct dmabounce_buffer *bb;
+
+	BUG_ON(!info);
+
+	bb = dmabounce_find_buffer(info, dma_handle, up_size, dir);
+	if (!bb) {
+		dma_direct_ops.unmap_page(dev, dma_handle, size, dir, attrs);
+		return;
+	}
+
+	if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) {
+		DMABOUNCE_DO_STATS(info, bounce_count++);
+		memcpy_from_coherent(bb->buf, bb->bounce_buf, 0, size);
+		__dma_sync(bb->buf, size, DMA_BIDIRECTIONAL);
+	}
+	dmabounce_free_buffer(info, bb);
+}
+
+/*
+ * Unmaps a scatter/gather list by unmapping each entry.
+ */
+void mem2_unmap_sg(struct device *dev, struct scatterlist *sgl, int nents,
+		   enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nents, i)
+		mem2_unmap_page(dev, sg->dma_address, sg->length, dir, attrs);
+}
+
+/*
+ * Maps a scatter/gather list by mapping each entry.
+ */
+static int mem2_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
+		       enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nents, i) {
+		sg->dma_length = sg->length;
+		sg->dma_address = mem2_map_page(dev, sg_page(sg), sg->offset,
+						sg->length, dir, attrs);
+		if (mem2_mapping_error(dev, sg->dma_address)) {
+			mem2_unmap_sg(dev, sgl, i, dir, attrs);
+			nents = 0;
+			sgl[nents].dma_length = 0;
+			pr_debug("%s: mem2_map_page error\n", __func__);
+			break;
+		}
+	}
+	return nents;
+}
+
+/*
+ * The sync functions synchronize streaming mode DMA translations
+ * making physical memory consistent before/after a DMA transfer.
+ *
+ * They call the corresponding dma_direct_ops hook if the DMA region
+ * associated to the dma handle @dma_handle wasn't bounced.
+ * Otherwise, original DMA buffers and their matching bounce buffers are put
+ * in sync.
+ * Also, for correctness, original DMA buffers are also made consistent.
+ */
+
+static void mem2_sync_range_for_cpu(struct device *dev, dma_addr_t dma_handle,
+				    unsigned long offset, size_t size,
+				    enum dma_data_direction dir)
+{
+	struct dmabounce_info *info = dev->archdata.dmabounce;
+	struct dmabounce_buffer *bb;
+
+	BUG_ON(!info);
+
+	dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n",
+		__func__, dma_handle, offset, size, dir);
+
+	bb = dmabounce_find_buffer(info, dma_handle, 0, dir);
+	if (!bb) {
+		dma_direct_ops.sync_single_range_for_cpu(dev, dma_handle,
+							 offset, size, dir);
+		return;
+	}
+	BUG_ON(offset + size > bb->size);
+
+	DMABOUNCE_DO_STATS(info, bounce_count++);
+
+	if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) {
+		dev_dbg(dev, "%s: copy back bounce %p to buf %p, size %d\n",
+			__func__, bb->bounce_buf + offset, bb->buf + offset,
+			size);
+		memcpy_from_coherent(bb->buf, bb->bounce_buf, offset, size);
+		__dma_sync(bb->buf + offset, size, DMA_BIDIRECTIONAL);
+	}
+}
+
+static void mem2_sync_range_for_device(struct device *dev,
+				       dma_addr_t dma_handle,
+				       unsigned long offset, size_t size,
+				       enum dma_data_direction dir)
+{
+	struct dmabounce_info *info = dev->archdata.dmabounce;
+	struct dmabounce_buffer *bb;
+
+	BUG_ON(!info);
+
+	dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n",
+		__func__, dma_handle, offset, size, dir);
+
+	bb = dmabounce_find_buffer(info, dma_handle, 0, dir);
+	if (!bb) {
+		WARN_ON(1);
+		dma_direct_ops.sync_single_range_for_device(dev, dma_handle,
+							    offset, size, dir);
+		return;
+	}
+	BUG_ON(offset + size > bb->size);
+
+	DMABOUNCE_DO_STATS(info, bounce_count++);
+
+	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) {
+		dev_dbg(dev, "%s: copy out buf %p to bounce %p, size %d\n",
+			__func__, bb->buf + offset, bb->bounce_buf + offset,
+			size);
+		memcpy_to_coherent(bb->bounce_buf, bb->buf, offset, size);
+	}
+}
+
+static void mem2_sync_sg_for_cpu(struct device *dev,
+				 struct scatterlist *sgl, int nents,
+				 enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nents, i) {
+		mem2_sync_range_for_cpu(dev, sg_dma_address(sg), sg->offset,
+					sg_dma_len(sg), dir);
+	}
+}
+
+static void mem2_sync_sg_for_device(struct device *dev,
+				    struct scatterlist *sgl, int nents,
+				    enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nents, i) {
+		mem2_sync_range_for_device(dev, sg_dma_address(sg), sg->offset,
+					   sg_dma_len(sg), dir);
+	}
+}
+
+/*
+ * The mem2_dma "device".
+ *
+ * This device "owns" a pool of coherent MEM2 memory that can be shared among
+ * several devices requiring MEM2 DMA buffers, instead of dedicating specific
+ * pools for each device.
+ *
+ * A device can use the shared coherent MEM2 memory pool by:
+ * - allocating a dmabounce_info struct associated to the mem2_dma device
+ * - registering this dmabounce_info with the device
+ * - making wii_mem2_dma_ops the default DMA operations for the device
+ */
+
+struct mem2_dma {
+	struct platform_device *pdev;
+	struct dmabounce_info *info;
+
+	dma_addr_t dma_base;
+	void *base;
+	size_t size;
+};
+
+static struct mem2_dma mem2_dma_instance;
+
+static inline struct mem2_dma *mem2_dma_get_instance(void)
+{
+	return &mem2_dma_instance;
+}
+
+static int __init mem2_dma_init(dma_addr_t dma_base, size_t size)
+{
+	struct mem2_dma *mem2_dma = mem2_dma_get_instance();
+	const int flags = DMA_MEMORY_MAP | DMA_MEMORY_EXCLUSIVE;
+	struct device *dev;
+	int error = 0;
+
+	mem2_dma->pdev = platform_device_register_simple("mem2_dma",
+							 0, NULL, 0);
+	if (IS_ERR(mem2_dma->pdev)) {
+		error = PTR_ERR(mem2_dma->pdev);
+		pr_err("error %d registering platform device\n", error);
+		goto err_pdev_register;
+	}
+	dev = &mem2_dma->pdev->dev;
+
+	if (!dma_declare_coherent_memory(dev, dma_base, dma_base,
+					 size, flags)) {
+		dev_err(dev, "error declaring coherent memory %zu@%Lx\n",
+			size, (unsigned long long)dma_base);
+		error = -EBUSY;
+		goto err_declare_coherent;
+	}
+	mem2_dma->dma_base = dma_base;
+	mem2_dma->size = size;
+	dev_info(dev, "using %zu KiB at 0x%Lx\n", size / 1024,
+		 (unsigned long long)dma_base);
+	goto out;
+
+err_declare_coherent:
+	platform_device_unregister(mem2_dma->pdev);
+err_pdev_register:
+	mem2_dma->pdev = NULL;
+out:
+	return error;
+}
+
+static int __init mem2_dma_setup(void)
+{
+	const dma_addr_t *dma_base;
+	const size_t *dma_size;
+	int error = -ENODEV;
+
+	dma_base = of_get_property(of_chosen, MEM2_DMA_BASE_PROP, NULL);
+	if (!dma_base) {
+		pr_err("can't find %s property\n", MEM2_DMA_BASE_PROP);
+		goto out;
+	}
+
+	dma_size = of_get_property(of_chosen, MEM2_DMA_SIZE_PROP, NULL);
+	if (!dma_size) {
+		pr_err("can't find %s property\n", MEM2_DMA_SIZE_PROP);
+		goto out;
+	}
+
+	error = mem2_dma_init(*dma_base, *dma_size);
+	if (error)
+		pr_err("error %d during setup\n", error);
+out:
+	return error;
+}
+arch_initcall(mem2_dma_setup);
+
+/**
+ * wii_mem2_dma_dev() - returns the device "owning" the shared MEM2 DMA region
+ *
+ * Use this function to retrieve the device for which the shared pool of
+ * coherent MEM2 memory has been registered.
+ */
+static struct device *wii_mem2_dma_dev(void)
+{
+	struct mem2_dma *mem2_dma = mem2_dma_get_instance();
+	BUG_ON(!mem2_dma->pdev);
+	return &mem2_dma->pdev->dev;
+}
+
+/**
+ * wii_set_mem2_dma_constraints() - forces device to use MEM2 DMA buffers only
+ * @dev:	device for which DMA constraints are defined
+ *
+ * Instructs device @dev to always use MEM2 DMA buffers for DMA transfers.
+ */
+int wii_set_mem2_dma_constraints(struct device *dev)
+{
+	struct dmabounce_info *info;
+	int error = -ENOMEM;
+
+	info = dmabounce_info_alloc(wii_mem2_dma_dev(), 0, 0, 4, 0);
+	if (!info)
+		goto out;
+	error = dmabounce_info_register(dev, info);
+	if (error)
+		goto out;
+	set_dma_ops(dev, &wii_mem2_dma_ops);
+out:
+	return error;
+}
+EXPORT_SYMBOL(wii_set_mem2_dma_constraints);
+
+/**
+ * wii_clear_mem2_dma_constraints() - clears device MEM2 DMA constraints
+ * @dev:	device for which DMA constraints are cleared
+ *
+ * Instructs device @dev to stop using MEM2 DMA buffers for DMA transfers.
+ * Must be called to undo wii_set_mem2_dma_constraints().
+ */
+void wii_clear_mem2_dma_constraints(struct device *dev)
+{
+	struct dmabounce_info *info = dev->archdata.dmabounce;
+
+	if (info) {
+		dmabounce_info_unregister(dev);
+		dmabounce_info_free(info);
+		set_dma_ops(dev, &dma_direct_ops);
+	}
+}
+EXPORT_SYMBOL(wii_clear_mem2_dma_constraints);
+
+/*
+ * Set of DMA operations for devices requiring MEM2 DMA buffers.
+ */
+struct dma_map_ops wii_mem2_dma_ops = {
+	.alloc_coherent = mem2_alloc_coherent,
+	.free_coherent = mem2_free_coherent,
+	.map_sg = mem2_map_sg,
+	.unmap_sg = mem2_unmap_sg,
+	.map_page = mem2_map_page,
+	.unmap_page = mem2_unmap_page,
+	.sync_single_range_for_cpu = mem2_sync_range_for_cpu,
+	.sync_single_range_for_device = mem2_sync_range_for_device,
+	.sync_sg_for_cpu = mem2_sync_sg_for_cpu,
+	.sync_sg_for_device = mem2_sync_sg_for_device,
+	.dma_supported = mem2_dma_supported,
+	.mapping_error = mem2_mapping_error,
+};
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag
  2010-02-28 14:07 [RFC PATCH v2 0/9] wii: add usb 2.0 support Albert Herranz
                   ` (6 preceding siblings ...)
  2010-02-28 14:08 ` [RFC PATCH v2 7/9] wii: add mem2 dma mapping ops Albert Herranz
@ 2010-02-28 14:08 ` Albert Herranz
  2010-03-01 14:49   ` Alan Stern
  2010-02-28 14:08 ` [RFC PATCH v2 9/9] wii: hollywood ehci controller support Albert Herranz
  8 siblings, 1 reply; 21+ messages in thread
From: Albert Herranz @ 2010-02-28 14:08 UTC (permalink / raw)
  To: linuxppc-dev, linux-arm-kernel, linux-usb; +Cc: Albert Herranz

The HCD_NO_COHERENT_MEM USB host controller driver flag can be enabled
to instruct the USB stack to avoid allocating coherent memory for USB
buffers.

This flag is useful to overcome some esoteric memory access restrictions
found in some platforms.
For example, the Nintendo Wii video game console is a NOT_COHERENT_CACHE
platform that is unable to safely perform non-32 bit uncached writes
to RAM because the byte enables are not connected to the bus.
Thus, in that platform, "coherent" DMA buffers cannot be directly used
by the kernel code unless it guarantees that all write accesses
to said buffers are done in 32 bit chunks (which is not the case in the
USB subsystem).

To avoid this unwanted behaviour HCD_NO_COHERENT_MEM can be enabled at
the HCD controller, causing USB buffer allocations to be satisfied from
normal kernel memory. In this case, the USB stack will make sure that
the buffers get properly mapped/unmapped for DMA transfers using the DMA
streaming mapping API.

Note that other parts of the USB stack may also use coherent memory,
like for example the hardware descriptors used in the EHCI controllers.
This needs to be checked and addressed separately. In the EHCI example,
hardware descriptors are accessed in 32-bit (or 64-bit) chunks, causing
no problems in the described scenario.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
---
 drivers/usb/core/buffer.c |   15 ++++++++-----
 drivers/usb/core/hcd.c    |   50 +++++++++++++++++++++++++++++++++++++-------
 drivers/usb/core/hcd.h    |   13 ++++++-----
 3 files changed, 58 insertions(+), 20 deletions(-)

diff --git a/drivers/usb/core/buffer.c b/drivers/usb/core/buffer.c
index 3ba2fff..fede7a0 100644
--- a/drivers/usb/core/buffer.c
+++ b/drivers/usb/core/buffer.c
@@ -53,8 +53,9 @@ int hcd_buffer_create(struct usb_hcd *hcd)
 	char		name[16];
 	int 		i, size;
 
-	if (!hcd->self.controller->dma_mask &&
-	    !(hcd->driver->flags & HCD_LOCAL_MEM))
+	if ((!hcd->self.controller->dma_mask &&
+	    !(hcd->driver->flags & HCD_LOCAL_MEM)) ||
+	    (hcd->driver->flags & HCD_NO_COHERENT_MEM))
 		return 0;
 
 	for (i = 0; i < HCD_BUFFER_POOLS; i++) {
@@ -109,8 +110,9 @@ void *hcd_buffer_alloc(
 	int 			i;
 
 	/* some USB hosts just use PIO */
-	if (!bus->controller->dma_mask &&
-	    !(hcd->driver->flags & HCD_LOCAL_MEM)) {
+	if ((!bus->controller->dma_mask &&
+	    !(hcd->driver->flags & HCD_LOCAL_MEM)) ||
+	    (hcd->driver->flags & HCD_NO_COHERENT_MEM)) {
 		*dma = ~(dma_addr_t) 0;
 		return kmalloc(size, mem_flags);
 	}
@@ -135,8 +137,9 @@ void hcd_buffer_free(
 	if (!addr)
 		return;
 
-	if (!bus->controller->dma_mask &&
-	    !(hcd->driver->flags & HCD_LOCAL_MEM)) {
+	if ((!bus->controller->dma_mask &&
+	    !(hcd->driver->flags & HCD_LOCAL_MEM)) ||
+	    (hcd->driver->flags & HCD_NO_COHERENT_MEM)) {
 		kfree(addr);
 		return;
 	}
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 80995ef..c2596c5 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -1260,6 +1260,34 @@ static void hcd_free_coherent(struct usb_bus *bus, dma_addr_t *dma_handle,
 	*dma_handle = 0;
 }
 
+static int urb_needs_setup_dma_map(struct usb_hcd *hcd, struct urb *urb)
+{
+	return !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP) ||
+	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
+		urb->setup_dma == ~(dma_addr_t)0);
+}
+
+static int urb_needs_setup_dma_unmap(struct usb_hcd *hcd, struct urb *urb)
+{
+	return !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP) ||
+	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
+		urb->setup_dma && urb->setup_dma != ~(dma_addr_t)0);
+}
+
+static int urb_needs_transfer_dma_map(struct usb_hcd *hcd, struct urb *urb)
+{
+	return !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP) ||
+	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
+		urb->transfer_dma == ~(dma_addr_t)0);
+}
+
+static int urb_needs_transfer_dma_unmap(struct usb_hcd *hcd, struct urb *urb)
+{
+	return !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP) ||
+	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
+		urb->transfer_dma && urb->transfer_dma != ~(dma_addr_t)0);
+}
+
 static int map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
 			   gfp_t mem_flags)
 {
@@ -1275,7 +1303,7 @@ static int map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
 		return 0;
 
 	if (usb_endpoint_xfer_control(&urb->ep->desc)
-	    && !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP)) {
+	    && urb_needs_setup_dma_map(hcd, urb)) {
 		if (hcd->self.uses_dma) {
 			urb->setup_dma = dma_map_single(
 					hcd->self.controller,
@@ -1296,7 +1324,7 @@ static int map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
 
 	dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
 	if (ret == 0 && urb->transfer_buffer_length != 0
-	    && !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)) {
+	    && urb_needs_transfer_dma_map(hcd, urb)) {
 		if (hcd->self.uses_dma) {
 			urb->transfer_dma = dma_map_single (
 					hcd->self.controller,
@@ -1334,12 +1362,15 @@ static void unmap_urb_for_dma(struct usb_hcd *hcd, struct urb *urb)
 		return;
 
 	if (usb_endpoint_xfer_control(&urb->ep->desc)
-	    && !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP)) {
-		if (hcd->self.uses_dma)
+	    && urb_needs_setup_dma_unmap(hcd, urb)) {
+		if (hcd->self.uses_dma) {
 			dma_unmap_single(hcd->self.controller, urb->setup_dma,
 					sizeof(struct usb_ctrlrequest),
 					DMA_TO_DEVICE);
-		else if (hcd->driver->flags & HCD_LOCAL_MEM)
+			if ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
+			    (urb->transfer_flags & URB_NO_SETUP_DMA_MAP))
+				urb->setup_dma = ~(dma_addr_t)0;
+		} else if (hcd->driver->flags & HCD_LOCAL_MEM)
 			hcd_free_coherent(urb->dev->bus, &urb->setup_dma,
 					(void **)&urb->setup_packet,
 					sizeof(struct usb_ctrlrequest),
@@ -1348,13 +1379,16 @@ static void unmap_urb_for_dma(struct usb_hcd *hcd, struct urb *urb)
 
 	dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
 	if (urb->transfer_buffer_length != 0
-	    && !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)) {
-		if (hcd->self.uses_dma)
+	    && urb_needs_transfer_dma_unmap(hcd, urb)) {
+		if (hcd->self.uses_dma) {
 			dma_unmap_single(hcd->self.controller,
 					urb->transfer_dma,
 					urb->transfer_buffer_length,
 					dir);
-		else if (hcd->driver->flags & HCD_LOCAL_MEM)
+			if ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
+			    (urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP))
+				urb->transfer_dma = ~(dma_addr_t)0;
+		} else if (hcd->driver->flags & HCD_LOCAL_MEM)
 			hcd_free_coherent(urb->dev->bus, &urb->transfer_dma,
 					&urb->transfer_buffer,
 					urb->transfer_buffer_length,
diff --git a/drivers/usb/core/hcd.h b/drivers/usb/core/hcd.h
index bbe2b92..cde42f3 100644
--- a/drivers/usb/core/hcd.h
+++ b/drivers/usb/core/hcd.h
@@ -183,12 +183,13 @@ struct hc_driver {
 	irqreturn_t	(*irq) (struct usb_hcd *hcd);
 
 	int	flags;
-#define	HCD_MEMORY	0x0001		/* HC regs use memory (else I/O) */
-#define	HCD_LOCAL_MEM	0x0002		/* HC needs local memory */
-#define	HCD_USB11	0x0010		/* USB 1.1 */
-#define	HCD_USB2	0x0020		/* USB 2.0 */
-#define	HCD_USB3	0x0040		/* USB 3.0 */
-#define	HCD_MASK	0x0070
+#define	HCD_MEMORY		0x0001	/* HC regs use memory (else I/O) */
+#define	HCD_LOCAL_MEM		0x0002	/* HC needs local memory */
+#define	HCD_NO_COHERENT_MEM	0x0004	/* HC avoids use of "coherent" mem */
+#define	HCD_USB11		0x0010	/* USB 1.1 */
+#define	HCD_USB2		0x0020	/* USB 2.0 */
+#define	HCD_USB3		0x0040	/* USB 3.0 */
+#define	HCD_MASK		0x0070
 
 	/* called to init HCD and root hub */
 	int	(*reset) (struct usb_hcd *hcd);
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v2 9/9] wii: hollywood ehci controller support
  2010-02-28 14:07 [RFC PATCH v2 0/9] wii: add usb 2.0 support Albert Herranz
                   ` (7 preceding siblings ...)
  2010-02-28 14:08 ` [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag Albert Herranz
@ 2010-02-28 14:08 ` Albert Herranz
  8 siblings, 0 replies; 21+ messages in thread
From: Albert Herranz @ 2010-02-28 14:08 UTC (permalink / raw)
  To: linuxppc-dev, linux-arm-kernel, linux-usb; +Cc: Albert Herranz

Add support for the USB Enhanced Host Controller Interface included
in the "Hollywood" chipset of the Nintendo Wii video game console.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
---
 arch/powerpc/platforms/embedded6xx/Kconfig |    1 +
 drivers/usb/host/Kconfig                   |    8 +
 drivers/usb/host/ehci-hcd.c                |    5 +
 drivers/usb/host/ehci-hlwd.c               |  233 ++++++++++++++++++++++++++++
 drivers/usb/host/ehci.h                    |   23 +++
 5 files changed, 270 insertions(+), 0 deletions(-)
 create mode 100644 drivers/usb/host/ehci-hlwd.c

diff --git a/arch/powerpc/platforms/embedded6xx/Kconfig b/arch/powerpc/platforms/embedded6xx/Kconfig
index 4d33755..0eb56a7 100644
--- a/arch/powerpc/platforms/embedded6xx/Kconfig
+++ b/arch/powerpc/platforms/embedded6xx/Kconfig
@@ -121,6 +121,7 @@ config WII
 	select GAMECUBE_COMMON
 	select HAVE_GENERIC_DMA_COHERENT
 	select HAVE_DMABOUNCE
+	select USB_ARCH_HAS_EHCI
 	help
 	  Select WII if configuring for the Nintendo Wii.
 	  More information at: <http://gc-linux.sourceforge.net/>
diff --git a/drivers/usb/host/Kconfig b/drivers/usb/host/Kconfig
index 2678a16..342954f 100644
--- a/drivers/usb/host/Kconfig
+++ b/drivers/usb/host/Kconfig
@@ -131,6 +131,14 @@ config USB_EHCI_HCD_PPC_OF
 	  Enables support for the USB controller present on the PowerPC
 	  OpenFirmware platform bus.
 
+config USB_EHCI_HCD_HLWD
+	bool "Nintendo Wii (Hollywood) EHCI USB controller support"
+	depends on USB_EHCI_HCD && WII
+	default y
+	---help---
+	  Say Y here to support the EHCI USB controller found in the
+	  Hollywood chipset of the Nintendo Wii video game console.
+
 config USB_W90X900_EHCI
 	bool "W90X900(W90P910) EHCI support"
 	depends on USB_EHCI_HCD && ARCH_W90X900
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 1ec3857..395c6a1 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -1133,6 +1133,11 @@ MODULE_LICENSE ("GPL");
 #define OF_PLATFORM_DRIVER	ehci_hcd_ppc_of_driver
 #endif
 
+#ifdef CONFIG_USB_EHCI_HCD_HLWD
+#include "ehci-hlwd.c"
+#define OF_PLATFORM_DRIVER	ehci_hcd_hlwd_driver
+#endif
+
 #ifdef CONFIG_XPS_USB_HCD_XILINX
 #include "ehci-xilinx-of.c"
 #define OF_PLATFORM_DRIVER	ehci_hcd_xilinx_of_driver
diff --git a/drivers/usb/host/ehci-hlwd.c b/drivers/usb/host/ehci-hlwd.c
new file mode 100644
index 0000000..129e96b
--- /dev/null
+++ b/drivers/usb/host/ehci-hlwd.c
@@ -0,0 +1,233 @@
+/*
+ * drivers/usb/host/ehci-hlwd.c
+ *
+ * Nintendo Wii (Hollywood) USB Enhanced Host Controller Interface.
+ * Copyright (C) 2009-2010 The GameCube Linux Team
+ * Copyright (C) 2009,2010 Albert Herranz
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * Based on ehci-ppc-of.c
+ *
+ * EHCI HCD (Host Controller Driver) for USB.
+ *
+ * Bus Glue for PPC On-Chip EHCI driver on the of_platform bus
+ * Tested on AMCC PPC 440EPx
+ *
+ * Valentine Barshak <vbarshak@ru.mvista.com>
+ *
+ * Based on "ehci-ppc-soc.c" by Stefan Roese <sr@denx.de>
+ * and "ohci-ppc-of.c" by Sylvain Munaut <tnt@246tNt.com>
+ *
+ * This file is licenced under the GPL.
+ */
+
+#include <linux/signal.h>
+#include <linux/of.h>
+#include <linux/of_platform.h>
+#include <asm/wii.h>
+
+#define DRV_MODULE_NAME "ehci-hlwd"
+#define DRV_DESCRIPTION "Nintendo Wii EHCI Host Controller"
+#define DRV_AUTHOR      "Albert Herranz"
+
+/*
+ * Non-standard registers.
+ */
+#define HLWD_EHCI_CTL		0x00cc	/* Controller Control */
+#define HLWD_EHCI_CTL_INTE	(1<<15)	/* Notify EHCI interrupts */
+
+/* called during probe() after chip reset completes */
+static int ehci_hlwd_reset(struct usb_hcd *hcd)
+{
+	struct ehci_hcd	*ehci = hcd_to_ehci(hcd);
+	int error;
+
+	dbg_hcs_params(ehci, "reset");
+	dbg_hcc_params(ehci, "reset");
+
+	error = ehci_halt(ehci);
+	if (error)
+		goto out;
+
+	error = ehci_init(hcd);
+	if (error)
+		goto out;
+
+	/* enable notification of EHCI interrupts */
+	setbits32(hcd->regs + HLWD_EHCI_CTL, HLWD_EHCI_CTL_INTE);
+
+	ehci->sbrn = 0x20;
+	error = ehci_reset(ehci);
+	ehci_port_power(ehci, 0);
+out:
+	return error;
+}
+
+static const struct hc_driver ehci_hlwd_hc_driver = {
+	.description		= hcd_name,
+	.product_desc		= "Nintendo Wii EHCI Host Controller",
+	.hcd_priv_size		= sizeof(struct ehci_hcd),
+
+	/*
+	 * generic hardware linkage
+	 */
+	.irq			= ehci_irq,
+	.flags			= HCD_USB2 | HCD_NO_COHERENT_MEM,
+
+	/*
+	 * basic lifecycle operations
+	 */
+	.reset			= ehci_hlwd_reset,
+	.start			= ehci_run,
+	.stop			= ehci_stop,
+	.shutdown		= ehci_shutdown,
+
+	/*
+	 * managing i/o requests and associated device resources
+	 */
+	.urb_enqueue		= ehci_urb_enqueue,
+	.urb_dequeue		= ehci_urb_dequeue,
+	.endpoint_disable	= ehci_endpoint_disable,
+	.endpoint_reset		= ehci_endpoint_reset,
+
+	/*
+	 * scheduling support
+	 */
+	.get_frame_number	= ehci_get_frame,
+
+	/*
+	 * root hub support
+	 */
+	.hub_status_data	= ehci_hub_status_data,
+	.hub_control		= ehci_hub_control,
+#ifdef	CONFIG_PM
+	.bus_suspend		= ehci_bus_suspend,
+	.bus_resume		= ehci_bus_resume,
+#endif
+	.relinquish_port	= ehci_relinquish_port,
+	.port_handed_over	= ehci_port_handed_over,
+
+	.clear_tt_buffer_complete	= ehci_clear_tt_buffer_complete,
+};
+
+static int __devinit
+ehci_hcd_hlwd_probe(struct of_device *op, const struct of_device_id *match)
+{
+	struct device_node *dn = op->node;
+	struct usb_hcd *hcd;
+	struct ehci_hcd	*ehci = NULL;
+	struct resource res;
+	int irq;
+	int error = -ENODEV;
+
+	if (usb_disabled())
+		goto out;
+
+	dev_dbg(&op->dev, "initializing " DRV_MODULE_NAME " USB Controller\n");
+
+	error = of_address_to_resource(dn, 0, &res);
+	if (error)
+		goto out;
+
+	hcd = usb_create_hcd(&ehci_hlwd_hc_driver, &op->dev, DRV_MODULE_NAME);
+	if (!hcd) {
+		error = -ENOMEM;
+		goto out;
+	}
+
+	hcd->rsrc_start = res.start;
+	hcd->rsrc_len = resource_size(&res);
+
+	irq = irq_of_parse_and_map(dn, 0);
+	if (irq == NO_IRQ) {
+		printk(KERN_ERR __FILE__ ": irq_of_parse_and_map failed\n");
+		error = -EBUSY;
+		goto err_irq;
+	}
+
+	hcd->regs = ioremap(hcd->rsrc_start, hcd->rsrc_len);
+	if (!hcd->regs) {
+		printk(KERN_ERR __FILE__ ": ioremap failed\n");
+		error = -EBUSY;
+		goto err_ioremap;
+	}
+
+	/* this device requires MEM2 DMA buffers */
+	error = wii_set_mem2_dma_constraints(&op->dev);
+	if (error)
+		goto err_mem2_constraints;
+
+	ehci = hcd_to_ehci(hcd);
+	ehci->caps = hcd->regs;
+	ehci->regs = hcd->regs +
+			HC_LENGTH(ehci_readl(ehci, &ehci->caps->hc_capbase));
+
+	/* cache this readonly data; minimize chip reads */
+	ehci->hcs_params = ehci_readl(ehci, &ehci->caps->hcs_params);
+
+	error = usb_add_hcd(hcd, irq, 0);
+	if (error == 0)
+		return 0;
+
+err_mem2_constraints:
+	iounmap(hcd->regs);
+err_ioremap:
+	irq_dispose_mapping(irq);
+err_irq:
+	usb_put_hcd(hcd);
+out:
+	return error;
+}
+
+
+static int ehci_hcd_hlwd_remove(struct of_device *op)
+{
+	struct usb_hcd *hcd = dev_get_drvdata(&op->dev);
+
+	dev_set_drvdata(&op->dev, NULL);
+
+	dev_dbg(&op->dev, "stopping " DRV_MODULE_NAME " USB Controller\n");
+
+	usb_remove_hcd(hcd);
+	wii_clear_mem2_dma_constraints(&op->dev);
+	iounmap(hcd->regs);
+	irq_dispose_mapping(hcd->irq);
+	usb_put_hcd(hcd);
+
+	return 0;
+}
+
+
+static int ehci_hcd_hlwd_shutdown(struct of_device *op)
+{
+	struct usb_hcd *hcd = dev_get_drvdata(&op->dev);
+
+	if (hcd->driver->shutdown)
+		hcd->driver->shutdown(hcd);
+
+	return 0;
+}
+
+
+static struct of_device_id ehci_hcd_hlwd_match[] = {
+	{ .compatible = "nintendo,hollywood-usb-ehci", },
+	{},
+};
+MODULE_DEVICE_TABLE(of, ehci_hcd_hlwd_match);
+
+static struct of_platform_driver ehci_hcd_hlwd_driver = {
+	.name		= DRV_MODULE_NAME,
+	.match_table	= ehci_hcd_hlwd_match,
+	.probe		= ehci_hcd_hlwd_probe,
+	.remove		= ehci_hcd_hlwd_remove,
+	.shutdown	= ehci_hcd_hlwd_shutdown,
+	.driver		= {
+		.name	= DRV_MODULE_NAME,
+		.owner	= THIS_MODULE,
+	},
+};
+
diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h
index 2d85e21..8fb519b 100644
--- a/drivers/usb/host/ehci.h
+++ b/drivers/usb/host/ehci.h
@@ -599,6 +599,27 @@ ehci_port_speed(struct ehci_hcd *ehci, unsigned int portsc)
 #define ehci_big_endian_mmio(e)		0
 #endif
 
+#ifdef CONFIG_USB_EHCI_HCD_HLWD
+
+/*
+ * The Nintendo Wii video game console has no PCI hardware.
+ * The USB controllers are part of the "Hollywood" chipset and are
+ * accessed via the platform internal I/O accessors.
+ */
+static inline unsigned int ehci_readl(const struct ehci_hcd *ehci,
+				      __u32 __iomem *regs)
+{
+	return in_be32(regs);
+}
+
+static inline void ehci_writel(const struct ehci_hcd *ehci,
+			       const unsigned int val, __u32 __iomem *regs)
+{
+	out_be32(regs, val);
+}
+
+#else
+
 /*
  * Big-endian read/write functions are arch-specific.
  * Other arches can be added if/when they're needed.
@@ -632,6 +653,8 @@ static inline void ehci_writel(const struct ehci_hcd *ehci,
 #endif
 }
 
+#endif /* CONFIG_USB_EHCI_HCD_HLWD */
+
 /*
  * On certain ppc-44x SoC there is a HW issue, that could only worked around with
  * explicit suspend/operate of OHCI. This function hereby makes sense only on that arch.
-- 
1.6.3.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 4/9] add generic dmabounce support
  2010-02-28 14:07 ` [RFC PATCH v2 4/9] add generic dmabounce support Albert Herranz
@ 2010-02-28 14:20   ` Russell King - ARM Linux
  2010-02-28 16:37     ` Albert Herranz
  0 siblings, 1 reply; 21+ messages in thread
From: Russell King - ARM Linux @ 2010-02-28 14:20 UTC (permalink / raw)
  To: Albert Herranz; +Cc: linux-usb, linuxppc-dev, linux-arm-kernel

On Sun, Feb 28, 2010 at 03:07:57PM +0100, Albert Herranz wrote:
> This patch makes part of the ARM dmabounce code available to other
> architectures as a generic API.

There is already a generic dma bounce implementation - it's called
swiotlb - lib/swiotlb.c.  We should eventually switch the ARM
dmabounce stuff over to that instead of keeping dmabounce around.

The only problem I forsee is that on ARM, we have devices which can
only address the least significant N bits of RAM, but there may be
an offset on the base address of RAM on the bus which isn't included
in these N bits.

Even more fun is where we have a DMA controller which can address
N bits of RAM, except for bit M which must be zero... (where N > M).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 4/9] add generic dmabounce support
  2010-02-28 14:20   ` Russell King - ARM Linux
@ 2010-02-28 16:37     ` Albert Herranz
  0 siblings, 0 replies; 21+ messages in thread
From: Albert Herranz @ 2010-02-28 16:37 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: linux-usb, linuxppc-dev, linux-arm-kernel

Russell King - ARM Linux wrote:
> On Sun, Feb 28, 2010 at 03:07:57PM +0100, Albert Herranz wrote:
>> This patch makes part of the ARM dmabounce code available to other
>> architectures as a generic API.
> 
> There is already a generic dma bounce implementation - it's called
> swiotlb - lib/swiotlb.c.  We should eventually switch the ARM
> dmabounce stuff over to that instead of keeping dmabounce around.
> 
> The only problem I forsee is that on ARM, we have devices which can
> only address the least significant N bits of RAM, but there may be
> an offset on the base address of RAM on the bus which isn't included
> in these N bits.
> 
> Even more fun is where we have a DMA controller which can address
> N bits of RAM, except for bit M which must be zero... (where N > M).
> 

In the Wii we have several limitations:
- it is a NOT_COHERENT_CACHE platform
- write accesses to coherent memory from the main processor must be done always in 32-bit chunks
- some devices can only reliably perform DMA to/from a specific region of memory (the second block of RAM, 64MB at 0x10000000, called MEM2)

So if swiotlb is the way to go I can try to adapt it to take into account these cases:
- it should allocate the io_tlb_start and io_tlb_overflow_buffer areas from MEM2 (add allocation/freeing hooks for these areas?)
- it should copy data from coherent memory in 32-bit chunks (add copy to/from coherent hooks?)
- it should decide to bounce buffers sitting in MEM1 (add a dma_needs_bounce() hook?)

Thanks,
Albert

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag
  2010-02-28 14:08 ` [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag Albert Herranz
@ 2010-03-01 14:49   ` Alan Stern
  2010-03-01 18:38     ` Albert Herranz
  0 siblings, 1 reply; 21+ messages in thread
From: Alan Stern @ 2010-03-01 14:49 UTC (permalink / raw)
  To: Albert Herranz; +Cc: linux-usb, linuxppc-dev, linux-arm-kernel

On Sun, 28 Feb 2010, Albert Herranz wrote:

> The HCD_NO_COHERENT_MEM USB host controller driver flag can be enabled
> to instruct the USB stack to avoid allocating coherent memory for USB
> buffers.
> 
> This flag is useful to overcome some esoteric memory access restrictions
> found in some platforms.
> For example, the Nintendo Wii video game console is a NOT_COHERENT_CACHE
> platform that is unable to safely perform non-32 bit uncached writes
> to RAM because the byte enables are not connected to the bus.
> Thus, in that platform, "coherent" DMA buffers cannot be directly used
> by the kernel code unless it guarantees that all write accesses
> to said buffers are done in 32 bit chunks (which is not the case in the
> USB subsystem).
> 
> To avoid this unwanted behaviour HCD_NO_COHERENT_MEM can be enabled at
> the HCD controller, causing USB buffer allocations to be satisfied from
> normal kernel memory. In this case, the USB stack will make sure that
> the buffers get properly mapped/unmapped for DMA transfers using the DMA
> streaming mapping API.
> 
> Note that other parts of the USB stack may also use coherent memory,
> like for example the hardware descriptors used in the EHCI controllers.
> This needs to be checked and addressed separately. In the EHCI example,
> hardware descriptors are accessed in 32-bit (or 64-bit) chunks, causing
> no problems in the described scenario.

> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -1260,6 +1260,34 @@ static void hcd_free_coherent(struct usb_bus *bus, dma_addr_t *dma_handle,
>  	*dma_handle = 0;
>  }
>  
> +static int urb_needs_setup_dma_map(struct usb_hcd *hcd, struct urb *urb)
> +{
> +	return !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP) ||
> +	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
> +		urb->setup_dma == ~(dma_addr_t)0);
> +}
> +
> +static int urb_needs_setup_dma_unmap(struct usb_hcd *hcd, struct urb *urb)
> +{
> +	return !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP) ||
> +	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
> +		urb->setup_dma && urb->setup_dma != ~(dma_addr_t)0);
> +}
> +
> +static int urb_needs_transfer_dma_map(struct usb_hcd *hcd, struct urb *urb)
> +{
> +	return !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP) ||
> +	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
> +		urb->transfer_dma == ~(dma_addr_t)0);
> +}
> +
> +static int urb_needs_transfer_dma_unmap(struct usb_hcd *hcd, struct urb *urb)
> +{
> +	return !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP) ||
> +	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
> +		urb->transfer_dma && urb->transfer_dma != ~(dma_addr_t)0);
> +}
> +

These functions would be a lot easier to understand if they were 
expanded into multiple test and return statements, rather than 
squeezing all the Boolean manipulations into single expressions.  (Not 
to mention the fact that other developement is going to make them even 
more complicated than they are now...)

Also, I can't help thinking that the corresponding *_map() and 
*_unmap() routines are so similar, it ought to be possible to combine 
them.  The only difference is a check for a NULL DMA address, and it's 
not clear to me why it is present.  It's also not clear why the test 
for a DMA address of all ones is present.  Maybe they both can be 
removed.

Alan Stern

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag
  2010-03-01 14:49   ` Alan Stern
@ 2010-03-01 18:38     ` Albert Herranz
  2010-03-01 19:23       ` Alan Stern
  0 siblings, 1 reply; 21+ messages in thread
From: Albert Herranz @ 2010-03-01 18:38 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-usb, linuxppc-dev, linux-arm-kernel

Alan Stern wrote:
>> --- a/drivers/usb/core/hcd.c
>> +++ b/drivers/usb/core/hcd.c
>> @@ -1260,6 +1260,34 @@ static void hcd_free_coherent(struct usb_bus *bus, dma_addr_t *dma_handle,
>>  	*dma_handle = 0;
>>  }
>>  
>> +static int urb_needs_setup_dma_map(struct usb_hcd *hcd, struct urb *urb)
>> +{
>> +	return !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP) ||
>> +	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
>> +		urb->setup_dma == ~(dma_addr_t)0);
>> +}
>> +
>> +static int urb_needs_setup_dma_unmap(struct usb_hcd *hcd, struct urb *urb)
>> +{
>> +	return !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP) ||
>> +	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
>> +		urb->setup_dma && urb->setup_dma != ~(dma_addr_t)0);
>> +}
>> +
>> +static int urb_needs_transfer_dma_map(struct usb_hcd *hcd, struct urb *urb)
>> +{
>> +	return !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP) ||
>> +	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
>> +		urb->transfer_dma == ~(dma_addr_t)0);
>> +}
>> +
>> +static int urb_needs_transfer_dma_unmap(struct usb_hcd *hcd, struct urb *urb)
>> +{
>> +	return !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP) ||
>> +	       ((hcd->driver->flags & HCD_NO_COHERENT_MEM) &&
>> +		urb->transfer_dma && urb->transfer_dma != ~(dma_addr_t)0);
>> +}
>> +
> 
> These functions would be a lot easier to understand if they were 
> expanded into multiple test and return statements, rather than 
> squeezing all the Boolean manipulations into single expressions.  (Not 
> to mention the fact that other developement is going to make them even 
> more complicated than they are now...)
> 

Yes, agreed. I'll enhance that, thanks.

> Also, I can't help thinking that the corresponding *_map() and 
> *_unmap() routines are so similar, it ought to be possible to combine 
> them.  The only difference is a check for a NULL DMA address, and it's 
> not clear to me why it is present.  It's also not clear why the test 
> for a DMA address of all ones is present.  Maybe they both can be 
> removed.
> 

I think too that I can simplify that logic.
I added those checks in a defensive way seeking robustness while I familiarize with the USB stack innards. So far, those cases are just avoiding mappings when urb_needs_transfer_dma_map()/urb_needs_transfer_dma_unmap() are called with urb->transfer_buffer == 0 and urb->transfer_dma == 0.

I guess that those cases are related to scatterlist-based urb requests.
What should be the correct way to check if a urb has already been scatter/gather-mapped?

The final logic would be something like:
- map if URB_NO_TRANSFER_DMA_MAP is cleared
- otherwise (URB_TRANSFER_NO_DMA_MAP is set so) map if HCD_NO_COHERENT_MEM is set _and_ it's not a scatter/gather request (as that should have been mapped already by usb_buffer_map_sg())

Am I on the right path?

> Alan Stern
> 

Thanks,
Albert

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag
  2010-03-01 18:38     ` Albert Herranz
@ 2010-03-01 19:23       ` Alan Stern
  2010-03-01 20:11         ` Albert Herranz
  0 siblings, 1 reply; 21+ messages in thread
From: Alan Stern @ 2010-03-01 19:23 UTC (permalink / raw)
  To: Albert Herranz; +Cc: linux-usb, linuxppc-dev, linux-arm-kernel

On Mon, 1 Mar 2010, Albert Herranz wrote:

> > Also, I can't help thinking that the corresponding *_map() and 
> > *_unmap() routines are so similar, it ought to be possible to combine 
> > them.  The only difference is a check for a NULL DMA address, and it's 
> > not clear to me why it is present.  It's also not clear why the test 
> > for a DMA address of all ones is present.  Maybe they both can be 
> > removed.
> > 
> 
> I think too that I can simplify that logic.
> I added those checks in a defensive way seeking robustness while I
> familiarize with the USB stack innards. So far, those cases are just
> avoiding mappings when
> urb_needs_transfer_dma_map()/urb_needs_transfer_dma_unmap() are
> called with urb->transfer_buffer == 0 and urb->transfer_dma == 0.
> 
> I guess that those cases are related to scatterlist-based urb requests.
> What should be the correct way to check if a urb has already been
> scatter/gather-mapped?

If urb->num_sgs > 0 then urb has been s-g mapped.  Although we don't
currently check for it, quite a few URBs have transfer_buffer_length ==
0 (a number of control requests are like this, for example) so they
don't need a mapping either.

> The final logic would be something like:
> - map if URB_NO_TRANSFER_DMA_MAP is cleared
> - otherwise (URB_TRANSFER_NO_DMA_MAP is set so) map if
> HCD_NO_COHERENT_MEM is set _and_ it's not a scatter/gather request
> (as that should have been mapped already by usb_buffer_map_sg())
> 
> Am I on the right path?

More or less.  I would do it somewhat differently:

	If URB_NO_TRANSFER_DMA_MAP is set then no map is needed.
	Otherwise if num_sgs > 0 then no map is needed.
	Otherwise if HCD_NO_COHERENT_MEM is set then use
		hcd_alloc_coherent().
	Otherwise if transfer_buffer_length > 0 then use
		dma_map_single().

Similar logic is needed for the setup buffer mapping, but you can 
assume that control URBs never use scatter-gather so there's no need to 
check num_sgs (and there's no need to check the transfer length, since 
setup packets are always 8 bytes long).

In fact, I think URB_NO_SETUP_DMA_MAP doesn't really offer any
worthwhile advantages.  (About the only place where multiple control
requests are used in rapid succession is during firmware transfers, and
those aren't time-constrained.)  It is currently used in a few
drivers, but we ought to be able to remove it without too much effort.  
That might make a good project.

Alan Stern

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag
  2010-03-01 19:23       ` Alan Stern
@ 2010-03-01 20:11         ` Albert Herranz
  2010-03-01 20:45           ` Alan Stern
  0 siblings, 1 reply; 21+ messages in thread
From: Albert Herranz @ 2010-03-01 20:11 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-usb, linuxppc-dev, linux-arm-kernel

Alan Stern wrote:
> If urb->num_sgs > 0 then urb has been s-g mapped.  Although we don't
> currently check for it, quite a few URBs have transfer_buffer_length ==
> 0 (a number of control requests are like this, for example) so they
> don't need a mapping either.
> 

Ok, I'll use urb->num_sgs > 0 to check for s-g mapped requests.

>> The final logic would be something like:
>> - map if URB_NO_TRANSFER_DMA_MAP is cleared
>> - otherwise (URB_TRANSFER_NO_DMA_MAP is set so) map if
>> HCD_NO_COHERENT_MEM is set _and_ it's not a scatter/gather request
>> (as that should have been mapped already by usb_buffer_map_sg())
>>
>> Am I on the right path?
> 
> More or less.  I would do it somewhat differently:
> 
> 	If URB_NO_TRANSFER_DMA_MAP is set then no map is needed.
> 	Otherwise if num_sgs > 0 then no map is needed.
> 	Otherwise if HCD_NO_COHERENT_MEM is set then use
> 		hcd_alloc_coherent().
> 	Otherwise if transfer_buffer_length > 0 then use
> 		dma_map_single().
> 

I think that logic is not quite right.
Remember that the final goal is to avoid allocating USB buffers from coherent memory (because the USB drivers access USB buffers without access restrictions but the platform I'm working on can't write to coherent memory unless it is done in 32-bit chunks).
And we want to avoid bouncing at the USB layer too (that's what v1 did).

The strategy so far is:
- we have modified the USB buffer allocation routines hcd_buffer_alloc()/hcd_buffer_free() to always return normal kernel memory when HCD_NO_COHERENT_MEM is set on the host controller.
- during map_urb_for_dma()/unmap_urb_for_dma() we need to make sure that those USB buffers are sync'ed, even if we are told USB_NO_{SETUP,TRANSFER}_DMA_MAP

So the logic would be:

 	If URB_NO_TRANSFER_DMA_MAP is _cleared_ then do the mapping
	- this case covers normal kernel memory used as a buffer and not already DMA mapped by a USB driver

 	Otherwise if HCD_NO_COHERENT_MEM is set _and_ num_sgs == 0 _and_ transfer_buffer_length > 0 then do the mapping too
	- this case covers USB buffers allocated via usb_buffer_alloc() and marked URB_NO_TRANSFER_DMA_MAP by a USB driver, which are allocated from normal kernel memory when HCD_NO_COHERENT_MEM is set (we avoid bouncing buffers here too, at least if they sit already within MEM2 in the Wii, but anyway that's part of the platform DMA mapping code)

	s-g urbs do not need a mapping as they have already been mapped, marked URB_NO_TRANSFER_DMA_MAP and have num_sgs > 0
	
> Similar logic is needed for the setup buffer mapping, but you can 
> assume that control URBs never use scatter-gather so there's no need to 
> check num_sgs (and there's no need to check the transfer length, since 
> setup packets are always 8 bytes long).
> 
> In fact, I think URB_NO_SETUP_DMA_MAP doesn't really offer any
> worthwhile advantages.  (About the only place where multiple control
> requests are used in rapid succession is during firmware transfers, and
> those aren't time-constrained.)  It is currently used in a few
> drivers, but we ought to be able to remove it without too much effort.  
> That might make a good project.
> 

I'll leave that projects to others or, in any case, address it later :)

> Alan Stern
> 

Thanks,
Albert

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag
  2010-03-01 20:11         ` Albert Herranz
@ 2010-03-01 20:45           ` Alan Stern
  2010-03-01 22:55             ` Albert Herranz
  0 siblings, 1 reply; 21+ messages in thread
From: Alan Stern @ 2010-03-01 20:45 UTC (permalink / raw)
  To: Albert Herranz; +Cc: linux-usb, linuxppc-dev, linux-arm-kernel

On Mon, 1 Mar 2010, Albert Herranz wrote:

> >> Am I on the right path?
> > 
> > More or less.  I would do it somewhat differently:
> > 
> > 	If URB_NO_TRANSFER_DMA_MAP is set then no map is needed.
> > 	Otherwise if num_sgs > 0 then no map is needed.
> > 	Otherwise if HCD_NO_COHERENT_MEM is set then use
> > 		hcd_alloc_coherent().
> > 	Otherwise if transfer_buffer_length > 0 then use
> > 		dma_map_single().
> > 
> 
> I think that logic is not quite right.
> Remember that the final goal is to avoid allocating USB buffers from coherent memory (because the USB drivers access USB buffers without access restrictions but the platform I'm working on can't write to coherent memory unless it is done in 32-bit chunks).

Actually the final goal is to make the mapping/unmapping algorithms 
clear and correct.  One of the subgoals involves avoiding coherent USB 
buffers, but there are others as well (if you look back the through the 
linux-usb mailing list for the last few weeks you'll see a discussion 
about a controller which has to use PIO for control transfers but can 
use DMA for other types).

> And we want to avoid bouncing at the USB layer too (that's what v1 did).
> 
> The strategy so far is:
> - we have modified the USB buffer allocation routines hcd_buffer_alloc()/hcd_buffer_free() to always return normal kernel memory when HCD_NO_COHERENT_MEM is set on the host controller.
> - during map_urb_for_dma()/unmap_urb_for_dma() we need to make sure that those USB buffers are sync'ed, even if we are told USB_NO_{SETUP,TRANSFER}_DMA_MAP
> 
> So the logic would be:
> 
>  	If URB_NO_TRANSFER_DMA_MAP is _cleared_ then do the mapping

No, that's wrong because it ignores the HCD_LOCAL_MEM flag.

> 	- this case covers normal kernel memory used as a buffer and not already DMA mapped by a USB driver
> 
>  	Otherwise if HCD_NO_COHERENT_MEM is set _and_ num_sgs == 0 _and_ transfer_buffer_length > 0 then do the mapping too
> 	- this case covers USB buffers allocated via usb_buffer_alloc() and marked URB_NO_TRANSFER_DMA_MAP by a USB driver, which are allocated from normal kernel memory when HCD_NO_COHERENT_MEM is set (we avoid bouncing buffers here too, at least if they sit already within MEM2 in the Wii, but anyway that's part of the platform DMA mapping code)
> 
> 	s-g urbs do not need a mapping as they have already been mapped, marked URB_NO_TRANSFER_DMA_MAP and have num_sgs > 0

Actually the test for transfer_buffer_length == 0 should be done first, 
since obviously no mapping is needed if there's no data.  (And in fact 
the current code does do this; I was wrong earlier when I said it 
doesn't.)

So let's make things a little easier by first testing the conditions 
under which no mapping is needed:

	If transfer_buffer_length is 0 then do nothing.
	Otherwise if num_sgs > 0 then do nothing.
	Otherwise if URB_NO_TRANSFER_DMA_MAP and transfer_dma
		are both set (this avoids your HCD_NO_COHERENT_MEM
		case) then do nothing.

The remaining cases all need mapping and/or bounce buffers:

	Otherwise if HCD_LOCAL_MEM is set then call hcd_alloc_coherent.
	Otherwise call dma_map_single.

Finally, the unmapping tests can be simplified greatly if the kind of
mapping is recorded in the URB flags.

Alan Stern

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag
  2010-03-01 20:45           ` Alan Stern
@ 2010-03-01 22:55             ` Albert Herranz
  2010-03-02 15:50               ` Alan Stern
  0 siblings, 1 reply; 21+ messages in thread
From: Albert Herranz @ 2010-03-01 22:55 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-usb, linuxppc-dev, linux-arm-kernel

Alan Stern wrote:
> On Mon, 1 Mar 2010, Albert Herranz wrote:
> 
>>>> Am I on the right path?
>>> More or less.  I would do it somewhat differently:
>>>
>>> 	If URB_NO_TRANSFER_DMA_MAP is set then no map is needed.
>>> 	Otherwise if num_sgs > 0 then no map is needed.
>>> 	Otherwise if HCD_NO_COHERENT_MEM is set then use
>>> 		hcd_alloc_coherent().
>>> 	Otherwise if transfer_buffer_length > 0 then use
>>> 		dma_map_single().
>>>
>> I think that logic is not quite right.
>> Remember that the final goal is to avoid allocating USB buffers from coherent memory (because the USB drivers access USB buffers without access restrictions but the platform I'm working on can't write to coherent memory unless it is done in 32-bit chunks).
> 
> Actually the final goal is to make the mapping/unmapping algorithms 
> clear and correct.  One of the subgoals involves avoiding coherent USB 
> buffers, but there are others as well (if you look back the through the 
> linux-usb mailing list for the last few weeks you'll see a discussion 
> about a controller which has to use PIO for control transfers but can 
> use DMA for other types).
> 

Well, I was talking about our particular case here. I did not imply that we should forget about the other cases.

>> And we want to avoid bouncing at the USB layer too (that's what v1 did).
>>
>> The strategy so far is:
>> - we have modified the USB buffer allocation routines hcd_buffer_alloc()/hcd_buffer_free() to always return normal kernel memory when HCD_NO_COHERENT_MEM is set on the host controller.
>> - during map_urb_for_dma()/unmap_urb_for_dma() we need to make sure that those USB buffers are sync'ed, even if we are told USB_NO_{SETUP,TRANSFER}_DMA_MAP
>>
>> So the logic would be:
>>
>>  	If URB_NO_TRANSFER_DMA_MAP is _cleared_ then do the mapping
> 
> No, that's wrong because it ignores the HCD_LOCAL_MEM flag.
> 

When I said "do the mapping" there I meant to do a dma_map_single() if self.uses_dma, else if HCD_LOCAL_MEM is set then do a hcd_alloc_coherent().
I should have been more clear on that.

>> 	- this case covers normal kernel memory used as a buffer and not already DMA mapped by a USB driver
>>
>>  	Otherwise if HCD_NO_COHERENT_MEM is set _and_ num_sgs == 0 _and_ transfer_buffer_length > 0 then do the mapping too
>> 	- this case covers USB buffers allocated via usb_buffer_alloc() and marked URB_NO_TRANSFER_DMA_MAP by a USB driver, which are allocated from normal kernel memory when HCD_NO_COHERENT_MEM is set (we avoid bouncing buffers here too, at least if they sit already within MEM2 in the Wii, but anyway that's part of the platform DMA mapping code)
>>
>> 	s-g urbs do not need a mapping as they have already been mapped, marked URB_NO_TRANSFER_DMA_MAP and have num_sgs > 0
> 
> Actually the test for transfer_buffer_length == 0 should be done first, 
> since obviously no mapping is needed if there's no data.  (And in fact 
> the current code does do this; I was wrong earlier when I said it 
> doesn't.)
> 
> So let's make things a little easier by first testing the conditions 
> under which no mapping is needed:
> 
> 	If transfer_buffer_length is 0 then do nothing.
> 	Otherwise if num_sgs > 0 then do nothing.
> 	Otherwise if URB_NO_TRANSFER_DMA_MAP and transfer_dma
> 		are both set (this avoids your HCD_NO_COHERENT_MEM
> 		case) then do nothing.
> 

I see. This case would include the PIO case too (for which dma_handle is set to all 1s).

So this assumes that transfer_dma should be set initially to 0 when allocating USB buffers for HCD_NO_COHERENT_MEM.

> The remaining cases all need mapping and/or bounce buffers:
> 
> 	Otherwise if HCD_LOCAL_MEM is set then call hcd_alloc_coherent.
> 	Otherwise call dma_map_single.
> 
> Finally, the unmapping tests can be simplified greatly if the kind of
> mapping is recorded in the URB flags.
> 

Good point.

> Alan Stern
> 

Thanks for your comments,
Albert

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag
  2010-03-01 22:55             ` Albert Herranz
@ 2010-03-02 15:50               ` Alan Stern
  2010-03-02 17:02                 ` Albert Herranz
  0 siblings, 1 reply; 21+ messages in thread
From: Alan Stern @ 2010-03-02 15:50 UTC (permalink / raw)
  To: Albert Herranz; +Cc: linux-usb, linuxppc-dev, linux-arm-kernel

On Mon, 1 Mar 2010, Albert Herranz wrote:

> > 	If transfer_buffer_length is 0 then do nothing.
> > 	Otherwise if num_sgs > 0 then do nothing.
> > 	Otherwise if URB_NO_TRANSFER_DMA_MAP and transfer_dma
> > 		are both set (this avoids your HCD_NO_COHERENT_MEM
> > 		case) then do nothing.
> > 
> 
> I see. This case would include the PIO case too (for which dma_handle
> is set to all 1s).

The test above should be transfer_dma != ~0, not transfer_dma != 0,
since ~0 means the DMA address isn't set.  In fact I forgot to 
include the PIO case; it should be handled by changing the remaining 
tests as follows:

	Otherwise if hcd->self.uses_dma is set then
		If this URB doesn't require PIO then call dma_map_single
	Otherwise if HCD_LOCAL_MEM is set then call hcd_alloc_coherent
	Otherwise do nothing (PIO case).

Currently "this URB doesn't require PIO" is always true, but in the 
future it won't be.

> So this assumes that transfer_dma should be set initially to 0 when
> allocating USB buffers for HCD_NO_COHERENT_MEM.

No, it should be set to ~0, the same as when buffers are allocated for 
a PIO-based controller.

Alan Stern

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag
  2010-03-02 15:50               ` Alan Stern
@ 2010-03-02 17:02                 ` Albert Herranz
  2010-03-02 17:43                   ` Alan Stern
  0 siblings, 1 reply; 21+ messages in thread
From: Albert Herranz @ 2010-03-02 17:02 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-usb, linuxppc-dev, linux-arm-kernel

Alan Stern wrote:
> On Mon, 1 Mar 2010, Albert Herranz wrote:
> 
>>> 	If transfer_buffer_length is 0 then do nothing.
>>> 	Otherwise if num_sgs > 0 then do nothing.
>>> 	Otherwise if URB_NO_TRANSFER_DMA_MAP and transfer_dma
>>> 		are both set (this avoids your HCD_NO_COHERENT_MEM
>>> 		case) then do nothing.
>>>
>> I see. This case would include the PIO case too (for which dma_handle
>> is set to all 1s).
> 
> The test above should be transfer_dma != ~0, not transfer_dma != 0,
> since ~0 means the DMA address isn't set.  In fact I forgot to 
> include the PIO case; it should be handled by changing the remaining 
> tests as follows:
> 
> 	Otherwise if hcd->self.uses_dma is set then
> 		If this URB doesn't require PIO then call dma_map_single
> 	Otherwise if HCD_LOCAL_MEM is set then call hcd_alloc_coherent
> 	Otherwise do nothing (PIO case).
> 
> Currently "this URB doesn't require PIO" is always true, but in the 
> future it won't be.
> 

Can this be currently tested?
Should I make provisions for this check now too?

>> So this assumes that transfer_dma should be set initially to 0 when
>> allocating USB buffers for HCD_NO_COHERENT_MEM.
> 
> No, it should be set to ~0, the same as when buffers are allocated for 
> a PIO-based controller.
> 

This logic now resembles more the one in my v2 proposal although with different formal checks.
I think I'll code and post another iteration of the 8/9 patch with your proposed checks and then we can continue further discussion on it.
I'll try to add explanatory comments for each check.

> Alan Stern
> 

Thanks a lot for your input,
Albert

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag
  2010-03-02 17:02                 ` Albert Herranz
@ 2010-03-02 17:43                   ` Alan Stern
  0 siblings, 0 replies; 21+ messages in thread
From: Alan Stern @ 2010-03-02 17:43 UTC (permalink / raw)
  To: Albert Herranz; +Cc: linux-usb, linuxppc-dev, linux-arm-kernel

On Tue, 2 Mar 2010, Albert Herranz wrote:

> > Currently "this URB doesn't require PIO" is always true, but in the 
> > future it won't be.
> > 
> 
> Can this be currently tested?

Not yet, but soon.  You can follow the gory details in this email 
thread:

	http://marc.info/?l=linux-usb&m=126477569707174&w=2

See especially this email:

	http://marc.info/?l=linux-usb&m=126630714002200&w=2

> Should I make provisions for this check now too?

Don't worry about it.  If you structure the code as I described, it 
will be easy to insert the check later on.

Alan Stern

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-03-02 17:43 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-28 14:07 [RFC PATCH v2 0/9] wii: add usb 2.0 support Albert Herranz
2010-02-28 14:07 ` [RFC PATCH v2 1/9] powerpc: add per-device dma coherent support Albert Herranz
2010-02-28 14:07 ` [RFC PATCH v2 2/9] wii: have generic dma coherent Albert Herranz
2010-02-28 14:07 ` [RFC PATCH v2 3/9] dma-coherent: fix bitmap access races Albert Herranz
2010-02-28 14:07 ` [RFC PATCH v2 4/9] add generic dmabounce support Albert Herranz
2010-02-28 14:20   ` Russell King - ARM Linux
2010-02-28 16:37     ` Albert Herranz
2010-02-28 14:07 ` [RFC PATCH v2 5/9] arm: use " Albert Herranz
2010-02-28 14:07 ` [RFC PATCH v2 6/9] powerpc: add optional per-device " Albert Herranz
2010-02-28 14:08 ` [RFC PATCH v2 7/9] wii: add mem2 dma mapping ops Albert Herranz
2010-02-28 14:08 ` [RFC PATCH v2 8/9] USB: add HCD_NO_COHERENT_MEM host controller driver flag Albert Herranz
2010-03-01 14:49   ` Alan Stern
2010-03-01 18:38     ` Albert Herranz
2010-03-01 19:23       ` Alan Stern
2010-03-01 20:11         ` Albert Herranz
2010-03-01 20:45           ` Alan Stern
2010-03-01 22:55             ` Albert Herranz
2010-03-02 15:50               ` Alan Stern
2010-03-02 17:02                 ` Albert Herranz
2010-03-02 17:43                   ` Alan Stern
2010-02-28 14:08 ` [RFC PATCH v2 9/9] wii: hollywood ehci controller support Albert Herranz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).