linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/13] enable swiotlb-xen on arm and arm64
@ 2013-08-29 18:31 Stefano Stabellini
  2013-08-29 18:32 ` [PATCH v5 01/13] arm: make SWIOTLB available Stefano Stabellini
                   ` (12 more replies)
  0 siblings, 13 replies; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:31 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, Konrad Rzeszutek Wilk,
	Stefano Stabellini, Ian Campbell

Hi all,
this patch series enables xen-swiotlb on arm and arm64.

Considering that all guests, including dom0, run on xen on arm with
second stage translation enabled, it follows that without an IOMMU no
guests could actually drive the hardware.

The solution for platforms without an IOMMU is to use swiotlb-xen,
adapted to autotranslate guests. swiotlb-xen provides a set of dma_ops
that can be used by Linux to setup a contiguous buffer in stage-2
addresses and use it for dma operations.
Basically Linux asks Xen to make a buffer contiguous and gets the
machine address for it. This buffer is going to be used by lib/swiotlb.c
to allocate bounce buffers.


Feedback is very welcome.
Cheers,

Stefano



Changes in v5:
- dropped the first two patches, already in the Xen tree;
- implement dma_mark_clean using dmac_flush_range on arm;
- add "arm64: define DMA_ERROR_CODE"
- better comment for XENMEM_exchange_and_pin return codes;
- fix xen_dma_add_entry error path;
- remove the spin_lock: the red-black tree is not modified at run time;
- add "swiotlb-xen: introduce xen_swiotlb_set_dma_mask";
- add "xen: introduce xen_alloc/free_coherent_pages";
- add "swiotlb-xen: use xen_alloc/free_coherent_pages";
- add "swiotlb: don't assume that io_tlb_start-io_tlb_end is coherent".

Changes in v4:
- rename XENMEM_get_dma_buf to XENMEM_exchange_and_pin;
- rename XENMEM_put_dma_buf to XENMEM_unpin;
- improve the documentation of the new hypercalls;
- add a note about out.address_bits for XENMEM_exchange;
- code style fixes;
- add err_out label in xen_dma_add_entry;
- remove INVALID_ADDRESS, use DMA_ERROR_CODE instead;
- add in-code comments regarding the usage of xen_dma_seg[0].dma_addr.

Changes in v3:
- add a patch to compile SWIOTLB without CONFIG_NEED_SG_DMA_LENGTH;
- add a patch to compile SWIOTLB_XEN without CONFIG_NEED_SG_DMA_LENGTH;
- arm/dma_capable: do not treat dma_mask as a limit;
- arm/dmabounce: keep using arm_dma_ops;
- add missing __init in xen_early_init declaration;
- many code style and name changes in swiotlb-xen.c;
- improve error checks in xen_dma_add_entry;
- warn on XENMEM_put_dma_buf failures.

Changes in v2:
- fixed a couple of errors in xen_bus_to_phys, xen_phys_to_bus and
xen_swiotlb_fixup.



Stefano Stabellini (13):
      arm: make SWIOTLB available
      arm: introduce a global dma_ops pointer
      arm64: define DMA_ERROR_CODE
      arm64: do not initialize arm64_swiotlb if dma_ops is already set
      xen/arm,arm64: move Xen initialization earlier
      xen: introduce XENMEM_exchange_and_pin and XENMEM_unpin
      xen: make xen_create_contiguous_region return the dma address
      swiotlb-xen: support autotranslate guests
      xen/arm,arm64: enable SWIOTLB_XEN
      swiotlb-xen: introduce xen_swiotlb_set_dma_mask
      xen: introduce xen_alloc/free_coherent_pages
      swiotlb-xen: use xen_alloc/free_coherent_pages
      swiotlb: don't assume that io_tlb_start-io_tlb_end is coherent

 arch/arm/Kconfig                           |    7 +
 arch/arm/include/asm/dma-mapping.h         |   35 +++++-
 arch/arm/include/asm/xen/hypervisor.h      |    8 +
 arch/arm/include/asm/xen/page-coherent.h   |   22 +++
 arch/arm/include/asm/xen/page.h            |    2 +
 arch/arm/kernel/setup.c                    |    2 +
 arch/arm/mm/dma-mapping.c                  |    3 +
 arch/arm/xen/Makefile                      |    2 +-
 arch/arm/xen/enlighten.c                   |   24 +++-
 arch/arm/xen/mm.c                          |  117 ++++++++++++++++
 arch/arm64/Kconfig                         |    1 +
 arch/arm64/include/asm/dma-mapping.h       |    1 +
 arch/arm64/include/asm/xen/page-coherent.h |   24 +++
 arch/arm64/kernel/setup.c                  |    2 +
 arch/arm64/mm/dma-mapping.c                |    2 +
 arch/arm64/xen/Makefile                    |    2 +-
 arch/ia64/include/asm/xen/page-coherent.h  |   24 +++
 arch/x86/include/asm/xen/page-coherent.h   |   24 +++
 arch/x86/xen/mmu.c                         |   11 +-
 drivers/xen/Kconfig                        |    1 -
 drivers/xen/swiotlb-xen.c                  |  209 ++++++++++++++++++++++++----
 include/xen/interface/memory.h             |   37 +++++
 include/xen/swiotlb-xen.h                  |    2 +
 include/xen/xen-ops.h                      |    7 +-
 lib/swiotlb.c                              |   12 ++-
 25 files changed, 535 insertions(+), 46 deletions(-)

git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git swiotlb-xen-5

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v5 01/13] arm: make SWIOTLB available
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-08-29 18:32 ` [PATCH v5 02/13] arm: introduce a global dma_ops pointer Stefano Stabellini
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini, will.deacon, linux

IOMMU_HELPER is needed because SWIOTLB calls iommu_is_span_boundary,
provided by lib/iommu_helper.c.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: will.deacon@arm.com
CC: linux@arm.linux.org.uk


Changes in v5:
- implement dma_mark_clean using dmac_flush_range.

Changes in v3:
- dma_capable: do not treat dma_mask as a limit;
- remove SWIOTLB dependency on NEED_SG_DMA_LENGTH.
---
 arch/arm/Kconfig                   |    6 ++++++
 arch/arm/include/asm/dma-mapping.h |   32 ++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index ba412e0..c0bfb33 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1832,6 +1832,12 @@ config CC_STACKPROTECTOR
 	  neutralized via a kernel panic.
 	  This feature requires gcc version 4.2 or above.
 
+config SWIOTLB
+	def_bool y
+
+config IOMMU_HELPER
+	def_bool SWIOTLB
+
 config XEN_DOM0
 	def_bool y
 	depends on XEN
diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 5b579b9..0982206 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -10,6 +10,7 @@
 
 #include <asm-generic/dma-coherent.h>
 #include <asm/memory.h>
+#include <asm/cacheflush.h>
 
 #define DMA_ERROR_CODE	(~0)
 extern struct dma_map_ops arm_dma_ops;
@@ -86,6 +87,37 @@ static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
 }
 #endif
 
+static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
+{
+	unsigned int offset = paddr & ~PAGE_MASK;
+	return pfn_to_dma(dev, paddr >> PAGE_SHIFT) + offset;
+}
+
+static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dev_addr)
+{
+	unsigned int offset = dev_addr & ~PAGE_MASK;
+	return (dma_to_pfn(dev, dev_addr) << PAGE_SHIFT) + offset;
+}
+
+static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
+{
+	u64 limit, mask = *dev->dma_mask;
+
+	limit = (mask + 1) & ~mask;
+	if (limit && size > limit)
+		return 0;
+
+	if ((addr | (addr + size - 1)) & ~mask)
+		return 0;
+
+	return 1;
+}
+
+static inline void dma_mark_clean(void *addr, size_t size)
+{
+	dmac_flush_range(addr, addr + size);
+}
+
 /*
  * DMA errors are defined by all-bits-set in the DMA address.
  */
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 02/13] arm: introduce a global dma_ops pointer
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
  2013-08-29 18:32 ` [PATCH v5 01/13] arm: make SWIOTLB available Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-09-02 14:45   ` [Xen-devel] " Julien Grall
  2013-08-29 18:32 ` [PATCH v5 03/13] arm64: define DMA_ERROR_CODE Stefano Stabellini
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini, will.deacon, linux

Initially set dma_ops to arm_dma_ops.


Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: will.deacon@arm.com
CC: linux@arm.linux.org.uk


Changes in v3:
-  keep using arm_dma_ops in dmabounce.
---
 arch/arm/include/asm/dma-mapping.h |    3 ++-
 arch/arm/mm/dma-mapping.c          |    3 +++
 2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 0982206..7d6e4f9 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -13,6 +13,7 @@
 #include <asm/cacheflush.h>
 
 #define DMA_ERROR_CODE	(~0)
+extern struct dma_map_ops *dma_ops;
 extern struct dma_map_ops arm_dma_ops;
 extern struct dma_map_ops arm_coherent_dma_ops;
 
@@ -20,7 +21,7 @@ static inline struct dma_map_ops *get_dma_ops(struct device *dev)
 {
 	if (dev && dev->archdata.dma_ops)
 		return dev->archdata.dma_ops;
-	return &arm_dma_ops;
+	return dma_ops;
 }
 
 static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 7f9b179..870b12c 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -141,6 +141,9 @@ struct dma_map_ops arm_dma_ops = {
 };
 EXPORT_SYMBOL(arm_dma_ops);
 
+struct dma_map_ops *dma_ops = &arm_dma_ops;
+EXPORT_SYMBOL(dma_ops);
+
 static void *arm_coherent_dma_alloc(struct device *dev, size_t size,
 	dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs);
 static void arm_coherent_dma_free(struct device *dev, size_t size, void *cpu_addr,
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 03/13] arm64: define DMA_ERROR_CODE
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
  2013-08-29 18:32 ` [PATCH v5 01/13] arm: make SWIOTLB available Stefano Stabellini
  2013-08-29 18:32 ` [PATCH v5 02/13] arm: introduce a global dma_ops pointer Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-09-05 16:05   ` Catalin Marinas
  2013-08-29 18:32 ` [PATCH v5 04/13] arm64: do not initialize arm64_swiotlb if dma_ops is already set Stefano Stabellini
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini, catalin.marinas, will.deacon

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: catalin.marinas@arm.com
CC: will.deacon@arm.com
---
 arch/arm64/include/asm/dma-mapping.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
index 8d18100..c2cb8a0 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -25,6 +25,7 @@
 
 #define ARCH_HAS_DMA_GET_REQUIRED_MASK
 
+#define DMA_ERROR_CODE	(~0)
 extern struct dma_map_ops *dma_ops;
 
 static inline struct dma_map_ops *get_dma_ops(struct device *dev)
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 04/13] arm64: do not initialize arm64_swiotlb if dma_ops is already set
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
                   ` (2 preceding siblings ...)
  2013-08-29 18:32 ` [PATCH v5 03/13] arm64: define DMA_ERROR_CODE Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-08-29 18:32 ` [PATCH v5 05/13] xen/arm,arm64: move Xen initialization earlier Stefano Stabellini
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini, will.deacon

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
CC: will.deacon@arm.com
---
 arch/arm64/mm/dma-mapping.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 4bd7579..a006e84 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -63,6 +63,8 @@ static struct dma_map_ops arm64_swiotlb_dma_ops = {
 
 void __init arm64_swiotlb_init(void)
 {
+	if (dma_ops != NULL)
+		return;
 	dma_ops = &arm64_swiotlb_dma_ops;
 	swiotlb_init(1);
 }
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 05/13] xen/arm,arm64: move Xen initialization earlier
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
                   ` (3 preceding siblings ...)
  2013-08-29 18:32 ` [PATCH v5 04/13] arm64: do not initialize arm64_swiotlb if dma_ops is already set Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-09-05 16:20   ` Catalin Marinas
  2013-08-29 18:32 ` [PATCH v5 06/13] xen: introduce XENMEM_exchange_and_pin and XENMEM_unpin Stefano Stabellini
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini

Move Xen initialization earlier, before any DMA requests can be made.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


Changes in v3:
- add missing __init in xen_early_init declaration.
---
 arch/arm/include/asm/xen/hypervisor.h |    8 ++++++++
 arch/arm/kernel/setup.c               |    2 ++
 arch/arm/xen/enlighten.c              |   21 ++++++++++++++-------
 arch/arm64/kernel/setup.c             |    2 ++
 4 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/arch/arm/include/asm/xen/hypervisor.h b/arch/arm/include/asm/xen/hypervisor.h
index d7ab99a..2782591 100644
--- a/arch/arm/include/asm/xen/hypervisor.h
+++ b/arch/arm/include/asm/xen/hypervisor.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_ARM_XEN_HYPERVISOR_H
 #define _ASM_ARM_XEN_HYPERVISOR_H
 
+#include <linux/init.h>
+
 extern struct shared_info *HYPERVISOR_shared_info;
 extern struct start_info *xen_start_info;
 
@@ -16,4 +18,10 @@ static inline enum paravirt_lazy_mode paravirt_get_lazy_mode(void)
 	return PARAVIRT_LAZY_NONE;
 }
 
+#ifdef CONFIG_XEN
+void __init xen_early_init(void);
+#else
+static inline void xen_early_init(void) { return; }
+#endif
+
 #endif /* _ASM_ARM_XEN_HYPERVISOR_H */
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 63af9a7..cb7b8e2 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -45,6 +45,7 @@
 #include <asm/cacheflush.h>
 #include <asm/cachetype.h>
 #include <asm/tlbflush.h>
+#include <asm/xen/hypervisor.h>
 
 #include <asm/prom.h>
 #include <asm/mach/arch.h>
@@ -889,6 +890,7 @@ void __init setup_arch(char **cmdline_p)
 
 	arm_dt_init_cpu_maps();
 	psci_init();
+	xen_early_init();
 #ifdef CONFIG_SMP
 	if (is_smp()) {
 		if (!mdesc->smp_init || !mdesc->smp_init()) {
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index c9770ba..14d17ab 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -195,21 +195,19 @@ static void xen_power_off(void)
  * documentation of the Xen Device Tree format.
  */
 #define GRANT_TABLE_PHYSADDR 0
-static int __init xen_guest_init(void)
+void __init xen_early_init(void)
 {
-	struct xen_add_to_physmap xatp;
-	static struct shared_info *shared_info_page = 0;
+	struct resource res;
 	struct device_node *node;
 	int len;
 	const char *s = NULL;
 	const char *version = NULL;
 	const char *xen_prefix = "xen,xen-";
-	struct resource res;
 
 	node = of_find_compatible_node(NULL, NULL, "xen,xen");
 	if (!node) {
 		pr_debug("No Xen support\n");
-		return 0;
+		return;
 	}
 	s = of_get_property(node, "compatible", &len);
 	if (strlen(xen_prefix) + 3  < len &&
@@ -217,10 +215,10 @@ static int __init xen_guest_init(void)
 		version = s + strlen(xen_prefix);
 	if (version == NULL) {
 		pr_debug("Xen version not found\n");
-		return 0;
+		return;
 	}
 	if (of_address_to_resource(node, GRANT_TABLE_PHYSADDR, &res))
-		return 0;
+		return;
 	xen_hvm_resume_frames = res.start >> PAGE_SHIFT;
 	xen_events_irq = irq_of_parse_and_map(node, 0);
 	pr_info("Xen %s support found, events_irq=%d gnttab_frame_pfn=%lx\n",
@@ -232,6 +230,15 @@ static int __init xen_guest_init(void)
 		xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
 	else
 		xen_start_info->flags &= ~(SIF_INITDOMAIN|SIF_PRIVILEGED);
+}
+
+static int __init xen_guest_init(void)
+{
+	struct xen_add_to_physmap xatp;
+	static struct shared_info *shared_info_page = 0;
+
+	if (!xen_domain())
+		return 0;
 
 	if (!shared_info_page)
 		shared_info_page = (struct shared_info *)
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index add6ea6..e0d438a 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -53,6 +53,7 @@
 #include <asm/traps.h>
 #include <asm/memblock.h>
 #include <asm/psci.h>
+#include <asm/xen/hypervisor.h>
 
 unsigned int processor_id;
 EXPORT_SYMBOL(processor_id);
@@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
 	unflatten_device_tree();
 
 	psci_init();
+	xen_early_init();
 
 	cpu_logical_map(0) = read_cpuid_mpidr() & MPIDR_HWID_BITMASK;
 #ifdef CONFIG_SMP
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 06/13] xen: introduce XENMEM_exchange_and_pin and XENMEM_unpin
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
                   ` (4 preceding siblings ...)
  2013-08-29 18:32 ` [PATCH v5 05/13] xen/arm,arm64: move Xen initialization earlier Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-08-29 18:32 ` [PATCH v5 07/13] xen: make xen_create_contiguous_region return the dma address Stefano Stabellini
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini

XENMEM_exchange can't be used by autotranslate guests because of two
severe limitations:

- it does not copy back the mfns into the out field for autotranslate
  guests;

- it does not guarantee that the hypervisor won't change the p2m
  mappings for the exchanged pages while the guest is using them. Xen
  never promises to keep the p2m mapping stable for autotranslate guests
  in general.  In practice it won't happen unless one uses uncommon
  features like memory sharing or paging.

To overcome these problems I am introducing two new hypercalls.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


Changes in v5:
- better comment for XENMEM_exchange_and_pin return codes;

Changes in v4:
- rename XENMEM_get_dma_buf to XENMEM_exchange_and_pin;
- rename XENMEM_put_dma_buf to XENMEM_unpin;
- improve the documentation of the new hypercalls;
- add a note about out.address_bits for XENMEM_exchange.
---
 include/xen/interface/memory.h |   37 +++++++++++++++++++++++++++++++++++++
 1 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h
index 2ecfe4f..fb0c6aa 100644
--- a/include/xen/interface/memory.h
+++ b/include/xen/interface/memory.h
@@ -66,6 +66,7 @@ struct xen_memory_exchange {
     /*
      * [IN] Details of memory extents to be exchanged (GMFN bases).
      * Note that @in.address_bits is ignored and unused.
+     * @out.address_bits should contain the address mask for the new pages.
      */
     struct xen_memory_reservation in;
 
@@ -263,4 +264,40 @@ struct xen_remove_from_physmap {
 };
 DEFINE_GUEST_HANDLE_STRUCT(xen_remove_from_physmap);
 
+#define XENMEM_exchange_and_pin             26
+/*
+ * This hypercall is similar to XENMEM_exchange: it takes the same
+ * struct as an argument and it exchanges the pages passed in with a new
+ * set of pages. The new pages are going to be "pinned": it's guaranteed
+ * that their p2m mapping won't be changed until explicitly "unpinned".
+ * The content of the exchanged pages is lost.
+ * Only normal guest r/w memory can be pinned: no granted pages or
+ * ballooned pages.
+ * If return code is zero then @out.extent_list provides the DMA frame
+ * numbers of the newly-allocated memory.
+ * Returns zero on complete success, otherwise a negative error code:
+ *   -ENOSYS if not implemented
+ *   -EINVAL if the page is already pinned
+ *   -EFAULT if an internal error occurs
+ * On complete success then always @nr_exchanged == @in.nr_extents.  On
+ * partial success @nr_exchanged indicates how much work was done and a
+ * negative error code is returned.
+ */
+
+#define XENMEM_unpin             27
+/*
+ * XENMEM_unpin unpins a set of pages, previously pinned by
+ * XENMEM_exchange_and_pin. After this call the p2m mapping of the pages can
+ * be transparently changed by the hypervisor, as usual. The pages are
+ * still accessible from the guest.
+ */
+struct xen_unpin {
+    /*
+     * [IN] Details of memory extents to be unpinned (GMFN bases).
+     * Note that @in.address_bits is ignored and unused.
+     */
+    struct xen_memory_reservation in;
+};
+DEFINE_GUEST_HANDLE_STRUCT(xen_unpin);
+
 #endif /* __XEN_PUBLIC_MEMORY_H__ */
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 07/13] xen: make xen_create_contiguous_region return the dma address
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
                   ` (5 preceding siblings ...)
  2013-08-29 18:32 ` [PATCH v5 06/13] xen: introduce XENMEM_exchange_and_pin and XENMEM_unpin Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-08-30 13:45   ` Konrad Rzeszutek Wilk
  2013-08-29 18:32 ` [PATCH v5 08/13] swiotlb-xen: support autotranslate guests Stefano Stabellini
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini

Modify xen_create_contiguous_region to return the dma address of the
newly contiguous buffer.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>


Changes in v4:
- use virt_to_machine instead of virt_to_bus.
---
 arch/x86/xen/mmu.c        |    4 +++-
 drivers/xen/swiotlb-xen.c |    6 +++---
 include/xen/xen-ops.h     |    3 ++-
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index fdc3ba2..6c34d7c 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -2329,7 +2329,8 @@ static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in,
 }
 
 int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
-				 unsigned int address_bits)
+				 unsigned int address_bits,
+				 dma_addr_t *dma_handle)
 {
 	unsigned long *in_frames = discontig_frames, out_frame;
 	unsigned long  flags;
@@ -2368,6 +2369,7 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
 
 	spin_unlock_irqrestore(&xen_reservation_lock, flags);
 
+	*dma_handle = virt_to_machine(vstart).maddr;
 	return success ? 0 : -ENOMEM;
 }
 EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 1b2277c..b72f31c 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -126,6 +126,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
 {
 	int i, rc;
 	int dma_bits;
+	dma_addr_t dma_handle;
 
 	dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
 
@@ -137,7 +138,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
 			rc = xen_create_contiguous_region(
 				(unsigned long)buf + (i << IO_TLB_SHIFT),
 				get_order(slabs << IO_TLB_SHIFT),
-				dma_bits);
+				dma_bits, &dma_handle);
 		} while (rc && dma_bits++ < max_dma_bits);
 		if (rc)
 			return rc;
@@ -294,11 +295,10 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 		*dma_handle = dev_addr;
 	else {
 		if (xen_create_contiguous_region(vstart, order,
-						 fls64(dma_mask)) != 0) {
+						 fls64(dma_mask), dma_handle) != 0) {
 			free_pages(vstart, order);
 			return NULL;
 		}
-		*dma_handle = virt_to_machine(ret).maddr;
 	}
 	memset(ret, 0, size);
 	return ret;
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index d6fe062..9ef704d 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -20,7 +20,8 @@ int xen_setup_shutdown_event(void);
 
 extern unsigned long *xen_contiguous_bitmap;
 int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
-				unsigned int address_bits);
+				unsigned int address_bits,
+				dma_addr_t *dma_handle);
 
 void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order);
 
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 08/13] swiotlb-xen: support autotranslate guests
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
                   ` (6 preceding siblings ...)
  2013-08-29 18:32 ` [PATCH v5 07/13] xen: make xen_create_contiguous_region return the dma address Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-08-30 13:46   ` Konrad Rzeszutek Wilk
  2013-08-29 18:32 ` [PATCH v5 09/13] xen/arm,arm64: enable SWIOTLB_XEN Stefano Stabellini
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini

Support autotranslate guests in swiotlb-xen by keeping track of the
phys-to-bus and bus-to-phys mappings of the swiotlb buffer
(xen_io_tlb_start-xen_io_tlb_end).

Use a simple direct access on a pre-allocated array for phys-to-bus
queries. Use a red-black tree for bus-to-phys queries.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>


Changes in v5:
- fix xen_dma_add_entry error path;
- remove the spin_lock: the red-black tree is not modified at run time.

Changes in v4:
- add err_out label in xen_dma_add_entry;
- remove INVALID_ADDRESS, use DMA_ERROR_CODE instead;
- code style fixes;
- add in-code comments regarding the usage of xen_dma_seg[0].dma_addr.

Changes in v3:
- many code style and name changes;
- improve error checks in xen_dma_add_entry.
---
 drivers/xen/swiotlb-xen.c |  155 ++++++++++++++++++++++++++++++++++++++++-----
 1 files changed, 139 insertions(+), 16 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b72f31c..7bb99ae 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -38,32 +38,131 @@
 #include <linux/bootmem.h>
 #include <linux/dma-mapping.h>
 #include <linux/export.h>
+#include <linux/slab.h>
+#include <linux/rbtree.h>
 #include <xen/swiotlb-xen.h>
 #include <xen/page.h>
 #include <xen/xen-ops.h>
 #include <xen/hvc-console.h>
+#include <xen/features.h>
 /*
  * Used to do a quick range check in swiotlb_tbl_unmap_single and
  * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by this
  * API.
  */
 
+#define NR_DMA_SEGS  ((xen_io_tlb_nslabs + IO_TLB_SEGSIZE - 1) / IO_TLB_SEGSIZE)
 static char *xen_io_tlb_start, *xen_io_tlb_end;
 static unsigned long xen_io_tlb_nslabs;
 /*
  * Quick lookup value of the bus address of the IOTLB.
  */
 
-static u64 start_dma_addr;
+struct xen_dma_info {
+	dma_addr_t dma_addr;
+	phys_addr_t phys_addr;
+	size_t size;
+	struct rb_node rbnode;
+};
+
+/*
+ * This array of struct xen_dma_info is indexed by physical addresses,
+ * starting from virt_to_phys(xen_io_tlb_start). Each entry maps
+ * (IO_TLB_SEGSIZE << IO_TLB_SHIFT) bytes, except the last one that is
+ * smaller. Getting the dma address corresponding to a given physical
+ * address can be done by direct access with the right index on the
+ * array.
+ */
+static struct xen_dma_info *xen_dma_seg;
+/* 
+ * This tree keeps track of bus address to physical address
+ * mappings.
+ */
+static struct rb_root bus_to_phys = RB_ROOT;
+
+static int xen_dma_add_entry(struct xen_dma_info *new)
+{
+	struct rb_node **link = &bus_to_phys.rb_node;
+	struct rb_node *parent = NULL;
+	struct xen_dma_info *entry;
+	int rc = 0;
+
+	while (*link) {
+		parent = *link;
+		entry = rb_entry(parent, struct xen_dma_info, rbnode);
+
+		if (new->dma_addr == entry->dma_addr)
+			goto err_out;
+		if (new->phys_addr == entry->phys_addr)
+			goto err_out;
+
+		if (new->dma_addr < entry->dma_addr)
+			link = &(*link)->rb_left;
+		else
+			link = &(*link)->rb_right;
+	}
+	rb_link_node(&new->rbnode, parent, link);
+	rb_insert_color(&new->rbnode, &bus_to_phys);
+	goto out;
+
+err_out:
+	rc = -EINVAL;
+	pr_warn("%s: cannot add phys=0x%pa -> dma=0x%pa: phys=0x%pa -> dma=0x%pa already exists\n",
+			__func__, &new->phys_addr, &new->dma_addr, &entry->phys_addr, &entry->dma_addr);
+out:
+	return rc;
+}
+
+static struct xen_dma_info *xen_get_dma_info(dma_addr_t dma_addr)
+{
+	struct rb_node *n = bus_to_phys.rb_node;
+	struct xen_dma_info *entry;
+
+	while (n) {
+		entry = rb_entry(n, struct xen_dma_info, rbnode);
+		if (entry->dma_addr <= dma_addr &&
+				entry->dma_addr + entry->size > dma_addr) {
+			return entry;
+		}
+		if (dma_addr < entry->dma_addr)
+			n = n->rb_left;
+		else
+			n = n->rb_right;
+	}
+
+	return NULL;
+}
 
 static dma_addr_t xen_phys_to_bus(phys_addr_t paddr)
 {
-	return phys_to_machine(XPADDR(paddr)).maddr;
+	int nr_seg;
+	unsigned long offset;
+	char *vaddr;
+
+	if (!xen_feature(XENFEAT_auto_translated_physmap))
+		return phys_to_machine(XPADDR(paddr)).maddr;
+
+	vaddr = (char *)phys_to_virt(paddr);
+	if (vaddr >= xen_io_tlb_end || vaddr < xen_io_tlb_start)
+		return DMA_ERROR_CODE;
+
+	offset = vaddr - xen_io_tlb_start;
+	nr_seg = offset / (IO_TLB_SEGSIZE << IO_TLB_SHIFT);
+
+	return xen_dma_seg[nr_seg].dma_addr +
+		(paddr - xen_dma_seg[nr_seg].phys_addr);
 }
 
 static phys_addr_t xen_bus_to_phys(dma_addr_t baddr)
 {
-	return machine_to_phys(XMADDR(baddr)).paddr;
+	if (xen_feature(XENFEAT_auto_translated_physmap)) {
+		struct xen_dma_info *dma = xen_get_dma_info(baddr);
+		if (dma == NULL)
+			return DMA_ERROR_CODE;
+		else
+			return dma->phys_addr + (baddr - dma->dma_addr);
+	} else
+		return machine_to_phys(XMADDR(baddr)).paddr;
 }
 
 static dma_addr_t xen_virt_to_bus(void *address)
@@ -107,6 +206,9 @@ static int is_xen_swiotlb_buffer(dma_addr_t dma_addr)
 	unsigned long pfn = mfn_to_local_pfn(mfn);
 	phys_addr_t paddr;
 
+	if (xen_feature(XENFEAT_auto_translated_physmap))
+		return 1;
+
 	/* If the address is outside our domain, it CAN
 	 * have the same virtual address as another address
 	 * in our domain. Therefore _only_ check address within our domain.
@@ -124,13 +226,12 @@ static int max_dma_bits = 32;
 static int
 xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
 {
-	int i, rc;
+	int i, j, rc;
 	int dma_bits;
-	dma_addr_t dma_handle;
 
 	dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
 
-	i = 0;
+	i = j = 0;
 	do {
 		int slabs = min(nslabs - i, (unsigned long)IO_TLB_SEGSIZE);
 
@@ -138,12 +239,18 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
 			rc = xen_create_contiguous_region(
 				(unsigned long)buf + (i << IO_TLB_SHIFT),
 				get_order(slabs << IO_TLB_SHIFT),
-				dma_bits, &dma_handle);
+				dma_bits, &xen_dma_seg[j].dma_addr);
 		} while (rc && dma_bits++ < max_dma_bits);
 		if (rc)
 			return rc;
 
+		xen_dma_seg[j].phys_addr = virt_to_phys(buf + (i << IO_TLB_SHIFT));
+		xen_dma_seg[j].size = slabs << IO_TLB_SHIFT;
+		rc = xen_dma_add_entry(&xen_dma_seg[j]);
+		if (rc != 0)
+			return rc;
 		i += slabs;
+		j++;
 	} while (i < nslabs);
 	return 0;
 }
@@ -193,9 +300,10 @@ retry:
 	/*
 	 * Get IO TLB memory from any location.
 	 */
-	if (early)
+	if (early) {
 		xen_io_tlb_start = alloc_bootmem_pages(PAGE_ALIGN(bytes));
-	else {
+		xen_dma_seg = alloc_bootmem(sizeof(struct xen_dma_info) * NR_DMA_SEGS);
+	} else {
 #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
 		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
@@ -210,6 +318,8 @@ retry:
 			xen_io_tlb_nslabs = SLABS_PER_PAGE << order;
 			bytes = xen_io_tlb_nslabs << IO_TLB_SHIFT;
 		}
+		xen_dma_seg = kzalloc(sizeof(struct xen_dma_info) * NR_DMA_SEGS,
+				GFP_KERNEL);
 	}
 	if (!xen_io_tlb_start) {
 		m_ret = XEN_SWIOTLB_ENOMEM;
@@ -232,7 +342,6 @@ retry:
 		m_ret = XEN_SWIOTLB_EFIXUP;
 		goto error;
 	}
-	start_dma_addr = xen_virt_to_bus(xen_io_tlb_start);
 	if (early) {
 		if (swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs,
 			 verbose))
@@ -290,7 +399,8 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 
 	phys = virt_to_phys(ret);
 	dev_addr = xen_phys_to_bus(phys);
-	if (((dev_addr + size - 1 <= dma_mask)) &&
+	if (!xen_feature(XENFEAT_auto_translated_physmap) &&
+	    ((dev_addr + size - 1 <= dma_mask)) &&
 	    !range_straddles_page_boundary(phys, size))
 		*dma_handle = dev_addr;
 	else {
@@ -321,8 +431,9 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
 
 	phys = virt_to_phys(vaddr);
 
-	if (((dev_addr + size - 1 > dma_mask)) ||
-	    range_straddles_page_boundary(phys, size))
+	if (xen_feature(XENFEAT_auto_translated_physmap) ||
+		(((dev_addr + size - 1 > dma_mask)) ||
+		 range_straddles_page_boundary(phys, size)))
 		xen_destroy_contiguous_region((unsigned long)vaddr, order);
 
 	free_pages((unsigned long)vaddr, order);
@@ -351,14 +462,19 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
 	 * we can safely return the device addr and not worry about bounce
 	 * buffering it.
 	 */
-	if (dma_capable(dev, dev_addr, size) &&
+	if (!xen_feature(XENFEAT_auto_translated_physmap) &&
+	    dma_capable(dev, dev_addr, size) &&
 	    !range_straddles_page_boundary(phys, size) && !swiotlb_force)
 		return dev_addr;
 
 	/*
 	 * Oh well, have to allocate and map a bounce buffer.
+	 * Pass the dma_addr of the first slab in the iotlb buffer as
+	 * argument so that swiotlb_tbl_map_single is free to allocate
+	 * the bounce buffer anywhere appropriate in io_tlb_start -
+	 * io_tlb_end.
 	 */
-	map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir);
+	map = swiotlb_tbl_map_single(dev, xen_dma_seg[0].dma_addr, phys, size, dir);
 	if (map == SWIOTLB_MAP_ERROR)
 		return DMA_ERROR_CODE;
 
@@ -494,10 +610,17 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
 		dma_addr_t dev_addr = xen_phys_to_bus(paddr);
 
 		if (swiotlb_force ||
+		    xen_feature(XENFEAT_auto_translated_physmap) ||
 		    !dma_capable(hwdev, dev_addr, sg->length) ||
 		    range_straddles_page_boundary(paddr, sg->length)) {
+			/*
+			 * Pass the dma_addr of the first slab in the iotlb buffer as
+			 * argument so that swiotlb_tbl_map_single is free to allocate
+			 * the bounce buffer anywhere appropriate in io_tlb_start -
+			 * io_tlb_end.
+			 */
 			phys_addr_t map = swiotlb_tbl_map_single(hwdev,
-								 start_dma_addr,
+								 xen_dma_seg[0].dma_addr,
 								 sg_phys(sg),
 								 sg->length,
 								 dir);
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 09/13] xen/arm,arm64: enable SWIOTLB_XEN
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
                   ` (7 preceding siblings ...)
  2013-08-29 18:32 ` [PATCH v5 08/13] swiotlb-xen: support autotranslate guests Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-08-29 18:32 ` [PATCH v5 10/13] swiotlb-xen: introduce xen_swiotlb_set_dma_mask Stefano Stabellini
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini

Xen on arm and arm64 needs SWIOTLB_XEN: when running on Xen we need to
program the hardware with mfns rather than pfns for dma addresses.
Remove SWIOTLB_XEN dependency on X86 and PCI and make XEN select
SWIOTLB_XEN on arm and arm64.

At the moment always rely on swiotlb-xen, but when Xen starts supporting
hardware IOMMUs we'll be able to avoid it conditionally on the presence
of an IOMMU on the platform.

Implement xen_create_contiguous_region on arm and arm64 by using
XENMEM_exchange_and_pin.

Initialize the xen-swiotlb from xen_early_init (before the native
dma_ops are initialized), set dma_ops to &xen_swiotlb_dma_ops if we are
running on Xen.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


Changes in v4:
- remove redefinition of DMA_ERROR_CODE;
- update the code to use XENMEM_exchange_and_pin and XENMEM_unpin;
- add a note about hardware IOMMU in the commit message.

Changes in v3:
- code style changes;
- warn on XENMEM_put_dma_buf failures.
---
 arch/arm/Kconfig                |    1 +
 arch/arm/include/asm/xen/page.h |    2 +
 arch/arm/xen/Makefile           |    2 +-
 arch/arm/xen/enlighten.c        |    3 +
 arch/arm/xen/mm.c               |  117 +++++++++++++++++++++++++++++++++++++++
 arch/arm64/Kconfig              |    1 +
 arch/arm64/xen/Makefile         |    2 +-
 drivers/xen/Kconfig             |    1 -
 drivers/xen/swiotlb-xen.c       |   16 +++++
 9 files changed, 142 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm/xen/mm.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index c0bfb33..2c9d112 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1848,6 +1848,7 @@ config XEN
 	depends on CPU_V7 && !CPU_V6
 	depends on !GENERIC_ATOMIC64
 	select ARM_PSCI
+	select SWIOTLB_XEN
 	help
 	  Say Y if you want to run Linux in a Virtual Machine on Xen on ARM.
 
diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 359a7b5..b0f7150 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -6,12 +6,14 @@
 
 #include <linux/pfn.h>
 #include <linux/types.h>
+#include <linux/dma-mapping.h>
 
 #include <xen/interface/grant_table.h>
 
 #define pfn_to_mfn(pfn)			(pfn)
 #define phys_to_machine_mapping_valid(pfn) (1)
 #define mfn_to_pfn(mfn)			(mfn)
+#define mfn_to_local_pfn(m)             (mfn_to_pfn(m))
 #define mfn_to_virt(m)			(__va(mfn_to_pfn(m) << PAGE_SHIFT))
 
 #define pte_mfn	    pte_pfn
diff --git a/arch/arm/xen/Makefile b/arch/arm/xen/Makefile
index 4384103..66fc35d 100644
--- a/arch/arm/xen/Makefile
+++ b/arch/arm/xen/Makefile
@@ -1 +1 @@
-obj-y		:= enlighten.o hypercall.o grant-table.o
+obj-y		:= enlighten.o hypercall.o grant-table.o mm.o
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 14d17ab..06a6953 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -195,6 +195,7 @@ static void xen_power_off(void)
  * documentation of the Xen Device Tree format.
  */
 #define GRANT_TABLE_PHYSADDR 0
+extern int xen_mm_init(void);
 void __init xen_early_init(void)
 {
 	struct resource res;
@@ -230,6 +231,8 @@ void __init xen_early_init(void)
 		xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
 	else
 		xen_start_info->flags &= ~(SIF_INITDOMAIN|SIF_PRIVILEGED);
+
+	xen_mm_init();
 }
 
 static int __init xen_guest_init(void)
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
new file mode 100644
index 0000000..916deb2
--- /dev/null
+++ b/arch/arm/xen/mm.c
@@ -0,0 +1,117 @@
+#include <linux/bootmem.h>
+#include <linux/gfp.h>
+#include <linux/export.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/dma-mapping.h>
+#include <linux/vmalloc.h>
+#include <linux/swiotlb.h>
+
+#include <xen/xen.h>
+#include <xen/interface/memory.h>
+#include <xen/swiotlb-xen.h>
+
+#include <asm/cacheflush.h>
+#include <asm/xen/page.h>
+#include <asm/xen/hypercall.h>
+#include <asm/xen/interface.h>
+
+static int xen_exchange_memory(xen_ulong_t extents_in,
+			       unsigned int order_in,
+			       xen_pfn_t *pfns_in,
+			       xen_ulong_t extents_out,
+			       unsigned int order_out,
+			       xen_pfn_t *mfns_out,
+			       unsigned int address_bits)
+{
+	long rc;
+	int success;
+
+	struct xen_memory_exchange exchange = {
+		.in = {
+			.nr_extents   = extents_in,
+			.extent_order = order_in,
+			.domid        = DOMID_SELF
+		},
+		.out = {
+			.nr_extents   = extents_out,
+			.extent_order = order_out,
+			.address_bits = address_bits,
+			.domid        = DOMID_SELF
+		}
+	};
+	set_xen_guest_handle(exchange.in.extent_start, pfns_in);
+	set_xen_guest_handle(exchange.out.extent_start, mfns_out);
+
+	BUG_ON(extents_in << order_in != extents_out << order_out);
+
+
+	rc = HYPERVISOR_memory_op(XENMEM_exchange_and_pin, &exchange);
+	success = (exchange.nr_exchanged == extents_in);
+
+	BUG_ON(!success && ((exchange.nr_exchanged != 0) || (rc == 0)));
+	BUG_ON(success && (rc != 0));
+
+	return success;
+}
+
+int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
+				 unsigned int address_bits,
+				 dma_addr_t *dma_handle)
+{
+	phys_addr_t pstart = __pa(vstart);
+	xen_pfn_t in_frame, out_frame;
+	int success;
+
+	/* Get a new contiguous memory extent. */
+	in_frame = out_frame = pstart >> PAGE_SHIFT;
+	success = xen_exchange_memory(1, order, &in_frame,
+				      1, order, &out_frame,
+				      address_bits);
+
+	if (!success)
+		return -ENOMEM;
+
+	*dma_handle = out_frame << PAGE_SHIFT;
+
+	return success ? 0 : -ENOMEM;
+}
+EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
+
+void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
+{
+	xen_pfn_t in_frame = __pa(vstart) >> PAGE_SHIFT;
+	struct xen_unpin unpin = {
+		.in = {
+			.nr_extents   = 1,
+			.extent_order = order,
+			.domid        = DOMID_SELF
+		},
+	};
+	set_xen_guest_handle(unpin.in.extent_start, &in_frame);
+
+	WARN_ON(HYPERVISOR_memory_op(XENMEM_unpin, &unpin));
+}
+EXPORT_SYMBOL_GPL(xen_destroy_contiguous_region);
+
+static struct dma_map_ops xen_swiotlb_dma_ops = {
+	.mapping_error = xen_swiotlb_dma_mapping_error,
+	.alloc = xen_swiotlb_alloc_coherent,
+	.free = xen_swiotlb_free_coherent,
+	.sync_single_for_cpu = xen_swiotlb_sync_single_for_cpu,
+	.sync_single_for_device = xen_swiotlb_sync_single_for_device,
+	.sync_sg_for_cpu = xen_swiotlb_sync_sg_for_cpu,
+	.sync_sg_for_device = xen_swiotlb_sync_sg_for_device,
+	.map_sg = xen_swiotlb_map_sg_attrs,
+	.unmap_sg = xen_swiotlb_unmap_sg_attrs,
+	.map_page = xen_swiotlb_map_page,
+	.unmap_page = xen_swiotlb_unmap_page,
+	.dma_supported = xen_swiotlb_dma_supported,
+};
+
+int __init xen_mm_init(void)
+{
+	xen_swiotlb_init(1, true);
+	dma_ops = &xen_swiotlb_dma_ops;
+	return 0;
+}
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9737e97..aa1f6fb 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -209,6 +209,7 @@ config XEN_DOM0
 config XEN
 	bool "Xen guest support on ARM64 (EXPERIMENTAL)"
 	depends on ARM64 && OF
+	select SWIOTLB_XEN
 	help
 	  Say Y if you want to run Linux in a Virtual Machine on Xen on ARM64.
 
diff --git a/arch/arm64/xen/Makefile b/arch/arm64/xen/Makefile
index be24040..0ef9637 100644
--- a/arch/arm64/xen/Makefile
+++ b/arch/arm64/xen/Makefile
@@ -1,2 +1,2 @@
-xen-arm-y	+= $(addprefix ../../arm/xen/, enlighten.o grant-table.o)
+xen-arm-y	+= $(addprefix ../../arm/xen/, enlighten.o grant-table.o mm.o)
 obj-y		:= xen-arm.o hypercall.o
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 9e02d60..7e83688 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -140,7 +140,6 @@ config XEN_GRANT_DEV_ALLOC
 
 config SWIOTLB_XEN
 	def_bool y
-	depends on PCI && X86
 	select SWIOTLB
 
 config XEN_TMEM
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 7bb99ae..b11a11f 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -45,6 +45,8 @@
 #include <xen/xen-ops.h>
 #include <xen/hvc-console.h>
 #include <xen/features.h>
+#include <asm/dma-mapping.h>
+
 /*
  * Used to do a quick range check in swiotlb_tbl_unmap_single and
  * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by this
@@ -58,6 +60,20 @@ static unsigned long xen_io_tlb_nslabs;
  * Quick lookup value of the bus address of the IOTLB.
  */
 
+#ifndef CONFIG_X86
+static unsigned long dma_alloc_coherent_mask(struct device *dev,
+					    gfp_t gfp)
+{
+	unsigned long dma_mask = 0;
+
+	dma_mask = dev->coherent_dma_mask;
+	if (!dma_mask)
+		dma_mask = (gfp & GFP_DMA) ? DMA_BIT_MASK(24) : DMA_BIT_MASK(32);
+
+	return dma_mask;
+}
+#endif
+
 struct xen_dma_info {
 	dma_addr_t dma_addr;
 	phys_addr_t phys_addr;
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 10/13] swiotlb-xen: introduce xen_swiotlb_set_dma_mask
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
                   ` (8 preceding siblings ...)
  2013-08-29 18:32 ` [PATCH v5 09/13] xen/arm,arm64: enable SWIOTLB_XEN Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-08-29 18:32 ` [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages Stefano Stabellini
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini

Implement xen_swiotlb_set_dma_mask, use it for set_dma_mask on arm.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/arm/xen/mm.c         |    1 +
 drivers/xen/swiotlb-xen.c |   12 ++++++++++++
 include/xen/swiotlb-xen.h |    2 ++
 3 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 916deb2..1477ace 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -107,6 +107,7 @@ static struct dma_map_ops xen_swiotlb_dma_ops = {
 	.map_page = xen_swiotlb_map_page,
 	.unmap_page = xen_swiotlb_unmap_page,
 	.dma_supported = xen_swiotlb_dma_supported,
+	.set_dma_mask = xen_swiotlb_set_dma_mask,
 };
 
 int __init xen_mm_init(void)
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b11a11f..010da31 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -732,3 +732,15 @@ xen_swiotlb_dma_supported(struct device *hwdev, u64 mask)
 	return xen_virt_to_bus(xen_io_tlb_end - 1) <= mask;
 }
 EXPORT_SYMBOL_GPL(xen_swiotlb_dma_supported);
+
+int
+xen_swiotlb_set_dma_mask(struct device *dev, u64 dma_mask)
+{
+	if (!dev->dma_mask || !xen_swiotlb_dma_supported(dev, dma_mask))
+		return -EIO;
+
+	*dev->dma_mask = dma_mask;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(xen_swiotlb_set_dma_mask);
diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h
index de8bcc6..7b64465 100644
--- a/include/xen/swiotlb-xen.h
+++ b/include/xen/swiotlb-xen.h
@@ -55,4 +55,6 @@ xen_swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr);
 extern int
 xen_swiotlb_dma_supported(struct device *hwdev, u64 mask);
 
+extern int
+xen_swiotlb_set_dma_mask(struct device *dev, u64 dma_mask);
 #endif /* __LINUX_SWIOTLB_XEN_H */
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
                   ` (9 preceding siblings ...)
  2013-08-29 18:32 ` [PATCH v5 10/13] swiotlb-xen: introduce xen_swiotlb_set_dma_mask Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-09-05 16:09   ` Catalin Marinas
  2013-08-29 18:32 ` [PATCH v5 12/13] swiotlb-xen: use xen_alloc/free_coherent_pages Stefano Stabellini
  2013-08-29 18:32 ` [PATCH v5 13/13] swiotlb: don't assume that io_tlb_start-io_tlb_end is coherent Stefano Stabellini
  12 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini

xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
and devices. On native x86 and ARMv8 is sufficient to call
__get_free_pages in order to get a coherent buffer, while on ARM we need
to call arm_dma_ops.alloc.

Introduce xen_alloc_coherent_pages to abstract the arch specific buffer
allocation.

Similarly introduce xen_free_coherent_pages to free a coherent buffer:
on x86 and ARM64 is simply a call to free_pages while on ARM is
arm_dma_ops.free.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/arm/include/asm/xen/page-coherent.h   |   22 ++++++++++++++++++++++
 arch/arm64/include/asm/xen/page-coherent.h |   24 ++++++++++++++++++++++++
 arch/ia64/include/asm/xen/page-coherent.h  |   24 ++++++++++++++++++++++++
 arch/x86/include/asm/xen/page-coherent.h   |   24 ++++++++++++++++++++++++
 4 files changed, 94 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/include/asm/xen/page-coherent.h
 create mode 100644 arch/arm64/include/asm/xen/page-coherent.h
 create mode 100644 arch/ia64/include/asm/xen/page-coherent.h
 create mode 100644 arch/x86/include/asm/xen/page-coherent.h

diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..af2cf8d
--- /dev/null
+++ b/arch/arm/include/asm/xen/page-coherent.h
@@ -0,0 +1,22 @@
+#ifndef _ASM_ARM_XEN_PAGE_COHERENT_H
+#define _ASM_ARM_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+		dma_addr_t *dma_handle, gfp_t flags,
+		struct dma_attrs *attrs)
+{
+	return arm_dma_ops.alloc(hwdev, size, dma_handle, flags, attrs);
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+		void *cpu_addr, dma_addr_t dma_handle,
+		struct dma_attrs *attrs)
+{
+	return arm_dma_ops.free(hwdev, size, cpu_addr, dma_handle, attrs);
+}
+
+#endif /* _ASM_ARM_XEN_PAGE_COHERENT_H */
diff --git a/arch/arm64/include/asm/xen/page-coherent.h b/arch/arm64/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..0d6ad25
--- /dev/null
+++ b/arch/arm64/include/asm/xen/page-coherent.h
@@ -0,0 +1,24 @@
+#ifndef _ASM_ARM64_XEN_PAGE_COHERENT_H
+#define _ASM_ARM64_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+		dma_addr_t *dma_handle, gfp_t flags,
+		struct dma_attrs *attrs)
+{
+	void *vstart = (void*)__get_free_pages(flags, get_order(size));
+	*dma_handle = virt_to_phys(vstart);
+	return vstart;
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+		void *cpu_addr, dma_addr_t dma_handle,
+		struct dma_attrs *attrs)
+{
+	free_pages((unsigned long) cpu_addr, get_order(size));
+}
+
+#endif /* _ASM_ARM64_XEN_PAGE_COHERENT_H */
diff --git a/arch/ia64/include/asm/xen/page-coherent.h b/arch/ia64/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..37b929c
--- /dev/null
+++ b/arch/ia64/include/asm/xen/page-coherent.h
@@ -0,0 +1,24 @@
+#ifndef _ASM_IA64_XEN_PAGE_COHERENT_H
+#define _ASM_IA64_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+		dma_addr_t *dma_handle, gfp_t flags,
+		struct dma_attrs *attrs)
+{
+	void *vstart = (void*)__get_free_pages(flags, get_order(size));
+	*dma_handle = virt_to_phys(vstart);
+	return vstart;
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+		void *cpu_addr, dma_addr_t dma_handle,
+		struct dma_attrs *attrs)
+{
+	free_pages((unsigned long) cpu_addr, get_order(size));
+}
+
+#endif /* _ASM_IA64_XEN_PAGE_COHERENT_H */
diff --git a/arch/x86/include/asm/xen/page-coherent.h b/arch/x86/include/asm/xen/page-coherent.h
new file mode 100644
index 0000000..31de2e0
--- /dev/null
+++ b/arch/x86/include/asm/xen/page-coherent.h
@@ -0,0 +1,24 @@
+#ifndef _ASM_X86_XEN_PAGE_COHERENT_H
+#define _ASM_X86_XEN_PAGE_COHERENT_H
+
+#include <asm/page.h>
+#include <linux/dma-attrs.h>
+#include <linux/dma-mapping.h>
+
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+		dma_addr_t *dma_handle, gfp_t flags,
+		struct dma_attrs *attrs)
+{
+	void *vstart = (void*)__get_free_pages(flags, get_order(size));
+	*dma_handle = virt_to_phys(vstart);
+	return vstart;
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+		void *cpu_addr, dma_addr_t dma_handle,
+		struct dma_attrs *attrs)
+{
+	free_pages((unsigned long) cpu_addr, get_order(size));
+}
+
+#endif /* _ASM_X86_XEN_PAGE_COHERENT_H */
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 12/13] swiotlb-xen: use xen_alloc/free_coherent_pages
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
                   ` (10 preceding siblings ...)
  2013-08-29 18:32 ` [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  2013-08-30 13:53   ` Konrad Rzeszutek Wilk
  2013-08-29 18:32 ` [PATCH v5 13/13] swiotlb: don't assume that io_tlb_start-io_tlb_end is coherent Stefano Stabellini
  12 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini

Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or
free coherent pages.

We need to be careful handling the pointer returned by
xen_alloc_coherent_pages, because on ARM the pointer is not equal to
phys_to_virt(*dma_handle). In fact virt_to_phys on the returned pointer
doesn't return a valid physical address.

Make xen_create_contiguous_region take a phys_addr_t as start parameter to
avoid the virt_to_phys calls which would be incorrect.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/arm/xen/mm.c         |    7 +++----
 arch/x86/xen/mmu.c        |    7 +++++--
 drivers/xen/swiotlb-xen.c |   24 +++++++++++++++---------
 include/xen/xen-ops.h     |    4 ++--
 4 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 1477ace..9163d4c 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -55,11 +55,10 @@ static int xen_exchange_memory(xen_ulong_t extents_in,
 	return success;
 }
 
-int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
+int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
 				 unsigned int address_bits,
 				 dma_addr_t *dma_handle)
 {
-	phys_addr_t pstart = __pa(vstart);
 	xen_pfn_t in_frame, out_frame;
 	int success;
 
@@ -78,9 +77,9 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
 }
 EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
 
-void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
+void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
 {
-	xen_pfn_t in_frame = __pa(vstart) >> PAGE_SHIFT;
+	xen_pfn_t in_frame = pstart >> PAGE_SHIFT;
 	struct xen_unpin unpin = {
 		.in = {
 			.nr_extents   = 1,
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 6c34d7c..07ee0e8 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -2328,13 +2328,14 @@ static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in,
 	return success;
 }
 
-int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
+int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
 				 unsigned int address_bits,
 				 dma_addr_t *dma_handle)
 {
 	unsigned long *in_frames = discontig_frames, out_frame;
 	unsigned long  flags;
 	int            success;
+	unsigned long vstart = (unsigned long) phys_to_virt(pstart);
 
 	/*
 	 * Currently an auto-translated guest will not perform I/O, nor will
@@ -2374,11 +2375,12 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
 }
 EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
 
-void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
+void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
 {
 	unsigned long *out_frames = discontig_frames, in_frame;
 	unsigned long  flags;
 	int success;
+	unsigned long vstart;
 
 	if (xen_feature(XENFEAT_auto_translated_physmap))
 		return;
@@ -2386,6 +2388,7 @@ void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
 	if (unlikely(order > MAX_CONTIG_ORDER))
 		return;
 
+	vstart = (unsigned long) phys_to_virt(pstart);
 	memset((void *) vstart, 0, PAGE_SIZE << order);
 
 	spin_lock_irqsave(&xen_reservation_lock, flags);
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 010da31..c96b928 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -46,6 +46,7 @@
 #include <xen/hvc-console.h>
 #include <xen/features.h>
 #include <asm/dma-mapping.h>
+#include <asm/xen/page-coherent.h>
 
 /*
  * Used to do a quick range check in swiotlb_tbl_unmap_single and
@@ -244,6 +245,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
 {
 	int i, j, rc;
 	int dma_bits;
+	phys_addr_t p = virt_to_phys(buf);
 
 	dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
 
@@ -253,7 +255,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
 
 		do {
 			rc = xen_create_contiguous_region(
-				(unsigned long)buf + (i << IO_TLB_SHIFT),
+				p + (i << IO_TLB_SHIFT),
 				get_order(slabs << IO_TLB_SHIFT),
 				dma_bits, &xen_dma_seg[j].dma_addr);
 		} while (rc && dma_bits++ < max_dma_bits);
@@ -389,7 +391,6 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 	void *ret;
 	int order = get_order(size);
 	u64 dma_mask = DMA_BIT_MASK(32);
-	unsigned long vstart;
 	phys_addr_t phys;
 	dma_addr_t dev_addr;
 
@@ -404,8 +405,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 	if (dma_alloc_from_coherent(hwdev, size, dma_handle, &ret))
 		return ret;
 
-	vstart = __get_free_pages(flags, order);
-	ret = (void *)vstart;
+	ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs);
 
 	if (!ret)
 		return ret;
@@ -413,16 +413,20 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 	if (hwdev && hwdev->coherent_dma_mask)
 		dma_mask = dma_alloc_coherent_mask(hwdev, flags);
 
-	phys = virt_to_phys(ret);
+	/* At this point dma_handle is the physical address, next we are
+	 * going to set it to the machine address.
+	 * Do not use virt_to_phys(ret) because on ARM it doesn't correspond
+	 * to *dma_handle. */
+	phys = *dma_handle;
 	dev_addr = xen_phys_to_bus(phys);
 	if (!xen_feature(XENFEAT_auto_translated_physmap) &&
 	    ((dev_addr + size - 1 <= dma_mask)) &&
 	    !range_straddles_page_boundary(phys, size))
 		*dma_handle = dev_addr;
 	else {
-		if (xen_create_contiguous_region(vstart, order,
+		if (xen_create_contiguous_region(phys, order,
 						 fls64(dma_mask), dma_handle) != 0) {
-			free_pages(vstart, order);
+			xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs);
 			return NULL;
 		}
 	}
@@ -445,14 +449,16 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
 	if (hwdev && hwdev->coherent_dma_mask)
 		dma_mask = hwdev->coherent_dma_mask;
 
-	phys = virt_to_phys(vaddr);
+	/* do not use virt_to_phys because on ARM it doesn't return you the
+	 * physical address */
+	phys = xen_bus_to_phys(dev_addr);
 
 	if (xen_feature(XENFEAT_auto_translated_physmap) ||
 		(((dev_addr + size - 1 > dma_mask)) ||
 		 range_straddles_page_boundary(phys, size)))
 		xen_destroy_contiguous_region((unsigned long)vaddr, order);
 
-	free_pages((unsigned long)vaddr, order);
+	xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
 }
 EXPORT_SYMBOL_GPL(xen_swiotlb_free_coherent);
 
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index 9ef704d..fb2ea8f 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -19,11 +19,11 @@ void xen_arch_resume(void);
 int xen_setup_shutdown_event(void);
 
 extern unsigned long *xen_contiguous_bitmap;
-int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
+int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
 				unsigned int address_bits,
 				dma_addr_t *dma_handle);
 
-void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order);
+void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order);
 
 struct vm_area_struct;
 int xen_remap_domain_mfn_range(struct vm_area_struct *vma,
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v5 13/13] swiotlb: don't assume that io_tlb_start-io_tlb_end is coherent
  2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
                   ` (11 preceding siblings ...)
  2013-08-29 18:32 ` [PATCH v5 12/13] swiotlb-xen: use xen_alloc/free_coherent_pages Stefano Stabellini
@ 2013-08-29 18:32 ` Stefano Stabellini
  12 siblings, 0 replies; 45+ messages in thread
From: Stefano Stabellini @ 2013-08-29 18:32 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-kernel, linux-arm-kernel, konrad.wilk, Stefano.Stabellini,
	Ian.Campbell, Stefano Stabellini

The swiotlb code has appropriate calls to dma_mark_clean in place for
buffers passed to swiotlb_map_page as an argument. However it assumes
that the swiotlb bounce buffer (io_tlb_start-io_tlb_end) is already
coherent and doesn't need any calls to dma_mark_clean.

On ARM the swiotlb bounce buffer is not coherent (the memory is
writealloc while it should be bufferable) and therefore we need to call
dma_mark_clean appropriately on the bounce buffer code paths too.

Note that most architecures have an empty dma_mark_clean implementation
anyway.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 lib/swiotlb.c |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 4e8686c..b1f43f6 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -547,7 +547,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
 	 * First, sync the memory before unmapping the entry
 	 */
 	if (orig_addr && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
+	{
+		dma_mark_clean(phys_to_virt(tlb_addr), size);
 		swiotlb_bounce(orig_addr, tlb_addr, size, DMA_FROM_DEVICE);
+	}
 
 	/*
 	 * Return the buffer to the free list by setting the corresponding
@@ -587,17 +590,20 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
 
 	switch (target) {
 	case SYNC_FOR_CPU:
-		if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL))
+		if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) {
+			dma_mark_clean(phys_to_virt(tlb_addr), size);
 			swiotlb_bounce(orig_addr, tlb_addr,
 				       size, DMA_FROM_DEVICE);
+		}
 		else
 			BUG_ON(dir != DMA_TO_DEVICE);
 		break;
 	case SYNC_FOR_DEVICE:
-		if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
+		if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) {
 			swiotlb_bounce(orig_addr, tlb_addr,
 				       size, DMA_TO_DEVICE);
-		else
+			dma_mark_clean(phys_to_virt(tlb_addr), size);
+		} else
 			BUG_ON(dir != DMA_FROM_DEVICE);
 		break;
 	default:
-- 
1.7.2.5


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 07/13] xen: make xen_create_contiguous_region return the dma address
  2013-08-29 18:32 ` [PATCH v5 07/13] xen: make xen_create_contiguous_region return the dma address Stefano Stabellini
@ 2013-08-30 13:45   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-08-30 13:45 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, linux-kernel, linux-arm-kernel, Ian.Campbell

On Thu, Aug 29, 2013 at 07:32:28PM +0100, Stefano Stabellini wrote:
> Modify xen_create_contiguous_region to return the dma address of the
> newly contiguous buffer.
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Reviewed-by: David Vrabel <david.vrabel@citrix.com>

Acked-by or Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> 
> Changes in v4:
> - use virt_to_machine instead of virt_to_bus.
> ---
>  arch/x86/xen/mmu.c        |    4 +++-
>  drivers/xen/swiotlb-xen.c |    6 +++---
>  include/xen/xen-ops.h     |    3 ++-
>  3 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> index fdc3ba2..6c34d7c 100644
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -2329,7 +2329,8 @@ static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in,
>  }
>  
>  int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> -				 unsigned int address_bits)
> +				 unsigned int address_bits,
> +				 dma_addr_t *dma_handle)
>  {
>  	unsigned long *in_frames = discontig_frames, out_frame;
>  	unsigned long  flags;
> @@ -2368,6 +2369,7 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
>  
>  	spin_unlock_irqrestore(&xen_reservation_lock, flags);
>  
> +	*dma_handle = virt_to_machine(vstart).maddr;
>  	return success ? 0 : -ENOMEM;
>  }
>  EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 1b2277c..b72f31c 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -126,6 +126,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
>  {
>  	int i, rc;
>  	int dma_bits;
> +	dma_addr_t dma_handle;
>  
>  	dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
>  
> @@ -137,7 +138,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
>  			rc = xen_create_contiguous_region(
>  				(unsigned long)buf + (i << IO_TLB_SHIFT),
>  				get_order(slabs << IO_TLB_SHIFT),
> -				dma_bits);
> +				dma_bits, &dma_handle);
>  		} while (rc && dma_bits++ < max_dma_bits);
>  		if (rc)
>  			return rc;
> @@ -294,11 +295,10 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
>  		*dma_handle = dev_addr;
>  	else {
>  		if (xen_create_contiguous_region(vstart, order,
> -						 fls64(dma_mask)) != 0) {
> +						 fls64(dma_mask), dma_handle) != 0) {
>  			free_pages(vstart, order);
>  			return NULL;
>  		}
> -		*dma_handle = virt_to_machine(ret).maddr;
>  	}
>  	memset(ret, 0, size);
>  	return ret;
> diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
> index d6fe062..9ef704d 100644
> --- a/include/xen/xen-ops.h
> +++ b/include/xen/xen-ops.h
> @@ -20,7 +20,8 @@ int xen_setup_shutdown_event(void);
>  
>  extern unsigned long *xen_contiguous_bitmap;
>  int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> -				unsigned int address_bits);
> +				unsigned int address_bits,
> +				dma_addr_t *dma_handle);
>  
>  void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order);
>  
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 08/13] swiotlb-xen: support autotranslate guests
  2013-08-29 18:32 ` [PATCH v5 08/13] swiotlb-xen: support autotranslate guests Stefano Stabellini
@ 2013-08-30 13:46   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-08-30 13:46 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, linux-kernel, linux-arm-kernel, Ian.Campbell

On Thu, Aug 29, 2013 at 07:32:29PM +0100, Stefano Stabellini wrote:
> Support autotranslate guests in swiotlb-xen by keeping track of the
> phys-to-bus and bus-to-phys mappings of the swiotlb buffer
> (xen_io_tlb_start-xen_io_tlb_end).
> 
> Use a simple direct access on a pre-allocated array for phys-to-bus
> queries. Use a red-black tree for bus-to-phys queries.
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Reviewed-by: David Vrabel <david.vrabel@citrix.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> 
> Changes in v5:
> - fix xen_dma_add_entry error path;
> - remove the spin_lock: the red-black tree is not modified at run time.
> 
> Changes in v4:
> - add err_out label in xen_dma_add_entry;
> - remove INVALID_ADDRESS, use DMA_ERROR_CODE instead;
> - code style fixes;
> - add in-code comments regarding the usage of xen_dma_seg[0].dma_addr.
> 
> Changes in v3:
> - many code style and name changes;
> - improve error checks in xen_dma_add_entry.
> ---
>  drivers/xen/swiotlb-xen.c |  155 ++++++++++++++++++++++++++++++++++++++++-----
>  1 files changed, 139 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index b72f31c..7bb99ae 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -38,32 +38,131 @@
>  #include <linux/bootmem.h>
>  #include <linux/dma-mapping.h>
>  #include <linux/export.h>
> +#include <linux/slab.h>
> +#include <linux/rbtree.h>
>  #include <xen/swiotlb-xen.h>
>  #include <xen/page.h>
>  #include <xen/xen-ops.h>
>  #include <xen/hvc-console.h>
> +#include <xen/features.h>
>  /*
>   * Used to do a quick range check in swiotlb_tbl_unmap_single and
>   * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by this
>   * API.
>   */
>  
> +#define NR_DMA_SEGS  ((xen_io_tlb_nslabs + IO_TLB_SEGSIZE - 1) / IO_TLB_SEGSIZE)
>  static char *xen_io_tlb_start, *xen_io_tlb_end;
>  static unsigned long xen_io_tlb_nslabs;
>  /*
>   * Quick lookup value of the bus address of the IOTLB.
>   */
>  
> -static u64 start_dma_addr;
> +struct xen_dma_info {
> +	dma_addr_t dma_addr;
> +	phys_addr_t phys_addr;
> +	size_t size;
> +	struct rb_node rbnode;
> +};
> +
> +/*
> + * This array of struct xen_dma_info is indexed by physical addresses,
> + * starting from virt_to_phys(xen_io_tlb_start). Each entry maps
> + * (IO_TLB_SEGSIZE << IO_TLB_SHIFT) bytes, except the last one that is
> + * smaller. Getting the dma address corresponding to a given physical
> + * address can be done by direct access with the right index on the
> + * array.
> + */
> +static struct xen_dma_info *xen_dma_seg;
> +/* 
> + * This tree keeps track of bus address to physical address
> + * mappings.
> + */
> +static struct rb_root bus_to_phys = RB_ROOT;
> +
> +static int xen_dma_add_entry(struct xen_dma_info *new)
> +{
> +	struct rb_node **link = &bus_to_phys.rb_node;
> +	struct rb_node *parent = NULL;
> +	struct xen_dma_info *entry;
> +	int rc = 0;
> +
> +	while (*link) {
> +		parent = *link;
> +		entry = rb_entry(parent, struct xen_dma_info, rbnode);
> +
> +		if (new->dma_addr == entry->dma_addr)
> +			goto err_out;
> +		if (new->phys_addr == entry->phys_addr)
> +			goto err_out;
> +
> +		if (new->dma_addr < entry->dma_addr)
> +			link = &(*link)->rb_left;
> +		else
> +			link = &(*link)->rb_right;
> +	}
> +	rb_link_node(&new->rbnode, parent, link);
> +	rb_insert_color(&new->rbnode, &bus_to_phys);
> +	goto out;
> +
> +err_out:
> +	rc = -EINVAL;
> +	pr_warn("%s: cannot add phys=0x%pa -> dma=0x%pa: phys=0x%pa -> dma=0x%pa already exists\n",
> +			__func__, &new->phys_addr, &new->dma_addr, &entry->phys_addr, &entry->dma_addr);
> +out:
> +	return rc;
> +}
> +
> +static struct xen_dma_info *xen_get_dma_info(dma_addr_t dma_addr)
> +{
> +	struct rb_node *n = bus_to_phys.rb_node;
> +	struct xen_dma_info *entry;
> +
> +	while (n) {
> +		entry = rb_entry(n, struct xen_dma_info, rbnode);
> +		if (entry->dma_addr <= dma_addr &&
> +				entry->dma_addr + entry->size > dma_addr) {
> +			return entry;
> +		}
> +		if (dma_addr < entry->dma_addr)
> +			n = n->rb_left;
> +		else
> +			n = n->rb_right;
> +	}
> +
> +	return NULL;
> +}
>  
>  static dma_addr_t xen_phys_to_bus(phys_addr_t paddr)
>  {
> -	return phys_to_machine(XPADDR(paddr)).maddr;
> +	int nr_seg;
> +	unsigned long offset;
> +	char *vaddr;
> +
> +	if (!xen_feature(XENFEAT_auto_translated_physmap))
> +		return phys_to_machine(XPADDR(paddr)).maddr;
> +
> +	vaddr = (char *)phys_to_virt(paddr);
> +	if (vaddr >= xen_io_tlb_end || vaddr < xen_io_tlb_start)
> +		return DMA_ERROR_CODE;
> +
> +	offset = vaddr - xen_io_tlb_start;
> +	nr_seg = offset / (IO_TLB_SEGSIZE << IO_TLB_SHIFT);
> +
> +	return xen_dma_seg[nr_seg].dma_addr +
> +		(paddr - xen_dma_seg[nr_seg].phys_addr);
>  }
>  
>  static phys_addr_t xen_bus_to_phys(dma_addr_t baddr)
>  {
> -	return machine_to_phys(XMADDR(baddr)).paddr;
> +	if (xen_feature(XENFEAT_auto_translated_physmap)) {
> +		struct xen_dma_info *dma = xen_get_dma_info(baddr);
> +		if (dma == NULL)
> +			return DMA_ERROR_CODE;
> +		else
> +			return dma->phys_addr + (baddr - dma->dma_addr);
> +	} else
> +		return machine_to_phys(XMADDR(baddr)).paddr;
>  }
>  
>  static dma_addr_t xen_virt_to_bus(void *address)
> @@ -107,6 +206,9 @@ static int is_xen_swiotlb_buffer(dma_addr_t dma_addr)
>  	unsigned long pfn = mfn_to_local_pfn(mfn);
>  	phys_addr_t paddr;
>  
> +	if (xen_feature(XENFEAT_auto_translated_physmap))
> +		return 1;
> +
>  	/* If the address is outside our domain, it CAN
>  	 * have the same virtual address as another address
>  	 * in our domain. Therefore _only_ check address within our domain.
> @@ -124,13 +226,12 @@ static int max_dma_bits = 32;
>  static int
>  xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
>  {
> -	int i, rc;
> +	int i, j, rc;
>  	int dma_bits;
> -	dma_addr_t dma_handle;
>  
>  	dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
>  
> -	i = 0;
> +	i = j = 0;
>  	do {
>  		int slabs = min(nslabs - i, (unsigned long)IO_TLB_SEGSIZE);
>  
> @@ -138,12 +239,18 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
>  			rc = xen_create_contiguous_region(
>  				(unsigned long)buf + (i << IO_TLB_SHIFT),
>  				get_order(slabs << IO_TLB_SHIFT),
> -				dma_bits, &dma_handle);
> +				dma_bits, &xen_dma_seg[j].dma_addr);
>  		} while (rc && dma_bits++ < max_dma_bits);
>  		if (rc)
>  			return rc;
>  
> +		xen_dma_seg[j].phys_addr = virt_to_phys(buf + (i << IO_TLB_SHIFT));
> +		xen_dma_seg[j].size = slabs << IO_TLB_SHIFT;
> +		rc = xen_dma_add_entry(&xen_dma_seg[j]);
> +		if (rc != 0)
> +			return rc;
>  		i += slabs;
> +		j++;
>  	} while (i < nslabs);
>  	return 0;
>  }
> @@ -193,9 +300,10 @@ retry:
>  	/*
>  	 * Get IO TLB memory from any location.
>  	 */
> -	if (early)
> +	if (early) {
>  		xen_io_tlb_start = alloc_bootmem_pages(PAGE_ALIGN(bytes));
> -	else {
> +		xen_dma_seg = alloc_bootmem(sizeof(struct xen_dma_info) * NR_DMA_SEGS);
> +	} else {
>  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
>  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
>  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
> @@ -210,6 +318,8 @@ retry:
>  			xen_io_tlb_nslabs = SLABS_PER_PAGE << order;
>  			bytes = xen_io_tlb_nslabs << IO_TLB_SHIFT;
>  		}
> +		xen_dma_seg = kzalloc(sizeof(struct xen_dma_info) * NR_DMA_SEGS,
> +				GFP_KERNEL);
>  	}
>  	if (!xen_io_tlb_start) {
>  		m_ret = XEN_SWIOTLB_ENOMEM;
> @@ -232,7 +342,6 @@ retry:
>  		m_ret = XEN_SWIOTLB_EFIXUP;
>  		goto error;
>  	}
> -	start_dma_addr = xen_virt_to_bus(xen_io_tlb_start);
>  	if (early) {
>  		if (swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs,
>  			 verbose))
> @@ -290,7 +399,8 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
>  
>  	phys = virt_to_phys(ret);
>  	dev_addr = xen_phys_to_bus(phys);
> -	if (((dev_addr + size - 1 <= dma_mask)) &&
> +	if (!xen_feature(XENFEAT_auto_translated_physmap) &&
> +	    ((dev_addr + size - 1 <= dma_mask)) &&
>  	    !range_straddles_page_boundary(phys, size))
>  		*dma_handle = dev_addr;
>  	else {
> @@ -321,8 +431,9 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
>  
>  	phys = virt_to_phys(vaddr);
>  
> -	if (((dev_addr + size - 1 > dma_mask)) ||
> -	    range_straddles_page_boundary(phys, size))
> +	if (xen_feature(XENFEAT_auto_translated_physmap) ||
> +		(((dev_addr + size - 1 > dma_mask)) ||
> +		 range_straddles_page_boundary(phys, size)))
>  		xen_destroy_contiguous_region((unsigned long)vaddr, order);
>  
>  	free_pages((unsigned long)vaddr, order);
> @@ -351,14 +462,19 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
>  	 * we can safely return the device addr and not worry about bounce
>  	 * buffering it.
>  	 */
> -	if (dma_capable(dev, dev_addr, size) &&
> +	if (!xen_feature(XENFEAT_auto_translated_physmap) &&
> +	    dma_capable(dev, dev_addr, size) &&
>  	    !range_straddles_page_boundary(phys, size) && !swiotlb_force)
>  		return dev_addr;
>  
>  	/*
>  	 * Oh well, have to allocate and map a bounce buffer.
> +	 * Pass the dma_addr of the first slab in the iotlb buffer as
> +	 * argument so that swiotlb_tbl_map_single is free to allocate
> +	 * the bounce buffer anywhere appropriate in io_tlb_start -
> +	 * io_tlb_end.
>  	 */
> -	map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir);
> +	map = swiotlb_tbl_map_single(dev, xen_dma_seg[0].dma_addr, phys, size, dir);
>  	if (map == SWIOTLB_MAP_ERROR)
>  		return DMA_ERROR_CODE;
>  
> @@ -494,10 +610,17 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
>  		dma_addr_t dev_addr = xen_phys_to_bus(paddr);
>  
>  		if (swiotlb_force ||
> +		    xen_feature(XENFEAT_auto_translated_physmap) ||
>  		    !dma_capable(hwdev, dev_addr, sg->length) ||
>  		    range_straddles_page_boundary(paddr, sg->length)) {
> +			/*
> +			 * Pass the dma_addr of the first slab in the iotlb buffer as
> +			 * argument so that swiotlb_tbl_map_single is free to allocate
> +			 * the bounce buffer anywhere appropriate in io_tlb_start -
> +			 * io_tlb_end.
> +			 */
>  			phys_addr_t map = swiotlb_tbl_map_single(hwdev,
> -								 start_dma_addr,
> +								 xen_dma_seg[0].dma_addr,
>  								 sg_phys(sg),
>  								 sg->length,
>  								 dir);
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 12/13] swiotlb-xen: use xen_alloc/free_coherent_pages
  2013-08-29 18:32 ` [PATCH v5 12/13] swiotlb-xen: use xen_alloc/free_coherent_pages Stefano Stabellini
@ 2013-08-30 13:53   ` Konrad Rzeszutek Wilk
  2013-09-05 16:50     ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-08-30 13:53 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, linux-kernel, linux-arm-kernel, Ian.Campbell

On Thu, Aug 29, 2013 at 07:32:33PM +0100, Stefano Stabellini wrote:
> Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or
> free coherent pages.
> 
> We need to be careful handling the pointer returned by
> xen_alloc_coherent_pages, because on ARM the pointer is not equal to
> phys_to_virt(*dma_handle). In fact virt_to_phys on the returned pointer
> doesn't return a valid physical address.

Why is it called 'virt_to_phys'? What does it return then?

> 
> Make xen_create_contiguous_region take a phys_addr_t as start parameter to
> avoid the virt_to_phys calls which would be incorrect.
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> ---
>  arch/arm/xen/mm.c         |    7 +++----
>  arch/x86/xen/mmu.c        |    7 +++++--
>  drivers/xen/swiotlb-xen.c |   24 +++++++++++++++---------
>  include/xen/xen-ops.h     |    4 ++--
>  4 files changed, 25 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
> index 1477ace..9163d4c 100644
> --- a/arch/arm/xen/mm.c
> +++ b/arch/arm/xen/mm.c
> @@ -55,11 +55,10 @@ static int xen_exchange_memory(xen_ulong_t extents_in,
>  	return success;
>  }
>  
> -int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
>  				 unsigned int address_bits,
>  				 dma_addr_t *dma_handle)
>  {
> -	phys_addr_t pstart = __pa(vstart);
>  	xen_pfn_t in_frame, out_frame;
>  	int success;
>  
> @@ -78,9 +77,9 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
>  }
>  EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
>  
> -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
> +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
>  {
> -	xen_pfn_t in_frame = __pa(vstart) >> PAGE_SHIFT;
> +	xen_pfn_t in_frame = pstart >> PAGE_SHIFT;
>  	struct xen_unpin unpin = {
>  		.in = {
>  			.nr_extents   = 1,
> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> index 6c34d7c..07ee0e8 100644
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -2328,13 +2328,14 @@ static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in,
>  	return success;
>  }
>  
> -int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
>  				 unsigned int address_bits,
>  				 dma_addr_t *dma_handle)
>  {
>  	unsigned long *in_frames = discontig_frames, out_frame;
>  	unsigned long  flags;
>  	int            success;
> +	unsigned long vstart = (unsigned long) phys_to_virt(pstart);

Extra space.
>  
>  	/*
>  	 * Currently an auto-translated guest will not perform I/O, nor will
> @@ -2374,11 +2375,12 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
>  }
>  EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
>  
> -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
> +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
>  {
>  	unsigned long *out_frames = discontig_frames, in_frame;
>  	unsigned long  flags;
>  	int success;
> +	unsigned long vstart;
>  
>  	if (xen_feature(XENFEAT_auto_translated_physmap))
>  		return;
> @@ -2386,6 +2388,7 @@ void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order)
>  	if (unlikely(order > MAX_CONTIG_ORDER))
>  		return;
>  
> +	vstart = (unsigned long) phys_to_virt(pstart);

Ditto
>  	memset((void *) vstart, 0, PAGE_SIZE << order);
>  
>  	spin_lock_irqsave(&xen_reservation_lock, flags);
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 010da31..c96b928 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -46,6 +46,7 @@
>  #include <xen/hvc-console.h>
>  #include <xen/features.h>
>  #include <asm/dma-mapping.h>
> +#include <asm/xen/page-coherent.h>
>  
>  /*
>   * Used to do a quick range check in swiotlb_tbl_unmap_single and
> @@ -244,6 +245,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
>  {
>  	int i, j, rc;
>  	int dma_bits;
> +	phys_addr_t p = virt_to_phys(buf);
>  
>  	dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
>  
> @@ -253,7 +255,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
>  
>  		do {
>  			rc = xen_create_contiguous_region(
> -				(unsigned long)buf + (i << IO_TLB_SHIFT),
> +				p + (i << IO_TLB_SHIFT),
>  				get_order(slabs << IO_TLB_SHIFT),
>  				dma_bits, &xen_dma_seg[j].dma_addr);
>  		} while (rc && dma_bits++ < max_dma_bits);
> @@ -389,7 +391,6 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
>  	void *ret;
>  	int order = get_order(size);
>  	u64 dma_mask = DMA_BIT_MASK(32);
> -	unsigned long vstart;
>  	phys_addr_t phys;
>  	dma_addr_t dev_addr;
>  
> @@ -404,8 +405,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
>  	if (dma_alloc_from_coherent(hwdev, size, dma_handle, &ret))
>  		return ret;
>  
> -	vstart = __get_free_pages(flags, order);
> -	ret = (void *)vstart;
> +	ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs);
>  
>  	if (!ret)
>  		return ret;
> @@ -413,16 +413,20 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size,
>  	if (hwdev && hwdev->coherent_dma_mask)
>  		dma_mask = dma_alloc_coherent_mask(hwdev, flags);
>  
> -	phys = virt_to_phys(ret);
> +	/* At this point dma_handle is the physical address, next we are
> +	 * going to set it to the machine address.
> +	 * Do not use virt_to_phys(ret) because on ARM it doesn't correspond
> +	 * to *dma_handle. */
> +	phys = *dma_handle;
>  	dev_addr = xen_phys_to_bus(phys);
>  	if (!xen_feature(XENFEAT_auto_translated_physmap) &&
>  	    ((dev_addr + size - 1 <= dma_mask)) &&
>  	    !range_straddles_page_boundary(phys, size))
>  		*dma_handle = dev_addr;
>  	else {
> -		if (xen_create_contiguous_region(vstart, order,
> +		if (xen_create_contiguous_region(phys, order,
>  						 fls64(dma_mask), dma_handle) != 0) {
> -			free_pages(vstart, order);
> +			xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs);
>  			return NULL;
>  		}
>  	}
> @@ -445,14 +449,16 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
>  	if (hwdev && hwdev->coherent_dma_mask)
>  		dma_mask = hwdev->coherent_dma_mask;
>  
> -	phys = virt_to_phys(vaddr);
> +	/* do not use virt_to_phys because on ARM it doesn't return you the
> +	 * physical address */
> +	phys = xen_bus_to_phys(dev_addr);
>  
>  	if (xen_feature(XENFEAT_auto_translated_physmap) ||
>  		(((dev_addr + size - 1 > dma_mask)) ||
>  		 range_straddles_page_boundary(phys, size)))
>  		xen_destroy_contiguous_region((unsigned long)vaddr, order);
>  
> -	free_pages((unsigned long)vaddr, order);
> +	xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
>  }
>  EXPORT_SYMBOL_GPL(xen_swiotlb_free_coherent);
>  
> diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
> index 9ef704d..fb2ea8f 100644
> --- a/include/xen/xen-ops.h
> +++ b/include/xen/xen-ops.h
> @@ -19,11 +19,11 @@ void xen_arch_resume(void);
>  int xen_setup_shutdown_event(void);
>  
>  extern unsigned long *xen_contiguous_bitmap;
> -int xen_create_contiguous_region(unsigned long vstart, unsigned int order,
> +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
>  				unsigned int address_bits,
>  				dma_addr_t *dma_handle);
>  
> -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order);
> +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order);
>  
>  struct vm_area_struct;
>  int xen_remap_domain_mfn_range(struct vm_area_struct *vma,
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Xen-devel] [PATCH v5 02/13] arm: introduce a global dma_ops pointer
  2013-08-29 18:32 ` [PATCH v5 02/13] arm: introduce a global dma_ops pointer Stefano Stabellini
@ 2013-09-02 14:45   ` Julien Grall
  2013-09-05 16:39     ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Julien Grall @ 2013-09-02 14:45 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, linux, Ian.Campbell, will.deacon, linux-kernel,
	linux-arm-kernel, ben-linux, kgene.kim, linux-samsung-soc

On 08/29/2013 07:32 PM, Stefano Stabellini wrote:
> Initially set dma_ops to arm_dma_ops.
> 
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> CC: will.deacon@arm.com
> CC: linux@arm.linux.org.uk
> 
> 
> Changes in v3:
> -  keep using arm_dma_ops in dmabounce.
> ---
>  arch/arm/include/asm/dma-mapping.h |    3 ++-
>  arch/arm/mm/dma-mapping.c          |    3 +++
>  2 files changed, 5 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 0982206..7d6e4f9 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -13,6 +13,7 @@
>  #include <asm/cacheflush.h>
>  
>  #define DMA_ERROR_CODE	(~0)
> +extern struct dma_map_ops *dma_ops;

Hi,

I tried to build your swiotlb patch series for the Arndale. I have a compilation
error because dma_ops is already used in samsung sound driver (sound/soc/samsung/dma.c).

This small fix allow me to built this serie for the Arndale.
Do I need to send it separately?

=======================================================================
commit 73d4ceded87f52fa958b92d8d8d65be485e90857
Author: Julien Grall <julien.grall@linaro.org>
Date:   Mon Sep 2 15:36:35 2013 +0100

    ASoC: Samsung: Rename dma_ops by samsung_dma_ops
    
    The commit "arm: introduce a global dma_ops pointer" introduce compilation issue
    when CONFIG_SND_SOC_SAMSUNG is enabled.
    
    sound/soc/samsung/dma.c:345:27: error: conflicting types for 'dma_ops'
    /local/home/julien/works/arndale/linux/arch/arm/include/asm/dma-mapping.h:16:28:
    note: previous declaration of 'dma_ops' was here
    
    Signed-off-by: Julien Grall <julien.grall@linaro.org>

diff --git a/sound/soc/samsung/dma.c b/sound/soc/samsung/dma.c
index ddea134..c341603 100644
--- a/sound/soc/samsung/dma.c
+++ b/sound/soc/samsung/dma.c
@@ -342,7 +342,7 @@ static int dma_mmap(struct snd_pcm_substream *substream,
                                     runtime->dma_bytes);
 }
 
-static struct snd_pcm_ops dma_ops = {
+static struct snd_pcm_ops samsung_dma_ops = {
        .open           = dma_open,
        .close          = dma_close,
        .ioctl          = snd_pcm_lib_ioctl,
@@ -429,7 +429,7 @@ out:
 }
 
 static struct snd_soc_platform_driver samsung_asoc_platform = {
-       .ops            = &dma_ops,
+       .ops            = &samsung_dma_ops,
        .pcm_new        = dma_new,
        .pcm_free       = dma_free_dma_buffers,
 };

-- 
Julien Grall

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 03/13] arm64: define DMA_ERROR_CODE
  2013-08-29 18:32 ` [PATCH v5 03/13] arm64: define DMA_ERROR_CODE Stefano Stabellini
@ 2013-09-05 16:05   ` Catalin Marinas
  2013-09-05 16:54     ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Catalin Marinas @ 2013-09-05 16:05 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, linux-kernel, linux-arm-kernel, konrad.wilk,
	Ian.Campbell, Will Deacon

On Thu, Aug 29, 2013 at 07:32:24PM +0100, Stefano Stabellini wrote:
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> CC: catalin.marinas@arm.com
> CC: will.deacon@arm.com
> ---
>  arch/arm64/include/asm/dma-mapping.h |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
> index 8d18100..c2cb8a0 100644
> --- a/arch/arm64/include/asm/dma-mapping.h
> +++ b/arch/arm64/include/asm/dma-mapping.h
> @@ -25,6 +25,7 @@
>  
>  #define ARCH_HAS_DMA_GET_REQUIRED_MASK
>  
> +#define DMA_ERROR_CODE	(~0)

I wonder why this isn't more generic. It looks like most (all?)
architectures define it in the same way.

-- 
Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-08-29 18:32 ` [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages Stefano Stabellini
@ 2013-09-05 16:09   ` Catalin Marinas
  2013-09-05 16:43     ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Catalin Marinas @ 2013-09-05 16:09 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Ian.Campbell, linux-kernel, linux-arm-kernel, konrad.wilk

On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> and devices. On native x86 and ARMv8 is sufficient to call
> __get_free_pages in order to get a coherent buffer, while on ARM we need
> to call arm_dma_ops.alloc.

Don't bet on this for ARMv8. It's not mandated for the architecture, so
at some point some SoC will require non-cacheable buffers for coherency.

-- 
Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 05/13] xen/arm,arm64: move Xen initialization earlier
  2013-08-29 18:32 ` [PATCH v5 05/13] xen/arm,arm64: move Xen initialization earlier Stefano Stabellini
@ 2013-09-05 16:20   ` Catalin Marinas
  2013-09-05 16:59     ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Catalin Marinas @ 2013-09-05 16:20 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Ian.Campbell, linux-kernel, linux-arm-kernel, konrad.wilk

On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> Move Xen initialization earlier, before any DMA requests can be made.
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

I guess you should cc the corresponding maintainers here.

>  arch/arm/include/asm/xen/hypervisor.h |    8 ++++++++
>  arch/arm/kernel/setup.c               |    2 ++
>  arch/arm/xen/enlighten.c              |   21 ++++++++++++++-------
>  arch/arm64/kernel/setup.c             |    2 ++
>  4 files changed, 26 insertions(+), 7 deletions(-)

[...]

> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -53,6 +53,7 @@
>  #include <asm/traps.h>
>  #include <asm/memblock.h>
>  #include <asm/psci.h>
> +#include <asm/xen/hypervisor.h>
>  
>  unsigned int processor_id;
>  EXPORT_SYMBOL(processor_id);
> @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
>  	unflatten_device_tree();
>  
>  	psci_init();
> +	xen_early_init();

So Xen guests don't have any hope for single Image? Basically you set
dma_ops unconditionally in xen_early_init(), even if the kernel is not
intended to run under Xen.

-- 
Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [Xen-devel] [PATCH v5 02/13] arm: introduce a global dma_ops pointer
  2013-09-02 14:45   ` [Xen-devel] " Julien Grall
@ 2013-09-05 16:39     ` Stefano Stabellini
  0 siblings, 0 replies; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-05 16:39 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, linux, Ian.Campbell, will.deacon,
	linux-kernel, linux-arm-kernel, ben-linux, kgene.kim,
	linux-samsung-soc

On Mon, 2 Sep 2013, Julien Grall wrote:
> On 08/29/2013 07:32 PM, Stefano Stabellini wrote:
> > Initially set dma_ops to arm_dma_ops.
> > 
> > 
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > CC: will.deacon@arm.com
> > CC: linux@arm.linux.org.uk
> > 
> > 
> > Changes in v3:
> > -  keep using arm_dma_ops in dmabounce.
> > ---
> >  arch/arm/include/asm/dma-mapping.h |    3 ++-
> >  arch/arm/mm/dma-mapping.c          |    3 +++
> >  2 files changed, 5 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> > index 0982206..7d6e4f9 100644
> > --- a/arch/arm/include/asm/dma-mapping.h
> > +++ b/arch/arm/include/asm/dma-mapping.h
> > @@ -13,6 +13,7 @@
> >  #include <asm/cacheflush.h>
> >  
> >  #define DMA_ERROR_CODE	(~0)
> > +extern struct dma_map_ops *dma_ops;
> 
> Hi,
> 
> I tried to build your swiotlb patch series for the Arndale. I have a compilation
> error because dma_ops is already used in samsung sound driver (sound/soc/samsung/dma.c).
> 
> This small fix allow me to built this serie for the Arndale.
> Do I need to send it separately?

This looks like a good fix, I'll append it to the series


> =======================================================================
> commit 73d4ceded87f52fa958b92d8d8d65be485e90857
> Author: Julien Grall <julien.grall@linaro.org>
> Date:   Mon Sep 2 15:36:35 2013 +0100
> 
>     ASoC: Samsung: Rename dma_ops by samsung_dma_ops
>     
>     The commit "arm: introduce a global dma_ops pointer" introduce compilation issue
>     when CONFIG_SND_SOC_SAMSUNG is enabled.
>     
>     sound/soc/samsung/dma.c:345:27: error: conflicting types for 'dma_ops'
>     /local/home/julien/works/arndale/linux/arch/arm/include/asm/dma-mapping.h:16:28:
>     note: previous declaration of 'dma_ops' was here
>     
>     Signed-off-by: Julien Grall <julien.grall@linaro.org>
> 
> diff --git a/sound/soc/samsung/dma.c b/sound/soc/samsung/dma.c
> index ddea134..c341603 100644
> --- a/sound/soc/samsung/dma.c
> +++ b/sound/soc/samsung/dma.c
> @@ -342,7 +342,7 @@ static int dma_mmap(struct snd_pcm_substream *substream,
>                                      runtime->dma_bytes);
>  }
>  
> -static struct snd_pcm_ops dma_ops = {
> +static struct snd_pcm_ops samsung_dma_ops = {
>         .open           = dma_open,
>         .close          = dma_close,
>         .ioctl          = snd_pcm_lib_ioctl,
> @@ -429,7 +429,7 @@ out:
>  }
>  
>  static struct snd_soc_platform_driver samsung_asoc_platform = {
> -       .ops            = &dma_ops,
> +       .ops            = &samsung_dma_ops,
>         .pcm_new        = dma_new,
>         .pcm_free       = dma_free_dma_buffers,
>  };
> 
> -- 
> Julien Grall
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-05 16:09   ` Catalin Marinas
@ 2013-09-05 16:43     ` Stefano Stabellini
  2013-09-06 14:14       ` Catalin Marinas
  0 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-05 16:43 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Stefano Stabellini, xen-devel, Ian.Campbell, linux-kernel,
	linux-arm-kernel, konrad.wilk

On Thu, 5 Sep 2013, Catalin Marinas wrote:
> On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > and devices. On native x86 and ARMv8 is sufficient to call
> > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > to call arm_dma_ops.alloc.
> 
> Don't bet on this for ARMv8. It's not mandated for the architecture, so
> at some point some SoC will require non-cacheable buffers for coherency.

I see.
Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
calling arm64_swiotlb_dma_ops.alloc?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 12/13] swiotlb-xen: use xen_alloc/free_coherent_pages
  2013-08-30 13:53   ` Konrad Rzeszutek Wilk
@ 2013-09-05 16:50     ` Stefano Stabellini
  2013-09-06 14:17       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-05 16:50 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, xen-devel, linux-kernel, linux-arm-kernel,
	Ian.Campbell

On Fri, 30 Aug 2013, Konrad Rzeszutek Wilk wrote:
> On Thu, Aug 29, 2013 at 07:32:33PM +0100, Stefano Stabellini wrote:
> > Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or
> > free coherent pages.
> > 
> > We need to be careful handling the pointer returned by
> > xen_alloc_coherent_pages, because on ARM the pointer is not equal to
> > phys_to_virt(*dma_handle). In fact virt_to_phys on the returned pointer
> > doesn't return a valid physical address.
> 
> Why is it called 'virt_to_phys'? What does it return then?

virt_to_phys only works for kernel direct mapped RAM memory.
In this case the virtual address could be an ioremap address, therefore
passing it to virt_to_phys would give you another physical address that
doesn't correspond to it.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 03/13] arm64: define DMA_ERROR_CODE
  2013-09-05 16:05   ` Catalin Marinas
@ 2013-09-05 16:54     ` Stefano Stabellini
  0 siblings, 0 replies; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-05 16:54 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Stefano Stabellini, xen-devel, linux-kernel, linux-arm-kernel,
	konrad.wilk, Ian.Campbell, Will Deacon

On Thu, 5 Sep 2013, Catalin Marinas wrote:
> On Thu, Aug 29, 2013 at 07:32:24PM +0100, Stefano Stabellini wrote:
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > CC: catalin.marinas@arm.com
> > CC: will.deacon@arm.com
> > ---
> >  arch/arm64/include/asm/dma-mapping.h |    1 +
> >  1 files changed, 1 insertions(+), 0 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
> > index 8d18100..c2cb8a0 100644
> > --- a/arch/arm64/include/asm/dma-mapping.h
> > +++ b/arch/arm64/include/asm/dma-mapping.h
> > @@ -25,6 +25,7 @@
> >  
> >  #define ARCH_HAS_DMA_GET_REQUIRED_MASK
> >  
> > +#define DMA_ERROR_CODE	(~0)
> 
> I wonder why this isn't more generic. It looks like most (all?)
> architectures define it in the same way.

Actually it's defined as 0 on x86 and ia64.
All the others define it as ~0.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 05/13] xen/arm,arm64: move Xen initialization earlier
  2013-09-05 16:20   ` Catalin Marinas
@ 2013-09-05 16:59     ` Stefano Stabellini
  2013-09-06  8:58       ` Ian Campbell
  0 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-05 16:59 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Stefano Stabellini, xen-devel, Ian.Campbell, linux-kernel,
	linux-arm-kernel, konrad.wilk

On Thu, 5 Sep 2013, Catalin Marinas wrote:
> On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> > Move Xen initialization earlier, before any DMA requests can be made.
> > 
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> 
> I guess you should cc the corresponding maintainers here.

Thanks for the reminder, I'll do that.


> >  arch/arm/include/asm/xen/hypervisor.h |    8 ++++++++
> >  arch/arm/kernel/setup.c               |    2 ++
> >  arch/arm/xen/enlighten.c              |   21 ++++++++++++++-------
> >  arch/arm64/kernel/setup.c             |    2 ++
> >  4 files changed, 26 insertions(+), 7 deletions(-)
> 
> [...]
> 
> > --- a/arch/arm64/kernel/setup.c
> > +++ b/arch/arm64/kernel/setup.c
> > @@ -53,6 +53,7 @@
> >  #include <asm/traps.h>
> >  #include <asm/memblock.h>
> >  #include <asm/psci.h>
> > +#include <asm/xen/hypervisor.h>
> >  
> >  unsigned int processor_id;
> >  EXPORT_SYMBOL(processor_id);
> > @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> >  	unflatten_device_tree();
> >  
> >  	psci_init();
> > +	xen_early_init();
> 
> So Xen guests don't have any hope for single Image? Basically you set
> dma_ops unconditionally in xen_early_init(), even if the kernel is not
> intended to run under Xen.

That should not happen: if we are not running on Xen xen_early_init
returns early, before calling xen_mm_init.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 05/13] xen/arm,arm64: move Xen initialization earlier
  2013-09-05 16:59     ` Stefano Stabellini
@ 2013-09-06  8:58       ` Ian Campbell
  2013-09-06 14:09         ` Catalin Marinas
  0 siblings, 1 reply; 45+ messages in thread
From: Ian Campbell @ 2013-09-06  8:58 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Catalin Marinas, xen-devel, linux-kernel, linux-arm-kernel, konrad.wilk

On Thu, 2013-09-05 at 17:59 +0100, Stefano Stabellini wrote:
> On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> > > Move Xen initialization earlier, before any DMA requests can be made.
> > > 
> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > 
> > I guess you should cc the corresponding maintainers here.
> 
> Thanks for the reminder, I'll do that.
> 
> 
> > >  arch/arm/include/asm/xen/hypervisor.h |    8 ++++++++
> > >  arch/arm/kernel/setup.c               |    2 ++
> > >  arch/arm/xen/enlighten.c              |   21 ++++++++++++++-------
> > >  arch/arm64/kernel/setup.c             |    2 ++
> > >  4 files changed, 26 insertions(+), 7 deletions(-)
> > 
> > [...]
> > 
> > > --- a/arch/arm64/kernel/setup.c
> > > +++ b/arch/arm64/kernel/setup.c
> > > @@ -53,6 +53,7 @@
> > >  #include <asm/traps.h>
> > >  #include <asm/memblock.h>
> > >  #include <asm/psci.h>
> > > +#include <asm/xen/hypervisor.h>
> > >  
> > >  unsigned int processor_id;
> > >  EXPORT_SYMBOL(processor_id);
> > > @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> > >  	unflatten_device_tree();
> > >  
> > >  	psci_init();
> > > +	xen_early_init();
> > 
> > So Xen guests don't have any hope for single Image? Basically you set
> > dma_ops unconditionally in xen_early_init(), even if the kernel is not
> > intended to run under Xen.
> 
> That should not happen: if we are not running on Xen xen_early_init
> returns early, before calling xen_mm_init.

x96 has a call to init_hypervisor_platform() at approximately this
location, which detects and calls the init function for any of Xen, KVM,
hyperv and vmware. I guess only Xen and KVM are currently relevant on
Linux ARM(64), so perhaps adding similar infrastructure on ARM would be
overkill at this point. I don't know if KVM needs such an early C-land
hook, I suppose it needs it even earlier so it can set up the hyp mode
trampoline from head.S?

Ian.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 05/13] xen/arm,arm64: move Xen initialization earlier
  2013-09-06  8:58       ` Ian Campbell
@ 2013-09-06 14:09         ` Catalin Marinas
  2013-09-06 14:23           ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 45+ messages in thread
From: Catalin Marinas @ 2013-09-06 14:09 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Stefano Stabellini, xen-devel, linux-kernel, linux-arm-kernel,
	konrad.wilk

On Fri, Sep 06, 2013 at 09:58:59AM +0100, Ian Campbell wrote:
> On Thu, 2013-09-05 at 17:59 +0100, Stefano Stabellini wrote:
> > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> > > > Move Xen initialization earlier, before any DMA requests can be made.
> > > > 
> > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > > 
> > > I guess you should cc the corresponding maintainers here.
> > 
> > Thanks for the reminder, I'll do that.
> > 
> > 
> > > >  arch/arm/include/asm/xen/hypervisor.h |    8 ++++++++
> > > >  arch/arm/kernel/setup.c               |    2 ++
> > > >  arch/arm/xen/enlighten.c              |   21 ++++++++++++++-------
> > > >  arch/arm64/kernel/setup.c             |    2 ++
> > > >  4 files changed, 26 insertions(+), 7 deletions(-)
> > > 
> > > [...]
> > > 
> > > > --- a/arch/arm64/kernel/setup.c
> > > > +++ b/arch/arm64/kernel/setup.c
> > > > @@ -53,6 +53,7 @@
> > > >  #include <asm/traps.h>
> > > >  #include <asm/memblock.h>
> > > >  #include <asm/psci.h>
> > > > +#include <asm/xen/hypervisor.h>
> > > >  
> > > >  unsigned int processor_id;
> > > >  EXPORT_SYMBOL(processor_id);
> > > > @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> > > >  	unflatten_device_tree();
> > > >  
> > > >  	psci_init();
> > > > +	xen_early_init();
> > > 
> > > So Xen guests don't have any hope for single Image? Basically you set
> > > dma_ops unconditionally in xen_early_init(), even if the kernel is not
> > > intended to run under Xen.
> > 
> > That should not happen: if we are not running on Xen xen_early_init
> > returns early, before calling xen_mm_init.
> 
> x96 has a call to init_hypervisor_platform() at approximately this
> location, which detects and calls the init function for any of Xen, KVM,
> hyperv and vmware.

I would rather have a core_initcall(xen_early_init()) if possible,
rather than hard-coded calls in setup_arch(). This early stuff is
DT-driven, so in theory you don't need a specific xen call. The only
thing is that you end up with swiotlb_init() and 64MB wasted if the Xen
guest does not plan to use them.

> I guess only Xen and KVM are currently relevant on Linux ARM(64), so
> perhaps adding similar infrastructure on ARM would be overkill at this
> point. I don't know if KVM needs such an early C-land hook, I suppose
> it needs it even earlier so it can set up the hyp mode trampoline from
> head.S?

head.S installs a Hyp stub if it starts in that mode and then forget
about. Later when KVM is initialised it installs its own code by doing
an HVC call.

-- 
Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-05 16:43     ` Stefano Stabellini
@ 2013-09-06 14:14       ` Catalin Marinas
  2013-09-06 14:59         ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Catalin Marinas @ 2013-09-06 14:14 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Ian.Campbell, linux-kernel, linux-arm-kernel, konrad.wilk

On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > and devices. On native x86 and ARMv8 is sufficient to call
> > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > to call arm_dma_ops.alloc.
> > 
> > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > at some point some SoC will require non-cacheable buffers for coherency.
> 
> I see.
> Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> calling arm64_swiotlb_dma_ops.alloc?

What does this buffer do exactly? Is it allocated by guests?

Currently arm64_swiotlb_dma_ops assume cache-coherent DMA. I have a
patch which introduces new ops for non-coherent DMA but this should
really be orthogonal to swiotlb. You can basically have 4 combinations
of coherent/non-coherent and swiotlb/iommu. Mark Rutland is currently
looking into how best to describe this via DT as it may not even be per
SoC but per bus or device.

-- 
Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 12/13] swiotlb-xen: use xen_alloc/free_coherent_pages
  2013-09-05 16:50     ` Stefano Stabellini
@ 2013-09-06 14:17       ` Konrad Rzeszutek Wilk
  2013-09-06 15:04         ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-09-06 14:17 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, linux-kernel, linux-arm-kernel, Ian.Campbell

On Thu, Sep 05, 2013 at 05:50:45PM +0100, Stefano Stabellini wrote:
> On Fri, 30 Aug 2013, Konrad Rzeszutek Wilk wrote:
> > On Thu, Aug 29, 2013 at 07:32:33PM +0100, Stefano Stabellini wrote:
> > > Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or
> > > free coherent pages.
> > > 
> > > We need to be careful handling the pointer returned by
> > > xen_alloc_coherent_pages, because on ARM the pointer is not equal to
> > > phys_to_virt(*dma_handle). In fact virt_to_phys on the returned pointer
> > > doesn't return a valid physical address.
> > 
> > Why is it called 'virt_to_phys'? What does it return then?
> 
> virt_to_phys only works for kernel direct mapped RAM memory.
> In this case the virtual address could be an ioremap address, therefore
> passing it to virt_to_phys would give you another physical address that
> doesn't correspond to it.

Ah, so very much ARM (and in some way SPARC) specific. I think you need add some comments
in the code (And git commit) to explain this. Thank you.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 05/13] xen/arm,arm64: move Xen initialization earlier
  2013-09-06 14:09         ` Catalin Marinas
@ 2013-09-06 14:23           ` Konrad Rzeszutek Wilk
  2013-09-06 14:53             ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-09-06 14:23 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Ian Campbell, Stefano Stabellini, xen-devel, linux-kernel,
	linux-arm-kernel

On Fri, Sep 06, 2013 at 03:09:21PM +0100, Catalin Marinas wrote:
> On Fri, Sep 06, 2013 at 09:58:59AM +0100, Ian Campbell wrote:
> > On Thu, 2013-09-05 at 17:59 +0100, Stefano Stabellini wrote:
> > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> > > > > Move Xen initialization earlier, before any DMA requests can be made.
> > > > > 
> > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > > > 
> > > > I guess you should cc the corresponding maintainers here.
> > > 
> > > Thanks for the reminder, I'll do that.
> > > 
> > > 
> > > > >  arch/arm/include/asm/xen/hypervisor.h |    8 ++++++++
> > > > >  arch/arm/kernel/setup.c               |    2 ++
> > > > >  arch/arm/xen/enlighten.c              |   21 ++++++++++++++-------
> > > > >  arch/arm64/kernel/setup.c             |    2 ++
> > > > >  4 files changed, 26 insertions(+), 7 deletions(-)
> > > > 
> > > > [...]
> > > > 
> > > > > --- a/arch/arm64/kernel/setup.c
> > > > > +++ b/arch/arm64/kernel/setup.c
> > > > > @@ -53,6 +53,7 @@
> > > > >  #include <asm/traps.h>
> > > > >  #include <asm/memblock.h>
> > > > >  #include <asm/psci.h>
> > > > > +#include <asm/xen/hypervisor.h>
> > > > >  
> > > > >  unsigned int processor_id;
> > > > >  EXPORT_SYMBOL(processor_id);
> > > > > @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> > > > >  	unflatten_device_tree();
> > > > >  
> > > > >  	psci_init();
> > > > > +	xen_early_init();
> > > > 
> > > > So Xen guests don't have any hope for single Image? Basically you set
> > > > dma_ops unconditionally in xen_early_init(), even if the kernel is not
> > > > intended to run under Xen.
> > > 
> > > That should not happen: if we are not running on Xen xen_early_init
> > > returns early, before calling xen_mm_init.
> > 
> > x96 has a call to init_hypervisor_platform() at approximately this
> > location, which detects and calls the init function for any of Xen, KVM,
> > hyperv and vmware.
> 
> I would rather have a core_initcall(xen_early_init()) if possible,
> rather than hard-coded calls in setup_arch(). This early stuff is
> DT-driven, so in theory you don't need a specific xen call. The only
> thing is that you end up with swiotlb_init() and 64MB wasted if the Xen
> guest does not plan to use them.

There is a swiotlb_free mechanism in case the allocation was not
neccessary.
> 
> > I guess only Xen and KVM are currently relevant on Linux ARM(64), so
> > perhaps adding similar infrastructure on ARM would be overkill at this
> > point. I don't know if KVM needs such an early C-land hook, I suppose
> > it needs it even earlier so it can set up the hyp mode trampoline from
> > head.S?
> 
> head.S installs a Hyp stub if it starts in that mode and then forget
> about. Later when KVM is initialised it installs its own code by doing
> an HVC call.
> 
> -- 
> Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 05/13] xen/arm,arm64: move Xen initialization earlier
  2013-09-06 14:23           ` Konrad Rzeszutek Wilk
@ 2013-09-06 14:53             ` Stefano Stabellini
  0 siblings, 0 replies; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-06 14:53 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Catalin Marinas, Ian Campbell, Stefano Stabellini, xen-devel,
	linux-kernel, linux-arm-kernel

On Fri, 6 Sep 2013, Konrad Rzeszutek Wilk wrote:
> On Fri, Sep 06, 2013 at 03:09:21PM +0100, Catalin Marinas wrote:
> > On Fri, Sep 06, 2013 at 09:58:59AM +0100, Ian Campbell wrote:
> > > On Thu, 2013-09-05 at 17:59 +0100, Stefano Stabellini wrote:
> > > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > > On Thu, Aug 29, 2013 at 07:32:26PM +0100, Stefano Stabellini wrote:
> > > > > > Move Xen initialization earlier, before any DMA requests can be made.
> > > > > > 
> > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > > > > 
> > > > > I guess you should cc the corresponding maintainers here.
> > > > 
> > > > Thanks for the reminder, I'll do that.
> > > > 
> > > > 
> > > > > >  arch/arm/include/asm/xen/hypervisor.h |    8 ++++++++
> > > > > >  arch/arm/kernel/setup.c               |    2 ++
> > > > > >  arch/arm/xen/enlighten.c              |   21 ++++++++++++++-------
> > > > > >  arch/arm64/kernel/setup.c             |    2 ++
> > > > > >  4 files changed, 26 insertions(+), 7 deletions(-)
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > --- a/arch/arm64/kernel/setup.c
> > > > > > +++ b/arch/arm64/kernel/setup.c
> > > > > > @@ -53,6 +53,7 @@
> > > > > >  #include <asm/traps.h>
> > > > > >  #include <asm/memblock.h>
> > > > > >  #include <asm/psci.h>
> > > > > > +#include <asm/xen/hypervisor.h>
> > > > > >  
> > > > > >  unsigned int processor_id;
> > > > > >  EXPORT_SYMBOL(processor_id);
> > > > > > @@ -267,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> > > > > >  	unflatten_device_tree();
> > > > > >  
> > > > > >  	psci_init();
> > > > > > +	xen_early_init();
> > > > > 
> > > > > So Xen guests don't have any hope for single Image? Basically you set
> > > > > dma_ops unconditionally in xen_early_init(), even if the kernel is not
> > > > > intended to run under Xen.
> > > > 
> > > > That should not happen: if we are not running on Xen xen_early_init
> > > > returns early, before calling xen_mm_init.
> > > 
> > > x96 has a call to init_hypervisor_platform() at approximately this
> > > location, which detects and calls the init function for any of Xen, KVM,
> > > hyperv and vmware.
> > 
> > I would rather have a core_initcall(xen_early_init()) if possible,
> > rather than hard-coded calls in setup_arch(). This early stuff is
> > DT-driven, so in theory you don't need a specific xen call. The only
> > thing is that you end up with swiotlb_init() and 64MB wasted if the Xen
> > guest does not plan to use them.
> 
> There is a swiotlb_free mechanism in case the allocation was not
> neccessary.

I think I'll just use the late swiotlb initialization.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-06 14:14       ` Catalin Marinas
@ 2013-09-06 14:59         ` Stefano Stabellini
  2013-09-06 15:59           ` Catalin Marinas
  0 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-06 14:59 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Stefano Stabellini, xen-devel, Ian.Campbell, linux-kernel,
	linux-arm-kernel, konrad.wilk

On Fri, 6 Sep 2013, Catalin Marinas wrote:
> On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > to call arm_dma_ops.alloc.
> > > 
> > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > at some point some SoC will require non-cacheable buffers for coherency.
> > 
> > I see.
> > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > calling arm64_swiotlb_dma_ops.alloc?
> 
> What does this buffer do exactly? Is it allocated by guests?

It is allocated by Dom0 to do DMA to/from a device.
It is the buffer that is going to be returned by dma_map_ops.alloc to
the caller:

On x86:
dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages

On ARM:
dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc

On ARM64
dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????



> Currently arm64_swiotlb_dma_ops assume cache-coherent DMA. I have a
> patch which introduces new ops for non-coherent DMA but this should
> really be orthogonal to swiotlb. You can basically have 4 combinations
> of coherent/non-coherent and swiotlb/iommu. Mark Rutland is currently
> looking into how best to describe this via DT as it may not even be per
> SoC but per bus or device.

It seems to me that calling arm64_swiotlb_dma_ops.alloc would ensure
that we allocate the right buffer for the right device?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 12/13] swiotlb-xen: use xen_alloc/free_coherent_pages
  2013-09-06 14:17       ` Konrad Rzeszutek Wilk
@ 2013-09-06 15:04         ` Stefano Stabellini
  0 siblings, 0 replies; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-06 15:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, xen-devel, linux-kernel, linux-arm-kernel,
	Ian.Campbell

On Fri, 6 Sep 2013, Konrad Rzeszutek Wilk wrote:
> On Thu, Sep 05, 2013 at 05:50:45PM +0100, Stefano Stabellini wrote:
> > On Fri, 30 Aug 2013, Konrad Rzeszutek Wilk wrote:
> > > On Thu, Aug 29, 2013 at 07:32:33PM +0100, Stefano Stabellini wrote:
> > > > Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or
> > > > free coherent pages.
> > > > 
> > > > We need to be careful handling the pointer returned by
> > > > xen_alloc_coherent_pages, because on ARM the pointer is not equal to
> > > > phys_to_virt(*dma_handle). In fact virt_to_phys on the returned pointer
> > > > doesn't return a valid physical address.
> > > 
> > > Why is it called 'virt_to_phys'? What does it return then?
> > 
> > virt_to_phys only works for kernel direct mapped RAM memory.
> > In this case the virtual address could be an ioremap address, therefore
> > passing it to virt_to_phys would give you another physical address that
> > doesn't correspond to it.
> 
> Ah, so very much ARM (and in some way SPARC) specific. I think you need add some comments
> in the code (And git commit) to explain this. Thank you.

OK

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-06 14:59         ` Stefano Stabellini
@ 2013-09-06 15:59           ` Catalin Marinas
  2013-09-06 16:09             ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Catalin Marinas @ 2013-09-06 15:59 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Ian.Campbell, linux-kernel, linux-arm-kernel, konrad.wilk

On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > to call arm_dma_ops.alloc.
> > > > 
> > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > at some point some SoC will require non-cacheable buffers for coherency.
> > > 
> > > I see.
> > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > calling arm64_swiotlb_dma_ops.alloc?
> > 
> > What does this buffer do exactly? Is it allocated by guests?
> 
> It is allocated by Dom0 to do DMA to/from a device.
> It is the buffer that is going to be returned by dma_map_ops.alloc to
> the caller:
> 
> On x86:
> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
> 
> On ARM:
> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
> 
> On ARM64
> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????

OK, I'm getting more confused. Do all the above calls happen in the
guest, Dom0, or a mix?

-- 
Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-06 15:59           ` Catalin Marinas
@ 2013-09-06 16:09             ` Stefano Stabellini
  2013-09-06 16:20               ` Catalin Marinas
  0 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-06 16:09 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Stefano Stabellini, xen-devel, Ian.Campbell, linux-kernel,
	linux-arm-kernel, konrad.wilk

On Fri, 6 Sep 2013, Catalin Marinas wrote:
> On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > > to call arm_dma_ops.alloc.
> > > > > 
> > > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > > at some point some SoC will require non-cacheable buffers for coherency.
> > > > 
> > > > I see.
> > > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > > calling arm64_swiotlb_dma_ops.alloc?
> > > 
> > > What does this buffer do exactly? Is it allocated by guests?
> > 
> > It is allocated by Dom0 to do DMA to/from a device.
> > It is the buffer that is going to be returned by dma_map_ops.alloc to
> > the caller:
> > 
> > On x86:
> > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
> > 
> > On ARM:
> > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
> > 
> > On ARM64
> > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
> 
> OK, I'm getting more confused. Do all the above calls happen in the
> guest, Dom0, or a mix?

I guess the confusion comes from a difference in terminology: dom0 is a
guest like the others, just a bit more privileged. We usually call domU
a normal unprivileged guest.

The above calls would happen in Dom0 (when an SMMU is not available).
They could also happen in a DomU if we assign a physical device to it
(and an SMMU is not available).

So in general they would happen in any guest that needs to program a
real device.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-06 16:09             ` Stefano Stabellini
@ 2013-09-06 16:20               ` Catalin Marinas
  2013-09-06 16:52                 ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Catalin Marinas @ 2013-09-06 16:20 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Ian.Campbell, linux-kernel, linux-arm-kernel, konrad.wilk

On Fri, Sep 06, 2013 at 05:09:52PM +0100, Stefano Stabellini wrote:
> On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> > > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > > > to call arm_dma_ops.alloc.
> > > > > > 
> > > > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > > > at some point some SoC will require non-cacheable buffers for coherency.
> > > > > 
> > > > > I see.
> > > > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > > > calling arm64_swiotlb_dma_ops.alloc?
> > > > 
> > > > What does this buffer do exactly? Is it allocated by guests?
> > > 
> > > It is allocated by Dom0 to do DMA to/from a device.
> > > It is the buffer that is going to be returned by dma_map_ops.alloc to
> > > the caller:
> > > 
> > > On x86:
> > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
> > > 
> > > On ARM:
> > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
> > > 
> > > On ARM64
> > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
> > 
> > OK, I'm getting more confused. Do all the above calls happen in the
> > guest, Dom0, or a mix?
> 
> I guess the confusion comes from a difference in terminology: dom0 is a
> guest like the others, just a bit more privileged. We usually call domU
> a normal unprivileged guest.

Thanks for the explanation.

> The above calls would happen in Dom0 (when an SMMU is not available).

So for Dom0, are there cases when it needs xen_swiotlb_alloc_coherent()
and other cases when it needs the arm_dma_ops.alloc? In Dom0 could we
not always use the default dma_alloc_coherent()?

> They could also happen in a DomU if we assign a physical device to it
> (and an SMMU is not available).

The problem is that you don't necessarily know one kind of coherency you
know for a physical device. As I said, we plan to do this DT-driven.

-- 
Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-06 16:20               ` Catalin Marinas
@ 2013-09-06 16:52                 ` Stefano Stabellini
  2013-09-09 15:51                   ` Catalin Marinas
  0 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-06 16:52 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Stefano Stabellini, xen-devel, Ian.Campbell, linux-kernel,
	linux-arm-kernel, konrad.wilk

On Fri, 6 Sep 2013, Catalin Marinas wrote:
> On Fri, Sep 06, 2013 at 05:09:52PM +0100, Stefano Stabellini wrote:
> > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
> > > > On Fri, 6 Sep 2013, Catalin Marinas wrote:
> > > > > On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
> > > > > > On Thu, 5 Sep 2013, Catalin Marinas wrote:
> > > > > > > On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
> > > > > > > > xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
> > > > > > > > and devices. On native x86 and ARMv8 is sufficient to call
> > > > > > > > __get_free_pages in order to get a coherent buffer, while on ARM we need
> > > > > > > > to call arm_dma_ops.alloc.
> > > > > > > 
> > > > > > > Don't bet on this for ARMv8. It's not mandated for the architecture, so
> > > > > > > at some point some SoC will require non-cacheable buffers for coherency.
> > > > > > 
> > > > > > I see.
> > > > > > Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
> > > > > > calling arm64_swiotlb_dma_ops.alloc?
> > > > > 
> > > > > What does this buffer do exactly? Is it allocated by guests?
> > > > 
> > > > It is allocated by Dom0 to do DMA to/from a device.
> > > > It is the buffer that is going to be returned by dma_map_ops.alloc to
> > > > the caller:
> > > > 
> > > > On x86:
> > > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
> > > > 
> > > > On ARM:
> > > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
> > > > 
> > > > On ARM64
> > > > dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
> > > 
> > > OK, I'm getting more confused. Do all the above calls happen in the
> > > guest, Dom0, or a mix?
> > 
> > I guess the confusion comes from a difference in terminology: dom0 is a
> > guest like the others, just a bit more privileged. We usually call domU
> > a normal unprivileged guest.
> 
> Thanks for the explanation.
> 
> > The above calls would happen in Dom0 (when an SMMU is not available).
> 
> So for Dom0, are there cases when it needs xen_swiotlb_alloc_coherent()
> and other cases when it needs the arm_dma_ops.alloc? In Dom0 could we
> not always use the default dma_alloc_coherent()?

Keep in mind that dom0 runs with second stage translation enabled.  This
means that what dom0 thinks is a physical address (machine address in
Xen terminology), it's actually just an intermediate physical address.
Also for the same reason what dom0 thinks is a contiguous buffer, it's
actually only contiguous in the intermediate physical address space.

So every time dom0 wants to allocate a dma-capable buffer it needs to go
via swiotlb-xen, that makes the buffer contiguous in the physical address
space (machine address space in Xen terminology) by issuing an hypercall.
swiotlb-xen also returns the physical address (machine address in Xen
terminology) to the caller.

To answer your question: in absence of an SMMU, all the
dma_alloc_coherent calls in dom0 need to go via xen_swiotlb_alloc_coherent.

xen_swiotlb_alloc_coherent cannot allocate a contigous buffer in physical
address space (see above), but it has to allocate a buffer coherent from
the caching attributes point of view. The hypervisor is going to take
care of making the allocated buffer really contiguous in physical address
space.

So now the problem is: how is xen_swiotlb_alloc_coherent going to
allocate a coherent buffer?
On x86 I can just call __get_free_pages.
On ARM I have to call arm_dma_ops.alloc.
On ARM64 ???

BTW if the Matrix is your kind of fun, I wrote an blog post explaining the
swiotlb Morpheus style:
http://blog.xen.org/index.php/2013/08/14/swiotlb-by-morpheus/


> > They could also happen in a DomU if we assign a physical device to it
> > (and an SMMU is not available).
> 
> The problem is that you don't necessarily know one kind of coherency you
> know for a physical device. As I said, we plan to do this DT-driven.
 
OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
arguments to it, I should be able to get the right coherency for the
right device, correct?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-06 16:52                 ` Stefano Stabellini
@ 2013-09-09 15:51                   ` Catalin Marinas
  2013-09-09 16:46                     ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Catalin Marinas @ 2013-09-09 15:51 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Ian.Campbell, linux-kernel, linux-arm-kernel, konrad.wilk

On 6 Sep 2013, at 17:52, Stefano Stabellini <Stefano.Stabellini@eu.citrix.com> wrote:
> On Fri, 6 Sep 2013, Catalin Marinas wrote:
>> On Fri, Sep 06, 2013 at 05:09:52PM +0100, Stefano Stabellini wrote:
>>> On Fri, 6 Sep 2013, Catalin Marinas wrote:
>>>> On Fri, Sep 06, 2013 at 03:59:02PM +0100, Stefano Stabellini wrote:
>>>>> On Fri, 6 Sep 2013, Catalin Marinas wrote:
>>>>>> On Thu, Sep 05, 2013 at 05:43:33PM +0100, Stefano Stabellini wrote:
>>>>>>> On Thu, 5 Sep 2013, Catalin Marinas wrote:
>>>>>>>> On Thu, Aug 29, 2013 at 07:32:32PM +0100, Stefano Stabellini wrote:
>>>>>>>>> xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu
>>>>>>>>> and devices. On native x86 and ARMv8 is sufficient to call
>>>>>>>>> __get_free_pages in order to get a coherent buffer, while on ARM we need
>>>>>>>>> to call arm_dma_ops.alloc.
>>>>>>>> 
>>>>>>>> Don't bet on this for ARMv8. It's not mandated for the architecture, so
>>>>>>>> at some point some SoC will require non-cacheable buffers for coherency.
>>>>>>> 
>>>>>>> I see.
>>>>>>> Would it be better if I implemented xen_alloc_coherent_pages on armv8 by
>>>>>>> calling arm64_swiotlb_dma_ops.alloc?
>>>>>> 
>>>>>> What does this buffer do exactly? Is it allocated by guests?
>>>>> 
>>>>> It is allocated by Dom0 to do DMA to/from a device.
>>>>> It is the buffer that is going to be returned by dma_map_ops.alloc to
>>>>> the caller:
>>>>> 
>>>>> On x86:
>>>>> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> __get_free_pages
>>>>> 
>>>>> On ARM:
>>>>> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> arm_dma_ops.alloc
>>>>> 
>>>>> On ARM64
>>>>> dma_map_ops.alloc -> xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages -> ????
>>>> 
>>>> OK, I'm getting more confused. Do all the above calls happen in the
>>>> guest, Dom0, or a mix?
>>> 
>>> I guess the confusion comes from a difference in terminology: dom0 is a
>>> guest like the others, just a bit more privileged. We usually call domU
>>> a normal unprivileged guest.
>> 
>> Thanks for the explanation.
>> 
>>> The above calls would happen in Dom0 (when an SMMU is not available).
>> 
>> So for Dom0, are there cases when it needs xen_swiotlb_alloc_coherent()
>> and other cases when it needs the arm_dma_ops.alloc? In Dom0 could we
>> not always use the default dma_alloc_coherent()?
> 
> Keep in mind that dom0 runs with second stage translation enabled.  This
> means that what dom0 thinks is a physical address (machine address in
> Xen terminology), it's actually just an intermediate physical address.
> Also for the same reason what dom0 thinks is a contiguous buffer, it's
> actually only contiguous in the intermediate physical address space.

OK, it makes sense now.  I thought dom0 is like the KVM host where stage
2 is disabled (or just flat).

> BTW if the Matrix is your kind of fun, I wrote an blog post explaining the
> swiotlb Morpheus style:
> http://blog.xen.org/index.php/2013/08/14/swiotlb-by-morpheus/

That was easier to understand ;)

>>> They could also happen in a DomU if we assign a physical device to it
>>> (and an SMMU is not available).
>> 
>> The problem is that you don't necessarily know one kind of coherency you
>> know for a physical device. As I said, we plan to do this DT-driven.
> 
> OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
> arguments to it, I should be able to get the right coherency for the
> right device, correct?

I think it needs a bit more work on the Xen part.  Basically
dma_alloc_attrs() calls get_dma_ops() to obtain the best DMA operations
for a device.  arm64_swiotlb_dma_ops is just the default implementation
and I'll add a _noncoherent variant as well.  Default dma_ops will be
set to one of these during boot.  But a device is also allowed to have
its own dev->archdata.dma_ops, set via set_dma_ops().

So even if you set the default dma_ops to Xen ops, you may not get them
via dma_alloc_coherent().  I don't see any easier solution other than
patching the dma_alloc_attrs() function to issue a Hyp call after the
memory has been allocated with the get_dma_ops()->alloc().  But I don't
like this either.

Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-09 15:51                   ` Catalin Marinas
@ 2013-09-09 16:46                     ` Stefano Stabellini
  2013-09-11  9:36                       ` Catalin Marinas
  0 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-09 16:46 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Stefano Stabellini, xen-devel, Ian.Campbell, linux-kernel,
	linux-arm-kernel, konrad.wilk

On Mon, 9 Sep 2013, Catalin Marinas wrote:
> >>> They could also happen in a DomU if we assign a physical device to it
> >>> (and an SMMU is not available).
> >> 
> >> The problem is that you don't necessarily know one kind of coherency you
> >> know for a physical device. As I said, we plan to do this DT-driven.
> > 
> > OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
> > arguments to it, I should be able to get the right coherency for the
> > right device, correct?
> 
> I think it needs a bit more work on the Xen part.  Basically
> dma_alloc_attrs() calls get_dma_ops() to obtain the best DMA operations
> for a device.  arm64_swiotlb_dma_ops is just the default implementation
> and I'll add a _noncoherent variant as well.  Default dma_ops will be
> set to one of these during boot.  But a device is also allowed to have
> its own dev->archdata.dma_ops, set via set_dma_ops().
> 
> So even if you set the default dma_ops to Xen ops, you may not get them
> via dma_alloc_coherent().  I don't see any easier solution other than
> patching the dma_alloc_attrs() function to issue a Hyp call after the
> memory has been allocated with the get_dma_ops()->alloc().  But I don't
> like this either.

I see. This problem affects arch/arm as well.
Either we add an if (!xen_domain()) in get_dma_ops, or we could make
get_dma_ops a function pointer and let people overwrite it.
See below the first option implemented for arch/arm on top of the
swiotlb series:


diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 7d6e4f9..0b8b5e4 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -12,6 +12,8 @@
 #include <asm/memory.h>
 #include <asm/cacheflush.h>
 
+#include <xen/xen.h>
+
 #define DMA_ERROR_CODE	(~0)
 extern struct dma_map_ops *dma_ops;
 extern struct dma_map_ops arm_dma_ops;
@@ -19,7 +21,7 @@ extern struct dma_map_ops arm_coherent_dma_ops;
 
 static inline struct dma_map_ops *get_dma_ops(struct device *dev)
 {
-	if (dev && dev->archdata.dma_ops)
+	if (!xen_domain() && dev && dev->archdata.dma_ops)
 		return dev->archdata.dma_ops;
 	return dma_ops;
 }
diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
index af2cf8d..c2232fe 100644
--- a/arch/arm/include/asm/xen/page-coherent.h
+++ b/arch/arm/include/asm/xen/page-coherent.h
@@ -9,6 +9,8 @@ static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
 		dma_addr_t *dma_handle, gfp_t flags,
 		struct dma_attrs *attrs)
 {
+	if (hwdev && hwdev->archdata.dma_ops)
+		return hwdev->archdata.dma_ops->alloc(hwdev, size, dma_handle, flags, attrs);
 	return arm_dma_ops.alloc(hwdev, size, dma_handle, flags, attrs);
 }
 
@@ -16,6 +18,8 @@ static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
 		void *cpu_addr, dma_addr_t dma_handle,
 		struct dma_attrs *attrs)
 {
+	if (hwdev && hwdev->archdata.dma_ops)
+		return hwdev->archdata.dma_ops->free(hwdev, size, cpu_addr, dma_handle, attrs);
 	return arm_dma_ops.free(hwdev, size, cpu_addr, dma_handle, attrs);
 }
 

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-09 16:46                     ` Stefano Stabellini
@ 2013-09-11  9:36                       ` Catalin Marinas
  2013-09-11 17:34                         ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Catalin Marinas @ 2013-09-11  9:36 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Ian.Campbell, linux-kernel, linux-arm-kernel, konrad.wilk

On Mon, Sep 09, 2013 at 05:46:59PM +0100, Stefano Stabellini wrote:
> On Mon, 9 Sep 2013, Catalin Marinas wrote:
> > >>> They could also happen in a DomU if we assign a physical device to it
> > >>> (and an SMMU is not available).
> > >> 
> > >> The problem is that you don't necessarily know one kind of coherency you
> > >> know for a physical device. As I said, we plan to do this DT-driven.
> > > 
> > > OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
> > > arguments to it, I should be able to get the right coherency for the
> > > right device, correct?
> > 
> > I think it needs a bit more work on the Xen part.  Basically
> > dma_alloc_attrs() calls get_dma_ops() to obtain the best DMA operations
> > for a device.  arm64_swiotlb_dma_ops is just the default implementation
> > and I'll add a _noncoherent variant as well.  Default dma_ops will be
> > set to one of these during boot.  But a device is also allowed to have
> > its own dev->archdata.dma_ops, set via set_dma_ops().
> > 
> > So even if you set the default dma_ops to Xen ops, you may not get them
> > via dma_alloc_coherent().  I don't see any easier solution other than
> > patching the dma_alloc_attrs() function to issue a Hyp call after the
> > memory has been allocated with the get_dma_ops()->alloc().  But I don't
> > like this either.
> 
> I see. This problem affects arch/arm as well.
> Either we add an if (!xen_domain()) in get_dma_ops, or we could make
> get_dma_ops a function pointer and let people overwrite it.
> See below the first option implemented for arch/arm on top of the
> swiotlb series:
> 
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 7d6e4f9..0b8b5e4 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -12,6 +12,8 @@
>  #include <asm/memory.h>
>  #include <asm/cacheflush.h>
>  
> +#include <xen/xen.h>
> +
>  #define DMA_ERROR_CODE	(~0)
>  extern struct dma_map_ops *dma_ops;
>  extern struct dma_map_ops arm_dma_ops;
> @@ -19,7 +21,7 @@ extern struct dma_map_ops arm_coherent_dma_ops;
>  
>  static inline struct dma_map_ops *get_dma_ops(struct device *dev)
>  {
> -	if (dev && dev->archdata.dma_ops)
> +	if (!xen_domain() && dev && dev->archdata.dma_ops)
>  		return dev->archdata.dma_ops;
>  	return dma_ops;
>  }
> diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
> index af2cf8d..c2232fe 100644
> --- a/arch/arm/include/asm/xen/page-coherent.h
> +++ b/arch/arm/include/asm/xen/page-coherent.h
> @@ -9,6 +9,8 @@ static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
>  		dma_addr_t *dma_handle, gfp_t flags,
>  		struct dma_attrs *attrs)
>  {
> +	if (hwdev && hwdev->archdata.dma_ops)
> +		return hwdev->archdata.dma_ops->alloc(hwdev, size, dma_handle, flags, attrs);
>  	return arm_dma_ops.alloc(hwdev, size, dma_handle, flags, attrs);

I still don't like xen digging into the archdata internals. What about:

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 5b579b9..5fa472c 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -15,13 +15,20 @@
 extern struct dma_map_ops arm_dma_ops;
 extern struct dma_map_ops arm_coherent_dma_ops;
 
-static inline struct dma_map_ops *get_dma_ops(struct device *dev)
+static inline struct dma_map_ops *__get_dma_ops(struct device *dev)
 {
 	if (dev && dev->archdata.dma_ops)
 		return dev->archdata.dma_ops;
 	return &arm_dma_ops;
 }
 
+static inline struct dma_map_ops *get_dma_ops(struct device *dev)
+{
+	if (xen_domain())
+		return xen_dma_ops;
+	return __get_dma_ops(dev);
+}
+
 static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
 {
 	BUG_ON(!dev);
@@ -32,7 +39,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
 
 static inline int dma_set_mask(struct device *dev, u64 mask)
 {
-	return get_dma_ops(dev)->set_dma_mask(dev, mask);
+	return __get_dma_ops(dev)->set_dma_mask(dev, mask);
 }
 
 #ifdef __arch_page_to_dma

And in xen_alloc_coherent_pages():

	return __get_dma_ops(dev).alloc(...);

Alternatively, add the xen_domain() check in dma_alloc_attrs() instead
of get_dma_ops() (and other functions like map_sg etc.)

-- 
Catalin

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-11  9:36                       ` Catalin Marinas
@ 2013-09-11 17:34                         ` Stefano Stabellini
  2013-09-12 13:53                           ` Catalin Marinas
  0 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-11 17:34 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Stefano Stabellini, xen-devel, Ian.Campbell, linux-kernel,
	linux-arm-kernel, konrad.wilk

On Wed, 11 Sep 2013, Catalin Marinas wrote:
> On Mon, Sep 09, 2013 at 05:46:59PM +0100, Stefano Stabellini wrote:
> > On Mon, 9 Sep 2013, Catalin Marinas wrote:
> > > >>> They could also happen in a DomU if we assign a physical device to it
> > > >>> (and an SMMU is not available).
> > > >> 
> > > >> The problem is that you don't necessarily know one kind of coherency you
> > > >> know for a physical device. As I said, we plan to do this DT-driven.
> > > > 
> > > > OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
> > > > arguments to it, I should be able to get the right coherency for the
> > > > right device, correct?
> > > 
> > > I think it needs a bit more work on the Xen part.  Basically
> > > dma_alloc_attrs() calls get_dma_ops() to obtain the best DMA operations
> > > for a device.  arm64_swiotlb_dma_ops is just the default implementation
> > > and I'll add a _noncoherent variant as well.  Default dma_ops will be
> > > set to one of these during boot.  But a device is also allowed to have
> > > its own dev->archdata.dma_ops, set via set_dma_ops().
> > > 
> > > So even if you set the default dma_ops to Xen ops, you may not get them
> > > via dma_alloc_coherent().  I don't see any easier solution other than
> > > patching the dma_alloc_attrs() function to issue a Hyp call after the
> > > memory has been allocated with the get_dma_ops()->alloc().  But I don't
> > > like this either.
> > 
> > I see. This problem affects arch/arm as well.
> > Either we add an if (!xen_domain()) in get_dma_ops, or we could make
> > get_dma_ops a function pointer and let people overwrite it.
> > See below the first option implemented for arch/arm on top of the
> > swiotlb series:
> > 
> > 
> > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> > index 7d6e4f9..0b8b5e4 100644
> > --- a/arch/arm/include/asm/dma-mapping.h
> > +++ b/arch/arm/include/asm/dma-mapping.h
> > @@ -12,6 +12,8 @@
> >  #include <asm/memory.h>
> >  #include <asm/cacheflush.h>
> >  
> > +#include <xen/xen.h>
> > +
> >  #define DMA_ERROR_CODE	(~0)
> >  extern struct dma_map_ops *dma_ops;
> >  extern struct dma_map_ops arm_dma_ops;
> > @@ -19,7 +21,7 @@ extern struct dma_map_ops arm_coherent_dma_ops;
> >  
> >  static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> >  {
> > -	if (dev && dev->archdata.dma_ops)
> > +	if (!xen_domain() && dev && dev->archdata.dma_ops)
> >  		return dev->archdata.dma_ops;
> >  	return dma_ops;
> >  }
> > diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
> > index af2cf8d..c2232fe 100644
> > --- a/arch/arm/include/asm/xen/page-coherent.h
> > +++ b/arch/arm/include/asm/xen/page-coherent.h
> > @@ -9,6 +9,8 @@ static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
> >  		dma_addr_t *dma_handle, gfp_t flags,
> >  		struct dma_attrs *attrs)
> >  {
> > +	if (hwdev && hwdev->archdata.dma_ops)
> > +		return hwdev->archdata.dma_ops->alloc(hwdev, size, dma_handle, flags, attrs);
> >  	return arm_dma_ops.alloc(hwdev, size, dma_handle, flags, attrs);
> 
> I still don't like xen digging into the archdata internals.

Me neither.


> What about:
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 5b579b9..5fa472c 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -15,13 +15,20 @@
>  extern struct dma_map_ops arm_dma_ops;
>  extern struct dma_map_ops arm_coherent_dma_ops;
>  
> -static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> +static inline struct dma_map_ops *__get_dma_ops(struct device *dev)
>  {
>  	if (dev && dev->archdata.dma_ops)
>  		return dev->archdata.dma_ops;
>  	return &arm_dma_ops;
>  }
>  
> +static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> +{
> +	if (xen_domain())
> +		return xen_dma_ops;
> +	return __get_dma_ops(dev);
> +}

I agree that this is better.


>  static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
>  {
>  	BUG_ON(!dev);
> @@ -32,7 +39,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
>  
>  static inline int dma_set_mask(struct device *dev, u64 mask)
>  {
> -	return get_dma_ops(dev)->set_dma_mask(dev, mask);
> +	return __get_dma_ops(dev)->set_dma_mask(dev, mask);
>  }
>  
>  #ifdef __arch_page_to_dma

I don't understand the reason for this change though: shouldn't
set_dma_mask go via the "default" (whatever that is), like the others?

On native it won't make a difference, in any case it will end up calling
arm_set_dma_mask.

On Xen it would make a difference, because
get_dma_ops(dev)->set_dma_mask  would end up calling
xen_swiotlb_set_dma_mask, that checks whether the mask is supported by
the swiotlb buffer before setting the mask for the device, while
obviously arm_set_dma_mask doesn't do that.


> And in xen_alloc_coherent_pages():
> 
> 	return __get_dma_ops(dev).alloc(...);

Right.


> Alternatively, add the xen_domain() check in dma_alloc_attrs() instead
> of get_dma_ops() (and other functions like map_sg etc.)

I prefer this approach because it is going to be more concise.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-11 17:34                         ` Stefano Stabellini
@ 2013-09-12 13:53                           ` Catalin Marinas
  2013-09-12 14:44                             ` Stefano Stabellini
  0 siblings, 1 reply; 45+ messages in thread
From: Catalin Marinas @ 2013-09-12 13:53 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Ian.Campbell, linux-kernel, linux-arm-kernel, konrad.wilk

On Wed, Sep 11, 2013 at 06:34:17PM +0100, Stefano Stabellini wrote:
> On Wed, 11 Sep 2013, Catalin Marinas wrote:
> >  static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> >  {
> >  	BUG_ON(!dev);
> > @@ -32,7 +39,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> >  
> >  static inline int dma_set_mask(struct device *dev, u64 mask)
> >  {
> > -	return get_dma_ops(dev)->set_dma_mask(dev, mask);
> > +	return __get_dma_ops(dev)->set_dma_mask(dev, mask);
> >  }
> >  
> >  #ifdef __arch_page_to_dma
> 
> I don't understand the reason for this change though: shouldn't
> set_dma_mask go via the "default" (whatever that is), like the others?

Under Xen, this would mean xen_dma_ops->set_dma_mask(). I thought you
want to set the default dma address.

I can see why you may want set_dma_mask() for xen_dma_ops but in also
needs to be passed to __get_dma_ops(dev)->set_dma_ops() otherwise the
actual dma_alloc_attrs() will miss the mask (and on arm64 it does
bouncing via swiotlb).

-- 
Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-12 13:53                           ` Catalin Marinas
@ 2013-09-12 14:44                             ` Stefano Stabellini
  2013-09-12 15:04                               ` Catalin Marinas
  0 siblings, 1 reply; 45+ messages in thread
From: Stefano Stabellini @ 2013-09-12 14:44 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Stefano Stabellini, xen-devel, Ian.Campbell, linux-kernel,
	linux-arm-kernel, konrad.wilk

On Thu, 12 Sep 2013, Catalin Marinas wrote:
> On Wed, Sep 11, 2013 at 06:34:17PM +0100, Stefano Stabellini wrote:
> > On Wed, 11 Sep 2013, Catalin Marinas wrote:
> > >  static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> > >  {
> > >  	BUG_ON(!dev);
> > > @@ -32,7 +39,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> > >  
> > >  static inline int dma_set_mask(struct device *dev, u64 mask)
> > >  {
> > > -	return get_dma_ops(dev)->set_dma_mask(dev, mask);
> > > +	return __get_dma_ops(dev)->set_dma_mask(dev, mask);
> > >  }
> > >  
> > >  #ifdef __arch_page_to_dma
> > 
> > I don't understand the reason for this change though: shouldn't
> > set_dma_mask go via the "default" (whatever that is), like the others?
> 
> Under Xen, this would mean xen_dma_ops->set_dma_mask(). I thought you
> want to set the default dma address.
> 
> I can see why you may want set_dma_mask() for xen_dma_ops but in also
> needs to be passed to __get_dma_ops(dev)->set_dma_ops() otherwise the
> actual dma_alloc_attrs() will miss the mask (and on arm64 it does
> bouncing via swiotlb).
 
xen_dma_ops->set_dma_mask() is implemented by
drivers/xen/swiotlb-xen.c:xen_swiotlb_set_dma_mask (provided by this
patch series). The function takes care of setting:

*dev->dma_mask = dma_mask;

I think that should be enough to make dma_alloc_attrs happy.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages
  2013-09-12 14:44                             ` Stefano Stabellini
@ 2013-09-12 15:04                               ` Catalin Marinas
  0 siblings, 0 replies; 45+ messages in thread
From: Catalin Marinas @ 2013-09-12 15:04 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Ian.Campbell, linux-kernel, linux-arm-kernel, konrad.wilk

On Thu, Sep 12, 2013 at 03:44:50PM +0100, Stefano Stabellini wrote:
> On Thu, 12 Sep 2013, Catalin Marinas wrote:
> > On Wed, Sep 11, 2013 at 06:34:17PM +0100, Stefano Stabellini wrote:
> > > On Wed, 11 Sep 2013, Catalin Marinas wrote:
> > > >  static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> > > >  {
> > > >  	BUG_ON(!dev);
> > > > @@ -32,7 +39,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> > > >  
> > > >  static inline int dma_set_mask(struct device *dev, u64 mask)
> > > >  {
> > > > -	return get_dma_ops(dev)->set_dma_mask(dev, mask);
> > > > +	return __get_dma_ops(dev)->set_dma_mask(dev, mask);
> > > >  }
> > > >  
> > > >  #ifdef __arch_page_to_dma
> > > 
> > > I don't understand the reason for this change though: shouldn't
> > > set_dma_mask go via the "default" (whatever that is), like the others?
> > 
> > Under Xen, this would mean xen_dma_ops->set_dma_mask(). I thought you
> > want to set the default dma address.
> > 
> > I can see why you may want set_dma_mask() for xen_dma_ops but in also
> > needs to be passed to __get_dma_ops(dev)->set_dma_ops() otherwise the
> > actual dma_alloc_attrs() will miss the mask (and on arm64 it does
> > bouncing via swiotlb).
>  
> xen_dma_ops->set_dma_mask() is implemented by
> drivers/xen/swiotlb-xen.c:xen_swiotlb_set_dma_mask (provided by this
> patch series). The function takes care of setting:
> 
> *dev->dma_mask = dma_mask;
> 
> I think that should be enough to make dma_alloc_attrs happy.

I think it would be safer to call __get_dma_ops()->set_dma_mask() in
case the driver overrides that and does more than dev->dma_mask setting.
I don't know whether this is the case though.

-- 
Catalin

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2013-09-12 15:05 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-29 18:31 [PATCH v5 0/13] enable swiotlb-xen on arm and arm64 Stefano Stabellini
2013-08-29 18:32 ` [PATCH v5 01/13] arm: make SWIOTLB available Stefano Stabellini
2013-08-29 18:32 ` [PATCH v5 02/13] arm: introduce a global dma_ops pointer Stefano Stabellini
2013-09-02 14:45   ` [Xen-devel] " Julien Grall
2013-09-05 16:39     ` Stefano Stabellini
2013-08-29 18:32 ` [PATCH v5 03/13] arm64: define DMA_ERROR_CODE Stefano Stabellini
2013-09-05 16:05   ` Catalin Marinas
2013-09-05 16:54     ` Stefano Stabellini
2013-08-29 18:32 ` [PATCH v5 04/13] arm64: do not initialize arm64_swiotlb if dma_ops is already set Stefano Stabellini
2013-08-29 18:32 ` [PATCH v5 05/13] xen/arm,arm64: move Xen initialization earlier Stefano Stabellini
2013-09-05 16:20   ` Catalin Marinas
2013-09-05 16:59     ` Stefano Stabellini
2013-09-06  8:58       ` Ian Campbell
2013-09-06 14:09         ` Catalin Marinas
2013-09-06 14:23           ` Konrad Rzeszutek Wilk
2013-09-06 14:53             ` Stefano Stabellini
2013-08-29 18:32 ` [PATCH v5 06/13] xen: introduce XENMEM_exchange_and_pin and XENMEM_unpin Stefano Stabellini
2013-08-29 18:32 ` [PATCH v5 07/13] xen: make xen_create_contiguous_region return the dma address Stefano Stabellini
2013-08-30 13:45   ` Konrad Rzeszutek Wilk
2013-08-29 18:32 ` [PATCH v5 08/13] swiotlb-xen: support autotranslate guests Stefano Stabellini
2013-08-30 13:46   ` Konrad Rzeszutek Wilk
2013-08-29 18:32 ` [PATCH v5 09/13] xen/arm,arm64: enable SWIOTLB_XEN Stefano Stabellini
2013-08-29 18:32 ` [PATCH v5 10/13] swiotlb-xen: introduce xen_swiotlb_set_dma_mask Stefano Stabellini
2013-08-29 18:32 ` [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages Stefano Stabellini
2013-09-05 16:09   ` Catalin Marinas
2013-09-05 16:43     ` Stefano Stabellini
2013-09-06 14:14       ` Catalin Marinas
2013-09-06 14:59         ` Stefano Stabellini
2013-09-06 15:59           ` Catalin Marinas
2013-09-06 16:09             ` Stefano Stabellini
2013-09-06 16:20               ` Catalin Marinas
2013-09-06 16:52                 ` Stefano Stabellini
2013-09-09 15:51                   ` Catalin Marinas
2013-09-09 16:46                     ` Stefano Stabellini
2013-09-11  9:36                       ` Catalin Marinas
2013-09-11 17:34                         ` Stefano Stabellini
2013-09-12 13:53                           ` Catalin Marinas
2013-09-12 14:44                             ` Stefano Stabellini
2013-09-12 15:04                               ` Catalin Marinas
2013-08-29 18:32 ` [PATCH v5 12/13] swiotlb-xen: use xen_alloc/free_coherent_pages Stefano Stabellini
2013-08-30 13:53   ` Konrad Rzeszutek Wilk
2013-09-05 16:50     ` Stefano Stabellini
2013-09-06 14:17       ` Konrad Rzeszutek Wilk
2013-09-06 15:04         ` Stefano Stabellini
2013-08-29 18:32 ` [PATCH v5 13/13] swiotlb: don't assume that io_tlb_start-io_tlb_end is coherent Stefano Stabellini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).