iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 00/12] PCI: brcmstb: enable PCIe for STB chips
@ 2020-07-15 14:35 Jim Quinlan via iommu
  2020-07-15 14:35 ` [PATCH v8 08/12] device core: Introduce DMA range map, supplanting dma_pfn_offset Jim Quinlan via iommu
  2020-07-20 23:27 ` [PATCH v8 00/12] PCI: brcmstb: enable PCIe for STB chips Florian Fainelli
  0 siblings, 2 replies; 5+ messages in thread
From: Jim Quinlan via iommu @ 2020-07-15 14:35 UTC (permalink / raw)
  To: linux-pci, Nicolas Saenz Julienne, Christoph Hellwig,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Heikki Krogerus, open list:SUPERH,
	open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10,
	open list:LIBATA SUBSYSTEM Serial and Parallel ATA drivers,
	Julien Grall, H. Peter Anvin, open list:STAGING SUBSYSTEM,
	Rob Herring, Florian Fainelli, Saravana Kannan,
	Rafael J. Wysocki, open list:ACPI FOR ARM64 ACPI/arm64,
	Alan Stern, open list:ALLWINNER A10 CSI DRIVER,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Joerg Roedel,
	Arnd Bergmann, Oliver Neukum, Hans de Goede, Stefano Stabellini,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Dan Williams, Andy Shevchenko, moderated list:ARM PORT,
	Jens Axboe, Greg Kroah-Hartman, open list:USB SUBSYSTEM,
	open list, open list:IOMMU DRIVERS, Robin Murphy,
	Suzuki K Poulose

Patchset Summary:
  Enhance a PCIe host controller driver.  Because of its unusual design
  we are foced to change dev->dma_pfn_offset into a more general role
  allowing multiple offsets.  See the 'v1' notes below for more info.

v8:
  Commit: "device core: Introduce DMA range map, supplanting ..."
  -- To satisfy a specific m68 compile configuration, I moved the 'struct
     bus_dma_region; definition out of #ifdef CONFIG_HAS_DMA and also defined
     three inline functions for !CONFIG_HAS_DMA (kernel test robot).
  -- The sunXi drivers -- suc4i_csi, sun6i_csi, cedrus_hw -- set
     a pfn_offset outside of_dma_configure() but the code offers no 
     insight on the size of the translation window.  V7 had me using
     SIZE_MAX as the size.  I have since contacted the sunXi maintainer and
     he said that using a size of SZ_4G would cover sunXi configurations.

v7:
  Commit: "device core: Introduce DMA range map, supplanting ..."
  -- remove second kcalloc/copy in device.c (AndyS)
  -- use PTR_ERR_OR_ZERO() and PHYS_PFN() (AndyS)
  -- indentation, sizeof(struct ...) => sizeof(*r) (AndyS)
  -- add pfn.h definitions: PFN_DMA_ADDR(), DMA_ADDR_PFN() (AndyS)
  -- Fixed compile error in "sun6i_csi.c" (kernel test robot)
  Commit "ata: ahci_brcm: Fix use of BCM7216 reset controller"
  -- correct name of function in the commit msg (SergeiS)
  
v6:
  Commit "device core: Introduce DMA range map":
  -- of_dma_get_range() now takes a single argument and returns either
     NULL, a valid map, or an ERR_PTR. (Robin)
  -- offsets are no longer a PFN value but an actual address. (Robin)
  -- the bus_dma_region struct stores the range size instead of
     the cpu_end and pci_end values. (Robin)
  -- devices that were setting a single offset with no boundaries
     have been modified to have boundaries; in a few places
     where this informatino was unavilable a /* FIXME: ... */
     comment was added. (Robin)
  -- dma_attach_offset_range() can be called when an offset
     map already exists; if it's range is already present
     nothing is done and success is returned. (Robin)
  All commits:
  -- Man name/style/corrections/etc changed (Bjorn)
  -- rebase to Torvalds master

v5:
  Commit "device core: Introduce multiple dma pfn offsets"
  -- in of/address.c: "map_size = 0" => "*map_size = 0"
  -- use kcalloc instead of kzalloc (AndyS)
  -- use PHYS_ADDR_MAX instead of "~(phys_addr_t)0"
  Commit "PCI: brcmstb: Set internal memory viewport sizes"
  -- now gives error on missing dma-ranges property.
  Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
  -- removed "Allof:" from brcm,scb-sizes definition (RobH)
  All Commits:
  -- indentation style, use max chars 100 (AndyS)
  -- rebased to torvalds master

v4:
  Commit "device core: Introduce multiple dma pfn offsets"
  -- of_dma_get_range() does not take a dev param but instead
     takes two "out" params: map and map_size.  We do this so
     that the code that parses dma-ranges is separate from
     the code that modifies 'dev'.   (Nicolas)
  -- the separate case of having a single pfn offset has
     been removed and is now processed by going through the
     map array. (Nicolas)
  -- move attach_uniform_dma_pfn_offset() from of/address.c to
     dma/mapping.c so that it does not depend on CONFIG_OF. (Nicolas)
  -- devm_kcalloc => devm_kzalloc (DanC)
  -- add/fix assignment to dev->dma_pfn_offset_map for func
     attach_uniform_dma_pfn_offset() (DanC, Nicolas)
  -- s/struct dma_pfn_offset_region/struct bus_dma_region/ (Nicolas)
  -- s/attach_uniform_dma_pfn_offset/dma_attach_uniform_pfn_offset/
  -- s/attach_dma_pfn_offset_map/dma_attach_pfn_offset_map/
  -- More use of PFN_{PHYS,DOWN,UP}. (AndyS)
  Commit "of: Include a dev param in of_dma_get_range()"
  -- this commit was sqaushed with "device core: Introduce ..."

v3:
  Commit "device core: Introduce multiple dma pfn offsets"
  Commit "arm: dma-mapping: Invoke dma offset func if needed"
  -- The above two commits have been squashed.  More importantly,
     the code has been modified so that the functionality for
     multiple pfn offsets subsumes the use of dev->dma_pfn_offset.
     In fact, dma_pfn_offset is removed and supplanted by
     dma_pfn_offset_map, which is a pointer to an array.  The
     more common case of a uniform offset is now handled as
     a map with a single entry, while cases requiring multiple
     pfn offsets use a map with multiple entries.  Code paths
     that used to do this:

         dev->dma_pfn_offset = mydrivers_pfn_offset;

     have been changed to do this:

         attach_uniform_dma_pfn_offset(dev, pfn_offset);

  Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
  -- Add if/then clause for required props: resets, reset-names (RobH)
  -- Change compatible list from const to enum (RobH)
  -- Change list of u32-tuples to u64 (RobH)

  Commit "of: Include a dev param in of_dma_get_range()"
  -- modify of/unittests.c to add NULL param in of_dma_get_range() call.

  Commit "device core: Add ability to handle multiple dma offsets"
  -- align comment in device.h (AndyS).
  -- s/cpu_beg/cpu_start/ and s/dma_beg/dma_start/ in struct
     dma_pfn_offset_region (AndyS).

v2:
Commit: "device core: Add ability to handle multiple dma offsets"
  o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph)
  o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph)
  o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP
  o dev->dma_pfn_map => dev->dma_pfn_offset_map
  o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph)
  o In device.h: s/const void */const struct dma_pfn_offset_region */
  o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since
    guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph)
  o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now
    dev->dma_pfn_offset_map is copied as well.
  o Merged two of the DMA commits into one (Christoph).

Commit "arm: dma-mapping: Invoke dma offset func if needed":
  o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET

Other commits' changes:
  o Removed need for carrying of_id var in priv (Nicolas)
  o Commit message rewordings (Bjorn)
  o Commit log messages filled to 75 chars (Bjorn)
  o devm_reset_control_get_shared())
    => devm_reset_control_get_optional_shared (Philipp)
  o Add call to reset_control_assert() in PCIe remove routines (Philipp)

v1:
This patchset expands the usefulness of the Broadcom Settop Box PCIe
controller by building upon the PCIe driver used currently by the
Raspbery Pi.  Other forms of this patchset were submitted by me years
ago and not accepted; the major sticking point was the code required
for the DMA remapping needed for the PCIe driver to work [1].

There have been many changes to the DMA and OF subsystems since that
time, making a cleaner and less intrusive patchset possible.  This
patchset implements a generalization of "dev->dma_pfn_offset", except
that instead of a single scalar offset it provides for multiple
offsets via a function which depends upon the "dma-ranges" property of
the PCIe host controller.  This is required for proper functionality
of the BrcmSTB PCIe controller and possibly some other devices.

[1] https://lore.kernel.org/linux-arm-kernel/1516058925-46522-5-git-send-email-jim2101024@gmail.com/

Jim Quinlan (12):
  PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB
  ata: ahci_brcm: Fix use of BCM7216 reset controller
  dt-bindings: PCI: Add bindings for more Brcmstb chips
  PCI: brcmstb: Add bcm7278 register info
  PCI: brcmstb: Add suspend and resume pm_ops
  PCI: brcmstb: Add bcm7278 PERST# support
  PCI: brcmstb: Add control of rescal reset
  device core: Introduce DMA range map, supplanting dma_pfn_offset
  PCI: brcmstb: Set additional internal memory DMA viewport sizes
  PCI: brcmstb: Accommodate MSI for older chips
  PCI: brcmstb: Set bus max burst size by chip type
  PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list

 .../bindings/pci/brcm,stb-pcie.yaml           |  56 ++-
 arch/arm/include/asm/dma-mapping.h            |   9 +-
 arch/arm/mach-keystone/keystone.c             |  17 +-
 arch/sh/drivers/pci/pcie-sh7786.c             |   9 +-
 arch/sh/kernel/dma-coherent.c                 |  16 +-
 arch/x86/pci/sta2x11-fixup.c                  |   7 +-
 drivers/acpi/arm64/iort.c                     |   5 +-
 drivers/ata/ahci_brcm.c                       |  11 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c         |   5 +-
 drivers/iommu/io-pgtable-arm.c                |   2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      |   5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      |   4 +-
 drivers/of/address.c                          |  95 ++--
 drivers/of/device.c                           |  47 +-
 drivers/of/of_private.h                       |   9 +-
 drivers/of/unittest.c                         |  35 +-
 drivers/pci/controller/Kconfig                |   3 +-
 drivers/pci/controller/pcie-brcmstb.c         | 408 +++++++++++++++---
 drivers/remoteproc/remoteproc_core.c          |   2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |   7 +-
 drivers/usb/core/message.c                    |   4 +-
 drivers/usb/core/usb.c                        |   2 +-
 include/linux/device.h                        |   4 +-
 include/linux/dma-direct.h                    |  10 +-
 include/linux/dma-mapping.h                   |  43 ++
 include/linux/pfn.h                           |   2 +
 kernel/dma/coherent.c                         |  10 +-
 kernel/dma/mapping.c                          |  53 +++
 28 files changed, 683 insertions(+), 197 deletions(-)

-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v8 08/12] device core: Introduce DMA range map, supplanting dma_pfn_offset
  2020-07-15 14:35 [PATCH v8 00/12] PCI: brcmstb: enable PCIe for STB chips Jim Quinlan via iommu
@ 2020-07-15 14:35 ` Jim Quinlan via iommu
  2020-07-21 12:51   ` Christoph Hellwig
  2020-07-20 23:27 ` [PATCH v8 00/12] PCI: brcmstb: enable PCIe for STB chips Florian Fainelli
  1 sibling, 1 reply; 5+ messages in thread
From: Jim Quinlan via iommu @ 2020-07-15 14:35 UTC (permalink / raw)
  To: linux-pci, Nicolas Saenz Julienne, Christoph Hellwig,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Rich Felker, open list:SUPERH, David Airlie, Hanjun Guo,
	open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM, Andy Shevchenko,
	Julien Grall, Heikki Krogerus, H. Peter Anvin, Will Deacon,
	Dan Williams, open list:STAGING SUBSYSTEM, Yoshinori Sato,
	Frank Rowand, maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT,
	Russell King, open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai,
	Ingo Molnar, Alan Stern, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Maxime Ripard, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS,
	open list:USB SUBSYSTEM, Stefano Stabellini, Daniel Vetter,
	Sudeep Holla, open list:ALLWINNER A10 CSI DRIVER, Robin Murphy

The new field 'dma_range_map' in struct device is used to facilitate the
use of single or multiple offsets between mapping regions of cpu addrs and
dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
capable of holding a single uniform offset and had no region bounds
checking.

The function of_dma_get_range() has been modified so that it takes a single
argument -- the device node -- and returns a map, NULL, or an error code.
The map is an array that holds the information regarding the DMA regions.
Each range entry contains the address offset, the cpu_start address, the
dma_start address, and the size of the region.

of_dma_configure() is the typical manner to set range offsets but there are
a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
driver code.  These cases now invoke the function
dma_attach_offset_range(dev, cpu_addr, dma_addr, size).

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
---
 arch/arm/include/asm/dma-mapping.h            |  9 +-
 arch/arm/mach-keystone/keystone.c             | 17 ++--
 arch/sh/drivers/pci/pcie-sh7786.c             |  9 +-
 arch/sh/kernel/dma-coherent.c                 | 16 ++--
 arch/x86/pci/sta2x11-fixup.c                  |  7 +-
 drivers/acpi/arm64/iort.c                     |  5 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c         |  5 +-
 drivers/iommu/io-pgtable-arm.c                |  2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  4 +-
 drivers/of/address.c                          | 95 ++++++++++---------
 drivers/of/device.c                           | 47 +++++----
 drivers/of/of_private.h                       |  9 +-
 drivers/of/unittest.c                         | 35 +++++--
 drivers/remoteproc/remoteproc_core.c          |  2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
 drivers/usb/core/message.c                    |  4 +-
 drivers/usb/core/usb.c                        |  2 +-
 include/linux/device.h                        |  4 +-
 include/linux/dma-direct.h                    | 10 +-
 include/linux/dma-mapping.h                   | 43 +++++++++
 include/linux/pfn.h                           |  2 +
 kernel/dma/coherent.c                         | 10 +-
 kernel/dma/mapping.c                          | 53 +++++++++++
 24 files changed, 278 insertions(+), 124 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index bdd80ddbca34..b7cdde9fb83d 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -35,8 +35,9 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 #ifndef __arch_pfn_to_dma
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-	if (dev)
-		pfn -= dev->dma_pfn_offset;
+	if (dev && dev->dma_range_map)
+		pfn -= DMA_ADDR_PFN(dma_offset_from_phys_addr(dev, PFN_PHYS(pfn)));
+
 	return (dma_addr_t)__pfn_to_bus(pfn);
 }
 
@@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
 {
 	unsigned long pfn = __bus_to_pfn(addr);
 
-	if (dev)
-		pfn += dev->dma_pfn_offset;
+	if (dev && dev->dma_range_map)
+		pfn += DMA_ADDR_PFN(dma_offset_from_dma_addr(dev, addr));
 
 	return pfn;
 }
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index 638808c4e122..a1a19781983b 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -8,6 +8,7 @@
  */
 #include <linux/io.h>
 #include <linux/of.h>
+#include <linux/dma-mapping.h>
 #include <linux/init.h>
 #include <linux/of_platform.h>
 #include <linux/of_address.h>
@@ -24,8 +25,6 @@
 
 #include "keystone.h"
 
-static unsigned long keystone_dma_pfn_offset __read_mostly;
-
 static int keystone_platform_notifier(struct notifier_block *nb,
 				      unsigned long event, void *data)
 {
@@ -38,9 +37,12 @@ static int keystone_platform_notifier(struct notifier_block *nb,
 		return NOTIFY_BAD;
 
 	if (!dev->of_node) {
-		dev->dma_pfn_offset = keystone_dma_pfn_offset;
-		dev_err(dev, "set dma_pfn_offset%08lx\n",
-			dev->dma_pfn_offset);
+		int ret = dma_attach_offset_range(dev, KEYSTONE_HIGH_PHYS_START,
+						  KEYSTONE_LOW_PHYS_START,
+						  KEYSTONE_HIGH_PHYS_SIZE);
+		dev_err(dev, "set dma_offset%08llx%s\n",
+			KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START,
+			ret ? " failed" : "");
 	}
 	return NOTIFY_OK;
 }
@@ -51,11 +53,8 @@ static struct notifier_block platform_nb = {
 
 static void __init keystone_init(void)
 {
-	if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) {
-		keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
-						   KEYSTONE_LOW_PHYS_START);
+	if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START)
 		bus_register_notifier(&platform_bus_type, &platform_nb);
-	}
 	keystone_pm_runtime_init();
 }
 
diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-sh7786.c
index e0b568aaa701..716bb99022c6 100644
--- a/arch/sh/drivers/pci/pcie-sh7786.c
+++ b/arch/sh/drivers/pci/pcie-sh7786.c
@@ -12,6 +12,7 @@
 #include <linux/io.h>
 #include <linux/async.h>
 #include <linux/delay.h>
+#include <linux/dma-mapping.h>
 #include <linux/slab.h>
 #include <linux/clk.h>
 #include <linux/sh_clk.h>
@@ -31,6 +32,8 @@ struct sh7786_pcie_port {
 static struct sh7786_pcie_port *sh7786_pcie_ports;
 static unsigned int nr_ports;
 static unsigned long dma_pfn_offset;
+size_t memsize;
+u64 memstart;
 
 static struct sh7786_pcie_hwops {
 	int (*core_init)(void);
@@ -301,7 +304,6 @@ static int __init pcie_init(struct sh7786_pcie_port *port)
 	struct pci_channel *chan = port->hose;
 	unsigned int data;
 	phys_addr_t memstart, memend;
-	size_t memsize;
 	int ret, i, win;
 
 	/* Begin initialization */
@@ -368,8 +370,6 @@ static int __init pcie_init(struct sh7786_pcie_port *port)
 	memstart = ALIGN_DOWN(memstart, memsize);
 	memsize = roundup_pow_of_two(memend - memstart);
 
-	dma_pfn_offset = memstart >> PAGE_SHIFT;
-
 	/*
 	 * If there's more than 512MB of memory, we need to roll over to
 	 * LAR1/LAMR1.
@@ -487,7 +487,8 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev, u8 slot, u8 pin)
 
 void pcibios_bus_add_device(struct pci_dev *pdev)
 {
-	pdev->dev.dma_pfn_offset = dma_pfn_offset;
+	dma_attach_offset_range(&pdev->dev, __pa(memory_start),
+				__pa(memory_start) - memstart, memsize);
 }
 
 static int __init sh7786_pcie_core_init(void)
diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
index d4811691b93c..e00f29c7c443 100644
--- a/arch/sh/kernel/dma-coherent.c
+++ b/arch/sh/kernel/dma-coherent.c
@@ -14,6 +14,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 {
 	void *ret, *ret_nocache;
 	int order = get_order(size);
+	phys_addr_t phys;
 
 	gfp |= __GFP_ZERO;
 
@@ -34,11 +35,12 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		return NULL;
 	}
 
-	split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
+	phys = virt_to_phys(ret);
+	split_page(pfn_to_page(PHYS_PFN(phys)), order);
 
-	*dma_handle = virt_to_phys(ret);
-	if (!WARN_ON(!dev))
-		*dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
+	*dma_handle = (dma_addr_t)phys;
+	if (!WARN_ON(!dev) && dev->dma_range_map)
+		*dma_handle -= dma_offset_from_phys_addr(dev, phys);
 
 	return ret_nocache;
 }
@@ -47,11 +49,11 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 		dma_addr_t dma_handle, unsigned long attrs)
 {
 	int order = get_order(size);
-	unsigned long pfn = (dma_handle >> PAGE_SHIFT);
+	unsigned long pfn = PHYS_PFN(dma_handle);
 	int k;
 
-	if (!WARN_ON(!dev))
-		pfn += dev->dma_pfn_offset;
+	if (!WARN_ON(!dev) && dev->dma_range_map)
+		pfn += DMA_ADDR_PFN(dma_offset_from_dma_addr(dev, dma_handle));
 
 	for (k = 0; k < (1 << order); k++)
 		__free_pages(pfn_to_page(pfn + k), 0);
diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index c313d784efab..74633ccf622e 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -12,6 +12,7 @@
 #include <linux/export.h>
 #include <linux/list.h>
 #include <linux/dma-direct.h>
+#include <linux/dma-mapping.h>
 #include <asm/iommu.h>
 
 #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
@@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
 	struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
 	struct device *dev = &pdev->dev;
 	u32 amba_base, max_amba_addr;
-	int i;
+	int i, ret;
 
 	if (!instance)
 		return;
@@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
 	pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
 	max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
 
-	dev->dma_pfn_offset = PFN_DOWN(-amba_base);
+	ret = dma_attach_offset_range(dev, 0, amba_base, STA2X11_AMBA_SIZE);
+	if (ret)
+		dev_err(dev, "sta2x11: could not set DMA offset\n");
 
 	dev->bus_dma_limit = max_amba_addr;
 	pci_set_consistent_dma_mask(pdev, max_amba_addr);
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 28a6b387e80e..41c2d861ce43 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
 	*dma_addr = dmaaddr;
 	*dma_size = size;
 
-	dev->dma_pfn_offset = PFN_DOWN(offset);
-	dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
+	ret = dma_attach_offset_range(dev, dmaaddr + offset, dmaaddr, size);
+
+	dev_dbg(dev, "dma_offset(%#08llx)%s\n", offset, ret ? " failed!" : "");
 }
 
 static void __init acpi_iort_register_irq(int hwirq, const char *name,
diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c
index 072ea113e6be..cbe49a07983c 100644
--- a/drivers/gpu/drm/sun4i/sun4i_backend.c
+++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
@@ -11,6 +11,7 @@
 #include <linux/module.h>
 #include <linux/of_device.h>
 #include <linux/of_graph.h>
+#include <linux/dma-mapping.h>
 #include <linux/platform_device.h>
 #include <linux/reset.h>
 
@@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
 		 * on our device since the RAM mapping is at 0 for the DMA bus,
 		 * unlike the CPU.
 		 */
-		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+		ret = dma_attach_offset_range(drm->dev, PHYS_OFFSET, 0, SZ_4G);
+		if (ret)
+			return ret;
 	}
 
 	backend->engine.node = dev->of_node;
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 04fbd4bf0ff9..d5542df9aacc 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
 		return NULL;
 
-	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
+	if (!selftest_running && cfg->iommu_dev->dma_range_map) {
 		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
 		return NULL;
 	}
diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
index eff34ded6305..95a5d5655056 100644
--- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
+++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/clk.h>
+#include <linux/dma-mapping.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/mutex.h>
@@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
 			return ret;
 	} else {
 #ifdef PHYS_PFN_OFFSET
-		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+		ret = dma_attach_offset_range(csi->dev, PHYS_OFFSET, 0, SZ_4G);
+		if (ret)
+			return ret;
 #endif
 	}
 
diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
index 055eb0b8e396..c26fc1cdd4d2 100644
--- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
+++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
@@ -898,7 +898,9 @@ static int sun6i_csi_probe(struct platform_device *pdev)
 
 	sdev->dev = &pdev->dev;
 	/* The DMA bus has the memory mapped at 0 */
-	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
+	ret = dma_attach_offset_range(sdev->dev, PHYS_OFFSET, 0, SZ_4G);
+	if (ret)
+		return ret;
 
 	ret = sun6i_csi_resource_request(sdev, pdev);
 	if (ret)
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 8eea3f6e29a4..5d9117a1cb16 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -918,33 +918,65 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
 }
 EXPORT_SYMBOL(of_io_request_and_map);
 
+static const struct bus_dma_region *dma_create_offset_map(struct device_node *node,
+							  int num_ranges)
+{
+	struct of_range_parser parser;
+	struct of_range range;
+	struct bus_dma_region *map, *r;
+	int ret;
+
+	r = kcalloc(num_ranges + 1, sizeof(*r), GFP_KERNEL);
+	if (!r)
+		return ERR_PTR(-ENOMEM);
+
+	map = r;
+	ret = of_dma_range_parser_init(&parser, node);
+	if (ret)
+		return ERR_PTR(ret);
+
+	/*
+	 * Record all info for DMA ranges array.  We use our
+	 * our own struct (bus_dma_region) so it is not dependent
+	 * on CONFIG_OF.
+	 */
+	for_each_of_range(&parser, &range) {
+		pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
+			 range.bus_addr, range.cpu_addr, range.size);
+		r->cpu_start = range.cpu_addr;
+		r->dma_start = range.bus_addr;
+		r->size = range.size;
+		r->offset = (u64)range.cpu_addr - (u64)range.bus_addr;
+		r++;
+	}
+	return map;
+}
+
 /**
- * of_dma_get_range - Get DMA range info
+ * of_dma_get_range - Get DMA range info and put it into a map array
  * @np:		device node to get DMA range info
- * @dma_addr:	pointer to store initial DMA address of DMA range
- * @paddr:	pointer to store initial CPU address of DMA range
- * @size:	pointer to store size of DMA range
  *
  * Look in bottom up direction for the first "dma-ranges" property
- * and parse it.
- *  dma-ranges format:
+ * and parse it.  Put the information into a DMA offset map array.
+ *
+ * dma-ranges format:
  *	DMA addr (dma_addr)	: naddr cells
  *	CPU addr (phys_addr_t)	: pna cells
  *	size			: nsize cells
  *
- * It returns -ENODEV if "dma-ranges" property was not found
- * for this device in DT.
+ * It returns -ENODEV if "dma-ranges" property was not found for this
+ * device in the DT.
  */
-int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *size)
+const struct bus_dma_region *of_dma_get_range(struct device_node *np)
 {
+	const struct bus_dma_region *map = NULL;
 	struct device_node *node = of_node_get(np);
+	struct of_range_parser parser;
 	const __be32 *ranges = NULL;
-	int len;
-	int ret = 0;
 	bool found_dma_ranges = false;
-	struct of_range_parser parser;
 	struct of_range range;
-	u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
+	int len, num_ranges = 0;
+	int ret = 0;
 
 	while (node) {
 		ranges = of_get_property(node, "dma-ranges", &len);
@@ -971,42 +1003,13 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
 
 	of_dma_range_parser_init(&parser, node);
 
-	for_each_of_range(&parser, &range) {
-		pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
-			 range.bus_addr, range.cpu_addr, range.size);
-
-		if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset) {
-			pr_warn("Can't handle multiple dma-ranges with different offsets on node(%pOF)\n", node);
-			/* Don't error out as we'd break some existing DTs */
-			continue;
-		}
-		dma_offset = range.cpu_addr - range.bus_addr;
-
-		/* Take lower and upper limits */
-		if (range.bus_addr < dma_start)
-			dma_start = range.bus_addr;
-		if (range.bus_addr + range.size > dma_end)
-			dma_end = range.bus_addr + range.size;
-	}
-
-	if (dma_start >= dma_end) {
-		ret = -EINVAL;
-		pr_debug("Invalid DMA ranges configuration on node(%pOF)\n",
-			 node);
-		goto out;
-	}
-
-	*dma_addr = dma_start;
-	*size = dma_end - dma_start;
-	*paddr = dma_start + dma_offset;
-
-	pr_debug("final: dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
-		 *dma_addr, *paddr, *size);
+	for_each_of_range(&parser, &range)
+		num_ranges++;
 
+	map = dma_create_offset_map(node, num_ranges);
 out:
 	of_node_put(node);
-
-	return ret;
+	return map ? map : ERR_PTR(ret);
 }
 
 /**
diff --git a/drivers/of/device.c b/drivers/of/device.c
index 27203bfd0b22..fea2f31d4245 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -88,14 +88,15 @@ int of_device_add(struct platform_device *ofdev)
  */
 int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 {
-	u64 dma_addr, paddr, size = 0;
-	int ret;
-	bool coherent;
-	unsigned long offset;
 	const struct iommu_ops *iommu;
-	u64 mask, end;
+	const struct bus_dma_region *map;
+	dma_addr_t dma_start = 0;
+	u64 mask, end, size = 0;
+	bool coherent;
+	int ret;
 
-	ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
+	map = of_dma_get_range(np);
+	ret = PTR_ERR_OR_ZERO(map);
 	if (ret < 0) {
 		/*
 		 * For legacy reasons, we have to assume some devices need
@@ -105,25 +106,36 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 		if (!force_dma)
 			return ret == -ENODEV ? 0 : ret;
 
-		dma_addr = offset = 0;
-	} else {
-		offset = PFN_DOWN(paddr - dma_addr);
+		dma_start = 0;
+		map = NULL;
+	} else if (map) {
+		const struct bus_dma_region *r = map;
+		dma_addr_t dma_end = 0;
+
+		/* Determine the overall bounds of all DMA regions */
+		for (dma_start = ~(dma_addr_t)0; r->size; r++) {
+			/* Take lower and upper limits */
+			if (r->dma_start < dma_start)
+				dma_start = r->dma_start;
+			if (r->dma_start + r->size > dma_end)
+				dma_end = r->dma_start + r->size;
+		}
+		size = dma_end - dma_start;
 
 		/*
 		 * Add a work around to treat the size as mask + 1 in case
 		 * it is defined in DT as a mask.
 		 */
 		if (size & 1) {
-			dev_warn(dev, "Invalid size 0x%llx for dma-range\n",
-				 size);
+			dev_warn(dev, "Invalid size 0x%llx for dma-range(s)\n", size);
 			size = size + 1;
 		}
 
 		if (!size) {
 			dev_err(dev, "Adjusted size 0x%llx invalid\n", size);
+			kfree(map);
 			return -EINVAL;
 		}
-		dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
 	}
 
 	/*
@@ -142,13 +154,11 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 	else if (!size)
 		size = 1ULL << 32;
 
-	dev->dma_pfn_offset = offset;
-
 	/*
 	 * Limit coherent and dma mask based on size and default mask
 	 * set by the driver.
 	 */
-	end = dma_addr + size - 1;
+	end = dma_start + size - 1;
 	mask = DMA_BIT_MASK(ilog2(end) + 1);
 	dev->coherent_dma_mask &= mask;
 	*dev->dma_mask &= mask;
@@ -161,14 +171,17 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 		coherent ? " " : " not ");
 
 	iommu = of_iommu_configure(dev, np);
-	if (PTR_ERR(iommu) == -EPROBE_DEFER)
+	if (PTR_ERR(iommu) == -EPROBE_DEFER) {
+		kfree(map);
 		return -EPROBE_DEFER;
+	}
 
 	dev_dbg(dev, "device is%sbehind an iommu\n",
 		iommu ? " " : " not ");
 
-	arch_setup_dma_ops(dev, dma_addr, size, iommu, coherent);
+	arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
 
+	dev->dma_range_map = map;
 	return 0;
 }
 EXPORT_SYMBOL_GPL(of_dma_configure);
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index edc682249c00..876149e721c5 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -157,14 +157,13 @@ extern void __of_sysfs_remove_bin_file(struct device_node *np,
 extern int of_bus_n_addr_cells(struct device_node *np);
 extern int of_bus_n_size_cells(struct device_node *np);
 
+struct bus_dma_region;
 #ifdef CONFIG_OF_ADDRESS
-extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
-			    u64 *paddr, u64 *size);
+extern const struct bus_dma_region *of_dma_get_range(struct device_node *np);
 #else
-static inline int of_dma_get_range(struct device_node *np, u64 *dma_addr,
-				   u64 *paddr, u64 *size)
+static inline const struct bus_dma_region *of_dma_get_range(struct device_node *np)
 {
-	return -ENODEV;
+	return ERR_PTR(-ENODEV);
 }
 #endif
 
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index 398de04fd19c..542d092f19c2 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -7,6 +7,7 @@
 
 #include <linux/memblock.h>
 #include <linux/clk.h>
+#include <linux/dma-mapping.h>
 #include <linux/err.h>
 #include <linux/errno.h>
 #include <linux/hashtable.h>
@@ -869,10 +870,10 @@ static void __init of_unittest_changeset(void)
 }
 
 static void __init of_unittest_dma_ranges_one(const char *path,
-		u64 expect_dma_addr, u64 expect_paddr, u64 expect_size)
+		u64 expect_dma_addr, u64 expect_paddr)
 {
 	struct device_node *np;
-	u64 dma_addr, paddr, size;
+	const struct bus_dma_region *map = NULL;
 	int rc;
 
 	np = of_find_node_by_path(path);
@@ -881,16 +882,27 @@ static void __init of_unittest_dma_ranges_one(const char *path,
 		return;
 	}
 
-	rc = of_dma_get_range(np, &dma_addr, &paddr, &size);
-
+	map = of_dma_get_range(np);
+	rc = PTR_ERR_OR_ZERO(map);
 	unittest(!rc, "of_dma_get_range failed on node %pOF rc=%i\n", np, rc);
-	if (!rc) {
-		unittest(size == expect_size,
-			 "of_dma_get_range wrong size on node %pOF size=%llx\n", np, size);
+
+	if (!rc && map) {
+		phys_addr_t	paddr;
+		dma_addr_t	dma_addr;
+		struct device	dev_bogus;
+
+		dev_bogus.dma_range_map = map;
+		paddr = (phys_addr_t)expect_dma_addr
+			+ dma_offset_from_dma_addr(&dev_bogus, expect_dma_addr);
+		dma_addr = (dma_addr_t)expect_paddr
+			- dma_offset_from_phys_addr(&dev_bogus, expect_paddr);
+
 		unittest(paddr == expect_paddr,
 			 "of_dma_get_range wrong phys addr (%llx) on node %pOF", paddr, np);
 		unittest(dma_addr == expect_dma_addr,
 			 "of_dma_get_range wrong DMA addr (%llx) on node %pOF", dma_addr, np);
+
+		kfree(map);
 	}
 	of_node_put(np);
 }
@@ -898,11 +910,14 @@ static void __init of_unittest_dma_ranges_one(const char *path,
 static void __init of_unittest_parse_dma_ranges(void)
 {
 	of_unittest_dma_ranges_one("/testcase-data/address-tests/device@70000000",
-		0x0, 0x20000000, 0x40000000);
+		0x0, 0x20000000);
 	of_unittest_dma_ranges_one("/testcase-data/address-tests/bus@80000000/device@1000",
-		0x10000000, 0x20000000, 0x40000000);
+		0x10000000, 0x20000000);
+	/* pci@90000000 has two ranges in the dma-range property */
+	of_unittest_dma_ranges_one("/testcase-data/address-tests/pci@90000000",
+		0x80000000, 0x20000000);
 	of_unittest_dma_ranges_one("/testcase-data/address-tests/pci@90000000",
-		0x80000000, 0x20000000, 0x10000000);
+		0xc0000000, 0x40000000);
 }
 
 static void __init of_unittest_pci_dma_ranges(void)
diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index 9f04c30c4aaf..49242dd6176e 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -519,7 +519,7 @@ static int rproc_handle_vdev(struct rproc *rproc, struct fw_rsc_vdev *rsc,
 	/* Initialise vdev subdevice */
 	snprintf(name, sizeof(name), "vdev%dbuffer", rvdev->index);
 	rvdev->dev.parent = &rproc->dev;
-	rvdev->dev.dma_pfn_offset = rproc->dev.parent->dma_pfn_offset;
+	rvdev->dev.dma_range_map = rproc->dev.parent->dma_range_map;
 	rvdev->dev.release = rproc_rvdev_release;
 	dev_set_name(&rvdev->dev, "%s#%s", dev_name(rvdev->dev.parent), name);
 	dev_set_drvdata(&rvdev->dev, rvdev);
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
index 1744e6fcc999..720b41eca7a3 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
@@ -230,8 +230,11 @@ int cedrus_hw_probe(struct cedrus_dev *dev)
 	 */
 
 #ifdef PHYS_PFN_OFFSET
-	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET))
-		dev->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET)) {
+		ret = dma_attach_offset_range(dev->dev, PHYS_OFFSET, 0, SZ_4G);
+		if (ret)
+			return ret;
+	}
 #endif
 
 	ret = of_reserved_mem_device_init(dev->dev);
diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index 6197938dcc2d..376ca258e510 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -1956,10 +1956,10 @@ int usb_set_configuration(struct usb_device *dev, int configuration)
 		intf->dev.groups = usb_interface_groups;
 		/*
 		 * Please refer to usb_alloc_dev() to see why we set
-		 * dma_mask and dma_pfn_offset.
+		 * dma_mask and dma_range_map.
 		 */
 		intf->dev.dma_mask = dev->dev.dma_mask;
-		intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
+		intf->dev.dma_range_map = dev->dev.dma_range_map;
 		INIT_WORK(&intf->reset_ws, __usb_queue_reset_device);
 		intf->minor = -1;
 		device_initialize(&intf->dev);
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
index f16c26dc079d..1f167a2c095e 100644
--- a/drivers/usb/core/usb.c
+++ b/drivers/usb/core/usb.c
@@ -611,7 +611,7 @@ struct usb_device *usb_alloc_dev(struct usb_device *parent,
 	 * mask for the entire HCD, so don't do that.
 	 */
 	dev->dev.dma_mask = bus->sysdev->dma_mask;
-	dev->dev.dma_pfn_offset = bus->sysdev->dma_pfn_offset;
+	dev->dev.dma_range_map = bus->sysdev->dma_range_map;
 	set_dev_node(&dev->dev, dev_to_node(bus->sysdev));
 	dev->state = USB_STATE_ATTACHED;
 	dev->lpm_disable_count = 1;
diff --git a/include/linux/device.h b/include/linux/device.h
index 15460a5ac024..feddefcf3e5c 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -492,7 +492,7 @@ struct dev_links_info {
  * 		such descriptors.
  * @bus_dma_limit: Limit of an upstream bridge or bus which imposes a smaller
  *		DMA limit than the device itself supports.
- * @dma_pfn_offset: offset of DMA memory range relatively of RAM
+ * @dma_range_map: map for DMA memory ranges relative to that of RAM
  * @dma_parms:	A low level driver may set these to teach IOMMU code about
  * 		segment limitations.
  * @dma_pools:	Dma pools (if dma'ble device).
@@ -577,7 +577,7 @@ struct device {
 					     64 bit addresses for consistent
 					     allocations such descriptors. */
 	u64		bus_dma_limit;	/* upstream dma constraint */
-	unsigned long	dma_pfn_offset;
+	const struct bus_dma_region *dma_range_map;
 
 	struct device_dma_parameters *dma_parms;
 
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index cdfa400f89b3..182784d28cfd 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -15,14 +15,20 @@ static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
 	dma_addr_t dev_addr = (dma_addr_t)paddr;
 
-	return dev_addr - ((dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
+	if (dev->dma_range_map)
+		dev_addr -= dma_offset_from_phys_addr(dev, paddr);
+
+	return dev_addr;
 }
 
 static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr)
 {
 	phys_addr_t paddr = (phys_addr_t)dev_addr;
 
-	return paddr + ((phys_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
+	if (dev->dma_range_map)
+		paddr += dma_offset_from_dma_addr(dev, dev_addr);
+
+	return paddr;
 }
 #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 78f677cf45ab..7c8fcac30e74 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -255,7 +255,37 @@ static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
 
 size_t dma_direct_max_mapping_size(struct device *dev);
 
+struct bus_dma_region {
+	phys_addr_t	cpu_start;
+	dma_addr_t	dma_start;
+	u64		size;
+	u64		offset;
+};
+
 #ifdef CONFIG_HAS_DMA
+int dma_attach_offset_range(struct device *dev, phys_addr_t cpu_start,
+		dma_addr_t dma_start, u64 size);
+
+static inline u64 dma_offset_from_dma_addr(struct device *dev, dma_addr_t dma_addr)
+{
+	const struct bus_dma_region *m = dev->dma_range_map;
+
+	for (; m->size; m++)
+		if (dma_addr >= m->dma_start && dma_addr - m->dma_start < m->size)
+			return m->offset;
+	return 0;
+}
+
+static inline u64 dma_offset_from_phys_addr(struct device *dev, phys_addr_t paddr)
+{
+	const struct bus_dma_region *m = dev->dma_range_map;
+
+	for (; m->size; m++)
+		if (paddr >= m->cpu_start && paddr - m->cpu_start < m->size)
+			return m->offset;
+	return 0;
+}
+
 #include <asm/dma-mapping.h>
 
 static inline const struct dma_map_ops *get_dma_ops(struct device *dev)
@@ -463,6 +493,19 @@ u64 dma_get_required_mask(struct device *dev);
 size_t dma_max_mapping_size(struct device *dev);
 unsigned long dma_get_merge_boundary(struct device *dev);
 #else /* CONFIG_HAS_DMA */
+static inline u64 dma_offset_from_dma_addr(struct device *dev, dma_addr_t dma_addr)
+{
+	return (u64)0;
+}
+static inline u64 dma_offset_from_phys_addr(struct device *dev, phys_addr_t paddr)
+{
+	return (u64)0;
+}
+static int dma_attach_offset_range(struct device *dev, phys_addr_t cpu_start,
+		dma_addr_t dma_start, u64 size)
+{
+	return -EIO;
+}
 static inline dma_addr_t dma_map_page_attrs(struct device *dev,
 		struct page *page, size_t offset, size_t size,
 		enum dma_data_direction dir, unsigned long attrs)
diff --git a/include/linux/pfn.h b/include/linux/pfn.h
index 14bc053c53d8..eddb535075a0 100644
--- a/include/linux/pfn.h
+++ b/include/linux/pfn.h
@@ -20,5 +20,7 @@ typedef struct {
 #define PFN_DOWN(x)	((x) >> PAGE_SHIFT)
 #define PFN_PHYS(x)	((phys_addr_t)(x) << PAGE_SHIFT)
 #define PHYS_PFN(x)	((unsigned long)((x) >> PAGE_SHIFT))
+#define PFN_DMA_ADDR(x)	((dma_addr_t)(x) << PAGE_SHIFT)
+#define DMA_ADDR_PFN(x)	((unsigned long)((x) >> PAGE_SHIFT))
 
 #endif
diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 2a0c4985f38e..66b1ac611c61 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -31,10 +31,12 @@ static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *de
 static inline dma_addr_t dma_get_device_base(struct device *dev,
 					     struct dma_coherent_mem * mem)
 {
-	if (mem->use_dev_dma_pfn_offset)
-		return (mem->pfn_base - dev->dma_pfn_offset) << PAGE_SHIFT;
-	else
-		return mem->device_base;
+	if (mem->use_dev_dma_pfn_offset && dev->dma_range_map) {
+		u64 dma_offset = dma_offset_from_phys_addr(dev, PFN_PHYS(mem->pfn_base));
+
+		return PFN_DMA_ADDR(mem->pfn_base) - dma_offset;
+	}
+	return mem->device_base;
 }
 
 static int dma_init_coherent_memory(phys_addr_t phys_addr,
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 98e3d873792e..2c08c4991bfa 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -11,6 +11,7 @@
 #include <linux/dma-noncoherent.h>
 #include <linux/export.h>
 #include <linux/gfp.h>
+#include <linux/limits.h>
 #include <linux/of_device.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
@@ -407,3 +408,55 @@ unsigned long dma_get_merge_boundary(struct device *dev)
 	return ops->get_merge_boundary(dev);
 }
 EXPORT_SYMBOL_GPL(dma_get_merge_boundary);
+
+/**
+ * dma_attach_offset_range - Assign scalar offset for a single DMA range.
+ * @dev:	device pointer; needed to "own" the alloced memory.
+ * @cpu_start:  beginning of memory region covered by this offset.
+ * @dma_start:  beginning of DMA/PCI region covered by this offset.
+ * @size:	size of the region.
+ *
+ * This is for the simple case of a uniform offset which cannot
+ * be discovered by "dma-ranges".
+ *
+ * It returns -ENOMEM if out of memory, -ENODEV if dev == NULL, otherwise 0.
+ */
+int dma_attach_offset_range(struct device *dev, phys_addr_t cpu_start,
+			    dma_addr_t dma_start, u64 size)
+{
+	struct bus_dma_region *map;
+	u64 offset = (u64)cpu_start - (u64)dma_start;
+
+	if (!dev)
+		return -ENODEV;
+
+	/* See if a map already exists and we already encompass the new range */
+	if (dev->dma_range_map) {
+		const struct bus_dma_region *m = dev->dma_range_map;
+
+		for (; m->size; m++)
+			if (offset == m->offset && cpu_start >= m->cpu_start
+			    && size <= m->size && cpu_start - m->cpu_start <= m->size - size)
+				return 0;
+
+		dev_err(dev, "attempt to add conflicting DMA range to existing map\n");
+		return -EINVAL;
+	}
+
+	if (!offset)
+		return 0;
+
+	/* Don't use devm_kcalloc() since this may be called as bus a notifier */
+	map = kcalloc(2, sizeof(*map), GFP_KERNEL);
+	if (!map)
+		return -ENOMEM;
+	dev->dma_range_map = map;
+
+	map->cpu_start = cpu_start;
+	map->dma_start = dma_start;
+	map->offset = offset;
+	map->size = size;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(dma_attach_offset_range);
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v8 00/12] PCI: brcmstb: enable PCIe for STB chips
  2020-07-15 14:35 [PATCH v8 00/12] PCI: brcmstb: enable PCIe for STB chips Jim Quinlan via iommu
  2020-07-15 14:35 ` [PATCH v8 08/12] device core: Introduce DMA range map, supplanting dma_pfn_offset Jim Quinlan via iommu
@ 2020-07-20 23:27 ` Florian Fainelli
  1 sibling, 0 replies; 5+ messages in thread
From: Florian Fainelli @ 2020-07-20 23:27 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Nicolas Saenz Julienne,
	Christoph Hellwig, bcm-kernel-feedback-list, Robin Murphy
  Cc: Heikki Krogerus, open list:SUPERH,
	open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	Julien Grall, H. Peter Anvin, open list:STAGING SUBSYSTEM,
	Rob Herring, Florian Fainelli, Saravana Kannan,
	Rafael J. Wysocki, open list:ACPI FOR ARM64 (ACPI/arm64),
	Alan Stern, open list:ALLWINNER A10 CSI DRIVER,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Joerg Roedel,
	Arnd Bergmann, Oliver Neukum, Hans de Goede, Stefano Stabellini,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Dan Williams, Andy Shevchenko, moderated list:ARM PORT,
	Jens Axboe, Greg Kroah-Hartman, open list:USB SUBSYSTEM,
	open list, open list:IOMMU DRIVERS, Suzuki K Poulose

On 7/15/20 7:35 AM, Jim Quinlan wrote:
> Patchset Summary:
>   Enhance a PCIe host controller driver.  Because of its unusual design
>   we are foced to change dev->dma_pfn_offset into a more general role
>   allowing multiple offsets.  See the 'v1' notes below for more info.

Christoph, Robin, are you happy with this version?

> 
> v8:
>   Commit: "device core: Introduce DMA range map, supplanting ..."
>   -- To satisfy a specific m68 compile configuration, I moved the 'struct
>      bus_dma_region; definition out of #ifdef CONFIG_HAS_DMA and also defined
>      three inline functions for !CONFIG_HAS_DMA (kernel test robot).
>   -- The sunXi drivers -- suc4i_csi, sun6i_csi, cedrus_hw -- set
>      a pfn_offset outside of_dma_configure() but the code offers no 
>      insight on the size of the translation window.  V7 had me using
>      SIZE_MAX as the size.  I have since contacted the sunXi maintainer and
>      he said that using a size of SZ_4G would cover sunXi configurations.
> 
> v7:
>   Commit: "device core: Introduce DMA range map, supplanting ..."
>   -- remove second kcalloc/copy in device.c (AndyS)
>   -- use PTR_ERR_OR_ZERO() and PHYS_PFN() (AndyS)
>   -- indentation, sizeof(struct ...) => sizeof(*r) (AndyS)
>   -- add pfn.h definitions: PFN_DMA_ADDR(), DMA_ADDR_PFN() (AndyS)
>   -- Fixed compile error in "sun6i_csi.c" (kernel test robot)
>   Commit "ata: ahci_brcm: Fix use of BCM7216 reset controller"
>   -- correct name of function in the commit msg (SergeiS)
>   
> v6:
>   Commit "device core: Introduce DMA range map":
>   -- of_dma_get_range() now takes a single argument and returns either
>      NULL, a valid map, or an ERR_PTR. (Robin)
>   -- offsets are no longer a PFN value but an actual address. (Robin)
>   -- the bus_dma_region struct stores the range size instead of
>      the cpu_end and pci_end values. (Robin)
>   -- devices that were setting a single offset with no boundaries
>      have been modified to have boundaries; in a few places
>      where this informatino was unavilable a /* FIXME: ... */
>      comment was added. (Robin)
>   -- dma_attach_offset_range() can be called when an offset
>      map already exists; if it's range is already present
>      nothing is done and success is returned. (Robin)
>   All commits:
>   -- Man name/style/corrections/etc changed (Bjorn)
>   -- rebase to Torvalds master
> 
> v5:
>   Commit "device core: Introduce multiple dma pfn offsets"
>   -- in of/address.c: "map_size = 0" => "*map_size = 0"
>   -- use kcalloc instead of kzalloc (AndyS)
>   -- use PHYS_ADDR_MAX instead of "~(phys_addr_t)0"
>   Commit "PCI: brcmstb: Set internal memory viewport sizes"
>   -- now gives error on missing dma-ranges property.
>   Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
>   -- removed "Allof:" from brcm,scb-sizes definition (RobH)
>   All Commits:
>   -- indentation style, use max chars 100 (AndyS)
>   -- rebased to torvalds master
> 
> v4:
>   Commit "device core: Introduce multiple dma pfn offsets"
>   -- of_dma_get_range() does not take a dev param but instead
>      takes two "out" params: map and map_size.  We do this so
>      that the code that parses dma-ranges is separate from
>      the code that modifies 'dev'.   (Nicolas)
>   -- the separate case of having a single pfn offset has
>      been removed and is now processed by going through the
>      map array. (Nicolas)
>   -- move attach_uniform_dma_pfn_offset() from of/address.c to
>      dma/mapping.c so that it does not depend on CONFIG_OF. (Nicolas)
>   -- devm_kcalloc => devm_kzalloc (DanC)
>   -- add/fix assignment to dev->dma_pfn_offset_map for func
>      attach_uniform_dma_pfn_offset() (DanC, Nicolas)
>   -- s/struct dma_pfn_offset_region/struct bus_dma_region/ (Nicolas)
>   -- s/attach_uniform_dma_pfn_offset/dma_attach_uniform_pfn_offset/
>   -- s/attach_dma_pfn_offset_map/dma_attach_pfn_offset_map/
>   -- More use of PFN_{PHYS,DOWN,UP}. (AndyS)
>   Commit "of: Include a dev param in of_dma_get_range()"
>   -- this commit was sqaushed with "device core: Introduce ..."
> 
> v3:
>   Commit "device core: Introduce multiple dma pfn offsets"
>   Commit "arm: dma-mapping: Invoke dma offset func if needed"
>   -- The above two commits have been squashed.  More importantly,
>      the code has been modified so that the functionality for
>      multiple pfn offsets subsumes the use of dev->dma_pfn_offset.
>      In fact, dma_pfn_offset is removed and supplanted by
>      dma_pfn_offset_map, which is a pointer to an array.  The
>      more common case of a uniform offset is now handled as
>      a map with a single entry, while cases requiring multiple
>      pfn offsets use a map with multiple entries.  Code paths
>      that used to do this:
> 
>          dev->dma_pfn_offset = mydrivers_pfn_offset;
> 
>      have been changed to do this:
> 
>          attach_uniform_dma_pfn_offset(dev, pfn_offset);
> 
>   Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
>   -- Add if/then clause for required props: resets, reset-names (RobH)
>   -- Change compatible list from const to enum (RobH)
>   -- Change list of u32-tuples to u64 (RobH)
> 
>   Commit "of: Include a dev param in of_dma_get_range()"
>   -- modify of/unittests.c to add NULL param in of_dma_get_range() call.
> 
>   Commit "device core: Add ability to handle multiple dma offsets"
>   -- align comment in device.h (AndyS).
>   -- s/cpu_beg/cpu_start/ and s/dma_beg/dma_start/ in struct
>      dma_pfn_offset_region (AndyS).
> 
> v2:
> Commit: "device core: Add ability to handle multiple dma offsets"
>   o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph)
>   o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph)
>   o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP
>   o dev->dma_pfn_map => dev->dma_pfn_offset_map
>   o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph)
>   o In device.h: s/const void */const struct dma_pfn_offset_region */
>   o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since
>     guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph)
>   o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now
>     dev->dma_pfn_offset_map is copied as well.
>   o Merged two of the DMA commits into one (Christoph).
> 
> Commit "arm: dma-mapping: Invoke dma offset func if needed":
>   o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET
> 
> Other commits' changes:
>   o Removed need for carrying of_id var in priv (Nicolas)
>   o Commit message rewordings (Bjorn)
>   o Commit log messages filled to 75 chars (Bjorn)
>   o devm_reset_control_get_shared())
>     => devm_reset_control_get_optional_shared (Philipp)
>   o Add call to reset_control_assert() in PCIe remove routines (Philipp)
> 
> v1:
> This patchset expands the usefulness of the Broadcom Settop Box PCIe
> controller by building upon the PCIe driver used currently by the
> Raspbery Pi.  Other forms of this patchset were submitted by me years
> ago and not accepted; the major sticking point was the code required
> for the DMA remapping needed for the PCIe driver to work [1].
> 
> There have been many changes to the DMA and OF subsystems since that
> time, making a cleaner and less intrusive patchset possible.  This
> patchset implements a generalization of "dev->dma_pfn_offset", except
> that instead of a single scalar offset it provides for multiple
> offsets via a function which depends upon the "dma-ranges" property of
> the PCIe host controller.  This is required for proper functionality
> of the BrcmSTB PCIe controller and possibly some other devices.
> 
> [1] https://lore.kernel.org/linux-arm-kernel/1516058925-46522-5-git-send-email-jim2101024@gmail.com/
> 
> Jim Quinlan (12):
>   PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB
>   ata: ahci_brcm: Fix use of BCM7216 reset controller
>   dt-bindings: PCI: Add bindings for more Brcmstb chips
>   PCI: brcmstb: Add bcm7278 register info
>   PCI: brcmstb: Add suspend and resume pm_ops
>   PCI: brcmstb: Add bcm7278 PERST# support
>   PCI: brcmstb: Add control of rescal reset
>   device core: Introduce DMA range map, supplanting dma_pfn_offset
>   PCI: brcmstb: Set additional internal memory DMA viewport sizes
>   PCI: brcmstb: Accommodate MSI for older chips
>   PCI: brcmstb: Set bus max burst size by chip type
>   PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list
> 
>  .../bindings/pci/brcm,stb-pcie.yaml           |  56 ++-
>  arch/arm/include/asm/dma-mapping.h            |   9 +-
>  arch/arm/mach-keystone/keystone.c             |  17 +-
>  arch/sh/drivers/pci/pcie-sh7786.c             |   9 +-
>  arch/sh/kernel/dma-coherent.c                 |  16 +-
>  arch/x86/pci/sta2x11-fixup.c                  |   7 +-
>  drivers/acpi/arm64/iort.c                     |   5 +-
>  drivers/ata/ahci_brcm.c                       |  11 +-
>  drivers/gpu/drm/sun4i/sun4i_backend.c         |   5 +-
>  drivers/iommu/io-pgtable-arm.c                |   2 +-
>  .../platform/sunxi/sun4i-csi/sun4i_csi.c      |   5 +-
>  .../platform/sunxi/sun6i-csi/sun6i_csi.c      |   4 +-
>  drivers/of/address.c                          |  95 ++--
>  drivers/of/device.c                           |  47 +-
>  drivers/of/of_private.h                       |   9 +-
>  drivers/of/unittest.c                         |  35 +-
>  drivers/pci/controller/Kconfig                |   3 +-
>  drivers/pci/controller/pcie-brcmstb.c         | 408 +++++++++++++++---
>  drivers/remoteproc/remoteproc_core.c          |   2 +-
>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |   7 +-
>  drivers/usb/core/message.c                    |   4 +-
>  drivers/usb/core/usb.c                        |   2 +-
>  include/linux/device.h                        |   4 +-
>  include/linux/dma-direct.h                    |  10 +-
>  include/linux/dma-mapping.h                   |  43 ++
>  include/linux/pfn.h                           |   2 +
>  kernel/dma/coherent.c                         |  10 +-
>  kernel/dma/mapping.c                          |  53 +++
>  28 files changed, 683 insertions(+), 197 deletions(-)
> 


-- 
Florian
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v8 08/12] device core: Introduce DMA range map, supplanting dma_pfn_offset
  2020-07-15 14:35 ` [PATCH v8 08/12] device core: Introduce DMA range map, supplanting dma_pfn_offset Jim Quinlan via iommu
@ 2020-07-21 12:51   ` Christoph Hellwig
  2020-07-22 22:37     ` Jim Quinlan via iommu
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2020-07-21 12:51 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Rich Felker, open list:SUPERH, David Airlie, linux-pci,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Julien Grall, Heikki Krogerus, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, open list:STAGING SUBSYSTEM,
	Yoshinori Sato, Frank Rowand,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar, bcm-kernel-feedback-list, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, open list:DRM DRIVERS FOR ALLWINNER A10,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS,
	open list:USB SUBSYSTEM, Stefano Stabellini, Daniel Vetter,
	Sudeep Holla, open list:ALLWINNER A10 CSI DRIVER, Robin Murphy

On Wed, Jul 15, 2020 at 10:35:11AM -0400, Jim Quinlan wrote:
> The new field 'dma_range_map' in struct device is used to facilitate the
> use of single or multiple offsets between mapping regions of cpu addrs and
> dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
> capable of holding a single uniform offset and had no region bounds
> checking.
> 
> The function of_dma_get_range() has been modified so that it takes a single
> argument -- the device node -- and returns a map, NULL, or an error code.
> The map is an array that holds the information regarding the DMA regions.
> Each range entry contains the address offset, the cpu_start address, the
> dma_start address, and the size of the region.
> 
> of_dma_configure() is the typical manner to set range offsets but there are
> a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
> driver code.  These cases now invoke the function
> dma_attach_offset_range(dev, cpu_addr, dma_addr, size).

So my main higher level issue here is the dma_attach_offset_range
function.  I think it should keep the old functionality and just
set a global range from 0 to (phys_addr_t)-1, and bail out if there
are DMA ranges already:

	int dma_set_global_offset(struct device *dev, u64 offset);

otherwise there is all kinds of minor nitpicks that aren't too
substantial, let me know what you think of something like this
hacked up version:


diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index bdd80ddbca3451..2405afeb79573a 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -35,8 +35,11 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 #ifndef __arch_pfn_to_dma
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-	if (dev)
-		pfn -= dev->dma_pfn_offset;
+	if (dev) {
+		phys_addr_t paddr = PFN_PHYS(pfn);
+
+		pfn -= (dma_offset_from_phys_addr(dev, paddr) >> PAGE_SHIFT);
+	}
 	return (dma_addr_t)__pfn_to_bus(pfn);
 }
 
@@ -45,8 +48,7 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
 	unsigned long pfn = __bus_to_pfn(addr);
 
 	if (dev)
-		pfn += dev->dma_pfn_offset;
-
+		pfn += (dma_offset_from_dma_addr(dev, addr) >> PAGE_SHIFT);
 	return pfn;
 }
 
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index 638808c4e12247..7539679205fbf7 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -8,6 +8,7 @@
  */
 #include <linux/io.h>
 #include <linux/of.h>
+#include <linux/dma-mapping.h>
 #include <linux/init.h>
 #include <linux/of_platform.h>
 #include <linux/of_address.h>
@@ -24,8 +25,6 @@
 
 #include "keystone.h"
 
-static unsigned long keystone_dma_pfn_offset __read_mostly;
-
 static int keystone_platform_notifier(struct notifier_block *nb,
 				      unsigned long event, void *data)
 {
@@ -38,9 +37,12 @@ static int keystone_platform_notifier(struct notifier_block *nb,
 		return NOTIFY_BAD;
 
 	if (!dev->of_node) {
-		dev->dma_pfn_offset = keystone_dma_pfn_offset;
-		dev_err(dev, "set dma_pfn_offset%08lx\n",
-			dev->dma_pfn_offset);
+		int ret = dma_set_offset_range(dev, KEYSTONE_HIGH_PHYS_START,
+						    KEYSTONE_LOW_PHYS_START,
+						    KEYSTONE_HIGH_PHYS_SIZE);
+		dev_err(dev, "set dma_offset%08llx%s\n",
+			KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START,
+			ret ? " failed" : "");
 	}
 	return NOTIFY_OK;
 }
@@ -51,11 +53,8 @@ static struct notifier_block platform_nb = {
 
 static void __init keystone_init(void)
 {
-	if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) {
-		keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
-						   KEYSTONE_LOW_PHYS_START);
+	if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START)
 		bus_register_notifier(&platform_bus_type, &platform_nb);
-	}
 	keystone_pm_runtime_init();
 }
 
diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-sh7786.c
index e0b568aaa7014c..e929f85c503852 100644
--- a/arch/sh/drivers/pci/pcie-sh7786.c
+++ b/arch/sh/drivers/pci/pcie-sh7786.c
@@ -12,6 +12,7 @@
 #include <linux/io.h>
 #include <linux/async.h>
 #include <linux/delay.h>
+#include <linux/dma-mapping.h>
 #include <linux/slab.h>
 #include <linux/clk.h>
 #include <linux/sh_clk.h>
@@ -31,6 +32,8 @@ struct sh7786_pcie_port {
 static struct sh7786_pcie_port *sh7786_pcie_ports;
 static unsigned int nr_ports;
 static unsigned long dma_pfn_offset;
+size_t memsize;
+u64 memstart;
 
 static struct sh7786_pcie_hwops {
 	int (*core_init)(void);
@@ -301,7 +304,6 @@ static int __init pcie_init(struct sh7786_pcie_port *port)
 	struct pci_channel *chan = port->hose;
 	unsigned int data;
 	phys_addr_t memstart, memend;
-	size_t memsize;
 	int ret, i, win;
 
 	/* Begin initialization */
@@ -368,8 +370,6 @@ static int __init pcie_init(struct sh7786_pcie_port *port)
 	memstart = ALIGN_DOWN(memstart, memsize);
 	memsize = roundup_pow_of_two(memend - memstart);
 
-	dma_pfn_offset = memstart >> PAGE_SHIFT;
-
 	/*
 	 * If there's more than 512MB of memory, we need to roll over to
 	 * LAR1/LAMR1.
@@ -487,7 +487,8 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev, u8 slot, u8 pin)
 
 void pcibios_bus_add_device(struct pci_dev *pdev)
 {
-	pdev->dev.dma_pfn_offset = dma_pfn_offset;
+	dma_set_offset_range(&pdev->dev, __pa(memory_start),
+			     __pa(memory_start) - memstart, memsize);
 }
 
 static int __init sh7786_pcie_core_init(void)
diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
index d4811691b93cc1..003a91719b3794 100644
--- a/arch/sh/kernel/dma-coherent.c
+++ b/arch/sh/kernel/dma-coherent.c
@@ -14,6 +14,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 {
 	void *ret, *ret_nocache;
 	int order = get_order(size);
+	phys_addr_t phys;
 
 	gfp |= __GFP_ZERO;
 
@@ -34,12 +35,10 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		return NULL;
 	}
 
-	split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
-
-	*dma_handle = virt_to_phys(ret);
-	if (!WARN_ON(!dev))
-		*dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
+	phys = virt_to_phys(ret);
+	split_page(pfn_to_page(PHYS_PFN(phys)), order);
 
+	*dma_handle = (dma_addr_t)phys - dma_offset_from_phys_addr(dev, phys);
 	return ret_nocache;
 }
 
@@ -47,12 +46,10 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 		dma_addr_t dma_handle, unsigned long attrs)
 {
 	int order = get_order(size);
-	unsigned long pfn = (dma_handle >> PAGE_SHIFT);
+	unsigned long pfn;
 	int k;
 
-	if (!WARN_ON(!dev))
-		pfn += dev->dma_pfn_offset;
-
+	pfn = PHYS_PFN(dma_handle + dma_offset_from_dma_addr(dev, dma_handle));
 	for (k = 0; k < (1 << order); k++)
 		__free_pages(pfn_to_page(pfn + k), 0);
 
diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index c313d784efabb9..ea3a58323f81d1 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -12,6 +12,7 @@
 #include <linux/export.h>
 #include <linux/list.h>
 #include <linux/dma-direct.h>
+#include <linux/dma-mapping.h>
 #include <asm/iommu.h>
 
 #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
@@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
 	struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
 	struct device *dev = &pdev->dev;
 	u32 amba_base, max_amba_addr;
-	int i;
+	int i, ret;
 
 	if (!instance)
 		return;
@@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
 	pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
 	max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
 
-	dev->dma_pfn_offset = PFN_DOWN(-amba_base);
+	ret = dma_set_offset_range(dev, 0, amba_base, STA2X11_AMBA_SIZE);
+	if (ret)
+		dev_err(dev, "sta2x11: could not set DMA offset\n");
 
 	dev->bus_dma_limit = max_amba_addr;
 	pci_set_consistent_dma_mask(pdev, max_amba_addr);
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 28a6b387e80e28..a3e04c003a2187 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
 	*dma_addr = dmaaddr;
 	*dma_size = size;
 
-	dev->dma_pfn_offset = PFN_DOWN(offset);
-	dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
+	ret = dma_set_offset_range(dev, dmaaddr + offset, dmaaddr, size);
+
+	dev_dbg(dev, "dma_offset(%#08llx)%s\n", offset, ret ? " failed!" : "");
 }
 
 static void __init acpi_iort_register_irq(int hwirq, const char *name,
diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c
index 072ea113e6be55..48a4adf1f04edc 100644
--- a/drivers/gpu/drm/sun4i/sun4i_backend.c
+++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
@@ -11,6 +11,7 @@
 #include <linux/module.h>
 #include <linux/of_device.h>
 #include <linux/of_graph.h>
+#include <linux/dma-mapping.h>
 #include <linux/platform_device.h>
 #include <linux/reset.h>
 
@@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
 		 * on our device since the RAM mapping is at 0 for the DMA bus,
 		 * unlike the CPU.
 		 */
-		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+		ret = dma_set_offset_range(drm->dev, PHYS_OFFSET, 0, SZ_4G);
+		if (ret)
+			return ret;
 	}
 
 	backend->engine.node = dev->of_node;
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 04fbd4bf0ff9fd..d5542df9aacc01 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
 		return NULL;
 
-	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
+	if (!selftest_running && cfg->iommu_dev->dma_range_map) {
 		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
 		return NULL;
 	}
diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
index eff34ded63055d..d6eda02fd3fc93 100644
--- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
+++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/clk.h>
+#include <linux/dma-mapping.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/mutex.h>
@@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
 			return ret;
 	} else {
 #ifdef PHYS_PFN_OFFSET
-		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+		ret = dma_set_offset_range(csi->dev, PHYS_OFFSET, 0, SZ_4G);
+		if (ret)
+			return ret;
 #endif
 	}
 
diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
index 055eb0b8e39692..450fce6cd8d21b 100644
--- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
+++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
@@ -898,7 +898,9 @@ static int sun6i_csi_probe(struct platform_device *pdev)
 
 	sdev->dev = &pdev->dev;
 	/* The DMA bus has the memory mapped at 0 */
-	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
+	ret = dma_set_offset_range(sdev->dev, PHYS_OFFSET, 0, SZ_4G);
+	if (ret)
+		return ret;
 
 	ret = sun6i_csi_resource_request(sdev, pdev);
 	if (ret)
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 8eea3f6e29a441..083ec3531bcceb 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -918,33 +918,33 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
 }
 EXPORT_SYMBOL(of_io_request_and_map);
 
+#ifdef CONFIG_HAS_DMA
 /**
- * of_dma_get_range - Get DMA range info
+ * of_dma_get_range - Get DMA range info and put it into a map array
  * @np:		device node to get DMA range info
- * @dma_addr:	pointer to store initial DMA address of DMA range
- * @paddr:	pointer to store initial CPU address of DMA range
- * @size:	pointer to store size of DMA range
+ * @map:	dma range structure to return
  *
  * Look in bottom up direction for the first "dma-ranges" property
- * and parse it.
- *  dma-ranges format:
+ * and parse it.  Put the information into a DMA offset map array.
+ *
+ * dma-ranges format:
  *	DMA addr (dma_addr)	: naddr cells
  *	CPU addr (phys_addr_t)	: pna cells
  *	size			: nsize cells
  *
- * It returns -ENODEV if "dma-ranges" property was not found
- * for this device in DT.
+ * It returns -ENODEV if "dma-ranges" property was not found for this
+ * device in the DT.
  */
-int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *size)
+int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map)
 {
 	struct device_node *node = of_node_get(np);
 	const __be32 *ranges = NULL;
-	int len;
-	int ret = 0;
 	bool found_dma_ranges = false;
 	struct of_range_parser parser;
 	struct of_range range;
-	u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
+	struct bus_dma_region *r;
+	int len, num_ranges = 0;
+	int ret;
 
 	while (node) {
 		ranges = of_get_property(node, "dma-ranges", &len);
@@ -970,44 +970,34 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
 	}
 
 	of_dma_range_parser_init(&parser, node);
+	for_each_of_range(&parser, &range)
+		num_ranges++;
+
+	of_dma_range_parser_init(&parser, node);
+
+	ret = -ENOMEM;
+	r = kcalloc(num_ranges + 1, sizeof(*r), GFP_KERNEL);
+	if (!r)
+		goto out;
 
+	/*
+	 * Record all info in the generic DMA ranges array for struct device.
+	 */
+	*map = r;
 	for_each_of_range(&parser, &range) {
 		pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
 			 range.bus_addr, range.cpu_addr, range.size);
-
-		if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset) {
-			pr_warn("Can't handle multiple dma-ranges with different offsets on node(%pOF)\n", node);
-			/* Don't error out as we'd break some existing DTs */
-			continue;
-		}
-		dma_offset = range.cpu_addr - range.bus_addr;
-
-		/* Take lower and upper limits */
-		if (range.bus_addr < dma_start)
-			dma_start = range.bus_addr;
-		if (range.bus_addr + range.size > dma_end)
-			dma_end = range.bus_addr + range.size;
+		r->cpu_start = range.cpu_addr;
+		r->dma_start = range.bus_addr;
+		r->size = range.size;
+		r->offset = (u64)range.cpu_addr - (u64)range.bus_addr;
+		r++;
 	}
-
-	if (dma_start >= dma_end) {
-		ret = -EINVAL;
-		pr_debug("Invalid DMA ranges configuration on node(%pOF)\n",
-			 node);
-		goto out;
-	}
-
-	*dma_addr = dma_start;
-	*size = dma_end - dma_start;
-	*paddr = dma_start + dma_offset;
-
-	pr_debug("final: dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
-		 *dma_addr, *paddr, *size);
-
 out:
 	of_node_put(node);
-
 	return ret;
 }
+#endif
 
 /**
  * of_dma_is_coherent - Check if device is coherent
diff --git a/drivers/of/device.c b/drivers/of/device.c
index 27203bfd0b22dc..0c84f42a23e42e 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -88,14 +88,14 @@ int of_device_add(struct platform_device *ofdev)
  */
 int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 {
-	u64 dma_addr, paddr, size = 0;
-	int ret;
-	bool coherent;
-	unsigned long offset;
 	const struct iommu_ops *iommu;
-	u64 mask, end;
+	const struct bus_dma_region *map = NULL;
+	dma_addr_t dma_start = 0;
+	u64 mask, end, size = 0;
+	bool coherent;
+	int ret;
 
-	ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
+	ret = of_dma_get_range(np, &map);
 	if (ret < 0) {
 		/*
 		 * For legacy reasons, we have to assume some devices need
@@ -104,26 +104,34 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 		 */
 		if (!force_dma)
 			return ret == -ENODEV ? 0 : ret;
-
-		dma_addr = offset = 0;
 	} else {
-		offset = PFN_DOWN(paddr - dma_addr);
+		const struct bus_dma_region *r = map;
+		dma_addr_t dma_end = 0;
+
+		/* Determine the overall bounds of all DMA regions */
+		for (dma_start = ~(dma_addr_t)0; r->size; r++) {
+			/* Take lower and upper limits */
+			if (r->dma_start < dma_start)
+				dma_start = r->dma_start;
+			if (r->dma_start + r->size > dma_end)
+				dma_end = r->dma_start + r->size;
+		}
+		size = dma_end - dma_start;
 
 		/*
 		 * Add a work around to treat the size as mask + 1 in case
 		 * it is defined in DT as a mask.
 		 */
 		if (size & 1) {
-			dev_warn(dev, "Invalid size 0x%llx for dma-range\n",
-				 size);
+			dev_warn(dev, "Invalid size 0x%llx for dma-range(s)\n", size);
 			size = size + 1;
 		}
 
 		if (!size) {
 			dev_err(dev, "Adjusted size 0x%llx invalid\n", size);
+			kfree(map);
 			return -EINVAL;
 		}
-		dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
 	}
 
 	/*
@@ -142,13 +150,11 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 	else if (!size)
 		size = 1ULL << 32;
 
-	dev->dma_pfn_offset = offset;
-
 	/*
 	 * Limit coherent and dma mask based on size and default mask
 	 * set by the driver.
 	 */
-	end = dma_addr + size - 1;
+	end = dma_start + size - 1;
 	mask = DMA_BIT_MASK(ilog2(end) + 1);
 	dev->coherent_dma_mask &= mask;
 	*dev->dma_mask &= mask;
@@ -161,14 +167,17 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 		coherent ? " " : " not ");
 
 	iommu = of_iommu_configure(dev, np);
-	if (PTR_ERR(iommu) == -EPROBE_DEFER)
+	if (PTR_ERR(iommu) == -EPROBE_DEFER) {
+		kfree(map);
 		return -EPROBE_DEFER;
+	}
 
 	dev_dbg(dev, "device is%sbehind an iommu\n",
 		iommu ? " " : " not ");
 
-	arch_setup_dma_ops(dev, dma_addr, size, iommu, coherent);
+	arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
 
+	dev->dma_range_map = map;
 	return 0;
 }
 EXPORT_SYMBOL_GPL(of_dma_configure);
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index edc682249c0015..768406b4156b21 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -157,12 +157,12 @@ extern void __of_sysfs_remove_bin_file(struct device_node *np,
 extern int of_bus_n_addr_cells(struct device_node *np);
 extern int of_bus_n_size_cells(struct device_node *np);
 
-#ifdef CONFIG_OF_ADDRESS
-extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
-			    u64 *paddr, u64 *size);
+struct bus_dma_region;
+#if defined(CONFIG_OF_ADDRESS) && defined(CONFIG_HAS_DMA)
+int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map);
 #else
-static inline int of_dma_get_range(struct device_node *np, u64 *dma_addr,
-				   u64 *paddr, u64 *size)
+static inline int of_dma_get_range(struct device_node *np,
+		const struct bus_dma_region **map);
 {
 	return -ENODEV;
 }
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index 398de04fd19c94..8d0c9bf495d2ef 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -7,6 +7,7 @@
 
 #include <linux/memblock.h>
 #include <linux/clk.h>
+#include <linux/dma-mapping.h>
 #include <linux/err.h>
 #include <linux/errno.h>
 #include <linux/hashtable.h>
@@ -869,10 +870,10 @@ static void __init of_unittest_changeset(void)
 }
 
 static void __init of_unittest_dma_ranges_one(const char *path,
-		u64 expect_dma_addr, u64 expect_paddr, u64 expect_size)
+		u64 expect_dma_addr, u64 expect_paddr)
 {
 	struct device_node *np;
-	u64 dma_addr, paddr, size;
+	const struct bus_dma_region *map = NULL;
 	int rc;
 
 	np = of_find_node_by_path(path);
@@ -881,16 +882,26 @@ static void __init of_unittest_dma_ranges_one(const char *path,
 		return;
 	}
 
-	rc = of_dma_get_range(np, &dma_addr, &paddr, &size);
-
+	rc = of_dma_get_range(np, &map);
 	unittest(!rc, "of_dma_get_range failed on node %pOF rc=%i\n", np, rc);
+
 	if (!rc) {
-		unittest(size == expect_size,
-			 "of_dma_get_range wrong size on node %pOF size=%llx\n", np, size);
+		phys_addr_t	paddr;
+		dma_addr_t	dma_addr;
+		struct device	dev_bogus;
+
+		dev_bogus.dma_range_map = map;
+		paddr = (phys_addr_t)expect_dma_addr +
+			dma_offset_from_dma_addr(&dev_bogus, expect_dma_addr);
+		dma_addr = (dma_addr_t)expect_paddr -
+			dma_offset_from_phys_addr(&dev_bogus, expect_paddr);
+
 		unittest(paddr == expect_paddr,
 			 "of_dma_get_range wrong phys addr (%llx) on node %pOF", paddr, np);
 		unittest(dma_addr == expect_dma_addr,
 			 "of_dma_get_range wrong DMA addr (%llx) on node %pOF", dma_addr, np);
+
+		kfree(map);
 	}
 	of_node_put(np);
 }
@@ -898,11 +909,14 @@ static void __init of_unittest_dma_ranges_one(const char *path,
 static void __init of_unittest_parse_dma_ranges(void)
 {
 	of_unittest_dma_ranges_one("/testcase-data/address-tests/device@70000000",
-		0x0, 0x20000000, 0x40000000);
+		0x0, 0x20000000);
 	of_unittest_dma_ranges_one("/testcase-data/address-tests/bus@80000000/device@1000",
-		0x10000000, 0x20000000, 0x40000000);
+		0x10000000, 0x20000000);
+	/* pci@90000000 has two ranges in the dma-range property */
+	of_unittest_dma_ranges_one("/testcase-data/address-tests/pci@90000000",
+		0x80000000, 0x20000000);
 	of_unittest_dma_ranges_one("/testcase-data/address-tests/pci@90000000",
-		0x80000000, 0x20000000, 0x10000000);
+		0xc0000000, 0x40000000);
 }
 
 static void __init of_unittest_pci_dma_ranges(void)
diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index 9f04c30c4aaf7a..49242dd6176e30 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -519,7 +519,7 @@ static int rproc_handle_vdev(struct rproc *rproc, struct fw_rsc_vdev *rsc,
 	/* Initialise vdev subdevice */
 	snprintf(name, sizeof(name), "vdev%dbuffer", rvdev->index);
 	rvdev->dev.parent = &rproc->dev;
-	rvdev->dev.dma_pfn_offset = rproc->dev.parent->dma_pfn_offset;
+	rvdev->dev.dma_range_map = rproc->dev.parent->dma_range_map;
 	rvdev->dev.release = rproc_rvdev_release;
 	dev_set_name(&rvdev->dev, "%s#%s", dev_name(rvdev->dev.parent), name);
 	dev_set_drvdata(&rvdev->dev, rvdev);
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
index 1744e6fcc99980..249e4bddaa4014 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
@@ -230,8 +230,11 @@ int cedrus_hw_probe(struct cedrus_dev *dev)
 	 */
 
 #ifdef PHYS_PFN_OFFSET
-	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET))
-		dev->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET)) {
+		ret = dma_set_offset_range(dev->dev, PHYS_OFFSET, 0, SZ_4G);
+		if (ret)
+			return ret;
+	}
 #endif
 
 	ret = of_reserved_mem_device_init(dev->dev);
diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index 6197938dcc2d8f..376ca258e510bf 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -1956,10 +1956,10 @@ int usb_set_configuration(struct usb_device *dev, int configuration)
 		intf->dev.groups = usb_interface_groups;
 		/*
 		 * Please refer to usb_alloc_dev() to see why we set
-		 * dma_mask and dma_pfn_offset.
+		 * dma_mask and dma_range_map.
 		 */
 		intf->dev.dma_mask = dev->dev.dma_mask;
-		intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
+		intf->dev.dma_range_map = dev->dev.dma_range_map;
 		INIT_WORK(&intf->reset_ws, __usb_queue_reset_device);
 		intf->minor = -1;
 		device_initialize(&intf->dev);
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
index f16c26dc079d79..1f167a2c095e9a 100644
--- a/drivers/usb/core/usb.c
+++ b/drivers/usb/core/usb.c
@@ -611,7 +611,7 @@ struct usb_device *usb_alloc_dev(struct usb_device *parent,
 	 * mask for the entire HCD, so don't do that.
 	 */
 	dev->dev.dma_mask = bus->sysdev->dma_mask;
-	dev->dev.dma_pfn_offset = bus->sysdev->dma_pfn_offset;
+	dev->dev.dma_range_map = bus->sysdev->dma_range_map;
 	set_dev_node(&dev->dev, dev_to_node(bus->sysdev));
 	dev->state = USB_STATE_ATTACHED;
 	dev->lpm_disable_count = 1;
diff --git a/include/linux/device.h b/include/linux/device.h
index 15460a5ac024a1..feddefcf3e5c20 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -492,7 +492,7 @@ struct dev_links_info {
  * 		such descriptors.
  * @bus_dma_limit: Limit of an upstream bridge or bus which imposes a smaller
  *		DMA limit than the device itself supports.
- * @dma_pfn_offset: offset of DMA memory range relatively of RAM
+ * @dma_range_map: map for DMA memory ranges relative to that of RAM
  * @dma_parms:	A low level driver may set these to teach IOMMU code about
  * 		segment limitations.
  * @dma_pools:	Dma pools (if dma'ble device).
@@ -577,7 +577,7 @@ struct device {
 					     64 bit addresses for consistent
 					     allocations such descriptors. */
 	u64		bus_dma_limit;	/* upstream dma constraint */
-	unsigned long	dma_pfn_offset;
+	const struct bus_dma_region *dma_range_map;
 
 	struct device_dma_parameters *dma_parms;
 
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 5184735a0fe8eb..810d27692674bc 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -13,16 +13,12 @@ extern unsigned int zone_dma_bits;
 #else
 static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
-	dma_addr_t dev_addr = (dma_addr_t)paddr;
-
-	return dev_addr - ((dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
+	return (dma_addr_t)paddr - dma_offset_from_phys_addr(dev, paddr);
 }
 
 static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr)
 {
-	phys_addr_t paddr = (phys_addr_t)dev_addr;
-
-	return paddr + ((phys_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
+	return (phys_addr_t)dev_addr + dma_offset_from_dma_addr(dev, dev_addr);
 }
 #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index a33ed3954ed465..5938c7ca2abcce 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -255,7 +255,38 @@ static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
 
 size_t dma_direct_max_mapping_size(struct device *dev);
 
+struct bus_dma_region {
+	phys_addr_t	cpu_start;
+	dma_addr_t	dma_start;
+	u64		size;
+	u64		offset;
+};
+
 #ifdef CONFIG_HAS_DMA
+static inline u64 dma_offset_from_dma_addr(struct device *dev, dma_addr_t dma_addr)
+{
+	const struct bus_dma_region *m = dev->dma_range_map;
+
+	if (!m)
+		return 0;
+	for (; m->size; m++)
+		if (dma_addr >= m->dma_start && dma_addr - m->dma_start < m->size)
+			return m->offset;
+	return 0;
+}
+
+static inline u64 dma_offset_from_phys_addr(struct device *dev, phys_addr_t paddr)
+{
+	const struct bus_dma_region *m = dev->dma_range_map;
+
+	if (!m)
+		return 0;
+	for (; m->size; m++)
+		if (paddr >= m->cpu_start && paddr - m->cpu_start < m->size)
+			return m->offset;
+	return 0;
+}
+
 #include <asm/dma-mapping.h>
 
 static inline const struct dma_map_ops *get_dma_ops(struct device *dev)
@@ -801,6 +832,9 @@ static inline void arch_teardown_dma_ops(struct device *dev)
 }
 #endif /* CONFIG_ARCH_HAS_TEARDOWN_DMA_OPS */
 
+int dma_set_offset_range(struct device *dev, phys_addr_t cpu_start,
+		dma_addr_t dma_start, u64 size);
+
 static inline unsigned int dma_get_max_seg_size(struct device *dev)
 {
 	if (dev->dma_parms && dev->dma_parms->max_segment_size)
diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 2a0c4985f38e41..751969d6185325 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -31,10 +31,12 @@ static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *de
 static inline dma_addr_t dma_get_device_base(struct device *dev,
 					     struct dma_coherent_mem * mem)
 {
-	if (mem->use_dev_dma_pfn_offset)
-		return (mem->pfn_base - dev->dma_pfn_offset) << PAGE_SHIFT;
-	else
-		return mem->device_base;
+	if (mem->use_dev_dma_pfn_offset) {
+		u64 base_addr = (u64)mem->pfn_base << PAGE_SHIFT;
+		
+		return base_addr - dma_offset_from_phys_addr(dev, base_addr);
+	}
+	return mem->device_base;
 }
 
 static int dma_init_coherent_memory(phys_addr_t phys_addr,
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index a8c18c9a796fdc..dc8017a106fd55 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -11,6 +11,7 @@
 #include <linux/dma-noncoherent.h>
 #include <linux/export.h>
 #include <linux/gfp.h>
+#include <linux/limits.h>
 #include <linux/of_device.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
@@ -417,3 +418,62 @@ unsigned long dma_get_merge_boundary(struct device *dev)
 	return ops->get_merge_boundary(dev);
 }
 EXPORT_SYMBOL_GPL(dma_get_merge_boundary);
+
+static bool dma_range_overlaps(struct device *dev, phys_addr_t cpu_start,
+		dma_addr_t dma_start, u64 size, u64 offset)
+{
+	const struct bus_dma_region *m = dev->dma_range_map;
+
+	for (m = dev->dma_range_map; m->size; m++) {
+		if (offset == m->offset &&
+		    cpu_start >= m->cpu_start &&
+		    size <= m->size &&
+		    cpu_start - m->cpu_start <= m->size - size)
+			return true;
+	}
+
+	return false;
+}
+
+/**
+ * dma_set_offset_range - Assign scalar offset for a single DMA range.
+ * @dev:	device pointer; needed to "own" the alloced memory.
+ * @cpu_start:  beginning of memory region covered by this offset.
+ * @dma_start:  beginning of DMA/PCI region covered by this offset.
+ * @size:	size of the region.
+ *
+ * This is for the simple case of a uniform offset which cannot
+ * be discovered by "dma-ranges".
+ *
+ * It returns -ENOMEM if out of memory, -ENODEV if dev == NULL, otherwise 0.
+ */
+int dma_set_offset_range(struct device *dev, phys_addr_t cpu_start,
+			    dma_addr_t dma_start, u64 size)
+{
+	struct bus_dma_region *map;
+	u64 offset = (u64)cpu_start - (u64)dma_start;
+
+	if (!offset)
+		return 0;
+
+	/*
+	 * See if a map already exists and we already encompass the new range:
+	 */
+	if (dev->dma_range_map) {
+		if (dma_range_overlaps(dev, cpu_start, dma_start, size, offset))
+			return 0;
+		dev_err(dev, "attempt to add conflicting DMA range to existing map\n");
+		return -EINVAL;
+	}
+
+	map = kcalloc(2, sizeof(*map), GFP_KERNEL);
+	if (!map)
+		return -ENOMEM;
+	map[0].cpu_start = cpu_start;
+	map[0].dma_start = dma_start;
+	map[0].offset = offset;
+	map[0].size = size;
+	dev->dma_range_map = map;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(dma_set_offset_range);

> 
> Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
> ---
>  arch/arm/include/asm/dma-mapping.h            |  9 +-
>  arch/arm/mach-keystone/keystone.c             | 17 ++--
>  arch/sh/drivers/pci/pcie-sh7786.c             |  9 +-
>  arch/sh/kernel/dma-coherent.c                 | 16 ++--
>  arch/x86/pci/sta2x11-fixup.c                  |  7 +-
>  drivers/acpi/arm64/iort.c                     |  5 +-
>  drivers/gpu/drm/sun4i/sun4i_backend.c         |  5 +-
>  drivers/iommu/io-pgtable-arm.c                |  2 +-
>  .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
>  .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  4 +-
>  drivers/of/address.c                          | 95 ++++++++++---------
>  drivers/of/device.c                           | 47 +++++----
>  drivers/of/of_private.h                       |  9 +-
>  drivers/of/unittest.c                         | 35 +++++--
>  drivers/remoteproc/remoteproc_core.c          |  2 +-
>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
>  drivers/usb/core/message.c                    |  4 +-
>  drivers/usb/core/usb.c                        |  2 +-
>  include/linux/device.h                        |  4 +-
>  include/linux/dma-direct.h                    | 10 +-
>  include/linux/dma-mapping.h                   | 43 +++++++++
>  include/linux/pfn.h                           |  2 +
>  kernel/dma/coherent.c                         | 10 +-
>  kernel/dma/mapping.c                          | 53 +++++++++++
>  24 files changed, 278 insertions(+), 124 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index bdd80ddbca34..b7cdde9fb83d 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -35,8 +35,9 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
>  #ifndef __arch_pfn_to_dma
>  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
>  {
> -	if (dev)
> -		pfn -= dev->dma_pfn_offset;
> +	if (dev && dev->dma_range_map)
> +		pfn -= DMA_ADDR_PFN(dma_offset_from_phys_addr(dev, PFN_PHYS(pfn)));
> +
>  	return (dma_addr_t)__pfn_to_bus(pfn);
>  }
>  
> @@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
>  {
>  	unsigned long pfn = __bus_to_pfn(addr);
>  
> -	if (dev)
> -		pfn += dev->dma_pfn_offset;
> +	if (dev && dev->dma_range_map)
> +		pfn += DMA_ADDR_PFN(dma_offset_from_dma_addr(dev, addr));
>  
>  	return pfn;
>  }
> diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
> index 638808c4e122..a1a19781983b 100644
> --- a/arch/arm/mach-keystone/keystone.c
> +++ b/arch/arm/mach-keystone/keystone.c
> @@ -8,6 +8,7 @@
>   */
>  #include <linux/io.h>
>  #include <linux/of.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/init.h>
>  #include <linux/of_platform.h>
>  #include <linux/of_address.h>
> @@ -24,8 +25,6 @@
>  
>  #include "keystone.h"
>  
> -static unsigned long keystone_dma_pfn_offset __read_mostly;
> -
>  static int keystone_platform_notifier(struct notifier_block *nb,
>  				      unsigned long event, void *data)
>  {
> @@ -38,9 +37,12 @@ static int keystone_platform_notifier(struct notifier_block *nb,
>  		return NOTIFY_BAD;
>  
>  	if (!dev->of_node) {
> -		dev->dma_pfn_offset = keystone_dma_pfn_offset;
> -		dev_err(dev, "set dma_pfn_offset%08lx\n",
> -			dev->dma_pfn_offset);
> +		int ret = dma_attach_offset_range(dev, KEYSTONE_HIGH_PHYS_START,
> +						  KEYSTONE_LOW_PHYS_START,
> +						  KEYSTONE_HIGH_PHYS_SIZE);
> +		dev_err(dev, "set dma_offset%08llx%s\n",
> +			KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START,
> +			ret ? " failed" : "");
>  	}
>  	return NOTIFY_OK;
>  }
> @@ -51,11 +53,8 @@ static struct notifier_block platform_nb = {
>  
>  static void __init keystone_init(void)
>  {
> -	if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) {
> -		keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
> -						   KEYSTONE_LOW_PHYS_START);
> +	if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START)
>  		bus_register_notifier(&platform_bus_type, &platform_nb);
> -	}
>  	keystone_pm_runtime_init();
>  }
>  
> diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-sh7786.c
> index e0b568aaa701..716bb99022c6 100644
> --- a/arch/sh/drivers/pci/pcie-sh7786.c
> +++ b/arch/sh/drivers/pci/pcie-sh7786.c
> @@ -12,6 +12,7 @@
>  #include <linux/io.h>
>  #include <linux/async.h>
>  #include <linux/delay.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/slab.h>
>  #include <linux/clk.h>
>  #include <linux/sh_clk.h>
> @@ -31,6 +32,8 @@ struct sh7786_pcie_port {
>  static struct sh7786_pcie_port *sh7786_pcie_ports;
>  static unsigned int nr_ports;
>  static unsigned long dma_pfn_offset;
> +size_t memsize;
> +u64 memstart;
>  
>  static struct sh7786_pcie_hwops {
>  	int (*core_init)(void);
> @@ -301,7 +304,6 @@ static int __init pcie_init(struct sh7786_pcie_port *port)
>  	struct pci_channel *chan = port->hose;
>  	unsigned int data;
>  	phys_addr_t memstart, memend;
> -	size_t memsize;
>  	int ret, i, win;
>  
>  	/* Begin initialization */
> @@ -368,8 +370,6 @@ static int __init pcie_init(struct sh7786_pcie_port *port)
>  	memstart = ALIGN_DOWN(memstart, memsize);
>  	memsize = roundup_pow_of_two(memend - memstart);
>  
> -	dma_pfn_offset = memstart >> PAGE_SHIFT;
> -
>  	/*
>  	 * If there's more than 512MB of memory, we need to roll over to
>  	 * LAR1/LAMR1.
> @@ -487,7 +487,8 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev, u8 slot, u8 pin)
>  
>  void pcibios_bus_add_device(struct pci_dev *pdev)
>  {
> -	pdev->dev.dma_pfn_offset = dma_pfn_offset;
> +	dma_attach_offset_range(&pdev->dev, __pa(memory_start),
> +				__pa(memory_start) - memstart, memsize);
>  }
>  
>  static int __init sh7786_pcie_core_init(void)
> diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
> index d4811691b93c..e00f29c7c443 100644
> --- a/arch/sh/kernel/dma-coherent.c
> +++ b/arch/sh/kernel/dma-coherent.c
> @@ -14,6 +14,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
>  {
>  	void *ret, *ret_nocache;
>  	int order = get_order(size);
> +	phys_addr_t phys;
>  
>  	gfp |= __GFP_ZERO;
>  
> @@ -34,11 +35,12 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
>  		return NULL;
>  	}
>  
> -	split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> +	phys = virt_to_phys(ret);
> +	split_page(pfn_to_page(PHYS_PFN(phys)), order);
>  
> -	*dma_handle = virt_to_phys(ret);
> -	if (!WARN_ON(!dev))
> -		*dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
> +	*dma_handle = (dma_addr_t)phys;
> +	if (!WARN_ON(!dev) && dev->dma_range_map)
> +		*dma_handle -= dma_offset_from_phys_addr(dev, phys);
>  
>  	return ret_nocache;
>  }
> @@ -47,11 +49,11 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
>  		dma_addr_t dma_handle, unsigned long attrs)
>  {
>  	int order = get_order(size);
> -	unsigned long pfn = (dma_handle >> PAGE_SHIFT);
> +	unsigned long pfn = PHYS_PFN(dma_handle);
>  	int k;
>  
> -	if (!WARN_ON(!dev))
> -		pfn += dev->dma_pfn_offset;
> +	if (!WARN_ON(!dev) && dev->dma_range_map)
> +		pfn += DMA_ADDR_PFN(dma_offset_from_dma_addr(dev, dma_handle));
>  
>  	for (k = 0; k < (1 << order); k++)
>  		__free_pages(pfn_to_page(pfn + k), 0);
> diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
> index c313d784efab..74633ccf622e 100644
> --- a/arch/x86/pci/sta2x11-fixup.c
> +++ b/arch/x86/pci/sta2x11-fixup.c
> @@ -12,6 +12,7 @@
>  #include <linux/export.h>
>  #include <linux/list.h>
>  #include <linux/dma-direct.h>
> +#include <linux/dma-mapping.h>
>  #include <asm/iommu.h>
>  
>  #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
> @@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
>  	struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
>  	struct device *dev = &pdev->dev;
>  	u32 amba_base, max_amba_addr;
> -	int i;
> +	int i, ret;
>  
>  	if (!instance)
>  		return;
> @@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
>  	pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
>  	max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
>  
> -	dev->dma_pfn_offset = PFN_DOWN(-amba_base);
> +	ret = dma_attach_offset_range(dev, 0, amba_base, STA2X11_AMBA_SIZE);
> +	if (ret)
> +		dev_err(dev, "sta2x11: could not set DMA offset\n");
>  
>  	dev->bus_dma_limit = max_amba_addr;
>  	pci_set_consistent_dma_mask(pdev, max_amba_addr);
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index 28a6b387e80e..41c2d861ce43 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
>  	*dma_addr = dmaaddr;
>  	*dma_size = size;
>  
> -	dev->dma_pfn_offset = PFN_DOWN(offset);
> -	dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
> +	ret = dma_attach_offset_range(dev, dmaaddr + offset, dmaaddr, size);
> +
> +	dev_dbg(dev, "dma_offset(%#08llx)%s\n", offset, ret ? " failed!" : "");
>  }
>  
>  static void __init acpi_iort_register_irq(int hwirq, const char *name,
> diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c
> index 072ea113e6be..cbe49a07983c 100644
> --- a/drivers/gpu/drm/sun4i/sun4i_backend.c
> +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
> @@ -11,6 +11,7 @@
>  #include <linux/module.h>
>  #include <linux/of_device.h>
>  #include <linux/of_graph.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/platform_device.h>
>  #include <linux/reset.h>
>  
> @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
>  		 * on our device since the RAM mapping is at 0 for the DMA bus,
>  		 * unlike the CPU.
>  		 */
> -		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = dma_attach_offset_range(drm->dev, PHYS_OFFSET, 0, SZ_4G);
> +		if (ret)
> +			return ret;
>  	}
>  
>  	backend->engine.node = dev->of_node;
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 04fbd4bf0ff9..d5542df9aacc 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>  	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
>  		return NULL;
>  
> -	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> +	if (!selftest_running && cfg->iommu_dev->dma_range_map) {
>  		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
>  		return NULL;
>  	}
> diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> index eff34ded6305..95a5d5655056 100644
> --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> @@ -7,6 +7,7 @@
>   */
>  
>  #include <linux/clk.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/interrupt.h>
>  #include <linux/module.h>
>  #include <linux/mutex.h>
> @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
>  			return ret;
>  	} else {
>  #ifdef PHYS_PFN_OFFSET
> -		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = dma_attach_offset_range(csi->dev, PHYS_OFFSET, 0, SZ_4G);
> +		if (ret)
> +			return ret;
>  #endif
>  	}
>  
> diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> index 055eb0b8e396..c26fc1cdd4d2 100644
> --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> @@ -898,7 +898,9 @@ static int sun6i_csi_probe(struct platform_device *pdev)
>  
>  	sdev->dev = &pdev->dev;
>  	/* The DMA bus has the memory mapped at 0 */
> -	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> +	ret = dma_attach_offset_range(sdev->dev, PHYS_OFFSET, 0, SZ_4G);
> +	if (ret)
> +		return ret;
>  
>  	ret = sun6i_csi_resource_request(sdev, pdev);
>  	if (ret)
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 8eea3f6e29a4..5d9117a1cb16 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -918,33 +918,65 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
>  }
>  EXPORT_SYMBOL(of_io_request_and_map);
>  
> +static const struct bus_dma_region *dma_create_offset_map(struct device_node *node,
> +							  int num_ranges)
> +{
> +	struct of_range_parser parser;
> +	struct of_range range;
> +	struct bus_dma_region *map, *r;
> +	int ret;
> +
> +	r = kcalloc(num_ranges + 1, sizeof(*r), GFP_KERNEL);
> +	if (!r)
> +		return ERR_PTR(-ENOMEM);
> +
> +	map = r;
> +	ret = of_dma_range_parser_init(&parser, node);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	/*
> +	 * Record all info for DMA ranges array.  We use our
> +	 * our own struct (bus_dma_region) so it is not dependent
> +	 * on CONFIG_OF.
> +	 */
> +	for_each_of_range(&parser, &range) {
> +		pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> +			 range.bus_addr, range.cpu_addr, range.size);
> +		r->cpu_start = range.cpu_addr;
> +		r->dma_start = range.bus_addr;
> +		r->size = range.size;
> +		r->offset = (u64)range.cpu_addr - (u64)range.bus_addr;
> +		r++;
> +	}
> +	return map;
> +}
> +
>  /**
> - * of_dma_get_range - Get DMA range info
> + * of_dma_get_range - Get DMA range info and put it into a map array
>   * @np:		device node to get DMA range info
> - * @dma_addr:	pointer to store initial DMA address of DMA range
> - * @paddr:	pointer to store initial CPU address of DMA range
> - * @size:	pointer to store size of DMA range
>   *
>   * Look in bottom up direction for the first "dma-ranges" property
> - * and parse it.
> - *  dma-ranges format:
> + * and parse it.  Put the information into a DMA offset map array.
> + *
> + * dma-ranges format:
>   *	DMA addr (dma_addr)	: naddr cells
>   *	CPU addr (phys_addr_t)	: pna cells
>   *	size			: nsize cells
>   *
> - * It returns -ENODEV if "dma-ranges" property was not found
> - * for this device in DT.
> + * It returns -ENODEV if "dma-ranges" property was not found for this
> + * device in the DT.
>   */
> -int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *size)
> +const struct bus_dma_region *of_dma_get_range(struct device_node *np)
>  {
> +	const struct bus_dma_region *map = NULL;
>  	struct device_node *node = of_node_get(np);
> +	struct of_range_parser parser;
>  	const __be32 *ranges = NULL;
> -	int len;
> -	int ret = 0;
>  	bool found_dma_ranges = false;
> -	struct of_range_parser parser;
>  	struct of_range range;
> -	u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> +	int len, num_ranges = 0;
> +	int ret = 0;
>  
>  	while (node) {
>  		ranges = of_get_property(node, "dma-ranges", &len);
> @@ -971,42 +1003,13 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
>  
>  	of_dma_range_parser_init(&parser, node);
>  
> -	for_each_of_range(&parser, &range) {
> -		pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> -			 range.bus_addr, range.cpu_addr, range.size);
> -
> -		if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset) {
> -			pr_warn("Can't handle multiple dma-ranges with different offsets on node(%pOF)\n", node);
> -			/* Don't error out as we'd break some existing DTs */
> -			continue;
> -		}
> -		dma_offset = range.cpu_addr - range.bus_addr;
> -
> -		/* Take lower and upper limits */
> -		if (range.bus_addr < dma_start)
> -			dma_start = range.bus_addr;
> -		if (range.bus_addr + range.size > dma_end)
> -			dma_end = range.bus_addr + range.size;
> -	}
> -
> -	if (dma_start >= dma_end) {
> -		ret = -EINVAL;
> -		pr_debug("Invalid DMA ranges configuration on node(%pOF)\n",
> -			 node);
> -		goto out;
> -	}
> -
> -	*dma_addr = dma_start;
> -	*size = dma_end - dma_start;
> -	*paddr = dma_start + dma_offset;
> -
> -	pr_debug("final: dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> -		 *dma_addr, *paddr, *size);
> +	for_each_of_range(&parser, &range)
> +		num_ranges++;
>  
> +	map = dma_create_offset_map(node, num_ranges);
>  out:
>  	of_node_put(node);
> -
> -	return ret;
> +	return map ? map : ERR_PTR(ret);
>  }
>  
>  /**
> diff --git a/drivers/of/device.c b/drivers/of/device.c
> index 27203bfd0b22..fea2f31d4245 100644
> --- a/drivers/of/device.c
> +++ b/drivers/of/device.c
> @@ -88,14 +88,15 @@ int of_device_add(struct platform_device *ofdev)
>   */
>  int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
>  {
> -	u64 dma_addr, paddr, size = 0;
> -	int ret;
> -	bool coherent;
> -	unsigned long offset;
>  	const struct iommu_ops *iommu;
> -	u64 mask, end;
> +	const struct bus_dma_region *map;
> +	dma_addr_t dma_start = 0;
> +	u64 mask, end, size = 0;
> +	bool coherent;
> +	int ret;
>  
> -	ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
> +	map = of_dma_get_range(np);
> +	ret = PTR_ERR_OR_ZERO(map);
>  	if (ret < 0) {
>  		/*
>  		 * For legacy reasons, we have to assume some devices need
> @@ -105,25 +106,36 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
>  		if (!force_dma)
>  			return ret == -ENODEV ? 0 : ret;
>  
> -		dma_addr = offset = 0;
> -	} else {
> -		offset = PFN_DOWN(paddr - dma_addr);
> +		dma_start = 0;
> +		map = NULL;
> +	} else if (map) {
> +		const struct bus_dma_region *r = map;
> +		dma_addr_t dma_end = 0;
> +
> +		/* Determine the overall bounds of all DMA regions */
> +		for (dma_start = ~(dma_addr_t)0; r->size; r++) {
> +			/* Take lower and upper limits */
> +			if (r->dma_start < dma_start)
> +				dma_start = r->dma_start;
> +			if (r->dma_start + r->size > dma_end)
> +				dma_end = r->dma_start + r->size;
> +		}
> +		size = dma_end - dma_start;
>  
>  		/*
>  		 * Add a work around to treat the size as mask + 1 in case
>  		 * it is defined in DT as a mask.
>  		 */
>  		if (size & 1) {
> -			dev_warn(dev, "Invalid size 0x%llx for dma-range\n",
> -				 size);
> +			dev_warn(dev, "Invalid size 0x%llx for dma-range(s)\n", size);
>  			size = size + 1;
>  		}
>  
>  		if (!size) {
>  			dev_err(dev, "Adjusted size 0x%llx invalid\n", size);
> +			kfree(map);
>  			return -EINVAL;
>  		}
> -		dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
>  	}
>  
>  	/*
> @@ -142,13 +154,11 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
>  	else if (!size)
>  		size = 1ULL << 32;
>  
> -	dev->dma_pfn_offset = offset;
> -
>  	/*
>  	 * Limit coherent and dma mask based on size and default mask
>  	 * set by the driver.
>  	 */
> -	end = dma_addr + size - 1;
> +	end = dma_start + size - 1;
>  	mask = DMA_BIT_MASK(ilog2(end) + 1);
>  	dev->coherent_dma_mask &= mask;
>  	*dev->dma_mask &= mask;
> @@ -161,14 +171,17 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
>  		coherent ? " " : " not ");
>  
>  	iommu = of_iommu_configure(dev, np);
> -	if (PTR_ERR(iommu) == -EPROBE_DEFER)
> +	if (PTR_ERR(iommu) == -EPROBE_DEFER) {
> +		kfree(map);
>  		return -EPROBE_DEFER;
> +	}
>  
>  	dev_dbg(dev, "device is%sbehind an iommu\n",
>  		iommu ? " " : " not ");
>  
> -	arch_setup_dma_ops(dev, dma_addr, size, iommu, coherent);
> +	arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
>  
> +	dev->dma_range_map = map;
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(of_dma_configure);
> diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
> index edc682249c00..876149e721c5 100644
> --- a/drivers/of/of_private.h
> +++ b/drivers/of/of_private.h
> @@ -157,14 +157,13 @@ extern void __of_sysfs_remove_bin_file(struct device_node *np,
>  extern int of_bus_n_addr_cells(struct device_node *np);
>  extern int of_bus_n_size_cells(struct device_node *np);
>  
> +struct bus_dma_region;
>  #ifdef CONFIG_OF_ADDRESS
> -extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
> -			    u64 *paddr, u64 *size);
> +extern const struct bus_dma_region *of_dma_get_range(struct device_node *np);
>  #else
> -static inline int of_dma_get_range(struct device_node *np, u64 *dma_addr,
> -				   u64 *paddr, u64 *size)
> +static inline const struct bus_dma_region *of_dma_get_range(struct device_node *np)
>  {
> -	return -ENODEV;
> +	return ERR_PTR(-ENODEV);
>  }
>  #endif
>  
> diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
> index 398de04fd19c..542d092f19c2 100644
> --- a/drivers/of/unittest.c
> +++ b/drivers/of/unittest.c
> @@ -7,6 +7,7 @@
>  
>  #include <linux/memblock.h>
>  #include <linux/clk.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/err.h>
>  #include <linux/errno.h>
>  #include <linux/hashtable.h>
> @@ -869,10 +870,10 @@ static void __init of_unittest_changeset(void)
>  }
>  
>  static void __init of_unittest_dma_ranges_one(const char *path,
> -		u64 expect_dma_addr, u64 expect_paddr, u64 expect_size)
> +		u64 expect_dma_addr, u64 expect_paddr)
>  {
>  	struct device_node *np;
> -	u64 dma_addr, paddr, size;
> +	const struct bus_dma_region *map = NULL;
>  	int rc;
>  
>  	np = of_find_node_by_path(path);
> @@ -881,16 +882,27 @@ static void __init of_unittest_dma_ranges_one(const char *path,
>  		return;
>  	}
>  
> -	rc = of_dma_get_range(np, &dma_addr, &paddr, &size);
> -
> +	map = of_dma_get_range(np);
> +	rc = PTR_ERR_OR_ZERO(map);
>  	unittest(!rc, "of_dma_get_range failed on node %pOF rc=%i\n", np, rc);
> -	if (!rc) {
> -		unittest(size == expect_size,
> -			 "of_dma_get_range wrong size on node %pOF size=%llx\n", np, size);
> +
> +	if (!rc && map) {
> +		phys_addr_t	paddr;
> +		dma_addr_t	dma_addr;
> +		struct device	dev_bogus;
> +
> +		dev_bogus.dma_range_map = map;
> +		paddr = (phys_addr_t)expect_dma_addr
> +			+ dma_offset_from_dma_addr(&dev_bogus, expect_dma_addr);
> +		dma_addr = (dma_addr_t)expect_paddr
> +			- dma_offset_from_phys_addr(&dev_bogus, expect_paddr);
> +
>  		unittest(paddr == expect_paddr,
>  			 "of_dma_get_range wrong phys addr (%llx) on node %pOF", paddr, np);
>  		unittest(dma_addr == expect_dma_addr,
>  			 "of_dma_get_range wrong DMA addr (%llx) on node %pOF", dma_addr, np);
> +
> +		kfree(map);
>  	}
>  	of_node_put(np);
>  }
> @@ -898,11 +910,14 @@ static void __init of_unittest_dma_ranges_one(const char *path,
>  static void __init of_unittest_parse_dma_ranges(void)
>  {
>  	of_unittest_dma_ranges_one("/testcase-data/address-tests/device@70000000",
> -		0x0, 0x20000000, 0x40000000);
> +		0x0, 0x20000000);
>  	of_unittest_dma_ranges_one("/testcase-data/address-tests/bus@80000000/device@1000",
> -		0x10000000, 0x20000000, 0x40000000);
> +		0x10000000, 0x20000000);
> +	/* pci@90000000 has two ranges in the dma-range property */
> +	of_unittest_dma_ranges_one("/testcase-data/address-tests/pci@90000000",
> +		0x80000000, 0x20000000);
>  	of_unittest_dma_ranges_one("/testcase-data/address-tests/pci@90000000",
> -		0x80000000, 0x20000000, 0x10000000);
> +		0xc0000000, 0x40000000);
>  }
>  
>  static void __init of_unittest_pci_dma_ranges(void)
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index 9f04c30c4aaf..49242dd6176e 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -519,7 +519,7 @@ static int rproc_handle_vdev(struct rproc *rproc, struct fw_rsc_vdev *rsc,
>  	/* Initialise vdev subdevice */
>  	snprintf(name, sizeof(name), "vdev%dbuffer", rvdev->index);
>  	rvdev->dev.parent = &rproc->dev;
> -	rvdev->dev.dma_pfn_offset = rproc->dev.parent->dma_pfn_offset;
> +	rvdev->dev.dma_range_map = rproc->dev.parent->dma_range_map;
>  	rvdev->dev.release = rproc_rvdev_release;
>  	dev_set_name(&rvdev->dev, "%s#%s", dev_name(rvdev->dev.parent), name);
>  	dev_set_drvdata(&rvdev->dev, rvdev);
> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> index 1744e6fcc999..720b41eca7a3 100644
> --- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> @@ -230,8 +230,11 @@ int cedrus_hw_probe(struct cedrus_dev *dev)
>  	 */
>  
>  #ifdef PHYS_PFN_OFFSET
> -	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET))
> -		dev->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET)) {
> +		ret = dma_attach_offset_range(dev->dev, PHYS_OFFSET, 0, SZ_4G);
> +		if (ret)
> +			return ret;
> +	}
>  #endif
>  
>  	ret = of_reserved_mem_device_init(dev->dev);
> diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> index 6197938dcc2d..376ca258e510 100644
> --- a/drivers/usb/core/message.c
> +++ b/drivers/usb/core/message.c
> @@ -1956,10 +1956,10 @@ int usb_set_configuration(struct usb_device *dev, int configuration)
>  		intf->dev.groups = usb_interface_groups;
>  		/*
>  		 * Please refer to usb_alloc_dev() to see why we set
> -		 * dma_mask and dma_pfn_offset.
> +		 * dma_mask and dma_range_map.
>  		 */
>  		intf->dev.dma_mask = dev->dev.dma_mask;
> -		intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
> +		intf->dev.dma_range_map = dev->dev.dma_range_map;
>  		INIT_WORK(&intf->reset_ws, __usb_queue_reset_device);
>  		intf->minor = -1;
>  		device_initialize(&intf->dev);
> diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
> index f16c26dc079d..1f167a2c095e 100644
> --- a/drivers/usb/core/usb.c
> +++ b/drivers/usb/core/usb.c
> @@ -611,7 +611,7 @@ struct usb_device *usb_alloc_dev(struct usb_device *parent,
>  	 * mask for the entire HCD, so don't do that.
>  	 */
>  	dev->dev.dma_mask = bus->sysdev->dma_mask;
> -	dev->dev.dma_pfn_offset = bus->sysdev->dma_pfn_offset;
> +	dev->dev.dma_range_map = bus->sysdev->dma_range_map;
>  	set_dev_node(&dev->dev, dev_to_node(bus->sysdev));
>  	dev->state = USB_STATE_ATTACHED;
>  	dev->lpm_disable_count = 1;
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 15460a5ac024..feddefcf3e5c 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -492,7 +492,7 @@ struct dev_links_info {
>   * 		such descriptors.
>   * @bus_dma_limit: Limit of an upstream bridge or bus which imposes a smaller
>   *		DMA limit than the device itself supports.
> - * @dma_pfn_offset: offset of DMA memory range relatively of RAM
> + * @dma_range_map: map for DMA memory ranges relative to that of RAM
>   * @dma_parms:	A low level driver may set these to teach IOMMU code about
>   * 		segment limitations.
>   * @dma_pools:	Dma pools (if dma'ble device).
> @@ -577,7 +577,7 @@ struct device {
>  					     64 bit addresses for consistent
>  					     allocations such descriptors. */
>  	u64		bus_dma_limit;	/* upstream dma constraint */
> -	unsigned long	dma_pfn_offset;
> +	const struct bus_dma_region *dma_range_map;
>  
>  	struct device_dma_parameters *dma_parms;
>  
> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> index cdfa400f89b3..182784d28cfd 100644
> --- a/include/linux/dma-direct.h
> +++ b/include/linux/dma-direct.h
> @@ -15,14 +15,20 @@ static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
>  {
>  	dma_addr_t dev_addr = (dma_addr_t)paddr;
>  
> -	return dev_addr - ((dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
> +	if (dev->dma_range_map)
> +		dev_addr -= dma_offset_from_phys_addr(dev, paddr);
> +
> +	return dev_addr;
>  }
>  
>  static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr)
>  {
>  	phys_addr_t paddr = (phys_addr_t)dev_addr;
>  
> -	return paddr + ((phys_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
> +	if (dev->dma_range_map)
> +		paddr += dma_offset_from_dma_addr(dev, dev_addr);
> +
> +	return paddr;
>  }
>  #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */
>  
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index 78f677cf45ab..7c8fcac30e74 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -255,7 +255,37 @@ static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
>  
>  size_t dma_direct_max_mapping_size(struct device *dev);
>  
> +struct bus_dma_region {
> +	phys_addr_t	cpu_start;
> +	dma_addr_t	dma_start;
> +	u64		size;
> +	u64		offset;
> +};
> +
>  #ifdef CONFIG_HAS_DMA
> +int dma_attach_offset_range(struct device *dev, phys_addr_t cpu_start,
> +		dma_addr_t dma_start, u64 size);
> +
> +static inline u64 dma_offset_from_dma_addr(struct device *dev, dma_addr_t dma_addr)
> +{
> +	const struct bus_dma_region *m = dev->dma_range_map;
> +
> +	for (; m->size; m++)
> +		if (dma_addr >= m->dma_start && dma_addr - m->dma_start < m->size)
> +			return m->offset;
> +	return 0;
> +}
> +
> +static inline u64 dma_offset_from_phys_addr(struct device *dev, phys_addr_t paddr)
> +{
> +	const struct bus_dma_region *m = dev->dma_range_map;
> +
> +	for (; m->size; m++)
> +		if (paddr >= m->cpu_start && paddr - m->cpu_start < m->size)
> +			return m->offset;
> +	return 0;
> +}
> +
>  #include <asm/dma-mapping.h>
>  
>  static inline const struct dma_map_ops *get_dma_ops(struct device *dev)
> @@ -463,6 +493,19 @@ u64 dma_get_required_mask(struct device *dev);
>  size_t dma_max_mapping_size(struct device *dev);
>  unsigned long dma_get_merge_boundary(struct device *dev);
>  #else /* CONFIG_HAS_DMA */
> +static inline u64 dma_offset_from_dma_addr(struct device *dev, dma_addr_t dma_addr)
> +{
> +	return (u64)0;
> +}
> +static inline u64 dma_offset_from_phys_addr(struct device *dev, phys_addr_t paddr)
> +{
> +	return (u64)0;
> +}
> +static int dma_attach_offset_range(struct device *dev, phys_addr_t cpu_start,
> +		dma_addr_t dma_start, u64 size)
> +{
> +	return -EIO;
> +}
>  static inline dma_addr_t dma_map_page_attrs(struct device *dev,
>  		struct page *page, size_t offset, size_t size,
>  		enum dma_data_direction dir, unsigned long attrs)
> diff --git a/include/linux/pfn.h b/include/linux/pfn.h
> index 14bc053c53d8..eddb535075a0 100644
> --- a/include/linux/pfn.h
> +++ b/include/linux/pfn.h
> @@ -20,5 +20,7 @@ typedef struct {
>  #define PFN_DOWN(x)	((x) >> PAGE_SHIFT)
>  #define PFN_PHYS(x)	((phys_addr_t)(x) << PAGE_SHIFT)
>  #define PHYS_PFN(x)	((unsigned long)((x) >> PAGE_SHIFT))
> +#define PFN_DMA_ADDR(x)	((dma_addr_t)(x) << PAGE_SHIFT)
> +#define DMA_ADDR_PFN(x)	((unsigned long)((x) >> PAGE_SHIFT))
>  
>  #endif
> diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
> index 2a0c4985f38e..66b1ac611c61 100644
> --- a/kernel/dma/coherent.c
> +++ b/kernel/dma/coherent.c
> @@ -31,10 +31,12 @@ static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *de
>  static inline dma_addr_t dma_get_device_base(struct device *dev,
>  					     struct dma_coherent_mem * mem)
>  {
> -	if (mem->use_dev_dma_pfn_offset)
> -		return (mem->pfn_base - dev->dma_pfn_offset) << PAGE_SHIFT;
> -	else
> -		return mem->device_base;
> +	if (mem->use_dev_dma_pfn_offset && dev->dma_range_map) {
> +		u64 dma_offset = dma_offset_from_phys_addr(dev, PFN_PHYS(mem->pfn_base));
> +
> +		return PFN_DMA_ADDR(mem->pfn_base) - dma_offset;
> +	}
> +	return mem->device_base;
>  }
>  
>  static int dma_init_coherent_memory(phys_addr_t phys_addr,
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index 98e3d873792e..2c08c4991bfa 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -11,6 +11,7 @@
>  #include <linux/dma-noncoherent.h>
>  #include <linux/export.h>
>  #include <linux/gfp.h>
> +#include <linux/limits.h>
>  #include <linux/of_device.h>
>  #include <linux/slab.h>
>  #include <linux/vmalloc.h>
> @@ -407,3 +408,55 @@ unsigned long dma_get_merge_boundary(struct device *dev)
>  	return ops->get_merge_boundary(dev);
>  }
>  EXPORT_SYMBOL_GPL(dma_get_merge_boundary);
> +
> +/**
> + * dma_attach_offset_range - Assign scalar offset for a single DMA range.
> + * @dev:	device pointer; needed to "own" the alloced memory.
> + * @cpu_start:  beginning of memory region covered by this offset.
> + * @dma_start:  beginning of DMA/PCI region covered by this offset.
> + * @size:	size of the region.
> + *
> + * This is for the simple case of a uniform offset which cannot
> + * be discovered by "dma-ranges".
> + *
> + * It returns -ENOMEM if out of memory, -ENODEV if dev == NULL, otherwise 0.
> + */
> +int dma_attach_offset_range(struct device *dev, phys_addr_t cpu_start,
> +			    dma_addr_t dma_start, u64 size)
> +{
> +	struct bus_dma_region *map;
> +	u64 offset = (u64)cpu_start - (u64)dma_start;
> +
> +	if (!dev)
> +		return -ENODEV;
> +
> +	/* See if a map already exists and we already encompass the new range */
> +	if (dev->dma_range_map) {
> +		const struct bus_dma_region *m = dev->dma_range_map;
> +
> +		for (; m->size; m++)
> +			if (offset == m->offset && cpu_start >= m->cpu_start
> +			    && size <= m->size && cpu_start - m->cpu_start <= m->size - size)
> +				return 0;
> +
> +		dev_err(dev, "attempt to add conflicting DMA range to existing map\n");
> +		return -EINVAL;
> +	}
> +
> +	if (!offset)
> +		return 0;
> +
> +	/* Don't use devm_kcalloc() since this may be called as bus a notifier */
> +	map = kcalloc(2, sizeof(*map), GFP_KERNEL);
> +	if (!map)
> +		return -ENOMEM;
> +	dev->dma_range_map = map;
> +
> +	map->cpu_start = cpu_start;
> +	map->dma_start = dma_start;
> +	map->offset = offset;
> +	map->size = size;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(dma_attach_offset_range);
> -- 
> 2.17.1
---end quoted text---
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v8 08/12] device core: Introduce DMA range map, supplanting dma_pfn_offset
  2020-07-21 12:51   ` Christoph Hellwig
@ 2020-07-22 22:37     ` Jim Quinlan via iommu
  0 siblings, 0 replies; 5+ messages in thread
From: Jim Quinlan via iommu @ 2020-07-22 22:37 UTC (permalink / raw)
  To: Christoph Hellwig, Robin Murphy
  Cc: Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Julien Grall, Heikki Krogerus, H. Peter Anvin,
	Will Deacon, Dan Williams, open list:STAGING SUBSYSTEM,
	Yoshinori Sato, Frank Rowand,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Maxime Ripard, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS,
	open list:USB SUBSYSTEM, Stefano Stabellini, Daniel Vetter,
	Sudeep Holla, open list:ALLWINNER A10 CSI DRIVER

On Tue, Jul 21, 2020 at 8:51 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Wed, Jul 15, 2020 at 10:35:11AM -0400, Jim Quinlan wrote:
> > The new field 'dma_range_map' in struct device is used to facilitate the
> > use of single or multiple offsets between mapping regions of cpu addrs and
> > dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
> > capable of holding a single uniform offset and had no region bounds
> > checking.
> >
> > The function of_dma_get_range() has been modified so that it takes a single
> > argument -- the device node -- and returns a map, NULL, or an error code.
> > The map is an array that holds the information regarding the DMA regions.
> > Each range entry contains the address offset, the cpu_start address, the
> > dma_start address, and the size of the region.
> >
> > of_dma_configure() is the typical manner to set range offsets but there are
> > a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
> > driver code.  These cases now invoke the function
> > dma_attach_offset_range(dev, cpu_addr, dma_addr, size).
>
> So my main higher level issue here is the dma_attach_offset_range
> function.  I think it should keep the old functionality and just
> set a global range from 0 to (phys_addr_t)-1, and bail out if there
> are DMA ranges already:
>
>         int dma_set_global_offset(struct device *dev, u64 offset);

Hi Christoph,

I had it this way in [V1...V5] but Robin requested that for V6 I
should change this function to
    o add bounds to the call
    o if there is a mapping already, check if what is requested is
already covered and return success.

Can you and Robin please discuss this and let me know which way to move forward?

>
>
> otherwise there is all kinds of minor nitpicks that aren't too
> substantial, let me know what you think of something like this
> hacked up version:
Kind of hard to see what you have changed but I will diff both of our
diffs and make the changes.

Thanks,
Jim Quinlan
Broadcom STB

>
>
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index bdd80ddbca3451..2405afeb79573a 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -35,8 +35,11 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
>  #ifndef __arch_pfn_to_dma
>  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
>  {
> -       if (dev)
> -               pfn -= dev->dma_pfn_offset;
> +       if (dev) {
> +               phys_addr_t paddr = PFN_PHYS(pfn);
> +
> +               pfn -= (dma_offset_from_phys_addr(dev, paddr) >> PAGE_SHIFT);
> +       }
>         return (dma_addr_t)__pfn_to_bus(pfn);
>  }
>
> @@ -45,8 +48,7 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
>         unsigned long pfn = __bus_to_pfn(addr);
>
>         if (dev)
> -               pfn += dev->dma_pfn_offset;
> -
> +               pfn += (dma_offset_from_dma_addr(dev, addr) >> PAGE_SHIFT);
>         return pfn;
>  }
>
> diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
> index 638808c4e12247..7539679205fbf7 100644
> --- a/arch/arm/mach-keystone/keystone.c
> +++ b/arch/arm/mach-keystone/keystone.c
> @@ -8,6 +8,7 @@
>   */
>  #include <linux/io.h>
>  #include <linux/of.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/init.h>
>  #include <linux/of_platform.h>
>  #include <linux/of_address.h>
> @@ -24,8 +25,6 @@
>
>  #include "keystone.h"
>
> -static unsigned long keystone_dma_pfn_offset __read_mostly;
> -
>  static int keystone_platform_notifier(struct notifier_block *nb,
>                                       unsigned long event, void *data)
>  {
> @@ -38,9 +37,12 @@ static int keystone_platform_notifier(struct notifier_block *nb,
>                 return NOTIFY_BAD;
>
>         if (!dev->of_node) {
> -               dev->dma_pfn_offset = keystone_dma_pfn_offset;
> -               dev_err(dev, "set dma_pfn_offset%08lx\n",
> -                       dev->dma_pfn_offset);
> +               int ret = dma_set_offset_range(dev, KEYSTONE_HIGH_PHYS_START,
> +                                                   KEYSTONE_LOW_PHYS_START,
> +                                                   KEYSTONE_HIGH_PHYS_SIZE);
> +               dev_err(dev, "set dma_offset%08llx%s\n",
> +                       KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START,
> +                       ret ? " failed" : "");
>         }
>         return NOTIFY_OK;
>  }
> @@ -51,11 +53,8 @@ static struct notifier_block platform_nb = {
>
>  static void __init keystone_init(void)
>  {
> -       if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) {
> -               keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
> -                                                  KEYSTONE_LOW_PHYS_START);
> +       if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START)
>                 bus_register_notifier(&platform_bus_type, &platform_nb);
> -       }
>         keystone_pm_runtime_init();
>  }
>
> diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-sh7786.c
> index e0b568aaa7014c..e929f85c503852 100644
> --- a/arch/sh/drivers/pci/pcie-sh7786.c
> +++ b/arch/sh/drivers/pci/pcie-sh7786.c
> @@ -12,6 +12,7 @@
>  #include <linux/io.h>
>  #include <linux/async.h>
>  #include <linux/delay.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/slab.h>
>  #include <linux/clk.h>
>  #include <linux/sh_clk.h>
> @@ -31,6 +32,8 @@ struct sh7786_pcie_port {
>  static struct sh7786_pcie_port *sh7786_pcie_ports;
>  static unsigned int nr_ports;
>  static unsigned long dma_pfn_offset;
> +size_t memsize;
> +u64 memstart;
>
>  static struct sh7786_pcie_hwops {
>         int (*core_init)(void);
> @@ -301,7 +304,6 @@ static int __init pcie_init(struct sh7786_pcie_port *port)
>         struct pci_channel *chan = port->hose;
>         unsigned int data;
>         phys_addr_t memstart, memend;
> -       size_t memsize;
>         int ret, i, win;
>
>         /* Begin initialization */
> @@ -368,8 +370,6 @@ static int __init pcie_init(struct sh7786_pcie_port *port)
>         memstart = ALIGN_DOWN(memstart, memsize);
>         memsize = roundup_pow_of_two(memend - memstart);
>
> -       dma_pfn_offset = memstart >> PAGE_SHIFT;
> -
>         /*
>          * If there's more than 512MB of memory, we need to roll over to
>          * LAR1/LAMR1.
> @@ -487,7 +487,8 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev, u8 slot, u8 pin)
>
>  void pcibios_bus_add_device(struct pci_dev *pdev)
>  {
> -       pdev->dev.dma_pfn_offset = dma_pfn_offset;
> +       dma_set_offset_range(&pdev->dev, __pa(memory_start),
> +                            __pa(memory_start) - memstart, memsize);
>  }
>
>  static int __init sh7786_pcie_core_init(void)
> diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
> index d4811691b93cc1..003a91719b3794 100644
> --- a/arch/sh/kernel/dma-coherent.c
> +++ b/arch/sh/kernel/dma-coherent.c
> @@ -14,6 +14,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
>  {
>         void *ret, *ret_nocache;
>         int order = get_order(size);
> +       phys_addr_t phys;
>
>         gfp |= __GFP_ZERO;
>
> @@ -34,12 +35,10 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
>                 return NULL;
>         }
>
> -       split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> -
> -       *dma_handle = virt_to_phys(ret);
> -       if (!WARN_ON(!dev))
> -               *dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
> +       phys = virt_to_phys(ret);
> +       split_page(pfn_to_page(PHYS_PFN(phys)), order);
>
> +       *dma_handle = (dma_addr_t)phys - dma_offset_from_phys_addr(dev, phys);
>         return ret_nocache;
>  }
>
> @@ -47,12 +46,10 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
>                 dma_addr_t dma_handle, unsigned long attrs)
>  {
>         int order = get_order(size);
> -       unsigned long pfn = (dma_handle >> PAGE_SHIFT);
> +       unsigned long pfn;
>         int k;
>
> -       if (!WARN_ON(!dev))
> -               pfn += dev->dma_pfn_offset;
> -
> +       pfn = PHYS_PFN(dma_handle + dma_offset_from_dma_addr(dev, dma_handle));
>         for (k = 0; k < (1 << order); k++)
>                 __free_pages(pfn_to_page(pfn + k), 0);
>
> diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
> index c313d784efabb9..ea3a58323f81d1 100644
> --- a/arch/x86/pci/sta2x11-fixup.c
> +++ b/arch/x86/pci/sta2x11-fixup.c
> @@ -12,6 +12,7 @@
>  #include <linux/export.h>
>  #include <linux/list.h>
>  #include <linux/dma-direct.h>
> +#include <linux/dma-mapping.h>
>  #include <asm/iommu.h>
>
>  #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
> @@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
>         struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
>         struct device *dev = &pdev->dev;
>         u32 amba_base, max_amba_addr;
> -       int i;
> +       int i, ret;
>
>         if (!instance)
>                 return;
> @@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
>         pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
>         max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
>
> -       dev->dma_pfn_offset = PFN_DOWN(-amba_base);
> +       ret = dma_set_offset_range(dev, 0, amba_base, STA2X11_AMBA_SIZE);
> +       if (ret)
> +               dev_err(dev, "sta2x11: could not set DMA offset\n");
>
>         dev->bus_dma_limit = max_amba_addr;
>         pci_set_consistent_dma_mask(pdev, max_amba_addr);
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index 28a6b387e80e28..a3e04c003a2187 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
>         *dma_addr = dmaaddr;
>         *dma_size = size;
>
> -       dev->dma_pfn_offset = PFN_DOWN(offset);
> -       dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
> +       ret = dma_set_offset_range(dev, dmaaddr + offset, dmaaddr, size);
> +
> +       dev_dbg(dev, "dma_offset(%#08llx)%s\n", offset, ret ? " failed!" : "");
>  }
>
>  static void __init acpi_iort_register_irq(int hwirq, const char *name,
> diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c
> index 072ea113e6be55..48a4adf1f04edc 100644
> --- a/drivers/gpu/drm/sun4i/sun4i_backend.c
> +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
> @@ -11,6 +11,7 @@
>  #include <linux/module.h>
>  #include <linux/of_device.h>
>  #include <linux/of_graph.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/platform_device.h>
>  #include <linux/reset.h>
>
> @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
>                  * on our device since the RAM mapping is at 0 for the DMA bus,
>                  * unlike the CPU.
>                  */
> -               drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +               ret = dma_set_offset_range(drm->dev, PHYS_OFFSET, 0, SZ_4G);
> +               if (ret)
> +                       return ret;
>         }
>
>         backend->engine.node = dev->of_node;
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 04fbd4bf0ff9fd..d5542df9aacc01 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>         if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
>                 return NULL;
>
> -       if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> +       if (!selftest_running && cfg->iommu_dev->dma_range_map) {
>                 dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
>                 return NULL;
>         }
> diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> index eff34ded63055d..d6eda02fd3fc93 100644
> --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> @@ -7,6 +7,7 @@
>   */
>
>  #include <linux/clk.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/interrupt.h>
>  #include <linux/module.h>
>  #include <linux/mutex.h>
> @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
>                         return ret;
>         } else {
>  #ifdef PHYS_PFN_OFFSET
> -               csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +               ret = dma_set_offset_range(csi->dev, PHYS_OFFSET, 0, SZ_4G);
> +               if (ret)
> +                       return ret;
>  #endif
>         }
>
> diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> index 055eb0b8e39692..450fce6cd8d21b 100644
> --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> @@ -898,7 +898,9 @@ static int sun6i_csi_probe(struct platform_device *pdev)
>
>         sdev->dev = &pdev->dev;
>         /* The DMA bus has the memory mapped at 0 */
> -       sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> +       ret = dma_set_offset_range(sdev->dev, PHYS_OFFSET, 0, SZ_4G);
> +       if (ret)
> +               return ret;
>
>         ret = sun6i_csi_resource_request(sdev, pdev);
>         if (ret)
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 8eea3f6e29a441..083ec3531bcceb 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -918,33 +918,33 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
>  }
>  EXPORT_SYMBOL(of_io_request_and_map);
>
> +#ifdef CONFIG_HAS_DMA
>  /**
> - * of_dma_get_range - Get DMA range info
> + * of_dma_get_range - Get DMA range info and put it into a map array
>   * @np:                device node to get DMA range info
> - * @dma_addr:  pointer to store initial DMA address of DMA range
> - * @paddr:     pointer to store initial CPU address of DMA range
> - * @size:      pointer to store size of DMA range
> + * @map:       dma range structure to return
>   *
>   * Look in bottom up direction for the first "dma-ranges" property
> - * and parse it.
> - *  dma-ranges format:
> + * and parse it.  Put the information into a DMA offset map array.
> + *
> + * dma-ranges format:
>   *     DMA addr (dma_addr)     : naddr cells
>   *     CPU addr (phys_addr_t)  : pna cells
>   *     size                    : nsize cells
>   *
> - * It returns -ENODEV if "dma-ranges" property was not found
> - * for this device in DT.
> + * It returns -ENODEV if "dma-ranges" property was not found for this
> + * device in the DT.
>   */
> -int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *size)
> +int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map)
>  {
>         struct device_node *node = of_node_get(np);
>         const __be32 *ranges = NULL;
> -       int len;
> -       int ret = 0;
>         bool found_dma_ranges = false;
>         struct of_range_parser parser;
>         struct of_range range;
> -       u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> +       struct bus_dma_region *r;
> +       int len, num_ranges = 0;
> +       int ret;
>
>         while (node) {
>                 ranges = of_get_property(node, "dma-ranges", &len);
> @@ -970,44 +970,34 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
>         }
>
>         of_dma_range_parser_init(&parser, node);
> +       for_each_of_range(&parser, &range)
> +               num_ranges++;
> +
> +       of_dma_range_parser_init(&parser, node);
> +
> +       ret = -ENOMEM;
> +       r = kcalloc(num_ranges + 1, sizeof(*r), GFP_KERNEL);
> +       if (!r)
> +               goto out;
>
> +       /*
> +        * Record all info in the generic DMA ranges array for struct device.
> +        */
> +       *map = r;
>         for_each_of_range(&parser, &range) {
>                 pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
>                          range.bus_addr, range.cpu_addr, range.size);
> -
> -               if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset) {
> -                       pr_warn("Can't handle multiple dma-ranges with different offsets on node(%pOF)\n", node);
> -                       /* Don't error out as we'd break some existing DTs */
> -                       continue;
> -               }
> -               dma_offset = range.cpu_addr - range.bus_addr;
> -
> -               /* Take lower and upper limits */
> -               if (range.bus_addr < dma_start)
> -                       dma_start = range.bus_addr;
> -               if (range.bus_addr + range.size > dma_end)
> -                       dma_end = range.bus_addr + range.size;
> +               r->cpu_start = range.cpu_addr;
> +               r->dma_start = range.bus_addr;
> +               r->size = range.size;
> +               r->offset = (u64)range.cpu_addr - (u64)range.bus_addr;
> +               r++;
>         }
> -
> -       if (dma_start >= dma_end) {
> -               ret = -EINVAL;
> -               pr_debug("Invalid DMA ranges configuration on node(%pOF)\n",
> -                        node);
> -               goto out;
> -       }
> -
> -       *dma_addr = dma_start;
> -       *size = dma_end - dma_start;
> -       *paddr = dma_start + dma_offset;
> -
> -       pr_debug("final: dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> -                *dma_addr, *paddr, *size);
> -
>  out:
>         of_node_put(node);
> -
>         return ret;
>  }
> +#endif
>
>  /**
>   * of_dma_is_coherent - Check if device is coherent
> diff --git a/drivers/of/device.c b/drivers/of/device.c
> index 27203bfd0b22dc..0c84f42a23e42e 100644
> --- a/drivers/of/device.c
> +++ b/drivers/of/device.c
> @@ -88,14 +88,14 @@ int of_device_add(struct platform_device *ofdev)
>   */
>  int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
>  {
> -       u64 dma_addr, paddr, size = 0;
> -       int ret;
> -       bool coherent;
> -       unsigned long offset;
>         const struct iommu_ops *iommu;
> -       u64 mask, end;
> +       const struct bus_dma_region *map = NULL;
> +       dma_addr_t dma_start = 0;
> +       u64 mask, end, size = 0;
> +       bool coherent;
> +       int ret;
>
> -       ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
> +       ret = of_dma_get_range(np, &map);
>         if (ret < 0) {
>                 /*
>                  * For legacy reasons, we have to assume some devices need
> @@ -104,26 +104,34 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
>                  */
>                 if (!force_dma)
>                         return ret == -ENODEV ? 0 : ret;
> -
> -               dma_addr = offset = 0;
>         } else {
> -               offset = PFN_DOWN(paddr - dma_addr);
> +               const struct bus_dma_region *r = map;
> +               dma_addr_t dma_end = 0;
> +
> +               /* Determine the overall bounds of all DMA regions */
> +               for (dma_start = ~(dma_addr_t)0; r->size; r++) {
> +                       /* Take lower and upper limits */
> +                       if (r->dma_start < dma_start)
> +                               dma_start = r->dma_start;
> +                       if (r->dma_start + r->size > dma_end)
> +                               dma_end = r->dma_start + r->size;
> +               }
> +               size = dma_end - dma_start;
>
>                 /*
>                  * Add a work around to treat the size as mask + 1 in case
>                  * it is defined in DT as a mask.
>                  */
>                 if (size & 1) {
> -                       dev_warn(dev, "Invalid size 0x%llx for dma-range\n",
> -                                size);
> +                       dev_warn(dev, "Invalid size 0x%llx for dma-range(s)\n", size);
>                         size = size + 1;
>                 }
>
>                 if (!size) {
>                         dev_err(dev, "Adjusted size 0x%llx invalid\n", size);
> +                       kfree(map);
>                         return -EINVAL;
>                 }
> -               dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
>         }
>
>         /*
> @@ -142,13 +150,11 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
>         else if (!size)
>                 size = 1ULL << 32;
>
> -       dev->dma_pfn_offset = offset;
> -
>         /*
>          * Limit coherent and dma mask based on size and default mask
>          * set by the driver.
>          */
> -       end = dma_addr + size - 1;
> +       end = dma_start + size - 1;
>         mask = DMA_BIT_MASK(ilog2(end) + 1);
>         dev->coherent_dma_mask &= mask;
>         *dev->dma_mask &= mask;
> @@ -161,14 +167,17 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
>                 coherent ? " " : " not ");
>
>         iommu = of_iommu_configure(dev, np);
> -       if (PTR_ERR(iommu) == -EPROBE_DEFER)
> +       if (PTR_ERR(iommu) == -EPROBE_DEFER) {
> +               kfree(map);
>                 return -EPROBE_DEFER;
> +       }
>
>         dev_dbg(dev, "device is%sbehind an iommu\n",
>                 iommu ? " " : " not ");
>
> -       arch_setup_dma_ops(dev, dma_addr, size, iommu, coherent);
> +       arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
>
> +       dev->dma_range_map = map;
>         return 0;
>  }
>  EXPORT_SYMBOL_GPL(of_dma_configure);
> diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
> index edc682249c0015..768406b4156b21 100644
> --- a/drivers/of/of_private.h
> +++ b/drivers/of/of_private.h
> @@ -157,12 +157,12 @@ extern void __of_sysfs_remove_bin_file(struct device_node *np,
>  extern int of_bus_n_addr_cells(struct device_node *np);
>  extern int of_bus_n_size_cells(struct device_node *np);
>
> -#ifdef CONFIG_OF_ADDRESS
> -extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
> -                           u64 *paddr, u64 *size);
> +struct bus_dma_region;
> +#if defined(CONFIG_OF_ADDRESS) && defined(CONFIG_HAS_DMA)
> +int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map);
>  #else
> -static inline int of_dma_get_range(struct device_node *np, u64 *dma_addr,
> -                                  u64 *paddr, u64 *size)
> +static inline int of_dma_get_range(struct device_node *np,
> +               const struct bus_dma_region **map);
>  {
>         return -ENODEV;
>  }
> diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
> index 398de04fd19c94..8d0c9bf495d2ef 100644
> --- a/drivers/of/unittest.c
> +++ b/drivers/of/unittest.c
> @@ -7,6 +7,7 @@
>
>  #include <linux/memblock.h>
>  #include <linux/clk.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/err.h>
>  #include <linux/errno.h>
>  #include <linux/hashtable.h>
> @@ -869,10 +870,10 @@ static void __init of_unittest_changeset(void)
>  }
>
>  static void __init of_unittest_dma_ranges_one(const char *path,
> -               u64 expect_dma_addr, u64 expect_paddr, u64 expect_size)
> +               u64 expect_dma_addr, u64 expect_paddr)
>  {
>         struct device_node *np;
> -       u64 dma_addr, paddr, size;
> +       const struct bus_dma_region *map = NULL;
>         int rc;
>
>         np = of_find_node_by_path(path);
> @@ -881,16 +882,26 @@ static void __init of_unittest_dma_ranges_one(const char *path,
>                 return;
>         }
>
> -       rc = of_dma_get_range(np, &dma_addr, &paddr, &size);
> -
> +       rc = of_dma_get_range(np, &map);
>         unittest(!rc, "of_dma_get_range failed on node %pOF rc=%i\n", np, rc);
> +
>         if (!rc) {
> -               unittest(size == expect_size,
> -                        "of_dma_get_range wrong size on node %pOF size=%llx\n", np, size);
> +               phys_addr_t     paddr;
> +               dma_addr_t      dma_addr;
> +               struct device   dev_bogus;
> +
> +               dev_bogus.dma_range_map = map;
> +               paddr = (phys_addr_t)expect_dma_addr +
> +                       dma_offset_from_dma_addr(&dev_bogus, expect_dma_addr);
> +               dma_addr = (dma_addr_t)expect_paddr -
> +                       dma_offset_from_phys_addr(&dev_bogus, expect_paddr);
> +
>                 unittest(paddr == expect_paddr,
>                          "of_dma_get_range wrong phys addr (%llx) on node %pOF", paddr, np);
>                 unittest(dma_addr == expect_dma_addr,
>                          "of_dma_get_range wrong DMA addr (%llx) on node %pOF", dma_addr, np);
> +
> +               kfree(map);
>         }
>         of_node_put(np);
>  }
> @@ -898,11 +909,14 @@ static void __init of_unittest_dma_ranges_one(const char *path,
>  static void __init of_unittest_parse_dma_ranges(void)
>  {
>         of_unittest_dma_ranges_one("/testcase-data/address-tests/device@70000000",
> -               0x0, 0x20000000, 0x40000000);
> +               0x0, 0x20000000);
>         of_unittest_dma_ranges_one("/testcase-data/address-tests/bus@80000000/device@1000",
> -               0x10000000, 0x20000000, 0x40000000);
> +               0x10000000, 0x20000000);
> +       /* pci@90000000 has two ranges in the dma-range property */
> +       of_unittest_dma_ranges_one("/testcase-data/address-tests/pci@90000000",
> +               0x80000000, 0x20000000);
>         of_unittest_dma_ranges_one("/testcase-data/address-tests/pci@90000000",
> -               0x80000000, 0x20000000, 0x10000000);
> +               0xc0000000, 0x40000000);
>  }
>
>  static void __init of_unittest_pci_dma_ranges(void)
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index 9f04c30c4aaf7a..49242dd6176e30 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -519,7 +519,7 @@ static int rproc_handle_vdev(struct rproc *rproc, struct fw_rsc_vdev *rsc,
>         /* Initialise vdev subdevice */
>         snprintf(name, sizeof(name), "vdev%dbuffer", rvdev->index);
>         rvdev->dev.parent = &rproc->dev;
> -       rvdev->dev.dma_pfn_offset = rproc->dev.parent->dma_pfn_offset;
> +       rvdev->dev.dma_range_map = rproc->dev.parent->dma_range_map;
>         rvdev->dev.release = rproc_rvdev_release;
>         dev_set_name(&rvdev->dev, "%s#%s", dev_name(rvdev->dev.parent), name);
>         dev_set_drvdata(&rvdev->dev, rvdev);
> diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> index 1744e6fcc99980..249e4bddaa4014 100644
> --- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> +++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> @@ -230,8 +230,11 @@ int cedrus_hw_probe(struct cedrus_dev *dev)
>          */
>
>  #ifdef PHYS_PFN_OFFSET
> -       if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET))
> -               dev->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +       if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET)) {
> +               ret = dma_set_offset_range(dev->dev, PHYS_OFFSET, 0, SZ_4G);
> +               if (ret)
> +                       return ret;
> +       }
>  #endif
>
>         ret = of_reserved_mem_device_init(dev->dev);
> diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> index 6197938dcc2d8f..376ca258e510bf 100644
> --- a/drivers/usb/core/message.c
> +++ b/drivers/usb/core/message.c
> @@ -1956,10 +1956,10 @@ int usb_set_configuration(struct usb_device *dev, int configuration)
>                 intf->dev.groups = usb_interface_groups;
>                 /*
>                  * Please refer to usb_alloc_dev() to see why we set
> -                * dma_mask and dma_pfn_offset.
> +                * dma_mask and dma_range_map.
>                  */
>                 intf->dev.dma_mask = dev->dev.dma_mask;
> -               intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
> +               intf->dev.dma_range_map = dev->dev.dma_range_map;
>                 INIT_WORK(&intf->reset_ws, __usb_queue_reset_device);
>                 intf->minor = -1;
>                 device_initialize(&intf->dev);
> diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
> index f16c26dc079d79..1f167a2c095e9a 100644
> --- a/drivers/usb/core/usb.c
> +++ b/drivers/usb/core/usb.c
> @@ -611,7 +611,7 @@ struct usb_device *usb_alloc_dev(struct usb_device *parent,
>          * mask for the entire HCD, so don't do that.
>          */
>         dev->dev.dma_mask = bus->sysdev->dma_mask;
> -       dev->dev.dma_pfn_offset = bus->sysdev->dma_pfn_offset;
> +       dev->dev.dma_range_map = bus->sysdev->dma_range_map;
>         set_dev_node(&dev->dev, dev_to_node(bus->sysdev));
>         dev->state = USB_STATE_ATTACHED;
>         dev->lpm_disable_count = 1;
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 15460a5ac024a1..feddefcf3e5c20 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -492,7 +492,7 @@ struct dev_links_info {
>   *             such descriptors.
>   * @bus_dma_limit: Limit of an upstream bridge or bus which imposes a smaller
>   *             DMA limit than the device itself supports.
> - * @dma_pfn_offset: offset of DMA memory range relatively of RAM
> + * @dma_range_map: map for DMA memory ranges relative to that of RAM
>   * @dma_parms: A low level driver may set these to teach IOMMU code about
>   *             segment limitations.
>   * @dma_pools: Dma pools (if dma'ble device).
> @@ -577,7 +577,7 @@ struct device {
>                                              64 bit addresses for consistent
>                                              allocations such descriptors. */
>         u64             bus_dma_limit;  /* upstream dma constraint */
> -       unsigned long   dma_pfn_offset;
> +       const struct bus_dma_region *dma_range_map;
>
>         struct device_dma_parameters *dma_parms;
>
> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> index 5184735a0fe8eb..810d27692674bc 100644
> --- a/include/linux/dma-direct.h
> +++ b/include/linux/dma-direct.h
> @@ -13,16 +13,12 @@ extern unsigned int zone_dma_bits;
>  #else
>  static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
>  {
> -       dma_addr_t dev_addr = (dma_addr_t)paddr;
> -
> -       return dev_addr - ((dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
> +       return (dma_addr_t)paddr - dma_offset_from_phys_addr(dev, paddr);
>  }
>
>  static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr)
>  {
> -       phys_addr_t paddr = (phys_addr_t)dev_addr;
> -
> -       return paddr + ((phys_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
> +       return (phys_addr_t)dev_addr + dma_offset_from_dma_addr(dev, dev_addr);
>  }
>  #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */
>
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index a33ed3954ed465..5938c7ca2abcce 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -255,7 +255,38 @@ static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
>
>  size_t dma_direct_max_mapping_size(struct device *dev);
>
> +struct bus_dma_region {
> +       phys_addr_t     cpu_start;
> +       dma_addr_t      dma_start;
> +       u64             size;
> +       u64             offset;
> +};
> +
>  #ifdef CONFIG_HAS_DMA
> +static inline u64 dma_offset_from_dma_addr(struct device *dev, dma_addr_t dma_addr)
> +{
> +       const struct bus_dma_region *m = dev->dma_range_map;
> +
> +       if (!m)
> +               return 0;
> +       for (; m->size; m++)
> +               if (dma_addr >= m->dma_start && dma_addr - m->dma_start < m->size)
> +                       return m->offset;
> +       return 0;
> +}
> +
> +static inline u64 dma_offset_from_phys_addr(struct device *dev, phys_addr_t paddr)
> +{
> +       const struct bus_dma_region *m = dev->dma_range_map;
> +
> +       if (!m)
> +               return 0;
> +       for (; m->size; m++)
> +               if (paddr >= m->cpu_start && paddr - m->cpu_start < m->size)
> +                       return m->offset;
> +       return 0;
> +}
> +
>  #include <asm/dma-mapping.h>
>
>  static inline const struct dma_map_ops *get_dma_ops(struct device *dev)
> @@ -801,6 +832,9 @@ static inline void arch_teardown_dma_ops(struct device *dev)
>  }
>  #endif /* CONFIG_ARCH_HAS_TEARDOWN_DMA_OPS */
>
> +int dma_set_offset_range(struct device *dev, phys_addr_t cpu_start,
> +               dma_addr_t dma_start, u64 size);
> +
>  static inline unsigned int dma_get_max_seg_size(struct device *dev)
>  {
>         if (dev->dma_parms && dev->dma_parms->max_segment_size)
> diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
> index 2a0c4985f38e41..751969d6185325 100644
> --- a/kernel/dma/coherent.c
> +++ b/kernel/dma/coherent.c
> @@ -31,10 +31,12 @@ static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *de
>  static inline dma_addr_t dma_get_device_base(struct device *dev,
>                                              struct dma_coherent_mem * mem)
>  {
> -       if (mem->use_dev_dma_pfn_offset)
> -               return (mem->pfn_base - dev->dma_pfn_offset) << PAGE_SHIFT;
> -       else
> -               return mem->device_base;
> +       if (mem->use_dev_dma_pfn_offset) {
> +               u64 base_addr = (u64)mem->pfn_base << PAGE_SHIFT;
> +
> +               return base_addr - dma_offset_from_phys_addr(dev, base_addr);
> +       }
> +       return mem->device_base;
>  }
>
>  static int dma_init_coherent_memory(phys_addr_t phys_addr,
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index a8c18c9a796fdc..dc8017a106fd55 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -11,6 +11,7 @@
>  #include <linux/dma-noncoherent.h>
>  #include <linux/export.h>
>  #include <linux/gfp.h>
> +#include <linux/limits.h>
>  #include <linux/of_device.h>
>  #include <linux/slab.h>
>  #include <linux/vmalloc.h>
> @@ -417,3 +418,62 @@ unsigned long dma_get_merge_boundary(struct device *dev)
>         return ops->get_merge_boundary(dev);
>  }
>  EXPORT_SYMBOL_GPL(dma_get_merge_boundary);
> +
> +static bool dma_range_overlaps(struct device *dev, phys_addr_t cpu_start,
> +               dma_addr_t dma_start, u64 size, u64 offset)
> +{
> +       const struct bus_dma_region *m = dev->dma_range_map;
> +
> +       for (m = dev->dma_range_map; m->size; m++) {
> +               if (offset == m->offset &&
> +                   cpu_start >= m->cpu_start &&
> +                   size <= m->size &&
> +                   cpu_start - m->cpu_start <= m->size - size)
> +                       return true;
> +       }
> +
> +       return false;
> +}
> +
> +/**
> + * dma_set_offset_range - Assign scalar offset for a single DMA range.
> + * @dev:       device pointer; needed to "own" the alloced memory.
> + * @cpu_start:  beginning of memory region covered by this offset.
> + * @dma_start:  beginning of DMA/PCI region covered by this offset.
> + * @size:      size of the region.
> + *
> + * This is for the simple case of a uniform offset which cannot
> + * be discovered by "dma-ranges".
> + *
> + * It returns -ENOMEM if out of memory, -ENODEV if dev == NULL, otherwise 0.
> + */
> +int dma_set_offset_range(struct device *dev, phys_addr_t cpu_start,
> +                           dma_addr_t dma_start, u64 size)
> +{
> +       struct bus_dma_region *map;
> +       u64 offset = (u64)cpu_start - (u64)dma_start;
> +
> +       if (!offset)
> +               return 0;
> +
> +       /*
> +        * See if a map already exists and we already encompass the new range:
> +        */
> +       if (dev->dma_range_map) {
> +               if (dma_range_overlaps(dev, cpu_start, dma_start, size, offset))
> +                       return 0;
> +               dev_err(dev, "attempt to add conflicting DMA range to existing map\n");
> +               return -EINVAL;
> +       }
> +
> +       map = kcalloc(2, sizeof(*map), GFP_KERNEL);
> +       if (!map)
> +               return -ENOMEM;
> +       map[0].cpu_start = cpu_start;
> +       map[0].dma_start = dma_start;
> +       map[0].offset = offset;
> +       map[0].size = size;
> +       dev->dma_range_map = map;
> +       return 0;
> +}
> +EXPORT_SYMBOL_GPL(dma_set_offset_range);
>
> >
> > Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
> > ---
> >  arch/arm/include/asm/dma-mapping.h            |  9 +-
> >  arch/arm/mach-keystone/keystone.c             | 17 ++--
> >  arch/sh/drivers/pci/pcie-sh7786.c             |  9 +-
> >  arch/sh/kernel/dma-coherent.c                 | 16 ++--
> >  arch/x86/pci/sta2x11-fixup.c                  |  7 +-
> >  drivers/acpi/arm64/iort.c                     |  5 +-
> >  drivers/gpu/drm/sun4i/sun4i_backend.c         |  5 +-
> >  drivers/iommu/io-pgtable-arm.c                |  2 +-
> >  .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
> >  .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  4 +-
> >  drivers/of/address.c                          | 95 ++++++++++---------
> >  drivers/of/device.c                           | 47 +++++----
> >  drivers/of/of_private.h                       |  9 +-
> >  drivers/of/unittest.c                         | 35 +++++--
> >  drivers/remoteproc/remoteproc_core.c          |  2 +-
> >  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
> >  drivers/usb/core/message.c                    |  4 +-
> >  drivers/usb/core/usb.c                        |  2 +-
> >  include/linux/device.h                        |  4 +-
> >  include/linux/dma-direct.h                    | 10 +-
> >  include/linux/dma-mapping.h                   | 43 +++++++++
> >  include/linux/pfn.h                           |  2 +
> >  kernel/dma/coherent.c                         | 10 +-
> >  kernel/dma/mapping.c                          | 53 +++++++++++
> >  24 files changed, 278 insertions(+), 124 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> > index bdd80ddbca34..b7cdde9fb83d 100644
> > --- a/arch/arm/include/asm/dma-mapping.h
> > +++ b/arch/arm/include/asm/dma-mapping.h
> > @@ -35,8 +35,9 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
> >  #ifndef __arch_pfn_to_dma
> >  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
> >  {
> > -     if (dev)
> > -             pfn -= dev->dma_pfn_offset;
> > +     if (dev && dev->dma_range_map)
> > +             pfn -= DMA_ADDR_PFN(dma_offset_from_phys_addr(dev, PFN_PHYS(pfn)));
> > +
> >       return (dma_addr_t)__pfn_to_bus(pfn);
> >  }
> >
> > @@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
> >  {
> >       unsigned long pfn = __bus_to_pfn(addr);
> >
> > -     if (dev)
> > -             pfn += dev->dma_pfn_offset;
> > +     if (dev && dev->dma_range_map)
> > +             pfn += DMA_ADDR_PFN(dma_offset_from_dma_addr(dev, addr));
> >
> >       return pfn;
> >  }
> > diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
> > index 638808c4e122..a1a19781983b 100644
> > --- a/arch/arm/mach-keystone/keystone.c
> > +++ b/arch/arm/mach-keystone/keystone.c
> > @@ -8,6 +8,7 @@
> >   */
> >  #include <linux/io.h>
> >  #include <linux/of.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/init.h>
> >  #include <linux/of_platform.h>
> >  #include <linux/of_address.h>
> > @@ -24,8 +25,6 @@
> >
> >  #include "keystone.h"
> >
> > -static unsigned long keystone_dma_pfn_offset __read_mostly;
> > -
> >  static int keystone_platform_notifier(struct notifier_block *nb,
> >                                     unsigned long event, void *data)
> >  {
> > @@ -38,9 +37,12 @@ static int keystone_platform_notifier(struct notifier_block *nb,
> >               return NOTIFY_BAD;
> >
> >       if (!dev->of_node) {
> > -             dev->dma_pfn_offset = keystone_dma_pfn_offset;
> > -             dev_err(dev, "set dma_pfn_offset%08lx\n",
> > -                     dev->dma_pfn_offset);
> > +             int ret = dma_attach_offset_range(dev, KEYSTONE_HIGH_PHYS_START,
> > +                                               KEYSTONE_LOW_PHYS_START,
> > +                                               KEYSTONE_HIGH_PHYS_SIZE);
> > +             dev_err(dev, "set dma_offset%08llx%s\n",
> > +                     KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START,
> > +                     ret ? " failed" : "");
> >       }
> >       return NOTIFY_OK;
> >  }
> > @@ -51,11 +53,8 @@ static struct notifier_block platform_nb = {
> >
> >  static void __init keystone_init(void)
> >  {
> > -     if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) {
> > -             keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
> > -                                                KEYSTONE_LOW_PHYS_START);
> > +     if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START)
> >               bus_register_notifier(&platform_bus_type, &platform_nb);
> > -     }
> >       keystone_pm_runtime_init();
> >  }
> >
> > diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-sh7786.c
> > index e0b568aaa701..716bb99022c6 100644
> > --- a/arch/sh/drivers/pci/pcie-sh7786.c
> > +++ b/arch/sh/drivers/pci/pcie-sh7786.c
> > @@ -12,6 +12,7 @@
> >  #include <linux/io.h>
> >  #include <linux/async.h>
> >  #include <linux/delay.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/slab.h>
> >  #include <linux/clk.h>
> >  #include <linux/sh_clk.h>
> > @@ -31,6 +32,8 @@ struct sh7786_pcie_port {
> >  static struct sh7786_pcie_port *sh7786_pcie_ports;
> >  static unsigned int nr_ports;
> >  static unsigned long dma_pfn_offset;
> > +size_t memsize;
> > +u64 memstart;
> >
> >  static struct sh7786_pcie_hwops {
> >       int (*core_init)(void);
> > @@ -301,7 +304,6 @@ static int __init pcie_init(struct sh7786_pcie_port *port)
> >       struct pci_channel *chan = port->hose;
> >       unsigned int data;
> >       phys_addr_t memstart, memend;
> > -     size_t memsize;
> >       int ret, i, win;
> >
> >       /* Begin initialization */
> > @@ -368,8 +370,6 @@ static int __init pcie_init(struct sh7786_pcie_port *port)
> >       memstart = ALIGN_DOWN(memstart, memsize);
> >       memsize = roundup_pow_of_two(memend - memstart);
> >
> > -     dma_pfn_offset = memstart >> PAGE_SHIFT;
> > -
> >       /*
> >        * If there's more than 512MB of memory, we need to roll over to
> >        * LAR1/LAMR1.
> > @@ -487,7 +487,8 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev, u8 slot, u8 pin)
> >
> >  void pcibios_bus_add_device(struct pci_dev *pdev)
> >  {
> > -     pdev->dev.dma_pfn_offset = dma_pfn_offset;
> > +     dma_attach_offset_range(&pdev->dev, __pa(memory_start),
> > +                             __pa(memory_start) - memstart, memsize);
> >  }
> >
> >  static int __init sh7786_pcie_core_init(void)
> > diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
> > index d4811691b93c..e00f29c7c443 100644
> > --- a/arch/sh/kernel/dma-coherent.c
> > +++ b/arch/sh/kernel/dma-coherent.c
> > @@ -14,6 +14,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> >  {
> >       void *ret, *ret_nocache;
> >       int order = get_order(size);
> > +     phys_addr_t phys;
> >
> >       gfp |= __GFP_ZERO;
> >
> > @@ -34,11 +35,12 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> >               return NULL;
> >       }
> >
> > -     split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > +     phys = virt_to_phys(ret);
> > +     split_page(pfn_to_page(PHYS_PFN(phys)), order);
> >
> > -     *dma_handle = virt_to_phys(ret);
> > -     if (!WARN_ON(!dev))
> > -             *dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
> > +     *dma_handle = (dma_addr_t)phys;
> > +     if (!WARN_ON(!dev) && dev->dma_range_map)
> > +             *dma_handle -= dma_offset_from_phys_addr(dev, phys);
> >
> >       return ret_nocache;
> >  }
> > @@ -47,11 +49,11 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
> >               dma_addr_t dma_handle, unsigned long attrs)
> >  {
> >       int order = get_order(size);
> > -     unsigned long pfn = (dma_handle >> PAGE_SHIFT);
> > +     unsigned long pfn = PHYS_PFN(dma_handle);
> >       int k;
> >
> > -     if (!WARN_ON(!dev))
> > -             pfn += dev->dma_pfn_offset;
> > +     if (!WARN_ON(!dev) && dev->dma_range_map)
> > +             pfn += DMA_ADDR_PFN(dma_offset_from_dma_addr(dev, dma_handle));
> >
> >       for (k = 0; k < (1 << order); k++)
> >               __free_pages(pfn_to_page(pfn + k), 0);
> > diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
> > index c313d784efab..74633ccf622e 100644
> > --- a/arch/x86/pci/sta2x11-fixup.c
> > +++ b/arch/x86/pci/sta2x11-fixup.c
> > @@ -12,6 +12,7 @@
> >  #include <linux/export.h>
> >  #include <linux/list.h>
> >  #include <linux/dma-direct.h>
> > +#include <linux/dma-mapping.h>
> >  #include <asm/iommu.h>
> >
> >  #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
> > @@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
> >       struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
> >       struct device *dev = &pdev->dev;
> >       u32 amba_base, max_amba_addr;
> > -     int i;
> > +     int i, ret;
> >
> >       if (!instance)
> >               return;
> > @@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
> >       pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
> >       max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
> >
> > -     dev->dma_pfn_offset = PFN_DOWN(-amba_base);
> > +     ret = dma_attach_offset_range(dev, 0, amba_base, STA2X11_AMBA_SIZE);
> > +     if (ret)
> > +             dev_err(dev, "sta2x11: could not set DMA offset\n");
> >
> >       dev->bus_dma_limit = max_amba_addr;
> >       pci_set_consistent_dma_mask(pdev, max_amba_addr);
> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> > index 28a6b387e80e..41c2d861ce43 100644
> > --- a/drivers/acpi/arm64/iort.c
> > +++ b/drivers/acpi/arm64/iort.c
> > @@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
> >       *dma_addr = dmaaddr;
> >       *dma_size = size;
> >
> > -     dev->dma_pfn_offset = PFN_DOWN(offset);
> > -     dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
> > +     ret = dma_attach_offset_range(dev, dmaaddr + offset, dmaaddr, size);
> > +
> > +     dev_dbg(dev, "dma_offset(%#08llx)%s\n", offset, ret ? " failed!" : "");
> >  }
> >
> >  static void __init acpi_iort_register_irq(int hwirq, const char *name,
> > diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c
> > index 072ea113e6be..cbe49a07983c 100644
> > --- a/drivers/gpu/drm/sun4i/sun4i_backend.c
> > +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
> > @@ -11,6 +11,7 @@
> >  #include <linux/module.h>
> >  #include <linux/of_device.h>
> >  #include <linux/of_graph.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/platform_device.h>
> >  #include <linux/reset.h>
> >
> > @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
> >                * on our device since the RAM mapping is at 0 for the DMA bus,
> >                * unlike the CPU.
> >                */
> > -             drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = dma_attach_offset_range(drm->dev, PHYS_OFFSET, 0, SZ_4G);
> > +             if (ret)
> > +                     return ret;
> >       }
> >
> >       backend->engine.node = dev->of_node;
> > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> > index 04fbd4bf0ff9..d5542df9aacc 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
> >       if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
> >               return NULL;
> >
> > -     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> > +     if (!selftest_running && cfg->iommu_dev->dma_range_map) {
> >               dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
> >               return NULL;
> >       }
> > diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > index eff34ded6305..95a5d5655056 100644
> > --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > @@ -7,6 +7,7 @@
> >   */
> >
> >  #include <linux/clk.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/interrupt.h>
> >  #include <linux/module.h>
> >  #include <linux/mutex.h>
> > @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
> >                       return ret;
> >       } else {
> >  #ifdef PHYS_PFN_OFFSET
> > -             csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = dma_attach_offset_range(csi->dev, PHYS_OFFSET, 0, SZ_4G);
> > +             if (ret)
> > +                     return ret;
> >  #endif
> >       }
> >
> > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > index 055eb0b8e396..c26fc1cdd4d2 100644
> > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > @@ -898,7 +898,9 @@ static int sun6i_csi_probe(struct platform_device *pdev)
> >
> >       sdev->dev = &pdev->dev;
> >       /* The DMA bus has the memory mapped at 0 */
> > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > +     ret = dma_attach_offset_range(sdev->dev, PHYS_OFFSET, 0, SZ_4G);
> > +     if (ret)
> > +             return ret;
> >
> >       ret = sun6i_csi_resource_request(sdev, pdev);
> >       if (ret)
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index 8eea3f6e29a4..5d9117a1cb16 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -918,33 +918,65 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
> >  }
> >  EXPORT_SYMBOL(of_io_request_and_map);
> >
> > +static const struct bus_dma_region *dma_create_offset_map(struct device_node *node,
> > +                                                       int num_ranges)
> > +{
> > +     struct of_range_parser parser;
> > +     struct of_range range;
> > +     struct bus_dma_region *map, *r;
> > +     int ret;
> > +
> > +     r = kcalloc(num_ranges + 1, sizeof(*r), GFP_KERNEL);
> > +     if (!r)
> > +             return ERR_PTR(-ENOMEM);
> > +
> > +     map = r;
> > +     ret = of_dma_range_parser_init(&parser, node);
> > +     if (ret)
> > +             return ERR_PTR(ret);
> > +
> > +     /*
> > +      * Record all info for DMA ranges array.  We use our
> > +      * our own struct (bus_dma_region) so it is not dependent
> > +      * on CONFIG_OF.
> > +      */
> > +     for_each_of_range(&parser, &range) {
> > +             pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> > +                      range.bus_addr, range.cpu_addr, range.size);
> > +             r->cpu_start = range.cpu_addr;
> > +             r->dma_start = range.bus_addr;
> > +             r->size = range.size;
> > +             r->offset = (u64)range.cpu_addr - (u64)range.bus_addr;
> > +             r++;
> > +     }
> > +     return map;
> > +}
> > +
> >  /**
> > - * of_dma_get_range - Get DMA range info
> > + * of_dma_get_range - Get DMA range info and put it into a map array
> >   * @np:              device node to get DMA range info
> > - * @dma_addr:        pointer to store initial DMA address of DMA range
> > - * @paddr:   pointer to store initial CPU address of DMA range
> > - * @size:    pointer to store size of DMA range
> >   *
> >   * Look in bottom up direction for the first "dma-ranges" property
> > - * and parse it.
> > - *  dma-ranges format:
> > + * and parse it.  Put the information into a DMA offset map array.
> > + *
> > + * dma-ranges format:
> >   *   DMA addr (dma_addr)     : naddr cells
> >   *   CPU addr (phys_addr_t)  : pna cells
> >   *   size                    : nsize cells
> >   *
> > - * It returns -ENODEV if "dma-ranges" property was not found
> > - * for this device in DT.
> > + * It returns -ENODEV if "dma-ranges" property was not found for this
> > + * device in the DT.
> >   */
> > -int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *size)
> > +const struct bus_dma_region *of_dma_get_range(struct device_node *np)
> >  {
> > +     const struct bus_dma_region *map = NULL;
> >       struct device_node *node = of_node_get(np);
> > +     struct of_range_parser parser;
> >       const __be32 *ranges = NULL;
> > -     int len;
> > -     int ret = 0;
> >       bool found_dma_ranges = false;
> > -     struct of_range_parser parser;
> >       struct of_range range;
> > -     u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> > +     int len, num_ranges = 0;
> > +     int ret = 0;
> >
> >       while (node) {
> >               ranges = of_get_property(node, "dma-ranges", &len);
> > @@ -971,42 +1003,13 @@ int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *siz
> >
> >       of_dma_range_parser_init(&parser, node);
> >
> > -     for_each_of_range(&parser, &range) {
> > -             pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> > -                      range.bus_addr, range.cpu_addr, range.size);
> > -
> > -             if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset) {
> > -                     pr_warn("Can't handle multiple dma-ranges with different offsets on node(%pOF)\n", node);
> > -                     /* Don't error out as we'd break some existing DTs */
> > -                     continue;
> > -             }
> > -             dma_offset = range.cpu_addr - range.bus_addr;
> > -
> > -             /* Take lower and upper limits */
> > -             if (range.bus_addr < dma_start)
> > -                     dma_start = range.bus_addr;
> > -             if (range.bus_addr + range.size > dma_end)
> > -                     dma_end = range.bus_addr + range.size;
> > -     }
> > -
> > -     if (dma_start >= dma_end) {
> > -             ret = -EINVAL;
> > -             pr_debug("Invalid DMA ranges configuration on node(%pOF)\n",
> > -                      node);
> > -             goto out;
> > -     }
> > -
> > -     *dma_addr = dma_start;
> > -     *size = dma_end - dma_start;
> > -     *paddr = dma_start + dma_offset;
> > -
> > -     pr_debug("final: dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> > -              *dma_addr, *paddr, *size);
> > +     for_each_of_range(&parser, &range)
> > +             num_ranges++;
> >
> > +     map = dma_create_offset_map(node, num_ranges);
> >  out:
> >       of_node_put(node);
> > -
> > -     return ret;
> > +     return map ? map : ERR_PTR(ret);
> >  }
> >
> >  /**
> > diff --git a/drivers/of/device.c b/drivers/of/device.c
> > index 27203bfd0b22..fea2f31d4245 100644
> > --- a/drivers/of/device.c
> > +++ b/drivers/of/device.c
> > @@ -88,14 +88,15 @@ int of_device_add(struct platform_device *ofdev)
> >   */
> >  int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
> >  {
> > -     u64 dma_addr, paddr, size = 0;
> > -     int ret;
> > -     bool coherent;
> > -     unsigned long offset;
> >       const struct iommu_ops *iommu;
> > -     u64 mask, end;
> > +     const struct bus_dma_region *map;
> > +     dma_addr_t dma_start = 0;
> > +     u64 mask, end, size = 0;
> > +     bool coherent;
> > +     int ret;
> >
> > -     ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
> > +     map = of_dma_get_range(np);
> > +     ret = PTR_ERR_OR_ZERO(map);
> >       if (ret < 0) {
> >               /*
> >                * For legacy reasons, we have to assume some devices need
> > @@ -105,25 +106,36 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
> >               if (!force_dma)
> >                       return ret == -ENODEV ? 0 : ret;
> >
> > -             dma_addr = offset = 0;
> > -     } else {
> > -             offset = PFN_DOWN(paddr - dma_addr);
> > +             dma_start = 0;
> > +             map = NULL;
> > +     } else if (map) {
> > +             const struct bus_dma_region *r = map;
> > +             dma_addr_t dma_end = 0;
> > +
> > +             /* Determine the overall bounds of all DMA regions */
> > +             for (dma_start = ~(dma_addr_t)0; r->size; r++) {
> > +                     /* Take lower and upper limits */
> > +                     if (r->dma_start < dma_start)
> > +                             dma_start = r->dma_start;
> > +                     if (r->dma_start + r->size > dma_end)
> > +                             dma_end = r->dma_start + r->size;
> > +             }
> > +             size = dma_end - dma_start;
> >
> >               /*
> >                * Add a work around to treat the size as mask + 1 in case
> >                * it is defined in DT as a mask.
> >                */
> >               if (size & 1) {
> > -                     dev_warn(dev, "Invalid size 0x%llx for dma-range\n",
> > -                              size);
> > +                     dev_warn(dev, "Invalid size 0x%llx for dma-range(s)\n", size);
> >                       size = size + 1;
> >               }
> >
> >               if (!size) {
> >                       dev_err(dev, "Adjusted size 0x%llx invalid\n", size);
> > +                     kfree(map);
> >                       return -EINVAL;
> >               }
> > -             dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
> >       }
> >
> >       /*
> > @@ -142,13 +154,11 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
> >       else if (!size)
> >               size = 1ULL << 32;
> >
> > -     dev->dma_pfn_offset = offset;
> > -
> >       /*
> >        * Limit coherent and dma mask based on size and default mask
> >        * set by the driver.
> >        */
> > -     end = dma_addr + size - 1;
> > +     end = dma_start + size - 1;
> >       mask = DMA_BIT_MASK(ilog2(end) + 1);
> >       dev->coherent_dma_mask &= mask;
> >       *dev->dma_mask &= mask;
> > @@ -161,14 +171,17 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
> >               coherent ? " " : " not ");
> >
> >       iommu = of_iommu_configure(dev, np);
> > -     if (PTR_ERR(iommu) == -EPROBE_DEFER)
> > +     if (PTR_ERR(iommu) == -EPROBE_DEFER) {
> > +             kfree(map);
> >               return -EPROBE_DEFER;
> > +     }
> >
> >       dev_dbg(dev, "device is%sbehind an iommu\n",
> >               iommu ? " " : " not ");
> >
> > -     arch_setup_dma_ops(dev, dma_addr, size, iommu, coherent);
> > +     arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
> >
> > +     dev->dma_range_map = map;
> >       return 0;
> >  }
> >  EXPORT_SYMBOL_GPL(of_dma_configure);
> > diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
> > index edc682249c00..876149e721c5 100644
> > --- a/drivers/of/of_private.h
> > +++ b/drivers/of/of_private.h
> > @@ -157,14 +157,13 @@ extern void __of_sysfs_remove_bin_file(struct device_node *np,
> >  extern int of_bus_n_addr_cells(struct device_node *np);
> >  extern int of_bus_n_size_cells(struct device_node *np);
> >
> > +struct bus_dma_region;
> >  #ifdef CONFIG_OF_ADDRESS
> > -extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
> > -                         u64 *paddr, u64 *size);
> > +extern const struct bus_dma_region *of_dma_get_range(struct device_node *np);
> >  #else
> > -static inline int of_dma_get_range(struct device_node *np, u64 *dma_addr,
> > -                                u64 *paddr, u64 *size)
> > +static inline const struct bus_dma_region *of_dma_get_range(struct device_node *np)
> >  {
> > -     return -ENODEV;
> > +     return ERR_PTR(-ENODEV);
> >  }
> >  #endif
> >
> > diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
> > index 398de04fd19c..542d092f19c2 100644
> > --- a/drivers/of/unittest.c
> > +++ b/drivers/of/unittest.c
> > @@ -7,6 +7,7 @@
> >
> >  #include <linux/memblock.h>
> >  #include <linux/clk.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/err.h>
> >  #include <linux/errno.h>
> >  #include <linux/hashtable.h>
> > @@ -869,10 +870,10 @@ static void __init of_unittest_changeset(void)
> >  }
> >
> >  static void __init of_unittest_dma_ranges_one(const char *path,
> > -             u64 expect_dma_addr, u64 expect_paddr, u64 expect_size)
> > +             u64 expect_dma_addr, u64 expect_paddr)
> >  {
> >       struct device_node *np;
> > -     u64 dma_addr, paddr, size;
> > +     const struct bus_dma_region *map = NULL;
> >       int rc;
> >
> >       np = of_find_node_by_path(path);
> > @@ -881,16 +882,27 @@ static void __init of_unittest_dma_ranges_one(const char *path,
> >               return;
> >       }
> >
> > -     rc = of_dma_get_range(np, &dma_addr, &paddr, &size);
> > -
> > +     map = of_dma_get_range(np);
> > +     rc = PTR_ERR_OR_ZERO(map);
> >       unittest(!rc, "of_dma_get_range failed on node %pOF rc=%i\n", np, rc);
> > -     if (!rc) {
> > -             unittest(size == expect_size,
> > -                      "of_dma_get_range wrong size on node %pOF size=%llx\n", np, size);
> > +
> > +     if (!rc && map) {
> > +             phys_addr_t     paddr;
> > +             dma_addr_t      dma_addr;
> > +             struct device   dev_bogus;
> > +
> > +             dev_bogus.dma_range_map = map;
> > +             paddr = (phys_addr_t)expect_dma_addr
> > +                     + dma_offset_from_dma_addr(&dev_bogus, expect_dma_addr);
> > +             dma_addr = (dma_addr_t)expect_paddr
> > +                     - dma_offset_from_phys_addr(&dev_bogus, expect_paddr);
> > +
> >               unittest(paddr == expect_paddr,
> >                        "of_dma_get_range wrong phys addr (%llx) on node %pOF", paddr, np);
> >               unittest(dma_addr == expect_dma_addr,
> >                        "of_dma_get_range wrong DMA addr (%llx) on node %pOF", dma_addr, np);
> > +
> > +             kfree(map);
> >       }
> >       of_node_put(np);
> >  }
> > @@ -898,11 +910,14 @@ static void __init of_unittest_dma_ranges_one(const char *path,
> >  static void __init of_unittest_parse_dma_ranges(void)
> >  {
> >       of_unittest_dma_ranges_one("/testcase-data/address-tests/device@70000000",
> > -             0x0, 0x20000000, 0x40000000);
> > +             0x0, 0x20000000);
> >       of_unittest_dma_ranges_one("/testcase-data/address-tests/bus@80000000/device@1000",
> > -             0x10000000, 0x20000000, 0x40000000);
> > +             0x10000000, 0x20000000);
> > +     /* pci@90000000 has two ranges in the dma-range property */
> > +     of_unittest_dma_ranges_one("/testcase-data/address-tests/pci@90000000",
> > +             0x80000000, 0x20000000);
> >       of_unittest_dma_ranges_one("/testcase-data/address-tests/pci@90000000",
> > -             0x80000000, 0x20000000, 0x10000000);
> > +             0xc0000000, 0x40000000);
> >  }
> >
> >  static void __init of_unittest_pci_dma_ranges(void)
> > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > index 9f04c30c4aaf..49242dd6176e 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -519,7 +519,7 @@ static int rproc_handle_vdev(struct rproc *rproc, struct fw_rsc_vdev *rsc,
> >       /* Initialise vdev subdevice */
> >       snprintf(name, sizeof(name), "vdev%dbuffer", rvdev->index);
> >       rvdev->dev.parent = &rproc->dev;
> > -     rvdev->dev.dma_pfn_offset = rproc->dev.parent->dma_pfn_offset;
> > +     rvdev->dev.dma_range_map = rproc->dev.parent->dma_range_map;
> >       rvdev->dev.release = rproc_rvdev_release;
> >       dev_set_name(&rvdev->dev, "%s#%s", dev_name(rvdev->dev.parent), name);
> >       dev_set_drvdata(&rvdev->dev, rvdev);
> > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> > index 1744e6fcc999..720b41eca7a3 100644
> > --- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
> > @@ -230,8 +230,11 @@ int cedrus_hw_probe(struct cedrus_dev *dev)
> >        */
> >
> >  #ifdef PHYS_PFN_OFFSET
> > -     if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET))
> > -             dev->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +     if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET)) {
> > +             ret = dma_attach_offset_range(dev->dev, PHYS_OFFSET, 0, SZ_4G);
> > +             if (ret)
> > +                     return ret;
> > +     }
> >  #endif
> >
> >       ret = of_reserved_mem_device_init(dev->dev);
> > diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> > index 6197938dcc2d..376ca258e510 100644
> > --- a/drivers/usb/core/message.c
> > +++ b/drivers/usb/core/message.c
> > @@ -1956,10 +1956,10 @@ int usb_set_configuration(struct usb_device *dev, int configuration)
> >               intf->dev.groups = usb_interface_groups;
> >               /*
> >                * Please refer to usb_alloc_dev() to see why we set
> > -              * dma_mask and dma_pfn_offset.
> > +              * dma_mask and dma_range_map.
> >                */
> >               intf->dev.dma_mask = dev->dev.dma_mask;
> > -             intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
> > +             intf->dev.dma_range_map = dev->dev.dma_range_map;
> >               INIT_WORK(&intf->reset_ws, __usb_queue_reset_device);
> >               intf->minor = -1;
> >               device_initialize(&intf->dev);
> > diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
> > index f16c26dc079d..1f167a2c095e 100644
> > --- a/drivers/usb/core/usb.c
> > +++ b/drivers/usb/core/usb.c
> > @@ -611,7 +611,7 @@ struct usb_device *usb_alloc_dev(struct usb_device *parent,
> >        * mask for the entire HCD, so don't do that.
> >        */
> >       dev->dev.dma_mask = bus->sysdev->dma_mask;
> > -     dev->dev.dma_pfn_offset = bus->sysdev->dma_pfn_offset;
> > +     dev->dev.dma_range_map = bus->sysdev->dma_range_map;
> >       set_dev_node(&dev->dev, dev_to_node(bus->sysdev));
> >       dev->state = USB_STATE_ATTACHED;
> >       dev->lpm_disable_count = 1;
> > diff --git a/include/linux/device.h b/include/linux/device.h
> > index 15460a5ac024..feddefcf3e5c 100644
> > --- a/include/linux/device.h
> > +++ b/include/linux/device.h
> > @@ -492,7 +492,7 @@ struct dev_links_info {
> >   *           such descriptors.
> >   * @bus_dma_limit: Limit of an upstream bridge or bus which imposes a smaller
> >   *           DMA limit than the device itself supports.
> > - * @dma_pfn_offset: offset of DMA memory range relatively of RAM
> > + * @dma_range_map: map for DMA memory ranges relative to that of RAM
> >   * @dma_parms:       A low level driver may set these to teach IOMMU code about
> >   *           segment limitations.
> >   * @dma_pools:       Dma pools (if dma'ble device).
> > @@ -577,7 +577,7 @@ struct device {
> >                                            64 bit addresses for consistent
> >                                            allocations such descriptors. */
> >       u64             bus_dma_limit;  /* upstream dma constraint */
> > -     unsigned long   dma_pfn_offset;
> > +     const struct bus_dma_region *dma_range_map;
> >
> >       struct device_dma_parameters *dma_parms;
> >
> > diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> > index cdfa400f89b3..182784d28cfd 100644
> > --- a/include/linux/dma-direct.h
> > +++ b/include/linux/dma-direct.h
> > @@ -15,14 +15,20 @@ static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
> >  {
> >       dma_addr_t dev_addr = (dma_addr_t)paddr;
> >
> > -     return dev_addr - ((dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
> > +     if (dev->dma_range_map)
> > +             dev_addr -= dma_offset_from_phys_addr(dev, paddr);
> > +
> > +     return dev_addr;
> >  }
> >
> >  static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr)
> >  {
> >       phys_addr_t paddr = (phys_addr_t)dev_addr;
> >
> > -     return paddr + ((phys_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
> > +     if (dev->dma_range_map)
> > +             paddr += dma_offset_from_dma_addr(dev, dev_addr);
> > +
> > +     return paddr;
> >  }
> >  #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */
> >
> > diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> > index 78f677cf45ab..7c8fcac30e74 100644
> > --- a/include/linux/dma-mapping.h
> > +++ b/include/linux/dma-mapping.h
> > @@ -255,7 +255,37 @@ static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
> >
> >  size_t dma_direct_max_mapping_size(struct device *dev);
> >
> > +struct bus_dma_region {
> > +     phys_addr_t     cpu_start;
> > +     dma_addr_t      dma_start;
> > +     u64             size;
> > +     u64             offset;
> > +};
> > +
> >  #ifdef CONFIG_HAS_DMA
> > +int dma_attach_offset_range(struct device *dev, phys_addr_t cpu_start,
> > +             dma_addr_t dma_start, u64 size);
> > +
> > +static inline u64 dma_offset_from_dma_addr(struct device *dev, dma_addr_t dma_addr)
> > +{
> > +     const struct bus_dma_region *m = dev->dma_range_map;
> > +
> > +     for (; m->size; m++)
> > +             if (dma_addr >= m->dma_start && dma_addr - m->dma_start < m->size)
> > +                     return m->offset;
> > +     return 0;
> > +}
> > +
> > +static inline u64 dma_offset_from_phys_addr(struct device *dev, phys_addr_t paddr)
> > +{
> > +     const struct bus_dma_region *m = dev->dma_range_map;
> > +
> > +     for (; m->size; m++)
> > +             if (paddr >= m->cpu_start && paddr - m->cpu_start < m->size)
> > +                     return m->offset;
> > +     return 0;
> > +}
> > +
> >  #include <asm/dma-mapping.h>
> >
> >  static inline const struct dma_map_ops *get_dma_ops(struct device *dev)
> > @@ -463,6 +493,19 @@ u64 dma_get_required_mask(struct device *dev);
> >  size_t dma_max_mapping_size(struct device *dev);
> >  unsigned long dma_get_merge_boundary(struct device *dev);
> >  #else /* CONFIG_HAS_DMA */
> > +static inline u64 dma_offset_from_dma_addr(struct device *dev, dma_addr_t dma_addr)
> > +{
> > +     return (u64)0;
> > +}
> > +static inline u64 dma_offset_from_phys_addr(struct device *dev, phys_addr_t paddr)
> > +{
> > +     return (u64)0;
> > +}
> > +static int dma_attach_offset_range(struct device *dev, phys_addr_t cpu_start,
> > +             dma_addr_t dma_start, u64 size)
> > +{
> > +     return -EIO;
> > +}
> >  static inline dma_addr_t dma_map_page_attrs(struct device *dev,
> >               struct page *page, size_t offset, size_t size,
> >               enum dma_data_direction dir, unsigned long attrs)
> > diff --git a/include/linux/pfn.h b/include/linux/pfn.h
> > index 14bc053c53d8..eddb535075a0 100644
> > --- a/include/linux/pfn.h
> > +++ b/include/linux/pfn.h
> > @@ -20,5 +20,7 @@ typedef struct {
> >  #define PFN_DOWN(x)  ((x) >> PAGE_SHIFT)
> >  #define PFN_PHYS(x)  ((phys_addr_t)(x) << PAGE_SHIFT)
> >  #define PHYS_PFN(x)  ((unsigned long)((x) >> PAGE_SHIFT))
> > +#define PFN_DMA_ADDR(x)      ((dma_addr_t)(x) << PAGE_SHIFT)
> > +#define DMA_ADDR_PFN(x)      ((unsigned long)((x) >> PAGE_SHIFT))
> >
> >  #endif
> > diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
> > index 2a0c4985f38e..66b1ac611c61 100644
> > --- a/kernel/dma/coherent.c
> > +++ b/kernel/dma/coherent.c
> > @@ -31,10 +31,12 @@ static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *de
> >  static inline dma_addr_t dma_get_device_base(struct device *dev,
> >                                            struct dma_coherent_mem * mem)
> >  {
> > -     if (mem->use_dev_dma_pfn_offset)
> > -             return (mem->pfn_base - dev->dma_pfn_offset) << PAGE_SHIFT;
> > -     else
> > -             return mem->device_base;
> > +     if (mem->use_dev_dma_pfn_offset && dev->dma_range_map) {
> > +             u64 dma_offset = dma_offset_from_phys_addr(dev, PFN_PHYS(mem->pfn_base));
> > +
> > +             return PFN_DMA_ADDR(mem->pfn_base) - dma_offset;
> > +     }
> > +     return mem->device_base;
> >  }
> >
> >  static int dma_init_coherent_memory(phys_addr_t phys_addr,
> > diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> > index 98e3d873792e..2c08c4991bfa 100644
> > --- a/kernel/dma/mapping.c
> > +++ b/kernel/dma/mapping.c
> > @@ -11,6 +11,7 @@
> >  #include <linux/dma-noncoherent.h>
> >  #include <linux/export.h>
> >  #include <linux/gfp.h>
> > +#include <linux/limits.h>
> >  #include <linux/of_device.h>
> >  #include <linux/slab.h>
> >  #include <linux/vmalloc.h>
> > @@ -407,3 +408,55 @@ unsigned long dma_get_merge_boundary(struct device *dev)
> >       return ops->get_merge_boundary(dev);
> >  }
> >  EXPORT_SYMBOL_GPL(dma_get_merge_boundary);
> > +
> > +/**
> > + * dma_attach_offset_range - Assign scalar offset for a single DMA range.
> > + * @dev:     device pointer; needed to "own" the alloced memory.
> > + * @cpu_start:  beginning of memory region covered by this offset.
> > + * @dma_start:  beginning of DMA/PCI region covered by this offset.
> > + * @size:    size of the region.
> > + *
> > + * This is for the simple case of a uniform offset which cannot
> > + * be discovered by "dma-ranges".
> > + *
> > + * It returns -ENOMEM if out of memory, -ENODEV if dev == NULL, otherwise 0.
> > + */
> > +int dma_attach_offset_range(struct device *dev, phys_addr_t cpu_start,
> > +                         dma_addr_t dma_start, u64 size)
> > +{
> > +     struct bus_dma_region *map;
> > +     u64 offset = (u64)cpu_start - (u64)dma_start;
> > +
> > +     if (!dev)
> > +             return -ENODEV;
> > +
> > +     /* See if a map already exists and we already encompass the new range */
> > +     if (dev->dma_range_map) {
> > +             const struct bus_dma_region *m = dev->dma_range_map;
> > +
> > +             for (; m->size; m++)
> > +                     if (offset == m->offset && cpu_start >= m->cpu_start
> > +                         && size <= m->size && cpu_start - m->cpu_start <= m->size - size)
> > +                             return 0;
> > +
> > +             dev_err(dev, "attempt to add conflicting DMA range to existing map\n");
> > +             return -EINVAL;
> > +     }
> > +
> > +     if (!offset)
> > +             return 0;
> > +
> > +     /* Don't use devm_kcalloc() since this may be called as bus a notifier */
> > +     map = kcalloc(2, sizeof(*map), GFP_KERNEL);
> > +     if (!map)
> > +             return -ENOMEM;
> > +     dev->dma_range_map = map;
> > +
> > +     map->cpu_start = cpu_start;
> > +     map->dma_start = dma_start;
> > +     map->offset = offset;
> > +     map->size = size;
> > +
> > +     return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(dma_attach_offset_range);
> > --
> > 2.17.1
> ---end quoted text---
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-07-22 22:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-15 14:35 [PATCH v8 00/12] PCI: brcmstb: enable PCIe for STB chips Jim Quinlan via iommu
2020-07-15 14:35 ` [PATCH v8 08/12] device core: Introduce DMA range map, supplanting dma_pfn_offset Jim Quinlan via iommu
2020-07-21 12:51   ` Christoph Hellwig
2020-07-22 22:37     ` Jim Quinlan via iommu
2020-07-20 23:27 ` [PATCH v8 00/12] PCI: brcmstb: enable PCIe for STB chips Florian Fainelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).