All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/13] PCI: brcmstb: enable PCIe for STB chips
@ 2020-06-03 19:20 ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Ulf Hansson, Oliver Neukum, open list:SUPERH,
	open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10,
	open list:LIBATA SUBSYSTEM Serial and Parallel ATA drivers,
	Julien Grall, Srinivas Kandagatla, H. Peter Anvin,
	open list:STAGING SUBSYSTEM, Florian Fainelli, Corey Minyard,
	Saravana Kannan, Rafael J. Wysocki,
	open list:ACPI FOR ARM64 ACPI/arm64, Alan Stern,
	open list:ALLWINNER A10 CSI DRIVER,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Joerg Roedel,
	Arnd Bergmann, Suzuki K Poulose, Hans de Goede, Mark Brown,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Dan Williams, Andy Shevchenko, moderated list:ARM PORT,
	Jens Axboe, Greg Kroah-Hartman, open list:USB SUBSYSTEM,
	open list, open list:IOMMU DRIVERS, Stefano Stabellini,
	Robin Murphy

v3:
  Commit "device core: Introduce multiple dma pfn offsets"
  Commit "arm: dma-mapping: Invoke dma offset func if needed"
  -- The above two commits have been squashed.  More importantly,
     the code has been modified so that the functionality for
     multiple pfn offsets subsumes the use of dev->dma_pfn_offset.
     In fact, dma_pfn_offset is removed and supplanted by
     dma_pfn_offset_map, which is a pointer to an array.  The
     more common case of a uniform offset is now handled as
     a map with a single entry, while cases requiring multiple
     pfn offsets use a map with multiple entries.  Code paths
     that used to do this:

         dev->dma_pfn_offset = mydrivers_pfn_offset;

     have been changed to do this:

         attach_uniform_dma_pfn_offset(dev, pfn_offset);

  Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
  -- Add if/then clause for required props: resets, reset-names (RobH)
  -- Change compatible list from const to enum (RobH)
  -- Change list of u32-tuples to u64 (RobH)

  Commit "of: Include a dev param in of_dma_get_range()"
  -- modify of/unittests.c to add NULL param in of_dma_get_range() call.

  Commit "device core: Add ability to handle multiple dma offsets"
  -- align comment in device.h (AndyS).
  -- s/cpu_beg/cpu_start/ and s/dma_beg/dma_start/ in struct
     dma_pfn_offset_region (AndyS).

v2:
Commit: "device core: Add ability to handle multiple dma offsets"
  o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph)
  o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph)
  o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP
  o dev->dma_pfn_map => dev->dma_pfn_offset_map
  o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph)
  o In device.h: s/const void */const struct dma_pfn_offset_region */
  o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since
    guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph)
  o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now
    dev->dma_pfn_offset_map is copied as well.
  o Merged two of the DMA commits into one (Christoph).

Commit "arm: dma-mapping: Invoke dma offset func if needed":
  o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET

Other commits' changes:
  o Removed need for carrying of_id var in priv (Nicolas)
  o Commit message rewordings (Bjorn)
  o Commit log messages filled to 75 chars (Bjorn)
  o devm_reset_control_get_shared())
    => devm_reset_control_get_optional_shared (Philipp)
  o Add call to reset_control_assert() in PCIe remove routines (Philipp)

v1:
This patchset expands the usefulness of the Broadcom Settop Box PCIe
controller by building upon the PCIe driver used currently by the
Raspbery Pi.  Other forms of this patchset were submitted by me years
ago and not accepted; the major sticking point was the code required
for the DMA remapping needed for the PCIe driver to work [1].

There have been many changes to the DMA and OF subsystems since that
time, making a cleaner and less intrusive patchset possible.  This
patchset implements a generalization of "dev->dma_pfn_offset", except
that instead of a single scalar offset it provides for multiple
offsets via a function which depends upon the "dma-ranges" property of
the PCIe host controller.  This is required for proper functionality
of the BrcmSTB PCIe controller and possibly some other devices.

[1] https://lore.kernel.org/linux-arm-kernel/1516058925-46522-5-git-send-email-jim2101024@gmail.com/

Jim Quinlan (13):
  PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB
  ata: ahci_brcm: Fix use of BCM7216 reset controller
  dt-bindings: PCI: Add bindings for more Brcmstb chips
  PCI: brcmstb: Add bcm7278 reigister info
  PCI: brcmstb: Add suspend and resume pm_ops
  PCI: brcmstb: Add bcm7278 PERST support
  PCI: brcmstb: Add control of rescal reset
  of: Include a dev param in of_dma_get_range()
  device core: Introduce multiple dma pfn offsets
  PCI: brcmstb: Set internal memory viewport sizes
  PCI: brcmstb: Accommodate MSI for older chips
  PCI: brcmstb: Set bus max burst size by chip type
  PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list

 .../bindings/pci/brcm,stb-pcie.yaml           |  58 ++-
 arch/arm/include/asm/dma-mapping.h            |   9 +-
 arch/arm/mach-keystone/keystone.c             |   9 +-
 arch/sh/drivers/pci/pcie-sh7786.c             |   3 +-
 arch/sh/kernel/dma-coherent.c                 |  17 +-
 arch/x86/pci/sta2x11-fixup.c                  |   7 +-
 drivers/acpi/arm64/iort.c                     |   5 +-
 drivers/ata/ahci_brcm.c                       |  14 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c         |   7 +-
 drivers/iommu/io-pgtable-arm.c                |   2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      |   5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      |   5 +-
 drivers/of/address.c                          |  97 ++++-
 drivers/of/device.c                           |  10 +-
 drivers/of/of_private.h                       |   8 +-
 drivers/of/unittest.c                         |   2 +-
 drivers/pci/controller/Kconfig                |   3 +-
 drivers/pci/controller/pcie-brcmstb.c         | 408 +++++++++++++++---
 drivers/remoteproc/remoteproc_core.c          |   2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |   7 +-
 drivers/usb/core/message.c                    |   4 +-
 drivers/usb/core/usb.c                        |   2 +-
 include/linux/device.h                        |   4 +-
 include/linux/dma-direct.h                    |  16 +-
 include/linux/dma-mapping.h                   |  45 ++
 kernel/dma/coherent.c                         |  11 +-
 26 files changed, 631 insertions(+), 129 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH v3 00/13] PCI: brcmstb: enable PCIe for STB chips
@ 2020-06-03 19:20 ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Alan Stern, Andy Shevchenko, Arnd Bergmann, Corey Minyard,
	Dan Williams, open list:STAGING SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE,
	open list:DRM DRIVERS FOR ALLWINNER A10, Florian Fainelli,
	Greg Kroah-Hartman, Hans de Goede, H. Peter Anvin,
	open list:IOMMU DRIVERS, Jens Axboe, Joerg Roedel, Julien Grall,
	open list:ACPI FOR ARM64 (ACPI/arm64),
	moderated list:ARM PORT,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	open list, open list:ALLWINNER A10 CSI DRIVER,
	open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list:SUPERH, open list:USB SUBSYSTEM, Mark Brown,
	Oliver Neukum, Rafael J. Wysocki, Rob Herring, Robin Murphy,
	Saravana Kannan, Srinivas Kandagatla, Stefano Stabellini,
	Suzuki K Poulose, Ulf Hansson

v3:
  Commit "device core: Introduce multiple dma pfn offsets"
  Commit "arm: dma-mapping: Invoke dma offset func if needed"
  -- The above two commits have been squashed.  More importantly,
     the code has been modified so that the functionality for
     multiple pfn offsets subsumes the use of dev->dma_pfn_offset.
     In fact, dma_pfn_offset is removed and supplanted by
     dma_pfn_offset_map, which is a pointer to an array.  The
     more common case of a uniform offset is now handled as
     a map with a single entry, while cases requiring multiple
     pfn offsets use a map with multiple entries.  Code paths
     that used to do this:

         dev->dma_pfn_offset = mydrivers_pfn_offset;

     have been changed to do this:

         attach_uniform_dma_pfn_offset(dev, pfn_offset);

  Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
  -- Add if/then clause for required props: resets, reset-names (RobH)
  -- Change compatible list from const to enum (RobH)
  -- Change list of u32-tuples to u64 (RobH)

  Commit "of: Include a dev param in of_dma_get_range()"
  -- modify of/unittests.c to add NULL param in of_dma_get_range() call.

  Commit "device core: Add ability to handle multiple dma offsets"
  -- align comment in device.h (AndyS).
  -- s/cpu_beg/cpu_start/ and s/dma_beg/dma_start/ in struct
     dma_pfn_offset_region (AndyS).

v2:
Commit: "device core: Add ability to handle multiple dma offsets"
  o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph)
  o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph)
  o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP
  o dev->dma_pfn_map => dev->dma_pfn_offset_map
  o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph)
  o In device.h: s/const void */const struct dma_pfn_offset_region */
  o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since
    guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph)
  o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now
    dev->dma_pfn_offset_map is copied as well.
  o Merged two of the DMA commits into one (Christoph).

Commit "arm: dma-mapping: Invoke dma offset func if needed":
  o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET

Other commits' changes:
  o Removed need for carrying of_id var in priv (Nicolas)
  o Commit message rewordings (Bjorn)
  o Commit log messages filled to 75 chars (Bjorn)
  o devm_reset_control_get_shared())
    => devm_reset_control_get_optional_shared (Philipp)
  o Add call to reset_control_assert() in PCIe remove routines (Philipp)

v1:
This patchset expands the usefulness of the Broadcom Settop Box PCIe
controller by building upon the PCIe driver used currently by the
Raspbery Pi.  Other forms of this patchset were submitted by me years
ago and not accepted; the major sticking point was the code required
for the DMA remapping needed for the PCIe driver to work [1].

There have been many changes to the DMA and OF subsystems since that
time, making a cleaner and less intrusive patchset possible.  This
patchset implements a generalization of "dev->dma_pfn_offset", except
that instead of a single scalar offset it provides for multiple
offsets via a function which depends upon the "dma-ranges" property of
the PCIe host controller.  This is required for proper functionality
of the BrcmSTB PCIe controller and possibly some other devices.

[1] https://lore.kernel.org/linux-arm-kernel/1516058925-46522-5-git-send-email-jim2101024@gmail.com/

Jim Quinlan (13):
  PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB
  ata: ahci_brcm: Fix use of BCM7216 reset controller
  dt-bindings: PCI: Add bindings for more Brcmstb chips
  PCI: brcmstb: Add bcm7278 reigister info
  PCI: brcmstb: Add suspend and resume pm_ops
  PCI: brcmstb: Add bcm7278 PERST support
  PCI: brcmstb: Add control of rescal reset
  of: Include a dev param in of_dma_get_range()
  device core: Introduce multiple dma pfn offsets
  PCI: brcmstb: Set internal memory viewport sizes
  PCI: brcmstb: Accommodate MSI for older chips
  PCI: brcmstb: Set bus max burst size by chip type
  PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list

 .../bindings/pci/brcm,stb-pcie.yaml           |  58 ++-
 arch/arm/include/asm/dma-mapping.h            |   9 +-
 arch/arm/mach-keystone/keystone.c             |   9 +-
 arch/sh/drivers/pci/pcie-sh7786.c             |   3 +-
 arch/sh/kernel/dma-coherent.c                 |  17 +-
 arch/x86/pci/sta2x11-fixup.c                  |   7 +-
 drivers/acpi/arm64/iort.c                     |   5 +-
 drivers/ata/ahci_brcm.c                       |  14 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c         |   7 +-
 drivers/iommu/io-pgtable-arm.c                |   2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      |   5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      |   5 +-
 drivers/of/address.c                          |  97 ++++-
 drivers/of/device.c                           |  10 +-
 drivers/of/of_private.h                       |   8 +-
 drivers/of/unittest.c                         |   2 +-
 drivers/pci/controller/Kconfig                |   3 +-
 drivers/pci/controller/pcie-brcmstb.c         | 408 +++++++++++++++---
 drivers/remoteproc/remoteproc_core.c          |   2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |   7 +-
 drivers/usb/core/message.c                    |   4 +-
 drivers/usb/core/usb.c                        |   2 +-
 include/linux/device.h                        |   4 +-
 include/linux/dma-direct.h                    |  16 +-
 include/linux/dma-mapping.h                   |  45 ++
 kernel/dma/coherent.c                         |  11 +-
 26 files changed, 631 insertions(+), 129 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH v3 00/13] PCI: brcmstb: enable PCIe for STB chips
@ 2020-06-03 19:20 ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Ulf Hansson, Oliver Neukum, open list:SUPERH,
	open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10,
	open list:LIBATA SUBSYSTEM Serial and Parallel ATA drivers,
	Julien Grall, Srinivas Kandagatla, H. Peter Anvin,
	open list:STAGING SUBSYSTEM, Rob Herring, Florian Fainelli,
	Corey Minyard, Saravana Kannan, Rafael J. Wysocki,
	open list:ACPI FOR ARM64 ACPI/arm64, Alan Stern,
	open list:ALLWINNER A10 CSI DRIVER,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Joerg Roedel,
	Arnd Bergmann, Suzuki K Poulose, Hans de Goede, Mark Brown,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Dan Williams, Andy Shevchenko, moderated list:ARM PORT,
	Jens Axboe, Greg Kroah-Hartman, open list:USB SUBSYSTEM,
	open list, open list:IOMMU DRIVERS, Stefano Stabellini,
	Robin Murphy

v3:
  Commit "device core: Introduce multiple dma pfn offsets"
  Commit "arm: dma-mapping: Invoke dma offset func if needed"
  -- The above two commits have been squashed.  More importantly,
     the code has been modified so that the functionality for
     multiple pfn offsets subsumes the use of dev->dma_pfn_offset.
     In fact, dma_pfn_offset is removed and supplanted by
     dma_pfn_offset_map, which is a pointer to an array.  The
     more common case of a uniform offset is now handled as
     a map with a single entry, while cases requiring multiple
     pfn offsets use a map with multiple entries.  Code paths
     that used to do this:

         dev->dma_pfn_offset = mydrivers_pfn_offset;

     have been changed to do this:

         attach_uniform_dma_pfn_offset(dev, pfn_offset);

  Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
  -- Add if/then clause for required props: resets, reset-names (RobH)
  -- Change compatible list from const to enum (RobH)
  -- Change list of u32-tuples to u64 (RobH)

  Commit "of: Include a dev param in of_dma_get_range()"
  -- modify of/unittests.c to add NULL param in of_dma_get_range() call.

  Commit "device core: Add ability to handle multiple dma offsets"
  -- align comment in device.h (AndyS).
  -- s/cpu_beg/cpu_start/ and s/dma_beg/dma_start/ in struct
     dma_pfn_offset_region (AndyS).

v2:
Commit: "device core: Add ability to handle multiple dma offsets"
  o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph)
  o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph)
  o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP
  o dev->dma_pfn_map => dev->dma_pfn_offset_map
  o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph)
  o In device.h: s/const void */const struct dma_pfn_offset_region */
  o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since
    guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph)
  o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now
    dev->dma_pfn_offset_map is copied as well.
  o Merged two of the DMA commits into one (Christoph).

Commit "arm: dma-mapping: Invoke dma offset func if needed":
  o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET

Other commits' changes:
  o Removed need for carrying of_id var in priv (Nicolas)
  o Commit message rewordings (Bjorn)
  o Commit log messages filled to 75 chars (Bjorn)
  o devm_reset_control_get_shared())
    => devm_reset_control_get_optional_shared (Philipp)
  o Add call to reset_control_assert() in PCIe remove routines (Philipp)

v1:
This patchset expands the usefulness of the Broadcom Settop Box PCIe
controller by building upon the PCIe driver used currently by the
Raspbery Pi.  Other forms of this patchset were submitted by me years
ago and not accepted; the major sticking point was the code required
for the DMA remapping needed for the PCIe driver to work [1].

There have been many changes to the DMA and OF subsystems since that
time, making a cleaner and less intrusive patchset possible.  This
patchset implements a generalization of "dev->dma_pfn_offset", except
that instead of a single scalar offset it provides for multiple
offsets via a function which depends upon the "dma-ranges" property of
the PCIe host controller.  This is required for proper functionality
of the BrcmSTB PCIe controller and possibly some other devices.

[1] https://lore.kernel.org/linux-arm-kernel/1516058925-46522-5-git-send-email-jim2101024@gmail.com/

Jim Quinlan (13):
  PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB
  ata: ahci_brcm: Fix use of BCM7216 reset controller
  dt-bindings: PCI: Add bindings for more Brcmstb chips
  PCI: brcmstb: Add bcm7278 reigister info
  PCI: brcmstb: Add suspend and resume pm_ops
  PCI: brcmstb: Add bcm7278 PERST support
  PCI: brcmstb: Add control of rescal reset
  of: Include a dev param in of_dma_get_range()
  device core: Introduce multiple dma pfn offsets
  PCI: brcmstb: Set internal memory viewport sizes
  PCI: brcmstb: Accommodate MSI for older chips
  PCI: brcmstb: Set bus max burst size by chip type
  PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list

 .../bindings/pci/brcm,stb-pcie.yaml           |  58 ++-
 arch/arm/include/asm/dma-mapping.h            |   9 +-
 arch/arm/mach-keystone/keystone.c             |   9 +-
 arch/sh/drivers/pci/pcie-sh7786.c             |   3 +-
 arch/sh/kernel/dma-coherent.c                 |  17 +-
 arch/x86/pci/sta2x11-fixup.c                  |   7 +-
 drivers/acpi/arm64/iort.c                     |   5 +-
 drivers/ata/ahci_brcm.c                       |  14 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c         |   7 +-
 drivers/iommu/io-pgtable-arm.c                |   2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      |   5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      |   5 +-
 drivers/of/address.c                          |  97 ++++-
 drivers/of/device.c                           |  10 +-
 drivers/of/of_private.h                       |   8 +-
 drivers/of/unittest.c                         |   2 +-
 drivers/pci/controller/Kconfig                |   3 +-
 drivers/pci/controller/pcie-brcmstb.c         | 408 +++++++++++++++---
 drivers/remoteproc/remoteproc_core.c          |   2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |   7 +-
 drivers/usb/core/message.c                    |   4 +-
 drivers/usb/core/usb.c                        |   2 +-
 include/linux/device.h                        |   4 +-
 include/linux/dma-direct.h                    |  16 +-
 include/linux/dma-mapping.h                   |  45 ++
 kernel/dma/coherent.c                         |  11 +-
 26 files changed, 631 insertions(+), 129 deletions(-)

-- 
2.17.1

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH v3 00/13] PCI: brcmstb: enable PCIe for STB chips
@ 2020-06-03 19:20 ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan via iommu @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Ulf Hansson, Oliver Neukum, open list:SUPERH,
	open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10,
	open list:LIBATA SUBSYSTEM Serial and Parallel ATA drivers,
	Julien Grall, Srinivas Kandagatla, H. Peter Anvin,
	open list:STAGING SUBSYSTEM, Rob Herring, Florian Fainelli,
	Corey Minyard, Saravana Kannan, Rafael J. Wysocki,
	open list:ACPI FOR ARM64 ACPI/arm64, Alan Stern,
	open list:ALLWINNER A10 CSI DRIVER,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Joerg Roedel,
	Arnd Bergmann, Suzuki K Poulose, Hans de Goede, Mark Brown,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Dan Williams, Andy Shevchenko, moderated list:ARM PORT,
	Jens Axboe, Greg Kroah-Hartman, open list:USB SUBSYSTEM,
	open list, open list:IOMMU DRIVERS, Stefano Stabellini,
	Robin Murphy

v3:
  Commit "device core: Introduce multiple dma pfn offsets"
  Commit "arm: dma-mapping: Invoke dma offset func if needed"
  -- The above two commits have been squashed.  More importantly,
     the code has been modified so that the functionality for
     multiple pfn offsets subsumes the use of dev->dma_pfn_offset.
     In fact, dma_pfn_offset is removed and supplanted by
     dma_pfn_offset_map, which is a pointer to an array.  The
     more common case of a uniform offset is now handled as
     a map with a single entry, while cases requiring multiple
     pfn offsets use a map with multiple entries.  Code paths
     that used to do this:

         dev->dma_pfn_offset = mydrivers_pfn_offset;

     have been changed to do this:

         attach_uniform_dma_pfn_offset(dev, pfn_offset);

  Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
  -- Add if/then clause for required props: resets, reset-names (RobH)
  -- Change compatible list from const to enum (RobH)
  -- Change list of u32-tuples to u64 (RobH)

  Commit "of: Include a dev param in of_dma_get_range()"
  -- modify of/unittests.c to add NULL param in of_dma_get_range() call.

  Commit "device core: Add ability to handle multiple dma offsets"
  -- align comment in device.h (AndyS).
  -- s/cpu_beg/cpu_start/ and s/dma_beg/dma_start/ in struct
     dma_pfn_offset_region (AndyS).

v2:
Commit: "device core: Add ability to handle multiple dma offsets"
  o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph)
  o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph)
  o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP
  o dev->dma_pfn_map => dev->dma_pfn_offset_map
  o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph)
  o In device.h: s/const void */const struct dma_pfn_offset_region */
  o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since
    guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph)
  o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now
    dev->dma_pfn_offset_map is copied as well.
  o Merged two of the DMA commits into one (Christoph).

Commit "arm: dma-mapping: Invoke dma offset func if needed":
  o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET

Other commits' changes:
  o Removed need for carrying of_id var in priv (Nicolas)
  o Commit message rewordings (Bjorn)
  o Commit log messages filled to 75 chars (Bjorn)
  o devm_reset_control_get_shared())
    => devm_reset_control_get_optional_shared (Philipp)
  o Add call to reset_control_assert() in PCIe remove routines (Philipp)

v1:
This patchset expands the usefulness of the Broadcom Settop Box PCIe
controller by building upon the PCIe driver used currently by the
Raspbery Pi.  Other forms of this patchset were submitted by me years
ago and not accepted; the major sticking point was the code required
for the DMA remapping needed for the PCIe driver to work [1].

There have been many changes to the DMA and OF subsystems since that
time, making a cleaner and less intrusive patchset possible.  This
patchset implements a generalization of "dev->dma_pfn_offset", except
that instead of a single scalar offset it provides for multiple
offsets via a function which depends upon the "dma-ranges" property of
the PCIe host controller.  This is required for proper functionality
of the BrcmSTB PCIe controller and possibly some other devices.

[1] https://lore.kernel.org/linux-arm-kernel/1516058925-46522-5-git-send-email-jim2101024@gmail.com/

Jim Quinlan (13):
  PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB
  ata: ahci_brcm: Fix use of BCM7216 reset controller
  dt-bindings: PCI: Add bindings for more Brcmstb chips
  PCI: brcmstb: Add bcm7278 reigister info
  PCI: brcmstb: Add suspend and resume pm_ops
  PCI: brcmstb: Add bcm7278 PERST support
  PCI: brcmstb: Add control of rescal reset
  of: Include a dev param in of_dma_get_range()
  device core: Introduce multiple dma pfn offsets
  PCI: brcmstb: Set internal memory viewport sizes
  PCI: brcmstb: Accommodate MSI for older chips
  PCI: brcmstb: Set bus max burst size by chip type
  PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list

 .../bindings/pci/brcm,stb-pcie.yaml           |  58 ++-
 arch/arm/include/asm/dma-mapping.h            |   9 +-
 arch/arm/mach-keystone/keystone.c             |   9 +-
 arch/sh/drivers/pci/pcie-sh7786.c             |   3 +-
 arch/sh/kernel/dma-coherent.c                 |  17 +-
 arch/x86/pci/sta2x11-fixup.c                  |   7 +-
 drivers/acpi/arm64/iort.c                     |   5 +-
 drivers/ata/ahci_brcm.c                       |  14 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c         |   7 +-
 drivers/iommu/io-pgtable-arm.c                |   2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      |   5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      |   5 +-
 drivers/of/address.c                          |  97 ++++-
 drivers/of/device.c                           |  10 +-
 drivers/of/of_private.h                       |   8 +-
 drivers/of/unittest.c                         |   2 +-
 drivers/pci/controller/Kconfig                |   3 +-
 drivers/pci/controller/pcie-brcmstb.c         | 408 +++++++++++++++---
 drivers/remoteproc/remoteproc_core.c          |   2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |   7 +-
 drivers/usb/core/message.c                    |   4 +-
 drivers/usb/core/usb.c                        |   2 +-
 include/linux/device.h                        |   4 +-
 include/linux/dma-direct.h                    |  16 +-
 include/linux/dma-mapping.h                   |  45 ++
 kernel/dma/coherent.c                         |  11 +-
 26 files changed, 631 insertions(+), 129 deletions(-)

-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH v3 00/13] PCI: brcmstb: enable PCIe for STB chips
@ 2020-06-03 19:20 ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Ulf Hansson, Oliver Neukum, open list:SUPERH,
	open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10,
	open list:LIBATA SUBSYSTEM Serial and Parallel ATA drivers,
	Julien Grall, Srinivas Kandagatla, H. Peter Anvin,
	open list:STAGING SUBSYSTEM, Rob Herring, Florian Fainelli,
	Corey Minyard, Saravana Kannan, Rafael J. Wysocki,
	open list:ACPI FOR ARM64 ACPI/arm64, Alan Stern,
	open list:ALLWINNER A10 CSI DRIVER,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Joerg Roedel,
	Arnd Bergmann, Suzuki K Poulose, Hans de Goede, Mark Brown,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Dan Williams, Andy Shevchenko, moderated list:ARM PORT,
	Jens Axboe, Greg Kroah-Hartman, open list:USB SUBSYSTEM,
	open list, open list:IOMMU DRIVERS, Stefano Stabellini,
	Robin Murphy

v3:
  Commit "device core: Introduce multiple dma pfn offsets"
  Commit "arm: dma-mapping: Invoke dma offset func if needed"
  -- The above two commits have been squashed.  More importantly,
     the code has been modified so that the functionality for
     multiple pfn offsets subsumes the use of dev->dma_pfn_offset.
     In fact, dma_pfn_offset is removed and supplanted by
     dma_pfn_offset_map, which is a pointer to an array.  The
     more common case of a uniform offset is now handled as
     a map with a single entry, while cases requiring multiple
     pfn offsets use a map with multiple entries.  Code paths
     that used to do this:

         dev->dma_pfn_offset = mydrivers_pfn_offset;

     have been changed to do this:

         attach_uniform_dma_pfn_offset(dev, pfn_offset);

  Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
  -- Add if/then clause for required props: resets, reset-names (RobH)
  -- Change compatible list from const to enum (RobH)
  -- Change list of u32-tuples to u64 (RobH)

  Commit "of: Include a dev param in of_dma_get_range()"
  -- modify of/unittests.c to add NULL param in of_dma_get_range() call.

  Commit "device core: Add ability to handle multiple dma offsets"
  -- align comment in device.h (AndyS).
  -- s/cpu_beg/cpu_start/ and s/dma_beg/dma_start/ in struct
     dma_pfn_offset_region (AndyS).

v2:
Commit: "device core: Add ability to handle multiple dma offsets"
  o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph)
  o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph)
  o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP
  o dev->dma_pfn_map => dev->dma_pfn_offset_map
  o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph)
  o In device.h: s/const void */const struct dma_pfn_offset_region */
  o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since
    guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph)
  o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now
    dev->dma_pfn_offset_map is copied as well.
  o Merged two of the DMA commits into one (Christoph).

Commit "arm: dma-mapping: Invoke dma offset func if needed":
  o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET

Other commits' changes:
  o Removed need for carrying of_id var in priv (Nicolas)
  o Commit message rewordings (Bjorn)
  o Commit log messages filled to 75 chars (Bjorn)
  o devm_reset_control_get_shared())
    => devm_reset_control_get_optional_shared (Philipp)
  o Add call to reset_control_assert() in PCIe remove routines (Philipp)

v1:
This patchset expands the usefulness of the Broadcom Settop Box PCIe
controller by building upon the PCIe driver used currently by the
Raspbery Pi.  Other forms of this patchset were submitted by me years
ago and not accepted; the major sticking point was the code required
for the DMA remapping needed for the PCIe driver to work [1].

There have been many changes to the DMA and OF subsystems since that
time, making a cleaner and less intrusive patchset possible.  This
patchset implements a generalization of "dev->dma_pfn_offset", except
that instead of a single scalar offset it provides for multiple
offsets via a function which depends upon the "dma-ranges" property of
the PCIe host controller.  This is required for proper functionality
of the BrcmSTB PCIe controller and possibly some other devices.

[1] https://lore.kernel.org/linux-arm-kernel/1516058925-46522-5-git-send-email-jim2101024@gmail.com/

Jim Quinlan (13):
  PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB
  ata: ahci_brcm: Fix use of BCM7216 reset controller
  dt-bindings: PCI: Add bindings for more Brcmstb chips
  PCI: brcmstb: Add bcm7278 reigister info
  PCI: brcmstb: Add suspend and resume pm_ops
  PCI: brcmstb: Add bcm7278 PERST support
  PCI: brcmstb: Add control of rescal reset
  of: Include a dev param in of_dma_get_range()
  device core: Introduce multiple dma pfn offsets
  PCI: brcmstb: Set internal memory viewport sizes
  PCI: brcmstb: Accommodate MSI for older chips
  PCI: brcmstb: Set bus max burst size by chip type
  PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list

 .../bindings/pci/brcm,stb-pcie.yaml           |  58 ++-
 arch/arm/include/asm/dma-mapping.h            |   9 +-
 arch/arm/mach-keystone/keystone.c             |   9 +-
 arch/sh/drivers/pci/pcie-sh7786.c             |   3 +-
 arch/sh/kernel/dma-coherent.c                 |  17 +-
 arch/x86/pci/sta2x11-fixup.c                  |   7 +-
 drivers/acpi/arm64/iort.c                     |   5 +-
 drivers/ata/ahci_brcm.c                       |  14 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c         |   7 +-
 drivers/iommu/io-pgtable-arm.c                |   2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      |   5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      |   5 +-
 drivers/of/address.c                          |  97 ++++-
 drivers/of/device.c                           |  10 +-
 drivers/of/of_private.h                       |   8 +-
 drivers/of/unittest.c                         |   2 +-
 drivers/pci/controller/Kconfig                |   3 +-
 drivers/pci/controller/pcie-brcmstb.c         | 408 +++++++++++++++---
 drivers/remoteproc/remoteproc_core.c          |   2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |   7 +-
 drivers/usb/core/message.c                    |   4 +-
 drivers/usb/core/usb.c                        |   2 +-
 include/linux/device.h                        |   4 +-
 include/linux/dma-direct.h                    |  16 +-
 include/linux/dma-mapping.h                   |  45 ++
 kernel/dma/coherent.c                         |  11 +-
 26 files changed, 631 insertions(+), 129 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH v3 00/13] PCI: brcmstb: enable PCIe for STB chips
@ 2020-06-03 19:20 ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Ulf Hansson, Oliver Neukum, open list:SUPERH,
	open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10,
	open list:LIBATA SUBSYSTEM Serial and Parallel ATA drivers,
	Julien Grall, Srinivas Kandagatla, H. Peter Anvin,
	open list:STAGING SUBSYSTEM, Florian Fainelli, Corey Minyard,
	Saravana Kannan, Rafael J. Wysocki,
	open list:ACPI FOR ARM64 ACPI/arm64, Alan Stern,
	open list:ALLWINNER A10 CSI DRIVER,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Joerg Roedel,
	Arnd Bergmann, Suzuki K Poulose, Hans de Goede, Mark Brown,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Dan Williams, Andy Shevchenko, moderated list:ARM PORT,
	Jens Axboe, Greg Kroah-Hartman, open list:USB SUBSYSTEM,
	open list, open list:IOMMU DRIVERS, Stefano Stabellini,
	Robin Murphy

v3:
  Commit "device core: Introduce multiple dma pfn offsets"
  Commit "arm: dma-mapping: Invoke dma offset func if needed"
  -- The above two commits have been squashed.  More importantly,
     the code has been modified so that the functionality for
     multiple pfn offsets subsumes the use of dev->dma_pfn_offset.
     In fact, dma_pfn_offset is removed and supplanted by
     dma_pfn_offset_map, which is a pointer to an array.  The
     more common case of a uniform offset is now handled as
     a map with a single entry, while cases requiring multiple
     pfn offsets use a map with multiple entries.  Code paths
     that used to do this:

         dev->dma_pfn_offset = mydrivers_pfn_offset;

     have been changed to do this:

         attach_uniform_dma_pfn_offset(dev, pfn_offset);

  Commit "dt-bindings: PCI: Add bindings for more Brcmstb chips"
  -- Add if/then clause for required props: resets, reset-names (RobH)
  -- Change compatible list from const to enum (RobH)
  -- Change list of u32-tuples to u64 (RobH)

  Commit "of: Include a dev param in of_dma_get_range()"
  -- modify of/unittests.c to add NULL param in of_dma_get_range() call.

  Commit "device core: Add ability to handle multiple dma offsets"
  -- align comment in device.h (AndyS).
  -- s/cpu_beg/cpu_start/ and s/dma_beg/dma_start/ in struct
     dma_pfn_offset_region (AndyS).

v2:
Commit: "device core: Add ability to handle multiple dma offsets"
  o Added helper func attach_dma_pfn_offset_map() in address.c (Chistoph)
  o Helpers funcs added to __phys_to_dma() & __dma_to_phys() (Christoph)
  o Added warning when multiple offsets are needed and !DMA_PFN_OFFSET_MAP
  o dev->dma_pfn_map => dev->dma_pfn_offset_map
  o s/frm/from/ for dma_pfn_offset_frm_{phys,dma}_addr() (Christoph)
  o In device.h: s/const void */const struct dma_pfn_offset_region */
  o removed 'unlikely' from unlikely(dev->dma_pfn_offset_map) since
    guarded by CONFIG_DMA_PFN_OFFSET_MAP (Christoph)
  o Since dev->dma_pfn_offset is copied in usb/core/{usb,message}.c, now
    dev->dma_pfn_offset_map is copied as well.
  o Merged two of the DMA commits into one (Christoph).

Commit "arm: dma-mapping: Invoke dma offset func if needed":
  o Use helper functions instead of #if CONFIG_DMA_PFN_OFFSET

Other commits' changes:
  o Removed need for carrying of_id var in priv (Nicolas)
  o Commit message rewordings (Bjorn)
  o Commit log messages filled to 75 chars (Bjorn)
  o devm_reset_control_get_shared())
    => devm_reset_control_get_optional_shared (Philipp)
  o Add call to reset_control_assert() in PCIe remove routines (Philipp)

v1:
This patchset expands the usefulness of the Broadcom Settop Box PCIe
controller by building upon the PCIe driver used currently by the
Raspbery Pi.  Other forms of this patchset were submitted by me years
ago and not accepted; the major sticking point was the code required
for the DMA remapping needed for the PCIe driver to work [1].

There have been many changes to the DMA and OF subsystems since that
time, making a cleaner and less intrusive patchset possible.  This
patchset implements a generalization of "dev->dma_pfn_offset", except
that instead of a single scalar offset it provides for multiple
offsets via a function which depends upon the "dma-ranges" property of
the PCIe host controller.  This is required for proper functionality
of the BrcmSTB PCIe controller and possibly some other devices.

[1] https://lore.kernel.org/linux-arm-kernel/1516058925-46522-5-git-send-email-jim2101024@gmail.com/

Jim Quinlan (13):
  PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB
  ata: ahci_brcm: Fix use of BCM7216 reset controller
  dt-bindings: PCI: Add bindings for more Brcmstb chips
  PCI: brcmstb: Add bcm7278 reigister info
  PCI: brcmstb: Add suspend and resume pm_ops
  PCI: brcmstb: Add bcm7278 PERST support
  PCI: brcmstb: Add control of rescal reset
  of: Include a dev param in of_dma_get_range()
  device core: Introduce multiple dma pfn offsets
  PCI: brcmstb: Set internal memory viewport sizes
  PCI: brcmstb: Accommodate MSI for older chips
  PCI: brcmstb: Set bus max burst size by chip type
  PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list

 .../bindings/pci/brcm,stb-pcie.yaml           |  58 ++-
 arch/arm/include/asm/dma-mapping.h            |   9 +-
 arch/arm/mach-keystone/keystone.c             |   9 +-
 arch/sh/drivers/pci/pcie-sh7786.c             |   3 +-
 arch/sh/kernel/dma-coherent.c                 |  17 +-
 arch/x86/pci/sta2x11-fixup.c                  |   7 +-
 drivers/acpi/arm64/iort.c                     |   5 +-
 drivers/ata/ahci_brcm.c                       |  14 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c         |   7 +-
 drivers/iommu/io-pgtable-arm.c                |   2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      |   5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      |   5 +-
 drivers/of/address.c                          |  97 ++++-
 drivers/of/device.c                           |  10 +-
 drivers/of/of_private.h                       |   8 +-
 drivers/of/unittest.c                         |   2 +-
 drivers/pci/controller/Kconfig                |   3 +-
 drivers/pci/controller/pcie-brcmstb.c         | 408 +++++++++++++++---
 drivers/remoteproc/remoteproc_core.c          |   2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |   7 +-
 drivers/usb/core/message.c                    |   4 +-
 drivers/usb/core/usb.c                        |   2 +-
 include/linux/device.h                        |   4 +-
 include/linux/dma-direct.h                    |  16 +-
 include/linux/dma-mapping.h                   |  45 ++
 kernel/dma/coherent.c                         |  11 +-
 26 files changed, 631 insertions(+), 129 deletions(-)

-- 
2.17.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH v3 01/13] PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB
  2020-06-03 19:20 ` Jim Quinlan
                   ` (4 preceding siblings ...)
  (?)
@ 2020-06-03 19:20 ` Jim Quinlan
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Jim Quinlan, Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas, open list

From: Jim Quinlan <jquinlan@broadcom.com>

Have PCIE_BRCMSTB depend on ARCH_BRCMSTB.  Also set the default value to
ARCH_BRCMSTB.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
 drivers/pci/controller/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
index 91bfdb784829..c0f3d4d10047 100644
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -244,9 +244,10 @@ config VMD
 
 config PCIE_BRCMSTB
 	tristate "Broadcom Brcmstb PCIe host controller"
-	depends on ARCH_BCM2835 || COMPILE_TEST
+	depends on ARCH_BRCMSTB || ARCH_BCM2835 || COMPILE_TEST
 	depends on OF
 	depends on PCI_MSI_IRQ_DOMAIN
+	default ARCH_BRCMSTB
 	help
 	  Say Y here to enable PCIe host controller support for
 	  Broadcom STB based SoCs, like the Raspberry Pi 4.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 02/13] ata: ahci_brcm: Fix use of BCM7216 reset controller
  2020-06-03 19:20 ` Jim Quinlan
                   ` (5 preceding siblings ...)
  (?)
@ 2020-06-03 19:20 ` Jim Quinlan
  2020-06-04  2:57   ` Florian Fainelli
  -1 siblings, 1 reply; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Jim Quinlan, Jens Axboe, Philipp Zabel, Florian Fainelli,
	Hans de Goede,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	open list

From: Jim Quinlan <jquinlan@broadcom.com>

A reset controller "rescal" is shared between the AHCI driver and the PCIe
driver for the BrcmSTB 7216 chip.  The code is modified to allow this
sharing and to deassert() properly.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>

Fixes: 272ecd60a636 ("ata: ahci_brcm: BCM7216 reset is self de-asserting")
Fixes: c345ec6a50e9 ("ata: ahci_brcm: Support BCM7216 reset controller
name")
---
 drivers/ata/ahci_brcm.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/ata/ahci_brcm.c b/drivers/ata/ahci_brcm.c
index 6853dbb4131d..c4ea555573dd 100644
--- a/drivers/ata/ahci_brcm.c
+++ b/drivers/ata/ahci_brcm.c
@@ -428,7 +428,6 @@ static int brcm_ahci_probe(struct platform_device *pdev)
 {
 	const struct of_device_id *of_id;
 	struct device *dev = &pdev->dev;
-	const char *reset_name = NULL;
 	struct brcm_ahci_priv *priv;
 	struct ahci_host_priv *hpriv;
 	struct resource *res;
@@ -452,11 +451,11 @@ static int brcm_ahci_probe(struct platform_device *pdev)
 
 	/* Reset is optional depending on platform and named differently */
 	if (priv->version == BRCM_SATA_BCM7216)
-		reset_name = "rescal";
+		priv->rcdev = devm_reset_control_get_optional_shared(&pdev->dev,
+								     "rescal");
 	else
-		reset_name = "ahci";
-
-	priv->rcdev = devm_reset_control_get_optional(&pdev->dev, reset_name);
+		priv->rcdev = devm_reset_control_get_optional(&pdev->dev,
+							      "ahci");
 	if (IS_ERR(priv->rcdev))
 		return PTR_ERR(priv->rcdev);
 
@@ -479,10 +478,7 @@ static int brcm_ahci_probe(struct platform_device *pdev)
 		break;
 	}
 
-	if (priv->version == BRCM_SATA_BCM7216)
-		ret = reset_control_reset(priv->rcdev);
-	else
-		ret = reset_control_deassert(priv->rcdev);
+	ret = reset_control_deassert(priv->rcdev);
 	if (ret)
 		return ret;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 03/13] dt-bindings: PCI: Add bindings for more Brcmstb chips
  2020-06-03 19:20 ` Jim Quinlan
@ 2020-06-03 19:20   ` Jim Quinlan
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Jim Quinlan, Florian Fainelli, Bjorn Helgaas, Rob Herring,
	moderated list:BROADCOM BCM7XXX ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	open list

From: Jim Quinlan <jquinlan@broadcom.com>

- Add compatible strings for three more Broadcom STB chips: 7278, 7216,
  7211 (STB version of RPi4).
- add new property 'brcm,scb-sizes'
- add new property 'resets'
- add new property 'reset-names' for 7216 only
- allow 'ranges' and 'dma-ranges' to have more than one item and update
  the example to show this.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 .../bindings/pci/brcm,stb-pcie.yaml           | 58 ++++++++++++++++---
 1 file changed, 51 insertions(+), 7 deletions(-)

diff --git a/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml b/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
index 8680a0f86c5a..4a012d77513f 100644
--- a/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
+++ b/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
@@ -9,12 +9,15 @@ title: Brcmstb PCIe Host Controller Device Tree Bindings
 maintainers:
   - Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
 
-allOf:
-  - $ref: /schemas/pci/pci-bus.yaml#
-
 properties:
   compatible:
-    const: brcm,bcm2711-pcie # The Raspberry Pi 4
+    items:
+      - enum:
+          - brcm,bcm2711-pcie # The Raspberry Pi 4
+          - brcm,bcm7211-pcie # Broadcom STB version of RPi4
+          - brcm,bcm7278-pcie # Broadcom 7278 Arm
+          - brcm,bcm7216-pcie # Broadcom 7216 Arm
+          - brcm,bcm7445-pcie # Broadcom 7445 Arm
 
   reg:
     maxItems: 1
@@ -34,10 +37,12 @@ properties:
       - const: msi
 
   ranges:
-    maxItems: 1
+    minItems: 1
+    maxItems: 4
 
   dma-ranges:
-    maxItems: 1
+    minItems: 1
+    maxItems: 6
 
   clocks:
     maxItems: 1
@@ -58,8 +63,33 @@ properties:
 
   aspm-no-l0s: true
 
+  resets:
+    description: for "brcm,bcm7216-pcie", must be a valid reset
+      phandle pointing to the RESCAL reset controller provider node.
+    $ref: "/schemas/types.yaml#/definitions/phandle"
+
+  reset-names:
+    items:
+      - const: rescal
+
+  brcm,scb-sizes:
+    description: u64 giving the 64bit PCIe memory
+      viewport size of a memory controller.  There may be up to
+      three controllers, and each size must be a power of two
+      with a size greater or equal to the amount of memory the
+      controller supports.  Note that each memory controller
+      may have two component regions -- base and extended -- so
+      this information cannot be deduced from the dma-ranges.
+
+    allOf:
+      - $ref: /schemas/types.yaml#/definitions/uint64-array
+      - items:
+          minItems: 1
+          maxItems: 3
+
 required:
   - reg
+  - ranges
   - dma-ranges
   - "#interrupt-cells"
   - interrupts
@@ -68,6 +98,18 @@ required:
   - interrupt-map
   - msi-controller
 
+allOf:
+  - $ref: /schemas/pci/pci-bus.yaml#
+  - if:
+      properties:
+        compatible:
+          contains:
+            const: brcm,bcm7216-pcie
+    then:
+      required:
+        - resets
+        - reset-names
+
 unevaluatedProperties: false
 
 examples:
@@ -93,7 +135,9 @@ examples:
                     msi-parent = <&pcie0>;
                     msi-controller;
                     ranges = <0x02000000 0x0 0xf8000000 0x6 0x00000000 0x0 0x04000000>;
-                    dma-ranges = <0x02000000 0x0 0x00000000 0x0 0x00000000 0x0 0x80000000>;
+                    dma-ranges = <0x42000000 0x1 0x00000000 0x0 0x40000000 0x0 0x80000000>,
+                                 <0x42000000 0x1 0x80000000 0x3 0x00000000 0x0 0x80000000>;
                     brcm,enable-ssc;
+                    brcm,scb-sizes =  <0x0000000080000000 0x0000000080000000>;
             };
     };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 03/13] dt-bindings: PCI: Add bindings for more Brcmstb chips
@ 2020-06-03 19:20   ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Florian Fainelli, open list, Rob Herring,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Jim Quinlan, Bjorn Helgaas,
	moderated list:BROADCOM BCM7XXX ARM ARCHITECTURE

From: Jim Quinlan <jquinlan@broadcom.com>

- Add compatible strings for three more Broadcom STB chips: 7278, 7216,
  7211 (STB version of RPi4).
- add new property 'brcm,scb-sizes'
- add new property 'resets'
- add new property 'reset-names' for 7216 only
- allow 'ranges' and 'dma-ranges' to have more than one item and update
  the example to show this.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 .../bindings/pci/brcm,stb-pcie.yaml           | 58 ++++++++++++++++---
 1 file changed, 51 insertions(+), 7 deletions(-)

diff --git a/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml b/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
index 8680a0f86c5a..4a012d77513f 100644
--- a/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
+++ b/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
@@ -9,12 +9,15 @@ title: Brcmstb PCIe Host Controller Device Tree Bindings
 maintainers:
   - Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
 
-allOf:
-  - $ref: /schemas/pci/pci-bus.yaml#
-
 properties:
   compatible:
-    const: brcm,bcm2711-pcie # The Raspberry Pi 4
+    items:
+      - enum:
+          - brcm,bcm2711-pcie # The Raspberry Pi 4
+          - brcm,bcm7211-pcie # Broadcom STB version of RPi4
+          - brcm,bcm7278-pcie # Broadcom 7278 Arm
+          - brcm,bcm7216-pcie # Broadcom 7216 Arm
+          - brcm,bcm7445-pcie # Broadcom 7445 Arm
 
   reg:
     maxItems: 1
@@ -34,10 +37,12 @@ properties:
       - const: msi
 
   ranges:
-    maxItems: 1
+    minItems: 1
+    maxItems: 4
 
   dma-ranges:
-    maxItems: 1
+    minItems: 1
+    maxItems: 6
 
   clocks:
     maxItems: 1
@@ -58,8 +63,33 @@ properties:
 
   aspm-no-l0s: true
 
+  resets:
+    description: for "brcm,bcm7216-pcie", must be a valid reset
+      phandle pointing to the RESCAL reset controller provider node.
+    $ref: "/schemas/types.yaml#/definitions/phandle"
+
+  reset-names:
+    items:
+      - const: rescal
+
+  brcm,scb-sizes:
+    description: u64 giving the 64bit PCIe memory
+      viewport size of a memory controller.  There may be up to
+      three controllers, and each size must be a power of two
+      with a size greater or equal to the amount of memory the
+      controller supports.  Note that each memory controller
+      may have two component regions -- base and extended -- so
+      this information cannot be deduced from the dma-ranges.
+
+    allOf:
+      - $ref: /schemas/types.yaml#/definitions/uint64-array
+      - items:
+          minItems: 1
+          maxItems: 3
+
 required:
   - reg
+  - ranges
   - dma-ranges
   - "#interrupt-cells"
   - interrupts
@@ -68,6 +98,18 @@ required:
   - interrupt-map
   - msi-controller
 
+allOf:
+  - $ref: /schemas/pci/pci-bus.yaml#
+  - if:
+      properties:
+        compatible:
+          contains:
+            const: brcm,bcm7216-pcie
+    then:
+      required:
+        - resets
+        - reset-names
+
 unevaluatedProperties: false
 
 examples:
@@ -93,7 +135,9 @@ examples:
                     msi-parent = <&pcie0>;
                     msi-controller;
                     ranges = <0x02000000 0x0 0xf8000000 0x6 0x00000000 0x0 0x04000000>;
-                    dma-ranges = <0x02000000 0x0 0x00000000 0x0 0x00000000 0x0 0x80000000>;
+                    dma-ranges = <0x42000000 0x1 0x00000000 0x0 0x40000000 0x0 0x80000000>,
+                                 <0x42000000 0x1 0x80000000 0x3 0x00000000 0x0 0x80000000>;
                     brcm,enable-ssc;
+                    brcm,scb-sizes =  <0x0000000080000000 0x0000000080000000>;
             };
     };
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 04/13] PCI: brcmstb: Add bcm7278 reigister info
  2020-06-03 19:20 ` Jim Quinlan
@ 2020-06-03 19:20   ` Jim Quinlan
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Jim Quinlan, Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas,
	Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list

From: Jim Quinlan <jquinlan@broadcom.com>

Add in compatibility strings and code for three Broadcom STB chips.  Some
of the register locations, shifts, and masks are different for certain
chips, requiring the use of different constants based on of_id.

We would like to add the following at this time to the match list but we
need to wait until the end of this patchset so that everything works.

    { .compatible = "brcm,bcm7211-pcie", .data = &generic_cfg },
    { .compatible = "brcm,bcm7278-pcie", .data = &bcm7278_cfg },
    { .compatible = "brcm,bcm7216-pcie", .data = &bcm7278_cfg },
    { .compatible = "brcm,bcm7445-pcie", .data = &generic_cfg },

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 108 +++++++++++++++++++++++---
 1 file changed, 96 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 73020b4ff090..7c707e483181 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -120,9 +120,8 @@
 #define  PCIE_EXT_SLOT_SHIFT				15
 #define  PCIE_EXT_FUNC_SHIFT				12
 
-#define PCIE_RGR1_SW_INIT_1				0x9210
 #define  PCIE_RGR1_SW_INIT_1_PERST_MASK			0x1
-#define  PCIE_RGR1_SW_INIT_1_INIT_MASK			0x2
+#define  PCIE_RGR1_SW_INIT_1_PERST_SHIFT		0x0
 
 /* PCIe parameters */
 #define BRCM_NUM_PCIE_OUT_WINS		0x4
@@ -152,6 +151,76 @@
 #define SSC_STATUS_SSC_MASK		0x400
 #define SSC_STATUS_PLL_LOCK_MASK	0x800
 
+#define IDX_ADDR(pcie)	\
+	(pcie->reg_offsets[EXT_CFG_INDEX])
+#define DATA_ADDR(pcie)	\
+	(pcie->reg_offsets[EXT_CFG_DATA])
+#define PCIE_RGR1_SW_INIT_1(pcie) \
+	(pcie->reg_offsets[RGR1_SW_INIT_1])
+
+enum {
+	RGR1_SW_INIT_1,
+	EXT_CFG_INDEX,
+	EXT_CFG_DATA,
+};
+
+enum {
+	RGR1_SW_INIT_1_INIT_MASK,
+	RGR1_SW_INIT_1_INIT_SHIFT,
+};
+
+enum pcie_type {
+	GENERIC,
+	BCM7278,
+	BCM2711,
+};
+
+struct pcie_cfg_data {
+	const int *reg_field_info;
+	const int *offsets;
+	const enum pcie_type type;
+};
+
+static const int pcie_reg_field_info[] = {
+	[RGR1_SW_INIT_1_INIT_MASK] = 0x2,
+	[RGR1_SW_INIT_1_INIT_SHIFT] = 0x1,
+};
+
+static const int pcie_reg_field_info_bcm7278[] = {
+	[RGR1_SW_INIT_1_INIT_MASK] = 0x1,
+	[RGR1_SW_INIT_1_INIT_SHIFT] = 0x0,
+};
+
+static const int pcie_offsets[] = {
+	[RGR1_SW_INIT_1] = 0x9210,
+	[EXT_CFG_INDEX]  = 0x9000,
+	[EXT_CFG_DATA]   = 0x9004,
+};
+
+static const struct pcie_cfg_data generic_cfg = {
+	.reg_field_info	= pcie_reg_field_info,
+	.offsets	= pcie_offsets,
+	.type		= GENERIC,
+};
+
+static const int pcie_offset_bcm7278[] = {
+	[RGR1_SW_INIT_1] = 0xc010,
+	[EXT_CFG_INDEX] = 0x9000,
+	[EXT_CFG_DATA] = 0x9004,
+};
+
+static const struct pcie_cfg_data bcm7278_cfg = {
+	.reg_field_info = pcie_reg_field_info_bcm7278,
+	.offsets	= pcie_offset_bcm7278,
+	.type		= BCM7278,
+};
+
+static const struct pcie_cfg_data bcm2711_cfg = {
+	.reg_field_info	= pcie_reg_field_info,
+	.offsets	= pcie_offsets,
+	.type		= BCM2711,
+};
+
 struct brcm_msi {
 	struct device		*dev;
 	void __iomem		*base;
@@ -176,6 +245,9 @@ struct brcm_pcie {
 	int			gen;
 	u64			msi_target_addr;
 	struct brcm_msi		*msi;
+	const int		*reg_offsets;
+	const int		*reg_field_info;
+	enum pcie_type		type;
 };
 
 /*
@@ -602,20 +674,21 @@ static struct pci_ops brcm_pcie_ops = {
 
 static inline void brcm_pcie_bridge_sw_init_set(struct brcm_pcie *pcie, u32 val)
 {
-	u32 tmp;
+	u32 tmp, mask =  pcie->reg_field_info[RGR1_SW_INIT_1_INIT_MASK];
+	u32 shift = pcie->reg_field_info[RGR1_SW_INIT_1_INIT_SHIFT];
 
-	tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1);
-	u32p_replace_bits(&tmp, val, PCIE_RGR1_SW_INIT_1_INIT_MASK);
-	writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1);
+	tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
+	tmp = (tmp & ~mask) | ((val << shift) & mask);
+	writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
 }
 
 static inline void brcm_pcie_perst_set(struct brcm_pcie *pcie, u32 val)
 {
 	u32 tmp;
 
-	tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1);
+	tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
 	u32p_replace_bits(&tmp, val, PCIE_RGR1_SW_INIT_1_PERST_MASK);
-	writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1);
+	writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
 }
 
 static inline int brcm_pcie_get_rc_bar2_size_and_offset(struct brcm_pcie *pcie,
@@ -924,10 +997,16 @@ static int brcm_pcie_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static const struct of_device_id brcm_pcie_match[] = {
+	{ .compatible = "brcm,bcm2711-pcie", .data = &bcm2711_cfg },
+	{},
+};
+
 static int brcm_pcie_probe(struct platform_device *pdev)
 {
 	struct device_node *np = pdev->dev.of_node, *msi_np;
 	struct pci_host_bridge *bridge;
+	const struct pcie_cfg_data *data;
 	struct brcm_pcie *pcie;
 	struct pci_bus *child;
 	struct resource *res;
@@ -937,9 +1016,18 @@ static int brcm_pcie_probe(struct platform_device *pdev)
 	if (!bridge)
 		return -ENOMEM;
 
+	data = of_device_get_match_data(&pdev->dev);
+	if (!data) {
+		pr_err("failed to look up compatible string\n");
+		return -EINVAL;
+	}
+
 	pcie = pci_host_bridge_priv(bridge);
 	pcie->dev = &pdev->dev;
 	pcie->np = np;
+	pcie->reg_offsets = data->offsets;
+	pcie->reg_field_info = data->reg_field_info;
+	pcie->type = data->type;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	pcie->base = devm_ioremap_resource(&pdev->dev, res);
@@ -1005,10 +1093,6 @@ static int brcm_pcie_probe(struct platform_device *pdev)
 	return ret;
 }
 
-static const struct of_device_id brcm_pcie_match[] = {
-	{ .compatible = "brcm,bcm2711-pcie" },
-	{},
-};
 MODULE_DEVICE_TABLE(of, brcm_pcie_match);
 
 static struct platform_driver brcm_pcie_driver = {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 04/13] PCI: brcmstb: Add bcm7278 reigister info
@ 2020-06-03 19:20   ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Rob Herring, Lorenzo Pieralisi, open list, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Jim Quinlan, Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE

From: Jim Quinlan <jquinlan@broadcom.com>

Add in compatibility strings and code for three Broadcom STB chips.  Some
of the register locations, shifts, and masks are different for certain
chips, requiring the use of different constants based on of_id.

We would like to add the following at this time to the match list but we
need to wait until the end of this patchset so that everything works.

    { .compatible = "brcm,bcm7211-pcie", .data = &generic_cfg },
    { .compatible = "brcm,bcm7278-pcie", .data = &bcm7278_cfg },
    { .compatible = "brcm,bcm7216-pcie", .data = &bcm7278_cfg },
    { .compatible = "brcm,bcm7445-pcie", .data = &generic_cfg },

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 108 +++++++++++++++++++++++---
 1 file changed, 96 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 73020b4ff090..7c707e483181 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -120,9 +120,8 @@
 #define  PCIE_EXT_SLOT_SHIFT				15
 #define  PCIE_EXT_FUNC_SHIFT				12
 
-#define PCIE_RGR1_SW_INIT_1				0x9210
 #define  PCIE_RGR1_SW_INIT_1_PERST_MASK			0x1
-#define  PCIE_RGR1_SW_INIT_1_INIT_MASK			0x2
+#define  PCIE_RGR1_SW_INIT_1_PERST_SHIFT		0x0
 
 /* PCIe parameters */
 #define BRCM_NUM_PCIE_OUT_WINS		0x4
@@ -152,6 +151,76 @@
 #define SSC_STATUS_SSC_MASK		0x400
 #define SSC_STATUS_PLL_LOCK_MASK	0x800
 
+#define IDX_ADDR(pcie)	\
+	(pcie->reg_offsets[EXT_CFG_INDEX])
+#define DATA_ADDR(pcie)	\
+	(pcie->reg_offsets[EXT_CFG_DATA])
+#define PCIE_RGR1_SW_INIT_1(pcie) \
+	(pcie->reg_offsets[RGR1_SW_INIT_1])
+
+enum {
+	RGR1_SW_INIT_1,
+	EXT_CFG_INDEX,
+	EXT_CFG_DATA,
+};
+
+enum {
+	RGR1_SW_INIT_1_INIT_MASK,
+	RGR1_SW_INIT_1_INIT_SHIFT,
+};
+
+enum pcie_type {
+	GENERIC,
+	BCM7278,
+	BCM2711,
+};
+
+struct pcie_cfg_data {
+	const int *reg_field_info;
+	const int *offsets;
+	const enum pcie_type type;
+};
+
+static const int pcie_reg_field_info[] = {
+	[RGR1_SW_INIT_1_INIT_MASK] = 0x2,
+	[RGR1_SW_INIT_1_INIT_SHIFT] = 0x1,
+};
+
+static const int pcie_reg_field_info_bcm7278[] = {
+	[RGR1_SW_INIT_1_INIT_MASK] = 0x1,
+	[RGR1_SW_INIT_1_INIT_SHIFT] = 0x0,
+};
+
+static const int pcie_offsets[] = {
+	[RGR1_SW_INIT_1] = 0x9210,
+	[EXT_CFG_INDEX]  = 0x9000,
+	[EXT_CFG_DATA]   = 0x9004,
+};
+
+static const struct pcie_cfg_data generic_cfg = {
+	.reg_field_info	= pcie_reg_field_info,
+	.offsets	= pcie_offsets,
+	.type		= GENERIC,
+};
+
+static const int pcie_offset_bcm7278[] = {
+	[RGR1_SW_INIT_1] = 0xc010,
+	[EXT_CFG_INDEX] = 0x9000,
+	[EXT_CFG_DATA] = 0x9004,
+};
+
+static const struct pcie_cfg_data bcm7278_cfg = {
+	.reg_field_info = pcie_reg_field_info_bcm7278,
+	.offsets	= pcie_offset_bcm7278,
+	.type		= BCM7278,
+};
+
+static const struct pcie_cfg_data bcm2711_cfg = {
+	.reg_field_info	= pcie_reg_field_info,
+	.offsets	= pcie_offsets,
+	.type		= BCM2711,
+};
+
 struct brcm_msi {
 	struct device		*dev;
 	void __iomem		*base;
@@ -176,6 +245,9 @@ struct brcm_pcie {
 	int			gen;
 	u64			msi_target_addr;
 	struct brcm_msi		*msi;
+	const int		*reg_offsets;
+	const int		*reg_field_info;
+	enum pcie_type		type;
 };
 
 /*
@@ -602,20 +674,21 @@ static struct pci_ops brcm_pcie_ops = {
 
 static inline void brcm_pcie_bridge_sw_init_set(struct brcm_pcie *pcie, u32 val)
 {
-	u32 tmp;
+	u32 tmp, mask =  pcie->reg_field_info[RGR1_SW_INIT_1_INIT_MASK];
+	u32 shift = pcie->reg_field_info[RGR1_SW_INIT_1_INIT_SHIFT];
 
-	tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1);
-	u32p_replace_bits(&tmp, val, PCIE_RGR1_SW_INIT_1_INIT_MASK);
-	writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1);
+	tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
+	tmp = (tmp & ~mask) | ((val << shift) & mask);
+	writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
 }
 
 static inline void brcm_pcie_perst_set(struct brcm_pcie *pcie, u32 val)
 {
 	u32 tmp;
 
-	tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1);
+	tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
 	u32p_replace_bits(&tmp, val, PCIE_RGR1_SW_INIT_1_PERST_MASK);
-	writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1);
+	writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
 }
 
 static inline int brcm_pcie_get_rc_bar2_size_and_offset(struct brcm_pcie *pcie,
@@ -924,10 +997,16 @@ static int brcm_pcie_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static const struct of_device_id brcm_pcie_match[] = {
+	{ .compatible = "brcm,bcm2711-pcie", .data = &bcm2711_cfg },
+	{},
+};
+
 static int brcm_pcie_probe(struct platform_device *pdev)
 {
 	struct device_node *np = pdev->dev.of_node, *msi_np;
 	struct pci_host_bridge *bridge;
+	const struct pcie_cfg_data *data;
 	struct brcm_pcie *pcie;
 	struct pci_bus *child;
 	struct resource *res;
@@ -937,9 +1016,18 @@ static int brcm_pcie_probe(struct platform_device *pdev)
 	if (!bridge)
 		return -ENOMEM;
 
+	data = of_device_get_match_data(&pdev->dev);
+	if (!data) {
+		pr_err("failed to look up compatible string\n");
+		return -EINVAL;
+	}
+
 	pcie = pci_host_bridge_priv(bridge);
 	pcie->dev = &pdev->dev;
 	pcie->np = np;
+	pcie->reg_offsets = data->offsets;
+	pcie->reg_field_info = data->reg_field_info;
+	pcie->type = data->type;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	pcie->base = devm_ioremap_resource(&pdev->dev, res);
@@ -1005,10 +1093,6 @@ static int brcm_pcie_probe(struct platform_device *pdev)
 	return ret;
 }
 
-static const struct of_device_id brcm_pcie_match[] = {
-	{ .compatible = "brcm,bcm2711-pcie" },
-	{},
-};
 MODULE_DEVICE_TABLE(of, brcm_pcie_match);
 
 static struct platform_driver brcm_pcie_driver = {
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 05/13] PCI: brcmstb: Add suspend and resume pm_ops
  2020-06-03 19:20 ` Jim Quinlan
@ 2020-06-03 19:20   ` Jim Quinlan
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Jim Quinlan, Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas,
	Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list

From: Jim Quinlan <jquinlan@broadcom.com>

Broadcom Set-top (BrcmSTB) boards typically support S2, S3, and S5 suspend
and resume.  Now the PCIe driver may do so as well.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 49 +++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 7c707e483181..f444751e247c 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -979,6 +979,49 @@ static void brcm_pcie_turn_off(struct brcm_pcie *pcie)
 	brcm_pcie_bridge_sw_init_set(pcie, 1);
 }
 
+static int brcm_pcie_suspend(struct device *dev)
+{
+	struct brcm_pcie *pcie = dev_get_drvdata(dev);
+	int ret = 0;
+
+	brcm_pcie_turn_off(pcie);
+	clk_disable_unprepare(pcie->clk);
+
+	return ret;
+}
+
+static int brcm_pcie_resume(struct device *dev)
+{
+	struct brcm_pcie *pcie = dev_get_drvdata(dev);
+	void __iomem *base;
+	u32 tmp;
+	int ret;
+
+	base = pcie->base;
+	clk_prepare_enable(pcie->clk);
+
+	/* Take bridge out of reset so we can access the SERDES reg */
+	brcm_pcie_bridge_sw_init_set(pcie, 0);
+
+	/* SERDES_IDDQ = 0 */
+	tmp = readl(base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
+	u32p_replace_bits(&tmp, 0,
+			  PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK);
+	writel(tmp, base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
+
+	/* wait for serdes to be stable */
+	udelay(100);
+
+	ret = brcm_pcie_setup(pcie);
+	if (ret)
+		return ret;
+
+	if (pcie->msi)
+		brcm_msi_set_regs(pcie->msi);
+
+	return 0;
+}
+
 static void __brcm_pcie_remove(struct brcm_pcie *pcie)
 {
 	brcm_msi_remove(pcie);
@@ -1095,12 +1138,18 @@ static int brcm_pcie_probe(struct platform_device *pdev)
 
 MODULE_DEVICE_TABLE(of, brcm_pcie_match);
 
+static const struct dev_pm_ops brcm_pcie_pm_ops = {
+	.suspend_noirq = brcm_pcie_suspend,
+	.resume_noirq = brcm_pcie_resume,
+};
+
 static struct platform_driver brcm_pcie_driver = {
 	.probe = brcm_pcie_probe,
 	.remove = brcm_pcie_remove,
 	.driver = {
 		.name = "brcm-pcie",
 		.of_match_table = brcm_pcie_match,
+		.pm = &brcm_pcie_pm_ops,
 	},
 };
 module_platform_driver(brcm_pcie_driver);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 05/13] PCI: brcmstb: Add suspend and resume pm_ops
@ 2020-06-03 19:20   ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Rob Herring, Lorenzo Pieralisi, open list, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Jim Quinlan, Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE

From: Jim Quinlan <jquinlan@broadcom.com>

Broadcom Set-top (BrcmSTB) boards typically support S2, S3, and S5 suspend
and resume.  Now the PCIe driver may do so as well.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 49 +++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 7c707e483181..f444751e247c 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -979,6 +979,49 @@ static void brcm_pcie_turn_off(struct brcm_pcie *pcie)
 	brcm_pcie_bridge_sw_init_set(pcie, 1);
 }
 
+static int brcm_pcie_suspend(struct device *dev)
+{
+	struct brcm_pcie *pcie = dev_get_drvdata(dev);
+	int ret = 0;
+
+	brcm_pcie_turn_off(pcie);
+	clk_disable_unprepare(pcie->clk);
+
+	return ret;
+}
+
+static int brcm_pcie_resume(struct device *dev)
+{
+	struct brcm_pcie *pcie = dev_get_drvdata(dev);
+	void __iomem *base;
+	u32 tmp;
+	int ret;
+
+	base = pcie->base;
+	clk_prepare_enable(pcie->clk);
+
+	/* Take bridge out of reset so we can access the SERDES reg */
+	brcm_pcie_bridge_sw_init_set(pcie, 0);
+
+	/* SERDES_IDDQ = 0 */
+	tmp = readl(base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
+	u32p_replace_bits(&tmp, 0,
+			  PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK);
+	writel(tmp, base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
+
+	/* wait for serdes to be stable */
+	udelay(100);
+
+	ret = brcm_pcie_setup(pcie);
+	if (ret)
+		return ret;
+
+	if (pcie->msi)
+		brcm_msi_set_regs(pcie->msi);
+
+	return 0;
+}
+
 static void __brcm_pcie_remove(struct brcm_pcie *pcie)
 {
 	brcm_msi_remove(pcie);
@@ -1095,12 +1138,18 @@ static int brcm_pcie_probe(struct platform_device *pdev)
 
 MODULE_DEVICE_TABLE(of, brcm_pcie_match);
 
+static const struct dev_pm_ops brcm_pcie_pm_ops = {
+	.suspend_noirq = brcm_pcie_suspend,
+	.resume_noirq = brcm_pcie_resume,
+};
+
 static struct platform_driver brcm_pcie_driver = {
 	.probe = brcm_pcie_probe,
 	.remove = brcm_pcie_remove,
 	.driver = {
 		.name = "brcm-pcie",
 		.of_match_table = brcm_pcie_match,
+		.pm = &brcm_pcie_pm_ops,
 	},
 };
 module_platform_driver(brcm_pcie_driver);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 06/13] PCI: brcmstb: Add bcm7278 PERST support
  2020-06-03 19:20 ` Jim Quinlan
@ 2020-06-03 19:20   ` Jim Quinlan
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Jim Quinlan, Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas,
	Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list

From: Jim Quinlan <jquinlan@broadcom.com>

The PERST bit was moved to a different register in 7278-type STB chips.  In
addition, the polarity of the bit was also changed; for other chips writing
a 1 specified assert; for 7278-type chips, writing a 0 specifies assert.

Signal-wise, PERST is an asserted-low signal.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index f444751e247c..0bcae9eba048 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -81,6 +81,7 @@
 
 #define PCIE_MISC_PCIE_CTRL				0x4064
 #define  PCIE_MISC_PCIE_CTRL_PCIE_L23_REQUEST_MASK	0x1
+#define PCIE_MISC_PCIE_CTRL_PCIE_PERSTB_MASK		0x4
 
 #define PCIE_MISC_PCIE_STATUS				0x4068
 #define  PCIE_MISC_PCIE_STATUS_PCIE_PORT_MASK		0x80
@@ -686,9 +687,17 @@ static inline void brcm_pcie_perst_set(struct brcm_pcie *pcie, u32 val)
 {
 	u32 tmp;
 
-	tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
-	u32p_replace_bits(&tmp, val, PCIE_RGR1_SW_INIT_1_PERST_MASK);
-	writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
+	if (pcie->type == BCM7278) {
+		/* Perst bit has moved and assert value is 0 */
+		tmp = readl(pcie->base + PCIE_MISC_PCIE_CTRL);
+		u32p_replace_bits(&tmp,
+				  !val, PCIE_MISC_PCIE_CTRL_PCIE_PERSTB_MASK);
+		writel(tmp, pcie->base +  PCIE_MISC_PCIE_CTRL);
+	} else {
+		tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
+		u32p_replace_bits(&tmp, val, PCIE_RGR1_SW_INIT_1_PERST_MASK);
+		writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
+	}
 }
 
 static inline int brcm_pcie_get_rc_bar2_size_and_offset(struct brcm_pcie *pcie,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 06/13] PCI: brcmstb: Add bcm7278 PERST support
@ 2020-06-03 19:20   ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Rob Herring, Lorenzo Pieralisi, open list, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Jim Quinlan, Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE

From: Jim Quinlan <jquinlan@broadcom.com>

The PERST bit was moved to a different register in 7278-type STB chips.  In
addition, the polarity of the bit was also changed; for other chips writing
a 1 specified assert; for 7278-type chips, writing a 0 specifies assert.

Signal-wise, PERST is an asserted-low signal.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index f444751e247c..0bcae9eba048 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -81,6 +81,7 @@
 
 #define PCIE_MISC_PCIE_CTRL				0x4064
 #define  PCIE_MISC_PCIE_CTRL_PCIE_L23_REQUEST_MASK	0x1
+#define PCIE_MISC_PCIE_CTRL_PCIE_PERSTB_MASK		0x4
 
 #define PCIE_MISC_PCIE_STATUS				0x4068
 #define  PCIE_MISC_PCIE_STATUS_PCIE_PORT_MASK		0x80
@@ -686,9 +687,17 @@ static inline void brcm_pcie_perst_set(struct brcm_pcie *pcie, u32 val)
 {
 	u32 tmp;
 
-	tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
-	u32p_replace_bits(&tmp, val, PCIE_RGR1_SW_INIT_1_PERST_MASK);
-	writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
+	if (pcie->type == BCM7278) {
+		/* Perst bit has moved and assert value is 0 */
+		tmp = readl(pcie->base + PCIE_MISC_PCIE_CTRL);
+		u32p_replace_bits(&tmp,
+				  !val, PCIE_MISC_PCIE_CTRL_PCIE_PERSTB_MASK);
+		writel(tmp, pcie->base +  PCIE_MISC_PCIE_CTRL);
+	} else {
+		tmp = readl(pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
+		u32p_replace_bits(&tmp, val, PCIE_RGR1_SW_INIT_1_PERST_MASK);
+		writel(tmp, pcie->base + PCIE_RGR1_SW_INIT_1(pcie));
+	}
 }
 
 static inline int brcm_pcie_get_rc_bar2_size_and_offset(struct brcm_pcie *pcie,
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 07/13] PCI: brcmstb: Add control of rescal reset
  2020-06-03 19:20 ` Jim Quinlan
@ 2020-06-03 19:20   ` Jim Quinlan
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Jim Quinlan, Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas,
	Florian Fainelli, Philipp Zabel,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list

From: Jim Quinlan <jquinlan@broadcom.com>

Some STB chips have a special purpose reset controller named RESCAL (reset
calibration).  The PCIe HW can now control RESCAL to start and stop its
operation.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 81 ++++++++++++++++++++++++++-
 1 file changed, 80 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 0bcae9eba048..fa356bc149c3 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -23,6 +23,7 @@
 #include <linux/of_platform.h>
 #include <linux/pci.h>
 #include <linux/printk.h>
+#include <linux/reset.h>
 #include <linux/sizes.h>
 #include <linux/slab.h>
 #include <linux/string.h>
@@ -152,7 +153,17 @@
 #define SSC_STATUS_SSC_MASK		0x400
 #define SSC_STATUS_PLL_LOCK_MASK	0x800
 
-#define IDX_ADDR(pcie)	\
+/* Rescal registers */
+#define PCIE_DVT_PMU_PCIE_PHY_CTRL				0xc700
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_NFLDS			0x3
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_DIG_RESET_MASK		0x4
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_DIG_RESET_SHIFT	0x2
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_RESET_MASK		0x2
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_RESET_SHIFT		0x1
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_PWRDN_MASK		0x1
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_PWRDN_SHIFT		0x0
+
+#define IDX_ADDR(pcie) \
 	(pcie->reg_offsets[EXT_CFG_INDEX])
 #define DATA_ADDR(pcie)	\
 	(pcie->reg_offsets[EXT_CFG_DATA])
@@ -249,6 +260,7 @@ struct brcm_pcie {
 	const int		*reg_offsets;
 	const int		*reg_field_info;
 	enum pcie_type		type;
+	struct reset_control	*rescal;
 };
 
 /*
@@ -964,6 +976,47 @@ static void brcm_pcie_enter_l23(struct brcm_pcie *pcie)
 		dev_err(pcie->dev, "failed to enter low-power link state\n");
 }
 
+static int brcm_phy_cntl(struct brcm_pcie *pcie, const int start)
+{
+	static const u32 shifts[PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_NFLDS] = {
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_PWRDN_SHIFT,
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_RESET_SHIFT,
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_DIG_RESET_SHIFT,};
+	static const u32 masks[PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_NFLDS] = {
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_PWRDN_MASK,
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_RESET_MASK,
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_DIG_RESET_MASK,};
+	const int beg = start ? 0 : PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_NFLDS - 1;
+	const int end = start ? PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_NFLDS : -1;
+	u32 tmp, combined_mask = 0;
+	u32 val = !!start;
+	void __iomem *base = pcie->base;
+	int i;
+
+	for (i = beg; i != end; start ? i++ : i--) {
+		tmp = readl(base + PCIE_DVT_PMU_PCIE_PHY_CTRL);
+		tmp = (tmp & ~masks[i]) | ((val << shifts[i]) & masks[i]);
+		writel(tmp, base + PCIE_DVT_PMU_PCIE_PHY_CTRL);
+		usleep_range(50, 200);
+		combined_mask |= masks[i];
+	}
+
+	tmp = readl(base + PCIE_DVT_PMU_PCIE_PHY_CTRL);
+	val = start ? combined_mask : 0;
+
+	return (tmp & combined_mask) == val ? 0 : -EIO;
+}
+
+static inline int brcm_phy_start(struct brcm_pcie *pcie)
+{
+	return pcie->rescal ? brcm_phy_cntl(pcie, 1) : 0;
+}
+
+static inline int brcm_phy_stop(struct brcm_pcie *pcie)
+{
+	return pcie->rescal ? brcm_phy_cntl(pcie, 0) : 0;
+}
+
 static void brcm_pcie_turn_off(struct brcm_pcie *pcie)
 {
 	void __iomem *base = pcie->base;
@@ -994,6 +1047,9 @@ static int brcm_pcie_suspend(struct device *dev)
 	int ret = 0;
 
 	brcm_pcie_turn_off(pcie);
+	ret = brcm_phy_stop(pcie);
+	if (ret)
+		dev_err(pcie->dev, "failed to stop phy\n");
 	clk_disable_unprepare(pcie->clk);
 
 	return ret;
@@ -1009,6 +1065,12 @@ static int brcm_pcie_resume(struct device *dev)
 	base = pcie->base;
 	clk_prepare_enable(pcie->clk);
 
+	ret = brcm_phy_start(pcie);
+	if (ret) {
+		dev_err(pcie->dev, "failed to start phy\n");
+		return ret;
+	}
+
 	/* Take bridge out of reset so we can access the SERDES reg */
 	brcm_pcie_bridge_sw_init_set(pcie, 0);
 
@@ -1035,6 +1097,9 @@ static void __brcm_pcie_remove(struct brcm_pcie *pcie)
 {
 	brcm_msi_remove(pcie);
 	brcm_pcie_turn_off(pcie);
+	if (brcm_phy_stop(pcie))
+		dev_err(pcie->dev, "failed to stop phy\n");
+	reset_control_assert(pcie->rescal);
 	clk_disable_unprepare(pcie->clk);
 }
 
@@ -1105,6 +1170,20 @@ static int brcm_pcie_probe(struct platform_device *pdev)
 		dev_err(&pdev->dev, "could not enable clock\n");
 		return ret;
 	}
+	pcie->rescal = devm_reset_control_get_optional_shared(&pdev->dev,
+							      "rescal");
+	if (IS_ERR(pcie->rescal))
+		return PTR_ERR(pcie->rescal);
+
+	ret = reset_control_deassert(pcie->rescal);
+	if (ret)
+		dev_err(&pdev->dev, "failed to deassert 'rescal'\n");
+
+	ret = brcm_phy_start(pcie);
+	if (ret) {
+		dev_err(pcie->dev, "failed to start phy\n");
+		return ret;
+	}
 
 	ret = brcm_pcie_setup(pcie);
 	if (ret)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 07/13] PCI: brcmstb: Add control of rescal reset
@ 2020-06-03 19:20   ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Rob Herring, Lorenzo Pieralisi, Jim Quinlan, open list,
	Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Philipp Zabel, Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE

From: Jim Quinlan <jquinlan@broadcom.com>

Some STB chips have a special purpose reset controller named RESCAL (reset
calibration).  The PCIe HW can now control RESCAL to start and stop its
operation.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 81 ++++++++++++++++++++++++++-
 1 file changed, 80 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 0bcae9eba048..fa356bc149c3 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -23,6 +23,7 @@
 #include <linux/of_platform.h>
 #include <linux/pci.h>
 #include <linux/printk.h>
+#include <linux/reset.h>
 #include <linux/sizes.h>
 #include <linux/slab.h>
 #include <linux/string.h>
@@ -152,7 +153,17 @@
 #define SSC_STATUS_SSC_MASK		0x400
 #define SSC_STATUS_PLL_LOCK_MASK	0x800
 
-#define IDX_ADDR(pcie)	\
+/* Rescal registers */
+#define PCIE_DVT_PMU_PCIE_PHY_CTRL				0xc700
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_NFLDS			0x3
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_DIG_RESET_MASK		0x4
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_DIG_RESET_SHIFT	0x2
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_RESET_MASK		0x2
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_RESET_SHIFT		0x1
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_PWRDN_MASK		0x1
+#define  PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_PWRDN_SHIFT		0x0
+
+#define IDX_ADDR(pcie) \
 	(pcie->reg_offsets[EXT_CFG_INDEX])
 #define DATA_ADDR(pcie)	\
 	(pcie->reg_offsets[EXT_CFG_DATA])
@@ -249,6 +260,7 @@ struct brcm_pcie {
 	const int		*reg_offsets;
 	const int		*reg_field_info;
 	enum pcie_type		type;
+	struct reset_control	*rescal;
 };
 
 /*
@@ -964,6 +976,47 @@ static void brcm_pcie_enter_l23(struct brcm_pcie *pcie)
 		dev_err(pcie->dev, "failed to enter low-power link state\n");
 }
 
+static int brcm_phy_cntl(struct brcm_pcie *pcie, const int start)
+{
+	static const u32 shifts[PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_NFLDS] = {
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_PWRDN_SHIFT,
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_RESET_SHIFT,
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_DIG_RESET_SHIFT,};
+	static const u32 masks[PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_NFLDS] = {
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_PWRDN_MASK,
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_RESET_MASK,
+		PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_DIG_RESET_MASK,};
+	const int beg = start ? 0 : PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_NFLDS - 1;
+	const int end = start ? PCIE_DVT_PMU_PCIE_PHY_CTRL_DAST_NFLDS : -1;
+	u32 tmp, combined_mask = 0;
+	u32 val = !!start;
+	void __iomem *base = pcie->base;
+	int i;
+
+	for (i = beg; i != end; start ? i++ : i--) {
+		tmp = readl(base + PCIE_DVT_PMU_PCIE_PHY_CTRL);
+		tmp = (tmp & ~masks[i]) | ((val << shifts[i]) & masks[i]);
+		writel(tmp, base + PCIE_DVT_PMU_PCIE_PHY_CTRL);
+		usleep_range(50, 200);
+		combined_mask |= masks[i];
+	}
+
+	tmp = readl(base + PCIE_DVT_PMU_PCIE_PHY_CTRL);
+	val = start ? combined_mask : 0;
+
+	return (tmp & combined_mask) == val ? 0 : -EIO;
+}
+
+static inline int brcm_phy_start(struct brcm_pcie *pcie)
+{
+	return pcie->rescal ? brcm_phy_cntl(pcie, 1) : 0;
+}
+
+static inline int brcm_phy_stop(struct brcm_pcie *pcie)
+{
+	return pcie->rescal ? brcm_phy_cntl(pcie, 0) : 0;
+}
+
 static void brcm_pcie_turn_off(struct brcm_pcie *pcie)
 {
 	void __iomem *base = pcie->base;
@@ -994,6 +1047,9 @@ static int brcm_pcie_suspend(struct device *dev)
 	int ret = 0;
 
 	brcm_pcie_turn_off(pcie);
+	ret = brcm_phy_stop(pcie);
+	if (ret)
+		dev_err(pcie->dev, "failed to stop phy\n");
 	clk_disable_unprepare(pcie->clk);
 
 	return ret;
@@ -1009,6 +1065,12 @@ static int brcm_pcie_resume(struct device *dev)
 	base = pcie->base;
 	clk_prepare_enable(pcie->clk);
 
+	ret = brcm_phy_start(pcie);
+	if (ret) {
+		dev_err(pcie->dev, "failed to start phy\n");
+		return ret;
+	}
+
 	/* Take bridge out of reset so we can access the SERDES reg */
 	brcm_pcie_bridge_sw_init_set(pcie, 0);
 
@@ -1035,6 +1097,9 @@ static void __brcm_pcie_remove(struct brcm_pcie *pcie)
 {
 	brcm_msi_remove(pcie);
 	brcm_pcie_turn_off(pcie);
+	if (brcm_phy_stop(pcie))
+		dev_err(pcie->dev, "failed to stop phy\n");
+	reset_control_assert(pcie->rescal);
 	clk_disable_unprepare(pcie->clk);
 }
 
@@ -1105,6 +1170,20 @@ static int brcm_pcie_probe(struct platform_device *pdev)
 		dev_err(&pdev->dev, "could not enable clock\n");
 		return ret;
 	}
+	pcie->rescal = devm_reset_control_get_optional_shared(&pdev->dev,
+							      "rescal");
+	if (IS_ERR(pcie->rescal))
+		return PTR_ERR(pcie->rescal);
+
+	ret = reset_control_deassert(pcie->rescal);
+	if (ret)
+		dev_err(&pdev->dev, "failed to deassert 'rescal'\n");
+
+	ret = brcm_phy_start(pcie);
+	if (ret) {
+		dev_err(pcie->dev, "failed to start phy\n");
+		return ret;
+	}
 
 	ret = brcm_pcie_setup(pcie);
 	if (ret)
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 08/13] of: Include a dev param in of_dma_get_range()
  2020-06-03 19:20 ` Jim Quinlan
                   ` (11 preceding siblings ...)
  (?)
@ 2020-06-03 19:20 ` Jim Quinlan
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Rob Herring, Frank Rowand,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, open list

Currently there is only one caller of of_dma_get_range().  A struct device
*dev param is needed For implementing multiple dma offsets.  This function
will still work if dev == NULL.

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
---
 drivers/of/address.c    | 4 +++-
 drivers/of/device.c     | 2 +-
 drivers/of/of_private.h | 8 ++++----
 drivers/of/unittest.c   | 2 +-
 4 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 8eea3f6e29a4..96d8cfb14a60 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -920,6 +920,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
 
 /**
  * of_dma_get_range - Get DMA range info
+ * @dev:	device pointer; only needed for a corner case.
  * @np:		device node to get DMA range info
  * @dma_addr:	pointer to store initial DMA address of DMA range
  * @paddr:	pointer to store initial CPU address of DMA range
@@ -935,7 +936,8 @@ EXPORT_SYMBOL(of_io_request_and_map);
  * It returns -ENODEV if "dma-ranges" property was not found
  * for this device in DT.
  */
-int of_dma_get_range(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *size)
+int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
+		     u64 *paddr, u64 *size)
 {
 	struct device_node *node = of_node_get(np);
 	const __be32 *ranges = NULL;
diff --git a/drivers/of/device.c b/drivers/of/device.c
index 27203bfd0b22..ef6a741f9f0b 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -95,7 +95,7 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 	const struct iommu_ops *iommu;
 	u64 mask, end;
 
-	ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
+	ret = of_dma_get_range(dev, np, &dma_addr, &paddr, &size);
 	if (ret < 0) {
 		/*
 		 * For legacy reasons, we have to assume some devices need
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index edc682249c00..7a83d3a81ab6 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -158,11 +158,11 @@ extern int of_bus_n_addr_cells(struct device_node *np);
 extern int of_bus_n_size_cells(struct device_node *np);
 
 #ifdef CONFIG_OF_ADDRESS
-extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
-			    u64 *paddr, u64 *size);
+extern int of_dma_get_range(struct device *dev, struct device_node *np,
+			    u64 *dma_addr, u64 *paddr, u64 *size);
 #else
-static inline int of_dma_get_range(struct device_node *np, u64 *dma_addr,
-				   u64 *paddr, u64 *size)
+static inline int of_dma_get_range(struct device *dev, struct device_node *np,
+				   u64 *dma_addr, u64 *paddr, u64 *size)
 {
 	return -ENODEV;
 }
diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index 398de04fd19c..57a22517c428 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -881,7 +881,7 @@ static void __init of_unittest_dma_ranges_one(const char *path,
 		return;
 	}
 
-	rc = of_dma_get_range(np, &dma_addr, &paddr, &size);
+	rc = of_dma_get_range(NULL, np, &dma_addr, &paddr, &size);
 
 	unittest(!rc, "of_dma_get_range failed on node %pOF rc=%i\n", np, rc);
 	if (!rc) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-03 19:20 ` Jim Quinlan
  (?)
@ 2020-06-03 19:20   ` Jim Quinlan via iommu
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Dan Williams, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT, Russell King,
	open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai, Ingo Molnar,
	Alan Stern, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Maxime Ripard, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

The new field in struct device 'dma_pfn_offset_map' is used to facilitate
the use of multiple pfn offsets between cpu addrs and dma addrs.  It
subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
designates the single offset a special case.

of_dma_configure() is the typical manner to set pfn offsets but there
are a number of ad hoc assignments to dev->dma_pfn_offset in the
kernel code.  These cases now invoke the function
attach_uniform_dma_pfn_offset(dev, pfn_offset).

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
---
 arch/arm/include/asm/dma-mapping.h            |  9 +-
 arch/arm/mach-keystone/keystone.c             |  9 +-
 arch/sh/drivers/pci/pcie-sh7786.c             |  3 +-
 arch/sh/kernel/dma-coherent.c                 | 17 ++--
 arch/x86/pci/sta2x11-fixup.c                  |  7 +-
 drivers/acpi/arm64/iort.c                     |  5 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c         |  7 +-
 drivers/iommu/io-pgtable-arm.c                |  2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  5 +-
 drivers/of/address.c                          | 93 +++++++++++++++++--
 drivers/of/device.c                           |  8 +-
 drivers/remoteproc/remoteproc_core.c          |  2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
 drivers/usb/core/message.c                    |  4 +-
 drivers/usb/core/usb.c                        |  2 +-
 include/linux/device.h                        |  4 +-
 include/linux/dma-direct.h                    | 16 +++-
 include/linux/dma-mapping.h                   | 45 +++++++++
 kernel/dma/coherent.c                         | 11 ++-
 20 files changed, 210 insertions(+), 51 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index bdd80ddbca34..f1e72f99468b 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -35,8 +35,9 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 #ifndef __arch_pfn_to_dma
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-	if (dev)
-		pfn -= dev->dma_pfn_offset;
+	if (dev && dev->dma_pfn_offset_map)
+		pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
+
 	return (dma_addr_t)__pfn_to_bus(pfn);
 }
 
@@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
 {
 	unsigned long pfn = __bus_to_pfn(addr);
 
-	if (dev)
-		pfn += dev->dma_pfn_offset;
+	if (dev && dev->dma_pfn_offset_map)
+		pfn += dma_pfn_offset_from_dma_addr(dev, addr);
 
 	return pfn;
 }
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index 638808c4e122..e7d3ee6e9cb5 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -8,6 +8,7 @@
  */
 #include <linux/io.h>
 #include <linux/of.h>
+#include <linux/dma-mapping.h>
 #include <linux/init.h>
 #include <linux/of_platform.h>
 #include <linux/of_address.h>
@@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct notifier_block *nb,
 		return NOTIFY_BAD;
 
 	if (!dev->of_node) {
-		dev->dma_pfn_offset = keystone_dma_pfn_offset;
-		dev_err(dev, "set dma_pfn_offset%08lx\n",
-			dev->dma_pfn_offset);
+		int ret = attach_uniform_dma_pfn_offset
+				(dev, keystone_dma_pfn_offset);
+
+		dev_err(dev, "set dma_pfn_offset%08lx%s\n",
+			dev->dma_pfn_offset, ret ? " failed" : "");
 	}
 	return NOTIFY_OK;
 }
diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-sh7786.c
index e0b568aaa701..2e832a5c58c1 100644
--- a/arch/sh/drivers/pci/pcie-sh7786.c
+++ b/arch/sh/drivers/pci/pcie-sh7786.c
@@ -12,6 +12,7 @@
 #include <linux/io.h>
 #include <linux/async.h>
 #include <linux/delay.h>
+#include <linux/dma-mapping.h>
 #include <linux/slab.h>
 #include <linux/clk.h>
 #include <linux/sh_clk.h>
@@ -487,7 +488,7 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev, u8 slot, u8 pin)
 
 void pcibios_bus_add_device(struct pci_dev *pdev)
 {
-	pdev->dev.dma_pfn_offset = dma_pfn_offset;
+	attach_uniform_dma_pfn_offset(&pdev->dev, dma_pfn_offset);
 }
 
 static int __init sh7786_pcie_core_init(void)
diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
index d4811691b93c..5fc9e358b6c7 100644
--- a/arch/sh/kernel/dma-coherent.c
+++ b/arch/sh/kernel/dma-coherent.c
@@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 {
 	void *ret, *ret_nocache;
 	int order = get_order(size);
+	unsigned long pfn;
+	phys_addr_t phys;
 
 	gfp |= __GFP_ZERO;
 
@@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		return NULL;
 	}
 
-	split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
+	phys = virt_to_phys(ret);
+	pfn =  phys >> PAGE_SHIFT;
+	split_page(pfn_to_page(pfn), order);
 
-	*dma_handle = virt_to_phys(ret);
-	if (!WARN_ON(!dev))
-		*dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
+	*dma_handle = (dma_addr_t)phys;
+	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
+		*dma_handle -= PFN_PHYS(
+			dma_pfn_offset_from_phys_addr(dev, phys));
 
 	return ret_nocache;
 }
@@ -50,8 +55,8 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 	unsigned long pfn = (dma_handle >> PAGE_SHIFT);
 	int k;
 
-	if (!WARN_ON(!dev))
-		pfn += dev->dma_pfn_offset;
+	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
+		pfn += dma_pfn_offset_from_dma_addr(dev, dma_handle);
 
 	for (k = 0; k < (1 << order); k++)
 		__free_pages(pfn_to_page(pfn + k), 0);
diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index c313d784efab..4cdeca9f69b6 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -12,6 +12,7 @@
 #include <linux/export.h>
 #include <linux/list.h>
 #include <linux/dma-direct.h>
+#include <linux/dma-mapping.h>
 #include <asm/iommu.h>
 
 #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
@@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
 	struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
 	struct device *dev = &pdev->dev;
 	u32 amba_base, max_amba_addr;
-	int i;
+	int i, ret;
 
 	if (!instance)
 		return;
@@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
 	pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
 	max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
 
-	dev->dma_pfn_offset = PFN_DOWN(-amba_base);
+	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(-amba_base));
+	if (ret)
+		dev_err(dev, "sta2x11: could not set PFN offset\n");
 
 	dev->bus_dma_limit = max_amba_addr;
 	pci_set_consistent_dma_mask(pdev, max_amba_addr);
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 28a6b387e80e..153661ddc74b 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
 	*dma_addr = dmaaddr;
 	*dma_size = size;
 
-	dev->dma_pfn_offset = PFN_DOWN(offset);
-	dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
+	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(offset));
+	dev_dbg(dev, "dma_pfn_offset(%#08llx)%s\n",
+		offset, ret ? " failed!" : "");
 }
 
 static void __init acpi_iort_register_irq(int hwirq, const char *name,
diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c
index 072ea113e6be..3d41dfc7d178 100644
--- a/drivers/gpu/drm/sun4i/sun4i_backend.c
+++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
@@ -11,6 +11,7 @@
 #include <linux/module.h>
 #include <linux/of_device.h>
 #include <linux/of_graph.h>
+#include <linux/dma-mapping.h>
 #include <linux/platform_device.h>
 #include <linux/reset.h>
 
@@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
 	const struct sun4i_backend_quirks *quirks;
 	struct resource *res;
 	void __iomem *regs;
-	int i, ret;
+	int i, ret = 0;
 
 	backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
 	if (!backend)
@@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
 		 * on our device since the RAM mapping is at 0 for the DMA bus,
 		 * unlike the CPU.
 		 */
-		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
+		if (ret)
+			return ret;
 	}
 
 	backend->engine.node = dev->of_node;
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 04fbd4bf0ff9..e9cc1c2d47cd 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
 		return NULL;
 
-	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
+	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
 		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
 		return NULL;
 	}
diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
index eff34ded6305..7212da5e1076 100644
--- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
+++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/clk.h>
+#include <linux/dma-mapping.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/mutex.h>
@@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
 			return ret;
 	} else {
 #ifdef PHYS_PFN_OFFSET
-		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
+		if (ret)
+			return ret;
 #endif
 	}
 
diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
index 055eb0b8e396..2d66d415b6c3 100644
--- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
+++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
@@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
 
 	sdev->dev = &pdev->dev;
 	/* The DMA bus has the memory mapped at 0 */
-	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
+	ret = attach_uniform_dma_pfn_offset(sdev->dev,
+					    PHYS_OFFSET >> PAGE_SHIFT);
+	if (ret)
+		return ret;
 
 	ret = sun6i_csi_resource_request(sdev, pdev);
 	if (ret)
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 96d8cfb14a60..c89333b0a5fb 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
 }
 EXPORT_SYMBOL(of_io_request_and_map);
 
+static int attach_dma_pfn_offset_map(struct device *dev,
+				     struct device_node *node, int num_ranges)
+{
+	struct of_range_parser parser;
+	struct of_range range;
+	struct dma_pfn_offset_region *r;
+
+	r = devm_kcalloc(dev, num_ranges + 1,
+			 sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
+	if (!r)
+		return -ENOMEM;
+	dev->dma_pfn_offset_map = r;
+	of_dma_range_parser_init(&parser, node);
+
+	/*
+	 * Record all info for DMA ranges array.  We could
+	 * just use the of_range struct, but if we did that it
+	 * would require more calculations for phys_to_dma and
+	 * dma_to_phys conversions.
+	 */
+	for_each_of_range(&parser, &range) {
+		r->cpu_start = range.cpu_addr;
+		r->cpu_end = r->cpu_start + range.size - 1;
+		r->dma_start = range.bus_addr;
+		r->dma_end = r->dma_start + range.size - 1;
+		r->pfn_offset = PFN_DOWN(range.cpu_addr)
+			- PFN_DOWN(range.bus_addr);
+		r++;
+	}
+	return 0;
+}
+
+
+
+/**
+ * attach_dma_pfn_offset - Assign scalar offset for all addresses.
+ * @dev:	device pointer; only needed for a corner case.
+ * @dma_pfn_offset:	offset to apply when converting from phys addr
+ *			to dma addr and vice versa.
+ *
+ * It returns -ENOMEM if out of memory, otherwise 0.
+ */
+int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long pfn_offset)
+{
+	struct dma_pfn_offset_region *r;
+
+	if (!dev)
+		return -ENODEV;
+
+	if (!pfn_offset)
+		return 0;
+
+	r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
+			 GFP_KERNEL);
+	if (!r)
+		return -ENOMEM;
+
+	r->uniform_offset = true;
+	r->pfn_offset = pfn_offset;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
+
 /**
  * of_dma_get_range - Get DMA range info
  * @dev:	device pointer; only needed for a corner case.
@@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
  *	CPU addr (phys_addr_t)	: pna cells
  *	size			: nsize cells
  *
- * It returns -ENODEV if "dma-ranges" property was not found
+ * It returns -ENODEV if !dev or "dma-ranges" property was not found
  * for this device in DT.
  */
 int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
@@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 	bool found_dma_ranges = false;
 	struct of_range_parser parser;
 	struct of_range range;
+	phys_addr_t cpu_start = ~(phys_addr_t)0;
 	u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
+	bool dma_multi_pfn_offset = false;
+	int num_ranges = 0;
+
+	if (!dev)
+		return -ENODEV;
 
 	while (node) {
 		ranges = of_get_property(node, "dma-ranges", &len);
@@ -977,11 +1047,10 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 		pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
 			 range.bus_addr, range.cpu_addr, range.size);
 
-		if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset) {
-			pr_warn("Can't handle multiple dma-ranges with different offsets on node(%pOF)\n", node);
-			/* Don't error out as we'd break some existing DTs */
-			continue;
-		}
+		num_ranges++;
+		if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset)
+			dma_multi_pfn_offset = true;
+
 		dma_offset = range.cpu_addr - range.bus_addr;
 
 		/* Take lower and upper limits */
@@ -989,6 +1058,8 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 			dma_start = range.bus_addr;
 		if (range.bus_addr + range.size > dma_end)
 			dma_end = range.bus_addr + range.size;
+		if (range.cpu_addr < cpu_start)
+			cpu_start = range.cpu_addr;
 	}
 
 	if (dma_start >= dma_end) {
@@ -998,9 +1069,17 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 		goto out;
 	}
 
+	if (dma_multi_pfn_offset)
+		ret = attach_dma_pfn_offset_map(dev, node, num_ranges);
+	else if (dma_offset)
+		ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(dma_offset));
+
+	if (ret)
+		goto out;
+
 	*dma_addr = dma_start;
 	*size = dma_end - dma_start;
-	*paddr = dma_start + dma_offset;
+	*paddr = cpu_start;
 
 	pr_debug("final: dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
 		 *dma_addr, *paddr, *size);
diff --git a/drivers/of/device.c b/drivers/of/device.c
index ef6a741f9f0b..91c50f40a82e 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -91,7 +91,6 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 	u64 dma_addr, paddr, size = 0;
 	int ret;
 	bool coherent;
-	unsigned long offset;
 	const struct iommu_ops *iommu;
 	u64 mask, end;
 
@@ -105,10 +104,8 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 		if (!force_dma)
 			return ret == -ENODEV ? 0 : ret;
 
-		dma_addr = offset = 0;
+		dma_addr = 0;
 	} else {
-		offset = PFN_DOWN(paddr - dma_addr);
-
 		/*
 		 * Add a work around to treat the size as mask + 1 in case
 		 * it is defined in DT as a mask.
@@ -123,7 +120,6 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 			dev_err(dev, "Adjusted size 0x%llx invalid\n", size);
 			return -EINVAL;
 		}
-		dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
 	}
 
 	/*
@@ -142,8 +138,6 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 	else if (!size)
 		size = 1ULL << 32;
 
-	dev->dma_pfn_offset = offset;
-
 	/*
 	 * Limit coherent and dma mask based on size and default mask
 	 * set by the driver.
diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index e12a54e67588..e49648f77261 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -518,7 +518,7 @@ static int rproc_handle_vdev(struct rproc *rproc, struct fw_rsc_vdev *rsc,
 	/* Initialise vdev subdevice */
 	snprintf(name, sizeof(name), "vdev%dbuffer", rvdev->index);
 	rvdev->dev.parent = rproc->dev.parent;
-	rvdev->dev.dma_pfn_offset = rproc->dev.parent->dma_pfn_offset;
+	rvdev->dev.dma_pfn_offset_map = rproc->dev.parent->dma_pfn_offset_map;
 	rvdev->dev.release = rproc_rvdev_release;
 	dev_set_name(&rvdev->dev, "%s#%s", dev_name(rvdev->dev.parent), name);
 	dev_set_drvdata(&rvdev->dev, rvdev);
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
index daf5f244f93b..29217e6e4153 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
@@ -171,8 +171,11 @@ int cedrus_hw_probe(struct cedrus_dev *dev)
 	 */
 
 #ifdef PHYS_PFN_OFFSET
-	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET))
-		dev->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET)) {
+		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
+		if (ret)
+			return ret;
+	}
 #endif
 
 	ret = of_reserved_mem_device_init(dev->dev);
diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index 6197938dcc2d..071856000428 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -1956,10 +1956,10 @@ int usb_set_configuration(struct usb_device *dev, int configuration)
 		intf->dev.groups = usb_interface_groups;
 		/*
 		 * Please refer to usb_alloc_dev() to see why we set
-		 * dma_mask and dma_pfn_offset.
+		 * dma_mask and dma_pfn_offset_map.
 		 */
 		intf->dev.dma_mask = dev->dev.dma_mask;
-		intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
+		intf->dev.dma_pfn_offset_map = dev->dev.dma_pfn_offset_map;
 		INIT_WORK(&intf->reset_ws, __usb_queue_reset_device);
 		intf->minor = -1;
 		device_initialize(&intf->dev);
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
index f16c26dc079d..3fbc0c06ce9c 100644
--- a/drivers/usb/core/usb.c
+++ b/drivers/usb/core/usb.c
@@ -611,7 +611,7 @@ struct usb_device *usb_alloc_dev(struct usb_device *parent,
 	 * mask for the entire HCD, so don't do that.
 	 */
 	dev->dev.dma_mask = bus->sysdev->dma_mask;
-	dev->dev.dma_pfn_offset = bus->sysdev->dma_pfn_offset;
+	dev->dev.dma_pfn_offset_map = bus->sysdev->dma_pfn_offset_map;
 	set_dev_node(&dev->dev, dev_to_node(bus->sysdev));
 	dev->state = USB_STATE_ATTACHED;
 	dev->lpm_disable_count = 1;
diff --git a/include/linux/device.h b/include/linux/device.h
index ac8e37cd716a..9c079643ff54 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -492,7 +492,7 @@ struct dev_links_info {
  * 		such descriptors.
  * @bus_dma_limit: Limit of an upstream bridge or bus which imposes a smaller
  *		DMA limit than the device itself supports.
- * @dma_pfn_offset: offset of DMA memory range relatively of RAM
+ * @dma_pfn_offset_map: offset map for DMA memory range relatively of RAM
  * @dma_parms:	A low level driver may set these to teach IOMMU code about
  * 		segment limitations.
  * @dma_pools:	Dma pools (if dma'ble device).
@@ -577,7 +577,7 @@ struct device {
 					     64 bit addresses for consistent
 					     allocations such descriptors. */
 	u64		bus_dma_limit;	/* upstream dma constraint */
-	unsigned long	dma_pfn_offset;
+	struct dma_pfn_offset_region *dma_pfn_offset_map;
 
 	struct device_dma_parameters *dma_parms;
 
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 24b8684aa21d..3c3363a3925e 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -15,14 +15,26 @@ static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
 	dma_addr_t dev_addr = (dma_addr_t)paddr;
 
-	return dev_addr - ((dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
+	if (dev->dma_pfn_offset_map) {
+		unsigned long dma_pfn_offset
+			= dma_pfn_offset_from_phys_addr(dev, paddr);
+
+		dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
+	}
+	return dev_addr;
 }
 
 static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr)
 {
 	phys_addr_t paddr = (phys_addr_t)dev_addr;
 
-	return paddr + ((phys_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
+	if (dev->dma_pfn_offset_map) {
+		unsigned long dma_pfn_offset
+			= dma_pfn_offset_from_dma_addr(dev, dev_addr);
+
+		paddr += ((phys_addr_t)dma_pfn_offset << PAGE_SHIFT);
+	}
+	return paddr;
 }
 #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 330ad58fbf4d..e0c2ec07c00a 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -256,6 +256,46 @@ static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
 size_t dma_direct_max_mapping_size(struct device *dev);
 
 #ifdef CONFIG_HAS_DMA
+struct dma_pfn_offset_region {
+	bool		uniform_offset;
+	phys_addr_t	cpu_start;
+	phys_addr_t	cpu_end;
+	dma_addr_t	dma_start;
+	dma_addr_t	dma_end;
+	unsigned long	pfn_offset;
+};
+
+int attach_uniform_dma_pfn_offset(struct device *dev,
+				  unsigned long dma_pfn_offset);
+
+static inline unsigned long dma_pfn_offset_from_dma_addr(struct device *dev,
+							 dma_addr_t dma_addr)
+{
+	const struct dma_pfn_offset_region *m = dev->dma_pfn_offset_map;
+
+	if (m->uniform_offset)
+		return m->pfn_offset;
+
+	for (; m->cpu_end; m++)
+		if (dma_addr >= m->dma_start && dma_addr <= m->dma_end)
+			return m->pfn_offset;
+	return 0;
+}
+
+static inline unsigned long dma_pfn_offset_from_phys_addr(struct device *dev,
+							  phys_addr_t paddr)
+{
+	const struct dma_pfn_offset_region *m = dev->dma_pfn_offset_map;
+
+	if (m->uniform_offset)
+		return m->pfn_offset;
+
+	for (; m->cpu_end; m++)
+		if (paddr >= m->cpu_start && paddr <= m->cpu_end)
+			return m->pfn_offset;
+	return 0;
+}
+
 #include <asm/dma-mapping.h>
 
 static inline const struct dma_map_ops *get_dma_ops(struct device *dev)
@@ -463,6 +503,11 @@ u64 dma_get_required_mask(struct device *dev);
 size_t dma_max_mapping_size(struct device *dev);
 unsigned long dma_get_merge_boundary(struct device *dev);
 #else /* CONFIG_HAS_DMA */
+static inline int attach_uniform_dma_pfn_offset(struct device *dev,
+						unsigned long dma_pfn_offset)
+{
+	return -EIO;
+}
 static inline dma_addr_t dma_map_page_attrs(struct device *dev,
 		struct page *page, size_t offset, size_t size,
 		enum dma_data_direction dir, unsigned long attrs)
diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 2a0c4985f38e..20fc48d0b9d7 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -31,10 +31,13 @@ static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *de
 static inline dma_addr_t dma_get_device_base(struct device *dev,
 					     struct dma_coherent_mem * mem)
 {
-	if (mem->use_dev_dma_pfn_offset)
-		return (mem->pfn_base - dev->dma_pfn_offset) << PAGE_SHIFT;
-	else
-		return mem->device_base;
+	if (mem->use_dev_dma_pfn_offset && dev->dma_pfn_offset_map) {
+		unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr
+			(dev, PFN_PHYS(mem->pfn_base));
+
+		return (mem->pfn_base - dma_pfn_offset) << PAGE_SHIFT;
+	}
+	return mem->device_base;
 }
 
 static int dma_init_coherent_memory(phys_addr_t phys_addr,
-- 
2.17.1

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-03 19:20   ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan via iommu @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Dan Williams, open list:STAGING SUBSYSTEM,
	Wolfram Sang, Yoshinori Sato, Frank Rowand,
	maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT, Russell King,
	open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai, Ingo Molnar,
	Alan Stern, David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Maxime Ripard, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

The new field in struct device 'dma_pfn_offset_map' is used to facilitate
the use of multiple pfn offsets between cpu addrs and dma addrs.  It
subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
designates the single offset a special case.

of_dma_configure() is the typical manner to set pfn offsets but there
are a number of ad hoc assignments to dev->dma_pfn_offset in the
kernel code.  These cases now invoke the function
attach_uniform_dma_pfn_offset(dev, pfn_offset).

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
---
 arch/arm/include/asm/dma-mapping.h            |  9 +-
 arch/arm/mach-keystone/keystone.c             |  9 +-
 arch/sh/drivers/pci/pcie-sh7786.c             |  3 +-
 arch/sh/kernel/dma-coherent.c                 | 17 ++--
 arch/x86/pci/sta2x11-fixup.c                  |  7 +-
 drivers/acpi/arm64/iort.c                     |  5 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c         |  7 +-
 drivers/iommu/io-pgtable-arm.c                |  2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  5 +-
 drivers/of/address.c                          | 93 +++++++++++++++++--
 drivers/of/device.c                           |  8 +-
 drivers/remoteproc/remoteproc_core.c          |  2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
 drivers/usb/core/message.c                    |  4 +-
 drivers/usb/core/usb.c                        |  2 +-
 include/linux/device.h                        |  4 +-
 include/linux/dma-direct.h                    | 16 +++-
 include/linux/dma-mapping.h                   | 45 +++++++++
 kernel/dma/coherent.c                         | 11 ++-
 20 files changed, 210 insertions(+), 51 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index bdd80ddbca34..f1e72f99468b 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -35,8 +35,9 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 #ifndef __arch_pfn_to_dma
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-	if (dev)
-		pfn -= dev->dma_pfn_offset;
+	if (dev && dev->dma_pfn_offset_map)
+		pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
+
 	return (dma_addr_t)__pfn_to_bus(pfn);
 }
 
@@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
 {
 	unsigned long pfn = __bus_to_pfn(addr);
 
-	if (dev)
-		pfn += dev->dma_pfn_offset;
+	if (dev && dev->dma_pfn_offset_map)
+		pfn += dma_pfn_offset_from_dma_addr(dev, addr);
 
 	return pfn;
 }
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index 638808c4e122..e7d3ee6e9cb5 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -8,6 +8,7 @@
  */
 #include <linux/io.h>
 #include <linux/of.h>
+#include <linux/dma-mapping.h>
 #include <linux/init.h>
 #include <linux/of_platform.h>
 #include <linux/of_address.h>
@@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct notifier_block *nb,
 		return NOTIFY_BAD;
 
 	if (!dev->of_node) {
-		dev->dma_pfn_offset = keystone_dma_pfn_offset;
-		dev_err(dev, "set dma_pfn_offset%08lx\n",
-			dev->dma_pfn_offset);
+		int ret = attach_uniform_dma_pfn_offset
+				(dev, keystone_dma_pfn_offset);
+
+		dev_err(dev, "set dma_pfn_offset%08lx%s\n",
+			dev->dma_pfn_offset, ret ? " failed" : "");
 	}
 	return NOTIFY_OK;
 }
diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-sh7786.c
index e0b568aaa701..2e832a5c58c1 100644
--- a/arch/sh/drivers/pci/pcie-sh7786.c
+++ b/arch/sh/drivers/pci/pcie-sh7786.c
@@ -12,6 +12,7 @@
 #include <linux/io.h>
 #include <linux/async.h>
 #include <linux/delay.h>
+#include <linux/dma-mapping.h>
 #include <linux/slab.h>
 #include <linux/clk.h>
 #include <linux/sh_clk.h>
@@ -487,7 +488,7 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev, u8 slot, u8 pin)
 
 void pcibios_bus_add_device(struct pci_dev *pdev)
 {
-	pdev->dev.dma_pfn_offset = dma_pfn_offset;
+	attach_uniform_dma_pfn_offset(&pdev->dev, dma_pfn_offset);
 }
 
 static int __init sh7786_pcie_core_init(void)
diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
index d4811691b93c..5fc9e358b6c7 100644
--- a/arch/sh/kernel/dma-coherent.c
+++ b/arch/sh/kernel/dma-coherent.c
@@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 {
 	void *ret, *ret_nocache;
 	int order = get_order(size);
+	unsigned long pfn;
+	phys_addr_t phys;
 
 	gfp |= __GFP_ZERO;
 
@@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		return NULL;
 	}
 
-	split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
+	phys = virt_to_phys(ret);
+	pfn =  phys >> PAGE_SHIFT;
+	split_page(pfn_to_page(pfn), order);
 
-	*dma_handle = virt_to_phys(ret);
-	if (!WARN_ON(!dev))
-		*dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
+	*dma_handle = (dma_addr_t)phys;
+	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
+		*dma_handle -= PFN_PHYS(
+			dma_pfn_offset_from_phys_addr(dev, phys));
 
 	return ret_nocache;
 }
@@ -50,8 +55,8 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 	unsigned long pfn = (dma_handle >> PAGE_SHIFT);
 	int k;
 
-	if (!WARN_ON(!dev))
-		pfn += dev->dma_pfn_offset;
+	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
+		pfn += dma_pfn_offset_from_dma_addr(dev, dma_handle);
 
 	for (k = 0; k < (1 << order); k++)
 		__free_pages(pfn_to_page(pfn + k), 0);
diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index c313d784efab..4cdeca9f69b6 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -12,6 +12,7 @@
 #include <linux/export.h>
 #include <linux/list.h>
 #include <linux/dma-direct.h>
+#include <linux/dma-mapping.h>
 #include <asm/iommu.h>
 
 #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
@@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
 	struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
 	struct device *dev = &pdev->dev;
 	u32 amba_base, max_amba_addr;
-	int i;
+	int i, ret;
 
 	if (!instance)
 		return;
@@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
 	pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
 	max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
 
-	dev->dma_pfn_offset = PFN_DOWN(-amba_base);
+	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(-amba_base));
+	if (ret)
+		dev_err(dev, "sta2x11: could not set PFN offset\n");
 
 	dev->bus_dma_limit = max_amba_addr;
 	pci_set_consistent_dma_mask(pdev, max_amba_addr);
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 28a6b387e80e..153661ddc74b 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
 	*dma_addr = dmaaddr;
 	*dma_size = size;
 
-	dev->dma_pfn_offset = PFN_DOWN(offset);
-	dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
+	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(offset));
+	dev_dbg(dev, "dma_pfn_offset(%#08llx)%s\n",
+		offset, ret ? " failed!" : "");
 }
 
 static void __init acpi_iort_register_irq(int hwirq, const char *name,
diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c
index 072ea113e6be..3d41dfc7d178 100644
--- a/drivers/gpu/drm/sun4i/sun4i_backend.c
+++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
@@ -11,6 +11,7 @@
 #include <linux/module.h>
 #include <linux/of_device.h>
 #include <linux/of_graph.h>
+#include <linux/dma-mapping.h>
 #include <linux/platform_device.h>
 #include <linux/reset.h>
 
@@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
 	const struct sun4i_backend_quirks *quirks;
 	struct resource *res;
 	void __iomem *regs;
-	int i, ret;
+	int i, ret = 0;
 
 	backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
 	if (!backend)
@@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
 		 * on our device since the RAM mapping is at 0 for the DMA bus,
 		 * unlike the CPU.
 		 */
-		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
+		if (ret)
+			return ret;
 	}
 
 	backend->engine.node = dev->of_node;
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 04fbd4bf0ff9..e9cc1c2d47cd 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
 		return NULL;
 
-	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
+	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
 		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
 		return NULL;
 	}
diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
index eff34ded6305..7212da5e1076 100644
--- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
+++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/clk.h>
+#include <linux/dma-mapping.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/mutex.h>
@@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
 			return ret;
 	} else {
 #ifdef PHYS_PFN_OFFSET
-		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
+		if (ret)
+			return ret;
 #endif
 	}
 
diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
index 055eb0b8e396..2d66d415b6c3 100644
--- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
+++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
@@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
 
 	sdev->dev = &pdev->dev;
 	/* The DMA bus has the memory mapped at 0 */
-	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
+	ret = attach_uniform_dma_pfn_offset(sdev->dev,
+					    PHYS_OFFSET >> PAGE_SHIFT);
+	if (ret)
+		return ret;
 
 	ret = sun6i_csi_resource_request(sdev, pdev);
 	if (ret)
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 96d8cfb14a60..c89333b0a5fb 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
 }
 EXPORT_SYMBOL(of_io_request_and_map);
 
+static int attach_dma_pfn_offset_map(struct device *dev,
+				     struct device_node *node, int num_ranges)
+{
+	struct of_range_parser parser;
+	struct of_range range;
+	struct dma_pfn_offset_region *r;
+
+	r = devm_kcalloc(dev, num_ranges + 1,
+			 sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
+	if (!r)
+		return -ENOMEM;
+	dev->dma_pfn_offset_map = r;
+	of_dma_range_parser_init(&parser, node);
+
+	/*
+	 * Record all info for DMA ranges array.  We could
+	 * just use the of_range struct, but if we did that it
+	 * would require more calculations for phys_to_dma and
+	 * dma_to_phys conversions.
+	 */
+	for_each_of_range(&parser, &range) {
+		r->cpu_start = range.cpu_addr;
+		r->cpu_end = r->cpu_start + range.size - 1;
+		r->dma_start = range.bus_addr;
+		r->dma_end = r->dma_start + range.size - 1;
+		r->pfn_offset = PFN_DOWN(range.cpu_addr)
+			- PFN_DOWN(range.bus_addr);
+		r++;
+	}
+	return 0;
+}
+
+
+
+/**
+ * attach_dma_pfn_offset - Assign scalar offset for all addresses.
+ * @dev:	device pointer; only needed for a corner case.
+ * @dma_pfn_offset:	offset to apply when converting from phys addr
+ *			to dma addr and vice versa.
+ *
+ * It returns -ENOMEM if out of memory, otherwise 0.
+ */
+int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long pfn_offset)
+{
+	struct dma_pfn_offset_region *r;
+
+	if (!dev)
+		return -ENODEV;
+
+	if (!pfn_offset)
+		return 0;
+
+	r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
+			 GFP_KERNEL);
+	if (!r)
+		return -ENOMEM;
+
+	r->uniform_offset = true;
+	r->pfn_offset = pfn_offset;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
+
 /**
  * of_dma_get_range - Get DMA range info
  * @dev:	device pointer; only needed for a corner case.
@@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
  *	CPU addr (phys_addr_t)	: pna cells
  *	size			: nsize cells
  *
- * It returns -ENODEV if "dma-ranges" property was not found
+ * It returns -ENODEV if !dev or "dma-ranges" property was not found
  * for this device in DT.
  */
 int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
@@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 	bool found_dma_ranges = false;
 	struct of_range_parser parser;
 	struct of_range range;
+	phys_addr_t cpu_start = ~(phys_addr_t)0;
 	u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
+	bool dma_multi_pfn_offset = false;
+	int num_ranges = 0;
+
+	if (!dev)
+		return -ENODEV;
 
 	while (node) {
 		ranges = of_get_property(node, "dma-ranges", &len);
@@ -977,11 +1047,10 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 		pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
 			 range.bus_addr, range.cpu_addr, range.size);
 
-		if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset) {
-			pr_warn("Can't handle multiple dma-ranges with different offsets on node(%pOF)\n", node);
-			/* Don't error out as we'd break some existing DTs */
-			continue;
-		}
+		num_ranges++;
+		if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset)
+			dma_multi_pfn_offset = true;
+
 		dma_offset = range.cpu_addr - range.bus_addr;
 
 		/* Take lower and upper limits */
@@ -989,6 +1058,8 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 			dma_start = range.bus_addr;
 		if (range.bus_addr + range.size > dma_end)
 			dma_end = range.bus_addr + range.size;
+		if (range.cpu_addr < cpu_start)
+			cpu_start = range.cpu_addr;
 	}
 
 	if (dma_start >= dma_end) {
@@ -998,9 +1069,17 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 		goto out;
 	}
 
+	if (dma_multi_pfn_offset)
+		ret = attach_dma_pfn_offset_map(dev, node, num_ranges);
+	else if (dma_offset)
+		ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(dma_offset));
+
+	if (ret)
+		goto out;
+
 	*dma_addr = dma_start;
 	*size = dma_end - dma_start;
-	*paddr = dma_start + dma_offset;
+	*paddr = cpu_start;
 
 	pr_debug("final: dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
 		 *dma_addr, *paddr, *size);
diff --git a/drivers/of/device.c b/drivers/of/device.c
index ef6a741f9f0b..91c50f40a82e 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -91,7 +91,6 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 	u64 dma_addr, paddr, size = 0;
 	int ret;
 	bool coherent;
-	unsigned long offset;
 	const struct iommu_ops *iommu;
 	u64 mask, end;
 
@@ -105,10 +104,8 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 		if (!force_dma)
 			return ret == -ENODEV ? 0 : ret;
 
-		dma_addr = offset = 0;
+		dma_addr = 0;
 	} else {
-		offset = PFN_DOWN(paddr - dma_addr);
-
 		/*
 		 * Add a work around to treat the size as mask + 1 in case
 		 * it is defined in DT as a mask.
@@ -123,7 +120,6 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 			dev_err(dev, "Adjusted size 0x%llx invalid\n", size);
 			return -EINVAL;
 		}
-		dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
 	}
 
 	/*
@@ -142,8 +138,6 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 	else if (!size)
 		size = 1ULL << 32;
 
-	dev->dma_pfn_offset = offset;
-
 	/*
 	 * Limit coherent and dma mask based on size and default mask
 	 * set by the driver.
diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index e12a54e67588..e49648f77261 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -518,7 +518,7 @@ static int rproc_handle_vdev(struct rproc *rproc, struct fw_rsc_vdev *rsc,
 	/* Initialise vdev subdevice */
 	snprintf(name, sizeof(name), "vdev%dbuffer", rvdev->index);
 	rvdev->dev.parent = rproc->dev.parent;
-	rvdev->dev.dma_pfn_offset = rproc->dev.parent->dma_pfn_offset;
+	rvdev->dev.dma_pfn_offset_map = rproc->dev.parent->dma_pfn_offset_map;
 	rvdev->dev.release = rproc_rvdev_release;
 	dev_set_name(&rvdev->dev, "%s#%s", dev_name(rvdev->dev.parent), name);
 	dev_set_drvdata(&rvdev->dev, rvdev);
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
index daf5f244f93b..29217e6e4153 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
@@ -171,8 +171,11 @@ int cedrus_hw_probe(struct cedrus_dev *dev)
 	 */
 
 #ifdef PHYS_PFN_OFFSET
-	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET))
-		dev->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET)) {
+		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
+		if (ret)
+			return ret;
+	}
 #endif
 
 	ret = of_reserved_mem_device_init(dev->dev);
diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index 6197938dcc2d..071856000428 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -1956,10 +1956,10 @@ int usb_set_configuration(struct usb_device *dev, int configuration)
 		intf->dev.groups = usb_interface_groups;
 		/*
 		 * Please refer to usb_alloc_dev() to see why we set
-		 * dma_mask and dma_pfn_offset.
+		 * dma_mask and dma_pfn_offset_map.
 		 */
 		intf->dev.dma_mask = dev->dev.dma_mask;
-		intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
+		intf->dev.dma_pfn_offset_map = dev->dev.dma_pfn_offset_map;
 		INIT_WORK(&intf->reset_ws, __usb_queue_reset_device);
 		intf->minor = -1;
 		device_initialize(&intf->dev);
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
index f16c26dc079d..3fbc0c06ce9c 100644
--- a/drivers/usb/core/usb.c
+++ b/drivers/usb/core/usb.c
@@ -611,7 +611,7 @@ struct usb_device *usb_alloc_dev(struct usb_device *parent,
 	 * mask for the entire HCD, so don't do that.
 	 */
 	dev->dev.dma_mask = bus->sysdev->dma_mask;
-	dev->dev.dma_pfn_offset = bus->sysdev->dma_pfn_offset;
+	dev->dev.dma_pfn_offset_map = bus->sysdev->dma_pfn_offset_map;
 	set_dev_node(&dev->dev, dev_to_node(bus->sysdev));
 	dev->state = USB_STATE_ATTACHED;
 	dev->lpm_disable_count = 1;
diff --git a/include/linux/device.h b/include/linux/device.h
index ac8e37cd716a..9c079643ff54 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -492,7 +492,7 @@ struct dev_links_info {
  * 		such descriptors.
  * @bus_dma_limit: Limit of an upstream bridge or bus which imposes a smaller
  *		DMA limit than the device itself supports.
- * @dma_pfn_offset: offset of DMA memory range relatively of RAM
+ * @dma_pfn_offset_map: offset map for DMA memory range relatively of RAM
  * @dma_parms:	A low level driver may set these to teach IOMMU code about
  * 		segment limitations.
  * @dma_pools:	Dma pools (if dma'ble device).
@@ -577,7 +577,7 @@ struct device {
 					     64 bit addresses for consistent
 					     allocations such descriptors. */
 	u64		bus_dma_limit;	/* upstream dma constraint */
-	unsigned long	dma_pfn_offset;
+	struct dma_pfn_offset_region *dma_pfn_offset_map;
 
 	struct device_dma_parameters *dma_parms;
 
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 24b8684aa21d..3c3363a3925e 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -15,14 +15,26 @@ static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
 	dma_addr_t dev_addr = (dma_addr_t)paddr;
 
-	return dev_addr - ((dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
+	if (dev->dma_pfn_offset_map) {
+		unsigned long dma_pfn_offset
+			= dma_pfn_offset_from_phys_addr(dev, paddr);
+
+		dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
+	}
+	return dev_addr;
 }
 
 static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr)
 {
 	phys_addr_t paddr = (phys_addr_t)dev_addr;
 
-	return paddr + ((phys_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
+	if (dev->dma_pfn_offset_map) {
+		unsigned long dma_pfn_offset
+			= dma_pfn_offset_from_dma_addr(dev, dev_addr);
+
+		paddr += ((phys_addr_t)dma_pfn_offset << PAGE_SHIFT);
+	}
+	return paddr;
 }
 #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 330ad58fbf4d..e0c2ec07c00a 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -256,6 +256,46 @@ static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
 size_t dma_direct_max_mapping_size(struct device *dev);
 
 #ifdef CONFIG_HAS_DMA
+struct dma_pfn_offset_region {
+	bool		uniform_offset;
+	phys_addr_t	cpu_start;
+	phys_addr_t	cpu_end;
+	dma_addr_t	dma_start;
+	dma_addr_t	dma_end;
+	unsigned long	pfn_offset;
+};
+
+int attach_uniform_dma_pfn_offset(struct device *dev,
+				  unsigned long dma_pfn_offset);
+
+static inline unsigned long dma_pfn_offset_from_dma_addr(struct device *dev,
+							 dma_addr_t dma_addr)
+{
+	const struct dma_pfn_offset_region *m = dev->dma_pfn_offset_map;
+
+	if (m->uniform_offset)
+		return m->pfn_offset;
+
+	for (; m->cpu_end; m++)
+		if (dma_addr >= m->dma_start && dma_addr <= m->dma_end)
+			return m->pfn_offset;
+	return 0;
+}
+
+static inline unsigned long dma_pfn_offset_from_phys_addr(struct device *dev,
+							  phys_addr_t paddr)
+{
+	const struct dma_pfn_offset_region *m = dev->dma_pfn_offset_map;
+
+	if (m->uniform_offset)
+		return m->pfn_offset;
+
+	for (; m->cpu_end; m++)
+		if (paddr >= m->cpu_start && paddr <= m->cpu_end)
+			return m->pfn_offset;
+	return 0;
+}
+
 #include <asm/dma-mapping.h>
 
 static inline const struct dma_map_ops *get_dma_ops(struct device *dev)
@@ -463,6 +503,11 @@ u64 dma_get_required_mask(struct device *dev);
 size_t dma_max_mapping_size(struct device *dev);
 unsigned long dma_get_merge_boundary(struct device *dev);
 #else /* CONFIG_HAS_DMA */
+static inline int attach_uniform_dma_pfn_offset(struct device *dev,
+						unsigned long dma_pfn_offset)
+{
+	return -EIO;
+}
 static inline dma_addr_t dma_map_page_attrs(struct device *dev,
 		struct page *page, size_t offset, size_t size,
 		enum dma_data_direction dir, unsigned long attrs)
diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 2a0c4985f38e..20fc48d0b9d7 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -31,10 +31,13 @@ static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *de
 static inline dma_addr_t dma_get_device_base(struct device *dev,
 					     struct dma_coherent_mem * mem)
 {
-	if (mem->use_dev_dma_pfn_offset)
-		return (mem->pfn_base - dev->dma_pfn_offset) << PAGE_SHIFT;
-	else
-		return mem->device_base;
+	if (mem->use_dev_dma_pfn_offset && dev->dma_pfn_offset_map) {
+		unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr
+			(dev, PFN_PHYS(mem->pfn_base));
+
+		return (mem->pfn_base - dma_pfn_offset) << PAGE_SHIFT;
+	}
+	return mem->device_base;
 }
 
 static int dma_init_coherent_memory(phys_addr_t phys_addr,
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-03 19:20   ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Dan Williams, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT, Russell King,
	open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai, Ingo Molnar,
	Alan Stern, David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

The new field in struct device 'dma_pfn_offset_map' is used to facilitate
the use of multiple pfn offsets between cpu addrs and dma addrs.  It
subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
designates the single offset a special case.

of_dma_configure() is the typical manner to set pfn offsets but there
are a number of ad hoc assignments to dev->dma_pfn_offset in the
kernel code.  These cases now invoke the function
attach_uniform_dma_pfn_offset(dev, pfn_offset).

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
---
 arch/arm/include/asm/dma-mapping.h            |  9 +-
 arch/arm/mach-keystone/keystone.c             |  9 +-
 arch/sh/drivers/pci/pcie-sh7786.c             |  3 +-
 arch/sh/kernel/dma-coherent.c                 | 17 ++--
 arch/x86/pci/sta2x11-fixup.c                  |  7 +-
 drivers/acpi/arm64/iort.c                     |  5 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c         |  7 +-
 drivers/iommu/io-pgtable-arm.c                |  2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  5 +-
 drivers/of/address.c                          | 93 +++++++++++++++++--
 drivers/of/device.c                           |  8 +-
 drivers/remoteproc/remoteproc_core.c          |  2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
 drivers/usb/core/message.c                    |  4 +-
 drivers/usb/core/usb.c                        |  2 +-
 include/linux/device.h                        |  4 +-
 include/linux/dma-direct.h                    | 16 +++-
 include/linux/dma-mapping.h                   | 45 +++++++++
 kernel/dma/coherent.c                         | 11 ++-
 20 files changed, 210 insertions(+), 51 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index bdd80ddbca34..f1e72f99468b 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -35,8 +35,9 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
 #ifndef __arch_pfn_to_dma
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-	if (dev)
-		pfn -= dev->dma_pfn_offset;
+	if (dev && dev->dma_pfn_offset_map)
+		pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
+
 	return (dma_addr_t)__pfn_to_bus(pfn);
 }
 
@@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
 {
 	unsigned long pfn = __bus_to_pfn(addr);
 
-	if (dev)
-		pfn += dev->dma_pfn_offset;
+	if (dev && dev->dma_pfn_offset_map)
+		pfn += dma_pfn_offset_from_dma_addr(dev, addr);
 
 	return pfn;
 }
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c
index 638808c4e122..e7d3ee6e9cb5 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -8,6 +8,7 @@
  */
 #include <linux/io.h>
 #include <linux/of.h>
+#include <linux/dma-mapping.h>
 #include <linux/init.h>
 #include <linux/of_platform.h>
 #include <linux/of_address.h>
@@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct notifier_block *nb,
 		return NOTIFY_BAD;
 
 	if (!dev->of_node) {
-		dev->dma_pfn_offset = keystone_dma_pfn_offset;
-		dev_err(dev, "set dma_pfn_offset%08lx\n",
-			dev->dma_pfn_offset);
+		int ret = attach_uniform_dma_pfn_offset
+				(dev, keystone_dma_pfn_offset);
+
+		dev_err(dev, "set dma_pfn_offset%08lx%s\n",
+			dev->dma_pfn_offset, ret ? " failed" : "");
 	}
 	return NOTIFY_OK;
 }
diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-sh7786.c
index e0b568aaa701..2e832a5c58c1 100644
--- a/arch/sh/drivers/pci/pcie-sh7786.c
+++ b/arch/sh/drivers/pci/pcie-sh7786.c
@@ -12,6 +12,7 @@
 #include <linux/io.h>
 #include <linux/async.h>
 #include <linux/delay.h>
+#include <linux/dma-mapping.h>
 #include <linux/slab.h>
 #include <linux/clk.h>
 #include <linux/sh_clk.h>
@@ -487,7 +488,7 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev, u8 slot, u8 pin)
 
 void pcibios_bus_add_device(struct pci_dev *pdev)
 {
-	pdev->dev.dma_pfn_offset = dma_pfn_offset;
+	attach_uniform_dma_pfn_offset(&pdev->dev, dma_pfn_offset);
 }
 
 static int __init sh7786_pcie_core_init(void)
diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
index d4811691b93c..5fc9e358b6c7 100644
--- a/arch/sh/kernel/dma-coherent.c
+++ b/arch/sh/kernel/dma-coherent.c
@@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 {
 	void *ret, *ret_nocache;
 	int order = get_order(size);
+	unsigned long pfn;
+	phys_addr_t phys;
 
 	gfp |= __GFP_ZERO;
 
@@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
 		return NULL;
 	}
 
-	split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
+	phys = virt_to_phys(ret);
+	pfn =  phys >> PAGE_SHIFT;
+	split_page(pfn_to_page(pfn), order);
 
-	*dma_handle = virt_to_phys(ret);
-	if (!WARN_ON(!dev))
-		*dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
+	*dma_handle = (dma_addr_t)phys;
+	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
+		*dma_handle -= PFN_PHYS(
+			dma_pfn_offset_from_phys_addr(dev, phys));
 
 	return ret_nocache;
 }
@@ -50,8 +55,8 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 	unsigned long pfn = (dma_handle >> PAGE_SHIFT);
 	int k;
 
-	if (!WARN_ON(!dev))
-		pfn += dev->dma_pfn_offset;
+	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
+		pfn += dma_pfn_offset_from_dma_addr(dev, dma_handle);
 
 	for (k = 0; k < (1 << order); k++)
 		__free_pages(pfn_to_page(pfn + k), 0);
diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index c313d784efab..4cdeca9f69b6 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -12,6 +12,7 @@
 #include <linux/export.h>
 #include <linux/list.h>
 #include <linux/dma-direct.h>
+#include <linux/dma-mapping.h>
 #include <asm/iommu.h>
 
 #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
@@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
 	struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
 	struct device *dev = &pdev->dev;
 	u32 amba_base, max_amba_addr;
-	int i;
+	int i, ret;
 
 	if (!instance)
 		return;
@@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
 	pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
 	max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
 
-	dev->dma_pfn_offset = PFN_DOWN(-amba_base);
+	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(-amba_base));
+	if (ret)
+		dev_err(dev, "sta2x11: could not set PFN offset\n");
 
 	dev->bus_dma_limit = max_amba_addr;
 	pci_set_consistent_dma_mask(pdev, max_amba_addr);
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 28a6b387e80e..153661ddc74b 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
 	*dma_addr = dmaaddr;
 	*dma_size = size;
 
-	dev->dma_pfn_offset = PFN_DOWN(offset);
-	dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
+	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(offset));
+	dev_dbg(dev, "dma_pfn_offset(%#08llx)%s\n",
+		offset, ret ? " failed!" : "");
 }
 
 static void __init acpi_iort_register_irq(int hwirq, const char *name,
diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c
index 072ea113e6be..3d41dfc7d178 100644
--- a/drivers/gpu/drm/sun4i/sun4i_backend.c
+++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
@@ -11,6 +11,7 @@
 #include <linux/module.h>
 #include <linux/of_device.h>
 #include <linux/of_graph.h>
+#include <linux/dma-mapping.h>
 #include <linux/platform_device.h>
 #include <linux/reset.h>
 
@@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
 	const struct sun4i_backend_quirks *quirks;
 	struct resource *res;
 	void __iomem *regs;
-	int i, ret;
+	int i, ret = 0;
 
 	backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
 	if (!backend)
@@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
 		 * on our device since the RAM mapping is at 0 for the DMA bus,
 		 * unlike the CPU.
 		 */
-		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
+		if (ret)
+			return ret;
 	}
 
 	backend->engine.node = dev->of_node;
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 04fbd4bf0ff9..e9cc1c2d47cd 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
 		return NULL;
 
-	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
+	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
 		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
 		return NULL;
 	}
diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
index eff34ded6305..7212da5e1076 100644
--- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
+++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/clk.h>
+#include <linux/dma-mapping.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/mutex.h>
@@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
 			return ret;
 	} else {
 #ifdef PHYS_PFN_OFFSET
-		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
+		if (ret)
+			return ret;
 #endif
 	}
 
diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
index 055eb0b8e396..2d66d415b6c3 100644
--- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
+++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
@@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
 
 	sdev->dev = &pdev->dev;
 	/* The DMA bus has the memory mapped at 0 */
-	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
+	ret = attach_uniform_dma_pfn_offset(sdev->dev,
+					    PHYS_OFFSET >> PAGE_SHIFT);
+	if (ret)
+		return ret;
 
 	ret = sun6i_csi_resource_request(sdev, pdev);
 	if (ret)
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 96d8cfb14a60..c89333b0a5fb 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
 }
 EXPORT_SYMBOL(of_io_request_and_map);
 
+static int attach_dma_pfn_offset_map(struct device *dev,
+				     struct device_node *node, int num_ranges)
+{
+	struct of_range_parser parser;
+	struct of_range range;
+	struct dma_pfn_offset_region *r;
+
+	r = devm_kcalloc(dev, num_ranges + 1,
+			 sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
+	if (!r)
+		return -ENOMEM;
+	dev->dma_pfn_offset_map = r;
+	of_dma_range_parser_init(&parser, node);
+
+	/*
+	 * Record all info for DMA ranges array.  We could
+	 * just use the of_range struct, but if we did that it
+	 * would require more calculations for phys_to_dma and
+	 * dma_to_phys conversions.
+	 */
+	for_each_of_range(&parser, &range) {
+		r->cpu_start = range.cpu_addr;
+		r->cpu_end = r->cpu_start + range.size - 1;
+		r->dma_start = range.bus_addr;
+		r->dma_end = r->dma_start + range.size - 1;
+		r->pfn_offset = PFN_DOWN(range.cpu_addr)
+			- PFN_DOWN(range.bus_addr);
+		r++;
+	}
+	return 0;
+}
+
+
+
+/**
+ * attach_dma_pfn_offset - Assign scalar offset for all addresses.
+ * @dev:	device pointer; only needed for a corner case.
+ * @dma_pfn_offset:	offset to apply when converting from phys addr
+ *			to dma addr and vice versa.
+ *
+ * It returns -ENOMEM if out of memory, otherwise 0.
+ */
+int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long pfn_offset)
+{
+	struct dma_pfn_offset_region *r;
+
+	if (!dev)
+		return -ENODEV;
+
+	if (!pfn_offset)
+		return 0;
+
+	r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
+			 GFP_KERNEL);
+	if (!r)
+		return -ENOMEM;
+
+	r->uniform_offset = true;
+	r->pfn_offset = pfn_offset;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
+
 /**
  * of_dma_get_range - Get DMA range info
  * @dev:	device pointer; only needed for a corner case.
@@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
  *	CPU addr (phys_addr_t)	: pna cells
  *	size			: nsize cells
  *
- * It returns -ENODEV if "dma-ranges" property was not found
+ * It returns -ENODEV if !dev or "dma-ranges" property was not found
  * for this device in DT.
  */
 int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
@@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 	bool found_dma_ranges = false;
 	struct of_range_parser parser;
 	struct of_range range;
+	phys_addr_t cpu_start = ~(phys_addr_t)0;
 	u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
+	bool dma_multi_pfn_offset = false;
+	int num_ranges = 0;
+
+	if (!dev)
+		return -ENODEV;
 
 	while (node) {
 		ranges = of_get_property(node, "dma-ranges", &len);
@@ -977,11 +1047,10 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 		pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
 			 range.bus_addr, range.cpu_addr, range.size);
 
-		if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset) {
-			pr_warn("Can't handle multiple dma-ranges with different offsets on node(%pOF)\n", node);
-			/* Don't error out as we'd break some existing DTs */
-			continue;
-		}
+		num_ranges++;
+		if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset)
+			dma_multi_pfn_offset = true;
+
 		dma_offset = range.cpu_addr - range.bus_addr;
 
 		/* Take lower and upper limits */
@@ -989,6 +1058,8 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 			dma_start = range.bus_addr;
 		if (range.bus_addr + range.size > dma_end)
 			dma_end = range.bus_addr + range.size;
+		if (range.cpu_addr < cpu_start)
+			cpu_start = range.cpu_addr;
 	}
 
 	if (dma_start >= dma_end) {
@@ -998,9 +1069,17 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr,
 		goto out;
 	}
 
+	if (dma_multi_pfn_offset)
+		ret = attach_dma_pfn_offset_map(dev, node, num_ranges);
+	else if (dma_offset)
+		ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(dma_offset));
+
+	if (ret)
+		goto out;
+
 	*dma_addr = dma_start;
 	*size = dma_end - dma_start;
-	*paddr = dma_start + dma_offset;
+	*paddr = cpu_start;
 
 	pr_debug("final: dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
 		 *dma_addr, *paddr, *size);
diff --git a/drivers/of/device.c b/drivers/of/device.c
index ef6a741f9f0b..91c50f40a82e 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -91,7 +91,6 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 	u64 dma_addr, paddr, size = 0;
 	int ret;
 	bool coherent;
-	unsigned long offset;
 	const struct iommu_ops *iommu;
 	u64 mask, end;
 
@@ -105,10 +104,8 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 		if (!force_dma)
 			return ret == -ENODEV ? 0 : ret;
 
-		dma_addr = offset = 0;
+		dma_addr = 0;
 	} else {
-		offset = PFN_DOWN(paddr - dma_addr);
-
 		/*
 		 * Add a work around to treat the size as mask + 1 in case
 		 * it is defined in DT as a mask.
@@ -123,7 +120,6 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 			dev_err(dev, "Adjusted size 0x%llx invalid\n", size);
 			return -EINVAL;
 		}
-		dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
 	}
 
 	/*
@@ -142,8 +138,6 @@ int of_dma_configure(struct device *dev, struct device_node *np, bool force_dma)
 	else if (!size)
 		size = 1ULL << 32;
 
-	dev->dma_pfn_offset = offset;
-
 	/*
 	 * Limit coherent and dma mask based on size and default mask
 	 * set by the driver.
diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index e12a54e67588..e49648f77261 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -518,7 +518,7 @@ static int rproc_handle_vdev(struct rproc *rproc, struct fw_rsc_vdev *rsc,
 	/* Initialise vdev subdevice */
 	snprintf(name, sizeof(name), "vdev%dbuffer", rvdev->index);
 	rvdev->dev.parent = rproc->dev.parent;
-	rvdev->dev.dma_pfn_offset = rproc->dev.parent->dma_pfn_offset;
+	rvdev->dev.dma_pfn_offset_map = rproc->dev.parent->dma_pfn_offset_map;
 	rvdev->dev.release = rproc_rvdev_release;
 	dev_set_name(&rvdev->dev, "%s#%s", dev_name(rvdev->dev.parent), name);
 	dev_set_drvdata(&rvdev->dev, rvdev);
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
index daf5f244f93b..29217e6e4153 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_hw.c
@@ -171,8 +171,11 @@ int cedrus_hw_probe(struct cedrus_dev *dev)
 	 */
 
 #ifdef PHYS_PFN_OFFSET
-	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET))
-		dev->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+	if (!(variant->quirks & CEDRUS_QUIRK_NO_DMA_OFFSET)) {
+		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
+		if (ret)
+			return ret;
+	}
 #endif
 
 	ret = of_reserved_mem_device_init(dev->dev);
diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index 6197938dcc2d..071856000428 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -1956,10 +1956,10 @@ int usb_set_configuration(struct usb_device *dev, int configuration)
 		intf->dev.groups = usb_interface_groups;
 		/*
 		 * Please refer to usb_alloc_dev() to see why we set
-		 * dma_mask and dma_pfn_offset.
+		 * dma_mask and dma_pfn_offset_map.
 		 */
 		intf->dev.dma_mask = dev->dev.dma_mask;
-		intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
+		intf->dev.dma_pfn_offset_map = dev->dev.dma_pfn_offset_map;
 		INIT_WORK(&intf->reset_ws, __usb_queue_reset_device);
 		intf->minor = -1;
 		device_initialize(&intf->dev);
diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
index f16c26dc079d..3fbc0c06ce9c 100644
--- a/drivers/usb/core/usb.c
+++ b/drivers/usb/core/usb.c
@@ -611,7 +611,7 @@ struct usb_device *usb_alloc_dev(struct usb_device *parent,
 	 * mask for the entire HCD, so don't do that.
 	 */
 	dev->dev.dma_mask = bus->sysdev->dma_mask;
-	dev->dev.dma_pfn_offset = bus->sysdev->dma_pfn_offset;
+	dev->dev.dma_pfn_offset_map = bus->sysdev->dma_pfn_offset_map;
 	set_dev_node(&dev->dev, dev_to_node(bus->sysdev));
 	dev->state = USB_STATE_ATTACHED;
 	dev->lpm_disable_count = 1;
diff --git a/include/linux/device.h b/include/linux/device.h
index ac8e37cd716a..9c079643ff54 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -492,7 +492,7 @@ struct dev_links_info {
  * 		such descriptors.
  * @bus_dma_limit: Limit of an upstream bridge or bus which imposes a smaller
  *		DMA limit than the device itself supports.
- * @dma_pfn_offset: offset of DMA memory range relatively of RAM
+ * @dma_pfn_offset_map: offset map for DMA memory range relatively of RAM
  * @dma_parms:	A low level driver may set these to teach IOMMU code about
  * 		segment limitations.
  * @dma_pools:	Dma pools (if dma'ble device).
@@ -577,7 +577,7 @@ struct device {
 					     64 bit addresses for consistent
 					     allocations such descriptors. */
 	u64		bus_dma_limit;	/* upstream dma constraint */
-	unsigned long	dma_pfn_offset;
+	struct dma_pfn_offset_region *dma_pfn_offset_map;
 
 	struct device_dma_parameters *dma_parms;
 
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 24b8684aa21d..3c3363a3925e 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -15,14 +15,26 @@ static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr)
 {
 	dma_addr_t dev_addr = (dma_addr_t)paddr;
 
-	return dev_addr - ((dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
+	if (dev->dma_pfn_offset_map) {
+		unsigned long dma_pfn_offset
+			= dma_pfn_offset_from_phys_addr(dev, paddr);
+
+		dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
+	}
+	return dev_addr;
 }
 
 static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr)
 {
 	phys_addr_t paddr = (phys_addr_t)dev_addr;
 
-	return paddr + ((phys_addr_t)dev->dma_pfn_offset << PAGE_SHIFT);
+	if (dev->dma_pfn_offset_map) {
+		unsigned long dma_pfn_offset
+			= dma_pfn_offset_from_dma_addr(dev, dev_addr);
+
+		paddr += ((phys_addr_t)dma_pfn_offset << PAGE_SHIFT);
+	}
+	return paddr;
 }
 #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 330ad58fbf4d..e0c2ec07c00a 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -256,6 +256,46 @@ static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
 size_t dma_direct_max_mapping_size(struct device *dev);
 
 #ifdef CONFIG_HAS_DMA
+struct dma_pfn_offset_region {
+	bool		uniform_offset;
+	phys_addr_t	cpu_start;
+	phys_addr_t	cpu_end;
+	dma_addr_t	dma_start;
+	dma_addr_t	dma_end;
+	unsigned long	pfn_offset;
+};
+
+int attach_uniform_dma_pfn_offset(struct device *dev,
+				  unsigned long dma_pfn_offset);
+
+static inline unsigned long dma_pfn_offset_from_dma_addr(struct device *dev,
+							 dma_addr_t dma_addr)
+{
+	const struct dma_pfn_offset_region *m = dev->dma_pfn_offset_map;
+
+	if (m->uniform_offset)
+		return m->pfn_offset;
+
+	for (; m->cpu_end; m++)
+		if (dma_addr >= m->dma_start && dma_addr <= m->dma_end)
+			return m->pfn_offset;
+	return 0;
+}
+
+static inline unsigned long dma_pfn_offset_from_phys_addr(struct device *dev,
+							  phys_addr_t paddr)
+{
+	const struct dma_pfn_offset_region *m = dev->dma_pfn_offset_map;
+
+	if (m->uniform_offset)
+		return m->pfn_offset;
+
+	for (; m->cpu_end; m++)
+		if (paddr >= m->cpu_start && paddr <= m->cpu_end)
+			return m->pfn_offset;
+	return 0;
+}
+
 #include <asm/dma-mapping.h>
 
 static inline const struct dma_map_ops *get_dma_ops(struct device *dev)
@@ -463,6 +503,11 @@ u64 dma_get_required_mask(struct device *dev);
 size_t dma_max_mapping_size(struct device *dev);
 unsigned long dma_get_merge_boundary(struct device *dev);
 #else /* CONFIG_HAS_DMA */
+static inline int attach_uniform_dma_pfn_offset(struct device *dev,
+						unsigned long dma_pfn_offset)
+{
+	return -EIO;
+}
 static inline dma_addr_t dma_map_page_attrs(struct device *dev,
 		struct page *page, size_t offset, size_t size,
 		enum dma_data_direction dir, unsigned long attrs)
diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 2a0c4985f38e..20fc48d0b9d7 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -31,10 +31,13 @@ static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *de
 static inline dma_addr_t dma_get_device_base(struct device *dev,
 					     struct dma_coherent_mem * mem)
 {
-	if (mem->use_dev_dma_pfn_offset)
-		return (mem->pfn_base - dev->dma_pfn_offset) << PAGE_SHIFT;
-	else
-		return mem->device_base;
+	if (mem->use_dev_dma_pfn_offset && dev->dma_pfn_offset_map) {
+		unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr
+			(dev, PFN_PHYS(mem->pfn_base));
+
+		return (mem->pfn_base - dma_pfn_offset) << PAGE_SHIFT;
+	}
+	return mem->device_base;
 }
 
 static int dma_init_coherent_memory(phys_addr_t phys_addr,
-- 
2.17.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 10/13] PCI: brcmstb: Set internal memory viewport sizes
  2020-06-03 19:20 ` Jim Quinlan
@ 2020-06-03 19:20   ` Jim Quinlan
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list

BrcmSTB PCIe controllers are intimately connected to the memory
controller(s) on the SOC.  There is a "viewport" for each memory controller
that allows inbound accesses to CPU memory.  Each viewport's size must be
set to a power of two, and that size must be equal to or larger than the
amount of memory each controller supports.

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 67 ++++++++++++++++++++-------
 1 file changed, 49 insertions(+), 18 deletions(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index fa356bc149c3..338e9ed44230 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -55,6 +55,8 @@
 #define  PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_MASK	0x300000
 #define  PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_128		0x0
 #define  PCIE_MISC_MISC_CTRL_SCB0_SIZE_MASK		0xf8000000
+#define  PCIE_MISC_MISC_CTRL_SCB1_SIZE_MASK		0x07c00000
+#define  PCIE_MISC_MISC_CTRL_SCB2_SIZE_MASK		0x0000001f
 
 #define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_LO		0x400c
 #define PCIE_MEM_WIN0_LO(win)	\
@@ -152,6 +154,7 @@
 #define SSC_STATUS_OFFSET		0x1
 #define SSC_STATUS_SSC_MASK		0x400
 #define SSC_STATUS_PLL_LOCK_MASK	0x800
+#define PCIE_BRCM_MAX_MEMC		3
 
 /* Rescal registers */
 #define PCIE_DVT_PMU_PCIE_PHY_CTRL				0xc700
@@ -261,6 +264,8 @@ struct brcm_pcie {
 	const int		*reg_field_info;
 	enum pcie_type		type;
 	struct reset_control	*rescal;
+	int			num_memc;
+	u64			memc_size[PCIE_BRCM_MAX_MEMC];
 };
 
 /*
@@ -717,22 +722,40 @@ static inline int brcm_pcie_get_rc_bar2_size_and_offset(struct brcm_pcie *pcie,
 							u64 *rc_bar2_offset)
 {
 	struct pci_host_bridge *bridge = pci_host_bridge_from_priv(pcie);
-	struct device *dev = pcie->dev;
 	struct resource_entry *entry;
+	struct device *dev = pcie->dev;
+	u64 lowest_pcie_addr = ~(u64)0;
+	int ret, i = 0;
+	u64 size = 0;
 
-	entry = resource_list_first_type(&bridge->dma_ranges, IORESOURCE_MEM);
-	if (!entry)
-		return -ENODEV;
+	resource_list_for_each_entry(entry, &bridge->dma_ranges) {
+		u64 pcie_beg = entry->res->start - entry->offset;
 
+		size += entry->res->end - entry->res->start + 1;
+		if (pcie_beg < lowest_pcie_addr)
+			lowest_pcie_addr = pcie_beg;
+	}
 
-	/*
-	 * The controller expects the inbound window offset to be calculated as
-	 * the difference between PCIe's address space and CPU's. The offset
-	 * provided by the firmware is calculated the opposite way, so we
-	 * negate it.
-	 */
-	*rc_bar2_offset = -entry->offset;
-	*rc_bar2_size = 1ULL << fls64(entry->res->end - entry->res->start);
+	ret = of_property_read_variable_u64_array(
+		pcie->np, "brcm,scb-sizes", pcie->memc_size, 1,
+		PCIE_BRCM_MAX_MEMC);
+
+	if (ret <= 0) {
+		/* Make an educated guess */
+		pcie->num_memc = 1;
+		pcie->memc_size[0] = 1 << fls64(size - 1);
+	} else {
+		pcie->num_memc = ret;
+	}
+
+	/* Each memc is viewed through a "port" that is a power of 2 */
+	for (i = 0, size = 0; i < pcie->num_memc; i++)
+		size += pcie->memc_size[i];
+
+	/* System memory starts at this address in PCIe-space */
+	*rc_bar2_offset = lowest_pcie_addr;
+	/* The sum of all memc views must also be a power of 2 */
+	*rc_bar2_size = 1ULL << fls64(size - 1);
 
 	/*
 	 * We validate the inbound memory view even though we should trust
@@ -784,12 +807,11 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
 	void __iomem *base = pcie->base;
 	struct device *dev = pcie->dev;
 	struct resource_entry *entry;
-	unsigned int scb_size_val;
 	bool ssc_good = false;
 	struct resource *res;
 	int num_out_wins = 0;
 	u16 nlw, cls, lnksta;
-	int i, ret;
+	int i, ret, memc;
 	u32 tmp, aspm_support;
 
 	/* Reset the bridge */
@@ -825,11 +847,20 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
 	writel(upper_32_bits(rc_bar2_offset),
 	       base + PCIE_MISC_RC_BAR2_CONFIG_HI);
 
-	scb_size_val = rc_bar2_size ?
-		       ilog2(rc_bar2_size) - 15 : 0xf; /* 0xf is 1GB */
 	tmp = readl(base + PCIE_MISC_MISC_CTRL);
-	u32p_replace_bits(&tmp, scb_size_val,
-			  PCIE_MISC_MISC_CTRL_SCB0_SIZE_MASK);
+	for (memc = 0; memc < pcie->num_memc; memc++) {
+		u32 scb_size_val = ilog2(pcie->memc_size[memc]) - 15;
+
+		if (memc == 0)
+			u32p_replace_bits(&tmp, scb_size_val,
+					  PCIE_MISC_MISC_CTRL_SCB0_SIZE_MASK);
+		else if (memc == 1)
+			u32p_replace_bits(&tmp, scb_size_val,
+					  PCIE_MISC_MISC_CTRL_SCB1_SIZE_MASK);
+		else if (memc == 2)
+			u32p_replace_bits(&tmp, scb_size_val,
+					  PCIE_MISC_MISC_CTRL_SCB2_SIZE_MASK);
+	}
 	writel(tmp, base + PCIE_MISC_MISC_CTRL);
 
 	/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 10/13] PCI: brcmstb: Set internal memory viewport sizes
@ 2020-06-03 19:20   ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Rob Herring, Lorenzo Pieralisi, open list, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE

BrcmSTB PCIe controllers are intimately connected to the memory
controller(s) on the SOC.  There is a "viewport" for each memory controller
that allows inbound accesses to CPU memory.  Each viewport's size must be
set to a power of two, and that size must be equal to or larger than the
amount of memory each controller supports.

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 67 ++++++++++++++++++++-------
 1 file changed, 49 insertions(+), 18 deletions(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index fa356bc149c3..338e9ed44230 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -55,6 +55,8 @@
 #define  PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_MASK	0x300000
 #define  PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_128		0x0
 #define  PCIE_MISC_MISC_CTRL_SCB0_SIZE_MASK		0xf8000000
+#define  PCIE_MISC_MISC_CTRL_SCB1_SIZE_MASK		0x07c00000
+#define  PCIE_MISC_MISC_CTRL_SCB2_SIZE_MASK		0x0000001f
 
 #define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_LO		0x400c
 #define PCIE_MEM_WIN0_LO(win)	\
@@ -152,6 +154,7 @@
 #define SSC_STATUS_OFFSET		0x1
 #define SSC_STATUS_SSC_MASK		0x400
 #define SSC_STATUS_PLL_LOCK_MASK	0x800
+#define PCIE_BRCM_MAX_MEMC		3
 
 /* Rescal registers */
 #define PCIE_DVT_PMU_PCIE_PHY_CTRL				0xc700
@@ -261,6 +264,8 @@ struct brcm_pcie {
 	const int		*reg_field_info;
 	enum pcie_type		type;
 	struct reset_control	*rescal;
+	int			num_memc;
+	u64			memc_size[PCIE_BRCM_MAX_MEMC];
 };
 
 /*
@@ -717,22 +722,40 @@ static inline int brcm_pcie_get_rc_bar2_size_and_offset(struct brcm_pcie *pcie,
 							u64 *rc_bar2_offset)
 {
 	struct pci_host_bridge *bridge = pci_host_bridge_from_priv(pcie);
-	struct device *dev = pcie->dev;
 	struct resource_entry *entry;
+	struct device *dev = pcie->dev;
+	u64 lowest_pcie_addr = ~(u64)0;
+	int ret, i = 0;
+	u64 size = 0;
 
-	entry = resource_list_first_type(&bridge->dma_ranges, IORESOURCE_MEM);
-	if (!entry)
-		return -ENODEV;
+	resource_list_for_each_entry(entry, &bridge->dma_ranges) {
+		u64 pcie_beg = entry->res->start - entry->offset;
 
+		size += entry->res->end - entry->res->start + 1;
+		if (pcie_beg < lowest_pcie_addr)
+			lowest_pcie_addr = pcie_beg;
+	}
 
-	/*
-	 * The controller expects the inbound window offset to be calculated as
-	 * the difference between PCIe's address space and CPU's. The offset
-	 * provided by the firmware is calculated the opposite way, so we
-	 * negate it.
-	 */
-	*rc_bar2_offset = -entry->offset;
-	*rc_bar2_size = 1ULL << fls64(entry->res->end - entry->res->start);
+	ret = of_property_read_variable_u64_array(
+		pcie->np, "brcm,scb-sizes", pcie->memc_size, 1,
+		PCIE_BRCM_MAX_MEMC);
+
+	if (ret <= 0) {
+		/* Make an educated guess */
+		pcie->num_memc = 1;
+		pcie->memc_size[0] = 1 << fls64(size - 1);
+	} else {
+		pcie->num_memc = ret;
+	}
+
+	/* Each memc is viewed through a "port" that is a power of 2 */
+	for (i = 0, size = 0; i < pcie->num_memc; i++)
+		size += pcie->memc_size[i];
+
+	/* System memory starts at this address in PCIe-space */
+	*rc_bar2_offset = lowest_pcie_addr;
+	/* The sum of all memc views must also be a power of 2 */
+	*rc_bar2_size = 1ULL << fls64(size - 1);
 
 	/*
 	 * We validate the inbound memory view even though we should trust
@@ -784,12 +807,11 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
 	void __iomem *base = pcie->base;
 	struct device *dev = pcie->dev;
 	struct resource_entry *entry;
-	unsigned int scb_size_val;
 	bool ssc_good = false;
 	struct resource *res;
 	int num_out_wins = 0;
 	u16 nlw, cls, lnksta;
-	int i, ret;
+	int i, ret, memc;
 	u32 tmp, aspm_support;
 
 	/* Reset the bridge */
@@ -825,11 +847,20 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
 	writel(upper_32_bits(rc_bar2_offset),
 	       base + PCIE_MISC_RC_BAR2_CONFIG_HI);
 
-	scb_size_val = rc_bar2_size ?
-		       ilog2(rc_bar2_size) - 15 : 0xf; /* 0xf is 1GB */
 	tmp = readl(base + PCIE_MISC_MISC_CTRL);
-	u32p_replace_bits(&tmp, scb_size_val,
-			  PCIE_MISC_MISC_CTRL_SCB0_SIZE_MASK);
+	for (memc = 0; memc < pcie->num_memc; memc++) {
+		u32 scb_size_val = ilog2(pcie->memc_size[memc]) - 15;
+
+		if (memc == 0)
+			u32p_replace_bits(&tmp, scb_size_val,
+					  PCIE_MISC_MISC_CTRL_SCB0_SIZE_MASK);
+		else if (memc == 1)
+			u32p_replace_bits(&tmp, scb_size_val,
+					  PCIE_MISC_MISC_CTRL_SCB1_SIZE_MASK);
+		else if (memc == 2)
+			u32p_replace_bits(&tmp, scb_size_val,
+					  PCIE_MISC_MISC_CTRL_SCB2_SIZE_MASK);
+	}
 	writel(tmp, base + PCIE_MISC_MISC_CTRL);
 
 	/*
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 11/13] PCI: brcmstb: Accommodate MSI for older chips
  2020-06-03 19:20 ` Jim Quinlan
@ 2020-06-03 19:20   ` Jim Quinlan
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Jim Quinlan, Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas,
	Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list

From: Jim Quinlan <jquinlan@broadcom.com>

Older BrcmSTB chips do not have a separate register for MSI interrupts; the
MSIs are in a register that also contains unrelated interrupts.  In
addition, the interrupts lie in bits [31..24] for these legacy chips.  This
commit provides common code for both legacy and non-legacy MSI interrupt
registers.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 72 +++++++++++++++++++--------
 1 file changed, 52 insertions(+), 20 deletions(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 338e9ed44230..9930419e3ac2 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -80,7 +80,8 @@
 #define PCIE_MISC_MSI_BAR_CONFIG_HI			0x4048
 
 #define PCIE_MISC_MSI_DATA_CONFIG			0x404c
-#define  PCIE_MISC_MSI_DATA_CONFIG_VAL			0xffe06540
+#define  PCIE_MISC_MSI_DATA_CONFIG_VAL_32		0xffe06540
+#define  PCIE_MISC_MSI_DATA_CONFIG_VAL_8		0xfff86540
 
 #define PCIE_MISC_PCIE_CTRL				0x4064
 #define  PCIE_MISC_PCIE_CTRL_PCIE_L23_REQUEST_MASK	0x1
@@ -92,6 +93,9 @@
 #define  PCIE_MISC_PCIE_STATUS_PCIE_PHYLINKUP_MASK	0x10
 #define  PCIE_MISC_PCIE_STATUS_PCIE_LINK_IN_L23_MASK	0x40
 
+#define PCIE_MISC_REVISION				0x406c
+#define  BRCM_PCIE_HW_REV_33				0x0303
+
 #define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT		0x4070
 #define  PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT_LIMIT_MASK	0xfff00000
 #define  PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT_BASE_MASK	0xfff0
@@ -112,10 +116,14 @@
 #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK	0x2
 #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK		0x08000000
 
-#define PCIE_MSI_INTR2_STATUS				0x4500
-#define PCIE_MSI_INTR2_CLR				0x4508
-#define PCIE_MSI_INTR2_MASK_SET				0x4510
-#define PCIE_MSI_INTR2_MASK_CLR				0x4514
+
+#define PCIE_INTR2_CPU_BASE		0x4300
+#define PCIE_MSI_INTR2_BASE		0x4500
+/* Offsets from PCIE_INTR2_CPU_BASE and PCIE_MSI_INTR2_BASE */
+#define  MSI_INT_STATUS			0x0
+#define  MSI_INT_CLR			0x8
+#define  MSI_INT_MASK_SET		0x10
+#define  MSI_INT_MASK_CLR		0x14
 
 #define PCIE_EXT_CFG_DATA				0x8000
 
@@ -130,6 +138,8 @@
 /* PCIe parameters */
 #define BRCM_NUM_PCIE_OUT_WINS		0x4
 #define BRCM_INT_PCI_MSI_NR		32
+#define BRCM_INT_PCI_MSI_LEGACY_NR	8
+#define BRCM_INT_PCI_MSI_SHIFT		0
 
 /* MSI target adresses */
 #define BRCM_MSI_TARGET_ADDR_LT_4GB	0x0fffffffcULL
@@ -247,6 +257,12 @@ struct brcm_msi {
 	int			irq;
 	/* used indicates which MSI interrupts have been alloc'd */
 	unsigned long		used;
+	bool			legacy;
+	/* Some chips have MSIs in bits [31..24] of a shared register. */
+	int			legacy_shift;
+	int			nr; /* No. of MSI available, depends on chip */
+	/* This is the base pointer for interrupt status/set/clr regs */
+	void __iomem		*intr_base;
 };
 
 /* Internal PCIe Host Controller Information.*/
@@ -266,6 +282,7 @@ struct brcm_pcie {
 	struct reset_control	*rescal;
 	int			num_memc;
 	u64			memc_size[PCIE_BRCM_MAX_MEMC];
+	u32			hw_rev;
 };
 
 /*
@@ -456,8 +473,10 @@ static void brcm_pcie_msi_isr(struct irq_desc *desc)
 	msi = irq_desc_get_handler_data(desc);
 	dev = msi->dev;
 
-	status = readl(msi->base + PCIE_MSI_INTR2_STATUS);
-	for_each_set_bit(bit, &status, BRCM_INT_PCI_MSI_NR) {
+	status = readl(msi->intr_base + MSI_INT_STATUS);
+	status >>= msi->legacy_shift;
+
+	for_each_set_bit(bit, &status, msi->nr) {
 		virq = irq_find_mapping(msi->inner_domain, bit);
 		if (virq)
 			generic_handle_irq(virq);
@@ -474,7 +493,7 @@ static void brcm_msi_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 
 	msg->address_lo = lower_32_bits(msi->target_addr);
 	msg->address_hi = upper_32_bits(msi->target_addr);
-	msg->data = (0xffff & PCIE_MISC_MSI_DATA_CONFIG_VAL) | data->hwirq;
+	msg->data = (0xffff & PCIE_MISC_MSI_DATA_CONFIG_VAL_32) | data->hwirq;
 }
 
 static int brcm_msi_set_affinity(struct irq_data *irq_data,
@@ -486,8 +505,9 @@ static int brcm_msi_set_affinity(struct irq_data *irq_data,
 static void brcm_msi_ack_irq(struct irq_data *data)
 {
 	struct brcm_msi *msi = irq_data_get_irq_chip_data(data);
+	const int shift_amt = data->hwirq + msi->legacy_shift;
 
-	writel(1 << data->hwirq, msi->base + PCIE_MSI_INTR2_CLR);
+	writel(1 << shift_amt, msi->intr_base + MSI_INT_CLR);
 }
 
 
@@ -503,7 +523,7 @@ static int brcm_msi_alloc(struct brcm_msi *msi)
 	int hwirq;
 
 	mutex_lock(&msi->lock);
-	hwirq = bitmap_find_free_region(&msi->used, BRCM_INT_PCI_MSI_NR, 0);
+	hwirq = bitmap_find_free_region(&msi->used, msi->nr, 0);
 	mutex_unlock(&msi->lock);
 
 	return hwirq;
@@ -552,7 +572,7 @@ static int brcm_allocate_domains(struct brcm_msi *msi)
 	struct fwnode_handle *fwnode = of_node_to_fwnode(msi->np);
 	struct device *dev = msi->dev;
 
-	msi->inner_domain = irq_domain_add_linear(NULL, BRCM_INT_PCI_MSI_NR,
+	msi->inner_domain = irq_domain_add_linear(NULL, msi->nr,
 						  &msi_domain_ops, msi);
 	if (!msi->inner_domain) {
 		dev_err(dev, "failed to create IRQ domain\n");
@@ -590,7 +610,10 @@ static void brcm_msi_remove(struct brcm_pcie *pcie)
 
 static void brcm_msi_set_regs(struct brcm_msi *msi)
 {
-	writel(0xffffffff, msi->base + PCIE_MSI_INTR2_MASK_CLR);
+	u32 val = __GENMASK(31, msi->legacy_shift);
+
+	writel(val, msi->intr_base + MSI_INT_MASK_CLR);
+	writel(val, msi->intr_base + MSI_INT_CLR);
 
 	/*
 	 * The 0 bit of PCIE_MISC_MSI_BAR_CONFIG_LO is repurposed to MSI
@@ -601,8 +624,10 @@ static void brcm_msi_set_regs(struct brcm_msi *msi)
 	writel(upper_32_bits(msi->target_addr),
 	       msi->base + PCIE_MISC_MSI_BAR_CONFIG_HI);
 
-	writel(PCIE_MISC_MSI_DATA_CONFIG_VAL,
-	       msi->base + PCIE_MISC_MSI_DATA_CONFIG);
+	val = msi->legacy ? PCIE_MISC_MSI_DATA_CONFIG_VAL_8 :
+		PCIE_MISC_MSI_DATA_CONFIG_VAL_32;
+
+	writel(val, msi->base + PCIE_MISC_MSI_DATA_CONFIG);
 }
 
 static int brcm_pcie_enable_msi(struct brcm_pcie *pcie)
@@ -627,6 +652,17 @@ static int brcm_pcie_enable_msi(struct brcm_pcie *pcie)
 	msi->np = pcie->np;
 	msi->target_addr = pcie->msi_target_addr;
 	msi->irq = irq;
+	msi->legacy = pcie->hw_rev < BRCM_PCIE_HW_REV_33;
+
+	if (msi->legacy) {
+		msi->intr_base = msi->base + PCIE_INTR2_CPU_BASE;
+		msi->nr = BRCM_INT_PCI_MSI_LEGACY_NR;
+		msi->legacy_shift = 24;
+	} else {
+		msi->intr_base = msi->base + PCIE_MSI_INTR2_BASE;
+		msi->nr = BRCM_INT_PCI_MSI_NR;
+		msi->legacy_shift = 0;
+	}
 
 	ret = brcm_allocate_domains(msi);
 	if (ret)
@@ -885,12 +921,6 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
 	tmp &= ~PCIE_MISC_RC_BAR3_CONFIG_LO_SIZE_MASK;
 	writel(tmp, base + PCIE_MISC_RC_BAR3_CONFIG_LO);
 
-	/* Mask all interrupts since we are not handling any yet */
-	writel(0xffffffff, pcie->base + PCIE_MSI_INTR2_MASK_SET);
-
-	/* clear any interrupts we find on boot */
-	writel(0xffffffff, pcie->base + PCIE_MSI_INTR2_CLR);
-
 	if (pcie->gen)
 		brcm_pcie_set_gen(pcie, pcie->gen);
 
@@ -1220,6 +1250,8 @@ static int brcm_pcie_probe(struct platform_device *pdev)
 	if (ret)
 		goto fail;
 
+	pcie->hw_rev = readl(pcie->base + PCIE_MISC_REVISION);
+
 	msi_np = of_parse_phandle(pcie->np, "msi-parent", 0);
 	if (pci_msi_enabled() && msi_np == pcie->np) {
 		ret = brcm_pcie_enable_msi(pcie);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 11/13] PCI: brcmstb: Accommodate MSI for older chips
@ 2020-06-03 19:20   ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Rob Herring, Lorenzo Pieralisi, open list, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Jim Quinlan, Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE

From: Jim Quinlan <jquinlan@broadcom.com>

Older BrcmSTB chips do not have a separate register for MSI interrupts; the
MSIs are in a register that also contains unrelated interrupts.  In
addition, the interrupts lie in bits [31..24] for these legacy chips.  This
commit provides common code for both legacy and non-legacy MSI interrupt
registers.

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 72 +++++++++++++++++++--------
 1 file changed, 52 insertions(+), 20 deletions(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 338e9ed44230..9930419e3ac2 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -80,7 +80,8 @@
 #define PCIE_MISC_MSI_BAR_CONFIG_HI			0x4048
 
 #define PCIE_MISC_MSI_DATA_CONFIG			0x404c
-#define  PCIE_MISC_MSI_DATA_CONFIG_VAL			0xffe06540
+#define  PCIE_MISC_MSI_DATA_CONFIG_VAL_32		0xffe06540
+#define  PCIE_MISC_MSI_DATA_CONFIG_VAL_8		0xfff86540
 
 #define PCIE_MISC_PCIE_CTRL				0x4064
 #define  PCIE_MISC_PCIE_CTRL_PCIE_L23_REQUEST_MASK	0x1
@@ -92,6 +93,9 @@
 #define  PCIE_MISC_PCIE_STATUS_PCIE_PHYLINKUP_MASK	0x10
 #define  PCIE_MISC_PCIE_STATUS_PCIE_LINK_IN_L23_MASK	0x40
 
+#define PCIE_MISC_REVISION				0x406c
+#define  BRCM_PCIE_HW_REV_33				0x0303
+
 #define PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT		0x4070
 #define  PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT_LIMIT_MASK	0xfff00000
 #define  PCIE_MISC_CPU_2_PCIE_MEM_WIN0_BASE_LIMIT_BASE_MASK	0xfff0
@@ -112,10 +116,14 @@
 #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK	0x2
 #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK		0x08000000
 
-#define PCIE_MSI_INTR2_STATUS				0x4500
-#define PCIE_MSI_INTR2_CLR				0x4508
-#define PCIE_MSI_INTR2_MASK_SET				0x4510
-#define PCIE_MSI_INTR2_MASK_CLR				0x4514
+
+#define PCIE_INTR2_CPU_BASE		0x4300
+#define PCIE_MSI_INTR2_BASE		0x4500
+/* Offsets from PCIE_INTR2_CPU_BASE and PCIE_MSI_INTR2_BASE */
+#define  MSI_INT_STATUS			0x0
+#define  MSI_INT_CLR			0x8
+#define  MSI_INT_MASK_SET		0x10
+#define  MSI_INT_MASK_CLR		0x14
 
 #define PCIE_EXT_CFG_DATA				0x8000
 
@@ -130,6 +138,8 @@
 /* PCIe parameters */
 #define BRCM_NUM_PCIE_OUT_WINS		0x4
 #define BRCM_INT_PCI_MSI_NR		32
+#define BRCM_INT_PCI_MSI_LEGACY_NR	8
+#define BRCM_INT_PCI_MSI_SHIFT		0
 
 /* MSI target adresses */
 #define BRCM_MSI_TARGET_ADDR_LT_4GB	0x0fffffffcULL
@@ -247,6 +257,12 @@ struct brcm_msi {
 	int			irq;
 	/* used indicates which MSI interrupts have been alloc'd */
 	unsigned long		used;
+	bool			legacy;
+	/* Some chips have MSIs in bits [31..24] of a shared register. */
+	int			legacy_shift;
+	int			nr; /* No. of MSI available, depends on chip */
+	/* This is the base pointer for interrupt status/set/clr regs */
+	void __iomem		*intr_base;
 };
 
 /* Internal PCIe Host Controller Information.*/
@@ -266,6 +282,7 @@ struct brcm_pcie {
 	struct reset_control	*rescal;
 	int			num_memc;
 	u64			memc_size[PCIE_BRCM_MAX_MEMC];
+	u32			hw_rev;
 };
 
 /*
@@ -456,8 +473,10 @@ static void brcm_pcie_msi_isr(struct irq_desc *desc)
 	msi = irq_desc_get_handler_data(desc);
 	dev = msi->dev;
 
-	status = readl(msi->base + PCIE_MSI_INTR2_STATUS);
-	for_each_set_bit(bit, &status, BRCM_INT_PCI_MSI_NR) {
+	status = readl(msi->intr_base + MSI_INT_STATUS);
+	status >>= msi->legacy_shift;
+
+	for_each_set_bit(bit, &status, msi->nr) {
 		virq = irq_find_mapping(msi->inner_domain, bit);
 		if (virq)
 			generic_handle_irq(virq);
@@ -474,7 +493,7 @@ static void brcm_msi_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 
 	msg->address_lo = lower_32_bits(msi->target_addr);
 	msg->address_hi = upper_32_bits(msi->target_addr);
-	msg->data = (0xffff & PCIE_MISC_MSI_DATA_CONFIG_VAL) | data->hwirq;
+	msg->data = (0xffff & PCIE_MISC_MSI_DATA_CONFIG_VAL_32) | data->hwirq;
 }
 
 static int brcm_msi_set_affinity(struct irq_data *irq_data,
@@ -486,8 +505,9 @@ static int brcm_msi_set_affinity(struct irq_data *irq_data,
 static void brcm_msi_ack_irq(struct irq_data *data)
 {
 	struct brcm_msi *msi = irq_data_get_irq_chip_data(data);
+	const int shift_amt = data->hwirq + msi->legacy_shift;
 
-	writel(1 << data->hwirq, msi->base + PCIE_MSI_INTR2_CLR);
+	writel(1 << shift_amt, msi->intr_base + MSI_INT_CLR);
 }
 
 
@@ -503,7 +523,7 @@ static int brcm_msi_alloc(struct brcm_msi *msi)
 	int hwirq;
 
 	mutex_lock(&msi->lock);
-	hwirq = bitmap_find_free_region(&msi->used, BRCM_INT_PCI_MSI_NR, 0);
+	hwirq = bitmap_find_free_region(&msi->used, msi->nr, 0);
 	mutex_unlock(&msi->lock);
 
 	return hwirq;
@@ -552,7 +572,7 @@ static int brcm_allocate_domains(struct brcm_msi *msi)
 	struct fwnode_handle *fwnode = of_node_to_fwnode(msi->np);
 	struct device *dev = msi->dev;
 
-	msi->inner_domain = irq_domain_add_linear(NULL, BRCM_INT_PCI_MSI_NR,
+	msi->inner_domain = irq_domain_add_linear(NULL, msi->nr,
 						  &msi_domain_ops, msi);
 	if (!msi->inner_domain) {
 		dev_err(dev, "failed to create IRQ domain\n");
@@ -590,7 +610,10 @@ static void brcm_msi_remove(struct brcm_pcie *pcie)
 
 static void brcm_msi_set_regs(struct brcm_msi *msi)
 {
-	writel(0xffffffff, msi->base + PCIE_MSI_INTR2_MASK_CLR);
+	u32 val = __GENMASK(31, msi->legacy_shift);
+
+	writel(val, msi->intr_base + MSI_INT_MASK_CLR);
+	writel(val, msi->intr_base + MSI_INT_CLR);
 
 	/*
 	 * The 0 bit of PCIE_MISC_MSI_BAR_CONFIG_LO is repurposed to MSI
@@ -601,8 +624,10 @@ static void brcm_msi_set_regs(struct brcm_msi *msi)
 	writel(upper_32_bits(msi->target_addr),
 	       msi->base + PCIE_MISC_MSI_BAR_CONFIG_HI);
 
-	writel(PCIE_MISC_MSI_DATA_CONFIG_VAL,
-	       msi->base + PCIE_MISC_MSI_DATA_CONFIG);
+	val = msi->legacy ? PCIE_MISC_MSI_DATA_CONFIG_VAL_8 :
+		PCIE_MISC_MSI_DATA_CONFIG_VAL_32;
+
+	writel(val, msi->base + PCIE_MISC_MSI_DATA_CONFIG);
 }
 
 static int brcm_pcie_enable_msi(struct brcm_pcie *pcie)
@@ -627,6 +652,17 @@ static int brcm_pcie_enable_msi(struct brcm_pcie *pcie)
 	msi->np = pcie->np;
 	msi->target_addr = pcie->msi_target_addr;
 	msi->irq = irq;
+	msi->legacy = pcie->hw_rev < BRCM_PCIE_HW_REV_33;
+
+	if (msi->legacy) {
+		msi->intr_base = msi->base + PCIE_INTR2_CPU_BASE;
+		msi->nr = BRCM_INT_PCI_MSI_LEGACY_NR;
+		msi->legacy_shift = 24;
+	} else {
+		msi->intr_base = msi->base + PCIE_MSI_INTR2_BASE;
+		msi->nr = BRCM_INT_PCI_MSI_NR;
+		msi->legacy_shift = 0;
+	}
 
 	ret = brcm_allocate_domains(msi);
 	if (ret)
@@ -885,12 +921,6 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
 	tmp &= ~PCIE_MISC_RC_BAR3_CONFIG_LO_SIZE_MASK;
 	writel(tmp, base + PCIE_MISC_RC_BAR3_CONFIG_LO);
 
-	/* Mask all interrupts since we are not handling any yet */
-	writel(0xffffffff, pcie->base + PCIE_MSI_INTR2_MASK_SET);
-
-	/* clear any interrupts we find on boot */
-	writel(0xffffffff, pcie->base + PCIE_MSI_INTR2_CLR);
-
 	if (pcie->gen)
 		brcm_pcie_set_gen(pcie, pcie->gen);
 
@@ -1220,6 +1250,8 @@ static int brcm_pcie_probe(struct platform_device *pdev)
 	if (ret)
 		goto fail;
 
+	pcie->hw_rev = readl(pcie->base + PCIE_MISC_REVISION);
+
 	msi_np = of_parse_phandle(pcie->np, "msi-parent", 0);
 	if (pci_msi_enabled() && msi_np == pcie->np) {
 		ret = brcm_pcie_enable_msi(pcie);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 12/13] PCI: brcmstb: Set bus max burst size by chip type
  2020-06-03 19:20 ` Jim Quinlan
@ 2020-06-03 19:20   ` Jim Quinlan
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Jim Quinlan, Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas,
	Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list

From: Jim Quinlan <jquinlan@broadcom.com>

The proper value of the parameter SCB_MAX_BURST_SIZE varies per chip.  The
2711 family requires 128B whereas other devices can employ 512.  The
assignment is complicated by the fact that the values for this two-bit
field have different meanings;

  Value   Type_Generic    Type_7278

     00       Reserved         128B
     01           128B         256B
     10           256B         512B
     11           512B     Reserved

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 9930419e3ac2..131cf0a51398 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -53,7 +53,7 @@
 #define  PCIE_MISC_MISC_CTRL_SCB_ACCESS_EN_MASK		0x1000
 #define  PCIE_MISC_MISC_CTRL_CFG_READ_UR_MODE_MASK	0x2000
 #define  PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_MASK	0x300000
-#define  PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_128		0x0
+
 #define  PCIE_MISC_MISC_CTRL_SCB0_SIZE_MASK		0xf8000000
 #define  PCIE_MISC_MISC_CTRL_SCB1_SIZE_MASK		0x07c00000
 #define  PCIE_MISC_MISC_CTRL_SCB2_SIZE_MASK		0x0000001f
@@ -848,7 +848,7 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
 	int num_out_wins = 0;
 	u16 nlw, cls, lnksta;
 	int i, ret, memc;
-	u32 tmp, aspm_support;
+	u32 tmp, burst, aspm_support;
 
 	/* Reset the bridge */
 	brcm_pcie_bridge_sw_init_set(pcie, 1);
@@ -864,10 +864,22 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
 	/* Wait for SerDes to be stable */
 	usleep_range(100, 200);
 
+	/*
+	 * SCB_MAX_BURST_SIZE is a two bit field.  For GENERIC chips it
+	 * is encoded as 0=128, 1=256, 2=512, 3=Rsvd, for BCM7278 it
+	 * is encoded as 0=Rsvd, 1=128, 2=256, 3=512.
+	 */
+	if (pcie->type == BCM2711)
+		burst = 0x0; /* 128B */
+	else if (pcie->type == BCM7278)
+		burst = 0x3; /* 512 bytes */
+	else
+		burst = 0x2; /* 512 bytes */
+
 	/* Set SCB_MAX_BURST_SIZE, CFG_READ_UR_MODE, SCB_ACCESS_EN */
 	u32p_replace_bits(&tmp, 1, PCIE_MISC_MISC_CTRL_SCB_ACCESS_EN_MASK);
 	u32p_replace_bits(&tmp, 1, PCIE_MISC_MISC_CTRL_CFG_READ_UR_MODE_MASK);
-	u32p_replace_bits(&tmp, PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_128,
+	u32p_replace_bits(&tmp, burst,
 			  PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_MASK);
 	writel(tmp, base + PCIE_MISC_MISC_CTRL);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 12/13] PCI: brcmstb: Set bus max burst size by chip type
@ 2020-06-03 19:20   ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Rob Herring, Lorenzo Pieralisi, open list, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Jim Quinlan, Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE

From: Jim Quinlan <jquinlan@broadcom.com>

The proper value of the parameter SCB_MAX_BURST_SIZE varies per chip.  The
2711 family requires 128B whereas other devices can employ 512.  The
assignment is complicated by the fact that the values for this two-bit
field have different meanings;

  Value   Type_Generic    Type_7278

     00       Reserved         128B
     01           128B         256B
     10           256B         512B
     11           512B     Reserved

Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 9930419e3ac2..131cf0a51398 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -53,7 +53,7 @@
 #define  PCIE_MISC_MISC_CTRL_SCB_ACCESS_EN_MASK		0x1000
 #define  PCIE_MISC_MISC_CTRL_CFG_READ_UR_MODE_MASK	0x2000
 #define  PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_MASK	0x300000
-#define  PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_128		0x0
+
 #define  PCIE_MISC_MISC_CTRL_SCB0_SIZE_MASK		0xf8000000
 #define  PCIE_MISC_MISC_CTRL_SCB1_SIZE_MASK		0x07c00000
 #define  PCIE_MISC_MISC_CTRL_SCB2_SIZE_MASK		0x0000001f
@@ -848,7 +848,7 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
 	int num_out_wins = 0;
 	u16 nlw, cls, lnksta;
 	int i, ret, memc;
-	u32 tmp, aspm_support;
+	u32 tmp, burst, aspm_support;
 
 	/* Reset the bridge */
 	brcm_pcie_bridge_sw_init_set(pcie, 1);
@@ -864,10 +864,22 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
 	/* Wait for SerDes to be stable */
 	usleep_range(100, 200);
 
+	/*
+	 * SCB_MAX_BURST_SIZE is a two bit field.  For GENERIC chips it
+	 * is encoded as 0=128, 1=256, 2=512, 3=Rsvd, for BCM7278 it
+	 * is encoded as 0=Rsvd, 1=128, 2=256, 3=512.
+	 */
+	if (pcie->type == BCM2711)
+		burst = 0x0; /* 128B */
+	else if (pcie->type == BCM7278)
+		burst = 0x3; /* 512 bytes */
+	else
+		burst = 0x2; /* 512 bytes */
+
 	/* Set SCB_MAX_BURST_SIZE, CFG_READ_UR_MODE, SCB_ACCESS_EN */
 	u32p_replace_bits(&tmp, 1, PCIE_MISC_MISC_CTRL_SCB_ACCESS_EN_MASK);
 	u32p_replace_bits(&tmp, 1, PCIE_MISC_MISC_CTRL_CFG_READ_UR_MODE_MASK);
-	u32p_replace_bits(&tmp, PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_128,
+	u32p_replace_bits(&tmp, burst,
 			  PCIE_MISC_MISC_CTRL_MAX_BURST_SIZE_MASK);
 	writel(tmp, base + PCIE_MISC_MISC_CTRL);
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 13/13] PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list
  2020-06-03 19:20 ` Jim Quinlan
@ 2020-06-03 19:20   ` Jim Quinlan
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list

Now that the support is in place with previous commits, we add several
chips that use the BrcmSTB driver.

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 131cf0a51398..22dbecb5403c 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -1189,6 +1189,10 @@ static int brcm_pcie_remove(struct platform_device *pdev)
 
 static const struct of_device_id brcm_pcie_match[] = {
 	{ .compatible = "brcm,bcm2711-pcie", .data = &bcm2711_cfg },
+	{ .compatible = "brcm,bcm7211-pcie", .data = &generic_cfg },
+	{ .compatible = "brcm,bcm7278-pcie", .data = &bcm7278_cfg },
+	{ .compatible = "brcm,bcm7216-pcie", .data = &bcm7278_cfg },
+	{ .compatible = "brcm,bcm7445-pcie", .data = &generic_cfg },
 	{},
 };
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v3 13/13] PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list
@ 2020-06-03 19:20   ` Jim Quinlan
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-03 19:20 UTC (permalink / raw)
  To: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, james.quinlan
  Cc: Rob Herring, Lorenzo Pieralisi, open list, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE

Now that the support is in place with previous commits, we add several
chips that use the BrcmSTB driver.

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index 131cf0a51398..22dbecb5403c 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -1189,6 +1189,10 @@ static int brcm_pcie_remove(struct platform_device *pdev)
 
 static const struct of_device_id brcm_pcie_match[] = {
 	{ .compatible = "brcm,bcm2711-pcie", .data = &bcm2711_cfg },
+	{ .compatible = "brcm,bcm7211-pcie", .data = &generic_cfg },
+	{ .compatible = "brcm,bcm7278-pcie", .data = &bcm7278_cfg },
+	{ .compatible = "brcm,bcm7216-pcie", .data = &bcm7278_cfg },
+	{ .compatible = "brcm,bcm7445-pcie", .data = &generic_cfg },
 	{},
 };
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 13/13] PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list
  2020-06-03 19:20   ` Jim Quinlan
@ 2020-06-03 20:02     ` Florian Fainelli
  -1 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:02 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> Now that the support is in place with previous commits, we add several
> chips that use the BrcmSTB driver.
> 
> Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 13/13] PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list
@ 2020-06-03 20:02     ` Florian Fainelli
  0 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:02 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Rob Herring, Lorenzo Pieralisi, open list,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> Now that the support is in place with previous commits, we add several
> chips that use the BrcmSTB driver.
> 
> Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 12/13] PCI: brcmstb: Set bus max burst size by chip type
  2020-06-03 19:20   ` Jim Quinlan
@ 2020-06-03 20:06     ` Florian Fainelli
  -1 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:06 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> The proper value of the parameter SCB_MAX_BURST_SIZE varies per chip.  The
> 2711 family requires 128B whereas other devices can employ 512.  The
> assignment is complicated by the fact that the values for this two-bit
> field have different meanings;
> 
>   Value   Type_Generic    Type_7278

It looks like Type_Generic and Type_7278 should be swapped in this
description. Other than that:

Acked-by: Florian Fainelli <f.fainelli@gmail.com>

> 
>      00       Reserved         128B
>      01           128B         256B
>      10           256B         512B
>      11           512B     Reserved
-- 
Florian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 12/13] PCI: brcmstb: Set bus max burst size by chip type
@ 2020-06-03 20:06     ` Florian Fainelli
  0 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:06 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Rob Herring, Lorenzo Pieralisi, open list, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> The proper value of the parameter SCB_MAX_BURST_SIZE varies per chip.  The
> 2711 family requires 128B whereas other devices can employ 512.  The
> assignment is complicated by the fact that the values for this two-bit
> field have different meanings;
> 
>   Value   Type_Generic    Type_7278

It looks like Type_Generic and Type_7278 should be swapped in this
description. Other than that:

Acked-by: Florian Fainelli <f.fainelli@gmail.com>

> 
>      00       Reserved         128B
>      01           128B         256B
>      10           256B         512B
>      11           512B     Reserved
-- 
Florian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 05/13] PCI: brcmstb: Add suspend and resume pm_ops
  2020-06-03 19:20   ` Jim Quinlan
@ 2020-06-03 20:24     ` Bjorn Helgaas
  -1 siblings, 0 replies; 83+ messages in thread
From: Bjorn Helgaas @ 2020-06-03 20:24 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: linux-pci, Christoph Hellwig, Nicolas Saenz Julienne,
	bcm-kernel-feedback-list, Lorenzo Pieralisi, Rob Herring,
	Bjorn Helgaas, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list

On Wed, Jun 03, 2020 at 03:20:37PM -0400, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> Broadcom Set-top (BrcmSTB) boards typically support S2, S3, and S5 suspend
> and resume.  Now the PCIe driver may do so as well.
> 
> Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
> ---
>  drivers/pci/controller/pcie-brcmstb.c | 49 +++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
> 
> diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
> index 7c707e483181..f444751e247c 100644
> --- a/drivers/pci/controller/pcie-brcmstb.c
> +++ b/drivers/pci/controller/pcie-brcmstb.c
> @@ -979,6 +979,49 @@ static void brcm_pcie_turn_off(struct brcm_pcie *pcie)
>  	brcm_pcie_bridge_sw_init_set(pcie, 1);
>  }
>  
> +static int brcm_pcie_suspend(struct device *dev)
> +{
> +	struct brcm_pcie *pcie = dev_get_drvdata(dev);
> +	int ret = 0;
> +
> +	brcm_pcie_turn_off(pcie);
> +	clk_disable_unprepare(pcie->clk);
> +
> +	return ret;

No need for "ret".

> +}

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 05/13] PCI: brcmstb: Add suspend and resume pm_ops
@ 2020-06-03 20:24     ` Bjorn Helgaas
  0 siblings, 0 replies; 83+ messages in thread
From: Bjorn Helgaas @ 2020-06-03 20:24 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Rob Herring, Lorenzo Pieralisi, linux-pci, open list,
	Florian Fainelli, bcm-kernel-feedback-list,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Bjorn Helgaas, Christoph Hellwig, Nicolas Saenz Julienne

On Wed, Jun 03, 2020 at 03:20:37PM -0400, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> Broadcom Set-top (BrcmSTB) boards typically support S2, S3, and S5 suspend
> and resume.  Now the PCIe driver may do so as well.
> 
> Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
> ---
>  drivers/pci/controller/pcie-brcmstb.c | 49 +++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
> 
> diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
> index 7c707e483181..f444751e247c 100644
> --- a/drivers/pci/controller/pcie-brcmstb.c
> +++ b/drivers/pci/controller/pcie-brcmstb.c
> @@ -979,6 +979,49 @@ static void brcm_pcie_turn_off(struct brcm_pcie *pcie)
>  	brcm_pcie_bridge_sw_init_set(pcie, 1);
>  }
>  
> +static int brcm_pcie_suspend(struct device *dev)
> +{
> +	struct brcm_pcie *pcie = dev_get_drvdata(dev);
> +	int ret = 0;
> +
> +	brcm_pcie_turn_off(pcie);
> +	clk_disable_unprepare(pcie->clk);
> +
> +	return ret;

No need for "ret".

> +}

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 04/13] PCI: brcmstb: Add bcm7278 reigister info
  2020-06-03 19:20   ` Jim Quinlan
@ 2020-06-03 20:28     ` Florian Fainelli
  -1 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:28 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> Add in compatibility strings and code for three Broadcom STB chips.  Some
> of the register locations, shifts, and masks are different for certain
> chips, requiring the use of different constants based on of_id.
> 
> We would like to add the following at this time to the match list but we
> need to wait until the end of this patchset so that everything works.
> 
>     { .compatible = "brcm,bcm7211-pcie", .data = &generic_cfg },
>     { .compatible = "brcm,bcm7278-pcie", .data = &bcm7278_cfg },
>     { .compatible = "brcm,bcm7216-pcie", .data = &bcm7278_cfg },
>     { .compatible = "brcm,bcm7445-pcie", .data = &generic_cfg },

If you need to resubmit, there is a typo in the subject: reigister vs.
register. Other than that:

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 04/13] PCI: brcmstb: Add bcm7278 reigister info
@ 2020-06-03 20:28     ` Florian Fainelli
  0 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:28 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Rob Herring, Lorenzo Pieralisi, open list, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> Add in compatibility strings and code for three Broadcom STB chips.  Some
> of the register locations, shifts, and masks are different for certain
> chips, requiring the use of different constants based on of_id.
> 
> We would like to add the following at this time to the match list but we
> need to wait until the end of this patchset so that everything works.
> 
>     { .compatible = "brcm,bcm7211-pcie", .data = &generic_cfg },
>     { .compatible = "brcm,bcm7278-pcie", .data = &bcm7278_cfg },
>     { .compatible = "brcm,bcm7216-pcie", .data = &bcm7278_cfg },
>     { .compatible = "brcm,bcm7445-pcie", .data = &generic_cfg },

If you need to resubmit, there is a typo in the subject: reigister vs.
register. Other than that:

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 06/13] PCI: brcmstb: Add bcm7278 PERST support
  2020-06-03 19:20   ` Jim Quinlan
@ 2020-06-03 20:28     ` Florian Fainelli
  -1 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:28 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> The PERST bit was moved to a different register in 7278-type STB chips.  In
> addition, the polarity of the bit was also changed; for other chips writing
> a 1 specified assert; for 7278-type chips, writing a 0 specifies assert.
> 
> Signal-wise, PERST is an asserted-low signal.
> 
> Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 06/13] PCI: brcmstb: Add bcm7278 PERST support
@ 2020-06-03 20:28     ` Florian Fainelli
  0 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:28 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Rob Herring, Lorenzo Pieralisi, open list,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> The PERST bit was moved to a different register in 7278-type STB chips.  In
> addition, the polarity of the bit was also changed; for other chips writing
> a 1 specified assert; for 7278-type chips, writing a 0 specifies assert.
> 
> Signal-wise, PERST is an asserted-low signal.
> 
> Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 07/13] PCI: brcmstb: Add control of rescal reset
  2020-06-03 19:20   ` Jim Quinlan
@ 2020-06-03 20:30     ` Florian Fainelli
  -1 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:30 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas, Florian Fainelli,
	Philipp Zabel,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> Some STB chips have a special purpose reset controller named RESCAL (reset
> calibration).  The PCIe HW can now control RESCAL to start and stop its
> operation.
> 
> Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 07/13] PCI: brcmstb: Add control of rescal reset
@ 2020-06-03 20:30     ` Florian Fainelli
  0 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:30 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Rob Herring, Lorenzo Pieralisi, open list, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Philipp Zabel, Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> Some STB chips have a special purpose reset controller named RESCAL (reset
> calibration).  The PCIe HW can now control RESCAL to start and stop its
> operation.
> 
> Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 10/13] PCI: brcmstb: Set internal memory viewport sizes
  2020-06-03 19:20   ` Jim Quinlan
@ 2020-06-03 20:33     ` Florian Fainelli
  -1 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:33 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> BrcmSTB PCIe controllers are intimately connected to the memory
> controller(s) on the SOC.  There is a "viewport" for each memory controller
> that allows inbound accesses to CPU memory.  Each viewport's size must be
> set to a power of two, and that size must be equal to or larger than the
> amount of memory each controller supports.
> 
> Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 10/13] PCI: brcmstb: Set internal memory viewport sizes
@ 2020-06-03 20:33     ` Florian Fainelli
  0 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-03 20:33 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Rob Herring, Lorenzo Pieralisi, open list,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> BrcmSTB PCIe controllers are intimately connected to the memory
> controller(s) on the SOC.  There is a "viewport" for each memory controller
> that allows inbound accesses to CPU memory.  Each viewport's size must be
> set to a power of two, and that size must be equal to or larger than the
> amount of memory each controller supports.
> 
> Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 11/13] PCI: brcmstb: Accommodate MSI for older chips
  2020-06-03 19:20   ` Jim Quinlan
@ 2020-06-04  2:56     ` Florian Fainelli
  -1 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-04  2:56 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	open list



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> Older BrcmSTB chips do not have a separate register for MSI interrupts; the
> MSIs are in a register that also contains unrelated interrupts.  In
> addition, the interrupts lie in bits [31..24] for these legacy chips.  This
> commit provides common code for both legacy and non-legacy MSI interrupt
> registers.
> 
> Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 11/13] PCI: brcmstb: Accommodate MSI for older chips
@ 2020-06-04  2:56     ` Florian Fainelli
  0 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-04  2:56 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Rob Herring, Lorenzo Pieralisi, open list, Florian Fainelli,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	Bjorn Helgaas,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> Older BrcmSTB chips do not have a separate register for MSI interrupts; the
> MSIs are in a register that also contains unrelated interrupts.  In
> addition, the interrupts lie in bits [31..24] for these legacy chips.  This
> commit provides common code for both legacy and non-legacy MSI interrupt
> registers.
> 
> Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 02/13] ata: ahci_brcm: Fix use of BCM7216 reset controller
  2020-06-03 19:20 ` [PATCH v3 02/13] ata: ahci_brcm: Fix use of BCM7216 reset controller Jim Quinlan
@ 2020-06-04  2:57   ` Florian Fainelli
  0 siblings, 0 replies; 83+ messages in thread
From: Florian Fainelli @ 2020-06-04  2:57 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig,
	Nicolas Saenz Julienne, bcm-kernel-feedback-list
  Cc: Jens Axboe, Philipp Zabel, Florian Fainelli, Hans de Goede,
	open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers),
	open list



On 6/3/2020 12:20 PM, Jim Quinlan wrote:
> From: Jim Quinlan <jquinlan@broadcom.com>
> 
> A reset controller "rescal" is shared between the AHCI driver and the PCIe
> driver for the BrcmSTB 7216 chip.  The code is modified to allow this
> sharing and to deassert() properly.
> 
> Signed-off-by: Jim Quinlan <jquinlan@broadcom.com>
> 
> Fixes: 272ecd60a636 ("ata: ahci_brcm: BCM7216 reset is self de-asserting")
> Fixes: c345ec6a50e9 ("ata: ahci_brcm: Support BCM7216 reset controller
> name")
> ---

[snip]

> @@ -479,10 +478,7 @@ static int brcm_ahci_probe(struct platform_device *pdev)
>  		break;
>  	}
>  
> -	if (priv->version == BRCM_SATA_BCM7216)
> -		ret = reset_control_reset(priv->rcdev);
> -	else
> -		ret = reset_control_deassert(priv->rcdev);
> +	ret = reset_control_deassert(priv->rcdev);
>  	if (ret)
>  		return ret;

Do we need a similar change for brcm_ahci_resume()?
-- 
Florian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-03 19:20   ` Jim Quinlan via iommu
  (?)
@ 2020-06-04 11:04     ` Dan Carpenter
  -1 siblings, 0 replies; 83+ messages in thread
From: Dan Carpenter @ 2020-06-04 11:04 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	linux-pci, Hanjun Guo,
	open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Paul Kocialkowski,
	Lorenzo Pieralisi, Yoshinori Sato, Will Deacon, Joerg Roedel,
	maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT, Russell King,
	open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai, Ingo Molnar,
	bcm-kernel-feedback-list, Alan Stern, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Maxime Ripard,
	Rob Herring, Borislav Petkov, Yong Deng, Santosh Shilimkar,
	Bjorn Helgaas, Dan Williams, Andy Shevchenko,
	moderated list:ARM PORT, Saravana Kannan, Greg Kroah-Hartman,
	Oliver Neukum, Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Wed, Jun 03, 2020 at 03:20:41PM -0400, Jim Quinlan wrote:
> @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
>  	const struct sun4i_backend_quirks *quirks;
>  	struct resource *res;
>  	void __iomem *regs;
> -	int i, ret;
> +	int i, ret = 0;

No need for this.

>  
>  	backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
>  	if (!backend)
> @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
>  		 * on our device since the RAM mapping is at 0 for the DMA bus,
>  		 * unlike the CPU.
>  		 */
> -		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  	}
>  
>  	backend->engine.node = dev->of_node;
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>  	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
>  		return NULL;
>  
> -	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> +	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
>  		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
>  		return NULL;
>  	}
> diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> index eff34ded6305..7212da5e1076 100644
> --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> @@ -7,6 +7,7 @@
>   */
>  
>  #include <linux/clk.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/interrupt.h>
>  #include <linux/module.h>
>  #include <linux/mutex.h>
> @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
>  			return ret;
>  	} else {
>  #ifdef PHYS_PFN_OFFSET
> -		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  #endif
>  	}
>  
> diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> index 055eb0b8e396..2d66d415b6c3 100644
> --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
>  
>  	sdev->dev = &pdev->dev;
>  	/* The DMA bus has the memory mapped at 0 */
> -	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> +	ret = attach_uniform_dma_pfn_offset(sdev->dev,
> +					    PHYS_OFFSET >> PAGE_SHIFT);
> +	if (ret)
> +		return ret;
>  
>  	ret = sun6i_csi_resource_request(sdev, pdev);
>  	if (ret)
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 96d8cfb14a60..c89333b0a5fb 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
>  }
>  EXPORT_SYMBOL(of_io_request_and_map);
>  
> +static int attach_dma_pfn_offset_map(struct device *dev,
> +				     struct device_node *node, int num_ranges)
> +{
> +	struct of_range_parser parser;
> +	struct of_range range;
> +	struct dma_pfn_offset_region *r;
> +
> +	r = devm_kcalloc(dev, num_ranges + 1,
> +			 sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> +	if (!r)
> +		return -ENOMEM;
> +	dev->dma_pfn_offset_map = r;
> +	of_dma_range_parser_init(&parser, node);
> +
> +	/*
> +	 * Record all info for DMA ranges array.  We could
> +	 * just use the of_range struct, but if we did that it
> +	 * would require more calculations for phys_to_dma and
> +	 * dma_to_phys conversions.
> +	 */
> +	for_each_of_range(&parser, &range) {
> +		r->cpu_start = range.cpu_addr;
> +		r->cpu_end = r->cpu_start + range.size - 1;
> +		r->dma_start = range.bus_addr;
> +		r->dma_end = r->dma_start + range.size - 1;
> +		r->pfn_offset = PFN_DOWN(range.cpu_addr)
> +			- PFN_DOWN(range.bus_addr);
> +		r++;
> +	}
> +	return 0;
> +}
> +
> +
> +
> +/**
> + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> + * @dev:	device pointer; only needed for a corner case.
> + * @dma_pfn_offset:	offset to apply when converting from phys addr
      ^^^^^^^^^^^^^^^
This parameter name does not match.

> + *			to dma addr and vice versa.
> + *
> + * It returns -ENOMEM if out of memory, otherwise 0.

It can also return -ENODEV.  Why are we passing NULL dev pointers to
all these functions anyway?

> + */
> +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long pfn_offset)
> +{
> +	struct dma_pfn_offset_region *r;
> +
> +	if (!dev)
> +		return -ENODEV;
> +
> +	if (!pfn_offset)
> +		return 0;
> +
> +	r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> +			 GFP_KERNEL);

Use:	r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);


> +	if (!r)
> +		return -ENOMEM;
> +
> +	r->uniform_offset = true;
> +	r->pfn_offset = pfn_offset;
> +
> +	return 0;
> +}

This function doesn't seem to do anything useful.  Is part of it
missing?

> +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> +

regards,
dan carpenter

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 11:04     ` Dan Carpenter
  0 siblings, 0 replies; 83+ messages in thread
From: Dan Carpenter @ 2020-06-04 11:04 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	linux-pci, Hanjun Guo,
	open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	open list:STAGING SUBSYSTEM, Paul Kocialkowski, Yoshinori Sato,
	Will Deacon, maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT,
	Russell King, open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai,
	Ingo Molnar, bcm-kernel-feedback-list, Alan Stern, Len Brown,
	Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Maxime Ripard,
	Rob Herring, Borislav Petkov, Yong Deng, Santosh Shilimkar,
	Bjorn Helgaas, Dan Williams, Andy Shevchenko,
	moderated list:ARM PORT, Saravana Kannan, Greg Kroah-Hartman,
	Oliver Neukum, Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Wed, Jun 03, 2020 at 03:20:41PM -0400, Jim Quinlan wrote:
> @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
>  	const struct sun4i_backend_quirks *quirks;
>  	struct resource *res;
>  	void __iomem *regs;
> -	int i, ret;
> +	int i, ret = 0;

No need for this.

>  
>  	backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
>  	if (!backend)
> @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
>  		 * on our device since the RAM mapping is at 0 for the DMA bus,
>  		 * unlike the CPU.
>  		 */
> -		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  	}
>  
>  	backend->engine.node = dev->of_node;
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>  	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
>  		return NULL;
>  
> -	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> +	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
>  		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
>  		return NULL;
>  	}
> diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> index eff34ded6305..7212da5e1076 100644
> --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> @@ -7,6 +7,7 @@
>   */
>  
>  #include <linux/clk.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/interrupt.h>
>  #include <linux/module.h>
>  #include <linux/mutex.h>
> @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
>  			return ret;
>  	} else {
>  #ifdef PHYS_PFN_OFFSET
> -		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  #endif
>  	}
>  
> diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> index 055eb0b8e396..2d66d415b6c3 100644
> --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
>  
>  	sdev->dev = &pdev->dev;
>  	/* The DMA bus has the memory mapped at 0 */
> -	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> +	ret = attach_uniform_dma_pfn_offset(sdev->dev,
> +					    PHYS_OFFSET >> PAGE_SHIFT);
> +	if (ret)
> +		return ret;
>  
>  	ret = sun6i_csi_resource_request(sdev, pdev);
>  	if (ret)
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 96d8cfb14a60..c89333b0a5fb 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
>  }
>  EXPORT_SYMBOL(of_io_request_and_map);
>  
> +static int attach_dma_pfn_offset_map(struct device *dev,
> +				     struct device_node *node, int num_ranges)
> +{
> +	struct of_range_parser parser;
> +	struct of_range range;
> +	struct dma_pfn_offset_region *r;
> +
> +	r = devm_kcalloc(dev, num_ranges + 1,
> +			 sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> +	if (!r)
> +		return -ENOMEM;
> +	dev->dma_pfn_offset_map = r;
> +	of_dma_range_parser_init(&parser, node);
> +
> +	/*
> +	 * Record all info for DMA ranges array.  We could
> +	 * just use the of_range struct, but if we did that it
> +	 * would require more calculations for phys_to_dma and
> +	 * dma_to_phys conversions.
> +	 */
> +	for_each_of_range(&parser, &range) {
> +		r->cpu_start = range.cpu_addr;
> +		r->cpu_end = r->cpu_start + range.size - 1;
> +		r->dma_start = range.bus_addr;
> +		r->dma_end = r->dma_start + range.size - 1;
> +		r->pfn_offset = PFN_DOWN(range.cpu_addr)
> +			- PFN_DOWN(range.bus_addr);
> +		r++;
> +	}
> +	return 0;
> +}
> +
> +
> +
> +/**
> + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> + * @dev:	device pointer; only needed for a corner case.
> + * @dma_pfn_offset:	offset to apply when converting from phys addr
      ^^^^^^^^^^^^^^^
This parameter name does not match.

> + *			to dma addr and vice versa.
> + *
> + * It returns -ENOMEM if out of memory, otherwise 0.

It can also return -ENODEV.  Why are we passing NULL dev pointers to
all these functions anyway?

> + */
> +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long pfn_offset)
> +{
> +	struct dma_pfn_offset_region *r;
> +
> +	if (!dev)
> +		return -ENODEV;
> +
> +	if (!pfn_offset)
> +		return 0;
> +
> +	r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> +			 GFP_KERNEL);

Use:	r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);


> +	if (!r)
> +		return -ENOMEM;
> +
> +	r->uniform_offset = true;
> +	r->pfn_offset = pfn_offset;
> +
> +	return 0;
> +}

This function doesn't seem to do anything useful.  Is part of it
missing?

> +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> +

regards,
dan carpenter

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 11:04     ` Dan Carpenter
  0 siblings, 0 replies; 83+ messages in thread
From: Dan Carpenter @ 2020-06-04 11:04 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	linux-pci, Hanjun Guo,
	open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Paul Kocialkowski,
	Lorenzo Pieralisi, Yoshinori Sato, Will Deacon, Joerg Roedel,
	maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT, Russell King,
	open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai, Ingo Molnar,
	bcm-kernel-feedback-list, Alan Stern, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Rob Herring,
	Borislav Petkov, Yong Deng, Santosh Shilimkar, Bjorn Helgaas,
	Dan Williams, Andy Shevchenko, moderated list:ARM PORT,
	Saravana Kannan, Greg Kroah-Hartman, Oliver Neukum,
	Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Wed, Jun 03, 2020 at 03:20:41PM -0400, Jim Quinlan wrote:
> @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
>  	const struct sun4i_backend_quirks *quirks;
>  	struct resource *res;
>  	void __iomem *regs;
> -	int i, ret;
> +	int i, ret = 0;

No need for this.

>  
>  	backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
>  	if (!backend)
> @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
>  		 * on our device since the RAM mapping is at 0 for the DMA bus,
>  		 * unlike the CPU.
>  		 */
> -		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  	}
>  
>  	backend->engine.node = dev->of_node;
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>  	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
>  		return NULL;
>  
> -	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> +	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
>  		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
>  		return NULL;
>  	}
> diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> index eff34ded6305..7212da5e1076 100644
> --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> @@ -7,6 +7,7 @@
>   */
>  
>  #include <linux/clk.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/interrupt.h>
>  #include <linux/module.h>
>  #include <linux/mutex.h>
> @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
>  			return ret;
>  	} else {
>  #ifdef PHYS_PFN_OFFSET
> -		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  #endif
>  	}
>  
> diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> index 055eb0b8e396..2d66d415b6c3 100644
> --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
>  
>  	sdev->dev = &pdev->dev;
>  	/* The DMA bus has the memory mapped at 0 */
> -	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> +	ret = attach_uniform_dma_pfn_offset(sdev->dev,
> +					    PHYS_OFFSET >> PAGE_SHIFT);
> +	if (ret)
> +		return ret;
>  
>  	ret = sun6i_csi_resource_request(sdev, pdev);
>  	if (ret)
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 96d8cfb14a60..c89333b0a5fb 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
>  }
>  EXPORT_SYMBOL(of_io_request_and_map);
>  
> +static int attach_dma_pfn_offset_map(struct device *dev,
> +				     struct device_node *node, int num_ranges)
> +{
> +	struct of_range_parser parser;
> +	struct of_range range;
> +	struct dma_pfn_offset_region *r;
> +
> +	r = devm_kcalloc(dev, num_ranges + 1,
> +			 sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> +	if (!r)
> +		return -ENOMEM;
> +	dev->dma_pfn_offset_map = r;
> +	of_dma_range_parser_init(&parser, node);
> +
> +	/*
> +	 * Record all info for DMA ranges array.  We could
> +	 * just use the of_range struct, but if we did that it
> +	 * would require more calculations for phys_to_dma and
> +	 * dma_to_phys conversions.
> +	 */
> +	for_each_of_range(&parser, &range) {
> +		r->cpu_start = range.cpu_addr;
> +		r->cpu_end = r->cpu_start + range.size - 1;
> +		r->dma_start = range.bus_addr;
> +		r->dma_end = r->dma_start + range.size - 1;
> +		r->pfn_offset = PFN_DOWN(range.cpu_addr)
> +			- PFN_DOWN(range.bus_addr);
> +		r++;
> +	}
> +	return 0;
> +}
> +
> +
> +
> +/**
> + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> + * @dev:	device pointer; only needed for a corner case.
> + * @dma_pfn_offset:	offset to apply when converting from phys addr
      ^^^^^^^^^^^^^^^
This parameter name does not match.

> + *			to dma addr and vice versa.
> + *
> + * It returns -ENOMEM if out of memory, otherwise 0.

It can also return -ENODEV.  Why are we passing NULL dev pointers to
all these functions anyway?

> + */
> +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long pfn_offset)
> +{
> +	struct dma_pfn_offset_region *r;
> +
> +	if (!dev)
> +		return -ENODEV;
> +
> +	if (!pfn_offset)
> +		return 0;
> +
> +	r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> +			 GFP_KERNEL);

Use:	r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);


> +	if (!r)
> +		return -ENOMEM;
> +
> +	r->uniform_offset = true;
> +	r->pfn_offset = pfn_offset;
> +
> +	return 0;
> +}

This function doesn't seem to do anything useful.  Is part of it
missing?

> +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> +

regards,
dan carpenter

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-04 11:04     ` Dan Carpenter
  (?)
@ 2020-06-04 13:48       ` Jim Quinlan via iommu
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-04 13:48 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Paul Kocialkowski,
	Lorenzo Pieralisi, Yoshinori Sato, Will Deacon, Joerg Roedel,
	maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT, Russell King,
	open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Maxime Ripard,
	Rob Herring, Borislav Petkov, Yong Deng, Santosh Shilimkar,
	Bjorn Helgaas, Dan Williams, Andy Shevchenko,
	moderated list:ARM PORT, Saravana Kannan, Greg Kroah-Hartman,
	Oliver Neukum, Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

Hi Dan,

On Thu, Jun 4, 2020 at 7:06 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> On Wed, Jun 03, 2020 at 03:20:41PM -0400, Jim Quinlan wrote:
> > @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
> >       const struct sun4i_backend_quirks *quirks;
> >       struct resource *res;
> >       void __iomem *regs;
> > -     int i, ret;
> > +     int i, ret = 0;
>
> No need for this.
Will fix.

>
> >
> >       backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
> >       if (!backend)
> > @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
> >                * on our device since the RAM mapping is at 0 for the DMA bus,
> >                * unlike the CPU.
> >                */
> > -             drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >       }
> >
> >       backend->engine.node = dev->of_node;
> > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> > index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
> >       if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
> >               return NULL;
> >
> > -     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> > +     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
> >               dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
> >               return NULL;
> >       }
> > diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > index eff34ded6305..7212da5e1076 100644
> > --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > @@ -7,6 +7,7 @@
> >   */
> >
> >  #include <linux/clk.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/interrupt.h>
> >  #include <linux/module.h>
> >  #include <linux/mutex.h>
> > @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
> >                       return ret;
> >       } else {
> >  #ifdef PHYS_PFN_OFFSET
> > -             csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >  #endif
> >       }
> >
> > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > index 055eb0b8e396..2d66d415b6c3 100644
> > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
> >
> >       sdev->dev = &pdev->dev;
> >       /* The DMA bus has the memory mapped at 0 */
> > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > +     if (ret)
> > +             return ret;
> >
> >       ret = sun6i_csi_resource_request(sdev, pdev);
> >       if (ret)
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index 96d8cfb14a60..c89333b0a5fb 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
> >  }
> >  EXPORT_SYMBOL(of_io_request_and_map);
> >
> > +static int attach_dma_pfn_offset_map(struct device *dev,
> > +                                  struct device_node *node, int num_ranges)
> > +{
> > +     struct of_range_parser parser;
> > +     struct of_range range;
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     r = devm_kcalloc(dev, num_ranges + 1,
> > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > +     if (!r)
> > +             return -ENOMEM;
> > +     dev->dma_pfn_offset_map = r;
> > +     of_dma_range_parser_init(&parser, node);
> > +
> > +     /*
> > +      * Record all info for DMA ranges array.  We could
> > +      * just use the of_range struct, but if we did that it
> > +      * would require more calculations for phys_to_dma and
> > +      * dma_to_phys conversions.
> > +      */
> > +     for_each_of_range(&parser, &range) {
> > +             r->cpu_start = range.cpu_addr;
> > +             r->cpu_end = r->cpu_start + range.size - 1;
> > +             r->dma_start = range.bus_addr;
> > +             r->dma_end = r->dma_start + range.size - 1;
> > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > +                     - PFN_DOWN(range.bus_addr);
> > +             r++;
> > +     }
> > +     return 0;
> > +}
> > +
> > +
> > +
> > +/**
> > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > + * @dev:     device pointer; only needed for a corner case.
> > + * @dma_pfn_offset:  offset to apply when converting from phys addr
>       ^^^^^^^^^^^^^^^
> This parameter name does not match.
Will fix.

>
> > + *                   to dma addr and vice versa.
> > + *
> > + * It returns -ENOMEM if out of memory, otherwise 0.
>
> It can also return -ENODEV.  Why are we passing NULL dev pointers to
> all these functions anyway?
No one should be passing dev==NULL to these functions -- that is why I
have the error out for this case.  Actually, there is a case of
dev==NULL being passed -- drivers/of/unittest.c makes a call with  a
NULL device because it has no dev  in its context.  I wasn't sure how
to fix this.

>
> > + */
> > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long pfn_offset)
> > +{
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     if (!dev)
> > +             return -ENODEV;
> > +
> > +     if (!pfn_offset)
> > +             return 0;
> > +
> > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > +                      GFP_KERNEL);
>
> Use:    r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
Will fix.

>
>
> > +     if (!r)
> > +             return -ENOMEM;
> > +
> > +     r->uniform_offset = true;
> > +     r->pfn_offset = pfn_offset;
> > +
> > +     return 0;
> > +}
>
> This function doesn't seem to do anything useful.  Is part of it
> missing?
No, the uniform pfn offset is a special case.  With it, there is only
one entry  in the "map" and 'uniform_offset' is set to true and the
pfn_offset is also set.  When the offset is to be computed, there are
no region bounds calculations needed -- the offset of the first entry
is returned immediately.  This special case was intentional to (a)
preserve backwards compatibility to  those using dev->dma_pfn_offset
and (b) it is faster than treating it as the general case -- I want to
minimize the execution cost of this case.  If this is deemed
unnecessary, I can remove the boolean and threat all cases in the same
way.

The other case is where multiple pfn offsets are needed for multiple
regions.  In this case the code checks the bounds for each successive
region, returning an  offset if there is a match.
>
> > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > +
>
> regards,
> dan carpenter
>

Thanks!
Jim Quinlan
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 13:48       ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan via iommu @ 2020-06-04 13:48 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	open list:STAGING SUBSYSTEM, Paul Kocialkowski, Yoshinori Sato,
	Will Deacon, maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT,
	Russell King, open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai,
	Ingo Molnar, maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE,
	Alan Stern, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Maxime Ripard,
	Rob Herring, Borislav Petkov, Yong Deng, Santosh Shilimkar,
	Bjorn Helgaas, Dan Williams, Andy Shevchenko,
	moderated list:ARM PORT, Saravana Kannan, Greg Kroah-Hartman,
	Oliver Neukum, Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

Hi Dan,

On Thu, Jun 4, 2020 at 7:06 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> On Wed, Jun 03, 2020 at 03:20:41PM -0400, Jim Quinlan wrote:
> > @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
> >       const struct sun4i_backend_quirks *quirks;
> >       struct resource *res;
> >       void __iomem *regs;
> > -     int i, ret;
> > +     int i, ret = 0;
>
> No need for this.
Will fix.

>
> >
> >       backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
> >       if (!backend)
> > @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
> >                * on our device since the RAM mapping is at 0 for the DMA bus,
> >                * unlike the CPU.
> >                */
> > -             drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >       }
> >
> >       backend->engine.node = dev->of_node;
> > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> > index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
> >       if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
> >               return NULL;
> >
> > -     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> > +     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
> >               dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
> >               return NULL;
> >       }
> > diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > index eff34ded6305..7212da5e1076 100644
> > --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > @@ -7,6 +7,7 @@
> >   */
> >
> >  #include <linux/clk.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/interrupt.h>
> >  #include <linux/module.h>
> >  #include <linux/mutex.h>
> > @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
> >                       return ret;
> >       } else {
> >  #ifdef PHYS_PFN_OFFSET
> > -             csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >  #endif
> >       }
> >
> > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > index 055eb0b8e396..2d66d415b6c3 100644
> > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
> >
> >       sdev->dev = &pdev->dev;
> >       /* The DMA bus has the memory mapped at 0 */
> > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > +     if (ret)
> > +             return ret;
> >
> >       ret = sun6i_csi_resource_request(sdev, pdev);
> >       if (ret)
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index 96d8cfb14a60..c89333b0a5fb 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
> >  }
> >  EXPORT_SYMBOL(of_io_request_and_map);
> >
> > +static int attach_dma_pfn_offset_map(struct device *dev,
> > +                                  struct device_node *node, int num_ranges)
> > +{
> > +     struct of_range_parser parser;
> > +     struct of_range range;
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     r = devm_kcalloc(dev, num_ranges + 1,
> > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > +     if (!r)
> > +             return -ENOMEM;
> > +     dev->dma_pfn_offset_map = r;
> > +     of_dma_range_parser_init(&parser, node);
> > +
> > +     /*
> > +      * Record all info for DMA ranges array.  We could
> > +      * just use the of_range struct, but if we did that it
> > +      * would require more calculations for phys_to_dma and
> > +      * dma_to_phys conversions.
> > +      */
> > +     for_each_of_range(&parser, &range) {
> > +             r->cpu_start = range.cpu_addr;
> > +             r->cpu_end = r->cpu_start + range.size - 1;
> > +             r->dma_start = range.bus_addr;
> > +             r->dma_end = r->dma_start + range.size - 1;
> > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > +                     - PFN_DOWN(range.bus_addr);
> > +             r++;
> > +     }
> > +     return 0;
> > +}
> > +
> > +
> > +
> > +/**
> > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > + * @dev:     device pointer; only needed for a corner case.
> > + * @dma_pfn_offset:  offset to apply when converting from phys addr
>       ^^^^^^^^^^^^^^^
> This parameter name does not match.
Will fix.

>
> > + *                   to dma addr and vice versa.
> > + *
> > + * It returns -ENOMEM if out of memory, otherwise 0.
>
> It can also return -ENODEV.  Why are we passing NULL dev pointers to
> all these functions anyway?
No one should be passing dev==NULL to these functions -- that is why I
have the error out for this case.  Actually, there is a case of
dev==NULL being passed -- drivers/of/unittest.c makes a call with  a
NULL device because it has no dev  in its context.  I wasn't sure how
to fix this.

>
> > + */
> > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long pfn_offset)
> > +{
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     if (!dev)
> > +             return -ENODEV;
> > +
> > +     if (!pfn_offset)
> > +             return 0;
> > +
> > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > +                      GFP_KERNEL);
>
> Use:    r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
Will fix.

>
>
> > +     if (!r)
> > +             return -ENOMEM;
> > +
> > +     r->uniform_offset = true;
> > +     r->pfn_offset = pfn_offset;
> > +
> > +     return 0;
> > +}
>
> This function doesn't seem to do anything useful.  Is part of it
> missing?
No, the uniform pfn offset is a special case.  With it, there is only
one entry  in the "map" and 'uniform_offset' is set to true and the
pfn_offset is also set.  When the offset is to be computed, there are
no region bounds calculations needed -- the offset of the first entry
is returned immediately.  This special case was intentional to (a)
preserve backwards compatibility to  those using dev->dma_pfn_offset
and (b) it is faster than treating it as the general case -- I want to
minimize the execution cost of this case.  If this is deemed
unnecessary, I can remove the boolean and threat all cases in the same
way.

The other case is where multiple pfn offsets are needed for multiple
regions.  In this case the code checks the bounds for each successive
region, returning an  offset if there is a match.
>
> > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > +
>
> regards,
> dan carpenter
>

Thanks!
Jim Quinlan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 13:48       ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-04 13:48 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Paul Kocialkowski,
	Lorenzo Pieralisi, Yoshinori Sato, Will Deacon, Joerg Roedel,
	maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT, Russell King,
	open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Rob Herring,
	Borislav Petkov, Yong Deng, Santosh Shilimkar, Bjorn Helgaas,
	Dan Williams, Andy Shevchenko, moderated list:ARM PORT,
	Saravana Kannan, Greg Kroah-Hartman, Oliver Neukum,
	Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

Hi Dan,

On Thu, Jun 4, 2020 at 7:06 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> On Wed, Jun 03, 2020 at 03:20:41PM -0400, Jim Quinlan wrote:
> > @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
> >       const struct sun4i_backend_quirks *quirks;
> >       struct resource *res;
> >       void __iomem *regs;
> > -     int i, ret;
> > +     int i, ret = 0;
>
> No need for this.
Will fix.

>
> >
> >       backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
> >       if (!backend)
> > @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct device *master,
> >                * on our device since the RAM mapping is at 0 for the DMA bus,
> >                * unlike the CPU.
> >                */
> > -             drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >       }
> >
> >       backend->engine.node = dev->of_node;
> > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> > index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
> >       if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
> >               return NULL;
> >
> > -     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> > +     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
> >               dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n");
> >               return NULL;
> >       }
> > diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > index eff34ded6305..7212da5e1076 100644
> > --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > @@ -7,6 +7,7 @@
> >   */
> >
> >  #include <linux/clk.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/interrupt.h>
> >  #include <linux/module.h>
> >  #include <linux/mutex.h>
> > @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
> >                       return ret;
> >       } else {
> >  #ifdef PHYS_PFN_OFFSET
> > -             csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >  #endif
> >       }
> >
> > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > index 055eb0b8e396..2d66d415b6c3 100644
> > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
> >
> >       sdev->dev = &pdev->dev;
> >       /* The DMA bus has the memory mapped at 0 */
> > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > +     if (ret)
> > +             return ret;
> >
> >       ret = sun6i_csi_resource_request(sdev, pdev);
> >       if (ret)
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index 96d8cfb14a60..c89333b0a5fb 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node *np, int index,
> >  }
> >  EXPORT_SYMBOL(of_io_request_and_map);
> >
> > +static int attach_dma_pfn_offset_map(struct device *dev,
> > +                                  struct device_node *node, int num_ranges)
> > +{
> > +     struct of_range_parser parser;
> > +     struct of_range range;
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     r = devm_kcalloc(dev, num_ranges + 1,
> > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > +     if (!r)
> > +             return -ENOMEM;
> > +     dev->dma_pfn_offset_map = r;
> > +     of_dma_range_parser_init(&parser, node);
> > +
> > +     /*
> > +      * Record all info for DMA ranges array.  We could
> > +      * just use the of_range struct, but if we did that it
> > +      * would require more calculations for phys_to_dma and
> > +      * dma_to_phys conversions.
> > +      */
> > +     for_each_of_range(&parser, &range) {
> > +             r->cpu_start = range.cpu_addr;
> > +             r->cpu_end = r->cpu_start + range.size - 1;
> > +             r->dma_start = range.bus_addr;
> > +             r->dma_end = r->dma_start + range.size - 1;
> > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > +                     - PFN_DOWN(range.bus_addr);
> > +             r++;
> > +     }
> > +     return 0;
> > +}
> > +
> > +
> > +
> > +/**
> > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > + * @dev:     device pointer; only needed for a corner case.
> > + * @dma_pfn_offset:  offset to apply when converting from phys addr
>       ^^^^^^^^^^^^^^^
> This parameter name does not match.
Will fix.

>
> > + *                   to dma addr and vice versa.
> > + *
> > + * It returns -ENOMEM if out of memory, otherwise 0.
>
> It can also return -ENODEV.  Why are we passing NULL dev pointers to
> all these functions anyway?
No one should be passing dev==NULL to these functions -- that is why I
have the error out for this case.  Actually, there is a case of
dev==NULL being passed -- drivers/of/unittest.c makes a call with  a
NULL device because it has no dev  in its context.  I wasn't sure how
to fix this.

>
> > + */
> > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long pfn_offset)
> > +{
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     if (!dev)
> > +             return -ENODEV;
> > +
> > +     if (!pfn_offset)
> > +             return 0;
> > +
> > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > +                      GFP_KERNEL);
>
> Use:    r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
Will fix.

>
>
> > +     if (!r)
> > +             return -ENOMEM;
> > +
> > +     r->uniform_offset = true;
> > +     r->pfn_offset = pfn_offset;
> > +
> > +     return 0;
> > +}
>
> This function doesn't seem to do anything useful.  Is part of it
> missing?
No, the uniform pfn offset is a special case.  With it, there is only
one entry  in the "map" and 'uniform_offset' is set to true and the
pfn_offset is also set.  When the offset is to be computed, there are
no region bounds calculations needed -- the offset of the first entry
is returned immediately.  This special case was intentional to (a)
preserve backwards compatibility to  those using dev->dma_pfn_offset
and (b) it is faster than treating it as the general case -- I want to
minimize the execution cost of this case.  If this is deemed
unnecessary, I can remove the boolean and threat all cases in the same
way.

The other case is where multiple pfn offsets are needed for multiple
regions.  In this case the code checks the bounds for each successive
region, returning an  offset if there is a match.
>
> > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > +
>
> regards,
> dan carpenter
>

Thanks!
Jim Quinlan
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-03 19:20   ` Jim Quinlan via iommu
  (?)
@ 2020-06-04 13:53     ` Nicolas Saenz Julienne
  -1 siblings, 0 replies; 83+ messages in thread
From: Nicolas Saenz Julienne @ 2020-06-04 13:53 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig, bcm-kernel-feedback-list
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Dan Williams, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar, Alan Stern, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Maxime Ripard, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM


[-- Attachment #1.1: Type: text/plain, Size: 15878 bytes --]

Hi Jim,

On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
> The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> the use of multiple pfn offsets between cpu addrs and dma addrs.  It
> subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
> designates the single offset a special case.
> 
> of_dma_configure() is the typical manner to set pfn offsets but there
> are a number of ad hoc assignments to dev->dma_pfn_offset in the
> kernel code.  These cases now invoke the function
> attach_uniform_dma_pfn_offset(dev, pfn_offset).
> 
> Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
> ---
>  arch/arm/include/asm/dma-mapping.h            |  9 +-
>  arch/arm/mach-keystone/keystone.c             |  9 +-
>  arch/sh/drivers/pci/pcie-sh7786.c             |  3 +-
>  arch/sh/kernel/dma-coherent.c                 | 17 ++--
>  arch/x86/pci/sta2x11-fixup.c                  |  7 +-
>  drivers/acpi/arm64/iort.c                     |  5 +-
>  drivers/gpu/drm/sun4i/sun4i_backend.c         |  7 +-
>  drivers/iommu/io-pgtable-arm.c                |  2 +-
>  .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
>  .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  5 +-
>  drivers/of/address.c                          | 93 +++++++++++++++++--
>  drivers/of/device.c                           |  8 +-
>  drivers/remoteproc/remoteproc_core.c          |  2 +-
>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
>  drivers/usb/core/message.c                    |  4 +-
>  drivers/usb/core/usb.c                        |  2 +-
>  include/linux/device.h                        |  4 +-
>  include/linux/dma-direct.h                    | 16 +++-
>  include/linux/dma-mapping.h                   | 45 +++++++++
>  kernel/dma/coherent.c                         | 11 ++-
>  20 files changed, 210 insertions(+), 51 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-
> mapping.h
> index bdd80ddbca34..f1e72f99468b 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -35,8 +35,9 @@ static inline const struct dma_map_ops
> *get_arch_dma_ops(struct bus_type *bus)
>  #ifndef __arch_pfn_to_dma
>  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
>  {
> -	if (dev)
> -		pfn -= dev->dma_pfn_offset;
> +	if (dev && dev->dma_pfn_offset_map)

Would it make sense to move the dev->dma_pfn_offset_map check into
dma_pfn_offset_from_phys_addr() and return 0 if not available? Same for the
opposite variant of the function. I think it'd make the code a little simpler on
some of the use cases, and overall less error prone if anyone starts using the
function elsewhere.

> +		pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
> +
>  	return (dma_addr_t)__pfn_to_bus(pfn);
>  }
>  
> @@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev,
> dma_addr_t addr)
>  {
>  	unsigned long pfn = __bus_to_pfn(addr);
>  
> -	if (dev)
> -		pfn += dev->dma_pfn_offset;
> +	if (dev && dev->dma_pfn_offset_map)
> +		pfn += dma_pfn_offset_from_dma_addr(dev, addr);
>  
>  	return pfn;
>  }
> diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-
> keystone/keystone.c
> index 638808c4e122..e7d3ee6e9cb5 100644
> --- a/arch/arm/mach-keystone/keystone.c
> +++ b/arch/arm/mach-keystone/keystone.c
> @@ -8,6 +8,7 @@
>   */
>  #include <linux/io.h>
>  #include <linux/of.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/init.h>
>  #include <linux/of_platform.h>
>  #include <linux/of_address.h>
> @@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct notifier_block
> *nb,
>  		return NOTIFY_BAD;
>  
>  	if (!dev->of_node) {
> -		dev->dma_pfn_offset = keystone_dma_pfn_offset;
> -		dev_err(dev, "set dma_pfn_offset%08lx\n",
> -			dev->dma_pfn_offset);
> +		int ret = attach_uniform_dma_pfn_offset
> +				(dev, keystone_dma_pfn_offset);
> +
> +		dev_err(dev, "set dma_pfn_offset%08lx%s\n",
> +			dev->dma_pfn_offset, ret ? " failed" : "");
>  	}
>  	return NOTIFY_OK;
>  }
> diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-
> sh7786.c
> index e0b568aaa701..2e832a5c58c1 100644
> --- a/arch/sh/drivers/pci/pcie-sh7786.c
> +++ b/arch/sh/drivers/pci/pcie-sh7786.c
> @@ -12,6 +12,7 @@
>  #include <linux/io.h>
>  #include <linux/async.h>
>  #include <linux/delay.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/slab.h>
>  #include <linux/clk.h>
>  #include <linux/sh_clk.h>
> @@ -487,7 +488,7 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev,
> u8 slot, u8 pin)
>  
>  void pcibios_bus_add_device(struct pci_dev *pdev)
>  {
> -	pdev->dev.dma_pfn_offset = dma_pfn_offset;
> +	attach_uniform_dma_pfn_offset(&pdev->dev, dma_pfn_offset);
>  }
>  
>  static int __init sh7786_pcie_core_init(void)
> diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
> index d4811691b93c..5fc9e358b6c7 100644
> --- a/arch/sh/kernel/dma-coherent.c
> +++ b/arch/sh/kernel/dma-coherent.c
> @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> dma_addr_t *dma_handle,
>  {
>  	void *ret, *ret_nocache;
>  	int order = get_order(size);
> +	unsigned long pfn;
> +	phys_addr_t phys;
>  
>  	gfp |= __GFP_ZERO;
>  
> @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> dma_addr_t *dma_handle,
>  		return NULL;
>  	}
>  
> -	split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> +	phys = virt_to_phys(ret);
> +	pfn =  phys >> PAGE_SHIFT;

nit: not sure it really pays off to have a pfn variable here.

> +	split_page(pfn_to_page(pfn), order);
>  
> -	*dma_handle = virt_to_phys(ret);
> -	if (!WARN_ON(!dev))
> -		*dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
> +	*dma_handle = (dma_addr_t)phys;
> +	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> +		*dma_handle -= PFN_PHYS(
> +			dma_pfn_offset_from_phys_addr(dev, phys));

>  
>  	return ret_nocache;
>  }
> @@ -50,8 +55,8 @@ void arch_dma_free(struct device *dev, size_t size, void
> *vaddr,
>  	unsigned long pfn = (dma_handle >> PAGE_SHIFT);
>  	int k;
>  
> -	if (!WARN_ON(!dev))
> -		pfn += dev->dma_pfn_offset;
> +	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> +		pfn += dma_pfn_offset_from_dma_addr(dev, dma_handle);
>  
>  	for (k = 0; k < (1 << order); k++)
>  		__free_pages(pfn_to_page(pfn + k), 0);
> diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
> index c313d784efab..4cdeca9f69b6 100644
> --- a/arch/x86/pci/sta2x11-fixup.c
> +++ b/arch/x86/pci/sta2x11-fixup.c
> @@ -12,6 +12,7 @@
>  #include <linux/export.h>
>  #include <linux/list.h>
>  #include <linux/dma-direct.h>
> +#include <linux/dma-mapping.h>
>  #include <asm/iommu.h>
>  
>  #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
> @@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
>  	struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
>  	struct device *dev = &pdev->dev;
>  	u32 amba_base, max_amba_addr;
> -	int i;
> +	int i, ret;
>  
>  	if (!instance)
>  		return;
> @@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
>  	pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
>  	max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
>  
> -	dev->dma_pfn_offset = PFN_DOWN(-amba_base);
> +	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(-amba_base));
> +	if (ret)
> +		dev_err(dev, "sta2x11: could not set PFN offset\n");
>  
>  	dev->bus_dma_limit = max_amba_addr;
>  	pci_set_consistent_dma_mask(pdev, max_amba_addr);
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index 28a6b387e80e..153661ddc74b 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr,
> u64 *dma_size)
>  	*dma_addr = dmaaddr;
>  	*dma_size = size;
>  
> -	dev->dma_pfn_offset = PFN_DOWN(offset);
> -	dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
> +	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(offset));
> +	dev_dbg(dev, "dma_pfn_offset(%#08llx)%s\n",
> +		offset, ret ? " failed!" : "");
>  }
>  
>  static void __init acpi_iort_register_irq(int hwirq, const char *name,
> diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c
> b/drivers/gpu/drm/sun4i/sun4i_backend.c
> index 072ea113e6be..3d41dfc7d178 100644
> --- a/drivers/gpu/drm/sun4i/sun4i_backend.c
> +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
> @@ -11,6 +11,7 @@
>  #include <linux/module.h>
>  #include <linux/of_device.h>
>  #include <linux/of_graph.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/platform_device.h>
>  #include <linux/reset.h>
>  
> @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct
> device *master,
>  	const struct sun4i_backend_quirks *quirks;
>  	struct resource *res;
>  	void __iomem *regs;
> -	int i, ret;
> +	int i, ret = 0;
>  
>  	backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
>  	if (!backend)
> @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct
> device *master,
>  		 * on our device since the RAM mapping is at 0 for the DMA bus,
>  		 * unlike the CPU.
>  		 */
> -		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  	}
>  
>  	backend->engine.node = dev->of_node;
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>  	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
>  		return NULL;
>  
> -	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> +	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
>  		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU
> page tables\n");
>  		return NULL;
>  	}
> diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> index eff34ded6305..7212da5e1076 100644
> --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> @@ -7,6 +7,7 @@
>   */
>  
>  #include <linux/clk.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/interrupt.h>
>  #include <linux/module.h>
>  #include <linux/mutex.h>
> @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
>  			return ret;
>  	} else {
>  #ifdef PHYS_PFN_OFFSET
> -		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  #endif
>  	}
>  
> diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> index 055eb0b8e396..2d66d415b6c3 100644
> --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
>  
>  	sdev->dev = &pdev->dev;
>  	/* The DMA bus has the memory mapped at 0 */
> -	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> +	ret = attach_uniform_dma_pfn_offset(sdev->dev,
> +					    PHYS_OFFSET >> PAGE_SHIFT);
> +	if (ret)
> +		return ret;
>  
>  	ret = sun6i_csi_resource_request(sdev, pdev);
>  	if (ret)
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 96d8cfb14a60..c89333b0a5fb 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node
> *np, int index,
>  }
>  EXPORT_SYMBOL(of_io_request_and_map);
>  
> +static int attach_dma_pfn_offset_map(struct device *dev,
> +				     struct device_node *node, int num_ranges)

As with the previous review, please take this comment with a grain of salt.

I think there should be a clear split between what belongs to OF and what
belongs to the core device infrastructure.

OF's job should be to parse DT and provide a list/array of ranges, whereas the
core device infrastructure should provide an API to assign a list of
ranges/offset to a device.

As a concrete example, you're forcing devices like the sta2x11 to build with OF
support, which, being an Intel device, it's pretty odd. But I'm also thinking
of how will all this fit once an ACPI device wants to use it.

Expanding on this idea, once you have a split between the OF's and device core
roles, it transpires that of_dma_get_range()'s job should only be to provide
the ranges in a device understandable structure and of_dma_configre()'s to
actually assign the device's parameters. This would obsolete patch #7.

> +{
> +	struct of_range_parser parser;
> +	struct of_range range;
> +	struct dma_pfn_offset_region *r;
> +
> +	r = devm_kcalloc(dev, num_ranges + 1,
> +			 sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> +	if (!r)
> +		return -ENOMEM;
> +	dev->dma_pfn_offset_map = r;
> +	of_dma_range_parser_init(&parser, node);
> +
> +	/*
> +	 * Record all info for DMA ranges array.  We could
> +	 * just use the of_range struct, but if we did that it
> +	 * would require more calculations for phys_to_dma and
> +	 * dma_to_phys conversions.
> +	 */
> +	for_each_of_range(&parser, &range) {
> +		r->cpu_start = range.cpu_addr;
> +		r->cpu_end = r->cpu_start + range.size - 1;
> +		r->dma_start = range.bus_addr;
> +		r->dma_end = r->dma_start + range.size - 1;
> +		r->pfn_offset = PFN_DOWN(range.cpu_addr)
> +			- PFN_DOWN(range.bus_addr);
> +		r++;
> +	}
> +	return 0;
> +}
> +
> +
> +
> +/**
> + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> + * @dev:	device pointer; only needed for a corner case.

That's a huge corner :P

> + * @dma_pfn_offset:	offset to apply when converting from phys addr
> + *			to dma addr and vice versa.
> + *
> + * It returns -ENOMEM if out of memory, otherwise 0.
> + */
> +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> pfn_offset)

As I say above, does this really belong to of/address.c?

> +{
> +	struct dma_pfn_offset_region *r;
> +
> +	if (!dev)
> +		return -ENODEV;
> +
> +	if (!pfn_offset)
> +		return 0;
> +
> +	r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> +			 GFP_KERNEL);
> +	if (!r)
> +		return -ENOMEM;

I think you're missing this:

	dev->dma_pfn_offset_map = r;

> +
> +	r->uniform_offset = true;
> +	r->pfn_offset = pfn_offset;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> +
>  /**
>   * of_dma_get_range - Get DMA range info
>   * @dev:	device pointer; only needed for a corner case.
> @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
>   *	CPU addr (phys_addr_t)	: pna cells
>   *	size			: nsize cells
>   *
> - * It returns -ENODEV if "dma-ranges" property was not found
> + * It returns -ENODEV if !dev or "dma-ranges" property was not found
>   * for this device in DT.
>   */
>  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> *dma_addr,
> @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> device_node *np, u64 *dma_addr,
>  	bool found_dma_ranges = false;
>  	struct of_range_parser parser;
>  	struct of_range range;
> +	phys_addr_t cpu_start = ~(phys_addr_t)0;
>  	u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> +	bool dma_multi_pfn_offset = false;
> +	int num_ranges = 0;
> +
> +	if (!dev)
> +		return -ENODEV;

Shouldn't this be part of patch #7?

Regards,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 169 bytes --]

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 13:53     ` Nicolas Saenz Julienne
  0 siblings, 0 replies; 83+ messages in thread
From: Nicolas Saenz Julienne @ 2020-06-04 13:53 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig, bcm-kernel-feedback-list
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Dan Williams, open list:STAGING SUBSYSTEM,
	Wolfram Sang, Yoshinori Sato, Frank Rowand,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar, Alan Stern, David Kershner, Len Brown,
	Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Maxime Ripard, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM


[-- Attachment #1.1: Type: text/plain, Size: 15878 bytes --]

Hi Jim,

On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
> The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> the use of multiple pfn offsets between cpu addrs and dma addrs.  It
> subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
> designates the single offset a special case.
> 
> of_dma_configure() is the typical manner to set pfn offsets but there
> are a number of ad hoc assignments to dev->dma_pfn_offset in the
> kernel code.  These cases now invoke the function
> attach_uniform_dma_pfn_offset(dev, pfn_offset).
> 
> Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
> ---
>  arch/arm/include/asm/dma-mapping.h            |  9 +-
>  arch/arm/mach-keystone/keystone.c             |  9 +-
>  arch/sh/drivers/pci/pcie-sh7786.c             |  3 +-
>  arch/sh/kernel/dma-coherent.c                 | 17 ++--
>  arch/x86/pci/sta2x11-fixup.c                  |  7 +-
>  drivers/acpi/arm64/iort.c                     |  5 +-
>  drivers/gpu/drm/sun4i/sun4i_backend.c         |  7 +-
>  drivers/iommu/io-pgtable-arm.c                |  2 +-
>  .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
>  .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  5 +-
>  drivers/of/address.c                          | 93 +++++++++++++++++--
>  drivers/of/device.c                           |  8 +-
>  drivers/remoteproc/remoteproc_core.c          |  2 +-
>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
>  drivers/usb/core/message.c                    |  4 +-
>  drivers/usb/core/usb.c                        |  2 +-
>  include/linux/device.h                        |  4 +-
>  include/linux/dma-direct.h                    | 16 +++-
>  include/linux/dma-mapping.h                   | 45 +++++++++
>  kernel/dma/coherent.c                         | 11 ++-
>  20 files changed, 210 insertions(+), 51 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-
> mapping.h
> index bdd80ddbca34..f1e72f99468b 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -35,8 +35,9 @@ static inline const struct dma_map_ops
> *get_arch_dma_ops(struct bus_type *bus)
>  #ifndef __arch_pfn_to_dma
>  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
>  {
> -	if (dev)
> -		pfn -= dev->dma_pfn_offset;
> +	if (dev && dev->dma_pfn_offset_map)

Would it make sense to move the dev->dma_pfn_offset_map check into
dma_pfn_offset_from_phys_addr() and return 0 if not available? Same for the
opposite variant of the function. I think it'd make the code a little simpler on
some of the use cases, and overall less error prone if anyone starts using the
function elsewhere.

> +		pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
> +
>  	return (dma_addr_t)__pfn_to_bus(pfn);
>  }
>  
> @@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev,
> dma_addr_t addr)
>  {
>  	unsigned long pfn = __bus_to_pfn(addr);
>  
> -	if (dev)
> -		pfn += dev->dma_pfn_offset;
> +	if (dev && dev->dma_pfn_offset_map)
> +		pfn += dma_pfn_offset_from_dma_addr(dev, addr);
>  
>  	return pfn;
>  }
> diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-
> keystone/keystone.c
> index 638808c4e122..e7d3ee6e9cb5 100644
> --- a/arch/arm/mach-keystone/keystone.c
> +++ b/arch/arm/mach-keystone/keystone.c
> @@ -8,6 +8,7 @@
>   */
>  #include <linux/io.h>
>  #include <linux/of.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/init.h>
>  #include <linux/of_platform.h>
>  #include <linux/of_address.h>
> @@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct notifier_block
> *nb,
>  		return NOTIFY_BAD;
>  
>  	if (!dev->of_node) {
> -		dev->dma_pfn_offset = keystone_dma_pfn_offset;
> -		dev_err(dev, "set dma_pfn_offset%08lx\n",
> -			dev->dma_pfn_offset);
> +		int ret = attach_uniform_dma_pfn_offset
> +				(dev, keystone_dma_pfn_offset);
> +
> +		dev_err(dev, "set dma_pfn_offset%08lx%s\n",
> +			dev->dma_pfn_offset, ret ? " failed" : "");
>  	}
>  	return NOTIFY_OK;
>  }
> diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-
> sh7786.c
> index e0b568aaa701..2e832a5c58c1 100644
> --- a/arch/sh/drivers/pci/pcie-sh7786.c
> +++ b/arch/sh/drivers/pci/pcie-sh7786.c
> @@ -12,6 +12,7 @@
>  #include <linux/io.h>
>  #include <linux/async.h>
>  #include <linux/delay.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/slab.h>
>  #include <linux/clk.h>
>  #include <linux/sh_clk.h>
> @@ -487,7 +488,7 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev,
> u8 slot, u8 pin)
>  
>  void pcibios_bus_add_device(struct pci_dev *pdev)
>  {
> -	pdev->dev.dma_pfn_offset = dma_pfn_offset;
> +	attach_uniform_dma_pfn_offset(&pdev->dev, dma_pfn_offset);
>  }
>  
>  static int __init sh7786_pcie_core_init(void)
> diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
> index d4811691b93c..5fc9e358b6c7 100644
> --- a/arch/sh/kernel/dma-coherent.c
> +++ b/arch/sh/kernel/dma-coherent.c
> @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> dma_addr_t *dma_handle,
>  {
>  	void *ret, *ret_nocache;
>  	int order = get_order(size);
> +	unsigned long pfn;
> +	phys_addr_t phys;
>  
>  	gfp |= __GFP_ZERO;
>  
> @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> dma_addr_t *dma_handle,
>  		return NULL;
>  	}
>  
> -	split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> +	phys = virt_to_phys(ret);
> +	pfn =  phys >> PAGE_SHIFT;

nit: not sure it really pays off to have a pfn variable here.

> +	split_page(pfn_to_page(pfn), order);
>  
> -	*dma_handle = virt_to_phys(ret);
> -	if (!WARN_ON(!dev))
> -		*dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
> +	*dma_handle = (dma_addr_t)phys;
> +	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> +		*dma_handle -= PFN_PHYS(
> +			dma_pfn_offset_from_phys_addr(dev, phys));

>  
>  	return ret_nocache;
>  }
> @@ -50,8 +55,8 @@ void arch_dma_free(struct device *dev, size_t size, void
> *vaddr,
>  	unsigned long pfn = (dma_handle >> PAGE_SHIFT);
>  	int k;
>  
> -	if (!WARN_ON(!dev))
> -		pfn += dev->dma_pfn_offset;
> +	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> +		pfn += dma_pfn_offset_from_dma_addr(dev, dma_handle);
>  
>  	for (k = 0; k < (1 << order); k++)
>  		__free_pages(pfn_to_page(pfn + k), 0);
> diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
> index c313d784efab..4cdeca9f69b6 100644
> --- a/arch/x86/pci/sta2x11-fixup.c
> +++ b/arch/x86/pci/sta2x11-fixup.c
> @@ -12,6 +12,7 @@
>  #include <linux/export.h>
>  #include <linux/list.h>
>  #include <linux/dma-direct.h>
> +#include <linux/dma-mapping.h>
>  #include <asm/iommu.h>
>  
>  #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
> @@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
>  	struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
>  	struct device *dev = &pdev->dev;
>  	u32 amba_base, max_amba_addr;
> -	int i;
> +	int i, ret;
>  
>  	if (!instance)
>  		return;
> @@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
>  	pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
>  	max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
>  
> -	dev->dma_pfn_offset = PFN_DOWN(-amba_base);
> +	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(-amba_base));
> +	if (ret)
> +		dev_err(dev, "sta2x11: could not set PFN offset\n");
>  
>  	dev->bus_dma_limit = max_amba_addr;
>  	pci_set_consistent_dma_mask(pdev, max_amba_addr);
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index 28a6b387e80e..153661ddc74b 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr,
> u64 *dma_size)
>  	*dma_addr = dmaaddr;
>  	*dma_size = size;
>  
> -	dev->dma_pfn_offset = PFN_DOWN(offset);
> -	dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
> +	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(offset));
> +	dev_dbg(dev, "dma_pfn_offset(%#08llx)%s\n",
> +		offset, ret ? " failed!" : "");
>  }
>  
>  static void __init acpi_iort_register_irq(int hwirq, const char *name,
> diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c
> b/drivers/gpu/drm/sun4i/sun4i_backend.c
> index 072ea113e6be..3d41dfc7d178 100644
> --- a/drivers/gpu/drm/sun4i/sun4i_backend.c
> +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
> @@ -11,6 +11,7 @@
>  #include <linux/module.h>
>  #include <linux/of_device.h>
>  #include <linux/of_graph.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/platform_device.h>
>  #include <linux/reset.h>
>  
> @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct
> device *master,
>  	const struct sun4i_backend_quirks *quirks;
>  	struct resource *res;
>  	void __iomem *regs;
> -	int i, ret;
> +	int i, ret = 0;
>  
>  	backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
>  	if (!backend)
> @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct
> device *master,
>  		 * on our device since the RAM mapping is at 0 for the DMA bus,
>  		 * unlike the CPU.
>  		 */
> -		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  	}
>  
>  	backend->engine.node = dev->of_node;
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>  	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
>  		return NULL;
>  
> -	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> +	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
>  		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU
> page tables\n");
>  		return NULL;
>  	}
> diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> index eff34ded6305..7212da5e1076 100644
> --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> @@ -7,6 +7,7 @@
>   */
>  
>  #include <linux/clk.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/interrupt.h>
>  #include <linux/module.h>
>  #include <linux/mutex.h>
> @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
>  			return ret;
>  	} else {
>  #ifdef PHYS_PFN_OFFSET
> -		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  #endif
>  	}
>  
> diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> index 055eb0b8e396..2d66d415b6c3 100644
> --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
>  
>  	sdev->dev = &pdev->dev;
>  	/* The DMA bus has the memory mapped at 0 */
> -	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> +	ret = attach_uniform_dma_pfn_offset(sdev->dev,
> +					    PHYS_OFFSET >> PAGE_SHIFT);
> +	if (ret)
> +		return ret;
>  
>  	ret = sun6i_csi_resource_request(sdev, pdev);
>  	if (ret)
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 96d8cfb14a60..c89333b0a5fb 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node
> *np, int index,
>  }
>  EXPORT_SYMBOL(of_io_request_and_map);
>  
> +static int attach_dma_pfn_offset_map(struct device *dev,
> +				     struct device_node *node, int num_ranges)

As with the previous review, please take this comment with a grain of salt.

I think there should be a clear split between what belongs to OF and what
belongs to the core device infrastructure.

OF's job should be to parse DT and provide a list/array of ranges, whereas the
core device infrastructure should provide an API to assign a list of
ranges/offset to a device.

As a concrete example, you're forcing devices like the sta2x11 to build with OF
support, which, being an Intel device, it's pretty odd. But I'm also thinking
of how will all this fit once an ACPI device wants to use it.

Expanding on this idea, once you have a split between the OF's and device core
roles, it transpires that of_dma_get_range()'s job should only be to provide
the ranges in a device understandable structure and of_dma_configre()'s to
actually assign the device's parameters. This would obsolete patch #7.

> +{
> +	struct of_range_parser parser;
> +	struct of_range range;
> +	struct dma_pfn_offset_region *r;
> +
> +	r = devm_kcalloc(dev, num_ranges + 1,
> +			 sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> +	if (!r)
> +		return -ENOMEM;
> +	dev->dma_pfn_offset_map = r;
> +	of_dma_range_parser_init(&parser, node);
> +
> +	/*
> +	 * Record all info for DMA ranges array.  We could
> +	 * just use the of_range struct, but if we did that it
> +	 * would require more calculations for phys_to_dma and
> +	 * dma_to_phys conversions.
> +	 */
> +	for_each_of_range(&parser, &range) {
> +		r->cpu_start = range.cpu_addr;
> +		r->cpu_end = r->cpu_start + range.size - 1;
> +		r->dma_start = range.bus_addr;
> +		r->dma_end = r->dma_start + range.size - 1;
> +		r->pfn_offset = PFN_DOWN(range.cpu_addr)
> +			- PFN_DOWN(range.bus_addr);
> +		r++;
> +	}
> +	return 0;
> +}
> +
> +
> +
> +/**
> + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> + * @dev:	device pointer; only needed for a corner case.

That's a huge corner :P

> + * @dma_pfn_offset:	offset to apply when converting from phys addr
> + *			to dma addr and vice versa.
> + *
> + * It returns -ENOMEM if out of memory, otherwise 0.
> + */
> +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> pfn_offset)

As I say above, does this really belong to of/address.c?

> +{
> +	struct dma_pfn_offset_region *r;
> +
> +	if (!dev)
> +		return -ENODEV;
> +
> +	if (!pfn_offset)
> +		return 0;
> +
> +	r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> +			 GFP_KERNEL);
> +	if (!r)
> +		return -ENOMEM;

I think you're missing this:

	dev->dma_pfn_offset_map = r;

> +
> +	r->uniform_offset = true;
> +	r->pfn_offset = pfn_offset;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> +
>  /**
>   * of_dma_get_range - Get DMA range info
>   * @dev:	device pointer; only needed for a corner case.
> @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
>   *	CPU addr (phys_addr_t)	: pna cells
>   *	size			: nsize cells
>   *
> - * It returns -ENODEV if "dma-ranges" property was not found
> + * It returns -ENODEV if !dev or "dma-ranges" property was not found
>   * for this device in DT.
>   */
>  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> *dma_addr,
> @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> device_node *np, u64 *dma_addr,
>  	bool found_dma_ranges = false;
>  	struct of_range_parser parser;
>  	struct of_range range;
> +	phys_addr_t cpu_start = ~(phys_addr_t)0;
>  	u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> +	bool dma_multi_pfn_offset = false;
> +	int num_ranges = 0;
> +
> +	if (!dev)
> +		return -ENODEV;

Shouldn't this be part of patch #7?

Regards,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 13:53     ` Nicolas Saenz Julienne
  0 siblings, 0 replies; 83+ messages in thread
From: Nicolas Saenz Julienne @ 2020-06-04 13:53 UTC (permalink / raw)
  To: Jim Quinlan, linux-pci, Christoph Hellwig, bcm-kernel-feedback-list
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Dan Williams, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar, Alan Stern, David Kershner, Len Brown,
	Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM


[-- Attachment #1.1: Type: text/plain, Size: 15878 bytes --]

Hi Jim,

On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
> The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> the use of multiple pfn offsets between cpu addrs and dma addrs.  It
> subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
> designates the single offset a special case.
> 
> of_dma_configure() is the typical manner to set pfn offsets but there
> are a number of ad hoc assignments to dev->dma_pfn_offset in the
> kernel code.  These cases now invoke the function
> attach_uniform_dma_pfn_offset(dev, pfn_offset).
> 
> Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
> ---
>  arch/arm/include/asm/dma-mapping.h            |  9 +-
>  arch/arm/mach-keystone/keystone.c             |  9 +-
>  arch/sh/drivers/pci/pcie-sh7786.c             |  3 +-
>  arch/sh/kernel/dma-coherent.c                 | 17 ++--
>  arch/x86/pci/sta2x11-fixup.c                  |  7 +-
>  drivers/acpi/arm64/iort.c                     |  5 +-
>  drivers/gpu/drm/sun4i/sun4i_backend.c         |  7 +-
>  drivers/iommu/io-pgtable-arm.c                |  2 +-
>  .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
>  .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  5 +-
>  drivers/of/address.c                          | 93 +++++++++++++++++--
>  drivers/of/device.c                           |  8 +-
>  drivers/remoteproc/remoteproc_core.c          |  2 +-
>  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
>  drivers/usb/core/message.c                    |  4 +-
>  drivers/usb/core/usb.c                        |  2 +-
>  include/linux/device.h                        |  4 +-
>  include/linux/dma-direct.h                    | 16 +++-
>  include/linux/dma-mapping.h                   | 45 +++++++++
>  kernel/dma/coherent.c                         | 11 ++-
>  20 files changed, 210 insertions(+), 51 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-
> mapping.h
> index bdd80ddbca34..f1e72f99468b 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -35,8 +35,9 @@ static inline const struct dma_map_ops
> *get_arch_dma_ops(struct bus_type *bus)
>  #ifndef __arch_pfn_to_dma
>  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
>  {
> -	if (dev)
> -		pfn -= dev->dma_pfn_offset;
> +	if (dev && dev->dma_pfn_offset_map)

Would it make sense to move the dev->dma_pfn_offset_map check into
dma_pfn_offset_from_phys_addr() and return 0 if not available? Same for the
opposite variant of the function. I think it'd make the code a little simpler on
some of the use cases, and overall less error prone if anyone starts using the
function elsewhere.

> +		pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
> +
>  	return (dma_addr_t)__pfn_to_bus(pfn);
>  }
>  
> @@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev,
> dma_addr_t addr)
>  {
>  	unsigned long pfn = __bus_to_pfn(addr);
>  
> -	if (dev)
> -		pfn += dev->dma_pfn_offset;
> +	if (dev && dev->dma_pfn_offset_map)
> +		pfn += dma_pfn_offset_from_dma_addr(dev, addr);
>  
>  	return pfn;
>  }
> diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-
> keystone/keystone.c
> index 638808c4e122..e7d3ee6e9cb5 100644
> --- a/arch/arm/mach-keystone/keystone.c
> +++ b/arch/arm/mach-keystone/keystone.c
> @@ -8,6 +8,7 @@
>   */
>  #include <linux/io.h>
>  #include <linux/of.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/init.h>
>  #include <linux/of_platform.h>
>  #include <linux/of_address.h>
> @@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct notifier_block
> *nb,
>  		return NOTIFY_BAD;
>  
>  	if (!dev->of_node) {
> -		dev->dma_pfn_offset = keystone_dma_pfn_offset;
> -		dev_err(dev, "set dma_pfn_offset%08lx\n",
> -			dev->dma_pfn_offset);
> +		int ret = attach_uniform_dma_pfn_offset
> +				(dev, keystone_dma_pfn_offset);
> +
> +		dev_err(dev, "set dma_pfn_offset%08lx%s\n",
> +			dev->dma_pfn_offset, ret ? " failed" : "");
>  	}
>  	return NOTIFY_OK;
>  }
> diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-
> sh7786.c
> index e0b568aaa701..2e832a5c58c1 100644
> --- a/arch/sh/drivers/pci/pcie-sh7786.c
> +++ b/arch/sh/drivers/pci/pcie-sh7786.c
> @@ -12,6 +12,7 @@
>  #include <linux/io.h>
>  #include <linux/async.h>
>  #include <linux/delay.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/slab.h>
>  #include <linux/clk.h>
>  #include <linux/sh_clk.h>
> @@ -487,7 +488,7 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev,
> u8 slot, u8 pin)
>  
>  void pcibios_bus_add_device(struct pci_dev *pdev)
>  {
> -	pdev->dev.dma_pfn_offset = dma_pfn_offset;
> +	attach_uniform_dma_pfn_offset(&pdev->dev, dma_pfn_offset);
>  }
>  
>  static int __init sh7786_pcie_core_init(void)
> diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
> index d4811691b93c..5fc9e358b6c7 100644
> --- a/arch/sh/kernel/dma-coherent.c
> +++ b/arch/sh/kernel/dma-coherent.c
> @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> dma_addr_t *dma_handle,
>  {
>  	void *ret, *ret_nocache;
>  	int order = get_order(size);
> +	unsigned long pfn;
> +	phys_addr_t phys;
>  
>  	gfp |= __GFP_ZERO;
>  
> @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> dma_addr_t *dma_handle,
>  		return NULL;
>  	}
>  
> -	split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> +	phys = virt_to_phys(ret);
> +	pfn =  phys >> PAGE_SHIFT;

nit: not sure it really pays off to have a pfn variable here.

> +	split_page(pfn_to_page(pfn), order);
>  
> -	*dma_handle = virt_to_phys(ret);
> -	if (!WARN_ON(!dev))
> -		*dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
> +	*dma_handle = (dma_addr_t)phys;
> +	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> +		*dma_handle -= PFN_PHYS(
> +			dma_pfn_offset_from_phys_addr(dev, phys));

>  
>  	return ret_nocache;
>  }
> @@ -50,8 +55,8 @@ void arch_dma_free(struct device *dev, size_t size, void
> *vaddr,
>  	unsigned long pfn = (dma_handle >> PAGE_SHIFT);
>  	int k;
>  
> -	if (!WARN_ON(!dev))
> -		pfn += dev->dma_pfn_offset;
> +	if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> +		pfn += dma_pfn_offset_from_dma_addr(dev, dma_handle);
>  
>  	for (k = 0; k < (1 << order); k++)
>  		__free_pages(pfn_to_page(pfn + k), 0);
> diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
> index c313d784efab..4cdeca9f69b6 100644
> --- a/arch/x86/pci/sta2x11-fixup.c
> +++ b/arch/x86/pci/sta2x11-fixup.c
> @@ -12,6 +12,7 @@
>  #include <linux/export.h>
>  #include <linux/list.h>
>  #include <linux/dma-direct.h>
> +#include <linux/dma-mapping.h>
>  #include <asm/iommu.h>
>  
>  #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
> @@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
>  	struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
>  	struct device *dev = &pdev->dev;
>  	u32 amba_base, max_amba_addr;
> -	int i;
> +	int i, ret;
>  
>  	if (!instance)
>  		return;
> @@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
>  	pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
>  	max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
>  
> -	dev->dma_pfn_offset = PFN_DOWN(-amba_base);
> +	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(-amba_base));
> +	if (ret)
> +		dev_err(dev, "sta2x11: could not set PFN offset\n");
>  
>  	dev->bus_dma_limit = max_amba_addr;
>  	pci_set_consistent_dma_mask(pdev, max_amba_addr);
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index 28a6b387e80e..153661ddc74b 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr,
> u64 *dma_size)
>  	*dma_addr = dmaaddr;
>  	*dma_size = size;
>  
> -	dev->dma_pfn_offset = PFN_DOWN(offset);
> -	dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
> +	ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(offset));
> +	dev_dbg(dev, "dma_pfn_offset(%#08llx)%s\n",
> +		offset, ret ? " failed!" : "");
>  }
>  
>  static void __init acpi_iort_register_irq(int hwirq, const char *name,
> diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c
> b/drivers/gpu/drm/sun4i/sun4i_backend.c
> index 072ea113e6be..3d41dfc7d178 100644
> --- a/drivers/gpu/drm/sun4i/sun4i_backend.c
> +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
> @@ -11,6 +11,7 @@
>  #include <linux/module.h>
>  #include <linux/of_device.h>
>  #include <linux/of_graph.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/platform_device.h>
>  #include <linux/reset.h>
>  
> @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct
> device *master,
>  	const struct sun4i_backend_quirks *quirks;
>  	struct resource *res;
>  	void __iomem *regs;
> -	int i, ret;
> +	int i, ret = 0;
>  
>  	backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
>  	if (!backend)
> @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct
> device *master,
>  		 * on our device since the RAM mapping is at 0 for the DMA bus,
>  		 * unlike the CPU.
>  		 */
> -		drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  	}
>  
>  	backend->engine.node = dev->of_node;
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>  	if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
>  		return NULL;
>  
> -	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> +	if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
>  		dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU
> page tables\n");
>  		return NULL;
>  	}
> diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> index eff34ded6305..7212da5e1076 100644
> --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> @@ -7,6 +7,7 @@
>   */
>  
>  #include <linux/clk.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/interrupt.h>
>  #include <linux/module.h>
>  #include <linux/mutex.h>
> @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
>  			return ret;
>  	} else {
>  #ifdef PHYS_PFN_OFFSET
> -		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> +		ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> +		if (ret)
> +			return ret;
>  #endif
>  	}
>  
> diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> index 055eb0b8e396..2d66d415b6c3 100644
> --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
>  
>  	sdev->dev = &pdev->dev;
>  	/* The DMA bus has the memory mapped at 0 */
> -	sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> +	ret = attach_uniform_dma_pfn_offset(sdev->dev,
> +					    PHYS_OFFSET >> PAGE_SHIFT);
> +	if (ret)
> +		return ret;
>  
>  	ret = sun6i_csi_resource_request(sdev, pdev);
>  	if (ret)
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 96d8cfb14a60..c89333b0a5fb 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node
> *np, int index,
>  }
>  EXPORT_SYMBOL(of_io_request_and_map);
>  
> +static int attach_dma_pfn_offset_map(struct device *dev,
> +				     struct device_node *node, int num_ranges)

As with the previous review, please take this comment with a grain of salt.

I think there should be a clear split between what belongs to OF and what
belongs to the core device infrastructure.

OF's job should be to parse DT and provide a list/array of ranges, whereas the
core device infrastructure should provide an API to assign a list of
ranges/offset to a device.

As a concrete example, you're forcing devices like the sta2x11 to build with OF
support, which, being an Intel device, it's pretty odd. But I'm also thinking
of how will all this fit once an ACPI device wants to use it.

Expanding on this idea, once you have a split between the OF's and device core
roles, it transpires that of_dma_get_range()'s job should only be to provide
the ranges in a device understandable structure and of_dma_configre()'s to
actually assign the device's parameters. This would obsolete patch #7.

> +{
> +	struct of_range_parser parser;
> +	struct of_range range;
> +	struct dma_pfn_offset_region *r;
> +
> +	r = devm_kcalloc(dev, num_ranges + 1,
> +			 sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> +	if (!r)
> +		return -ENOMEM;
> +	dev->dma_pfn_offset_map = r;
> +	of_dma_range_parser_init(&parser, node);
> +
> +	/*
> +	 * Record all info for DMA ranges array.  We could
> +	 * just use the of_range struct, but if we did that it
> +	 * would require more calculations for phys_to_dma and
> +	 * dma_to_phys conversions.
> +	 */
> +	for_each_of_range(&parser, &range) {
> +		r->cpu_start = range.cpu_addr;
> +		r->cpu_end = r->cpu_start + range.size - 1;
> +		r->dma_start = range.bus_addr;
> +		r->dma_end = r->dma_start + range.size - 1;
> +		r->pfn_offset = PFN_DOWN(range.cpu_addr)
> +			- PFN_DOWN(range.bus_addr);
> +		r++;
> +	}
> +	return 0;
> +}
> +
> +
> +
> +/**
> + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> + * @dev:	device pointer; only needed for a corner case.

That's a huge corner :P

> + * @dma_pfn_offset:	offset to apply when converting from phys addr
> + *			to dma addr and vice versa.
> + *
> + * It returns -ENOMEM if out of memory, otherwise 0.
> + */
> +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> pfn_offset)

As I say above, does this really belong to of/address.c?

> +{
> +	struct dma_pfn_offset_region *r;
> +
> +	if (!dev)
> +		return -ENODEV;
> +
> +	if (!pfn_offset)
> +		return 0;
> +
> +	r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> +			 GFP_KERNEL);
> +	if (!r)
> +		return -ENOMEM;

I think you're missing this:

	dev->dma_pfn_offset_map = r;

> +
> +	r->uniform_offset = true;
> +	r->pfn_offset = pfn_offset;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> +
>  /**
>   * of_dma_get_range - Get DMA range info
>   * @dev:	device pointer; only needed for a corner case.
> @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
>   *	CPU addr (phys_addr_t)	: pna cells
>   *	size			: nsize cells
>   *
> - * It returns -ENODEV if "dma-ranges" property was not found
> + * It returns -ENODEV if !dev or "dma-ranges" property was not found
>   * for this device in DT.
>   */
>  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> *dma_addr,
> @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> device_node *np, u64 *dma_addr,
>  	bool found_dma_ranges = false;
>  	struct of_range_parser parser;
>  	struct of_range range;
> +	phys_addr_t cpu_start = ~(phys_addr_t)0;
>  	u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> +	bool dma_multi_pfn_offset = false;
> +	int num_ranges = 0;
> +
> +	if (!dev)
> +		return -ENODEV;

Shouldn't this be part of patch #7?

Regards,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-04 13:48       ` Jim Quinlan via iommu
  (?)
@ 2020-06-04 14:18         ` Dan Carpenter
  -1 siblings, 0 replies; 83+ messages in thread
From: Dan Carpenter @ 2020-06-04 14:18 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Paul Kocialkowski,
	Lorenzo Pieralisi, Yoshinori Sato, Will Deacon, Joerg Roedel,
	maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT, Russell King,
	open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Maxime Ripard,
	Rob Herring, Borislav Petkov, Yong Deng, Santosh Shilimkar,
	Bjorn Helgaas, Dan Williams, Andy Shevchenko,
	moderated list:ARM PORT, Saravana Kannan, Greg Kroah-Hartman,
	Oliver Neukum, Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Thu, Jun 04, 2020 at 09:48:49AM -0400, Jim Quinlan wrote:
> > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > +                      GFP_KERNEL);
> >
> > Use:    r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
> Will fix.
> 
> >
> >
> > > +     if (!r)
> > > +             return -ENOMEM;
> > > +
> > > +     r->uniform_offset = true;
> > > +     r->pfn_offset = pfn_offset;
> > > +
> > > +     return 0;
> > > +}
> >
> > This function doesn't seem to do anything useful.  Is part of it
> > missing?
> No, the uniform pfn offset is a special case.

Sorry, I wasn't clear.  We're talking about different things.  The code
does:

	r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
	if (!r)
		return -ENOMEM;

	r->uniform_offset = true;
	r->pfn_offset = pfn_offset;

	return 0;

The code allocates "r" and then doesn't save it anywhere so there is
no point.

regards,
dan carpenter

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 14:18         ` Dan Carpenter
  0 siblings, 0 replies; 83+ messages in thread
From: Dan Carpenter @ 2020-06-04 14:18 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	open list:STAGING SUBSYSTEM, Paul Kocialkowski, Yoshinori Sato,
	Will Deacon, maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT,
	Russell King, open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai,
	Ingo Molnar, maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE,
	Alan Stern, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Maxime Ripard,
	Rob Herring, Borislav Petkov, Yong Deng, Santosh Shilimkar,
	Bjorn Helgaas, Dan Williams, Andy Shevchenko,
	moderated list:ARM PORT, Saravana Kannan, Greg Kroah-Hartman,
	Oliver Neukum, Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Thu, Jun 04, 2020 at 09:48:49AM -0400, Jim Quinlan wrote:
> > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > +                      GFP_KERNEL);
> >
> > Use:    r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
> Will fix.
> 
> >
> >
> > > +     if (!r)
> > > +             return -ENOMEM;
> > > +
> > > +     r->uniform_offset = true;
> > > +     r->pfn_offset = pfn_offset;
> > > +
> > > +     return 0;
> > > +}
> >
> > This function doesn't seem to do anything useful.  Is part of it
> > missing?
> No, the uniform pfn offset is a special case.

Sorry, I wasn't clear.  We're talking about different things.  The code
does:

	r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
	if (!r)
		return -ENOMEM;

	r->uniform_offset = true;
	r->pfn_offset = pfn_offset;

	return 0;

The code allocates "r" and then doesn't save it anywhere so there is
no point.

regards,
dan carpenter

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 14:18         ` Dan Carpenter
  0 siblings, 0 replies; 83+ messages in thread
From: Dan Carpenter @ 2020-06-04 14:18 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Paul Kocialkowski,
	Lorenzo Pieralisi, Yoshinori Sato, Will Deacon, Joerg Roedel,
	maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT, Russell King,
	open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Rob Herring,
	Borislav Petkov, Yong Deng, Santosh Shilimkar, Bjorn Helgaas,
	Dan Williams, Andy Shevchenko, moderated list:ARM PORT,
	Saravana Kannan, Greg Kroah-Hartman, Oliver Neukum,
	Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Thu, Jun 04, 2020 at 09:48:49AM -0400, Jim Quinlan wrote:
> > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > +                      GFP_KERNEL);
> >
> > Use:    r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
> Will fix.
> 
> >
> >
> > > +     if (!r)
> > > +             return -ENOMEM;
> > > +
> > > +     r->uniform_offset = true;
> > > +     r->pfn_offset = pfn_offset;
> > > +
> > > +     return 0;
> > > +}
> >
> > This function doesn't seem to do anything useful.  Is part of it
> > missing?
> No, the uniform pfn offset is a special case.

Sorry, I wasn't clear.  We're talking about different things.  The code
does:

	r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
	if (!r)
		return -ENOMEM;

	r->uniform_offset = true;
	r->pfn_offset = pfn_offset;

	return 0;

The code allocates "r" and then doesn't save it anywhere so there is
no point.

regards,
dan carpenter

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-04 13:53     ` Nicolas Saenz Julienne
  (?)
@ 2020-06-04 14:35       ` Jim Quinlan via iommu
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-04 14:35 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, open list:DRM DRIVERS FOR ALLWINNER A10,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> Hi Jim,
>
> On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
> > The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> > the use of multiple pfn offsets between cpu addrs and dma addrs.  It
> > subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
> > designates the single offset a special case.
> >
> > of_dma_configure() is the typical manner to set pfn offsets but there
> > are a number of ad hoc assignments to dev->dma_pfn_offset in the
> > kernel code.  These cases now invoke the function
> > attach_uniform_dma_pfn_offset(dev, pfn_offset).
> >
> > Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
> > ---
> >  arch/arm/include/asm/dma-mapping.h            |  9 +-
> >  arch/arm/mach-keystone/keystone.c             |  9 +-
> >  arch/sh/drivers/pci/pcie-sh7786.c             |  3 +-
> >  arch/sh/kernel/dma-coherent.c                 | 17 ++--
> >  arch/x86/pci/sta2x11-fixup.c                  |  7 +-
> >  drivers/acpi/arm64/iort.c                     |  5 +-
> >  drivers/gpu/drm/sun4i/sun4i_backend.c         |  7 +-
> >  drivers/iommu/io-pgtable-arm.c                |  2 +-
> >  .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
> >  .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  5 +-
> >  drivers/of/address.c                          | 93 +++++++++++++++++--
> >  drivers/of/device.c                           |  8 +-
> >  drivers/remoteproc/remoteproc_core.c          |  2 +-
> >  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
> >  drivers/usb/core/message.c                    |  4 +-
> >  drivers/usb/core/usb.c                        |  2 +-
> >  include/linux/device.h                        |  4 +-
> >  include/linux/dma-direct.h                    | 16 +++-
> >  include/linux/dma-mapping.h                   | 45 +++++++++
> >  kernel/dma/coherent.c                         | 11 ++-
> >  20 files changed, 210 insertions(+), 51 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-
> > mapping.h
> > index bdd80ddbca34..f1e72f99468b 100644
> > --- a/arch/arm/include/asm/dma-mapping.h
> > +++ b/arch/arm/include/asm/dma-mapping.h
> > @@ -35,8 +35,9 @@ static inline const struct dma_map_ops
> > *get_arch_dma_ops(struct bus_type *bus)
> >  #ifndef __arch_pfn_to_dma
> >  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
> >  {
> > -     if (dev)
> > -             pfn -= dev->dma_pfn_offset;
> > +     if (dev && dev->dma_pfn_offset_map)
>
> Would it make sense to move the dev->dma_pfn_offset_map check into
> dma_pfn_offset_from_phys_addr() and return 0 if not available? Same for the
> opposite variant of the function. I think it'd make the code a little simpler on
> some of the use cases, and overall less error prone if anyone starts using the
> function elsewhere.

Yes it makes sense and I was debating doing it but I just wanted to
make it explicit that there was not much cost for this change for the
fastpath -- no dma_pfn_offset whatsoever -- as the cost goes from a
"pfn += dev->dma_pfn_offset"  to a "if (dev->dma_pfn_offset_map)".  I
will do what you suggest.
>
> > +             pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
> > +
> >       return (dma_addr_t)__pfn_to_bus(pfn);
> >  }
> >
> > @@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev,
> > dma_addr_t addr)
> >  {
> >       unsigned long pfn = __bus_to_pfn(addr);
> >
> > -     if (dev)
> > -             pfn += dev->dma_pfn_offset;
> > +     if (dev && dev->dma_pfn_offset_map)
> > +             pfn += dma_pfn_offset_from_dma_addr(dev, addr);
> >
> >       return pfn;
> >  }
> > diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-
> > keystone/keystone.c
> > index 638808c4e122..e7d3ee6e9cb5 100644
> > --- a/arch/arm/mach-keystone/keystone.c
> > +++ b/arch/arm/mach-keystone/keystone.c
> > @@ -8,6 +8,7 @@
> >   */
> >  #include <linux/io.h>
> >  #include <linux/of.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/init.h>
> >  #include <linux/of_platform.h>
> >  #include <linux/of_address.h>
> > @@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct notifier_block
> > *nb,
> >               return NOTIFY_BAD;
> >
> >       if (!dev->of_node) {
> > -             dev->dma_pfn_offset = keystone_dma_pfn_offset;
> > -             dev_err(dev, "set dma_pfn_offset%08lx\n",
> > -                     dev->dma_pfn_offset);
> > +             int ret = attach_uniform_dma_pfn_offset
> > +                             (dev, keystone_dma_pfn_offset);
> > +
> > +             dev_err(dev, "set dma_pfn_offset%08lx%s\n",
> > +                     dev->dma_pfn_offset, ret ? " failed" : "");
> >       }
> >       return NOTIFY_OK;
> >  }
> > diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-
> > sh7786.c
> > index e0b568aaa701..2e832a5c58c1 100644
> > --- a/arch/sh/drivers/pci/pcie-sh7786.c
> > +++ b/arch/sh/drivers/pci/pcie-sh7786.c
> > @@ -12,6 +12,7 @@
> >  #include <linux/io.h>
> >  #include <linux/async.h>
> >  #include <linux/delay.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/slab.h>
> >  #include <linux/clk.h>
> >  #include <linux/sh_clk.h>
> > @@ -487,7 +488,7 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev,
> > u8 slot, u8 pin)
> >
> >  void pcibios_bus_add_device(struct pci_dev *pdev)
> >  {
> > -     pdev->dev.dma_pfn_offset = dma_pfn_offset;
> > +     attach_uniform_dma_pfn_offset(&pdev->dev, dma_pfn_offset);
> >  }
> >
> >  static int __init sh7786_pcie_core_init(void)
> > diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
> > index d4811691b93c..5fc9e358b6c7 100644
> > --- a/arch/sh/kernel/dma-coherent.c
> > +++ b/arch/sh/kernel/dma-coherent.c
> > @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > dma_addr_t *dma_handle,
> >  {
> >       void *ret, *ret_nocache;
> >       int order = get_order(size);
> > +     unsigned long pfn;
> > +     phys_addr_t phys;
> >
> >       gfp |= __GFP_ZERO;
> >
> > @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > dma_addr_t *dma_handle,
> >               return NULL;
> >       }
> >
> > -     split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > +     phys = virt_to_phys(ret);
> > +     pfn =  phys >> PAGE_SHIFT;
>
> nit: not sure it really pays off to have a pfn variable here.
Did it for readability; the compiler's optimization should take care
of any extra variables.  But I can switch if you insist.
>
> > +     split_page(pfn_to_page(pfn), order);
> >
> > -     *dma_handle = virt_to_phys(ret);
> > -     if (!WARN_ON(!dev))
> > -             *dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
> > +     *dma_handle = (dma_addr_t)phys;
> > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> > +             *dma_handle -= PFN_PHYS(
> > +                     dma_pfn_offset_from_phys_addr(dev, phys));
>
> >
> >       return ret_nocache;
> >  }
> > @@ -50,8 +55,8 @@ void arch_dma_free(struct device *dev, size_t size, void
> > *vaddr,
> >       unsigned long pfn = (dma_handle >> PAGE_SHIFT);
> >       int k;
> >
> > -     if (!WARN_ON(!dev))
> > -             pfn += dev->dma_pfn_offset;
> > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> > +             pfn += dma_pfn_offset_from_dma_addr(dev, dma_handle);
> >
> >       for (k = 0; k < (1 << order); k++)
> >               __free_pages(pfn_to_page(pfn + k), 0);
> > diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
> > index c313d784efab..4cdeca9f69b6 100644
> > --- a/arch/x86/pci/sta2x11-fixup.c
> > +++ b/arch/x86/pci/sta2x11-fixup.c
> > @@ -12,6 +12,7 @@
> >  #include <linux/export.h>
> >  #include <linux/list.h>
> >  #include <linux/dma-direct.h>
> > +#include <linux/dma-mapping.h>
> >  #include <asm/iommu.h>
> >
> >  #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
> > @@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
> >       struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
> >       struct device *dev = &pdev->dev;
> >       u32 amba_base, max_amba_addr;
> > -     int i;
> > +     int i, ret;
> >
> >       if (!instance)
> >               return;
> > @@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
> >       pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
> >       max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
> >
> > -     dev->dma_pfn_offset = PFN_DOWN(-amba_base);
> > +     ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(-amba_base));
> > +     if (ret)
> > +             dev_err(dev, "sta2x11: could not set PFN offset\n");
> >
> >       dev->bus_dma_limit = max_amba_addr;
> >       pci_set_consistent_dma_mask(pdev, max_amba_addr);
> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> > index 28a6b387e80e..153661ddc74b 100644
> > --- a/drivers/acpi/arm64/iort.c
> > +++ b/drivers/acpi/arm64/iort.c
> > @@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr,
> > u64 *dma_size)
> >       *dma_addr = dmaaddr;
> >       *dma_size = size;
> >
> > -     dev->dma_pfn_offset = PFN_DOWN(offset);
> > -     dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
> > +     ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(offset));
> > +     dev_dbg(dev, "dma_pfn_offset(%#08llx)%s\n",
> > +             offset, ret ? " failed!" : "");
> >  }
> >
> >  static void __init acpi_iort_register_irq(int hwirq, const char *name,
> > diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c
> > b/drivers/gpu/drm/sun4i/sun4i_backend.c
> > index 072ea113e6be..3d41dfc7d178 100644
> > --- a/drivers/gpu/drm/sun4i/sun4i_backend.c
> > +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
> > @@ -11,6 +11,7 @@
> >  #include <linux/module.h>
> >  #include <linux/of_device.h>
> >  #include <linux/of_graph.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/platform_device.h>
> >  #include <linux/reset.h>
> >
> > @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct
> > device *master,
> >       const struct sun4i_backend_quirks *quirks;
> >       struct resource *res;
> >       void __iomem *regs;
> > -     int i, ret;
> > +     int i, ret = 0;
> >
> >       backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
> >       if (!backend)
> > @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct
> > device *master,
> >                * on our device since the RAM mapping is at 0 for the DMA bus,
> >                * unlike the CPU.
> >                */
> > -             drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >       }
> >
> >       backend->engine.node = dev->of_node;
> > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> > index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
> >       if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
> >               return NULL;
> >
> > -     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> > +     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
> >               dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU
> > page tables\n");
> >               return NULL;
> >       }
> > diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > index eff34ded6305..7212da5e1076 100644
> > --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > @@ -7,6 +7,7 @@
> >   */
> >
> >  #include <linux/clk.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/interrupt.h>
> >  #include <linux/module.h>
> >  #include <linux/mutex.h>
> > @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
> >                       return ret;
> >       } else {
> >  #ifdef PHYS_PFN_OFFSET
> > -             csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >  #endif
> >       }
> >
> > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > index 055eb0b8e396..2d66d415b6c3 100644
> > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
> >
> >       sdev->dev = &pdev->dev;
> >       /* The DMA bus has the memory mapped at 0 */
> > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > +     if (ret)
> > +             return ret;
> >
> >       ret = sun6i_csi_resource_request(sdev, pdev);
> >       if (ret)
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index 96d8cfb14a60..c89333b0a5fb 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node
> > *np, int index,
> >  }
> >  EXPORT_SYMBOL(of_io_request_and_map);
> >
> > +static int attach_dma_pfn_offset_map(struct device *dev,
> > +                                  struct device_node *node, int num_ranges)
>
> As with the previous review, please take this comment with a grain of salt.
>
> I think there should be a clear split between what belongs to OF and what
> belongs to the core device infrastructure.
>
> OF's job should be to parse DT and provide a list/array of ranges, whereas the
> core device infrastructure should provide an API to assign a list of
> ranges/offset to a device.
>
> As a concrete example, you're forcing devices like the sta2x11 to build with OF
> support, which, being an Intel device, it's pretty odd. But I'm also thinking
> of how will all this fit once an ACPI device wants to use it.
To fix this I only have to move attach_uniform_dma_pfn_offset() from
of/address.c to say include/linux/dma-mapping.h.  It has no
dependencies on OF.  Do you agree?

>
> Expanding on this idea, once you have a split between the OF's and device core
> roles, it transpires that of_dma_get_range()'s job should only be to provide
> the ranges in a device understandable structure and of_dma_configre()'s to
> actually assign the device's parameters. This would obsolete patch #7.

I think you mean patch #8.  I agree with you.  The reason I needed a
"struct device *"  in the call is because I wanted to make sure the
memory that is alloc'd belongs to the device that needs it.  If I do a
regular kzalloc(), this memory  will become a leak once someone starts
unbinding/binding their device.  Also, in  all uses of of_dma_rtange()
-- there is only one --  a dev is required as one can't attach an
offset map to NULL.

I do see that there are a number of functions in drivers/of/*.c that
take 'struct device *dev' as an argument so there is precedent for
something like this.  Regardless, I need an owner to the memory I
alloc().


>
> > +{
> > +     struct of_range_parser parser;
> > +     struct of_range range;
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     r = devm_kcalloc(dev, num_ranges + 1,
> > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > +     if (!r)
> > +             return -ENOMEM;
> > +     dev->dma_pfn_offset_map = r;
> > +     of_dma_range_parser_init(&parser, node);
> > +
> > +     /*
> > +      * Record all info for DMA ranges array.  We could
> > +      * just use the of_range struct, but if we did that it
> > +      * would require more calculations for phys_to_dma and
> > +      * dma_to_phys conversions.
> > +      */
> > +     for_each_of_range(&parser, &range) {
> > +             r->cpu_start = range.cpu_addr;
> > +             r->cpu_end = r->cpu_start + range.size - 1;
> > +             r->dma_start = range.bus_addr;
> > +             r->dma_end = r->dma_start + range.size - 1;
> > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > +                     - PFN_DOWN(range.bus_addr);
> > +             r++;
> > +     }
> > +     return 0;
> > +}
> > +
> > +
> > +
> > +/**
> > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > + * @dev:     device pointer; only needed for a corner case.
>
> That's a huge corner :P
Good point; I'm not really sure what percent of Linux configurations
require a dma_pfn_offset.  I'll drop the "corner".

>
> > + * @dma_pfn_offset:  offset to apply when converting from phys addr
> > + *                   to dma addr and vice versa.
> > + *
> > + * It returns -ENOMEM if out of memory, otherwise 0.
> > + */
> > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> > pfn_offset)
>
> As I say above, does this really belong to of/address.c?
No it does not.  Will fix.

>
> > +{
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     if (!dev)
> > +             return -ENODEV;
> > +
> > +     if (!pfn_offset)
> > +             return 0;
> > +
> > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > +                      GFP_KERNEL);
> > +     if (!r)
> > +             return -ENOMEM;
>
> I think you're missing this:
>
>         dev->dma_pfn_offset_map = r;
>
That's a showstopper!  DanC also pointed it out but I still didn't see
it.  Thanks!

> > +
> > +     r->uniform_offset = true;
> > +     r->pfn_offset = pfn_offset;
> > +
> > +     return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > +
> >  /**
> >   * of_dma_get_range - Get DMA range info
> >   * @dev:     device pointer; only needed for a corner case.
> > @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
> >   *   CPU addr (phys_addr_t)  : pna cells
> >   *   size                    : nsize cells
> >   *
> > - * It returns -ENODEV if "dma-ranges" property was not found
> > + * It returns -ENODEV if !dev or "dma-ranges" property was not found
> >   * for this device in DT.
> >   */
> >  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> > *dma_addr,
> > @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> > device_node *np, u64 *dma_addr,
> >       bool found_dma_ranges = false;
> >       struct of_range_parser parser;
> >       struct of_range range;
> > +     phys_addr_t cpu_start = ~(phys_addr_t)0;
> >       u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> > +     bool dma_multi_pfn_offset = false;
> > +     int num_ranges = 0;
> > +
> > +     if (!dev)
> > +             return -ENODEV;
>
> Shouldn't this be part of patch #7?
Do you mean #8?  Do you mean the test for !dev?   It is not required
for #8 so I thought I'd keep these two changes separate.  I could
squash them.
>
> Regards,
> Nicolas
>
Many thanks!

Jim Quinlan
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 14:35       ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan via iommu @ 2020-06-04 14:35 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, open list:STAGING SUBSYSTEM,
	Wolfram Sang, Yoshinori Sato, Frank Rowand,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, open list:DRM DRIVERS FOR ALLWINNER A10,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> Hi Jim,
>
> On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
> > The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> > the use of multiple pfn offsets between cpu addrs and dma addrs.  It
> > subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
> > designates the single offset a special case.
> >
> > of_dma_configure() is the typical manner to set pfn offsets but there
> > are a number of ad hoc assignments to dev->dma_pfn_offset in the
> > kernel code.  These cases now invoke the function
> > attach_uniform_dma_pfn_offset(dev, pfn_offset).
> >
> > Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
> > ---
> >  arch/arm/include/asm/dma-mapping.h            |  9 +-
> >  arch/arm/mach-keystone/keystone.c             |  9 +-
> >  arch/sh/drivers/pci/pcie-sh7786.c             |  3 +-
> >  arch/sh/kernel/dma-coherent.c                 | 17 ++--
> >  arch/x86/pci/sta2x11-fixup.c                  |  7 +-
> >  drivers/acpi/arm64/iort.c                     |  5 +-
> >  drivers/gpu/drm/sun4i/sun4i_backend.c         |  7 +-
> >  drivers/iommu/io-pgtable-arm.c                |  2 +-
> >  .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
> >  .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  5 +-
> >  drivers/of/address.c                          | 93 +++++++++++++++++--
> >  drivers/of/device.c                           |  8 +-
> >  drivers/remoteproc/remoteproc_core.c          |  2 +-
> >  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
> >  drivers/usb/core/message.c                    |  4 +-
> >  drivers/usb/core/usb.c                        |  2 +-
> >  include/linux/device.h                        |  4 +-
> >  include/linux/dma-direct.h                    | 16 +++-
> >  include/linux/dma-mapping.h                   | 45 +++++++++
> >  kernel/dma/coherent.c                         | 11 ++-
> >  20 files changed, 210 insertions(+), 51 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-
> > mapping.h
> > index bdd80ddbca34..f1e72f99468b 100644
> > --- a/arch/arm/include/asm/dma-mapping.h
> > +++ b/arch/arm/include/asm/dma-mapping.h
> > @@ -35,8 +35,9 @@ static inline const struct dma_map_ops
> > *get_arch_dma_ops(struct bus_type *bus)
> >  #ifndef __arch_pfn_to_dma
> >  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
> >  {
> > -     if (dev)
> > -             pfn -= dev->dma_pfn_offset;
> > +     if (dev && dev->dma_pfn_offset_map)
>
> Would it make sense to move the dev->dma_pfn_offset_map check into
> dma_pfn_offset_from_phys_addr() and return 0 if not available? Same for the
> opposite variant of the function. I think it'd make the code a little simpler on
> some of the use cases, and overall less error prone if anyone starts using the
> function elsewhere.

Yes it makes sense and I was debating doing it but I just wanted to
make it explicit that there was not much cost for this change for the
fastpath -- no dma_pfn_offset whatsoever -- as the cost goes from a
"pfn += dev->dma_pfn_offset"  to a "if (dev->dma_pfn_offset_map)".  I
will do what you suggest.
>
> > +             pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
> > +
> >       return (dma_addr_t)__pfn_to_bus(pfn);
> >  }
> >
> > @@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev,
> > dma_addr_t addr)
> >  {
> >       unsigned long pfn = __bus_to_pfn(addr);
> >
> > -     if (dev)
> > -             pfn += dev->dma_pfn_offset;
> > +     if (dev && dev->dma_pfn_offset_map)
> > +             pfn += dma_pfn_offset_from_dma_addr(dev, addr);
> >
> >       return pfn;
> >  }
> > diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-
> > keystone/keystone.c
> > index 638808c4e122..e7d3ee6e9cb5 100644
> > --- a/arch/arm/mach-keystone/keystone.c
> > +++ b/arch/arm/mach-keystone/keystone.c
> > @@ -8,6 +8,7 @@
> >   */
> >  #include <linux/io.h>
> >  #include <linux/of.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/init.h>
> >  #include <linux/of_platform.h>
> >  #include <linux/of_address.h>
> > @@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct notifier_block
> > *nb,
> >               return NOTIFY_BAD;
> >
> >       if (!dev->of_node) {
> > -             dev->dma_pfn_offset = keystone_dma_pfn_offset;
> > -             dev_err(dev, "set dma_pfn_offset%08lx\n",
> > -                     dev->dma_pfn_offset);
> > +             int ret = attach_uniform_dma_pfn_offset
> > +                             (dev, keystone_dma_pfn_offset);
> > +
> > +             dev_err(dev, "set dma_pfn_offset%08lx%s\n",
> > +                     dev->dma_pfn_offset, ret ? " failed" : "");
> >       }
> >       return NOTIFY_OK;
> >  }
> > diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-
> > sh7786.c
> > index e0b568aaa701..2e832a5c58c1 100644
> > --- a/arch/sh/drivers/pci/pcie-sh7786.c
> > +++ b/arch/sh/drivers/pci/pcie-sh7786.c
> > @@ -12,6 +12,7 @@
> >  #include <linux/io.h>
> >  #include <linux/async.h>
> >  #include <linux/delay.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/slab.h>
> >  #include <linux/clk.h>
> >  #include <linux/sh_clk.h>
> > @@ -487,7 +488,7 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev,
> > u8 slot, u8 pin)
> >
> >  void pcibios_bus_add_device(struct pci_dev *pdev)
> >  {
> > -     pdev->dev.dma_pfn_offset = dma_pfn_offset;
> > +     attach_uniform_dma_pfn_offset(&pdev->dev, dma_pfn_offset);
> >  }
> >
> >  static int __init sh7786_pcie_core_init(void)
> > diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
> > index d4811691b93c..5fc9e358b6c7 100644
> > --- a/arch/sh/kernel/dma-coherent.c
> > +++ b/arch/sh/kernel/dma-coherent.c
> > @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > dma_addr_t *dma_handle,
> >  {
> >       void *ret, *ret_nocache;
> >       int order = get_order(size);
> > +     unsigned long pfn;
> > +     phys_addr_t phys;
> >
> >       gfp |= __GFP_ZERO;
> >
> > @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > dma_addr_t *dma_handle,
> >               return NULL;
> >       }
> >
> > -     split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > +     phys = virt_to_phys(ret);
> > +     pfn =  phys >> PAGE_SHIFT;
>
> nit: not sure it really pays off to have a pfn variable here.
Did it for readability; the compiler's optimization should take care
of any extra variables.  But I can switch if you insist.
>
> > +     split_page(pfn_to_page(pfn), order);
> >
> > -     *dma_handle = virt_to_phys(ret);
> > -     if (!WARN_ON(!dev))
> > -             *dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
> > +     *dma_handle = (dma_addr_t)phys;
> > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> > +             *dma_handle -= PFN_PHYS(
> > +                     dma_pfn_offset_from_phys_addr(dev, phys));
>
> >
> >       return ret_nocache;
> >  }
> > @@ -50,8 +55,8 @@ void arch_dma_free(struct device *dev, size_t size, void
> > *vaddr,
> >       unsigned long pfn = (dma_handle >> PAGE_SHIFT);
> >       int k;
> >
> > -     if (!WARN_ON(!dev))
> > -             pfn += dev->dma_pfn_offset;
> > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> > +             pfn += dma_pfn_offset_from_dma_addr(dev, dma_handle);
> >
> >       for (k = 0; k < (1 << order); k++)
> >               __free_pages(pfn_to_page(pfn + k), 0);
> > diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
> > index c313d784efab..4cdeca9f69b6 100644
> > --- a/arch/x86/pci/sta2x11-fixup.c
> > +++ b/arch/x86/pci/sta2x11-fixup.c
> > @@ -12,6 +12,7 @@
> >  #include <linux/export.h>
> >  #include <linux/list.h>
> >  #include <linux/dma-direct.h>
> > +#include <linux/dma-mapping.h>
> >  #include <asm/iommu.h>
> >
> >  #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
> > @@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
> >       struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
> >       struct device *dev = &pdev->dev;
> >       u32 amba_base, max_amba_addr;
> > -     int i;
> > +     int i, ret;
> >
> >       if (!instance)
> >               return;
> > @@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
> >       pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
> >       max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
> >
> > -     dev->dma_pfn_offset = PFN_DOWN(-amba_base);
> > +     ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(-amba_base));
> > +     if (ret)
> > +             dev_err(dev, "sta2x11: could not set PFN offset\n");
> >
> >       dev->bus_dma_limit = max_amba_addr;
> >       pci_set_consistent_dma_mask(pdev, max_amba_addr);
> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> > index 28a6b387e80e..153661ddc74b 100644
> > --- a/drivers/acpi/arm64/iort.c
> > +++ b/drivers/acpi/arm64/iort.c
> > @@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr,
> > u64 *dma_size)
> >       *dma_addr = dmaaddr;
> >       *dma_size = size;
> >
> > -     dev->dma_pfn_offset = PFN_DOWN(offset);
> > -     dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
> > +     ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(offset));
> > +     dev_dbg(dev, "dma_pfn_offset(%#08llx)%s\n",
> > +             offset, ret ? " failed!" : "");
> >  }
> >
> >  static void __init acpi_iort_register_irq(int hwirq, const char *name,
> > diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c
> > b/drivers/gpu/drm/sun4i/sun4i_backend.c
> > index 072ea113e6be..3d41dfc7d178 100644
> > --- a/drivers/gpu/drm/sun4i/sun4i_backend.c
> > +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
> > @@ -11,6 +11,7 @@
> >  #include <linux/module.h>
> >  #include <linux/of_device.h>
> >  #include <linux/of_graph.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/platform_device.h>
> >  #include <linux/reset.h>
> >
> > @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct
> > device *master,
> >       const struct sun4i_backend_quirks *quirks;
> >       struct resource *res;
> >       void __iomem *regs;
> > -     int i, ret;
> > +     int i, ret = 0;
> >
> >       backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
> >       if (!backend)
> > @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct
> > device *master,
> >                * on our device since the RAM mapping is at 0 for the DMA bus,
> >                * unlike the CPU.
> >                */
> > -             drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >       }
> >
> >       backend->engine.node = dev->of_node;
> > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> > index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
> >       if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
> >               return NULL;
> >
> > -     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> > +     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
> >               dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU
> > page tables\n");
> >               return NULL;
> >       }
> > diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > index eff34ded6305..7212da5e1076 100644
> > --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > @@ -7,6 +7,7 @@
> >   */
> >
> >  #include <linux/clk.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/interrupt.h>
> >  #include <linux/module.h>
> >  #include <linux/mutex.h>
> > @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
> >                       return ret;
> >       } else {
> >  #ifdef PHYS_PFN_OFFSET
> > -             csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >  #endif
> >       }
> >
> > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > index 055eb0b8e396..2d66d415b6c3 100644
> > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
> >
> >       sdev->dev = &pdev->dev;
> >       /* The DMA bus has the memory mapped at 0 */
> > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > +     if (ret)
> > +             return ret;
> >
> >       ret = sun6i_csi_resource_request(sdev, pdev);
> >       if (ret)
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index 96d8cfb14a60..c89333b0a5fb 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node
> > *np, int index,
> >  }
> >  EXPORT_SYMBOL(of_io_request_and_map);
> >
> > +static int attach_dma_pfn_offset_map(struct device *dev,
> > +                                  struct device_node *node, int num_ranges)
>
> As with the previous review, please take this comment with a grain of salt.
>
> I think there should be a clear split between what belongs to OF and what
> belongs to the core device infrastructure.
>
> OF's job should be to parse DT and provide a list/array of ranges, whereas the
> core device infrastructure should provide an API to assign a list of
> ranges/offset to a device.
>
> As a concrete example, you're forcing devices like the sta2x11 to build with OF
> support, which, being an Intel device, it's pretty odd. But I'm also thinking
> of how will all this fit once an ACPI device wants to use it.
To fix this I only have to move attach_uniform_dma_pfn_offset() from
of/address.c to say include/linux/dma-mapping.h.  It has no
dependencies on OF.  Do you agree?

>
> Expanding on this idea, once you have a split between the OF's and device core
> roles, it transpires that of_dma_get_range()'s job should only be to provide
> the ranges in a device understandable structure and of_dma_configre()'s to
> actually assign the device's parameters. This would obsolete patch #7.

I think you mean patch #8.  I agree with you.  The reason I needed a
"struct device *"  in the call is because I wanted to make sure the
memory that is alloc'd belongs to the device that needs it.  If I do a
regular kzalloc(), this memory  will become a leak once someone starts
unbinding/binding their device.  Also, in  all uses of of_dma_rtange()
-- there is only one --  a dev is required as one can't attach an
offset map to NULL.

I do see that there are a number of functions in drivers/of/*.c that
take 'struct device *dev' as an argument so there is precedent for
something like this.  Regardless, I need an owner to the memory I
alloc().


>
> > +{
> > +     struct of_range_parser parser;
> > +     struct of_range range;
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     r = devm_kcalloc(dev, num_ranges + 1,
> > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > +     if (!r)
> > +             return -ENOMEM;
> > +     dev->dma_pfn_offset_map = r;
> > +     of_dma_range_parser_init(&parser, node);
> > +
> > +     /*
> > +      * Record all info for DMA ranges array.  We could
> > +      * just use the of_range struct, but if we did that it
> > +      * would require more calculations for phys_to_dma and
> > +      * dma_to_phys conversions.
> > +      */
> > +     for_each_of_range(&parser, &range) {
> > +             r->cpu_start = range.cpu_addr;
> > +             r->cpu_end = r->cpu_start + range.size - 1;
> > +             r->dma_start = range.bus_addr;
> > +             r->dma_end = r->dma_start + range.size - 1;
> > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > +                     - PFN_DOWN(range.bus_addr);
> > +             r++;
> > +     }
> > +     return 0;
> > +}
> > +
> > +
> > +
> > +/**
> > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > + * @dev:     device pointer; only needed for a corner case.
>
> That's a huge corner :P
Good point; I'm not really sure what percent of Linux configurations
require a dma_pfn_offset.  I'll drop the "corner".

>
> > + * @dma_pfn_offset:  offset to apply when converting from phys addr
> > + *                   to dma addr and vice versa.
> > + *
> > + * It returns -ENOMEM if out of memory, otherwise 0.
> > + */
> > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> > pfn_offset)
>
> As I say above, does this really belong to of/address.c?
No it does not.  Will fix.

>
> > +{
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     if (!dev)
> > +             return -ENODEV;
> > +
> > +     if (!pfn_offset)
> > +             return 0;
> > +
> > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > +                      GFP_KERNEL);
> > +     if (!r)
> > +             return -ENOMEM;
>
> I think you're missing this:
>
>         dev->dma_pfn_offset_map = r;
>
That's a showstopper!  DanC also pointed it out but I still didn't see
it.  Thanks!

> > +
> > +     r->uniform_offset = true;
> > +     r->pfn_offset = pfn_offset;
> > +
> > +     return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > +
> >  /**
> >   * of_dma_get_range - Get DMA range info
> >   * @dev:     device pointer; only needed for a corner case.
> > @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
> >   *   CPU addr (phys_addr_t)  : pna cells
> >   *   size                    : nsize cells
> >   *
> > - * It returns -ENODEV if "dma-ranges" property was not found
> > + * It returns -ENODEV if !dev or "dma-ranges" property was not found
> >   * for this device in DT.
> >   */
> >  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> > *dma_addr,
> > @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> > device_node *np, u64 *dma_addr,
> >       bool found_dma_ranges = false;
> >       struct of_range_parser parser;
> >       struct of_range range;
> > +     phys_addr_t cpu_start = ~(phys_addr_t)0;
> >       u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> > +     bool dma_multi_pfn_offset = false;
> > +     int num_ranges = 0;
> > +
> > +     if (!dev)
> > +             return -ENODEV;
>
> Shouldn't this be part of patch #7?
Do you mean #8?  Do you mean the test for !dev?   It is not required
for #8 so I thought I'd keep these two changes separate.  I could
squash them.
>
> Regards,
> Nicolas
>
Many thanks!

Jim Quinlan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 14:35       ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-04 14:35 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> Hi Jim,
>
> On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
> > The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> > the use of multiple pfn offsets between cpu addrs and dma addrs.  It
> > subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
> > designates the single offset a special case.
> >
> > of_dma_configure() is the typical manner to set pfn offsets but there
> > are a number of ad hoc assignments to dev->dma_pfn_offset in the
> > kernel code.  These cases now invoke the function
> > attach_uniform_dma_pfn_offset(dev, pfn_offset).
> >
> > Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
> > ---
> >  arch/arm/include/asm/dma-mapping.h            |  9 +-
> >  arch/arm/mach-keystone/keystone.c             |  9 +-
> >  arch/sh/drivers/pci/pcie-sh7786.c             |  3 +-
> >  arch/sh/kernel/dma-coherent.c                 | 17 ++--
> >  arch/x86/pci/sta2x11-fixup.c                  |  7 +-
> >  drivers/acpi/arm64/iort.c                     |  5 +-
> >  drivers/gpu/drm/sun4i/sun4i_backend.c         |  7 +-
> >  drivers/iommu/io-pgtable-arm.c                |  2 +-
> >  .../platform/sunxi/sun4i-csi/sun4i_csi.c      |  5 +-
> >  .../platform/sunxi/sun6i-csi/sun6i_csi.c      |  5 +-
> >  drivers/of/address.c                          | 93 +++++++++++++++++--
> >  drivers/of/device.c                           |  8 +-
> >  drivers/remoteproc/remoteproc_core.c          |  2 +-
> >  .../staging/media/sunxi/cedrus/cedrus_hw.c    |  7 +-
> >  drivers/usb/core/message.c                    |  4 +-
> >  drivers/usb/core/usb.c                        |  2 +-
> >  include/linux/device.h                        |  4 +-
> >  include/linux/dma-direct.h                    | 16 +++-
> >  include/linux/dma-mapping.h                   | 45 +++++++++
> >  kernel/dma/coherent.c                         | 11 ++-
> >  20 files changed, 210 insertions(+), 51 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-
> > mapping.h
> > index bdd80ddbca34..f1e72f99468b 100644
> > --- a/arch/arm/include/asm/dma-mapping.h
> > +++ b/arch/arm/include/asm/dma-mapping.h
> > @@ -35,8 +35,9 @@ static inline const struct dma_map_ops
> > *get_arch_dma_ops(struct bus_type *bus)
> >  #ifndef __arch_pfn_to_dma
> >  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
> >  {
> > -     if (dev)
> > -             pfn -= dev->dma_pfn_offset;
> > +     if (dev && dev->dma_pfn_offset_map)
>
> Would it make sense to move the dev->dma_pfn_offset_map check into
> dma_pfn_offset_from_phys_addr() and return 0 if not available? Same for the
> opposite variant of the function. I think it'd make the code a little simpler on
> some of the use cases, and overall less error prone if anyone starts using the
> function elsewhere.

Yes it makes sense and I was debating doing it but I just wanted to
make it explicit that there was not much cost for this change for the
fastpath -- no dma_pfn_offset whatsoever -- as the cost goes from a
"pfn += dev->dma_pfn_offset"  to a "if (dev->dma_pfn_offset_map)".  I
will do what you suggest.
>
> > +             pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
> > +
> >       return (dma_addr_t)__pfn_to_bus(pfn);
> >  }
> >
> > @@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev,
> > dma_addr_t addr)
> >  {
> >       unsigned long pfn = __bus_to_pfn(addr);
> >
> > -     if (dev)
> > -             pfn += dev->dma_pfn_offset;
> > +     if (dev && dev->dma_pfn_offset_map)
> > +             pfn += dma_pfn_offset_from_dma_addr(dev, addr);
> >
> >       return pfn;
> >  }
> > diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-
> > keystone/keystone.c
> > index 638808c4e122..e7d3ee6e9cb5 100644
> > --- a/arch/arm/mach-keystone/keystone.c
> > +++ b/arch/arm/mach-keystone/keystone.c
> > @@ -8,6 +8,7 @@
> >   */
> >  #include <linux/io.h>
> >  #include <linux/of.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/init.h>
> >  #include <linux/of_platform.h>
> >  #include <linux/of_address.h>
> > @@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct notifier_block
> > *nb,
> >               return NOTIFY_BAD;
> >
> >       if (!dev->of_node) {
> > -             dev->dma_pfn_offset = keystone_dma_pfn_offset;
> > -             dev_err(dev, "set dma_pfn_offset%08lx\n",
> > -                     dev->dma_pfn_offset);
> > +             int ret = attach_uniform_dma_pfn_offset
> > +                             (dev, keystone_dma_pfn_offset);
> > +
> > +             dev_err(dev, "set dma_pfn_offset%08lx%s\n",
> > +                     dev->dma_pfn_offset, ret ? " failed" : "");
> >       }
> >       return NOTIFY_OK;
> >  }
> > diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-
> > sh7786.c
> > index e0b568aaa701..2e832a5c58c1 100644
> > --- a/arch/sh/drivers/pci/pcie-sh7786.c
> > +++ b/arch/sh/drivers/pci/pcie-sh7786.c
> > @@ -12,6 +12,7 @@
> >  #include <linux/io.h>
> >  #include <linux/async.h>
> >  #include <linux/delay.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/slab.h>
> >  #include <linux/clk.h>
> >  #include <linux/sh_clk.h>
> > @@ -487,7 +488,7 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev,
> > u8 slot, u8 pin)
> >
> >  void pcibios_bus_add_device(struct pci_dev *pdev)
> >  {
> > -     pdev->dev.dma_pfn_offset = dma_pfn_offset;
> > +     attach_uniform_dma_pfn_offset(&pdev->dev, dma_pfn_offset);
> >  }
> >
> >  static int __init sh7786_pcie_core_init(void)
> > diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
> > index d4811691b93c..5fc9e358b6c7 100644
> > --- a/arch/sh/kernel/dma-coherent.c
> > +++ b/arch/sh/kernel/dma-coherent.c
> > @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > dma_addr_t *dma_handle,
> >  {
> >       void *ret, *ret_nocache;
> >       int order = get_order(size);
> > +     unsigned long pfn;
> > +     phys_addr_t phys;
> >
> >       gfp |= __GFP_ZERO;
> >
> > @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > dma_addr_t *dma_handle,
> >               return NULL;
> >       }
> >
> > -     split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > +     phys = virt_to_phys(ret);
> > +     pfn =  phys >> PAGE_SHIFT;
>
> nit: not sure it really pays off to have a pfn variable here.
Did it for readability; the compiler's optimization should take care
of any extra variables.  But I can switch if you insist.
>
> > +     split_page(pfn_to_page(pfn), order);
> >
> > -     *dma_handle = virt_to_phys(ret);
> > -     if (!WARN_ON(!dev))
> > -             *dma_handle -= PFN_PHYS(dev->dma_pfn_offset);
> > +     *dma_handle = (dma_addr_t)phys;
> > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> > +             *dma_handle -= PFN_PHYS(
> > +                     dma_pfn_offset_from_phys_addr(dev, phys));
>
> >
> >       return ret_nocache;
> >  }
> > @@ -50,8 +55,8 @@ void arch_dma_free(struct device *dev, size_t size, void
> > *vaddr,
> >       unsigned long pfn = (dma_handle >> PAGE_SHIFT);
> >       int k;
> >
> > -     if (!WARN_ON(!dev))
> > -             pfn += dev->dma_pfn_offset;
> > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
> > +             pfn += dma_pfn_offset_from_dma_addr(dev, dma_handle);
> >
> >       for (k = 0; k < (1 << order); k++)
> >               __free_pages(pfn_to_page(pfn + k), 0);
> > diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
> > index c313d784efab..4cdeca9f69b6 100644
> > --- a/arch/x86/pci/sta2x11-fixup.c
> > +++ b/arch/x86/pci/sta2x11-fixup.c
> > @@ -12,6 +12,7 @@
> >  #include <linux/export.h>
> >  #include <linux/list.h>
> >  #include <linux/dma-direct.h>
> > +#include <linux/dma-mapping.h>
> >  #include <asm/iommu.h>
> >
> >  #define STA2X11_SWIOTLB_SIZE (4*1024*1024)
> > @@ -133,7 +134,7 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
> >       struct sta2x11_instance *instance = sta2x11_pdev_to_instance(pdev);
> >       struct device *dev = &pdev->dev;
> >       u32 amba_base, max_amba_addr;
> > -     int i;
> > +     int i, ret;
> >
> >       if (!instance)
> >               return;
> > @@ -141,7 +142,9 @@ static void sta2x11_map_ep(struct pci_dev *pdev)
> >       pci_read_config_dword(pdev, AHB_BASE(0), &amba_base);
> >       max_amba_addr = amba_base + STA2X11_AMBA_SIZE - 1;
> >
> > -     dev->dma_pfn_offset = PFN_DOWN(-amba_base);
> > +     ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(-amba_base));
> > +     if (ret)
> > +             dev_err(dev, "sta2x11: could not set PFN offset\n");
> >
> >       dev->bus_dma_limit = max_amba_addr;
> >       pci_set_consistent_dma_mask(pdev, max_amba_addr);
> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> > index 28a6b387e80e..153661ddc74b 100644
> > --- a/drivers/acpi/arm64/iort.c
> > +++ b/drivers/acpi/arm64/iort.c
> > @@ -1142,8 +1142,9 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr,
> > u64 *dma_size)
> >       *dma_addr = dmaaddr;
> >       *dma_size = size;
> >
> > -     dev->dma_pfn_offset = PFN_DOWN(offset);
> > -     dev_dbg(dev, "dma_pfn_offset(%#08llx)\n", offset);
> > +     ret = attach_uniform_dma_pfn_offset(dev, PFN_DOWN(offset));
> > +     dev_dbg(dev, "dma_pfn_offset(%#08llx)%s\n",
> > +             offset, ret ? " failed!" : "");
> >  }
> >
> >  static void __init acpi_iort_register_irq(int hwirq, const char *name,
> > diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c
> > b/drivers/gpu/drm/sun4i/sun4i_backend.c
> > index 072ea113e6be..3d41dfc7d178 100644
> > --- a/drivers/gpu/drm/sun4i/sun4i_backend.c
> > +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
> > @@ -11,6 +11,7 @@
> >  #include <linux/module.h>
> >  #include <linux/of_device.h>
> >  #include <linux/of_graph.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/platform_device.h>
> >  #include <linux/reset.h>
> >
> > @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct
> > device *master,
> >       const struct sun4i_backend_quirks *quirks;
> >       struct resource *res;
> >       void __iomem *regs;
> > -     int i, ret;
> > +     int i, ret = 0;
> >
> >       backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
> >       if (!backend)
> > @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct
> > device *master,
> >                * on our device since the RAM mapping is at 0 for the DMA bus,
> >                * unlike the CPU.
> >                */
> > -             drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >       }
> >
> >       backend->engine.node = dev->of_node;
> > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> > index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
> >       if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
> >               return NULL;
> >
> > -     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> > +     if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
> >               dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU
> > page tables\n");
> >               return NULL;
> >       }
> > diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > index eff34ded6305..7212da5e1076 100644
> > --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > @@ -7,6 +7,7 @@
> >   */
> >
> >  #include <linux/clk.h>
> > +#include <linux/dma-mapping.h>
> >  #include <linux/interrupt.h>
> >  #include <linux/module.h>
> >  #include <linux/mutex.h>
> > @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
> >                       return ret;
> >       } else {
> >  #ifdef PHYS_PFN_OFFSET
> > -             csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > +             ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > +             if (ret)
> > +                     return ret;
> >  #endif
> >       }
> >
> > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > index 055eb0b8e396..2d66d415b6c3 100644
> > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
> >
> >       sdev->dev = &pdev->dev;
> >       /* The DMA bus has the memory mapped at 0 */
> > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > +     if (ret)
> > +             return ret;
> >
> >       ret = sun6i_csi_resource_request(sdev, pdev);
> >       if (ret)
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index 96d8cfb14a60..c89333b0a5fb 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node
> > *np, int index,
> >  }
> >  EXPORT_SYMBOL(of_io_request_and_map);
> >
> > +static int attach_dma_pfn_offset_map(struct device *dev,
> > +                                  struct device_node *node, int num_ranges)
>
> As with the previous review, please take this comment with a grain of salt.
>
> I think there should be a clear split between what belongs to OF and what
> belongs to the core device infrastructure.
>
> OF's job should be to parse DT and provide a list/array of ranges, whereas the
> core device infrastructure should provide an API to assign a list of
> ranges/offset to a device.
>
> As a concrete example, you're forcing devices like the sta2x11 to build with OF
> support, which, being an Intel device, it's pretty odd. But I'm also thinking
> of how will all this fit once an ACPI device wants to use it.
To fix this I only have to move attach_uniform_dma_pfn_offset() from
of/address.c to say include/linux/dma-mapping.h.  It has no
dependencies on OF.  Do you agree?

>
> Expanding on this idea, once you have a split between the OF's and device core
> roles, it transpires that of_dma_get_range()'s job should only be to provide
> the ranges in a device understandable structure and of_dma_configre()'s to
> actually assign the device's parameters. This would obsolete patch #7.

I think you mean patch #8.  I agree with you.  The reason I needed a
"struct device *"  in the call is because I wanted to make sure the
memory that is alloc'd belongs to the device that needs it.  If I do a
regular kzalloc(), this memory  will become a leak once someone starts
unbinding/binding their device.  Also, in  all uses of of_dma_rtange()
-- there is only one --  a dev is required as one can't attach an
offset map to NULL.

I do see that there are a number of functions in drivers/of/*.c that
take 'struct device *dev' as an argument so there is precedent for
something like this.  Regardless, I need an owner to the memory I
alloc().


>
> > +{
> > +     struct of_range_parser parser;
> > +     struct of_range range;
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     r = devm_kcalloc(dev, num_ranges + 1,
> > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > +     if (!r)
> > +             return -ENOMEM;
> > +     dev->dma_pfn_offset_map = r;
> > +     of_dma_range_parser_init(&parser, node);
> > +
> > +     /*
> > +      * Record all info for DMA ranges array.  We could
> > +      * just use the of_range struct, but if we did that it
> > +      * would require more calculations for phys_to_dma and
> > +      * dma_to_phys conversions.
> > +      */
> > +     for_each_of_range(&parser, &range) {
> > +             r->cpu_start = range.cpu_addr;
> > +             r->cpu_end = r->cpu_start + range.size - 1;
> > +             r->dma_start = range.bus_addr;
> > +             r->dma_end = r->dma_start + range.size - 1;
> > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > +                     - PFN_DOWN(range.bus_addr);
> > +             r++;
> > +     }
> > +     return 0;
> > +}
> > +
> > +
> > +
> > +/**
> > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > + * @dev:     device pointer; only needed for a corner case.
>
> That's a huge corner :P
Good point; I'm not really sure what percent of Linux configurations
require a dma_pfn_offset.  I'll drop the "corner".

>
> > + * @dma_pfn_offset:  offset to apply when converting from phys addr
> > + *                   to dma addr and vice versa.
> > + *
> > + * It returns -ENOMEM if out of memory, otherwise 0.
> > + */
> > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> > pfn_offset)
>
> As I say above, does this really belong to of/address.c?
No it does not.  Will fix.

>
> > +{
> > +     struct dma_pfn_offset_region *r;
> > +
> > +     if (!dev)
> > +             return -ENODEV;
> > +
> > +     if (!pfn_offset)
> > +             return 0;
> > +
> > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > +                      GFP_KERNEL);
> > +     if (!r)
> > +             return -ENOMEM;
>
> I think you're missing this:
>
>         dev->dma_pfn_offset_map = r;
>
That's a showstopper!  DanC also pointed it out but I still didn't see
it.  Thanks!

> > +
> > +     r->uniform_offset = true;
> > +     r->pfn_offset = pfn_offset;
> > +
> > +     return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > +
> >  /**
> >   * of_dma_get_range - Get DMA range info
> >   * @dev:     device pointer; only needed for a corner case.
> > @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
> >   *   CPU addr (phys_addr_t)  : pna cells
> >   *   size                    : nsize cells
> >   *
> > - * It returns -ENODEV if "dma-ranges" property was not found
> > + * It returns -ENODEV if !dev or "dma-ranges" property was not found
> >   * for this device in DT.
> >   */
> >  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> > *dma_addr,
> > @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> > device_node *np, u64 *dma_addr,
> >       bool found_dma_ranges = false;
> >       struct of_range_parser parser;
> >       struct of_range range;
> > +     phys_addr_t cpu_start = ~(phys_addr_t)0;
> >       u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> > +     bool dma_multi_pfn_offset = false;
> > +     int num_ranges = 0;
> > +
> > +     if (!dev)
> > +             return -ENODEV;
>
> Shouldn't this be part of patch #7?
Do you mean #8?  Do you mean the test for !dev?   It is not required
for #8 so I thought I'd keep these two changes separate.  I could
squash them.
>
> Regards,
> Nicolas
>
Many thanks!

Jim Quinlan
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-04 14:18         ` Dan Carpenter
  (?)
@ 2020-06-04 14:43           ` Jim Quinlan via iommu
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-04 14:43 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Paul Kocialkowski,
	Lorenzo Pieralisi, Yoshinori Sato, Will Deacon, Joerg Roedel,
	maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT, Russell King,
	open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Maxime Ripard,
	Rob Herring, Borislav Petkov, Yong Deng, Santosh Shilimkar,
	Bjorn Helgaas, Dan Williams, Andy Shevchenko,
	moderated list:ARM PORT, Saravana Kannan, Greg Kroah-Hartman,
	Oliver Neukum, Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Thu, Jun 4, 2020 at 10:20 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> On Thu, Jun 04, 2020 at 09:48:49AM -0400, Jim Quinlan wrote:
> > > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > > +                      GFP_KERNEL);
> > >
> > > Use:    r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
> > Will fix.
> >
> > >
> > >
> > > > +     if (!r)
> > > > +             return -ENOMEM;
> > > > +
> > > > +     r->uniform_offset = true;
> > > > +     r->pfn_offset = pfn_offset;
> > > > +
> > > > +     return 0;
> > > > +}
> > >
> > > This function doesn't seem to do anything useful.  Is part of it
> > > missing?
> > No, the uniform pfn offset is a special case.
>
> Sorry, I wasn't clear.  We're talking about different things.  The code
> does:
>
>         r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
>         if (!r)
>                 return -ENOMEM;
>
>         r->uniform_offset = true;
>         r->pfn_offset = pfn_offset;
>
>         return 0;
>
> The code allocates "r" and then doesn't save it anywhere so there is
> no point.
You are absolutely right, sorry I missed your point.  Will fix.

Thanks,
Jim Quinlan

>
> regards,
> dan carpenter
>
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 14:43           ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan via iommu @ 2020-06-04 14:43 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	open list:STAGING SUBSYSTEM, Paul Kocialkowski, Yoshinori Sato,
	Will Deacon, maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT,
	Russell King, open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai,
	Ingo Molnar, maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE,
	Alan Stern, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Maxime Ripard,
	Rob Herring, Borislav Petkov, Yong Deng, Santosh Shilimkar,
	Bjorn Helgaas, Dan Williams, Andy Shevchenko,
	moderated list:ARM PORT, Saravana Kannan, Greg Kroah-Hartman,
	Oliver Neukum, Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Thu, Jun 4, 2020 at 10:20 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> On Thu, Jun 04, 2020 at 09:48:49AM -0400, Jim Quinlan wrote:
> > > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > > +                      GFP_KERNEL);
> > >
> > > Use:    r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
> > Will fix.
> >
> > >
> > >
> > > > +     if (!r)
> > > > +             return -ENOMEM;
> > > > +
> > > > +     r->uniform_offset = true;
> > > > +     r->pfn_offset = pfn_offset;
> > > > +
> > > > +     return 0;
> > > > +}
> > >
> > > This function doesn't seem to do anything useful.  Is part of it
> > > missing?
> > No, the uniform pfn offset is a special case.
>
> Sorry, I wasn't clear.  We're talking about different things.  The code
> does:
>
>         r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
>         if (!r)
>                 return -ENOMEM;
>
>         r->uniform_offset = true;
>         r->pfn_offset = pfn_offset;
>
>         return 0;
>
> The code allocates "r" and then doesn't save it anywhere so there is
> no point.
You are absolutely right, sorry I missed your point.  Will fix.

Thanks,
Jim Quinlan

>
> regards,
> dan carpenter
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 14:43           ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-04 14:43 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR REMOTEPROC SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Frank Rowand, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Paul Kocialkowski,
	Lorenzo Pieralisi, Yoshinori Sato, Will Deacon, Joerg Roedel,
	maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT, Russell King,
	open list:ACPI FOR ARM64 ACPI/arm64, Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Mauro Carvalho Chehab, Rob Herring,
	Borislav Petkov, Yong Deng, Santosh Shilimkar, Bjorn Helgaas,
	Dan Williams, Andy Shevchenko, moderated list:ARM PORT,
	Saravana Kannan, Greg Kroah-Hartman, Oliver Neukum,
	Rafael J. Wysocki, open list, Wolfram Sang,
	open list:IOMMU DRIVERS, Mark Brown, Thomas Gleixner,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Thu, Jun 4, 2020 at 10:20 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> On Thu, Jun 04, 2020 at 09:48:49AM -0400, Jim Quinlan wrote:
> > > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > > +                      GFP_KERNEL);
> > >
> > > Use:    r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
> > Will fix.
> >
> > >
> > >
> > > > +     if (!r)
> > > > +             return -ENOMEM;
> > > > +
> > > > +     r->uniform_offset = true;
> > > > +     r->pfn_offset = pfn_offset;
> > > > +
> > > > +     return 0;
> > > > +}
> > >
> > > This function doesn't seem to do anything useful.  Is part of it
> > > missing?
> > No, the uniform pfn offset is a special case.
>
> Sorry, I wasn't clear.  We're talking about different things.  The code
> does:
>
>         r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
>         if (!r)
>                 return -ENOMEM;
>
>         r->uniform_offset = true;
>         r->pfn_offset = pfn_offset;
>
>         return 0;
>
> The code allocates "r" and then doesn't save it anywhere so there is
> no point.
You are absolutely right, sorry I missed your point.  Will fix.

Thanks,
Jim Quinlan

>
> regards,
> dan carpenter
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-04 14:35       ` Jim Quinlan via iommu
  (?)
@ 2020-06-04 15:05         ` Andy Shevchenko
  -1 siblings, 0 replies; 83+ messages in thread
From: Andy Shevchenko @ 2020-06-04 15:05 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Will Deacon, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Wolfram Sang,
	Lorenzo Pieralisi, Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, Yong Deng, Santosh Shilimkar, Bjorn Helgaas,
	Thomas Gleixner, Mauro Carvalho Chehab, moderated list:ARM PORT,
	Saravana Kannan, Greg Kroah-Hartman, Oliver Neukum,
	Rafael J. Wysocki, open list, Paul Kocialkowski,
	open list:IOMMU DRIVERS, Mark Brown, Stefano Stabellini,
	Daniel Vetter, Sudeep Holla, open list:ALLWINNER A10 CSI DRIVER,
	Robin Murphy, open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Thu, Jun 04, 2020 at 10:35:12AM -0400, Jim Quinlan wrote:
> On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
> <nsaenzjulienne@suse.de> wrote:
> > On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:

...

> > > +     phys = virt_to_phys(ret);
> > > +     pfn =  phys >> PAGE_SHIFT;
> >
> > nit: not sure it really pays off to have a pfn variable here.
> Did it for readability; the compiler's optimization should take care
> of any extra variables.  But I can switch if you insist.

One side note: please, try to get familiar with existing helpers in the kernel.
For example, above line is like

	pfn = PFN_DOWN(phys);

...

> > > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)

> > > +             *dma_handle -= PFN_PHYS(
> > > +                     dma_pfn_offset_from_phys_addr(dev, phys));

Don't do such indentation, esp. we have now 100! :-)

-- 
With Best Regards,
Andy Shevchenko


_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 15:05         ` Andy Shevchenko
  0 siblings, 0 replies; 83+ messages in thread
From: Andy Shevchenko @ 2020-06-04 15:05 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Will Deacon, Christoph Hellwig,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Yoshinori Sato,
	Frank Rowand, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, Yong Deng, Santosh Shilimkar, Bjorn Helgaas,
	Thomas Gleixner, Mauro Carvalho Chehab, moderated list:ARM PORT,
	Saravana Kannan, Greg Kroah-Hartman, Oliver Neukum,
	Rafael J. Wysocki, open list, Paul Kocialkowski,
	open list:IOMMU DRIVERS, Mark Brown, Stefano Stabellini,
	Daniel Vetter, Sudeep Holla, open list:ALLWINNER A10 CSI DRIVER,
	Robin Murphy, open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Thu, Jun 04, 2020 at 10:35:12AM -0400, Jim Quinlan wrote:
> On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
> <nsaenzjulienne@suse.de> wrote:
> > On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:

...

> > > +     phys = virt_to_phys(ret);
> > > +     pfn =  phys >> PAGE_SHIFT;
> >
> > nit: not sure it really pays off to have a pfn variable here.
> Did it for readability; the compiler's optimization should take care
> of any extra variables.  But I can switch if you insist.

One side note: please, try to get familiar with existing helpers in the kernel.
For example, above line is like

	pfn = PFN_DOWN(phys);

...

> > > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)

> > > +             *dma_handle -= PFN_PHYS(
> > > +                     dma_pfn_offset_from_phys_addr(dev, phys));

Don't do such indentation, esp. we have now 100! :-)

-- 
With Best Regards,
Andy Shevchenko


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 15:05         ` Andy Shevchenko
  0 siblings, 0 replies; 83+ messages in thread
From: Andy Shevchenko @ 2020-06-04 15:05 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Will Deacon, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Wolfram Sang,
	Lorenzo Pieralisi, Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Rob Herring, Borislav Petkov,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

On Thu, Jun 04, 2020 at 10:35:12AM -0400, Jim Quinlan wrote:
> On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
> <nsaenzjulienne@suse.de> wrote:
> > On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:

...

> > > +     phys = virt_to_phys(ret);
> > > +     pfn =  phys >> PAGE_SHIFT;
> >
> > nit: not sure it really pays off to have a pfn variable here.
> Did it for readability; the compiler's optimization should take care
> of any extra variables.  But I can switch if you insist.

One side note: please, try to get familiar with existing helpers in the kernel.
For example, above line is like

	pfn = PFN_DOWN(phys);

...

> > > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)

> > > +             *dma_handle -= PFN_PHYS(
> > > +                     dma_pfn_offset_from_phys_addr(dev, phys));

Don't do such indentation, esp. we have now 100! :-)

-- 
With Best Regards,
Andy Shevchenko


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-04 15:05         ` Andy Shevchenko
  (?)
@ 2020-06-04 16:28           ` Jim Quinlan via iommu
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-04 16:28 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Will Deacon, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Wolfram Sang,
	Lorenzo Pieralisi, Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, Yong Deng, Santosh Shilimkar, Bjorn Helgaas,
	Thomas Gleixner, Mauro Carvalho Chehab, moderated list:ARM PORT,
	Saravana Kannan, Greg Kroah-Hartman, Oliver Neukum,
	Rafael J. Wysocki, open list, Paul Kocialkowski,
	open list:IOMMU DRIVERS, Mark Brown, Stefano Stabellini,
	Daniel Vetter, Sudeep Holla, open list:ALLWINNER A10 CSI DRIVER,
	Robin Murphy, open list:USB SUBSYSTEM, Nicolas Saenz Julienne

Hi Andy,

On Thu, Jun 4, 2020 at 11:05 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Thu, Jun 04, 2020 at 10:35:12AM -0400, Jim Quinlan wrote:
> > On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
> > <nsaenzjulienne@suse.de> wrote:
> > > On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
>
> ...
>
> > > > +     phys = virt_to_phys(ret);
> > > > +     pfn =  phys >> PAGE_SHIFT;
> > >
> > > nit: not sure it really pays off to have a pfn variable here.
> > Did it for readability; the compiler's optimization should take care
> > of any extra variables.  But I can switch if you insist.
>
> One side note: please, try to get familiar with existing helpers in the kernel.
> For example, above line is like
>
>         pfn = PFN_DOWN(phys);
I just used the term in the original code; will change to PFN_DOWN().

>
> ...
>
> > > > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
>
> > > > +             *dma_handle -= PFN_PHYS(
> > > > +                     dma_pfn_offset_from_phys_addr(dev, phys));
>
> Don't do such indentation, esp. we have now 100! :-)

Got it.  Thanks,
Jim Quinlan
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 16:28           ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan via iommu @ 2020-06-04 16:28 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Will Deacon, Christoph Hellwig,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Yoshinori Sato,
	Frank Rowand, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, Yong Deng, Santosh Shilimkar, Bjorn Helgaas,
	Thomas Gleixner, Mauro Carvalho Chehab, moderated list:ARM PORT,
	Saravana Kannan, Greg Kroah-Hartman, Oliver Neukum,
	Rafael J. Wysocki, open list, Paul Kocialkowski,
	open list:IOMMU DRIVERS, Mark Brown, Stefano Stabellini,
	Daniel Vetter, Sudeep Holla, open list:ALLWINNER A10 CSI DRIVER,
	Robin Murphy, open list:USB SUBSYSTEM, Nicolas Saenz Julienne

Hi Andy,

On Thu, Jun 4, 2020 at 11:05 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Thu, Jun 04, 2020 at 10:35:12AM -0400, Jim Quinlan wrote:
> > On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
> > <nsaenzjulienne@suse.de> wrote:
> > > On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
>
> ...
>
> > > > +     phys = virt_to_phys(ret);
> > > > +     pfn =  phys >> PAGE_SHIFT;
> > >
> > > nit: not sure it really pays off to have a pfn variable here.
> > Did it for readability; the compiler's optimization should take care
> > of any extra variables.  But I can switch if you insist.
>
> One side note: please, try to get familiar with existing helpers in the kernel.
> For example, above line is like
>
>         pfn = PFN_DOWN(phys);
I just used the term in the original code; will change to PFN_DOWN().

>
> ...
>
> > > > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
>
> > > > +             *dma_handle -= PFN_PHYS(
> > > > +                     dma_pfn_offset_from_phys_addr(dev, phys));
>
> Don't do such indentation, esp. we have now 100! :-)

Got it.  Thanks,
Jim Quinlan
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 16:28           ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-04 16:28 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	open list:DRM DRIVERS FOR ALLWINNER A10, Bjorn Andersson,
	Julien Grall, H. Peter Anvin, Will Deacon, Christoph Hellwig,
	Marek Szyprowski, open list:STAGING SUBSYSTEM, Wolfram Sang,
	Lorenzo Pieralisi, Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Rob Herring, Borislav Petkov,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM, Nicolas Saenz Julienne

Hi Andy,

On Thu, Jun 4, 2020 at 11:05 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Thu, Jun 04, 2020 at 10:35:12AM -0400, Jim Quinlan wrote:
> > On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
> > <nsaenzjulienne@suse.de> wrote:
> > > On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
>
> ...
>
> > > > +     phys = virt_to_phys(ret);
> > > > +     pfn =  phys >> PAGE_SHIFT;
> > >
> > > nit: not sure it really pays off to have a pfn variable here.
> > Did it for readability; the compiler's optimization should take care
> > of any extra variables.  But I can switch if you insist.
>
> One side note: please, try to get familiar with existing helpers in the kernel.
> For example, above line is like
>
>         pfn = PFN_DOWN(phys);
I just used the term in the original code; will change to PFN_DOWN().

>
> ...
>
> > > > +     if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
>
> > > > +             *dma_handle -= PFN_PHYS(
> > > > +                     dma_pfn_offset_from_phys_addr(dev, phys));
>
> Don't do such indentation, esp. we have now 100! :-)

Got it.  Thanks,
Jim Quinlan
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-04 14:35       ` Jim Quinlan via iommu
  (?)
@ 2020-06-04 16:52         ` Nicolas Saenz Julienne
  -1 siblings, 0 replies; 83+ messages in thread
From: Nicolas Saenz Julienne @ 2020-06-04 16:52 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, open list:DRM DRIVERS FOR ALLWINNER A10,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM


[-- Attachment #1.1: Type: text/plain, Size: 9715 bytes --]

Hi Jim,

On Thu, 2020-06-04 at 10:35 -0400, Jim Quinlan wrote:

[...]

> > > --- a/arch/sh/kernel/dma-coherent.c
> > > +++ b/arch/sh/kernel/dma-coherent.c
> > > @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > dma_addr_t *dma_handle,
> > >  {
> > >       void *ret, *ret_nocache;
> > >       int order = get_order(size);
> > > +     unsigned long pfn;
> > > +     phys_addr_t phys;
> > > 
> > >       gfp |= __GFP_ZERO;
> > > 
> > > @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > dma_addr_t *dma_handle,
> > >               return NULL;
> > >       }
> > > 
> > > -     split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > > +     phys = virt_to_phys(ret);
> > > +     pfn =  phys >> PAGE_SHIFT;
> > 
> > nit: not sure it really pays off to have a pfn variable here.
> Did it for readability; the compiler's optimization should take care
> of any extra variables.  But I can switch if you insist.

No need.

[...]

> > > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > index 055eb0b8e396..2d66d415b6c3 100644
> > > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device
> > > *pdev)
> > > 
> > >       sdev->dev = &pdev->dev;
> > >       /* The DMA bus has the memory mapped at 0 */
> > > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > > +     if (ret)
> > > +             return ret;
> > > 
> > >       ret = sun6i_csi_resource_request(sdev, pdev);
> > >       if (ret)
> > > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > > index 96d8cfb14a60..c89333b0a5fb 100644
> > > --- a/drivers/of/address.c
> > > +++ b/drivers/of/address.c
> > > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct
> > > device_node
> > > *np, int index,
> > >  }
> > >  EXPORT_SYMBOL(of_io_request_and_map);
> > > 
> > > +static int attach_dma_pfn_offset_map(struct device *dev,
> > > +                                  struct device_node *node, int
> > > num_ranges)
> > 
> > As with the previous review, please take this comment with a grain of salt.
> > 
> > I think there should be a clear split between what belongs to OF and what
> > belongs to the core device infrastructure.
> > 
> > OF's job should be to parse DT and provide a list/array of ranges, whereas
> > the
> > core device infrastructure should provide an API to assign a list of
> > ranges/offset to a device.
> > 
> > As a concrete example, you're forcing devices like the sta2x11 to build with
> > OF
> > support, which, being an Intel device, it's pretty odd. But I'm also
> > thinking
> > of how will all this fit once an ACPI device wants to use it.
> To fix this I only have to move attach_uniform_dma_pfn_offset() from
> of/address.c to say include/linux/dma-mapping.h.  It has no
> dependencies on OF.  Do you agree?

Yes that seems nicer. In case you didn't had it in mind already, I'd change the
function name to match the naming scheme they use there.

On the other hand, I'd also move the non OF parts of the non unifom dma_offset
version of the function there.

> > Expanding on this idea, once you have a split between the OF's and device
> > core
> > roles, it transpires that of_dma_get_range()'s job should only be to provide
> > the ranges in a device understandable structure and of_dma_configre()'s to
> > actually assign the device's parameters. This would obsolete patch #7.
> 
> I think you mean patch #8.

Yes, my bad.

> I agree with you.  The reason I needed a "struct device *"  in the call is
> because I wanted to make sure the memory that is alloc'd belongs to the
> device that needs it.  If I do a regular kzalloc(), this memory  will become
> a leak once someone starts unbinding/binding their device.  Also, in  all
> uses of of_dma_rtange() -- there is only one --  a dev is required as one
> can't attach an offset map to NULL.
> 
> I do see that there are a number of functions in drivers/of/*.c that
> take 'struct device *dev' as an argument so there is precedent for
> something like this.  Regardless, I need an owner to the memory I
> alloc().

I understand the need for dev to be around, devm_*() is key. But also it's
important to keep the functions on purpose. And if of_dma_get_range() starts
setting ranges it calls, for the very least, for a function rename. Although
I'd rather split the parsing and setting of ranges as mentioned earlier. That
said, I get that's a more drastic move.

Talking about drastic moves. How about getting rid of the concept of
dma_pfn_offset for drivers altogether. Let them provide dma_pfn_offset_regions
(even when there is only one). I feel it's conceptually nicer, as you'd be
dealing only in one currency, so to speak, and you'd centralize the bus DMA
ranges setter function which is always easier to maintain.

I'd go as far as not creating a special case for uniform offsets. Let just set
cpu_end and dma_end to -1 so we always get a match. It's slightly more compute
heavy, but I don't think it's worth the optimization.

Just my two cents :)

> > > +{
> > > +     struct of_range_parser parser;
> > > +     struct of_range range;
> > > +     struct dma_pfn_offset_region *r;
> > > +
> > > +     r = devm_kcalloc(dev, num_ranges + 1,
> > > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > > +     if (!r)
> > > +             return -ENOMEM;
> > > +     dev->dma_pfn_offset_map = r;
> > > +     of_dma_range_parser_init(&parser, node);
> > > +
> > > +     /*
> > > +      * Record all info for DMA ranges array.  We could
> > > +      * just use the of_range struct, but if we did that it
> > > +      * would require more calculations for phys_to_dma and
> > > +      * dma_to_phys conversions.
> > > +      */
> > > +     for_each_of_range(&parser, &range) {
> > > +             r->cpu_start = range.cpu_addr;
> > > +             r->cpu_end = r->cpu_start + range.size - 1;
> > > +             r->dma_start = range.bus_addr;
> > > +             r->dma_end = r->dma_start + range.size - 1;
> > > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > > +                     - PFN_DOWN(range.bus_addr);
> > > +             r++;
> > > +     }
> > > +     return 0;
> > > +}
> > > +
> > > +
> > > +
> > > +/**
> > > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > > + * @dev:     device pointer; only needed for a corner case.
> > 
> > That's a huge corner :P
> Good point; I'm not really sure what percent of Linux configurations
> require a dma_pfn_offset.  I'll drop the "corner".
> 
> > > + * @dma_pfn_offset:  offset to apply when converting from phys addr
> > > + *                   to dma addr and vice versa.
> > > + *
> > > + * It returns -ENOMEM if out of memory, otherwise 0.
> > > + */
> > > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> > > pfn_offset)
> > 
> > As I say above, does this really belong to of/address.c?
> No it does not.  Will fix.
> 
> > > +{
> > > +     struct dma_pfn_offset_region *r;
> > > +
> > > +     if (!dev)
> > > +             return -ENODEV;
> > > +
> > > +     if (!pfn_offset)
> > > +             return 0;
> > > +
> > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > +                      GFP_KERNEL);
> > > +     if (!r)
> > > +             return -ENOMEM;
> > 
> > I think you're missing this:
> > 
> >         dev->dma_pfn_offset_map = r;
> > 
> That's a showstopper!  DanC also pointed it out but I still didn't see
> it.  Thanks!
> 
> > > +
> > > +     r->uniform_offset = true;
> > > +     r->pfn_offset = pfn_offset;
> > > +
> > > +     return 0;
> > > +}
> > > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > > +
> > >  /**
> > >   * of_dma_get_range - Get DMA range info
> > >   * @dev:     device pointer; only needed for a corner case.
> > > @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
> > >   *   CPU addr (phys_addr_t)  : pna cells
> > >   *   size                    : nsize cells
> > >   *
> > > - * It returns -ENODEV if "dma-ranges" property was not found
> > > + * It returns -ENODEV if !dev or "dma-ranges" property was not found
> > >   * for this device in DT.
> > >   */
> > >  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> > > *dma_addr,
> > > @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> > > device_node *np, u64 *dma_addr,
> > >       bool found_dma_ranges = false;
> > >       struct of_range_parser parser;
> > >       struct of_range range;
> > > +     phys_addr_t cpu_start = ~(phys_addr_t)0;
> > >       u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> > > +     bool dma_multi_pfn_offset = false;
> > > +     int num_ranges = 0;
> > > +
> > > +     if (!dev)
> > > +             return -ENODEV;
> > 
> > Shouldn't this be part of patch #7?
> Do you mean #8?  Do you mean the test for !dev?   It is not required
> for #8 so I thought I'd keep these two changes separate.  I could
> squash them.

#8, of course.

It's more of a subjective matter, but to me it fits better #8's description and
keeps this one more focused. That said, it's just a comment, do as you please.

Regads,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 169 bytes --]

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 16:52         ` Nicolas Saenz Julienne
  0 siblings, 0 replies; 83+ messages in thread
From: Nicolas Saenz Julienne @ 2020-06-04 16:52 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, open list:STAGING SUBSYSTEM,
	Wolfram Sang, Yoshinori Sato, Frank Rowand,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, open list:DRM DRIVERS FOR ALLWINNER A10,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM


[-- Attachment #1.1: Type: text/plain, Size: 9715 bytes --]

Hi Jim,

On Thu, 2020-06-04 at 10:35 -0400, Jim Quinlan wrote:

[...]

> > > --- a/arch/sh/kernel/dma-coherent.c
> > > +++ b/arch/sh/kernel/dma-coherent.c
> > > @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > dma_addr_t *dma_handle,
> > >  {
> > >       void *ret, *ret_nocache;
> > >       int order = get_order(size);
> > > +     unsigned long pfn;
> > > +     phys_addr_t phys;
> > > 
> > >       gfp |= __GFP_ZERO;
> > > 
> > > @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > dma_addr_t *dma_handle,
> > >               return NULL;
> > >       }
> > > 
> > > -     split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > > +     phys = virt_to_phys(ret);
> > > +     pfn =  phys >> PAGE_SHIFT;
> > 
> > nit: not sure it really pays off to have a pfn variable here.
> Did it for readability; the compiler's optimization should take care
> of any extra variables.  But I can switch if you insist.

No need.

[...]

> > > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > index 055eb0b8e396..2d66d415b6c3 100644
> > > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device
> > > *pdev)
> > > 
> > >       sdev->dev = &pdev->dev;
> > >       /* The DMA bus has the memory mapped at 0 */
> > > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > > +     if (ret)
> > > +             return ret;
> > > 
> > >       ret = sun6i_csi_resource_request(sdev, pdev);
> > >       if (ret)
> > > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > > index 96d8cfb14a60..c89333b0a5fb 100644
> > > --- a/drivers/of/address.c
> > > +++ b/drivers/of/address.c
> > > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct
> > > device_node
> > > *np, int index,
> > >  }
> > >  EXPORT_SYMBOL(of_io_request_and_map);
> > > 
> > > +static int attach_dma_pfn_offset_map(struct device *dev,
> > > +                                  struct device_node *node, int
> > > num_ranges)
> > 
> > As with the previous review, please take this comment with a grain of salt.
> > 
> > I think there should be a clear split between what belongs to OF and what
> > belongs to the core device infrastructure.
> > 
> > OF's job should be to parse DT and provide a list/array of ranges, whereas
> > the
> > core device infrastructure should provide an API to assign a list of
> > ranges/offset to a device.
> > 
> > As a concrete example, you're forcing devices like the sta2x11 to build with
> > OF
> > support, which, being an Intel device, it's pretty odd. But I'm also
> > thinking
> > of how will all this fit once an ACPI device wants to use it.
> To fix this I only have to move attach_uniform_dma_pfn_offset() from
> of/address.c to say include/linux/dma-mapping.h.  It has no
> dependencies on OF.  Do you agree?

Yes that seems nicer. In case you didn't had it in mind already, I'd change the
function name to match the naming scheme they use there.

On the other hand, I'd also move the non OF parts of the non unifom dma_offset
version of the function there.

> > Expanding on this idea, once you have a split between the OF's and device
> > core
> > roles, it transpires that of_dma_get_range()'s job should only be to provide
> > the ranges in a device understandable structure and of_dma_configre()'s to
> > actually assign the device's parameters. This would obsolete patch #7.
> 
> I think you mean patch #8.

Yes, my bad.

> I agree with you.  The reason I needed a "struct device *"  in the call is
> because I wanted to make sure the memory that is alloc'd belongs to the
> device that needs it.  If I do a regular kzalloc(), this memory  will become
> a leak once someone starts unbinding/binding their device.  Also, in  all
> uses of of_dma_rtange() -- there is only one --  a dev is required as one
> can't attach an offset map to NULL.
> 
> I do see that there are a number of functions in drivers/of/*.c that
> take 'struct device *dev' as an argument so there is precedent for
> something like this.  Regardless, I need an owner to the memory I
> alloc().

I understand the need for dev to be around, devm_*() is key. But also it's
important to keep the functions on purpose. And if of_dma_get_range() starts
setting ranges it calls, for the very least, for a function rename. Although
I'd rather split the parsing and setting of ranges as mentioned earlier. That
said, I get that's a more drastic move.

Talking about drastic moves. How about getting rid of the concept of
dma_pfn_offset for drivers altogether. Let them provide dma_pfn_offset_regions
(even when there is only one). I feel it's conceptually nicer, as you'd be
dealing only in one currency, so to speak, and you'd centralize the bus DMA
ranges setter function which is always easier to maintain.

I'd go as far as not creating a special case for uniform offsets. Let just set
cpu_end and dma_end to -1 so we always get a match. It's slightly more compute
heavy, but I don't think it's worth the optimization.

Just my two cents :)

> > > +{
> > > +     struct of_range_parser parser;
> > > +     struct of_range range;
> > > +     struct dma_pfn_offset_region *r;
> > > +
> > > +     r = devm_kcalloc(dev, num_ranges + 1,
> > > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > > +     if (!r)
> > > +             return -ENOMEM;
> > > +     dev->dma_pfn_offset_map = r;
> > > +     of_dma_range_parser_init(&parser, node);
> > > +
> > > +     /*
> > > +      * Record all info for DMA ranges array.  We could
> > > +      * just use the of_range struct, but if we did that it
> > > +      * would require more calculations for phys_to_dma and
> > > +      * dma_to_phys conversions.
> > > +      */
> > > +     for_each_of_range(&parser, &range) {
> > > +             r->cpu_start = range.cpu_addr;
> > > +             r->cpu_end = r->cpu_start + range.size - 1;
> > > +             r->dma_start = range.bus_addr;
> > > +             r->dma_end = r->dma_start + range.size - 1;
> > > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > > +                     - PFN_DOWN(range.bus_addr);
> > > +             r++;
> > > +     }
> > > +     return 0;
> > > +}
> > > +
> > > +
> > > +
> > > +/**
> > > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > > + * @dev:     device pointer; only needed for a corner case.
> > 
> > That's a huge corner :P
> Good point; I'm not really sure what percent of Linux configurations
> require a dma_pfn_offset.  I'll drop the "corner".
> 
> > > + * @dma_pfn_offset:  offset to apply when converting from phys addr
> > > + *                   to dma addr and vice versa.
> > > + *
> > > + * It returns -ENOMEM if out of memory, otherwise 0.
> > > + */
> > > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> > > pfn_offset)
> > 
> > As I say above, does this really belong to of/address.c?
> No it does not.  Will fix.
> 
> > > +{
> > > +     struct dma_pfn_offset_region *r;
> > > +
> > > +     if (!dev)
> > > +             return -ENODEV;
> > > +
> > > +     if (!pfn_offset)
> > > +             return 0;
> > > +
> > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > +                      GFP_KERNEL);
> > > +     if (!r)
> > > +             return -ENOMEM;
> > 
> > I think you're missing this:
> > 
> >         dev->dma_pfn_offset_map = r;
> > 
> That's a showstopper!  DanC also pointed it out but I still didn't see
> it.  Thanks!
> 
> > > +
> > > +     r->uniform_offset = true;
> > > +     r->pfn_offset = pfn_offset;
> > > +
> > > +     return 0;
> > > +}
> > > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > > +
> > >  /**
> > >   * of_dma_get_range - Get DMA range info
> > >   * @dev:     device pointer; only needed for a corner case.
> > > @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
> > >   *   CPU addr (phys_addr_t)  : pna cells
> > >   *   size                    : nsize cells
> > >   *
> > > - * It returns -ENODEV if "dma-ranges" property was not found
> > > + * It returns -ENODEV if !dev or "dma-ranges" property was not found
> > >   * for this device in DT.
> > >   */
> > >  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> > > *dma_addr,
> > > @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> > > device_node *np, u64 *dma_addr,
> > >       bool found_dma_ranges = false;
> > >       struct of_range_parser parser;
> > >       struct of_range range;
> > > +     phys_addr_t cpu_start = ~(phys_addr_t)0;
> > >       u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> > > +     bool dma_multi_pfn_offset = false;
> > > +     int num_ranges = 0;
> > > +
> > > +     if (!dev)
> > > +             return -ENODEV;
> > 
> > Shouldn't this be part of patch #7?
> Do you mean #8?  Do you mean the test for !dev?   It is not required
> for #8 so I thought I'd keep these two changes separate.  I could
> squash them.

#8, of course.

It's more of a subjective matter, but to me it fits better #8's description and
keeps this one more focused. That said, it's just a comment, do as you please.

Regads,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 16:52         ` Nicolas Saenz Julienne
  0 siblings, 0 replies; 83+ messages in thread
From: Nicolas Saenz Julienne @ 2020-06-04 16:52 UTC (permalink / raw)
  To: Jim Quinlan
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM


[-- Attachment #1.1: Type: text/plain, Size: 9715 bytes --]

Hi Jim,

On Thu, 2020-06-04 at 10:35 -0400, Jim Quinlan wrote:

[...]

> > > --- a/arch/sh/kernel/dma-coherent.c
> > > +++ b/arch/sh/kernel/dma-coherent.c
> > > @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > dma_addr_t *dma_handle,
> > >  {
> > >       void *ret, *ret_nocache;
> > >       int order = get_order(size);
> > > +     unsigned long pfn;
> > > +     phys_addr_t phys;
> > > 
> > >       gfp |= __GFP_ZERO;
> > > 
> > > @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > dma_addr_t *dma_handle,
> > >               return NULL;
> > >       }
> > > 
> > > -     split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > > +     phys = virt_to_phys(ret);
> > > +     pfn =  phys >> PAGE_SHIFT;
> > 
> > nit: not sure it really pays off to have a pfn variable here.
> Did it for readability; the compiler's optimization should take care
> of any extra variables.  But I can switch if you insist.

No need.

[...]

> > > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > index 055eb0b8e396..2d66d415b6c3 100644
> > > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device
> > > *pdev)
> > > 
> > >       sdev->dev = &pdev->dev;
> > >       /* The DMA bus has the memory mapped at 0 */
> > > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > > +     if (ret)
> > > +             return ret;
> > > 
> > >       ret = sun6i_csi_resource_request(sdev, pdev);
> > >       if (ret)
> > > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > > index 96d8cfb14a60..c89333b0a5fb 100644
> > > --- a/drivers/of/address.c
> > > +++ b/drivers/of/address.c
> > > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct
> > > device_node
> > > *np, int index,
> > >  }
> > >  EXPORT_SYMBOL(of_io_request_and_map);
> > > 
> > > +static int attach_dma_pfn_offset_map(struct device *dev,
> > > +                                  struct device_node *node, int
> > > num_ranges)
> > 
> > As with the previous review, please take this comment with a grain of salt.
> > 
> > I think there should be a clear split between what belongs to OF and what
> > belongs to the core device infrastructure.
> > 
> > OF's job should be to parse DT and provide a list/array of ranges, whereas
> > the
> > core device infrastructure should provide an API to assign a list of
> > ranges/offset to a device.
> > 
> > As a concrete example, you're forcing devices like the sta2x11 to build with
> > OF
> > support, which, being an Intel device, it's pretty odd. But I'm also
> > thinking
> > of how will all this fit once an ACPI device wants to use it.
> To fix this I only have to move attach_uniform_dma_pfn_offset() from
> of/address.c to say include/linux/dma-mapping.h.  It has no
> dependencies on OF.  Do you agree?

Yes that seems nicer. In case you didn't had it in mind already, I'd change the
function name to match the naming scheme they use there.

On the other hand, I'd also move the non OF parts of the non unifom dma_offset
version of the function there.

> > Expanding on this idea, once you have a split between the OF's and device
> > core
> > roles, it transpires that of_dma_get_range()'s job should only be to provide
> > the ranges in a device understandable structure and of_dma_configre()'s to
> > actually assign the device's parameters. This would obsolete patch #7.
> 
> I think you mean patch #8.

Yes, my bad.

> I agree with you.  The reason I needed a "struct device *"  in the call is
> because I wanted to make sure the memory that is alloc'd belongs to the
> device that needs it.  If I do a regular kzalloc(), this memory  will become
> a leak once someone starts unbinding/binding their device.  Also, in  all
> uses of of_dma_rtange() -- there is only one --  a dev is required as one
> can't attach an offset map to NULL.
> 
> I do see that there are a number of functions in drivers/of/*.c that
> take 'struct device *dev' as an argument so there is precedent for
> something like this.  Regardless, I need an owner to the memory I
> alloc().

I understand the need for dev to be around, devm_*() is key. But also it's
important to keep the functions on purpose. And if of_dma_get_range() starts
setting ranges it calls, for the very least, for a function rename. Although
I'd rather split the parsing and setting of ranges as mentioned earlier. That
said, I get that's a more drastic move.

Talking about drastic moves. How about getting rid of the concept of
dma_pfn_offset for drivers altogether. Let them provide dma_pfn_offset_regions
(even when there is only one). I feel it's conceptually nicer, as you'd be
dealing only in one currency, so to speak, and you'd centralize the bus DMA
ranges setter function which is always easier to maintain.

I'd go as far as not creating a special case for uniform offsets. Let just set
cpu_end and dma_end to -1 so we always get a match. It's slightly more compute
heavy, but I don't think it's worth the optimization.

Just my two cents :)

> > > +{
> > > +     struct of_range_parser parser;
> > > +     struct of_range range;
> > > +     struct dma_pfn_offset_region *r;
> > > +
> > > +     r = devm_kcalloc(dev, num_ranges + 1,
> > > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > > +     if (!r)
> > > +             return -ENOMEM;
> > > +     dev->dma_pfn_offset_map = r;
> > > +     of_dma_range_parser_init(&parser, node);
> > > +
> > > +     /*
> > > +      * Record all info for DMA ranges array.  We could
> > > +      * just use the of_range struct, but if we did that it
> > > +      * would require more calculations for phys_to_dma and
> > > +      * dma_to_phys conversions.
> > > +      */
> > > +     for_each_of_range(&parser, &range) {
> > > +             r->cpu_start = range.cpu_addr;
> > > +             r->cpu_end = r->cpu_start + range.size - 1;
> > > +             r->dma_start = range.bus_addr;
> > > +             r->dma_end = r->dma_start + range.size - 1;
> > > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > > +                     - PFN_DOWN(range.bus_addr);
> > > +             r++;
> > > +     }
> > > +     return 0;
> > > +}
> > > +
> > > +
> > > +
> > > +/**
> > > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > > + * @dev:     device pointer; only needed for a corner case.
> > 
> > That's a huge corner :P
> Good point; I'm not really sure what percent of Linux configurations
> require a dma_pfn_offset.  I'll drop the "corner".
> 
> > > + * @dma_pfn_offset:  offset to apply when converting from phys addr
> > > + *                   to dma addr and vice versa.
> > > + *
> > > + * It returns -ENOMEM if out of memory, otherwise 0.
> > > + */
> > > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> > > pfn_offset)
> > 
> > As I say above, does this really belong to of/address.c?
> No it does not.  Will fix.
> 
> > > +{
> > > +     struct dma_pfn_offset_region *r;
> > > +
> > > +     if (!dev)
> > > +             return -ENODEV;
> > > +
> > > +     if (!pfn_offset)
> > > +             return 0;
> > > +
> > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > +                      GFP_KERNEL);
> > > +     if (!r)
> > > +             return -ENOMEM;
> > 
> > I think you're missing this:
> > 
> >         dev->dma_pfn_offset_map = r;
> > 
> That's a showstopper!  DanC also pointed it out but I still didn't see
> it.  Thanks!
> 
> > > +
> > > +     r->uniform_offset = true;
> > > +     r->pfn_offset = pfn_offset;
> > > +
> > > +     return 0;
> > > +}
> > > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > > +
> > >  /**
> > >   * of_dma_get_range - Get DMA range info
> > >   * @dev:     device pointer; only needed for a corner case.
> > > @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
> > >   *   CPU addr (phys_addr_t)  : pna cells
> > >   *   size                    : nsize cells
> > >   *
> > > - * It returns -ENODEV if "dma-ranges" property was not found
> > > + * It returns -ENODEV if !dev or "dma-ranges" property was not found
> > >   * for this device in DT.
> > >   */
> > >  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> > > *dma_addr,
> > > @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> > > device_node *np, u64 *dma_addr,
> > >       bool found_dma_ranges = false;
> > >       struct of_range_parser parser;
> > >       struct of_range range;
> > > +     phys_addr_t cpu_start = ~(phys_addr_t)0;
> > >       u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> > > +     bool dma_multi_pfn_offset = false;
> > > +     int num_ranges = 0;
> > > +
> > > +     if (!dev)
> > > +             return -ENODEV;
> > 
> > Shouldn't this be part of patch #7?
> Do you mean #8?  Do you mean the test for !dev?   It is not required
> for #8 so I thought I'd keep these two changes separate.  I could
> squash them.

#8, of course.

It's more of a subjective matter, but to me it fits better #8's description and
keeps this one more focused. That said, it's just a comment, do as you please.

Regads,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-04 16:52         ` Nicolas Saenz Julienne
  (?)
@ 2020-06-04 18:01           ` Jim Quinlan via iommu
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-04 18:01 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, open list:DRM DRIVERS FOR ALLWINNER A10,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

Hi Nicolas,

On Thu, Jun 4, 2020 at 12:52 PM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> Hi Jim,
>
> On Thu, 2020-06-04 at 10:35 -0400, Jim Quinlan wrote:
>
> [...]
>
> > > > --- a/arch/sh/kernel/dma-coherent.c
> > > > +++ b/arch/sh/kernel/dma-coherent.c
> > > > @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > > dma_addr_t *dma_handle,
> > > >  {
> > > >       void *ret, *ret_nocache;
> > > >       int order = get_order(size);
> > > > +     unsigned long pfn;
> > > > +     phys_addr_t phys;
> > > >
> > > >       gfp |= __GFP_ZERO;
> > > >
> > > > @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > > dma_addr_t *dma_handle,
> > > >               return NULL;
> > > >       }
> > > >
> > > > -     split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > > > +     phys = virt_to_phys(ret);
> > > > +     pfn =  phys >> PAGE_SHIFT;
> > >
> > > nit: not sure it really pays off to have a pfn variable here.
> > Did it for readability; the compiler's optimization should take care
> > of any extra variables.  But I can switch if you insist.
>
> No need.
>
> [...]
>
> > > > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > index 055eb0b8e396..2d66d415b6c3 100644
> > > > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device
> > > > *pdev)
> > > >
> > > >       sdev->dev = &pdev->dev;
> > > >       /* The DMA bus has the memory mapped at 0 */
> > > > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > > > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > > > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > > > +     if (ret)
> > > > +             return ret;
> > > >
> > > >       ret = sun6i_csi_resource_request(sdev, pdev);
> > > >       if (ret)
> > > > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > > > index 96d8cfb14a60..c89333b0a5fb 100644
> > > > --- a/drivers/of/address.c
> > > > +++ b/drivers/of/address.c
> > > > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct
> > > > device_node
> > > > *np, int index,
> > > >  }
> > > >  EXPORT_SYMBOL(of_io_request_and_map);
> > > >
> > > > +static int attach_dma_pfn_offset_map(struct device *dev,
> > > > +                                  struct device_node *node, int
> > > > num_ranges)
> > >
> > > As with the previous review, please take this comment with a grain of salt.
> > >
> > > I think there should be a clear split between what belongs to OF and what
> > > belongs to the core device infrastructure.
> > >
> > > OF's job should be to parse DT and provide a list/array of ranges, whereas
> > > the
> > > core device infrastructure should provide an API to assign a list of
> > > ranges/offset to a device.
> > >
> > > As a concrete example, you're forcing devices like the sta2x11 to build with
> > > OF
> > > support, which, being an Intel device, it's pretty odd. But I'm also
> > > thinking
> > > of how will all this fit once an ACPI device wants to use it.
> > To fix this I only have to move attach_uniform_dma_pfn_offset() from
> > of/address.c to say include/linux/dma-mapping.h.  It has no
> > dependencies on OF.  Do you agree?
>
> Yes that seems nicer. In case you didn't had it in mind already, I'd change the
> function name to match the naming scheme they use there.
>
> On the other hand, I'd also move the non OF parts of the non unifom dma_offset
> version of the function there.
>
> > > Expanding on this idea, once you have a split between the OF's and device
> > > core
> > > roles, it transpires that of_dma_get_range()'s job should only be to provide
> > > the ranges in a device understandable structure and of_dma_configre()'s to
> > > actually assign the device's parameters. This would obsolete patch #7.
> >
> > I think you mean patch #8.
>
> Yes, my bad.
>
> > I agree with you.  The reason I needed a "struct device *"  in the call is
> > because I wanted to make sure the memory that is alloc'd belongs to the
> > device that needs it.  If I do a regular kzalloc(), this memory  will become
> > a leak once someone starts unbinding/binding their device.  Also, in  all
> > uses of of_dma_rtange() -- there is only one --  a dev is required as one
> > can't attach an offset map to NULL.
> >
> > I do see that there are a number of functions in drivers/of/*.c that
> > take 'struct device *dev' as an argument so there is precedent for
> > something like this.  Regardless, I need an owner to the memory I
> > alloc().
>
> I understand the need for dev to be around, devm_*() is key. But also it's
> important to keep the functions on purpose. And if of_dma_get_range() starts
> setting ranges it calls, for the very least, for a function rename. Although
> I'd rather split the parsing and setting of ranges as mentioned earlier. That
> said, I get that's a more drastic move.

I agree with you.  I could do this from device.c:

        of_dma_get_num_ranges(..., &num_ranges); /* new function */
        r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
        of_dma_get_range(np, &dma_addr, &paddr, &size, r, num_ranges);

The problem here is that there could be four ranges, all with
offset=0.  My current code would optimize this case out but the above
would have us holding useless memory and looping through the four
ranges on every dma <=> phys conversion only to add 0.

>
> Talking about drastic moves. How about getting rid of the concept of
> dma_pfn_offset for drivers altogether. Let them provide dma_pfn_offset_regions
> (even when there is only one). I feel it's conceptually nicer, as you'd be
> dealing only in one currency, so to speak, and you'd centralize the bus DMA
> ranges setter function which is always easier to maintain.
Do you agree that we have to somehow hang this info on the struct
device structure?  Because in the dma2phys() and phys2dma() all you
have is the dev parameter.  I don't see how this  can be done w/o
involving dev.
>
> I'd go as far as not creating a special case for uniform offsets. Let just set
> cpu_end and dma_end to -1 so we always get a match. It's slightly more compute
> heavy, but I don't think it's worth the optimization.
Well, there are two subcases here.  One where we do know the bounds
and one where we do not.  I suppose for the latter I could have the
drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
this some thought...

>
> Just my two cents :)

Worth much more than $0.02 IMO :-)

BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
the inline functions but the problem is that it slows the fastpath;
consider the following code from dma-direct.h

        if (dev->dma_pfn_offset_map) {
                unsigned long dma_pfn_offset =
dma_pfn_offset_from_phys_addr(dev, paddr);

                dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
        }
        return dev_addr;

becomes

        unsigned long dma_pfn_offset  =
dma_pfn_offset_from_phys_addr(dev, paddr);

        dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
        return dev_addr;

So those configurations that  have no dma_pfn_offsets are doing an
unnecessary shift and add.

Thanks much,
Jim Quinlan

>
> > > > +{
> > > > +     struct of_range_parser parser;
> > > > +     struct of_range range;
> > > > +     struct dma_pfn_offset_region *r;
> > > > +
> > > > +     r = devm_kcalloc(dev, num_ranges + 1,
> > > > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > > > +     if (!r)
> > > > +             return -ENOMEM;
> > > > +     dev->dma_pfn_offset_map = r;
> > > > +     of_dma_range_parser_init(&parser, node);
> > > > +
> > > > +     /*
> > > > +      * Record all info for DMA ranges array.  We could
> > > > +      * just use the of_range struct, but if we did that it
> > > > +      * would require more calculations for phys_to_dma and
> > > > +      * dma_to_phys conversions.
> > > > +      */
> > > > +     for_each_of_range(&parser, &range) {
> > > > +             r->cpu_start = range.cpu_addr;
> > > > +             r->cpu_end = r->cpu_start + range.size - 1;
> > > > +             r->dma_start = range.bus_addr;
> > > > +             r->dma_end = r->dma_start + range.size - 1;
> > > > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > > > +                     - PFN_DOWN(range.bus_addr);
> > > > +             r++;
> > > > +     }
> > > > +     return 0;
> > > > +}
> > > > +
> > > > +
> > > > +
> > > > +/**
> > > > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > > > + * @dev:     device pointer; only needed for a corner case.
> > >
> > > That's a huge corner :P
> > Good point; I'm not really sure what percent of Linux configurations
> > require a dma_pfn_offset.  I'll drop the "corner".
> >
> > > > + * @dma_pfn_offset:  offset to apply when converting from phys addr
> > > > + *                   to dma addr and vice versa.
> > > > + *
> > > > + * It returns -ENOMEM if out of memory, otherwise 0.
> > > > + */
> > > > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> > > > pfn_offset)
> > >
> > > As I say above, does this really belong to of/address.c?
> > No it does not.  Will fix.
> >
> > > > +{
> > > > +     struct dma_pfn_offset_region *r;
> > > > +
> > > > +     if (!dev)
> > > > +             return -ENODEV;
> > > > +
> > > > +     if (!pfn_offset)
> > > > +             return 0;
> > > > +
> > > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > > +                      GFP_KERNEL);
> > > > +     if (!r)
> > > > +             return -ENOMEM;
> > >
> > > I think you're missing this:
> > >
> > >         dev->dma_pfn_offset_map = r;
> > >
> > That's a showstopper!  DanC also pointed it out but I still didn't see
> > it.  Thanks!
> >
> > > > +
> > > > +     r->uniform_offset = true;
> > > > +     r->pfn_offset = pfn_offset;
> > > > +
> > > > +     return 0;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > > > +
> > > >  /**
> > > >   * of_dma_get_range - Get DMA range info
> > > >   * @dev:     device pointer; only needed for a corner case.
> > > > @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
> > > >   *   CPU addr (phys_addr_t)  : pna cells
> > > >   *   size                    : nsize cells
> > > >   *
> > > > - * It returns -ENODEV if "dma-ranges" property was not found
> > > > + * It returns -ENODEV if !dev or "dma-ranges" property was not found
> > > >   * for this device in DT.
> > > >   */
> > > >  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> > > > *dma_addr,
> > > > @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> > > > device_node *np, u64 *dma_addr,
> > > >       bool found_dma_ranges = false;
> > > >       struct of_range_parser parser;
> > > >       struct of_range range;
> > > > +     phys_addr_t cpu_start = ~(phys_addr_t)0;
> > > >       u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> > > > +     bool dma_multi_pfn_offset = false;
> > > > +     int num_ranges = 0;
> > > > +
> > > > +     if (!dev)
> > > > +             return -ENODEV;
> > >
> > > Shouldn't this be part of patch #7?
> > Do you mean #8?  Do you mean the test for !dev?   It is not required
> > for #8 so I thought I'd keep these two changes separate.  I could
> > squash them.
>
> #8, of course.
>
> It's more of a subjective matter, but to me it fits better #8's description and
> keeps this one more focused. That said, it's just a comment, do as you please.
>
> Regads,
> Nicolas
>
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 18:01           ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan via iommu @ 2020-06-04 18:01 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, open list:STAGING SUBSYSTEM,
	Wolfram Sang, Yoshinori Sato, Frank Rowand,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, open list:DRM DRIVERS FOR ALLWINNER A10,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

Hi Nicolas,

On Thu, Jun 4, 2020 at 12:52 PM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> Hi Jim,
>
> On Thu, 2020-06-04 at 10:35 -0400, Jim Quinlan wrote:
>
> [...]
>
> > > > --- a/arch/sh/kernel/dma-coherent.c
> > > > +++ b/arch/sh/kernel/dma-coherent.c
> > > > @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > > dma_addr_t *dma_handle,
> > > >  {
> > > >       void *ret, *ret_nocache;
> > > >       int order = get_order(size);
> > > > +     unsigned long pfn;
> > > > +     phys_addr_t phys;
> > > >
> > > >       gfp |= __GFP_ZERO;
> > > >
> > > > @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > > dma_addr_t *dma_handle,
> > > >               return NULL;
> > > >       }
> > > >
> > > > -     split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > > > +     phys = virt_to_phys(ret);
> > > > +     pfn =  phys >> PAGE_SHIFT;
> > >
> > > nit: not sure it really pays off to have a pfn variable here.
> > Did it for readability; the compiler's optimization should take care
> > of any extra variables.  But I can switch if you insist.
>
> No need.
>
> [...]
>
> > > > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > index 055eb0b8e396..2d66d415b6c3 100644
> > > > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device
> > > > *pdev)
> > > >
> > > >       sdev->dev = &pdev->dev;
> > > >       /* The DMA bus has the memory mapped at 0 */
> > > > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > > > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > > > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > > > +     if (ret)
> > > > +             return ret;
> > > >
> > > >       ret = sun6i_csi_resource_request(sdev, pdev);
> > > >       if (ret)
> > > > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > > > index 96d8cfb14a60..c89333b0a5fb 100644
> > > > --- a/drivers/of/address.c
> > > > +++ b/drivers/of/address.c
> > > > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct
> > > > device_node
> > > > *np, int index,
> > > >  }
> > > >  EXPORT_SYMBOL(of_io_request_and_map);
> > > >
> > > > +static int attach_dma_pfn_offset_map(struct device *dev,
> > > > +                                  struct device_node *node, int
> > > > num_ranges)
> > >
> > > As with the previous review, please take this comment with a grain of salt.
> > >
> > > I think there should be a clear split between what belongs to OF and what
> > > belongs to the core device infrastructure.
> > >
> > > OF's job should be to parse DT and provide a list/array of ranges, whereas
> > > the
> > > core device infrastructure should provide an API to assign a list of
> > > ranges/offset to a device.
> > >
> > > As a concrete example, you're forcing devices like the sta2x11 to build with
> > > OF
> > > support, which, being an Intel device, it's pretty odd. But I'm also
> > > thinking
> > > of how will all this fit once an ACPI device wants to use it.
> > To fix this I only have to move attach_uniform_dma_pfn_offset() from
> > of/address.c to say include/linux/dma-mapping.h.  It has no
> > dependencies on OF.  Do you agree?
>
> Yes that seems nicer. In case you didn't had it in mind already, I'd change the
> function name to match the naming scheme they use there.
>
> On the other hand, I'd also move the non OF parts of the non unifom dma_offset
> version of the function there.
>
> > > Expanding on this idea, once you have a split between the OF's and device
> > > core
> > > roles, it transpires that of_dma_get_range()'s job should only be to provide
> > > the ranges in a device understandable structure and of_dma_configre()'s to
> > > actually assign the device's parameters. This would obsolete patch #7.
> >
> > I think you mean patch #8.
>
> Yes, my bad.
>
> > I agree with you.  The reason I needed a "struct device *"  in the call is
> > because I wanted to make sure the memory that is alloc'd belongs to the
> > device that needs it.  If I do a regular kzalloc(), this memory  will become
> > a leak once someone starts unbinding/binding their device.  Also, in  all
> > uses of of_dma_rtange() -- there is only one --  a dev is required as one
> > can't attach an offset map to NULL.
> >
> > I do see that there are a number of functions in drivers/of/*.c that
> > take 'struct device *dev' as an argument so there is precedent for
> > something like this.  Regardless, I need an owner to the memory I
> > alloc().
>
> I understand the need for dev to be around, devm_*() is key. But also it's
> important to keep the functions on purpose. And if of_dma_get_range() starts
> setting ranges it calls, for the very least, for a function rename. Although
> I'd rather split the parsing and setting of ranges as mentioned earlier. That
> said, I get that's a more drastic move.

I agree with you.  I could do this from device.c:

        of_dma_get_num_ranges(..., &num_ranges); /* new function */
        r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
        of_dma_get_range(np, &dma_addr, &paddr, &size, r, num_ranges);

The problem here is that there could be four ranges, all with
offset=0.  My current code would optimize this case out but the above
would have us holding useless memory and looping through the four
ranges on every dma <=> phys conversion only to add 0.

>
> Talking about drastic moves. How about getting rid of the concept of
> dma_pfn_offset for drivers altogether. Let them provide dma_pfn_offset_regions
> (even when there is only one). I feel it's conceptually nicer, as you'd be
> dealing only in one currency, so to speak, and you'd centralize the bus DMA
> ranges setter function which is always easier to maintain.
Do you agree that we have to somehow hang this info on the struct
device structure?  Because in the dma2phys() and phys2dma() all you
have is the dev parameter.  I don't see how this  can be done w/o
involving dev.
>
> I'd go as far as not creating a special case for uniform offsets. Let just set
> cpu_end and dma_end to -1 so we always get a match. It's slightly more compute
> heavy, but I don't think it's worth the optimization.
Well, there are two subcases here.  One where we do know the bounds
and one where we do not.  I suppose for the latter I could have the
drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
this some thought...

>
> Just my two cents :)

Worth much more than $0.02 IMO :-)

BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
the inline functions but the problem is that it slows the fastpath;
consider the following code from dma-direct.h

        if (dev->dma_pfn_offset_map) {
                unsigned long dma_pfn_offset =
dma_pfn_offset_from_phys_addr(dev, paddr);

                dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
        }
        return dev_addr;

becomes

        unsigned long dma_pfn_offset  =
dma_pfn_offset_from_phys_addr(dev, paddr);

        dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
        return dev_addr;

So those configurations that  have no dma_pfn_offsets are doing an
unnecessary shift and add.

Thanks much,
Jim Quinlan

>
> > > > +{
> > > > +     struct of_range_parser parser;
> > > > +     struct of_range range;
> > > > +     struct dma_pfn_offset_region *r;
> > > > +
> > > > +     r = devm_kcalloc(dev, num_ranges + 1,
> > > > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > > > +     if (!r)
> > > > +             return -ENOMEM;
> > > > +     dev->dma_pfn_offset_map = r;
> > > > +     of_dma_range_parser_init(&parser, node);
> > > > +
> > > > +     /*
> > > > +      * Record all info for DMA ranges array.  We could
> > > > +      * just use the of_range struct, but if we did that it
> > > > +      * would require more calculations for phys_to_dma and
> > > > +      * dma_to_phys conversions.
> > > > +      */
> > > > +     for_each_of_range(&parser, &range) {
> > > > +             r->cpu_start = range.cpu_addr;
> > > > +             r->cpu_end = r->cpu_start + range.size - 1;
> > > > +             r->dma_start = range.bus_addr;
> > > > +             r->dma_end = r->dma_start + range.size - 1;
> > > > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > > > +                     - PFN_DOWN(range.bus_addr);
> > > > +             r++;
> > > > +     }
> > > > +     return 0;
> > > > +}
> > > > +
> > > > +
> > > > +
> > > > +/**
> > > > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > > > + * @dev:     device pointer; only needed for a corner case.
> > >
> > > That's a huge corner :P
> > Good point; I'm not really sure what percent of Linux configurations
> > require a dma_pfn_offset.  I'll drop the "corner".
> >
> > > > + * @dma_pfn_offset:  offset to apply when converting from phys addr
> > > > + *                   to dma addr and vice versa.
> > > > + *
> > > > + * It returns -ENOMEM if out of memory, otherwise 0.
> > > > + */
> > > > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> > > > pfn_offset)
> > >
> > > As I say above, does this really belong to of/address.c?
> > No it does not.  Will fix.
> >
> > > > +{
> > > > +     struct dma_pfn_offset_region *r;
> > > > +
> > > > +     if (!dev)
> > > > +             return -ENODEV;
> > > > +
> > > > +     if (!pfn_offset)
> > > > +             return 0;
> > > > +
> > > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > > +                      GFP_KERNEL);
> > > > +     if (!r)
> > > > +             return -ENOMEM;
> > >
> > > I think you're missing this:
> > >
> > >         dev->dma_pfn_offset_map = r;
> > >
> > That's a showstopper!  DanC also pointed it out but I still didn't see
> > it.  Thanks!
> >
> > > > +
> > > > +     r->uniform_offset = true;
> > > > +     r->pfn_offset = pfn_offset;
> > > > +
> > > > +     return 0;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > > > +
> > > >  /**
> > > >   * of_dma_get_range - Get DMA range info
> > > >   * @dev:     device pointer; only needed for a corner case.
> > > > @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
> > > >   *   CPU addr (phys_addr_t)  : pna cells
> > > >   *   size                    : nsize cells
> > > >   *
> > > > - * It returns -ENODEV if "dma-ranges" property was not found
> > > > + * It returns -ENODEV if !dev or "dma-ranges" property was not found
> > > >   * for this device in DT.
> > > >   */
> > > >  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> > > > *dma_addr,
> > > > @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> > > > device_node *np, u64 *dma_addr,
> > > >       bool found_dma_ranges = false;
> > > >       struct of_range_parser parser;
> > > >       struct of_range range;
> > > > +     phys_addr_t cpu_start = ~(phys_addr_t)0;
> > > >       u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> > > > +     bool dma_multi_pfn_offset = false;
> > > > +     int num_ranges = 0;
> > > > +
> > > > +     if (!dev)
> > > > +             return -ENODEV;
> > >
> > > Shouldn't this be part of patch #7?
> > Do you mean #8?  Do you mean the test for !dev?   It is not required
> > for #8 so I thought I'd keep these two changes separate.  I could
> > squash them.
>
> #8, of course.
>
> It's more of a subjective matter, but to me it fits better #8's description and
> keeps this one more focused. That said, it's just a comment, do as you please.
>
> Regads,
> Nicolas
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-04 18:01           ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-04 18:01 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

Hi Nicolas,

On Thu, Jun 4, 2020 at 12:52 PM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> Hi Jim,
>
> On Thu, 2020-06-04 at 10:35 -0400, Jim Quinlan wrote:
>
> [...]
>
> > > > --- a/arch/sh/kernel/dma-coherent.c
> > > > +++ b/arch/sh/kernel/dma-coherent.c
> > > > @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > > dma_addr_t *dma_handle,
> > > >  {
> > > >       void *ret, *ret_nocache;
> > > >       int order = get_order(size);
> > > > +     unsigned long pfn;
> > > > +     phys_addr_t phys;
> > > >
> > > >       gfp |= __GFP_ZERO;
> > > >
> > > > @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > > dma_addr_t *dma_handle,
> > > >               return NULL;
> > > >       }
> > > >
> > > > -     split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > > > +     phys = virt_to_phys(ret);
> > > > +     pfn =  phys >> PAGE_SHIFT;
> > >
> > > nit: not sure it really pays off to have a pfn variable here.
> > Did it for readability; the compiler's optimization should take care
> > of any extra variables.  But I can switch if you insist.
>
> No need.
>
> [...]
>
> > > > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > index 055eb0b8e396..2d66d415b6c3 100644
> > > > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device
> > > > *pdev)
> > > >
> > > >       sdev->dev = &pdev->dev;
> > > >       /* The DMA bus has the memory mapped at 0 */
> > > > -     sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > > > +     ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > > > +                                         PHYS_OFFSET >> PAGE_SHIFT);
> > > > +     if (ret)
> > > > +             return ret;
> > > >
> > > >       ret = sun6i_csi_resource_request(sdev, pdev);
> > > >       if (ret)
> > > > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > > > index 96d8cfb14a60..c89333b0a5fb 100644
> > > > --- a/drivers/of/address.c
> > > > +++ b/drivers/of/address.c
> > > > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct
> > > > device_node
> > > > *np, int index,
> > > >  }
> > > >  EXPORT_SYMBOL(of_io_request_and_map);
> > > >
> > > > +static int attach_dma_pfn_offset_map(struct device *dev,
> > > > +                                  struct device_node *node, int
> > > > num_ranges)
> > >
> > > As with the previous review, please take this comment with a grain of salt.
> > >
> > > I think there should be a clear split between what belongs to OF and what
> > > belongs to the core device infrastructure.
> > >
> > > OF's job should be to parse DT and provide a list/array of ranges, whereas
> > > the
> > > core device infrastructure should provide an API to assign a list of
> > > ranges/offset to a device.
> > >
> > > As a concrete example, you're forcing devices like the sta2x11 to build with
> > > OF
> > > support, which, being an Intel device, it's pretty odd. But I'm also
> > > thinking
> > > of how will all this fit once an ACPI device wants to use it.
> > To fix this I only have to move attach_uniform_dma_pfn_offset() from
> > of/address.c to say include/linux/dma-mapping.h.  It has no
> > dependencies on OF.  Do you agree?
>
> Yes that seems nicer. In case you didn't had it in mind already, I'd change the
> function name to match the naming scheme they use there.
>
> On the other hand, I'd also move the non OF parts of the non unifom dma_offset
> version of the function there.
>
> > > Expanding on this idea, once you have a split between the OF's and device
> > > core
> > > roles, it transpires that of_dma_get_range()'s job should only be to provide
> > > the ranges in a device understandable structure and of_dma_configre()'s to
> > > actually assign the device's parameters. This would obsolete patch #7.
> >
> > I think you mean patch #8.
>
> Yes, my bad.
>
> > I agree with you.  The reason I needed a "struct device *"  in the call is
> > because I wanted to make sure the memory that is alloc'd belongs to the
> > device that needs it.  If I do a regular kzalloc(), this memory  will become
> > a leak once someone starts unbinding/binding their device.  Also, in  all
> > uses of of_dma_rtange() -- there is only one --  a dev is required as one
> > can't attach an offset map to NULL.
> >
> > I do see that there are a number of functions in drivers/of/*.c that
> > take 'struct device *dev' as an argument so there is precedent for
> > something like this.  Regardless, I need an owner to the memory I
> > alloc().
>
> I understand the need for dev to be around, devm_*() is key. But also it's
> important to keep the functions on purpose. And if of_dma_get_range() starts
> setting ranges it calls, for the very least, for a function rename. Although
> I'd rather split the parsing and setting of ranges as mentioned earlier. That
> said, I get that's a more drastic move.

I agree with you.  I could do this from device.c:

        of_dma_get_num_ranges(..., &num_ranges); /* new function */
        r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
        of_dma_get_range(np, &dma_addr, &paddr, &size, r, num_ranges);

The problem here is that there could be four ranges, all with
offset=0.  My current code would optimize this case out but the above
would have us holding useless memory and looping through the four
ranges on every dma <=> phys conversion only to add 0.

>
> Talking about drastic moves. How about getting rid of the concept of
> dma_pfn_offset for drivers altogether. Let them provide dma_pfn_offset_regions
> (even when there is only one). I feel it's conceptually nicer, as you'd be
> dealing only in one currency, so to speak, and you'd centralize the bus DMA
> ranges setter function which is always easier to maintain.
Do you agree that we have to somehow hang this info on the struct
device structure?  Because in the dma2phys() and phys2dma() all you
have is the dev parameter.  I don't see how this  can be done w/o
involving dev.
>
> I'd go as far as not creating a special case for uniform offsets. Let just set
> cpu_end and dma_end to -1 so we always get a match. It's slightly more compute
> heavy, but I don't think it's worth the optimization.
Well, there are two subcases here.  One where we do know the bounds
and one where we do not.  I suppose for the latter I could have the
drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
this some thought...

>
> Just my two cents :)

Worth much more than $0.02 IMO :-)

BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
the inline functions but the problem is that it slows the fastpath;
consider the following code from dma-direct.h

        if (dev->dma_pfn_offset_map) {
                unsigned long dma_pfn_offset =
dma_pfn_offset_from_phys_addr(dev, paddr);

                dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
        }
        return dev_addr;

becomes

        unsigned long dma_pfn_offset  =
dma_pfn_offset_from_phys_addr(dev, paddr);

        dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
        return dev_addr;

So those configurations that  have no dma_pfn_offsets are doing an
unnecessary shift and add.

Thanks much,
Jim Quinlan

>
> > > > +{
> > > > +     struct of_range_parser parser;
> > > > +     struct of_range range;
> > > > +     struct dma_pfn_offset_region *r;
> > > > +
> > > > +     r = devm_kcalloc(dev, num_ranges + 1,
> > > > +                      sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > > > +     if (!r)
> > > > +             return -ENOMEM;
> > > > +     dev->dma_pfn_offset_map = r;
> > > > +     of_dma_range_parser_init(&parser, node);
> > > > +
> > > > +     /*
> > > > +      * Record all info for DMA ranges array.  We could
> > > > +      * just use the of_range struct, but if we did that it
> > > > +      * would require more calculations for phys_to_dma and
> > > > +      * dma_to_phys conversions.
> > > > +      */
> > > > +     for_each_of_range(&parser, &range) {
> > > > +             r->cpu_start = range.cpu_addr;
> > > > +             r->cpu_end = r->cpu_start + range.size - 1;
> > > > +             r->dma_start = range.bus_addr;
> > > > +             r->dma_end = r->dma_start + range.size - 1;
> > > > +             r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > > > +                     - PFN_DOWN(range.bus_addr);
> > > > +             r++;
> > > > +     }
> > > > +     return 0;
> > > > +}
> > > > +
> > > > +
> > > > +
> > > > +/**
> > > > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > > > + * @dev:     device pointer; only needed for a corner case.
> > >
> > > That's a huge corner :P
> > Good point; I'm not really sure what percent of Linux configurations
> > require a dma_pfn_offset.  I'll drop the "corner".
> >
> > > > + * @dma_pfn_offset:  offset to apply when converting from phys addr
> > > > + *                   to dma addr and vice versa.
> > > > + *
> > > > + * It returns -ENOMEM if out of memory, otherwise 0.
> > > > + */
> > > > +int attach_uniform_dma_pfn_offset(struct device *dev, unsigned long
> > > > pfn_offset)
> > >
> > > As I say above, does this really belong to of/address.c?
> > No it does not.  Will fix.
> >
> > > > +{
> > > > +     struct dma_pfn_offset_region *r;
> > > > +
> > > > +     if (!dev)
> > > > +             return -ENODEV;
> > > > +
> > > > +     if (!pfn_offset)
> > > > +             return 0;
> > > > +
> > > > +     r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > > +                      GFP_KERNEL);
> > > > +     if (!r)
> > > > +             return -ENOMEM;
> > >
> > > I think you're missing this:
> > >
> > >         dev->dma_pfn_offset_map = r;
> > >
> > That's a showstopper!  DanC also pointed it out but I still didn't see
> > it.  Thanks!
> >
> > > > +
> > > > +     r->uniform_offset = true;
> > > > +     r->pfn_offset = pfn_offset;
> > > > +
> > > > +     return 0;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(attach_uniform_dma_pfn_offset);
> > > > +
> > > >  /**
> > > >   * of_dma_get_range - Get DMA range info
> > > >   * @dev:     device pointer; only needed for a corner case.
> > > > @@ -933,7 +997,7 @@ EXPORT_SYMBOL(of_io_request_and_map);
> > > >   *   CPU addr (phys_addr_t)  : pna cells
> > > >   *   size                    : nsize cells
> > > >   *
> > > > - * It returns -ENODEV if "dma-ranges" property was not found
> > > > + * It returns -ENODEV if !dev or "dma-ranges" property was not found
> > > >   * for this device in DT.
> > > >   */
> > > >  int of_dma_get_range(struct device *dev, struct device_node *np, u64
> > > > *dma_addr,
> > > > @@ -946,7 +1010,13 @@ int of_dma_get_range(struct device *dev, struct
> > > > device_node *np, u64 *dma_addr,
> > > >       bool found_dma_ranges = false;
> > > >       struct of_range_parser parser;
> > > >       struct of_range range;
> > > > +     phys_addr_t cpu_start = ~(phys_addr_t)0;
> > > >       u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0;
> > > > +     bool dma_multi_pfn_offset = false;
> > > > +     int num_ranges = 0;
> > > > +
> > > > +     if (!dev)
> > > > +             return -ENODEV;
> > >
> > > Shouldn't this be part of patch #7?
> > Do you mean #8?  Do you mean the test for !dev?   It is not required
> > for #8 so I thought I'd keep these two changes separate.  I could
> > squash them.
>
> #8, of course.
>
> It's more of a subjective matter, but to me it fits better #8's description and
> keeps this one more focused. That said, it's just a comment, do as you please.
>
> Regads,
> Nicolas
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-04 18:01           ` Jim Quinlan via iommu
  (?)
@ 2020-06-05 17:27             ` Nicolas Saenz Julienne
  -1 siblings, 0 replies; 83+ messages in thread
From: Nicolas Saenz Julienne @ 2020-06-05 17:27 UTC (permalink / raw)
  To: Jim Quinlan, Christoph Hellwig
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, open list:DRM DRIVERS FOR ALLWINNER A10,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM


[-- Attachment #1.1: Type: text/plain, Size: 4410 bytes --]

Hi Christoph,
a question arouse, is there a real value to dealing with PFNs (as opposed to
real addresses) in the core DMA code/structures? I see that in some cases it
eases interacting with mm, but the overwhelming usage of say,
dev->dma_pfn_offset, involves shifting it.

Hi Jim,
On Thu, 2020-06-04 at 14:01 -0400, Jim Quinlan wrote:
> Hi Nicolas,

[...]

> > I understand the need for dev to be around, devm_*() is key. But also it's
> > important to keep the functions on purpose. And if of_dma_get_range() starts
> > setting ranges it calls, for the very least, for a function rename. Although
> > I'd rather split the parsing and setting of ranges as mentioned earlier.
> > That
> > said, I get that's a more drastic move.
> 
> I agree with you.  I could do this from device.c:
> 
>         of_dma_get_num_ranges(..., &num_ranges); /* new function */
>         r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
>         of_dma_get_range(np, &dma_addr, &paddr, &size, r, num_ranges);
> 
> The problem here is that there could be four ranges, all with
> offset=0.  My current code would optimize this case out but the above
> would have us holding useless memory and looping through the four
> ranges on every dma <=> phys conversion only to add 0.

Point taken. Ultimately it's setting the device's dma ranges in
of_dma_get_range() that was really bothering me, so if we have to pass the
device pointer for allocations, be it.

> > Talking about drastic moves. How about getting rid of the concept of
> > dma_pfn_offset for drivers altogether. Let them provide
> > dma_pfn_offset_regions
> > (even when there is only one). I feel it's conceptually nicer, as you'd be
> > dealing only in one currency, so to speak, and you'd centralize the bus DMA
> > ranges setter function which is always easier to maintain.
> Do you agree that we have to somehow hang this info on the struct
> device structure?  Because in the dma2phys() and phys2dma() all you
> have is the dev parameter.  I don't see how this  can be done w/o
> involving dev.

Sorry I didn't make myself clear here. What bothers me is having two functions
setting the same device parameter trough different means, I'd be happy to get
rid of attach_uniform_dma_pfn_offset(), and always use the same function to set
a device's bus dma regions. Something the likes of this comes to mind:

dma_attach_pfn_offset_region(struct device *dev, struct dma_pfn_offset_regions *r)

We could maybe use some helper macros for the linear case. But that's the gist
of it.

Also, it goes hand in hand with the comment below. Why having a special case
for non sparse DMA offsets in struct dma_pfn_offset_regions? The way I see it,
in this case, code simplicity is more interesting than a small optimization.

> > I'd go as far as not creating a special case for uniform offsets. Let just
> > set
> > cpu_end and dma_end to -1 so we always get a match. It's slightly more
> > compute
> > heavy, but I don't think it's worth the optimization.
> Well, there are two subcases here.  One where we do know the bounds
> and one where we do not.  I suppose for the latter I could have the
> drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
> this some thought...
> 
> > Just my two cents :)
> 
> Worth much more than $0.02 IMO :-)

BTW, would you consider renaming the DMA offset struct to something simpler
like, struct bus_dma_region? It complements 'dev->bus_dma_limit' better IMO.

> BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
> the inline functions but the problem is that it slows the fastpath;
> consider the following code from dma-direct.h
> 
>         if (dev->dma_pfn_offset_map) {
>                 unsigned long dma_pfn_offset =
dma_pfn_offset_from_phys_addr(dev, paddr);
> 
>                 dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
>         }
>         return dev_addr;
> 
> becomes
> 
>         unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr(dev,
paddr);
> 
>         dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
>         return dev_addr;
> 
> So those configurations that  have no dma_pfn_offsets are doing an
> unnecessary shift and add.

Fair enough. Still not a huge difference, but I see the value being the most
common case.

Regards,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 169 bytes --]

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-05 17:27             ` Nicolas Saenz Julienne
  0 siblings, 0 replies; 83+ messages in thread
From: Nicolas Saenz Julienne @ 2020-06-05 17:27 UTC (permalink / raw)
  To: Jim Quinlan, Christoph Hellwig
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, open list:STAGING SUBSYSTEM,
	Wolfram Sang, Yoshinori Sato, Frank Rowand,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, open list:DRM DRIVERS FOR ALLWINNER A10,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM


[-- Attachment #1.1: Type: text/plain, Size: 4410 bytes --]

Hi Christoph,
a question arouse, is there a real value to dealing with PFNs (as opposed to
real addresses) in the core DMA code/structures? I see that in some cases it
eases interacting with mm, but the overwhelming usage of say,
dev->dma_pfn_offset, involves shifting it.

Hi Jim,
On Thu, 2020-06-04 at 14:01 -0400, Jim Quinlan wrote:
> Hi Nicolas,

[...]

> > I understand the need for dev to be around, devm_*() is key. But also it's
> > important to keep the functions on purpose. And if of_dma_get_range() starts
> > setting ranges it calls, for the very least, for a function rename. Although
> > I'd rather split the parsing and setting of ranges as mentioned earlier.
> > That
> > said, I get that's a more drastic move.
> 
> I agree with you.  I could do this from device.c:
> 
>         of_dma_get_num_ranges(..., &num_ranges); /* new function */
>         r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
>         of_dma_get_range(np, &dma_addr, &paddr, &size, r, num_ranges);
> 
> The problem here is that there could be four ranges, all with
> offset=0.  My current code would optimize this case out but the above
> would have us holding useless memory and looping through the four
> ranges on every dma <=> phys conversion only to add 0.

Point taken. Ultimately it's setting the device's dma ranges in
of_dma_get_range() that was really bothering me, so if we have to pass the
device pointer for allocations, be it.

> > Talking about drastic moves. How about getting rid of the concept of
> > dma_pfn_offset for drivers altogether. Let them provide
> > dma_pfn_offset_regions
> > (even when there is only one). I feel it's conceptually nicer, as you'd be
> > dealing only in one currency, so to speak, and you'd centralize the bus DMA
> > ranges setter function which is always easier to maintain.
> Do you agree that we have to somehow hang this info on the struct
> device structure?  Because in the dma2phys() and phys2dma() all you
> have is the dev parameter.  I don't see how this  can be done w/o
> involving dev.

Sorry I didn't make myself clear here. What bothers me is having two functions
setting the same device parameter trough different means, I'd be happy to get
rid of attach_uniform_dma_pfn_offset(), and always use the same function to set
a device's bus dma regions. Something the likes of this comes to mind:

dma_attach_pfn_offset_region(struct device *dev, struct dma_pfn_offset_regions *r)

We could maybe use some helper macros for the linear case. But that's the gist
of it.

Also, it goes hand in hand with the comment below. Why having a special case
for non sparse DMA offsets in struct dma_pfn_offset_regions? The way I see it,
in this case, code simplicity is more interesting than a small optimization.

> > I'd go as far as not creating a special case for uniform offsets. Let just
> > set
> > cpu_end and dma_end to -1 so we always get a match. It's slightly more
> > compute
> > heavy, but I don't think it's worth the optimization.
> Well, there are two subcases here.  One where we do know the bounds
> and one where we do not.  I suppose for the latter I could have the
> drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
> this some thought...
> 
> > Just my two cents :)
> 
> Worth much more than $0.02 IMO :-)

BTW, would you consider renaming the DMA offset struct to something simpler
like, struct bus_dma_region? It complements 'dev->bus_dma_limit' better IMO.

> BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
> the inline functions but the problem is that it slows the fastpath;
> consider the following code from dma-direct.h
> 
>         if (dev->dma_pfn_offset_map) {
>                 unsigned long dma_pfn_offset =
dma_pfn_offset_from_phys_addr(dev, paddr);
> 
>                 dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
>         }
>         return dev_addr;
> 
> becomes
> 
>         unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr(dev,
paddr);
> 
>         dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
>         return dev_addr;
> 
> So those configurations that  have no dma_pfn_offsets are doing an
> unnecessary shift and add.

Fair enough. Still not a huge difference, but I see the value being the most
common case.

Regards,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-05 17:27             ` Nicolas Saenz Julienne
  0 siblings, 0 replies; 83+ messages in thread
From: Nicolas Saenz Julienne @ 2020-06-05 17:27 UTC (permalink / raw)
  To: Jim Quinlan, Christoph Hellwig
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM


[-- Attachment #1.1: Type: text/plain, Size: 4410 bytes --]

Hi Christoph,
a question arouse, is there a real value to dealing with PFNs (as opposed to
real addresses) in the core DMA code/structures? I see that in some cases it
eases interacting with mm, but the overwhelming usage of say,
dev->dma_pfn_offset, involves shifting it.

Hi Jim,
On Thu, 2020-06-04 at 14:01 -0400, Jim Quinlan wrote:
> Hi Nicolas,

[...]

> > I understand the need for dev to be around, devm_*() is key. But also it's
> > important to keep the functions on purpose. And if of_dma_get_range() starts
> > setting ranges it calls, for the very least, for a function rename. Although
> > I'd rather split the parsing and setting of ranges as mentioned earlier.
> > That
> > said, I get that's a more drastic move.
> 
> I agree with you.  I could do this from device.c:
> 
>         of_dma_get_num_ranges(..., &num_ranges); /* new function */
>         r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
>         of_dma_get_range(np, &dma_addr, &paddr, &size, r, num_ranges);
> 
> The problem here is that there could be four ranges, all with
> offset=0.  My current code would optimize this case out but the above
> would have us holding useless memory and looping through the four
> ranges on every dma <=> phys conversion only to add 0.

Point taken. Ultimately it's setting the device's dma ranges in
of_dma_get_range() that was really bothering me, so if we have to pass the
device pointer for allocations, be it.

> > Talking about drastic moves. How about getting rid of the concept of
> > dma_pfn_offset for drivers altogether. Let them provide
> > dma_pfn_offset_regions
> > (even when there is only one). I feel it's conceptually nicer, as you'd be
> > dealing only in one currency, so to speak, and you'd centralize the bus DMA
> > ranges setter function which is always easier to maintain.
> Do you agree that we have to somehow hang this info on the struct
> device structure?  Because in the dma2phys() and phys2dma() all you
> have is the dev parameter.  I don't see how this  can be done w/o
> involving dev.

Sorry I didn't make myself clear here. What bothers me is having two functions
setting the same device parameter trough different means, I'd be happy to get
rid of attach_uniform_dma_pfn_offset(), and always use the same function to set
a device's bus dma regions. Something the likes of this comes to mind:

dma_attach_pfn_offset_region(struct device *dev, struct dma_pfn_offset_regions *r)

We could maybe use some helper macros for the linear case. But that's the gist
of it.

Also, it goes hand in hand with the comment below. Why having a special case
for non sparse DMA offsets in struct dma_pfn_offset_regions? The way I see it,
in this case, code simplicity is more interesting than a small optimization.

> > I'd go as far as not creating a special case for uniform offsets. Let just
> > set
> > cpu_end and dma_end to -1 so we always get a match. It's slightly more
> > compute
> > heavy, but I don't think it's worth the optimization.
> Well, there are two subcases here.  One where we do know the bounds
> and one where we do not.  I suppose for the latter I could have the
> drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
> this some thought...
> 
> > Just my two cents :)
> 
> Worth much more than $0.02 IMO :-)

BTW, would you consider renaming the DMA offset struct to something simpler
like, struct bus_dma_region? It complements 'dev->bus_dma_limit' better IMO.

> BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
> the inline functions but the problem is that it slows the fastpath;
> consider the following code from dma-direct.h
> 
>         if (dev->dma_pfn_offset_map) {
>                 unsigned long dma_pfn_offset =
dma_pfn_offset_from_phys_addr(dev, paddr);
> 
>                 dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
>         }
>         return dev_addr;
> 
> becomes
> 
>         unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr(dev,
paddr);
> 
>         dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
>         return dev_addr;
> 
> So those configurations that  have no dma_pfn_offsets are doing an
> unnecessary shift and add.

Fair enough. Still not a huge difference, but I see the value being the most
common case.

Regards,
Nicolas


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
  2020-06-05 17:27             ` Nicolas Saenz Julienne
  (?)
@ 2020-06-05 20:41               ` Jim Quinlan via iommu
  -1 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-05 20:41 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, open list:DRM DRIVERS FOR ALLWINNER A10,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

On Fri, Jun 5, 2020 at 1:27 PM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> Hi Christoph,
> a question arouse, is there a real value to dealing with PFNs (as opposed to
> real addresses) in the core DMA code/structures? I see that in some cases it
> eases interacting with mm, but the overwhelming usage of say,
> dev->dma_pfn_offset, involves shifting it.
>
> Hi Jim,
> On Thu, 2020-06-04 at 14:01 -0400, Jim Quinlan wrote:
> > Hi Nicolas,
>
> [...]
>
> > > I understand the need for dev to be around, devm_*() is key. But also it's
> > > important to keep the functions on purpose. And if of_dma_get_range() starts
> > > setting ranges it calls, for the very least, for a function rename. Although
> > > I'd rather split the parsing and setting of ranges as mentioned earlier.
> > > That
> > > said, I get that's a more drastic move.
> >
> > I agree with you.  I could do this from device.c:
> >
> >         of_dma_get_num_ranges(..., &num_ranges); /* new function */
> >         r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
> >         of_dma_get_range(np, &dma_addr, &paddr, &size, r, num_ranges);
> >
> > The problem here is that there could be four ranges, all with
> > offset=0.  My current code would optimize this case out but the above
> > would have us holding useless memory and looping through the four
> > ranges on every dma <=> phys conversion only to add 0.
>
> Point taken. Ultimately it's setting the device's dma ranges in
> of_dma_get_range() that was really bothering me, so if we have to pass the
> device pointer for allocations, be it.
>
> > > Talking about drastic moves. How about getting rid of the concept of
> > > dma_pfn_offset for drivers altogether. Let them provide
> > > dma_pfn_offset_regions
> > > (even when there is only one). I feel it's conceptually nicer, as you'd be
> > > dealing only in one currency, so to speak, and you'd centralize the bus DMA
> > > ranges setter function which is always easier to maintain.
> > Do you agree that we have to somehow hang this info on the struct
> > device structure?  Because in the dma2phys() and phys2dma() all you
> > have is the dev parameter.  I don't see how this  can be done w/o
> > involving dev.
>
> Sorry I didn't make myself clear here. What bothers me is having two functions
> setting the same device parameter trough different means, I'd be happy to get
> rid of attach_uniform_dma_pfn_offset(), and always use the same function to set
> a device's bus dma regions. Something the likes of this comes to mind:
>
> dma_attach_pfn_offset_region(struct device *dev, struct dma_pfn_offset_regions *r)
>
> We could maybe use some helper macros for the linear case. But that's the gist
> of it.
>
> Also, it goes hand in hand with the comment below. Why having a special case
> for non sparse DMA offsets in struct dma_pfn_offset_regions? The way I see it,
> in this case, code simplicity is more interesting than a small optimization.
I've removed the special case and also need for 'dev' in
of_dma_get_range().  v4 is comming...
>
> > > I'd go as far as not creating a special case for uniform offsets. Let just
> > > set
> > > cpu_end and dma_end to -1 so we always get a match. It's slightly more
> > > compute
> > > heavy, but I don't think it's worth the optimization.
> > Well, there are two subcases here.  One where we do know the bounds
> > and one where we do not.  I suppose for the latter I could have the
> > drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
> > this some thought...
> >
> > > Just my two cents :)
> >
> > Worth much more than $0.02 IMO :-)
>
> BTW, would you consider renaming the DMA offset struct to something simpler
> like, struct bus_dma_region? It complements 'dev->bus_dma_limit' better IMO.
Will do

Thanks,
Jim
>
> > BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
> > the inline functions but the problem is that it slows the fastpath;
> > consider the following code from dma-direct.h
> >
> >         if (dev->dma_pfn_offset_map) {
> >                 unsigned long dma_pfn_offset =
> dma_pfn_offset_from_phys_addr(dev, paddr);
> >
> >                 dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
> >         }
> >         return dev_addr;
> >
> > becomes
> >
> >         unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr(dev,
> paddr);
> >
> >         dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
> >         return dev_addr;
> >
> > So those configurations that  have no dma_pfn_offsets are doing an
> > unnecessary shift and add.
>
> Fair enough. Still not a huge difference, but I see the value being the most
> common case.
>
> Regards,
> Nicolas
>
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-05 20:41               ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan via iommu @ 2020-06-05 20:41 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, open list:STAGING SUBSYSTEM,
	Wolfram Sang, Yoshinori Sato, Frank Rowand,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Maxime Ripard, Rob Herring,
	Borislav Petkov, open list:DRM DRIVERS FOR ALLWINNER A10,
	Yong Deng, Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Daniel Vetter, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

On Fri, Jun 5, 2020 at 1:27 PM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> Hi Christoph,
> a question arouse, is there a real value to dealing with PFNs (as opposed to
> real addresses) in the core DMA code/structures? I see that in some cases it
> eases interacting with mm, but the overwhelming usage of say,
> dev->dma_pfn_offset, involves shifting it.
>
> Hi Jim,
> On Thu, 2020-06-04 at 14:01 -0400, Jim Quinlan wrote:
> > Hi Nicolas,
>
> [...]
>
> > > I understand the need for dev to be around, devm_*() is key. But also it's
> > > important to keep the functions on purpose. And if of_dma_get_range() starts
> > > setting ranges it calls, for the very least, for a function rename. Although
> > > I'd rather split the parsing and setting of ranges as mentioned earlier.
> > > That
> > > said, I get that's a more drastic move.
> >
> > I agree with you.  I could do this from device.c:
> >
> >         of_dma_get_num_ranges(..., &num_ranges); /* new function */
> >         r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
> >         of_dma_get_range(np, &dma_addr, &paddr, &size, r, num_ranges);
> >
> > The problem here is that there could be four ranges, all with
> > offset=0.  My current code would optimize this case out but the above
> > would have us holding useless memory and looping through the four
> > ranges on every dma <=> phys conversion only to add 0.
>
> Point taken. Ultimately it's setting the device's dma ranges in
> of_dma_get_range() that was really bothering me, so if we have to pass the
> device pointer for allocations, be it.
>
> > > Talking about drastic moves. How about getting rid of the concept of
> > > dma_pfn_offset for drivers altogether. Let them provide
> > > dma_pfn_offset_regions
> > > (even when there is only one). I feel it's conceptually nicer, as you'd be
> > > dealing only in one currency, so to speak, and you'd centralize the bus DMA
> > > ranges setter function which is always easier to maintain.
> > Do you agree that we have to somehow hang this info on the struct
> > device structure?  Because in the dma2phys() and phys2dma() all you
> > have is the dev parameter.  I don't see how this  can be done w/o
> > involving dev.
>
> Sorry I didn't make myself clear here. What bothers me is having two functions
> setting the same device parameter trough different means, I'd be happy to get
> rid of attach_uniform_dma_pfn_offset(), and always use the same function to set
> a device's bus dma regions. Something the likes of this comes to mind:
>
> dma_attach_pfn_offset_region(struct device *dev, struct dma_pfn_offset_regions *r)
>
> We could maybe use some helper macros for the linear case. But that's the gist
> of it.
>
> Also, it goes hand in hand with the comment below. Why having a special case
> for non sparse DMA offsets in struct dma_pfn_offset_regions? The way I see it,
> in this case, code simplicity is more interesting than a small optimization.
I've removed the special case and also need for 'dev' in
of_dma_get_range().  v4 is comming...
>
> > > I'd go as far as not creating a special case for uniform offsets. Let just
> > > set
> > > cpu_end and dma_end to -1 so we always get a match. It's slightly more
> > > compute
> > > heavy, but I don't think it's worth the optimization.
> > Well, there are two subcases here.  One where we do know the bounds
> > and one where we do not.  I suppose for the latter I could have the
> > drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
> > this some thought...
> >
> > > Just my two cents :)
> >
> > Worth much more than $0.02 IMO :-)
>
> BTW, would you consider renaming the DMA offset struct to something simpler
> like, struct bus_dma_region? It complements 'dev->bus_dma_limit' better IMO.
Will do

Thanks,
Jim
>
> > BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
> > the inline functions but the problem is that it slows the fastpath;
> > consider the following code from dma-direct.h
> >
> >         if (dev->dma_pfn_offset_map) {
> >                 unsigned long dma_pfn_offset =
> dma_pfn_offset_from_phys_addr(dev, paddr);
> >
> >                 dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
> >         }
> >         return dev_addr;
> >
> > becomes
> >
> >         unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr(dev,
> paddr);
> >
> >         dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
> >         return dev_addr;
> >
> > So those configurations that  have no dma_pfn_offsets are doing an
> > unnecessary shift and add.
>
> Fair enough. Still not a huge difference, but I see the value being the most
> common case.
>
> Regards,
> Nicolas
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets
@ 2020-06-05 20:41               ` Jim Quinlan via iommu
  0 siblings, 0 replies; 83+ messages in thread
From: Jim Quinlan @ 2020-06-05 20:41 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: Ulf Hansson, Rich Felker, open list:SUPERH, David Airlie,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	Hanjun Guo, open list:REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM,
	Andy Shevchenko, Bjorn Andersson, Julien Grall, H. Peter Anvin,
	Will Deacon, Christoph Hellwig, Marek Szyprowski,
	open list:STAGING SUBSYSTEM, Wolfram Sang, Lorenzo Pieralisi,
	Yoshinori Sato, Frank Rowand, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Russell King, open list:ACPI FOR ARM64 (ACPI/arm64),
	Chen-Yu Tsai, Ingo Molnar,
	maintainer:BROADCOM BCM7XXX ARM ARCHITECTURE, Alan Stern,
	David Kershner, Len Brown, Ohad Ben-Cohen,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE, Arnd Bergmann,
	Suzuki K Poulose, Dan Williams, Rob Herring, Borislav Petkov,
	open list:DRM DRIVERS FOR ALLWINNER A10, Yong Deng,
	Santosh Shilimkar, Bjorn Helgaas, Thomas Gleixner,
	Mauro Carvalho Chehab, moderated list:ARM PORT, Saravana Kannan,
	Greg Kroah-Hartman, Oliver Neukum, Rafael J. Wysocki, open list,
	Paul Kocialkowski, open list:IOMMU DRIVERS, Mark Brown,
	Stefano Stabellini, Sudeep Holla,
	open list:ALLWINNER A10 CSI DRIVER, Robin Murphy,
	open list:USB SUBSYSTEM

On Fri, Jun 5, 2020 at 1:27 PM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> Hi Christoph,
> a question arouse, is there a real value to dealing with PFNs (as opposed to
> real addresses) in the core DMA code/structures? I see that in some cases it
> eases interacting with mm, but the overwhelming usage of say,
> dev->dma_pfn_offset, involves shifting it.
>
> Hi Jim,
> On Thu, 2020-06-04 at 14:01 -0400, Jim Quinlan wrote:
> > Hi Nicolas,
>
> [...]
>
> > > I understand the need for dev to be around, devm_*() is key. But also it's
> > > important to keep the functions on purpose. And if of_dma_get_range() starts
> > > setting ranges it calls, for the very least, for a function rename. Although
> > > I'd rather split the parsing and setting of ranges as mentioned earlier.
> > > That
> > > said, I get that's a more drastic move.
> >
> > I agree with you.  I could do this from device.c:
> >
> >         of_dma_get_num_ranges(..., &num_ranges); /* new function */
> >         r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
> >         of_dma_get_range(np, &dma_addr, &paddr, &size, r, num_ranges);
> >
> > The problem here is that there could be four ranges, all with
> > offset=0.  My current code would optimize this case out but the above
> > would have us holding useless memory and looping through the four
> > ranges on every dma <=> phys conversion only to add 0.
>
> Point taken. Ultimately it's setting the device's dma ranges in
> of_dma_get_range() that was really bothering me, so if we have to pass the
> device pointer for allocations, be it.
>
> > > Talking about drastic moves. How about getting rid of the concept of
> > > dma_pfn_offset for drivers altogether. Let them provide
> > > dma_pfn_offset_regions
> > > (even when there is only one). I feel it's conceptually nicer, as you'd be
> > > dealing only in one currency, so to speak, and you'd centralize the bus DMA
> > > ranges setter function which is always easier to maintain.
> > Do you agree that we have to somehow hang this info on the struct
> > device structure?  Because in the dma2phys() and phys2dma() all you
> > have is the dev parameter.  I don't see how this  can be done w/o
> > involving dev.
>
> Sorry I didn't make myself clear here. What bothers me is having two functions
> setting the same device parameter trough different means, I'd be happy to get
> rid of attach_uniform_dma_pfn_offset(), and always use the same function to set
> a device's bus dma regions. Something the likes of this comes to mind:
>
> dma_attach_pfn_offset_region(struct device *dev, struct dma_pfn_offset_regions *r)
>
> We could maybe use some helper macros for the linear case. But that's the gist
> of it.
>
> Also, it goes hand in hand with the comment below. Why having a special case
> for non sparse DMA offsets in struct dma_pfn_offset_regions? The way I see it,
> in this case, code simplicity is more interesting than a small optimization.
I've removed the special case and also need for 'dev' in
of_dma_get_range().  v4 is comming...
>
> > > I'd go as far as not creating a special case for uniform offsets. Let just
> > > set
> > > cpu_end and dma_end to -1 so we always get a match. It's slightly more
> > > compute
> > > heavy, but I don't think it's worth the optimization.
> > Well, there are two subcases here.  One where we do know the bounds
> > and one where we do not.  I suppose for the latter I could have the
> > drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
> > this some thought...
> >
> > > Just my two cents :)
> >
> > Worth much more than $0.02 IMO :-)
>
> BTW, would you consider renaming the DMA offset struct to something simpler
> like, struct bus_dma_region? It complements 'dev->bus_dma_limit' better IMO.
Will do

Thanks,
Jim
>
> > BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
> > the inline functions but the problem is that it slows the fastpath;
> > consider the following code from dma-direct.h
> >
> >         if (dev->dma_pfn_offset_map) {
> >                 unsigned long dma_pfn_offset =
> dma_pfn_offset_from_phys_addr(dev, paddr);
> >
> >                 dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
> >         }
> >         return dev_addr;
> >
> > becomes
> >
> >         unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr(dev,
> paddr);
> >
> >         dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
> >         return dev_addr;
> >
> > So those configurations that  have no dma_pfn_offsets are doing an
> > unnecessary shift and add.
>
> Fair enough. Still not a huge difference, but I see the value being the most
> common case.
>
> Regards,
> Nicolas
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2020-06-07 19:13 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-03 19:20 [PATCH v3 00/13] PCI: brcmstb: enable PCIe for STB chips Jim Quinlan
2020-06-03 19:20 ` Jim Quinlan
2020-06-03 19:20 ` Jim Quinlan
2020-06-03 19:20 ` Jim Quinlan via iommu
2020-06-03 19:20 ` Jim Quinlan
2020-06-03 19:20 ` Jim Quinlan
2020-06-03 19:20 ` [PATCH v3 01/13] PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB Jim Quinlan
2020-06-03 19:20 ` [PATCH v3 02/13] ata: ahci_brcm: Fix use of BCM7216 reset controller Jim Quinlan
2020-06-04  2:57   ` Florian Fainelli
2020-06-03 19:20 ` [PATCH v3 03/13] dt-bindings: PCI: Add bindings for more Brcmstb chips Jim Quinlan
2020-06-03 19:20   ` Jim Quinlan
2020-06-03 19:20 ` [PATCH v3 04/13] PCI: brcmstb: Add bcm7278 reigister info Jim Quinlan
2020-06-03 19:20   ` Jim Quinlan
2020-06-03 20:28   ` Florian Fainelli
2020-06-03 20:28     ` Florian Fainelli
2020-06-03 19:20 ` [PATCH v3 05/13] PCI: brcmstb: Add suspend and resume pm_ops Jim Quinlan
2020-06-03 19:20   ` Jim Quinlan
2020-06-03 20:24   ` Bjorn Helgaas
2020-06-03 20:24     ` Bjorn Helgaas
2020-06-03 19:20 ` [PATCH v3 06/13] PCI: brcmstb: Add bcm7278 PERST support Jim Quinlan
2020-06-03 19:20   ` Jim Quinlan
2020-06-03 20:28   ` Florian Fainelli
2020-06-03 20:28     ` Florian Fainelli
2020-06-03 19:20 ` [PATCH v3 07/13] PCI: brcmstb: Add control of rescal reset Jim Quinlan
2020-06-03 19:20   ` Jim Quinlan
2020-06-03 20:30   ` Florian Fainelli
2020-06-03 20:30     ` Florian Fainelli
2020-06-03 19:20 ` [PATCH v3 08/13] of: Include a dev param in of_dma_get_range() Jim Quinlan
2020-06-03 19:20 ` [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets Jim Quinlan
2020-06-03 19:20   ` Jim Quinlan
2020-06-03 19:20   ` Jim Quinlan via iommu
2020-06-04 11:04   ` Dan Carpenter
2020-06-04 11:04     ` Dan Carpenter
2020-06-04 11:04     ` Dan Carpenter
2020-06-04 13:48     ` Jim Quinlan
2020-06-04 13:48       ` Jim Quinlan
2020-06-04 13:48       ` Jim Quinlan via iommu
2020-06-04 14:18       ` Dan Carpenter
2020-06-04 14:18         ` Dan Carpenter
2020-06-04 14:18         ` Dan Carpenter
2020-06-04 14:43         ` Jim Quinlan
2020-06-04 14:43           ` Jim Quinlan
2020-06-04 14:43           ` Jim Quinlan via iommu
2020-06-04 13:53   ` Nicolas Saenz Julienne
2020-06-04 13:53     ` Nicolas Saenz Julienne
2020-06-04 13:53     ` Nicolas Saenz Julienne
2020-06-04 14:35     ` Jim Quinlan
2020-06-04 14:35       ` Jim Quinlan
2020-06-04 14:35       ` Jim Quinlan via iommu
2020-06-04 15:05       ` Andy Shevchenko
2020-06-04 15:05         ` Andy Shevchenko
2020-06-04 15:05         ` Andy Shevchenko
2020-06-04 16:28         ` Jim Quinlan
2020-06-04 16:28           ` Jim Quinlan
2020-06-04 16:28           ` Jim Quinlan via iommu
2020-06-04 16:52       ` Nicolas Saenz Julienne
2020-06-04 16:52         ` Nicolas Saenz Julienne
2020-06-04 16:52         ` Nicolas Saenz Julienne
2020-06-04 18:01         ` Jim Quinlan
2020-06-04 18:01           ` Jim Quinlan
2020-06-04 18:01           ` Jim Quinlan via iommu
2020-06-05 17:27           ` Nicolas Saenz Julienne
2020-06-05 17:27             ` Nicolas Saenz Julienne
2020-06-05 17:27             ` Nicolas Saenz Julienne
2020-06-05 20:41             ` Jim Quinlan
2020-06-05 20:41               ` Jim Quinlan
2020-06-05 20:41               ` Jim Quinlan via iommu
2020-06-03 19:20 ` [PATCH v3 10/13] PCI: brcmstb: Set internal memory viewport sizes Jim Quinlan
2020-06-03 19:20   ` Jim Quinlan
2020-06-03 20:33   ` Florian Fainelli
2020-06-03 20:33     ` Florian Fainelli
2020-06-03 19:20 ` [PATCH v3 11/13] PCI: brcmstb: Accommodate MSI for older chips Jim Quinlan
2020-06-03 19:20   ` Jim Quinlan
2020-06-04  2:56   ` Florian Fainelli
2020-06-04  2:56     ` Florian Fainelli
2020-06-03 19:20 ` [PATCH v3 12/13] PCI: brcmstb: Set bus max burst size by chip type Jim Quinlan
2020-06-03 19:20   ` Jim Quinlan
2020-06-03 20:06   ` Florian Fainelli
2020-06-03 20:06     ` Florian Fainelli
2020-06-03 19:20 ` [PATCH v3 13/13] PCI: brcmstb: Add bcm7211, bcm7216, bcm7445, bcm7278 to match list Jim Quinlan
2020-06-03 19:20   ` Jim Quinlan
2020-06-03 20:02   ` Florian Fainelli
2020-06-03 20:02     ` Florian Fainelli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.