linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms
@ 2017-08-24 20:37 Roy Pledge
  2017-08-24 20:37 ` [v4 01/11] soc/fsl/qbman: Use shared-dma-pool for BMan private memory allocations Roy Pledge
                   ` (10 more replies)
  0 siblings, 11 replies; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Roy Pledge

This patch series enables DPAA1 QBMan devices for ARM and
ARM64 architectures. This allows the LS1043A and LS1046A to use
QBMan functionality which allows access to ethernet and cyptographic
devices for example.

Changes since v3:
- Use memremap() instead of ioremap() for non iomem QBMan portal regions
- Ensured the __iomem attribute is respected when accessing iomem mapped regions
- Removed calls to flush/invalidate/prefetch for ARM/ARM64 since mapping is done as write combine

Changes since v2:
- Fixed some misspellings
- Added 'no-map' constraint to device tree bindings
- Described ordering contraint on regions in the device tree
- Removed confusing comment regarding non-shareable mappings
- Added warning if old reserved-memory technique is used on ARM

Changes since v1:
- Reworked private memory allocations to use shared-dma-pool on ARM platforms


Claudiu Manoil (2):
  soc/fsl/qbman: Drop L1_CACHE_BYTES compile time check
  soc/fsl/qbman: Add missing headers on ARM

Madalin Bucur (4):
  soc/fsl/qbman: Drop set/clear_bits usage
  soc/fsl/qbman: add QMAN_REV32
  soc/fsl/qbman: different register offsets on ARM
  fsl/soc/qbman: Enable FSL_LAYERSCAPE config on ARM

Roy Pledge (4):
  soc/fsl/qbman: Use shared-dma-pool for BMan private memory allocations
  soc/fsl/qbman: Use shared-dma-pool for QMan private memory allocations
  dt-bindings: soc/fsl: Update reserved memory binding for QBMan
  soc/fsl/qbman: Rework portal mapping calls for ARM/PPC

Valentin Rothberg (1):
  soc/fsl/qbman: Fix ARM32 typo

 Documentation/devicetree/bindings/soc/fsl/bman.txt |  12 +-
 Documentation/devicetree/bindings/soc/fsl/qman.txt |  26 ++--
 drivers/soc/fsl/qbman/Kconfig                      |   2 +-
 drivers/soc/fsl/qbman/bman.c                       |  30 ++++-
 drivers/soc/fsl/qbman/bman_ccsr.c                  |  35 +++++-
 drivers/soc/fsl/qbman/bman_portal.c                |  36 ++++--
 drivers/soc/fsl/qbman/bman_priv.h                  |  11 +-
 drivers/soc/fsl/qbman/dpaa_sys.h                   |  14 +--
 drivers/soc/fsl/qbman/qman.c                       |  52 ++++++--
 drivers/soc/fsl/qbman/qman_ccsr.c                  | 140 ++++++++++++++++-----
 drivers/soc/fsl/qbman/qman_portal.c                |  36 ++++--
 drivers/soc/fsl/qbman/qman_priv.h                  |  13 +-
 drivers/soc/fsl/qbman/qman_test.h                  |   2 -
 13 files changed, 305 insertions(+), 104 deletions(-)

--
2.7.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [v4 01/11] soc/fsl/qbman: Use shared-dma-pool for BMan private memory allocations
  2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
@ 2017-08-24 20:37 ` Roy Pledge
  2017-09-14 13:46   ` Catalin Marinas
  2017-08-24 20:37 ` [v4 02/11] soc/fsl/qbman: Use shared-dma-pool for QMan " Roy Pledge
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Roy Pledge

Use the shared-memory-pool mechanism for free buffer proxy record
area allocation.

Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
---
 drivers/soc/fsl/qbman/bman_ccsr.c | 35 ++++++++++++++++++++++++++++++++++-
 drivers/soc/fsl/qbman/bman_priv.h |  3 +++
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/qbman/bman_ccsr.c b/drivers/soc/fsl/qbman/bman_ccsr.c
index eaa9585..2182236 100644
--- a/drivers/soc/fsl/qbman/bman_ccsr.c
+++ b/drivers/soc/fsl/qbman/bman_ccsr.c
@@ -170,10 +170,11 @@ static int fsl_bman_probe(struct platform_device *pdev)
 {
 	int ret, err_irq;
 	struct device *dev = &pdev->dev;
-	struct device_node *node = dev->of_node;
+	struct device_node *mem_node, *node = dev->of_node;
 	struct resource *res;
 	u16 id, bm_pool_cnt;
 	u8 major, minor;
+	u64 size;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	if (!res) {
@@ -201,6 +202,38 @@ static int fsl_bman_probe(struct platform_device *pdev)
 		return -ENODEV;
 	}
 
+	/*
+	 * If FBPR memory wasn't defined using the qbman compatible string
+	 * try using the of_reserved_mem_device method
+	 */
+	if (!fbpr_a) {
+		ret = of_reserved_mem_device_init(dev);
+		if (ret) {
+			dev_err(dev, "of_reserved_mem_device_init() failed 0x%x\n",
+				ret);
+			return -ENODEV;
+		}
+		mem_node = of_parse_phandle(dev->of_node, "memory-region", 0);
+		if (mem_node) {
+			ret = of_property_read_u64(mem_node, "size", &size);
+			if (ret) {
+				dev_err(dev, "FBPR: of_address_to_resource fails 0x%x\n",
+					ret);
+				return -ENODEV;
+			}
+			fbpr_sz = size;
+		} else {
+			dev_err(dev, "No memory-region found for FBPR\n");
+			return -ENODEV;
+		}
+		if (!dma_zalloc_coherent(dev, fbpr_sz, &fbpr_a, 0)) {
+			dev_err(dev, "Alloc FBPR memory failed\n");
+			return -ENODEV;
+		}
+	}
+
+	dev_dbg(dev, "Allocated FBPR 0x%llx 0x%zx\n", fbpr_a, fbpr_sz);
+
 	bm_set_memory(fbpr_a, fbpr_sz);
 
 	err_irq = platform_get_irq(pdev, 0);
diff --git a/drivers/soc/fsl/qbman/bman_priv.h b/drivers/soc/fsl/qbman/bman_priv.h
index f6896a2..765a4bf 100644
--- a/drivers/soc/fsl/qbman/bman_priv.h
+++ b/drivers/soc/fsl/qbman/bman_priv.h
@@ -33,6 +33,9 @@
 #include "dpaa_sys.h"
 
 #include <soc/fsl/bman.h>
+#include <linux/dma-contiguous.h>
+#include <linux/of_address.h>
+#include <linux/dma-mapping.h>
 
 /* Portal processing (interrupt) sources */
 #define BM_PIRQ_RCRI	0x00000002	/* RCR Ring (below threshold) */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [v4 02/11] soc/fsl/qbman: Use shared-dma-pool for QMan private memory allocations
  2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
  2017-08-24 20:37 ` [v4 01/11] soc/fsl/qbman: Use shared-dma-pool for BMan private memory allocations Roy Pledge
@ 2017-08-24 20:37 ` Roy Pledge
  2017-08-24 20:37 ` [v4 03/11] dt-bindings: soc/fsl: Update reserved memory binding for QBMan Roy Pledge
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Roy Pledge

Use the shared-memory-pool mechanism for frame queue descriptor and
packed frame descriptor record area allocations.

Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
---
 drivers/soc/fsl/qbman/qman_ccsr.c | 138 +++++++++++++++++++++++++++++---------
 drivers/soc/fsl/qbman/qman_priv.h |   4 +-
 drivers/soc/fsl/qbman/qman_test.h |   2 -
 3 files changed, 109 insertions(+), 35 deletions(-)

diff --git a/drivers/soc/fsl/qbman/qman_ccsr.c b/drivers/soc/fsl/qbman/qman_ccsr.c
index 835ce94..20a1ebd 100644
--- a/drivers/soc/fsl/qbman/qman_ccsr.c
+++ b/drivers/soc/fsl/qbman/qman_ccsr.c
@@ -401,21 +401,42 @@ static int qm_init_pfdr(struct device *dev, u32 pfdr_start, u32 num)
 }
 
 /*
- * Ideally we would use the DMA API to turn rmem->base into a DMA address
- * (especially if iommu translations ever get involved).  Unfortunately, the
- * DMA API currently does not allow mapping anything that is not backed with
- * a struct page.
+ * QMan needs two global memory areas initialized at boot time:
+ *  1) FQD: Frame Queue Descriptors used to manage frame queues
+ *  2) PFDR: Packed Frame Queue Descriptor Records used to store frames
+ * Both areas are reserved using the device tree reserved memory framework
+ * and the addresses and sizes are initialized when the QMan device is probed
  */
 static dma_addr_t fqd_a, pfdr_a;
 static size_t fqd_sz, pfdr_sz;
 
+#ifdef CONFIG_PPC
+/*
+ * Support for PPC Device Tree backward compatibility when compatible
+ * string is set to fsl-qman-fqd and fsl-qman-pfdr
+ */
+static int zero_priv_mem(phys_addr_t addr, size_t sz)
+{
+	/* map as cacheable, non-guarded */
+	void __iomem *tmpp = ioremap_prot(addr, sz, 0);
+
+	if (!tmpp)
+		return -ENOMEM;
+
+	memset_io(tmpp, 0, sz);
+	flush_dcache_range((unsigned long)tmpp,
+			   (unsigned long)tmpp + sz);
+	iounmap(tmpp);
+
+	return 0;
+}
+
 static int qman_fqd(struct reserved_mem *rmem)
 {
 	fqd_a = rmem->base;
 	fqd_sz = rmem->size;
 
 	WARN_ON(!(fqd_a && fqd_sz));
-
 	return 0;
 }
 RESERVEDMEM_OF_DECLARE(qman_fqd, "fsl,qman-fqd", qman_fqd);
@@ -431,32 +452,13 @@ static int qman_pfdr(struct reserved_mem *rmem)
 }
 RESERVEDMEM_OF_DECLARE(qman_pfdr, "fsl,qman-pfdr", qman_pfdr);
 
+#endif
+
 static unsigned int qm_get_fqid_maxcnt(void)
 {
 	return fqd_sz / 64;
 }
 
-/*
- * Flush this memory range from data cache so that QMAN originated
- * transactions for this memory region could be marked non-coherent.
- */
-static int zero_priv_mem(struct device *dev, struct device_node *node,
-			 phys_addr_t addr, size_t sz)
-{
-	/* map as cacheable, non-guarded */
-	void __iomem *tmpp = ioremap_prot(addr, sz, 0);
-
-	if (!tmpp)
-		return -ENOMEM;
-
-	memset_io(tmpp, 0, sz);
-	flush_dcache_range((unsigned long)tmpp,
-			   (unsigned long)tmpp + sz);
-	iounmap(tmpp);
-
-	return 0;
-}
-
 static void log_edata_bits(struct device *dev, u32 bit_count)
 {
 	u32 i, j, mask = 0xffffffff;
@@ -687,11 +689,12 @@ static int qman_resource_init(struct device *dev)
 static int fsl_qman_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
-	struct device_node *node = dev->of_node;
+	struct device_node *mem_node, *node = dev->of_node;
 	struct resource *res;
 	int ret, err_irq;
 	u16 id;
 	u8 major, minor;
+	u64 size;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	if (!res) {
@@ -727,10 +730,83 @@ static int fsl_qman_probe(struct platform_device *pdev)
 		qm_channel_caam = QMAN_CHANNEL_CAAM_REV3;
 	}
 
-	ret = zero_priv_mem(dev, node, fqd_a, fqd_sz);
-	WARN_ON(ret);
-	if (ret)
-		return -ENODEV;
+	if (fqd_a) {
+#ifdef CONFIG_PPC
+		/*
+		 * For PPC backward DT compatibility
+		 * FQD memory MUST be zero'd by software
+		 */
+		zero_priv_mem(fqd_a, fqd_sz);
+#else
+		WARN(1, "Unexpected archiceture using non shared-dma-mem reservations");
+#endif
+	} else {
+		/*
+		 * Order of memory regions is assumed as FQD followed by PFDR
+		 * in order to ensure allocations from the correct regions the
+		 * driver initializes then allocates each piece in order
+		 */
+		ret = of_reserved_mem_device_init_by_idx(dev, dev->of_node, 0);
+		if (ret) {
+			dev_err(dev, "of_reserved_mem_device_init_by_idx(0) failed 0x%x\n",
+				ret);
+			return -ENODEV;
+		}
+		mem_node = of_parse_phandle(dev->of_node, "memory-region", 0);
+		if (mem_node) {
+			ret = of_property_read_u64(mem_node, "size", &size);
+			if (ret) {
+				dev_err(dev, "FQD: of_address_to_resource fails 0x%x\n",
+					ret);
+				return -ENODEV;
+			}
+			fqd_sz = size;
+		} else {
+			dev_err(dev, "No memory-region found for FQD\n");
+			return -ENODEV;
+		}
+		if (!dma_zalloc_coherent(dev, fqd_sz, &fqd_a, 0)) {
+			dev_err(dev, "Alloc FQD memory failed\n");
+			return -ENODEV;
+		}
+
+		/*
+		 * Disassociate the FQD reserved memory area from the device
+		 * because a device can only have one DMA memory area. This
+		 * should be fine since the memory is allocated and initialized
+		 * and only ever accessed by the QMan device from now on
+		 */
+		of_reserved_mem_device_release(dev);
+	}
+	dev_dbg(dev, "Allocated FQD 0x%llx 0x%zx\n", fqd_a, fqd_sz);
+
+	if (!pfdr_a) {
+		/* Setup PFDR memory */
+		ret = of_reserved_mem_device_init_by_idx(dev, dev->of_node, 1);
+		if (ret) {
+			dev_err(dev, "of_reserved_mem_device_init(1) failed 0x%x\n",
+			ret);
+			return -ENODEV;
+		}
+		mem_node = of_parse_phandle(dev->of_node, "memory-region", 1);
+		if (mem_node) {
+			ret = of_property_read_u64(mem_node, "size", &size);
+			if (ret) {
+				dev_err(dev, "PFDR: of_address_to_resource fails 0x%x\n",
+					ret);
+				return -ENODEV;
+			}
+			pfdr_sz = size;
+		} else {
+			dev_err(dev, "No memory-region found for PFDR\n");
+			return -ENODEV;
+		}
+		if (!dma_zalloc_coherent(dev, pfdr_sz, &pfdr_a, 0)) {
+			dev_err(dev, "Alloc PFDR Failed size 0x%zx\n", pfdr_sz);
+			return -ENODEV;
+		}
+	}
+	dev_info(dev, "Allocated PFDR 0x%llx 0x%zx\n", pfdr_a, pfdr_sz);
 
 	ret = qman_init_ccsr(dev);
 	if (ret) {
diff --git a/drivers/soc/fsl/qbman/qman_priv.h b/drivers/soc/fsl/qbman/qman_priv.h
index 5fe9faf..957ef54 100644
--- a/drivers/soc/fsl/qbman/qman_priv.h
+++ b/drivers/soc/fsl/qbman/qman_priv.h
@@ -28,13 +28,13 @@
  * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
 #include "dpaa_sys.h"
 
 #include <soc/fsl/qman.h>
 #include <linux/dma-mapping.h>
 #include <linux/iommu.h>
+#include <linux/dma-contiguous.h>
+#include <linux/of_address.h>
 
 #if defined(CONFIG_FSL_PAMU)
 #include <asm/fsl_pamu_stash.h>
diff --git a/drivers/soc/fsl/qbman/qman_test.h b/drivers/soc/fsl/qbman/qman_test.h
index d5f8cb2..41bdbc48 100644
--- a/drivers/soc/fsl/qbman/qman_test.h
+++ b/drivers/soc/fsl/qbman/qman_test.h
@@ -30,7 +30,5 @@
 
 #include "qman_priv.h"
 
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
 int qman_test_stash(void);
 int qman_test_api(void);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [v4 03/11] dt-bindings: soc/fsl: Update reserved memory binding for QBMan
  2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
  2017-08-24 20:37 ` [v4 01/11] soc/fsl/qbman: Use shared-dma-pool for BMan private memory allocations Roy Pledge
  2017-08-24 20:37 ` [v4 02/11] soc/fsl/qbman: Use shared-dma-pool for QMan " Roy Pledge
@ 2017-08-24 20:37 ` Roy Pledge
  2017-09-14 13:47   ` Catalin Marinas
  2017-08-24 20:37 ` [v4 04/11] soc/fsl/qbman: Drop set/clear_bits usage Roy Pledge
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Roy Pledge

Updates the QMan and BMan device tree bindings for reserved memory
nodes. This makes the reserved memory allocation compatible with
the shared-dma-pool usage.

Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
---
 Documentation/devicetree/bindings/soc/fsl/bman.txt | 12 +++++-----
 Documentation/devicetree/bindings/soc/fsl/qman.txt | 26 ++++++++++++++++------
 2 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/Documentation/devicetree/bindings/soc/fsl/bman.txt b/Documentation/devicetree/bindings/soc/fsl/bman.txt
index 47ac834..48eed14 100644
--- a/Documentation/devicetree/bindings/soc/fsl/bman.txt
+++ b/Documentation/devicetree/bindings/soc/fsl/bman.txt
@@ -65,8 +65,8 @@ to the respective BMan instance
 BMan Private Memory Node
 
 BMan requires a contiguous range of physical memory used for the backing store
-for BMan Free Buffer Proxy Records (FBPR). This memory is reserved/allocated as a
-node under the /reserved-memory node
+for BMan Free Buffer Proxy Records (FBPR). This memory is reserved/allocated as
+a node under the /reserved-memory node.
 
 The BMan FBPR memory node must be named "bman-fbpr"
 
@@ -75,7 +75,9 @@ PROPERTIES
 - compatible
 	Usage:		required
 	Value type:	<stringlist>
-	Definition:	Must inclide "fsl,bman-fbpr"
+	Definition:	PPC platforms: Must include "fsl,bman-fbpr"
+			ARM platforms: Must include "shared-dma-pool"
+				       as well as the "no-map" property
 
 The following constraints are relevant to the FBPR private memory:
 	- The size must be 2^(size + 1), with size = 11..33. That is 4 KiB to
@@ -100,10 +102,10 @@ The example below shows a BMan FBPR dynamic allocation memory node
 		ranges;
 
 		bman_fbpr: bman-fbpr {
-			compatible = "fsl,bman-fbpr";
-			alloc-ranges = <0 0 0x10 0>;
+			compatible = "shared-mem-pool";
 			size = <0 0x1000000>;
 			alignment = <0 0x1000000>;
+			no-map;
 		};
 	};
 
diff --git a/Documentation/devicetree/bindings/soc/fsl/qman.txt b/Documentation/devicetree/bindings/soc/fsl/qman.txt
index 556ebb8..ee96afd 100644
--- a/Documentation/devicetree/bindings/soc/fsl/qman.txt
+++ b/Documentation/devicetree/bindings/soc/fsl/qman.txt
@@ -60,6 +60,12 @@ are located at offsets 0xbf8 and 0xbfc
 	Value type:	<prop-encoded-array>
 	Definition:	Reference input clock. Its frequency is half of the
 			platform clock
+- memory-regions
+	Usage:		Required for ARM
+	Value type:	<phandle array>
+	Definition:	List of phandles referencing the QMan private memory
+			nodes (described below). The qman-fqd node must be
+			first followed by qman-pfdr node. Only used on ARM
 
 Devices connected to a QMan instance via Direct Connect Portals (DCP) must link
 to the respective QMan instance
@@ -74,7 +80,9 @@ QMan Private Memory Nodes
 
 QMan requires two contiguous range of physical memory used for the backing store
 for QMan Frame Queue Descriptor (FQD) and Packed Frame Descriptor Record (PFDR).
-This memory is reserved/allocated as a nodes under the /reserved-memory node
+This memory is reserved/allocated as a node under the /reserved-memory node.
+
+For additional details about reserved memory regions see reserved-memory.txt
 
 The QMan FQD memory node must be named "qman-fqd"
 
@@ -83,7 +91,9 @@ PROPERTIES
 - compatible
 	Usage:		required
 	Value type:	<stringlist>
-	Definition:	Must inclide "fsl,qman-fqd"
+	Definition:	PPC platforms: Must include "fsl,qman-fqd"
+			ARM platforms: Must include "shared-dma-pool"
+				       as well as the "no-map" property
 
 The QMan PFDR memory node must be named "qman-pfdr"
 
@@ -92,7 +102,9 @@ PROPERTIES
 - compatible
 	Usage:		required
 	Value type:	<stringlist>
-	Definition:	Must inclide "fsl,qman-pfdr"
+	Definition:	PPC platforms: Must include "fsl,qman-pfdr"
+			ARM platforms: Must include "shared-dma-pool"
+				       as well as the "no-map" property
 
 The following constraints are relevant to the FQD and PFDR private memory:
 	- The size must be 2^(size + 1), with size = 11..29. That is 4 KiB to
@@ -117,16 +129,16 @@ The example below shows a QMan FQD and a PFDR dynamic allocation memory nodes
 		ranges;
 
 		qman_fqd: qman-fqd {
-			compatible = "fsl,qman-fqd";
-			alloc-ranges = <0 0 0x10 0>;
+			compatible = "shared-dma-pool";
 			size = <0 0x400000>;
 			alignment = <0 0x400000>;
+			no-map;
 		};
 		qman_pfdr: qman-pfdr {
-			compatible = "fsl,qman-pfdr";
-			alloc-ranges = <0 0 0x10 0>;
+			compatible = "shared-dma-pool";
 			size = <0 0x2000000>;
 			alignment = <0 0x2000000>;
+			no-map;
 		};
 	};
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [v4 04/11] soc/fsl/qbman: Drop set/clear_bits usage
  2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
                   ` (2 preceding siblings ...)
  2017-08-24 20:37 ` [v4 03/11] dt-bindings: soc/fsl: Update reserved memory binding for QBMan Roy Pledge
@ 2017-08-24 20:37 ` Roy Pledge
  2017-08-24 20:37 ` [v4 05/11] soc/fsl/qbman: Drop L1_CACHE_BYTES compile time check Roy Pledge
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Claudiu Manoil, Roy Pledge

From: Madalin Bucur <madalin.bucur@nxp.com>

Replace PPC specific set/clear_bits API with standard
bit twiddling so driver is portalable outside PPC.

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
---
 drivers/soc/fsl/qbman/bman.c | 2 +-
 drivers/soc/fsl/qbman/qman.c | 8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/soc/fsl/qbman/bman.c b/drivers/soc/fsl/qbman/bman.c
index 604e45c..ff8998f 100644
--- a/drivers/soc/fsl/qbman/bman.c
+++ b/drivers/soc/fsl/qbman/bman.c
@@ -616,7 +616,7 @@ int bman_p_irqsource_add(struct bman_portal *p, u32 bits)
 	unsigned long irqflags;
 
 	local_irq_save(irqflags);
-	set_bits(bits & BM_PIRQ_VISIBLE, &p->irq_sources);
+	p->irq_sources |= bits & BM_PIRQ_VISIBLE;
 	bm_out(&p->p, BM_REG_IER, p->irq_sources);
 	local_irq_restore(irqflags);
 	return 0;
diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c
index 1bcfc51..25419e1 100644
--- a/drivers/soc/fsl/qbman/qman.c
+++ b/drivers/soc/fsl/qbman/qman.c
@@ -908,12 +908,12 @@ static inline int qm_mc_result_timeout(struct qm_portal *portal,
 
 static inline void fq_set(struct qman_fq *fq, u32 mask)
 {
-	set_bits(mask, &fq->flags);
+	fq->flags |= mask;
 }
 
 static inline void fq_clear(struct qman_fq *fq, u32 mask)
 {
-	clear_bits(mask, &fq->flags);
+	fq->flags &= ~mask;
 }
 
 static inline int fq_isset(struct qman_fq *fq, u32 mask)
@@ -1574,7 +1574,7 @@ void qman_p_irqsource_add(struct qman_portal *p, u32 bits)
 	unsigned long irqflags;
 
 	local_irq_save(irqflags);
-	set_bits(bits & QM_PIRQ_VISIBLE, &p->irq_sources);
+	p->irq_sources |= bits & QM_PIRQ_VISIBLE;
 	qm_out(&p->p, QM_REG_IER, p->irq_sources);
 	local_irq_restore(irqflags);
 }
@@ -1597,7 +1597,7 @@ void qman_p_irqsource_remove(struct qman_portal *p, u32 bits)
 	 */
 	local_irq_save(irqflags);
 	bits &= QM_PIRQ_VISIBLE;
-	clear_bits(bits, &p->irq_sources);
+	p->irq_sources &= ~bits;
 	qm_out(&p->p, QM_REG_IER, p->irq_sources);
 	ier = qm_in(&p->p, QM_REG_IER);
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [v4 05/11] soc/fsl/qbman: Drop L1_CACHE_BYTES compile time check
  2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
                   ` (3 preceding siblings ...)
  2017-08-24 20:37 ` [v4 04/11] soc/fsl/qbman: Drop set/clear_bits usage Roy Pledge
@ 2017-08-24 20:37 ` Roy Pledge
  2017-09-14 13:49   ` Catalin Marinas
  2017-08-24 20:37 ` [v4 06/11] soc/fsl/qbman: Fix ARM32 typo Roy Pledge
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Claudiu Manoil, Roy Pledge

From: Claudiu Manoil <claudiu.manoil@nxp.com>

Not relevant and arch dependent. Overkill for PPC.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
---
 drivers/soc/fsl/qbman/dpaa_sys.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
index 2ce394a..f85c319 100644
--- a/drivers/soc/fsl/qbman/dpaa_sys.h
+++ b/drivers/soc/fsl/qbman/dpaa_sys.h
@@ -49,10 +49,6 @@
 #define DPAA_PORTAL_CE 0
 #define DPAA_PORTAL_CI 1
 
-#if (L1_CACHE_BYTES != 32) && (L1_CACHE_BYTES != 64)
-#error "Unsupported Cacheline Size"
-#endif
-
 static inline void dpaa_flush(void *p)
 {
 #ifdef CONFIG_PPC
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [v4 06/11] soc/fsl/qbman: Fix ARM32 typo
  2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
                   ` (4 preceding siblings ...)
  2017-08-24 20:37 ` [v4 05/11] soc/fsl/qbman: Drop L1_CACHE_BYTES compile time check Roy Pledge
@ 2017-08-24 20:37 ` Roy Pledge
  2017-08-24 20:37 ` [v4 07/11] soc/fsl/qbman: Rework portal mapping calls for ARM/PPC Roy Pledge
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Valentin Rothberg, Claudiu Manoil, Roy Pledge

From: Valentin Rothberg <valentinrothberg@gmail.com>

The Kconfig symbol for 32bit ARM is 'ARM', not 'ARM32'.

Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
---
 drivers/soc/fsl/qbman/dpaa_sys.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
index f85c319..81a9a5e 100644
--- a/drivers/soc/fsl/qbman/dpaa_sys.h
+++ b/drivers/soc/fsl/qbman/dpaa_sys.h
@@ -53,7 +53,7 @@ static inline void dpaa_flush(void *p)
 {
 #ifdef CONFIG_PPC
 	flush_dcache_range((unsigned long)p, (unsigned long)p+64);
-#elif defined(CONFIG_ARM32)
+#elif defined(CONFIG_ARM)
 	__cpuc_flush_dcache_area(p, 64);
 #elif defined(CONFIG_ARM64)
 	__flush_dcache_area(p, 64);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [v4 07/11] soc/fsl/qbman: Rework portal mapping calls for ARM/PPC
  2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
                   ` (5 preceding siblings ...)
  2017-08-24 20:37 ` [v4 06/11] soc/fsl/qbman: Fix ARM32 typo Roy Pledge
@ 2017-08-24 20:37 ` Roy Pledge
  2017-09-14 14:00   ` Catalin Marinas
  2017-08-24 20:37 ` [v4 08/11] soc/fsl/qbman: add QMAN_REV32 Roy Pledge
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Roy Pledge

Rework portal mapping for PPC and ARM. The PPC devices require a
cacheable coherent mapping while ARM will work with a non-cachable/write
combine mapping. This also eliminates the need for manual cache
flushes on ARM

Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
---
 drivers/soc/fsl/qbman/bman.c        |  6 +++---
 drivers/soc/fsl/qbman/bman_portal.c | 36 +++++++++++++++++++++++-------------
 drivers/soc/fsl/qbman/bman_priv.h   |  8 +++-----
 drivers/soc/fsl/qbman/dpaa_sys.h    |  8 ++++----
 drivers/soc/fsl/qbman/qman.c        |  6 +++---
 drivers/soc/fsl/qbman/qman_portal.c | 36 +++++++++++++++++++++++-------------
 drivers/soc/fsl/qbman/qman_priv.h   |  8 +++-----
 7 files changed, 62 insertions(+), 46 deletions(-)

diff --git a/drivers/soc/fsl/qbman/bman.c b/drivers/soc/fsl/qbman/bman.c
index ff8998f..e31c843 100644
--- a/drivers/soc/fsl/qbman/bman.c
+++ b/drivers/soc/fsl/qbman/bman.c
@@ -154,7 +154,7 @@ struct bm_mc {
 };
 
 struct bm_addr {
-	void __iomem *ce;	/* cache-enabled */
+	void *ce;		/* cache-enabled */
 	void __iomem *ci;	/* cache-inhibited */
 };
 
@@ -512,8 +512,8 @@ static int bman_create_portal(struct bman_portal *portal,
 	 * config, everything that follows depends on it and "config" is more
 	 * for (de)reference...
 	 */
-	p->addr.ce = c->addr_virt[DPAA_PORTAL_CE];
-	p->addr.ci = c->addr_virt[DPAA_PORTAL_CI];
+	p->addr.ce = c->addr_virt_ce;
+	p->addr.ci = c->addr_virt_ci;
 	if (bm_rcr_init(p, bm_rcr_pvb, bm_rcr_cce)) {
 		dev_err(c->dev, "RCR initialisation failed\n");
 		goto fail_rcr;
diff --git a/drivers/soc/fsl/qbman/bman_portal.c b/drivers/soc/fsl/qbman/bman_portal.c
index 39b39c8..bb03503 100644
--- a/drivers/soc/fsl/qbman/bman_portal.c
+++ b/drivers/soc/fsl/qbman/bman_portal.c
@@ -91,7 +91,6 @@ static int bman_portal_probe(struct platform_device *pdev)
 	struct device_node *node = dev->of_node;
 	struct bm_portal_config *pcfg;
 	struct resource *addr_phys[2];
-	void __iomem *va;
 	int irq, cpu;
 
 	pcfg = devm_kmalloc(dev, sizeof(*pcfg), GFP_KERNEL);
@@ -123,23 +122,34 @@ static int bman_portal_probe(struct platform_device *pdev)
 	}
 	pcfg->irq = irq;
 
-	va = ioremap_prot(addr_phys[0]->start, resource_size(addr_phys[0]), 0);
-	if (!va) {
-		dev_err(dev, "ioremap::CE failed\n");
+	/*
+	 * TODO: Ultimately we would like to use a cacheable/non-shareable
+	 * (coherent) mapping for the portal on both architectures but that
+	 * isn't currently available in the kernel.  Because of HW differences
+	 * PPC needs to be mapped cacheable while ARM SoCs will work with non
+	 * cacheable mappings
+	 */
+#ifdef CONFIG_PPC
+	/* PPC requires a cacheable/non-coherent mapping of the portal */
+	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
+				resource_size(addr_phys[0]), MEMREMAP_WB);
+#else
+	/* ARM can use a write combine mapping. */
+	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
+				resource_size(addr_phys[0]), MEMREMAP_WC);
+#endif
+	if (!pcfg->addr_virt_ce) {
+		dev_err(dev, "memremap::CE failed\n");
 		goto err_ioremap1;
 	}
 
-	pcfg->addr_virt[DPAA_PORTAL_CE] = va;
-
-	va = ioremap_prot(addr_phys[1]->start, resource_size(addr_phys[1]),
-			  _PAGE_GUARDED | _PAGE_NO_CACHE);
-	if (!va) {
+	pcfg->addr_virt_ci = ioremap(addr_phys[1]->start,
+					resource_size(addr_phys[1]));
+	if (!pcfg->addr_virt_ci) {
 		dev_err(dev, "ioremap::CI failed\n");
 		goto err_ioremap2;
 	}
 
-	pcfg->addr_virt[DPAA_PORTAL_CI] = va;
-
 	spin_lock(&bman_lock);
 	cpu = cpumask_next_zero(-1, &portal_cpus);
 	if (cpu >= nr_cpu_ids) {
@@ -164,9 +174,9 @@ static int bman_portal_probe(struct platform_device *pdev)
 	return 0;
 
 err_portal_init:
-	iounmap(pcfg->addr_virt[DPAA_PORTAL_CI]);
+	iounmap(pcfg->addr_virt_ci);
 err_ioremap2:
-	iounmap(pcfg->addr_virt[DPAA_PORTAL_CE]);
+	memunmap(pcfg->addr_virt_ce);
 err_ioremap1:
 	return -ENXIO;
 }
diff --git a/drivers/soc/fsl/qbman/bman_priv.h b/drivers/soc/fsl/qbman/bman_priv.h
index 765a4bf..c48e6eb 100644
--- a/drivers/soc/fsl/qbman/bman_priv.h
+++ b/drivers/soc/fsl/qbman/bman_priv.h
@@ -49,11 +49,9 @@ extern u16 bman_ip_rev;	/* 0 if uninitialised, otherwise BMAN_REVx */
 extern struct gen_pool *bm_bpalloc;
 
 struct bm_portal_config {
-	/*
-	 * Corenet portal addresses;
-	 * [0]==cache-enabled, [1]==cache-inhibited.
-	 */
-	void __iomem *addr_virt[2];
+	/* Portal addresses */
+	void  *addr_virt_ce;
+	void __iomem *addr_virt_ci;
 	/* Allow these to be joined in lists */
 	struct list_head list;
 	struct device *dev;
diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
index 81a9a5e..0a1d573 100644
--- a/drivers/soc/fsl/qbman/dpaa_sys.h
+++ b/drivers/soc/fsl/qbman/dpaa_sys.h
@@ -51,12 +51,12 @@
 
 static inline void dpaa_flush(void *p)
 {
+	/*
+	 * Only PPC needs to flush the cache currently - on ARM the mapping
+	 * is non cacheable
+	 */
 #ifdef CONFIG_PPC
 	flush_dcache_range((unsigned long)p, (unsigned long)p+64);
-#elif defined(CONFIG_ARM)
-	__cpuc_flush_dcache_area(p, 64);
-#elif defined(CONFIG_ARM64)
-	__flush_dcache_area(p, 64);
 #endif
 }
 
diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c
index 25419e1..668fab1 100644
--- a/drivers/soc/fsl/qbman/qman.c
+++ b/drivers/soc/fsl/qbman/qman.c
@@ -300,7 +300,7 @@ struct qm_mc {
 };
 
 struct qm_addr {
-	void __iomem *ce;	/* cache-enabled */
+	void *ce;		/* cache-enabled */
 	void __iomem *ci;	/* cache-inhibited */
 };
 
@@ -1123,8 +1123,8 @@ static int qman_create_portal(struct qman_portal *portal,
 	 * config, everything that follows depends on it and "config" is more
 	 * for (de)reference
 	 */
-	p->addr.ce = c->addr_virt[DPAA_PORTAL_CE];
-	p->addr.ci = c->addr_virt[DPAA_PORTAL_CI];
+	p->addr.ce = c->addr_virt_ce;
+	p->addr.ci = c->addr_virt_ci;
 	/*
 	 * If CI-stashing is used, the current defaults use a threshold of 3,
 	 * and stash with high-than-DQRR priority.
diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
index cbacdf4..41fe33a 100644
--- a/drivers/soc/fsl/qbman/qman_portal.c
+++ b/drivers/soc/fsl/qbman/qman_portal.c
@@ -224,7 +224,6 @@ static int qman_portal_probe(struct platform_device *pdev)
 	struct device_node *node = dev->of_node;
 	struct qm_portal_config *pcfg;
 	struct resource *addr_phys[2];
-	void __iomem *va;
 	int irq, cpu, err;
 	u32 val;
 
@@ -262,23 +261,34 @@ static int qman_portal_probe(struct platform_device *pdev)
 	}
 	pcfg->irq = irq;
 
-	va = ioremap_prot(addr_phys[0]->start, resource_size(addr_phys[0]), 0);
-	if (!va) {
-		dev_err(dev, "ioremap::CE failed\n");
+	/*
+	 * TODO: Ultimately we would like to use a cacheable/non-shareable
+	 * (coherent) mapping for the portal on both architectures but that
+	 * isn't currently available in the kernel.  Because of HW differences
+	 * PPC needs to be mapped cacheable while ARM SoCs will work with non
+	 * cacheable mappings
+	 */
+#ifdef CONFIG_PPC
+	/* PPC requires a cacheable mapping of the portal */
+	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
+				resource_size(addr_phys[0]), MEMREMAP_WB);
+#else
+	/* ARM can use write combine mapping for the cacheable area */
+	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
+				resource_size(addr_phys[0]), MEMREMAP_WT);
+#endif
+	if (!pcfg->addr_virt_ce) {
+		dev_err(dev, "memremap::CE failed\n");
 		goto err_ioremap1;
 	}
 
-	pcfg->addr_virt[DPAA_PORTAL_CE] = va;
-
-	va = ioremap_prot(addr_phys[1]->start, resource_size(addr_phys[1]),
-			  _PAGE_GUARDED | _PAGE_NO_CACHE);
-	if (!va) {
+	pcfg->addr_virt_ci = ioremap(addr_phys[1]->start,
+				resource_size(addr_phys[1]));
+	if (!pcfg->addr_virt_ci) {
 		dev_err(dev, "ioremap::CI failed\n");
 		goto err_ioremap2;
 	}
 
-	pcfg->addr_virt[DPAA_PORTAL_CI] = va;
-
 	pcfg->pools = qm_get_pools_sdqcr();
 
 	spin_lock(&qman_lock);
@@ -310,9 +320,9 @@ static int qman_portal_probe(struct platform_device *pdev)
 	return 0;
 
 err_portal_init:
-	iounmap(pcfg->addr_virt[DPAA_PORTAL_CI]);
+	iounmap(pcfg->addr_virt_ci);
 err_ioremap2:
-	iounmap(pcfg->addr_virt[DPAA_PORTAL_CE]);
+	memunmap(pcfg->addr_virt_ce);
 err_ioremap1:
 	return -ENXIO;
 }
diff --git a/drivers/soc/fsl/qbman/qman_priv.h b/drivers/soc/fsl/qbman/qman_priv.h
index 957ef54..bab7f15 100644
--- a/drivers/soc/fsl/qbman/qman_priv.h
+++ b/drivers/soc/fsl/qbman/qman_priv.h
@@ -155,11 +155,9 @@ static inline void qman_cgrs_xor(struct qman_cgrs *dest,
 void qman_init_cgr_all(void);
 
 struct qm_portal_config {
-	/*
-	 * Corenet portal addresses;
-	 * [0]==cache-enabled, [1]==cache-inhibited.
-	 */
-	void __iomem *addr_virt[2];
+	/* Portal addresses */
+	void *addr_virt_ce;
+	void __iomem *addr_virt_ci;
 	struct device *dev;
 	struct iommu_domain *iommu_domain;
 	/* Allow these to be joined in lists */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [v4 08/11] soc/fsl/qbman: add QMAN_REV32
  2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
                   ` (6 preceding siblings ...)
  2017-08-24 20:37 ` [v4 07/11] soc/fsl/qbman: Rework portal mapping calls for ARM/PPC Roy Pledge
@ 2017-08-24 20:37 ` Roy Pledge
  2017-08-24 20:37 ` [v4 09/11] soc/fsl/qbman: different register offsets on ARM Roy Pledge
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Roy Pledge

From: Madalin Bucur <madalin.bucur@nxp.com>

Add revision 3.2 of the QBMan block.  This is the version
for LS1043A and LS1046A SoCs.

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
---
 drivers/soc/fsl/qbman/qman_ccsr.c | 2 ++
 drivers/soc/fsl/qbman/qman_priv.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/soc/fsl/qbman/qman_ccsr.c b/drivers/soc/fsl/qbman/qman_ccsr.c
index 20a1ebd..bbe3975 100644
--- a/drivers/soc/fsl/qbman/qman_ccsr.c
+++ b/drivers/soc/fsl/qbman/qman_ccsr.c
@@ -720,6 +720,8 @@ static int fsl_qman_probe(struct platform_device *pdev)
 		qman_ip_rev = QMAN_REV30;
 	else if (major == 3 && minor == 1)
 		qman_ip_rev = QMAN_REV31;
+	else if (major == 3 && minor == 2)
+		qman_ip_rev = QMAN_REV32;
 	else {
 		dev_err(dev, "Unknown QMan version\n");
 		return -ENODEV;
diff --git a/drivers/soc/fsl/qbman/qman_priv.h b/drivers/soc/fsl/qbman/qman_priv.h
index bab7f15..8f715fa 100644
--- a/drivers/soc/fsl/qbman/qman_priv.h
+++ b/drivers/soc/fsl/qbman/qman_priv.h
@@ -185,6 +185,7 @@ struct qm_portal_config {
 #define QMAN_REV20 0x0200
 #define QMAN_REV30 0x0300
 #define QMAN_REV31 0x0301
+#define QMAN_REV32 0x0302
 extern u16 qman_ip_rev; /* 0 if uninitialised, otherwise QMAN_REVx */
 
 #define QM_FQID_RANGE_START 1 /* FQID 0 reserved for internal use */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [v4 09/11] soc/fsl/qbman: different register offsets on ARM
  2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
                   ` (7 preceding siblings ...)
  2017-08-24 20:37 ` [v4 08/11] soc/fsl/qbman: add QMAN_REV32 Roy Pledge
@ 2017-08-24 20:37 ` Roy Pledge
  2017-08-24 20:37 ` [v4 10/11] soc/fsl/qbman: Add missing headers " Roy Pledge
  2017-08-24 20:37 ` [v4 11/11] fsl/soc/qbman: Enable FSL_LAYERSCAPE config " Roy Pledge
  10 siblings, 0 replies; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Claudiu Manoil, Roy Pledge

From: Madalin Bucur <madalin.bucur@nxp.com>

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
---
 drivers/soc/fsl/qbman/bman.c | 22 ++++++++++++++++++++++
 drivers/soc/fsl/qbman/qman.c | 38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

diff --git a/drivers/soc/fsl/qbman/bman.c b/drivers/soc/fsl/qbman/bman.c
index e31c843..265048d 100644
--- a/drivers/soc/fsl/qbman/bman.c
+++ b/drivers/soc/fsl/qbman/bman.c
@@ -35,6 +35,27 @@
 
 /* Portal register assists */
 
+#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
+/* Cache-inhibited register offsets */
+#define BM_REG_RCR_PI_CINH	0x3000
+#define BM_REG_RCR_CI_CINH	0x3100
+#define BM_REG_RCR_ITR		0x3200
+#define BM_REG_CFG		0x3300
+#define BM_REG_SCN(n)		(0x3400 + ((n) << 6))
+#define BM_REG_ISR		0x3e00
+#define BM_REG_IER		0x3e40
+#define BM_REG_ISDR		0x3e80
+#define BM_REG_IIR		0x3ec0
+
+/* Cache-enabled register offsets */
+#define BM_CL_CR		0x0000
+#define BM_CL_RR0		0x0100
+#define BM_CL_RR1		0x0140
+#define BM_CL_RCR		0x1000
+#define BM_CL_RCR_PI_CENA	0x3000
+#define BM_CL_RCR_CI_CENA	0x3100
+
+#else
 /* Cache-inhibited register offsets */
 #define BM_REG_RCR_PI_CINH	0x0000
 #define BM_REG_RCR_CI_CINH	0x0004
@@ -53,6 +74,7 @@
 #define BM_CL_RCR		0x1000
 #define BM_CL_RCR_PI_CENA	0x3000
 #define BM_CL_RCR_CI_CENA	0x3100
+#endif
 
 /*
  * Portal modes.
diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c
index 668fab1..fdd4c65 100644
--- a/drivers/soc/fsl/qbman/qman.c
+++ b/drivers/soc/fsl/qbman/qman.c
@@ -41,6 +41,43 @@
 
 /* Portal register assists */
 
+#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
+/* Cache-inhibited register offsets */
+#define QM_REG_EQCR_PI_CINH	0x3000
+#define QM_REG_EQCR_CI_CINH	0x3040
+#define QM_REG_EQCR_ITR		0x3080
+#define QM_REG_DQRR_PI_CINH	0x3100
+#define QM_REG_DQRR_CI_CINH	0x3140
+#define QM_REG_DQRR_ITR		0x3180
+#define QM_REG_DQRR_DCAP	0x31C0
+#define QM_REG_DQRR_SDQCR	0x3200
+#define QM_REG_DQRR_VDQCR	0x3240
+#define QM_REG_DQRR_PDQCR	0x3280
+#define QM_REG_MR_PI_CINH	0x3300
+#define QM_REG_MR_CI_CINH	0x3340
+#define QM_REG_MR_ITR		0x3380
+#define QM_REG_CFG		0x3500
+#define QM_REG_ISR		0x3600
+#define QM_REG_IER		0x3640
+#define QM_REG_ISDR		0x3680
+#define QM_REG_IIR		0x36C0
+#define QM_REG_ITPR		0x3740
+
+/* Cache-enabled register offsets */
+#define QM_CL_EQCR		0x0000
+#define QM_CL_DQRR		0x1000
+#define QM_CL_MR		0x2000
+#define QM_CL_EQCR_PI_CENA	0x3000
+#define QM_CL_EQCR_CI_CENA	0x3040
+#define QM_CL_DQRR_PI_CENA	0x3100
+#define QM_CL_DQRR_CI_CENA	0x3140
+#define QM_CL_MR_PI_CENA	0x3300
+#define QM_CL_MR_CI_CENA	0x3340
+#define QM_CL_CR		0x3800
+#define QM_CL_RR0		0x3900
+#define QM_CL_RR1		0x3940
+
+#else
 /* Cache-inhibited register offsets */
 #define QM_REG_EQCR_PI_CINH	0x0000
 #define QM_REG_EQCR_CI_CINH	0x0004
@@ -75,6 +112,7 @@
 #define QM_CL_CR		0x3800
 #define QM_CL_RR0		0x3900
 #define QM_CL_RR1		0x3940
+#endif
 
 /*
  * BTW, the drivers (and h/w programming model) already obtain the required
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [v4 10/11] soc/fsl/qbman: Add missing headers on ARM
  2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
                   ` (8 preceding siblings ...)
  2017-08-24 20:37 ` [v4 09/11] soc/fsl/qbman: different register offsets on ARM Roy Pledge
@ 2017-08-24 20:37 ` Roy Pledge
  2017-08-24 20:37 ` [v4 11/11] fsl/soc/qbman: Enable FSL_LAYERSCAPE config " Roy Pledge
  10 siblings, 0 replies; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Claudiu Manoil, Roy Pledge

From: Claudiu Manoil <claudiu.manoil@nxp.com>

Unlike PPC builds, ARM builds need following headers
explicitly:
+#include <linux/io.h>		for ioread32be()
+#include <linux/delay.h>		for udelay()

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
---
 drivers/soc/fsl/qbman/dpaa_sys.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
index 0a1d573..8ec6a78 100644
--- a/drivers/soc/fsl/qbman/dpaa_sys.h
+++ b/drivers/soc/fsl/qbman/dpaa_sys.h
@@ -44,6 +44,8 @@
 #include <linux/prefetch.h>
 #include <linux/genalloc.h>
 #include <asm/cacheflush.h>
+#include <linux/io.h>
+#include <linux/delay.h>
 
 /* For 2-element tables related to cache-inhibited and cache-enabled mappings */
 #define DPAA_PORTAL_CE 0
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [v4 11/11] fsl/soc/qbman: Enable FSL_LAYERSCAPE config on ARM
  2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
                   ` (9 preceding siblings ...)
  2017-08-24 20:37 ` [v4 10/11] soc/fsl/qbman: Add missing headers " Roy Pledge
@ 2017-08-24 20:37 ` Roy Pledge
  10 siblings, 0 replies; 20+ messages in thread
From: Roy Pledge @ 2017-08-24 20:37 UTC (permalink / raw)
  To: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel
  Cc: oss, madalin.bucur, catalin.marinas, linux, arnd, mark.rutland,
	Claudiu Manoil, Stuart Yoder, Roy Pledge

From: Madalin Bucur <madalin.bucur@nxp.com>

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
[Stuart: changed to use ARCH_LAYERSCAPE]
Signed-off-by: Stuart Yoder <stuart.yoder@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
---
 drivers/soc/fsl/qbman/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/qbman/Kconfig b/drivers/soc/fsl/qbman/Kconfig
index 757033c..fb4e6bf 100644
--- a/drivers/soc/fsl/qbman/Kconfig
+++ b/drivers/soc/fsl/qbman/Kconfig
@@ -1,6 +1,6 @@
 menuconfig FSL_DPAA
 	bool "Freescale DPAA 1.x support"
-	depends on FSL_SOC_BOOKE
+	depends on (FSL_SOC_BOOKE || ARCH_LAYERSCAPE)
 	select GENERIC_ALLOCATOR
 	help
 	  The Freescale Data Path Acceleration Architecture (DPAA) is a set of
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [v4 01/11] soc/fsl/qbman: Use shared-dma-pool for BMan private memory allocations
  2017-08-24 20:37 ` [v4 01/11] soc/fsl/qbman: Use shared-dma-pool for BMan private memory allocations Roy Pledge
@ 2017-09-14 13:46   ` Catalin Marinas
  0 siblings, 0 replies; 20+ messages in thread
From: Catalin Marinas @ 2017-09-14 13:46 UTC (permalink / raw)
  To: Roy Pledge
  Cc: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel,
	mark.rutland, arnd, madalin.bucur, linux, oss

On Thu, Aug 24, 2017 at 04:37:45PM -0400, Roy Pledge wrote:
> --- a/drivers/soc/fsl/qbman/bman_ccsr.c
> +++ b/drivers/soc/fsl/qbman/bman_ccsr.c
[...]
> @@ -201,6 +202,38 @@ static int fsl_bman_probe(struct platform_device *pdev)
>  		return -ENODEV;
>  	}
>  
> +	/*
> +	 * If FBPR memory wasn't defined using the qbman compatible string
> +	 * try using the of_reserved_mem_device method
> +	 */
> +	if (!fbpr_a) {
> +		ret = of_reserved_mem_device_init(dev);
> +		if (ret) {
> +			dev_err(dev, "of_reserved_mem_device_init() failed 0x%x\n",
> +				ret);
> +			return -ENODEV;
> +		}
> +		mem_node = of_parse_phandle(dev->of_node, "memory-region", 0);
> +		if (mem_node) {
> +			ret = of_property_read_u64(mem_node, "size", &size);
> +			if (ret) {
> +				dev_err(dev, "FBPR: of_address_to_resource fails 0x%x\n",
> +					ret);
> +				return -ENODEV;
> +			}
> +			fbpr_sz = size;
> +		} else {
> +			dev_err(dev, "No memory-region found for FBPR\n");
> +			return -ENODEV;
> +		}
> +		if (!dma_zalloc_coherent(dev, fbpr_sz, &fbpr_a, 0)) {
> +			dev_err(dev, "Alloc FBPR memory failed\n");
> +			return -ENODEV;
> +		}
> +	}

At a quick look, I think I spotted this pattern a couple of more times
in the subsequent patch. Could it be moved to a common function?

-- 
Catalin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v4 03/11] dt-bindings: soc/fsl: Update reserved memory binding for QBMan
  2017-08-24 20:37 ` [v4 03/11] dt-bindings: soc/fsl: Update reserved memory binding for QBMan Roy Pledge
@ 2017-09-14 13:47   ` Catalin Marinas
  0 siblings, 0 replies; 20+ messages in thread
From: Catalin Marinas @ 2017-09-14 13:47 UTC (permalink / raw)
  To: Roy Pledge
  Cc: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel,
	mark.rutland, arnd, madalin.bucur, linux, oss

On Thu, Aug 24, 2017 at 04:37:47PM -0400, Roy Pledge wrote:
> Updates the QMan and BMan device tree bindings for reserved memory
> nodes. This makes the reserved memory allocation compatible with
> the shared-dma-pool usage.
> 
> Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
> ---
>  Documentation/devicetree/bindings/soc/fsl/bman.txt | 12 +++++-----
>  Documentation/devicetree/bindings/soc/fsl/qman.txt | 26 ++++++++++++++++------

This needs reviewed by the DT maintainers.

-- 
Catalin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v4 05/11] soc/fsl/qbman: Drop L1_CACHE_BYTES compile time check
  2017-08-24 20:37 ` [v4 05/11] soc/fsl/qbman: Drop L1_CACHE_BYTES compile time check Roy Pledge
@ 2017-09-14 13:49   ` Catalin Marinas
  2017-09-14 18:30     ` Roy Pledge
  0 siblings, 1 reply; 20+ messages in thread
From: Catalin Marinas @ 2017-09-14 13:49 UTC (permalink / raw)
  To: Roy Pledge
  Cc: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel,
	mark.rutland, arnd, madalin.bucur, linux, oss, Claudiu Manoil

On Thu, Aug 24, 2017 at 04:37:49PM -0400, Roy Pledge wrote:
> From: Claudiu Manoil <claudiu.manoil@nxp.com>
> 
> Not relevant and arch dependent. Overkill for PPC.
> 
> Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
> ---
>  drivers/soc/fsl/qbman/dpaa_sys.h | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
> index 2ce394a..f85c319 100644
> --- a/drivers/soc/fsl/qbman/dpaa_sys.h
> +++ b/drivers/soc/fsl/qbman/dpaa_sys.h
> @@ -49,10 +49,6 @@
>  #define DPAA_PORTAL_CE 0
>  #define DPAA_PORTAL_CI 1
>  
> -#if (L1_CACHE_BYTES != 32) && (L1_CACHE_BYTES != 64)
> -#error "Unsupported Cacheline Size"
> -#endif

Maybe this check was for a reason on PPC as it uses WB memory mappings
for some of the qbman descriptors (which IIUC fit within a cacheline).
You could add a check for CONFIG_PPC if you think there is any chance of
this constant going higher.

-- 
Catalin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v4 07/11] soc/fsl/qbman: Rework portal mapping calls for ARM/PPC
  2017-08-24 20:37 ` [v4 07/11] soc/fsl/qbman: Rework portal mapping calls for ARM/PPC Roy Pledge
@ 2017-09-14 14:00   ` Catalin Marinas
  2017-09-14 19:07     ` Roy Pledge
  0 siblings, 1 reply; 20+ messages in thread
From: Catalin Marinas @ 2017-09-14 14:00 UTC (permalink / raw)
  To: Roy Pledge
  Cc: leoyang.li, linuxppc-dev, linux-arm-kernel, linux-kernel,
	mark.rutland, arnd, madalin.bucur, linux, oss

On Thu, Aug 24, 2017 at 04:37:51PM -0400, Roy Pledge wrote:
> diff --git a/drivers/soc/fsl/qbman/bman.c b/drivers/soc/fsl/qbman/bman.c
> index ff8998f..e31c843 100644
> --- a/drivers/soc/fsl/qbman/bman.c
> +++ b/drivers/soc/fsl/qbman/bman.c
> @@ -154,7 +154,7 @@ struct bm_mc {
>  };
>  
>  struct bm_addr {
> -	void __iomem *ce;	/* cache-enabled */
> +	void *ce;		/* cache-enabled */
>  	void __iomem *ci;	/* cache-inhibited */
>  };

You dropped __iomem from ce, which is fine since it is now set via
memremap. However, I haven't seen (at least not in this patch), a change
to bm_ce_in() which still uses __raw_readl().

(it may be worth checking this code with sparse, it may warn about this)

> diff --git a/drivers/soc/fsl/qbman/bman_portal.c b/drivers/soc/fsl/qbman/bman_portal.c
> index 39b39c8..bb03503 100644
> --- a/drivers/soc/fsl/qbman/bman_portal.c
> +++ b/drivers/soc/fsl/qbman/bman_portal.c
> @@ -91,7 +91,6 @@ static int bman_portal_probe(struct platform_device *pdev)
>  	struct device_node *node = dev->of_node;
>  	struct bm_portal_config *pcfg;
>  	struct resource *addr_phys[2];
> -	void __iomem *va;
>  	int irq, cpu;
>  
>  	pcfg = devm_kmalloc(dev, sizeof(*pcfg), GFP_KERNEL);
> @@ -123,23 +122,34 @@ static int bman_portal_probe(struct platform_device *pdev)
>  	}
>  	pcfg->irq = irq;
>  
> -	va = ioremap_prot(addr_phys[0]->start, resource_size(addr_phys[0]), 0);
> -	if (!va) {
> -		dev_err(dev, "ioremap::CE failed\n");
> +	/*
> +	 * TODO: Ultimately we would like to use a cacheable/non-shareable
> +	 * (coherent) mapping for the portal on both architectures but that
> +	 * isn't currently available in the kernel.  Because of HW differences
> +	 * PPC needs to be mapped cacheable while ARM SoCs will work with non
> +	 * cacheable mappings
> +	 */

This comment mentions "cacheable/non-shareable (coherent)". Was this
meant for ARM platforms? Because non-shareable is not coherent, nor is
this combination guaranteed to work with different CPUs and
interconnects.

> +#ifdef CONFIG_PPC
> +	/* PPC requires a cacheable/non-coherent mapping of the portal */
> +	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
> +				resource_size(addr_phys[0]), MEMREMAP_WB);
> +#else
> +	/* ARM can use a write combine mapping. */
> +	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
> +				resource_size(addr_phys[0]), MEMREMAP_WC);
> +#endif

Nitpick: you could define something like QBMAN_MAP_ATTR to be different
between PPC and the rest and just keep a single memremap() call.

One may complain that "ce" is no longer "cache enabled" but I'm
personally fine to keep the same name for historical reasons.

> diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
> index 81a9a5e..0a1d573 100644
> --- a/drivers/soc/fsl/qbman/dpaa_sys.h
> +++ b/drivers/soc/fsl/qbman/dpaa_sys.h
> @@ -51,12 +51,12 @@
>  
>  static inline void dpaa_flush(void *p)
>  {
> +	/*
> +	 * Only PPC needs to flush the cache currently - on ARM the mapping
> +	 * is non cacheable
> +	 */
>  #ifdef CONFIG_PPC
>  	flush_dcache_range((unsigned long)p, (unsigned long)p+64);
> -#elif defined(CONFIG_ARM)
> -	__cpuc_flush_dcache_area(p, 64);
> -#elif defined(CONFIG_ARM64)
> -	__flush_dcache_area(p, 64);
>  #endif
>  }

Dropping the private API cache maintenance is fine and the memory is WC
now for ARM (mapping to Normal NonCacheable). However, do you require
any barriers here? Normal NC doesn't guarantee any ordering.

> diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> index cbacdf4..41fe33a 100644
> --- a/drivers/soc/fsl/qbman/qman_portal.c
> +++ b/drivers/soc/fsl/qbman/qman_portal.c
> @@ -224,7 +224,6 @@ static int qman_portal_probe(struct platform_device *pdev)
>  	struct device_node *node = dev->of_node;
>  	struct qm_portal_config *pcfg;
>  	struct resource *addr_phys[2];
> -	void __iomem *va;
>  	int irq, cpu, err;
>  	u32 val;
>  
> @@ -262,23 +261,34 @@ static int qman_portal_probe(struct platform_device *pdev)
>  	}
>  	pcfg->irq = irq;
>  
> -	va = ioremap_prot(addr_phys[0]->start, resource_size(addr_phys[0]), 0);
> -	if (!va) {
> -		dev_err(dev, "ioremap::CE failed\n");
> +	/*
> +	 * TODO: Ultimately we would like to use a cacheable/non-shareable
> +	 * (coherent) mapping for the portal on both architectures but that
> +	 * isn't currently available in the kernel.  Because of HW differences
> +	 * PPC needs to be mapped cacheable while ARM SoCs will work with non
> +	 * cacheable mappings
> +	 */

Same comment as above non non-shareable.

> +#ifdef CONFIG_PPC
> +	/* PPC requires a cacheable mapping of the portal */
> +	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
> +				resource_size(addr_phys[0]), MEMREMAP_WB);
> +#else
> +	/* ARM can use write combine mapping for the cacheable area */
> +	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
> +				resource_size(addr_phys[0]), MEMREMAP_WT);
> +#endif

Same nitpick: a single memremap() call.

-- 
Catalin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v4 05/11] soc/fsl/qbman: Drop L1_CACHE_BYTES compile time check
  2017-09-14 13:49   ` Catalin Marinas
@ 2017-09-14 18:30     ` Roy Pledge
  0 siblings, 0 replies; 20+ messages in thread
From: Roy Pledge @ 2017-09-14 18:30 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Leo Li, linuxppc-dev, linux-arm-kernel, linux-kernel,
	mark.rutland, arnd, Madalin-cristian Bucur, linux, oss,
	Claudiu Manoil

On 9/14/2017 9:49 AM, Catalin Marinas wrote:
> On Thu, Aug 24, 2017 at 04:37:49PM -0400, Roy Pledge wrote:
>> From: Claudiu Manoil <claudiu.manoil@nxp.com>
>>
>> Not relevant and arch dependent. Overkill for PPC.
>>
>> Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
>> Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
>> ---
>>   drivers/soc/fsl/qbman/dpaa_sys.h | 4 ----
>>   1 file changed, 4 deletions(-)
>>
>> diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
>> index 2ce394a..f85c319 100644
>> --- a/drivers/soc/fsl/qbman/dpaa_sys.h
>> +++ b/drivers/soc/fsl/qbman/dpaa_sys.h
>> @@ -49,10 +49,6 @@
>>   #define DPAA_PORTAL_CE 0
>>   #define DPAA_PORTAL_CI 1
>>   
>> -#if (L1_CACHE_BYTES != 32) && (L1_CACHE_BYTES != 64)
>> -#error "Unsupported Cacheline Size"
>> -#endif
> 
> Maybe this check was for a reason on PPC as it uses WB memory mappings
> for some of the qbman descriptors (which IIUC fit within a cacheline).
> You could add a check for CONFIG_PPC if you think there is any chance of
> this constant going higher.
> 

No, the reason PPC needs WB (technically any cacheable mapping) is that 
the QBMan block on those parts will raise an error IRQ if it sees any 
transaction less than cacheline size.  We know that this cannot happen 
on PPC parts with QBMan when there is a cacheable mapping because we 
also developed the interconnect for everything that has a QBMan block.

We dropped the check for L1_CACHE_BYTES due to the value being set to 
128 on ARM64 even on parts that has smaller caches. I don't think there 
is much to worry about here as cacheline size isn't something SW 
controls in any case. If we produce a part with QBMan that has a larger 
cache granularity we will need to address that in other parts of the 
code as well. The check was in the code for PPC as a sanity check but 
since the value isn't (in my opinion) meaningful on ARM we can remove it 
to avoid problems.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v4 07/11] soc/fsl/qbman: Rework portal mapping calls for ARM/PPC
  2017-09-14 14:00   ` Catalin Marinas
@ 2017-09-14 19:07     ` Roy Pledge
  2017-09-15 21:49       ` Catalin Marinas
  0 siblings, 1 reply; 20+ messages in thread
From: Roy Pledge @ 2017-09-14 19:07 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Leo Li, linuxppc-dev, linux-arm-kernel, linux-kernel,
	mark.rutland, arnd, Madalin-cristian Bucur, linux, oss

On 9/14/2017 10:00 AM, Catalin Marinas wrote:
> On Thu, Aug 24, 2017 at 04:37:51PM -0400, Roy Pledge wrote:
>> diff --git a/drivers/soc/fsl/qbman/bman.c b/drivers/soc/fsl/qbman/bman.c
>> index ff8998f..e31c843 100644
>> --- a/drivers/soc/fsl/qbman/bman.c
>> +++ b/drivers/soc/fsl/qbman/bman.c
>> @@ -154,7 +154,7 @@ struct bm_mc {
>>   };
>>   
>>   struct bm_addr {
>> -	void __iomem *ce;	/* cache-enabled */
>> +	void *ce;		/* cache-enabled */
>>   	void __iomem *ci;	/* cache-inhibited */
>>   };
> 
> You dropped __iomem from ce, which is fine since it is now set via
> memremap. However, I haven't seen (at least not in this patch), a change
> to bm_ce_in() which still uses __raw_readl().
> 
> (it may be worth checking this code with sparse, it may warn about this)
Thanks, you're correct I missed this. I will fix this (and the qman 
version) and run sparse.
> 
>> diff --git a/drivers/soc/fsl/qbman/bman_portal.c b/drivers/soc/fsl/qbman/bman_portal.c
>> index 39b39c8..bb03503 100644
>> --- a/drivers/soc/fsl/qbman/bman_portal.c
>> +++ b/drivers/soc/fsl/qbman/bman_portal.c
>> @@ -91,7 +91,6 @@ static int bman_portal_probe(struct platform_device *pdev)
>>   	struct device_node *node = dev->of_node;
>>   	struct bm_portal_config *pcfg;
>>   	struct resource *addr_phys[2];
>> -	void __iomem *va;
>>   	int irq, cpu;
>>   
>>   	pcfg = devm_kmalloc(dev, sizeof(*pcfg), GFP_KERNEL);
>> @@ -123,23 +122,34 @@ static int bman_portal_probe(struct platform_device *pdev)
>>   	}
>>   	pcfg->irq = irq;
>>   
>> -	va = ioremap_prot(addr_phys[0]->start, resource_size(addr_phys[0]), 0);
>> -	if (!va) {
>> -		dev_err(dev, "ioremap::CE failed\n");
>> +	/*
>> +	 * TODO: Ultimately we would like to use a cacheable/non-shareable
>> +	 * (coherent) mapping for the portal on both architectures but that
>> +	 * isn't currently available in the kernel.  Because of HW differences
>> +	 * PPC needs to be mapped cacheable while ARM SoCs will work with non
>> +	 * cacheable mappings
>> +	 */
> 
> This comment mentions "cacheable/non-shareable (coherent)". Was this
> meant for ARM platforms? Because non-shareable is not coherent, nor is
> this combination guaranteed to work with different CPUs and
> interconnects.
My wording is poor I should have been clearer that non-shareable == 
non-coherent.  I will fix this.

We do understand that cacheable/non shareable isn't supported on all 
CPU/interconnect combinations but we have verified with ARM that for the 
CPU/interconnects we have integrated QBMan on our use is OK. The note is 
here to try to explain why the mapping is different right now. Once we 
get the basic QBMan support integrated for ARM we do plan to try to have 
patches integrated that enable the cacheable mapping as it gives a 
significant performance boost.  This is a step 2 as we understand the 
topic is complex and a little controversial so I think treating it as an 
independent change will be easier than mixing it with the less 
interesting changes in this patchset.

> 
>> +#ifdef CONFIG_PPC
>> +	/* PPC requires a cacheable/non-coherent mapping of the portal */
>> +	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
>> +				resource_size(addr_phys[0]), MEMREMAP_WB);
>> +#else
>> +	/* ARM can use a write combine mapping. */
>> +	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
>> +				resource_size(addr_phys[0]), MEMREMAP_WC);
>> +#endif
> 
> Nitpick: you could define something like QBMAN_MAP_ATTR to be different
> between PPC and the rest and just keep a single memremap() call.
I will change this - it will be a little more compact.
> 
> One may complain that "ce" is no longer "cache enabled" but I'm
> personally fine to keep the same name for historical reasons.
Cache Enabled is also how the 'data sheet' for the processor describes 
the region and I think it is useful to keep it aligned so that anyone 
looking at the manual and the code can easily correlate the ter >
>> diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
>> index 81a9a5e..0a1d573 100644
>> --- a/drivers/soc/fsl/qbman/dpaa_sys.h
>> +++ b/drivers/soc/fsl/qbman/dpaa_sys.h
>> @@ -51,12 +51,12 @@
>>   
>>   static inline void dpaa_flush(void *p)
>>   {
>> +	/*
>> +	 * Only PPC needs to flush the cache currently - on ARM the mapping
>> +	 * is non cacheable
>> +	 */
>>   #ifdef CONFIG_PPC
>>   	flush_dcache_range((unsigned long)p, (unsigned long)p+64);
>> -#elif defined(CONFIG_ARM)
>> -	__cpuc_flush_dcache_area(p, 64);
>> -#elif defined(CONFIG_ARM64)
>> -	__flush_dcache_area(p, 64);
>>   #endif
>>   }
> 
> Dropping the private API cache maintenance is fine and the memory is WC
> now for ARM (mapping to Normal NonCacheable). However, do you require
> any barriers here? Normal NC doesn't guarantee any ordering.
The barrier is done in the code where the command is formed. We follow 
this pattern
a) Zero the command cache line (the device never reacts to a 0 command 
verb so a cast out of this will have no effect)
b) Fill in everything in the command except the command verb (byte 0)
c) Execute a memory barrier
d) Set the command verb (byte 0)
e) Flush the command
If a castout happens between d) and e) doesn't matter since it was about 
to be flushed anyway .  Any castout before d) will not cause HW to 
process the command because verb is still 0. The barrier at c) prevents 
reordering so the HW cannot see the verb set before the command is formed.

> 
>> diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
>> index cbacdf4..41fe33a 100644
>> --- a/drivers/soc/fsl/qbman/qman_portal.c
>> +++ b/drivers/soc/fsl/qbman/qman_portal.c
>> @@ -224,7 +224,6 @@ static int qman_portal_probe(struct platform_device *pdev)
>>   	struct device_node *node = dev->of_node;
>>   	struct qm_portal_config *pcfg;
>>   	struct resource *addr_phys[2];
>> -	void __iomem *va;
>>   	int irq, cpu, err;
>>   	u32 val;
>>   
>> @@ -262,23 +261,34 @@ static int qman_portal_probe(struct platform_device *pdev)
>>   	}
>>   	pcfg->irq = irq;
>>   
>> -	va = ioremap_prot(addr_phys[0]->start, resource_size(addr_phys[0]), 0);
>> -	if (!va) {
>> -		dev_err(dev, "ioremap::CE failed\n");
>> +	/*
>> +	 * TODO: Ultimately we would like to use a cacheable/non-shareable
>> +	 * (coherent) mapping for the portal on both architectures but that
>> +	 * isn't currently available in the kernel.  Because of HW differences
>> +	 * PPC needs to be mapped cacheable while ARM SoCs will work with non
>> +	 * cacheable mappings
>> +	 */
> 
> Same comment as above non non-shareable.
> 
>> +#ifdef CONFIG_PPC
>> +	/* PPC requires a cacheable mapping of the portal */
>> +	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
>> +				resource_size(addr_phys[0]), MEMREMAP_WB);
>> +#else
>> +	/* ARM can use write combine mapping for the cacheable area */
>> +	pcfg->addr_virt_ce = memremap(addr_phys[0]->start,
>> +				resource_size(addr_phys[0]), MEMREMAP_WT);
>> +#endif
> 
> Same nitpick: a single memremap() call.
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v4 07/11] soc/fsl/qbman: Rework portal mapping calls for ARM/PPC
  2017-09-14 19:07     ` Roy Pledge
@ 2017-09-15 21:49       ` Catalin Marinas
  2017-09-18 18:48         ` Roy Pledge
  0 siblings, 1 reply; 20+ messages in thread
From: Catalin Marinas @ 2017-09-15 21:49 UTC (permalink / raw)
  To: Roy Pledge
  Cc: mark.rutland, arnd, Madalin-cristian Bucur, linux-kernel, Leo Li,
	oss, linux, linuxppc-dev, linux-arm-kernel

On Thu, Sep 14, 2017 at 07:07:50PM +0000, Roy Pledge wrote:
> On 9/14/2017 10:00 AM, Catalin Marinas wrote:
> > On Thu, Aug 24, 2017 at 04:37:51PM -0400, Roy Pledge wrote:
> >> @@ -123,23 +122,34 @@ static int bman_portal_probe(struct platform_device *pdev)
> >>   	}
> >>   	pcfg->irq = irq;
> >>   
> >> -	va = ioremap_prot(addr_phys[0]->start, resource_size(addr_phys[0]), 0);
> >> -	if (!va) {
> >> -		dev_err(dev, "ioremap::CE failed\n");
> >> +	/*
> >> +	 * TODO: Ultimately we would like to use a cacheable/non-shareable
> >> +	 * (coherent) mapping for the portal on both architectures but that
> >> +	 * isn't currently available in the kernel.  Because of HW differences
> >> +	 * PPC needs to be mapped cacheable while ARM SoCs will work with non
> >> +	 * cacheable mappings
> >> +	 */
> > 
> > This comment mentions "cacheable/non-shareable (coherent)". Was this
> > meant for ARM platforms? Because non-shareable is not coherent, nor is
> > this combination guaranteed to work with different CPUs and
> > interconnects.
> 
> My wording is poor I should have been clearer that non-shareable == 
> non-coherent.  I will fix this.
> 
> We do understand that cacheable/non shareable isn't supported on all 
> CPU/interconnect combinations but we have verified with ARM that for the 
> CPU/interconnects we have integrated QBMan on our use is OK. The note is 
> here to try to explain why the mapping is different right now. Once we 
> get the basic QBMan support integrated for ARM we do plan to try to have 
> patches integrated that enable the cacheable mapping as it gives a 
> significant performance boost.

I will definitely not ack those patches (at least not in the form I've
seen, assuming certain eviction order of the bytes in a cacheline). The
reason is that it is incredibly fragile, highly dependent on the CPU
microarchitecture and interconnects. Assuming that you ever only have a
single SoC with this device, you may get away with #ifdefs in the
driver. But if you support two or more SoCs with different behaviours,
you'd have to make run-time decisions in the driver or run-time code
patching. We are very keen on single kernel binary image/drivers and
architecturally compliant code (the cacheable mapping hacks are well
outside the architecture behaviour).

> >> diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
> >> index 81a9a5e..0a1d573 100644
> >> --- a/drivers/soc/fsl/qbman/dpaa_sys.h
> >> +++ b/drivers/soc/fsl/qbman/dpaa_sys.h
> >> @@ -51,12 +51,12 @@
> >>   
> >>   static inline void dpaa_flush(void *p)
> >>   {
> >> +	/*
> >> +	 * Only PPC needs to flush the cache currently - on ARM the mapping
> >> +	 * is non cacheable
> >> +	 */
> >>   #ifdef CONFIG_PPC
> >>   	flush_dcache_range((unsigned long)p, (unsigned long)p+64);
> >> -#elif defined(CONFIG_ARM)
> >> -	__cpuc_flush_dcache_area(p, 64);
> >> -#elif defined(CONFIG_ARM64)
> >> -	__flush_dcache_area(p, 64);
> >>   #endif
> >>   }
> > 
> > Dropping the private API cache maintenance is fine and the memory is WC
> > now for ARM (mapping to Normal NonCacheable). However, do you require
> > any barriers here? Normal NC doesn't guarantee any ordering.
> 
> The barrier is done in the code where the command is formed. We follow 
> this pattern
> a) Zero the command cache line (the device never reacts to a 0 command 
> verb so a cast out of this will have no effect)
> b) Fill in everything in the command except the command verb (byte 0)
> c) Execute a memory barrier
> d) Set the command verb (byte 0)
> e) Flush the command
> If a castout happens between d) and e) doesn't matter since it was about 
> to be flushed anyway .  Any castout before d) will not cause HW to 
> process the command because verb is still 0. The barrier at c) prevents 
> reordering so the HW cannot see the verb set before the command is formed.

I think that's fine, the dpaa_flush() can be a no-op with non-cacheable
memory (I had forgotten the details).

-- 
Catalin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [v4 07/11] soc/fsl/qbman: Rework portal mapping calls for ARM/PPC
  2017-09-15 21:49       ` Catalin Marinas
@ 2017-09-18 18:48         ` Roy Pledge
  0 siblings, 0 replies; 20+ messages in thread
From: Roy Pledge @ 2017-09-18 18:48 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: mark.rutland, arnd, Madalin-cristian Bucur, linux-kernel, Leo Li,
	oss, linux, linuxppc-dev, linux-arm-kernel

On 9/15/2017 5:49 PM, Catalin Marinas wrote:
> On Thu, Sep 14, 2017 at 07:07:50PM +0000, Roy Pledge wrote:
>> On 9/14/2017 10:00 AM, Catalin Marinas wrote:
>>> On Thu, Aug 24, 2017 at 04:37:51PM -0400, Roy Pledge wrote:
>>>> @@ -123,23 +122,34 @@ static int bman_portal_probe(struct platform_device *pdev)
>>>>    	}
>>>>    	pcfg->irq = irq;
>>>>    
>>>> -	va = ioremap_prot(addr_phys[0]->start, resource_size(addr_phys[0]), 0);
>>>> -	if (!va) {
>>>> -		dev_err(dev, "ioremap::CE failed\n");
>>>> +	/*
>>>> +	 * TODO: Ultimately we would like to use a cacheable/non-shareable
>>>> +	 * (coherent) mapping for the portal on both architectures but that
>>>> +	 * isn't currently available in the kernel.  Because of HW differences
>>>> +	 * PPC needs to be mapped cacheable while ARM SoCs will work with non
>>>> +	 * cacheable mappings
>>>> +	 */
>>>
>>> This comment mentions "cacheable/non-shareable (coherent)". Was this
>>> meant for ARM platforms? Because non-shareable is not coherent, nor is
>>> this combination guaranteed to work with different CPUs and
>>> interconnects.
>>
>> My wording is poor I should have been clearer that non-shareable ==
>> non-coherent.  I will fix this.
>>
>> We do understand that cacheable/non shareable isn't supported on all
>> CPU/interconnect combinations but we have verified with ARM that for the
>> CPU/interconnects we have integrated QBMan on our use is OK. The note is
>> here to try to explain why the mapping is different right now. Once we
>> get the basic QBMan support integrated for ARM we do plan to try to have
>> patches integrated that enable the cacheable mapping as it gives a
>> significant performance boost.
> 
> I will definitely not ack those patches (at least not in the form I've
> seen, assuming certain eviction order of the bytes in a cacheline). The
> reason is that it is incredibly fragile, highly dependent on the CPU
> microarchitecture and interconnects. Assuming that you ever only have a
> single SoC with this device, you may get away with #ifdefs in the
> driver. But if you support two or more SoCs with different behaviours,
> you'd have to make run-time decisions in the driver or run-time code
> patching. We are very keen on single kernel binary image/drivers and
> architecturally compliant code (the cacheable mapping hacks are well
> outside the architecture behaviour).
> 

Let's put this particular point on hold for now, I would like to focus 
on getting the basic functions merged in ASAP. I removed the comment in 
question (it sort of happened naturally when I applied your other 
comments) in the next revision of the patchset.  I have submitted the 
patches to our automated test system for sanity checking and I will sent 
a new patchset once I get the results.

Thanks again for your comments - they have been very useful and have 
improved the quality of the code for sure.

>>>> diff --git a/drivers/soc/fsl/qbman/dpaa_sys.h b/drivers/soc/fsl/qbman/dpaa_sys.h
>>>> index 81a9a5e..0a1d573 100644
>>>> --- a/drivers/soc/fsl/qbman/dpaa_sys.h
>>>> +++ b/drivers/soc/fsl/qbman/dpaa_sys.h
>>>> @@ -51,12 +51,12 @@
>>>>    
>>>>    static inline void dpaa_flush(void *p)
>>>>    {
>>>> +	/*
>>>> +	 * Only PPC needs to flush the cache currently - on ARM the mapping
>>>> +	 * is non cacheable
>>>> +	 */
>>>>    #ifdef CONFIG_PPC
>>>>    	flush_dcache_range((unsigned long)p, (unsigned long)p+64);
>>>> -#elif defined(CONFIG_ARM)
>>>> -	__cpuc_flush_dcache_area(p, 64);
>>>> -#elif defined(CONFIG_ARM64)
>>>> -	__flush_dcache_area(p, 64);
>>>>    #endif
>>>>    }
>>>
>>> Dropping the private API cache maintenance is fine and the memory is WC
>>> now for ARM (mapping to Normal NonCacheable). However, do you require
>>> any barriers here? Normal NC doesn't guarantee any ordering.
>>
>> The barrier is done in the code where the command is formed. We follow
>> this pattern
>> a) Zero the command cache line (the device never reacts to a 0 command
>> verb so a cast out of this will have no effect)
>> b) Fill in everything in the command except the command verb (byte 0)
>> c) Execute a memory barrier
>> d) Set the command verb (byte 0)
>> e) Flush the command
>> If a castout happens between d) and e) doesn't matter since it was about
>> to be flushed anyway .  Any castout before d) will not cause HW to
>> process the command because verb is still 0. The barrier at c) prevents
>> reordering so the HW cannot see the verb set before the command is formed.
> 
> I think that's fine, the dpaa_flush() can be a no-op with non-cacheable
> memory (I had forgotten the details).
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-09-18 18:49 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-24 20:37 [v4 00/11] soc/fsl/qbman: Enable QBMan on ARM Platforms Roy Pledge
2017-08-24 20:37 ` [v4 01/11] soc/fsl/qbman: Use shared-dma-pool for BMan private memory allocations Roy Pledge
2017-09-14 13:46   ` Catalin Marinas
2017-08-24 20:37 ` [v4 02/11] soc/fsl/qbman: Use shared-dma-pool for QMan " Roy Pledge
2017-08-24 20:37 ` [v4 03/11] dt-bindings: soc/fsl: Update reserved memory binding for QBMan Roy Pledge
2017-09-14 13:47   ` Catalin Marinas
2017-08-24 20:37 ` [v4 04/11] soc/fsl/qbman: Drop set/clear_bits usage Roy Pledge
2017-08-24 20:37 ` [v4 05/11] soc/fsl/qbman: Drop L1_CACHE_BYTES compile time check Roy Pledge
2017-09-14 13:49   ` Catalin Marinas
2017-09-14 18:30     ` Roy Pledge
2017-08-24 20:37 ` [v4 06/11] soc/fsl/qbman: Fix ARM32 typo Roy Pledge
2017-08-24 20:37 ` [v4 07/11] soc/fsl/qbman: Rework portal mapping calls for ARM/PPC Roy Pledge
2017-09-14 14:00   ` Catalin Marinas
2017-09-14 19:07     ` Roy Pledge
2017-09-15 21:49       ` Catalin Marinas
2017-09-18 18:48         ` Roy Pledge
2017-08-24 20:37 ` [v4 08/11] soc/fsl/qbman: add QMAN_REV32 Roy Pledge
2017-08-24 20:37 ` [v4 09/11] soc/fsl/qbman: different register offsets on ARM Roy Pledge
2017-08-24 20:37 ` [v4 10/11] soc/fsl/qbman: Add missing headers " Roy Pledge
2017-08-24 20:37 ` [v4 11/11] fsl/soc/qbman: Enable FSL_LAYERSCAPE config " Roy Pledge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).