All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture
@ 2014-06-12  5:08 Zhen Lei
  2014-06-12  5:08 ` [PATCH RFC v2 1/3] iommu/arm: Adjust code to facilitate support arm smmu variants Zhen Lei
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Zhen Lei @ 2014-06-12  5:08 UTC (permalink / raw)
  To: linux-arm-kernel

Changes in v2:

- Split Hisilicon smmu implementation in a separate file, hisi-smmu.c
- Refactor arm-smmu.c. Some direct call hardware dependent functions replaced
  with hooks. And move common struct and marco definition into arm-smmu.h
- Merge the description of Hisilicon private properties into arm,smmu.txt

Hisilicon smmu is similar to arm-smmu, some code can be direct reused. For
example: map and unmap, device tree configuration, and the software framework.
I found that, abstract about 11 hardware dependent functions in arm-smmu.c is
enough . Abstract means use hooks to replace the direct call of functions. Now,
if need to support Hisilicon SMMU or others arm smmu variants, just selective
rewrite these hooks. And I add a dt_cfg_probe hook, to allow each variant parse
their hardware special configuration. Finally, flush_pgtable is a special case,
hardware independent but maybe referenced. So, total 13 hooks.

Additional:
Arnd Bergmann told me to pay attention to the discussion of "devicetree: Add
generic IOMMU device tree bindings". But, this subject is still discussing, no
conclusion now. Besides, exclude the Hisilicon private properties, all keep the
same to arm-smmu. So, I think this version is not relate to the discussion now.

Zhen Lei (3):
  iommu/arm: Adjust code to facilitate support arm smmu variants
  iommu/hisilicon: Add support for Hisilicon Ltd. System MMU
    architecture
  documentation/iommu: Add description of Hisilicon SMMU private
    binding

 .../devicetree/bindings/iommu/arm,smmu.txt         |  16 +
 drivers/iommu/Kconfig                              |   7 +
 drivers/iommu/Makefile                             |   1 +
 drivers/iommu/arm-smmu.c                           | 190 +++---
 drivers/iommu/arm-smmu.h                           | 165 +++++
 drivers/iommu/hisi-smmu.c                          | 662 +++++++++++++++++++++
 6 files changed, 927 insertions(+), 114 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu.h
 create mode 100644 drivers/iommu/hisi-smmu.c

--
1.8.0

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 1/3] iommu/arm: Adjust code to facilitate support arm smmu variants
  2014-06-12  5:08 [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture Zhen Lei
@ 2014-06-12  5:08 ` Zhen Lei
  2014-06-12  5:08 ` [PATCH RFC v2 2/3] iommu/hisilicon: Add support for Hisilicon Ltd. System MMU architecture Zhen Lei
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Zhen Lei @ 2014-06-12  5:08 UTC (permalink / raw)
  To: linux-arm-kernel

There is no harmful for original arm-smmu process. A variant can appropriate
rewrite hooks according to the hardware.

1. Pick out hardware dependent functions, replace direct call with hooks
2. Move common struct and marco definition into arm-smmu.h
3. flush_pgtable is a special case, hardware independent but maybe referenced

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 drivers/iommu/arm-smmu.c | 186 ++++++++++++++++++-----------------------------
 drivers/iommu/arm-smmu.h | 154 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 226 insertions(+), 114 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu.h

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 1599354..413a1f2 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -46,15 +46,7 @@
 #include <linux/amba/bus.h>

 #include <asm/pgalloc.h>
-
-/* Maximum number of stream IDs assigned to a single device */
-#define MAX_MASTER_STREAMIDS		MAX_PHANDLE_ARGS
-
-/* Maximum number of context banks per SMMU */
-#define ARM_SMMU_MAX_CBS		128
-
-/* Maximum number of mapping groups per SMMU */
-#define ARM_SMMU_MAX_SMRS		128
+#include "arm-smmu.h"

 /* SMMU global address space */
 #define ARM_SMMU_GR0(smmu)		((smmu)->base)
@@ -323,96 +315,12 @@

 #define FSYNR0_WNR			(1 << 4)

-struct arm_smmu_smr {
-	u8				idx;
-	u16				mask;
-	u16				id;
-};
-
-struct arm_smmu_master {
-	struct device_node		*of_node;
-
-	/*
-	 * The following is specific to the master's position in the
-	 * SMMU chain.
-	 */
-	struct rb_node			node;
-	int				num_streamids;
-	u16				streamids[MAX_MASTER_STREAMIDS];
-
-	/*
-	 * We only need to allocate these on the root SMMU, as we
-	 * configure unmatched streams to bypass translation.
-	 */
-	struct arm_smmu_smr		*smrs;
-};
-
-struct arm_smmu_device {
-	struct device			*dev;
-	struct device_node		*parent_of_node;
-
-	void __iomem			*base;
-	unsigned long			size;
-	unsigned long			pagesize;
-
-#define ARM_SMMU_FEAT_COHERENT_WALK	(1 << 0)
-#define ARM_SMMU_FEAT_STREAM_MATCH	(1 << 1)
-#define ARM_SMMU_FEAT_TRANS_S1		(1 << 2)
-#define ARM_SMMU_FEAT_TRANS_S2		(1 << 3)
-#define ARM_SMMU_FEAT_TRANS_NESTED	(1 << 4)
-	u32				features;
-
-#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
-	u32				options;
-	int				version;
-
-	u32				num_context_banks;
-	u32				num_s2_context_banks;
-	DECLARE_BITMAP(context_map, ARM_SMMU_MAX_CBS);
-	atomic_t			irptndx;
-
-	u32				num_mapping_groups;
-	DECLARE_BITMAP(smr_map, ARM_SMMU_MAX_SMRS);
-
-	unsigned long			input_size;
-	unsigned long			s1_output_size;
-	unsigned long			s2_output_size;
-
-	u32				num_global_irqs;
-	u32				num_context_irqs;
-	unsigned int			*irqs;
-
-	struct list_head		list;
-	struct rb_root			masters;
-};
-
-struct arm_smmu_cfg {
-	struct arm_smmu_device		*smmu;
-	u8				cbndx;
-	u8				irptndx;
-	u32				cbar;
-	pgd_t				*pgd;
-};
-#define INVALID_IRPTNDX			0xff
-
 #define ARM_SMMU_CB_ASID(cfg)		((cfg)->cbndx)
 #define ARM_SMMU_CB_VMID(cfg)		((cfg)->cbndx + 1)

-struct arm_smmu_domain {
-	/*
-	 * A domain can span across multiple, chained SMMUs and requires
-	 * all devices within the domain to follow the same translation
-	 * path.
-	 */
-	struct arm_smmu_device		*leaf_smmu;
-	struct arm_smmu_cfg		root_cfg;
-	phys_addr_t			output_mask;
-
-	spinlock_t			lock;
-};
-
 static DEFINE_SPINLOCK(arm_smmu_devices_lock);
 static LIST_HEAD(arm_smmu_devices);
+static struct arm_smmu_hwdep_ops smmu_hwdep_ops;

 struct arm_smmu_option_prop {
 	u32 opt;
@@ -555,6 +463,12 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx)
 	clear_bit(idx, map);
 }

+static int arm_smmu_alloc_context(struct arm_smmu_device *smmu,
+			int start, int end, struct arm_smmu_master *master)
+{
+	return __arm_smmu_alloc_bitmap(smmu->context_map, start, end);
+}
+
 /* Wait for any pending TLB invalidations to complete */
 static void arm_smmu_tlb_sync(struct arm_smmu_device *smmu)
 {
@@ -590,7 +504,7 @@ static void arm_smmu_tlb_inv_context(struct arm_smmu_cfg *cfg)
 			       base + ARM_SMMU_GR0_TLBIVMID);
 	}

-	arm_smmu_tlb_sync(smmu);
+	smmu_hwdep_ops.tlb_sync(smmu);
 }

 static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
@@ -859,6 +773,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	struct arm_smmu_domain *smmu_domain = domain->priv;
 	struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg;
 	struct arm_smmu_device *smmu, *parent;
+	struct arm_smmu_master *master;

 	/*
 	 * Walk the SMMU chain to find the root device for this chain.
@@ -873,7 +788,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		smmu_domain->output_mask &= (1ULL << smmu->s2_output_size) - 1;
 	} while ((parent = find_parent_smmu(smmu)));

-	if (!find_smmu_master(smmu, dev->of_node)) {
+	master = find_smmu_master(smmu, dev->of_node);
+	if (!master) {
 		dev_err(dev, "unable to find root SMMU for device\n");
 		return -ENODEV;
 	}
@@ -893,8 +809,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		start = smmu->num_s2_context_banks;
 	}

-	ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
-				      smmu->num_context_banks);
+	ret = smmu_hwdep_ops.alloc_context(smmu, start,
+					smmu->num_context_banks, master);
 	if (IS_ERR_VALUE(ret))
 		return ret;

@@ -907,7 +823,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	}

 	irq = smmu->irqs[smmu->num_global_irqs + root_cfg->irptndx];
-	ret = request_irq(irq, arm_smmu_context_fault, IRQF_SHARED,
+	ret = request_irq(irq, smmu_hwdep_ops.context_fault, IRQF_SHARED,
 			  "arm-smmu-context-fault", domain);
 	if (IS_ERR_VALUE(ret)) {
 		dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
@@ -917,7 +833,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	}

 	root_cfg->smmu = smmu;
-	arm_smmu_init_context_bank(smmu_domain);
+	smmu_hwdep_ops.init_context_bank(smmu_domain);
 	return ret;

 out_free_context:
@@ -925,21 +841,29 @@ out_free_context:
 	return ret;
 }

+static void arm_smmu_destroy_context_bank(struct arm_smmu_domain *smmu_domain)
+{
+	struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg;
+	struct arm_smmu_device *smmu = root_cfg->smmu;
+	void __iomem *cb_base;
+
+	/* Disable the context bank and nuke the TLB before freeing it. */
+	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, root_cfg->cbndx);
+	writel_relaxed(0, cb_base + ARM_SMMU_CB_SCTLR);
+	smmu_hwdep_ops.tlb_inv_context(root_cfg);
+}
+
 static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 {
 	struct arm_smmu_domain *smmu_domain = domain->priv;
 	struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg;
 	struct arm_smmu_device *smmu = root_cfg->smmu;
-	void __iomem *cb_base;
 	int irq;

 	if (!smmu)
 		return;

-	/* Disable the context bank and nuke the TLB before freeing it. */
-	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, root_cfg->cbndx);
-	writel_relaxed(0, cb_base + ARM_SMMU_CB_SCTLR);
-	arm_smmu_tlb_inv_context(root_cfg);
+	smmu_hwdep_ops.destroy_context_bank(smmu_domain);

 	if (root_cfg->irptndx != INVALID_IRPTNDX) {
 		irq = smmu->irqs[smmu->num_global_irqs + root_cfg->irptndx];
@@ -1227,7 +1151,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	if (!master)
 		return -ENODEV;

-	return arm_smmu_domain_add_master(smmu_domain, master);
+	return smmu_hwdep_ops.domain_add_master(smmu_domain, master);

 err_unlock:
 	spin_unlock_irqrestore(&smmu_domain->lock, flags);
@@ -1241,7 +1165,7 @@ static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)

 	master = find_smmu_master(smmu_domain->leaf_smmu, dev->of_node);
 	if (master)
-		arm_smmu_domain_remove_master(smmu_domain, master);
+		smmu_hwdep_ops.domain_remove_master(smmu_domain, master);
 }

 static bool arm_smmu_pte_is_contiguous_range(unsigned long addr,
@@ -1498,7 +1422,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 	struct arm_smmu_domain *smmu_domain = domain->priv;

 	ret = arm_smmu_handle_mapping(smmu_domain, iova, 0, size, 0);
-	arm_smmu_tlb_inv_context(&smmu_domain->root_cfg);
+	smmu_hwdep_ops.tlb_inv_context(&smmu_domain->root_cfg);
 	return ret ? 0 : size;
 }

@@ -1625,7 +1549,7 @@ static struct iommu_ops arm_smmu_ops = {
 			   PAGE_SIZE),
 };

-static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
+static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
 {
 	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
 	void __iomem *cb_base;
@@ -1672,8 +1596,10 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
 	reg &= ~(sCR0_BSU_MASK << sCR0_BSU_SHIFT);

 	/* Push the button */
-	arm_smmu_tlb_sync(smmu);
+	smmu_hwdep_ops.tlb_sync(smmu);
 	writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
+
+	return 0;
 }

 static int arm_smmu_id_size_to_bits(int size)
@@ -1839,6 +1765,27 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	return 0;
 }

+static struct arm_smmu_hwdep_ops smmu_hwdep_ops = {
+	.alloc_context		= arm_smmu_alloc_context,
+	.flush_pgtable		= arm_smmu_flush_pgtable,
+	.tlb_sync		= arm_smmu_tlb_sync,
+	.tlb_inv_context	= arm_smmu_tlb_inv_context,
+	.context_fault		= arm_smmu_context_fault,
+	.global_fault		= arm_smmu_global_fault,
+	.init_context_bank	= arm_smmu_init_context_bank,
+	.destroy_context_bank	= arm_smmu_destroy_context_bank,
+	.domain_add_master	= arm_smmu_domain_add_master,
+	.domain_remove_master	= arm_smmu_domain_remove_master,
+	.device_reset		= arm_smmu_device_reset,
+	.device_cfg_probe	= arm_smmu_device_cfg_probe,
+	.dt_cfg_probe		= NULL,
+};
+
+void __attribute__((weak))
+arm_smmu_hwdep_ops_override(struct arm_smmu_hwdep_ops *ops)
+{
+}
+
 static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 {
 	struct resource *res;
@@ -1916,7 +1863,13 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	if ((dev_node = of_parse_phandle(dev->of_node, "smmu-parent", 0)))
 		smmu->parent_of_node = dev_node;

-	err = arm_smmu_device_cfg_probe(smmu);
+	if (smmu_hwdep_ops.dt_cfg_probe) {
+		err = smmu_hwdep_ops.dt_cfg_probe(smmu, dev);
+		if (err)
+			goto out_put_parent;
+	}
+
+	err = smmu_hwdep_ops.device_cfg_probe(smmu);
 	if (err)
 		goto out_put_parent;

@@ -1933,7 +1886,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)

 	for (i = 0; i < smmu->num_global_irqs; ++i) {
 		err = request_irq(smmu->irqs[i],
-				  arm_smmu_global_fault,
+				  smmu_hwdep_ops.global_fault,
 				  IRQF_SHARED,
 				  "arm-smmu global fault",
 				  smmu);
@@ -1949,7 +1902,10 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	list_add(&smmu->list, &arm_smmu_devices);
 	spin_unlock(&arm_smmu_devices_lock);

-	arm_smmu_device_reset(smmu);
+	err = smmu_hwdep_ops.device_reset(smmu);
+	if (err)
+		goto out_free_irqs;
+
 	return 0;

 out_free_irqs:
@@ -2006,7 +1962,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 		free_irq(smmu->irqs[i], smmu);

 	/* Turn the thing off */
-	writel(sCR0_CLIENTPD,ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
+	writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
 	return 0;
 }

@@ -2035,6 +1991,8 @@ static int __init arm_smmu_init(void)
 {
 	int ret;

+	arm_smmu_hwdep_ops_override(&smmu_hwdep_ops);
+
 	ret = platform_driver_register(&arm_smmu_driver);
 	if (ret)
 		return ret;
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
new file mode 100644
index 0000000..79366ee
--- /dev/null
+++ b/drivers/iommu/arm-smmu.h
@@ -0,0 +1,154 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ * Author: Zhen Lei <thunder.leizhen@huawei.com>
+ *
+ */
+
+#ifndef ARM_SMMU_H
+#define ARM_SMMU_H
+
+#include <linux/iommu.h>
+#include <linux/of.h>
+
+/* Maximum number of stream IDs assigned to a single device */
+#define MAX_MASTER_STREAMIDS		MAX_PHANDLE_ARGS
+
+/* Maximum number of context banks per SMMU */
+#define ARM_SMMU_MAX_CBS		128
+
+/* Maximum number of mapping groups per SMMU */
+#define ARM_SMMU_MAX_SMRS		128
+
+struct arm_smmu_smr {
+	u8				idx;
+	u16				mask;
+	u16				id;
+};
+
+struct arm_smmu_master {
+	struct device_node		*of_node;
+
+	/*
+	 * The following is specific to the master's position in the
+	 * SMMU chain.
+	 */
+	struct rb_node			node;
+	int				num_streamids;
+	u16				streamids[MAX_MASTER_STREAMIDS];
+
+	/*
+	 * We only need to allocate these on the root SMMU, as we
+	 * configure unmatched streams to bypass translation.
+	 */
+	struct arm_smmu_smr		*smrs;
+};
+
+struct arm_smmu_device {
+	struct device			*dev;
+	struct device_node		*parent_of_node;
+
+	void __iomem			*base;
+	unsigned long			size;
+	unsigned long			pagesize;
+
+#define ARM_SMMU_FEAT_COHERENT_WALK	(1 << 0)
+#define ARM_SMMU_FEAT_STREAM_MATCH	(1 << 1)
+#define ARM_SMMU_FEAT_TRANS_S1		(1 << 2)
+#define ARM_SMMU_FEAT_TRANS_S2		(1 << 3)
+#define ARM_SMMU_FEAT_TRANS_NESTED	(1 << 4)
+	u32				features;
+
+#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+	u32				options;
+	int				version;
+
+	u32				num_context_banks;
+	u32				num_s2_context_banks;
+	DECLARE_BITMAP(context_map, ARM_SMMU_MAX_CBS);
+	atomic_t			irptndx;
+
+	u32				num_mapping_groups;
+	DECLARE_BITMAP(smr_map, ARM_SMMU_MAX_SMRS);
+
+	unsigned long			input_size;
+	unsigned long			s1_output_size;
+	unsigned long			s2_output_size;
+
+	u32				num_global_irqs;
+	u32				num_context_irqs;
+	unsigned int			*irqs;
+
+	struct list_head		list;
+	struct rb_root			masters;
+};
+
+struct arm_smmu_cfg {
+	struct arm_smmu_device		*smmu;
+	u8				cbndx;
+	u8				irptndx;
+	u32				cbar;
+	pgd_t				*pgd;
+};
+#define INVALID_IRPTNDX			0xff
+
+struct arm_smmu_domain {
+	/*
+	 * A domain can span across multiple, chained SMMUs and requires
+	 * all devices within the domain to follow the same translation
+	 * path.
+	 */
+	struct arm_smmu_device		*leaf_smmu;
+	struct arm_smmu_cfg		root_cfg;
+	phys_addr_t			output_mask;
+
+	spinlock_t			lock;
+};
+
+/**
+ * struct arm_smmu_hwdep_ops - arm smmu hardware dependent ops
+ * @alloc_context: alloc a free context bank
+ * @flush_pgtable: flush page table. reference only, don't override
+ * @tlb_sync: sync smmu tlb operation
+ * @tlb_inv_context: invalid smmu context bank tlb
+ * @context_fault: context fault handler
+ * @global_fault: global fault handler
+ * @init_context_bank: init a context bank
+ * @destroy_context_bank: disable a context bank and invalid the TLBs
+ * @domain_add_master: add a master into a domain
+ * @domain_remove_master: remove a master from a domain
+ * @device_reset: reset a smmu
+ * @device_cfg_probe: probe hardware configuration
+ * @dt_cfg_probe: probe device tree configuration(extends, hardware dependent)
+ */
+struct arm_smmu_hwdep_ops {
+	int (*alloc_context)(struct arm_smmu_device *smmu,
+			int start, int end, struct arm_smmu_master *master);
+	void (*flush_pgtable)(struct arm_smmu_device *smmu, void *addr,
+				   size_t size);
+	void (*tlb_sync)(struct arm_smmu_device *smmu);
+	void (*tlb_inv_context)(struct arm_smmu_cfg *cfg);
+	irqreturn_t (*context_fault)(int irq, void *dev);
+	irqreturn_t (*global_fault)(int irq, void *dev);
+	void (*init_context_bank)(struct arm_smmu_domain *smmu_domain);
+	void (*destroy_context_bank)(struct arm_smmu_domain *smmu_domain);
+	int (*domain_add_master)(struct arm_smmu_domain *smmu_domain,
+					struct arm_smmu_master *master);
+	void (*domain_remove_master)(struct arm_smmu_domain *smmu_domain,
+					struct arm_smmu_master *master);
+	int (*device_reset)(struct arm_smmu_device *smmu);
+	int (*device_cfg_probe)(struct arm_smmu_device *smmu);
+	int (*dt_cfg_probe)(struct arm_smmu_device *smmu, struct device *dev);
+};
+
+#endif
--
1.8.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 2/3] iommu/hisilicon: Add support for Hisilicon Ltd. System MMU architecture
  2014-06-12  5:08 [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture Zhen Lei
  2014-06-12  5:08 ` [PATCH RFC v2 1/3] iommu/arm: Adjust code to facilitate support arm smmu variants Zhen Lei
@ 2014-06-12  5:08 ` Zhen Lei
  2014-06-12  5:08 ` [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding Zhen Lei
  2014-06-16 16:37 ` [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture Will Deacon
  3 siblings, 0 replies; 16+ messages in thread
From: Zhen Lei @ 2014-06-12  5:08 UTC (permalink / raw)
  To: linux-arm-kernel

Here is the major hardware difference compare to arm-smmu specification:
1. Only have global register space 0, no GR1. Actually, some context bank
registers have been moved into GR0 to optimize hardware logic.

2. StreamID is 16 bits, highest 8 bits is VMID, lowest 8 bits is ASID. StreamID
match is not support, so direct use VMID and ASID to index context bank. First
use VMID to index stage2 context bank, then use ASID to index stage1 context
bank. In fact, max 256 stage2 context banks, each stage2 context bank relate to
256 stage1 context banks.
|-----------------|            |-----------------|
|stage2 CB VMID0  |----------->|stage1 CB ASID0  |
|-----------------|            |-----------------|
|   ......        |            |   ......        |
|-----------------|            |-----------------|
|stage2 CB VMID255|-----|      |stage2 CB ASID255|
|-----------------|     |      |-----------------|
                        |
                        |
                        |
                        |----->|-----------------|
                               |stage1 CB ASID0  |
                               |-----------------|
                               |   ......        |
                               |-----------------|
                               |stage2 CB ASID255|
                               |-----------------|

3. The base address of stage2 context bank is stored in SMMU_CFG_S2CTBAR, and
the base address of stage1 context bank is stored in S2_S1CTBAR(locate in
stage2 context bank).

4. All context bank fault share 8 groups of context fault registers. That is,
max record 8 context faults. Fault syndrome register recorded StreamID to help
software determine which context bank issue fault.

5. When choose stage1 translation and stage2 bypass mode, the register sequence
impact output attribute is: S1_SCTLR, CBAR, S2CR(for arm-smmu, process S2CR
first). This issue a problem, because total 256 stage1 CBs share a stage2 CB
when VMID=0. If some devices use bypass mode(use device built-in attributes),
and some devices use map mode(use page table entry specified attributes), smmu
can not work properly. The avoidance scheme is occupy another stage2 CB(S2CR)
to support bypass mode.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 drivers/iommu/Kconfig     |   7 +
 drivers/iommu/Makefile    |   1 +
 drivers/iommu/arm-smmu.c  |   4 +
 drivers/iommu/arm-smmu.h  |  11 +
 drivers/iommu/hisi-smmu.c | 662 ++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 685 insertions(+)
 create mode 100644 drivers/iommu/hisi-smmu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d260605..ef4e851 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -305,4 +305,11 @@ config ARM_SMMU
 	  Say Y here if your SoC includes an IOMMU device implementing
 	  the ARM SMMU architecture.

+config HISI_SMMU
+	bool "Hisilicon Ltd. System MMU (SMMU) Support"
+	depends on ARM_SMMU
+	help
+	  Say Y here if your SoC includes an IOMMU device implementing
+	  the Hisilicon SMMU architecture.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 8893bad..e06e36e 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -5,6 +5,7 @@ obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o msm_iommu_dev.o
 obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o
 obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
 obj-$(CONFIG_ARM_SMMU) += arm-smmu.o
+obj-$(CONFIG_HISI_SMMU) += hisi-smmu.o
 obj-$(CONFIG_DMAR_TABLE) += dmar.o
 obj-$(CONFIG_INTEL_IOMMU) += iova.o intel-iommu.o
 obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 413a1f2..c952d72 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -968,6 +968,9 @@ static void arm_smmu_domain_destroy(struct iommu_domain *domain)
 {
 	struct arm_smmu_domain *smmu_domain = domain->priv;

+	if (smmu_domain->num_of_masters)
+		dev_err(smmu_domain->leaf_smmu->dev, "destroy domain with active dev!\n");
+
 	/*
 	 * Free the domain resources. We assume that all devices have
 	 * already been detached.
@@ -1972,6 +1975,7 @@ static struct of_device_id arm_smmu_of_match[] = {
 	{ .compatible = "arm,smmu-v2", },
 	{ .compatible = "arm,mmu-400", },
 	{ .compatible = "arm,mmu-500", },
+	{ .compatible = "hisilicon,smmu-v1", },
 	{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 79366ee..2941b39 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -24,8 +24,12 @@
 /* Maximum number of stream IDs assigned to a single device */
 #define MAX_MASTER_STREAMIDS		MAX_PHANDLE_ARGS

+#ifdef CONFIG_HISI_SMMU
+#define ARM_SMMU_MAX_CBS		256
+#else
 /* Maximum number of context banks per SMMU */
 #define ARM_SMMU_MAX_CBS		128
+#endif

 /* Maximum number of mapping groups per SMMU */
 #define ARM_SMMU_MAX_SMRS		128
@@ -58,6 +62,12 @@ struct arm_smmu_device {
 	struct device			*dev;
 	struct device_node		*parent_of_node;

+#ifdef CONFIG_HISI_SMMU
+	void __iomem			*s1cbt;
+	void __iomem			*s2cbt;
+	u8				cb_mtcfg[ARM_SMMU_MAX_CBS];
+#endif
+
 	void __iomem			*base;
 	unsigned long			size;
 	unsigned long			pagesize;
@@ -113,6 +123,7 @@ struct arm_smmu_domain {
 	phys_addr_t			output_mask;

 	spinlock_t			lock;
+	int				num_of_masters;
 };

 /**
diff --git a/drivers/iommu/hisi-smmu.c b/drivers/iommu/hisi-smmu.c
new file mode 100644
index 0000000..5a2035e
--- /dev/null
+++ b/drivers/iommu/hisi-smmu.c
@@ -0,0 +1,662 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Copyright (C) 2014 Hisilicon Limited
+ *
+ * Author: Zhen Lei <thunder.leizhen@huawei.com>
+ *
+ * Hisilicon smmu-v1 hardware dependent implemention, a arm smmu variant
+ *
+ */
+
+#define pr_fmt(fmt) "hisi-smmu: " fmt
+
+#include <linux/delay.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/of.h>
+#include <linux/spinlock.h>
+
+#include "arm-smmu.h"
+
+/* SMMU global address space */
+#define SMMU_GR0(smmu)			((smmu)->base)
+
+#define SMMU_OS_VMID			0
+#define SMMU_CB_NUMIRPT			8
+#define SMMU_S1CBT_SIZE			0x10000
+#define SMMU_S2CBT_SIZE			0x2000
+#define SMMU_S1CBT_SHIFT		16
+#define SMMU_S2CBT_SHIFT		12
+
+#define SMMU_CTRL_CR0			0x0
+#define SMMU_CTRL_ACR			0x8
+#define SMMU_CFG_S2CTBAR		0xc
+#define SMMU_IDR0			0x10
+#define SMMU_IDR1			0x14
+#define SMMU_IDR2			0x18
+#define SMMU_HIS_GFAR_LOW		0x20
+#define SMMU_HIS_GFAR_HIGH		0x24
+#define SMMU_RINT_GFSR			0x28
+#define SMMU_RINT_GFSYNR		0x2c
+#define SMMU_CFG_GFIM			0x30
+#define SMMU_CFG_CBF			0x34
+#define SMMU_TLBIALL			0x40
+#define SMMU_TLBIVMID			0x44
+#define SMMU_TLBISID			0x48
+#define SMMU_TLBIVA_LOW			0x4c
+#define SMMU_TLBIVA_HIGH		0x50
+#define SMMU_TLBGSYNC			0x54
+#define SMMU_TLBGSTATUS			0x58
+#define SMMU_CXTIALL			0x60
+#define SMMU_CXTIVMID			0x64
+#define SMMU_CXTISID			0x68
+#define SMMU_CXTGSYNC			0x6c
+#define SMMU_CXTGSTATUS			0x70
+#define SMMU_RINT_CB_FSR(n)		(0x100 + ((n) << 2))
+#define SMMU_RINT_CB_FSYNR(n)		(0x120 + ((n) << 2))
+#define SMMU_HIS_CB_FAR_LOW(n)		(0x140 + ((n) << 2))
+#define SMMU_HIS_CB_FAR_HIGH(n)		(0x144 + ((n) << 2))
+#define SMMU_CTRL_CB_RESUME(n)		(0x180 + ((n) << 2))
+
+#define SMMU_CB_S2CR(n)			(0x0  + ((n) << 5))
+#define SMMU_CB_CBAR(n)			(0x4  + ((n) << 5))
+#define SMMU_CB_S1CTBAR(n)		(0x18 + ((n) << 5))
+
+#define SMMU_S1_MAIR0			0x0
+#define SMMU_S1_MAIR1			0x4
+#define SMMU_S1_TTBR0_L			0x8
+#define SMMU_S1_TTBR0_H			0xc
+#define SMMU_S1_TTBR1_L			0x10
+#define SMMU_S1_TTBR1_H			0x14
+#define SMMU_S1_TTBCR			0x18
+#define SMMU_S1_SCTLR			0x1c
+
+#define CFG_CBF_S1_ORGN_WA		(1 << 12)
+#define CFG_CBF_S1_IRGN_WA		(1 << 10)
+#define CFG_CBF_S1_SHCFG_IS		(3 << 8)
+#define CFG_CBF_S2_ORGN_WA		(1 << 4)
+#define CFG_CBF_S2_IRGN_WA		(1 << 2)
+#define CFG_CBF_S2_SHCFG_IS		(3 << 0)
+
+/* Configuration registers */
+#define sCR0_CLIENTPD			(1 << 0)
+#define sCR0_GFRE			(1 << 1)
+#define sCR0_GFIE			(1 << 2)
+#define sCR0_GCFGFRE			(1 << 4)
+#define sCR0_GCFGFIE			(1 << 5)
+
+#if (PAGE_SIZE == SZ_4K)
+#define sACR_WC_EN			(7 << 0)
+#elif (PAGE_SIZE == SZ_64K)
+#define sACR_WC_EN			(3 << 5)
+#else
+#define sACR_WC_EN			0
+#endif
+
+#define ID0_S1TS			(1 << 30)
+#define ID0_S2TS			(1 << 29)
+#define ID0_NTS				(1 << 28)
+#define ID0_PTFS_SHIFT			24
+#define ID0_PTFS_MASK			0x2
+#define ID0_PTFS_V8_ONLY		0x2
+#define ID0_CTTW			(1 << 14)
+
+#define ID2_OAS_SHIFT			8
+#define ID2_OAS_MASK			0xff
+#define ID2_IAS_SHIFT			0
+#define ID2_IAS_MASK			0xff
+
+#define S2CR_TYPE_SHIFT			16
+#define S2CR_TYPE_MASK			0x3
+#define S2CR_TYPE_TRANS			(0 << S2CR_TYPE_SHIFT)
+#define S2CR_TYPE_BYPASS		(1 << S2CR_TYPE_SHIFT)
+#define S2CR_TYPE_FAULT			(2 << S2CR_TYPE_SHIFT)
+#define S2CR_SHCFG_NS			(3 << 8)
+#define S2CR_MTCFG			(1 << 11)
+#define S2CR_MEMATTR_OIWB		(0xf << 12)
+#define S2CR_MTSH_WEAKEST		(S2CR_SHCFG_NS | \
+				S2CR_MTCFG | S2CR_MEMATTR_OIWB)
+
+/* Context bank attribute registers */
+#define CBAR_VMID_SHIFT			0
+#define CBAR_VMID_MASK			0xff
+#define CBAR_S1_BPSHCFG_SHIFT		8
+#define CBAR_S1_BPSHCFG_MASK		3
+#define CBAR_S1_BPSHCFG_NSH		3
+#define CBAR_S1_MEMATTR_SHIFT		12
+#define CBAR_S1_MEMATTR_MASK		0xf
+#define CBAR_S1_MEMATTR_WB		0xf
+#define CBAR_TYPE_SHIFT			16
+#define CBAR_TYPE_MASK			0x3
+#define CBAR_TYPE_S2_TRANS		(0 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_BYPASS	(1 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_FAULT	(2 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_TRANS	(3 << CBAR_TYPE_SHIFT)
+#define CBAR_IRPTNDX_SHIFT		24
+#define CBAR_IRPTNDX_MASK		0xff
+
+#define SMMU_CB_BASE(smmu)	((smmu)->s1cbt)
+#define SMMU_CB(smmu, n)	((n) << 5)
+#define SMMU_CB_SID(cfg)	(((u16)SMMU_OS_VMID << 8) | ((cfg)->cbndx))
+
+#define sTLBGSTATUS_GSACTIVE	(1 << 0)
+#define TLB_LOOP_TIMEOUT	1000000	/* 1s! */
+
+#define SCTLR_WACFG_WA		(2 << 26)
+#define SCTLR_RACFG_RA		(2 << 24)
+#define SCTLR_SHCFG_IS		(2 << 22)
+#define SCTLR_MTCFG		(1 << 20)
+#define SCTLR_MEMATTR_WB	(0xf << 16)
+#define SCTLR_MEMATTR_NC	(0x5 << 16)
+#define SCTLR_MEMATTR_NGNRE	(0x1 << 16)
+#define SCTLR_CACHE_WBRAWA	(SCTLR_WACFG_WA | SCTLR_RACFG_RA | \
+			SCTLR_SHCFG_IS | SCTLR_MTCFG | SCTLR_MEMATTR_WB)
+#define SCTLR_CACHE_NC		(SCTLR_SHCFG_IS | \
+			SCTLR_MTCFG | SCTLR_MEMATTR_NC)
+#define SCTLR_CACHE_NGNRE	(SCTLR_SHCFG_IS | \
+			SCTLR_MTCFG | SCTLR_MEMATTR_NGNRE)
+
+#define SCTLR_CFCFG			(1 << 7)
+#define SCTLR_CFIE			(1 << 6)
+#define SCTLR_CFRE			(1 << 5)
+#define SCTLR_E				(1 << 4)
+#define SCTLR_AFED			(1 << 3)
+#define SCTLR_M				(1 << 0)
+#define SCTLR_EAE_SBOP			(SCTLR_AFED)
+
+#define RESUME_RETRY			(0 << 0)
+#define RESUME_TERMINATE		(1 << 0)
+
+#define TTBCR_TG0_4K			(0 << 14)
+#define TTBCR_TG0_64K			(3 << 14)
+
+#define TTBCR_SH0_SHIFT			12
+#define TTBCR_SH0_MASK			0x3
+#define TTBCR_SH_NS			0
+#define TTBCR_SH_OS			2
+#define TTBCR_SH_IS			3
+#define TTBCR_ORGN0_SHIFT		10
+#define TTBCR_IRGN0_SHIFT		8
+#define TTBCR_RGN_MASK			0x3
+#define TTBCR_RGN_NC			0
+#define TTBCR_RGN_WBWA			1
+#define TTBCR_RGN_WT			2
+#define TTBCR_RGN_WB			3
+#define TTBCR_T1SZ_SHIFT		16
+#define TTBCR_T0SZ_SHIFT		0
+#define TTBCR_SZ_MASK			0xf
+
+#define MAIR_ATTR_SHIFT(n)		((n) << 3)
+#define MAIR_ATTR_MASK			0xff
+#define MAIR_ATTR_DEVICE		0x04
+#define MAIR_ATTR_NC			0x44
+#define MAIR_ATTR_WBRWA			0xff
+#define MAIR_ATTR_IDX_NC		0
+#define MAIR_ATTR_IDX_CACHE		1
+#define MAIR_ATTR_IDX_DEV		2
+
+#define FSR_MULTI		(1 << 31)
+#define FSR_EF			(1 << 4)
+#define FSR_PF			(1 << 3)
+#define FSR_AFF			(1 << 2)
+#define FSR_TF			(1 << 1)
+#define FSR_IGN			(FSR_AFF)
+#define FSR_FAULT		(FSR_MULTI | FSR_EF | FSR_PF | FSR_TF | FSR_IGN)
+
+#define FSYNR0_ASID(n)			(0xff & ((n) >> 24))
+#define FSYNR0_VMID(n)			(0xff & ((n) >> 16))
+#define FSYNR0_WNR			(1 << 4)
+#define FSYNR0_SS			(1 << 2)
+#define FSYNR0_CF			(1 << 0)
+
+
+static u32 hisi_bypass_vmid = 0xff;
+static struct arm_smmu_hwdep_ops smmu_hwdep_ops_bak;
+
+
+static int hisi_smmu_alloc_context(struct arm_smmu_device *smmu,
+			int start, int end, struct arm_smmu_master *master)
+{
+	if (master)
+		start = master->streamids[0];
+
+	if (smmu->cb_mtcfg[start])
+		return -ENOSPC;
+
+	return smmu_hwdep_ops_bak.alloc_context(smmu, start, end, master);
+}
+
+static void hisi_smmu_tlb_sync(struct arm_smmu_device *smmu)
+{
+	int count = 0;
+	void __iomem *gr0_base = SMMU_GR0(smmu);
+
+	writel_relaxed(0, gr0_base + SMMU_TLBGSYNC);
+	while (readl_relaxed(gr0_base + SMMU_TLBGSTATUS)
+	       & sTLBGSTATUS_GSACTIVE) {
+		cpu_relax();
+		if (++count == TLB_LOOP_TIMEOUT) {
+			dev_err_ratelimited(smmu->dev,
+			"TLB sync timed out -- SMMU may be deadlocked\n");
+			return;
+		}
+		udelay(1);
+	}
+}
+
+static void hisi_smmu_tlb_inv_context(struct arm_smmu_cfg *cfg)
+{
+	struct arm_smmu_device *smmu = cfg->smmu;
+
+	writel_relaxed(SMMU_CB_SID(cfg), SMMU_GR0(smmu) + SMMU_CXTISID);
+
+	hisi_smmu_tlb_sync(smmu);
+}
+
+static irqreturn_t hisi_smmu_context_fault(int irq, void *dev)
+{
+	int i, flags, ret = IRQ_NONE, num_unhandled = 0;
+	u32 fsr, far, fsynr, resume;
+	unsigned long iova;
+	struct iommu_domain *domain = dev;
+	struct arm_smmu_domain *smmu_domain = domain->priv;
+	struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg;
+	struct arm_smmu_device *smmu = root_cfg->smmu;
+	void __iomem *gr0_base = SMMU_GR0(smmu);
+
+	for (i = 0; i < SMMU_CB_NUMIRPT; i++) {
+		fsynr = readl_relaxed(gr0_base + SMMU_RINT_CB_FSYNR(i));
+		if (!(fsynr & FSYNR0_CF) ||
+		    (FSYNR0_VMID(fsynr) != SMMU_OS_VMID) ||
+		    (root_cfg->cbndx != FSYNR0_ASID(fsynr)))
+			continue;
+
+		fsr = readl_relaxed(gr0_base + SMMU_RINT_CB_FSR(i));
+		if (fsr & FSR_IGN)
+			dev_err_ratelimited(smmu->dev,
+					    "Unexpected context fault (fsr 0x%u)\n",
+					    fsr);
+
+		flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
+
+		far = readl_relaxed(gr0_base + SMMU_HIS_CB_FAR_LOW(i));
+		iova = far;
+#ifdef CONFIG_64BIT
+		far = readl_relaxed(gr0_base + SMMU_HIS_CB_FAR_HIGH(i));
+		iova |= ((unsigned long)far << 32);
+#endif
+
+		if (!report_iommu_fault(domain, smmu->dev, iova, flags)) {
+			ret = IRQ_HANDLED;
+			resume = RESUME_RETRY;
+		} else {
+			dev_err_ratelimited(smmu->dev,
+			    "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, cb=%d\n",
+			    iova, fsynr, FSYNR0_ASID(fsynr));
+			num_unhandled++;
+			resume = RESUME_TERMINATE;
+		}
+
+		/* Clear the faulting FSR */
+		writel(fsr, gr0_base + SMMU_RINT_CB_FSR(i));
+
+		/* Retry or terminate any stalled transactions */
+		if (fsynr & FSYNR0_SS)
+			writel_relaxed(resume, gr0_base + SMMU_CTRL_CB_RESUME(i));
+	}
+
+	/*
+	 * If any fault unhandled, treat IRQ_NONE, although some maybe handled.
+	 */
+	if (num_unhandled)
+		ret = IRQ_NONE;
+
+	return ret;
+}
+
+static irqreturn_t hisi_smmu_global_fault(int irq, void *dev)
+{
+	u32 gfsr, gfsynr0;
+	struct arm_smmu_device *smmu = dev;
+	void __iomem *gr0_base = SMMU_GR0(smmu);
+
+	gfsr = readl_relaxed(gr0_base + SMMU_RINT_GFSR);
+	if (!gfsr)
+		return IRQ_NONE;
+
+	gfsynr0 = readl_relaxed(gr0_base + SMMU_RINT_GFSYNR);
+
+	dev_err_ratelimited(smmu->dev,
+		"Unexpected global fault, this could be serious\n");
+	dev_err_ratelimited(smmu->dev,
+		"\tGFSR 0x%08x, GFSYNR0 0x%08x\n", gfsr, gfsynr0);
+
+	writel(gfsr, gr0_base + SMMU_RINT_GFSR);
+	return IRQ_HANDLED;
+}
+
+static void hisi_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain)
+{
+	u32 reg;
+	struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg;
+	struct arm_smmu_device *smmu = root_cfg->smmu;
+	void __iomem *cb_base;
+
+	cb_base = SMMU_CB_BASE(smmu) + SMMU_CB(smmu, root_cfg->cbndx);
+
+	/* TTBR0 */
+	smmu_hwdep_ops_bak.flush_pgtable(smmu, root_cfg->pgd,
+			       PTRS_PER_PGD * sizeof(pgd_t));
+	reg = __pa(root_cfg->pgd);
+	writel_relaxed(reg, cb_base + SMMU_S1_TTBR0_L);
+	reg = (phys_addr_t)__pa(root_cfg->pgd) >> 32;
+	writel_relaxed(reg, cb_base + SMMU_S1_TTBR0_H);
+
+	/*
+	 * TTBCR
+	 * We use long descriptor, with inner-shareable WBWA tables in TTBR0.
+	 */
+	if (PAGE_SIZE == SZ_4K)
+		reg = TTBCR_TG0_4K;
+	else
+		reg = TTBCR_TG0_64K;
+
+	reg |= (64 - smmu->s1_output_size) << TTBCR_T0SZ_SHIFT;
+
+	reg |= (TTBCR_SH_IS << TTBCR_SH0_SHIFT) |
+	       (TTBCR_RGN_WBWA << TTBCR_ORGN0_SHIFT) |
+	       (TTBCR_RGN_WBWA << TTBCR_IRGN0_SHIFT);
+	writel_relaxed(reg, cb_base + SMMU_S1_TTBCR);
+
+	reg = (MAIR_ATTR_NC << MAIR_ATTR_SHIFT(MAIR_ATTR_IDX_NC)) |
+	      (MAIR_ATTR_WBRWA << MAIR_ATTR_SHIFT(MAIR_ATTR_IDX_CACHE)) |
+	      (MAIR_ATTR_DEVICE << MAIR_ATTR_SHIFT(MAIR_ATTR_IDX_DEV));
+	writel_relaxed(reg, cb_base + SMMU_S1_MAIR0);
+
+	/* SCTLR */
+	reg = SCTLR_CFCFG | SCTLR_CFIE | SCTLR_CFRE | SCTLR_M | SCTLR_EAE_SBOP;
+#ifdef __BIG_ENDIAN
+	reg |= SCTLR_E;
+#endif
+	writel_relaxed(reg, cb_base + SMMU_S1_SCTLR);
+}
+
+static void hisi_smmu_destroy_context_bank(struct arm_smmu_domain *smmu_domain)
+{
+	struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg;
+	struct arm_smmu_device *smmu = root_cfg->smmu;
+	void __iomem *cb_base;
+
+	/* Disable the context bank and nuke the TLB before freeing it. */
+	cb_base = SMMU_CB_BASE(smmu) + SMMU_CB(smmu, root_cfg->cbndx);
+	writel_relaxed(0, cb_base + SMMU_S1_SCTLR);
+	hisi_smmu_tlb_inv_context(root_cfg);
+}
+
+static int hisi_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
+				      struct arm_smmu_master *master)
+{
+	unsigned long flags;
+
+	if (SMMU_CB_SID(&smmu_domain->root_cfg) != master->streamids[0]) {
+		dev_err(smmu_domain->leaf_smmu->dev, "Too many sid attached\n");
+		return -ENODEV;
+	}
+
+	spin_lock_irqsave(&smmu_domain->lock, flags);
+	smmu_domain->num_of_masters++;
+	spin_unlock_irqrestore(&smmu_domain->lock, flags);
+
+	return 0;
+}
+
+static void hisi_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain,
+					  struct arm_smmu_master *master)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&smmu_domain->lock, flags);
+	smmu_domain->num_of_masters--;
+	spin_unlock_irqrestore(&smmu_domain->lock, flags);
+}
+
+static int hisi_smmu_device_reset(struct arm_smmu_device *smmu)
+{
+	void __iomem *gr0_base = SMMU_GR0(smmu);
+	void __iomem *cb_base;
+	int i = 0;
+	u32 reg;
+
+	/* Clear Global FSR */
+	reg = readl_relaxed(gr0_base + SMMU_RINT_GFSR);
+	writel(reg, gr0_base + SMMU_RINT_GFSR);
+
+	/* unmask all global interrupt */
+	writel_relaxed(0, gr0_base + SMMU_CFG_GFIM);
+
+	reg  = CFG_CBF_S1_ORGN_WA | CFG_CBF_S1_IRGN_WA | CFG_CBF_S1_SHCFG_IS;
+	reg |= CFG_CBF_S2_ORGN_WA | CFG_CBF_S2_IRGN_WA | CFG_CBF_S2_SHCFG_IS;
+	writel_relaxed(reg, gr0_base + SMMU_CFG_CBF);
+
+	/* stage 2 context bank table */
+	reg = readl_relaxed(gr0_base + SMMU_CFG_S2CTBAR);
+	smmu->s2cbt = devm_ioremap(smmu->dev,
+			(phys_addr_t)reg << SMMU_S2CBT_SHIFT, SMMU_S2CBT_SIZE);
+	if (!smmu->s2cbt) {
+		pr_err("Failed to ioremap SnCB table\n");
+		return -ENOMEM;
+	}
+
+	/* stage 1 context bank table */
+	reg = readl_relaxed(smmu->s2cbt + SMMU_CB_S1CTBAR(SMMU_OS_VMID));
+	smmu->s1cbt = devm_ioremap(smmu->dev,
+			(phys_addr_t)reg << SMMU_S1CBT_SHIFT, SMMU_S1CBT_SIZE);
+	if (!smmu->s1cbt) {
+		pr_err("Failed to ioremap SnCB table\n");
+		return -ENOMEM;
+	}
+
+	/* Make sure all context banks are disabled */
+	for (i = 0; i < smmu->num_context_banks; i++) {
+		cb_base = SMMU_CB_BASE(smmu) + SMMU_CB(smmu, i);
+
+		switch (smmu->cb_mtcfg[i]) {
+		case 1:
+			reg = SCTLR_CACHE_WBRAWA;
+			break;
+		case 2:
+			reg = SCTLR_CACHE_NC;
+			break;
+		case 3:
+			reg = SCTLR_CACHE_NGNRE;
+			break;
+		default:
+			reg = 0;
+			break;
+		}
+
+		writel_relaxed(reg, cb_base + SMMU_S1_SCTLR);
+	}
+
+	/* Clear CB_FSR  */
+	for (i = 0; i < SMMU_CB_NUMIRPT; i++)
+		writel_relaxed(FSR_FAULT, gr0_base + SMMU_RINT_CB_FSR(i));
+
+	/*
+	 * Use the weakest attribute, so no impact stage 1 output attribute.
+	 */
+	reg = CBAR_TYPE_S1_TRANS_S2_BYPASS |
+		(CBAR_S1_BPSHCFG_NSH << CBAR_S1_BPSHCFG_SHIFT) |
+		(CBAR_S1_MEMATTR_WB << CBAR_S1_MEMATTR_SHIFT);
+	writel_relaxed(reg, smmu->s2cbt + SMMU_CB_CBAR(SMMU_OS_VMID));
+
+	/* Bypass need use another S2CR */
+	reg = S2CR_TYPE_BYPASS | S2CR_MTSH_WEAKEST;
+	writel_relaxed(reg, smmu->s2cbt + SMMU_CB_S2CR(hisi_bypass_vmid));
+
+	/* Mark S2CR as translation */
+	reg = S2CR_TYPE_TRANS | S2CR_MTSH_WEAKEST;
+	writel_relaxed(reg, smmu->s2cbt + SMMU_CB_S2CR(SMMU_OS_VMID));
+
+	/* Invalidate the TLB, just in case */
+	writel_relaxed(SMMU_OS_VMID, gr0_base + SMMU_TLBIVMID);
+	hisi_smmu_tlb_sync(smmu);
+
+	writel_relaxed(sACR_WC_EN, gr0_base + SMMU_CTRL_ACR);
+
+	/* Enable fault reporting */
+	reg  = (sCR0_GFRE | sCR0_GFIE | sCR0_GCFGFRE | sCR0_GCFGFIE);
+	reg &= ~sCR0_CLIENTPD;
+
+	writel_relaxed(reg, gr0_base + SMMU_CTRL_CR0);
+	dsb();
+
+	return 0;
+}
+
+static int hisi_smmu_id_size_to_bits(unsigned long size)
+{
+	int i;
+
+	for (i = 7; i >= 0; i--)
+		if ((size >> i) & 0x1)
+			break;
+
+	return 32 + 4 * (i + 1);
+}
+
+static int hisi_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
+{
+	unsigned long size;
+	void __iomem *gr0_base = SMMU_GR0(smmu);
+	u32 id;
+
+	dev_notice(smmu->dev, "probing hardware configuration...\n");
+
+	smmu->version = 1;
+
+	/* ID0 */
+	id = readl_relaxed(gr0_base + SMMU_IDR0);
+#ifndef CONFIG_64BIT
+	if (((id >> ID0_PTFS_SHIFT) & ID0_PTFS_MASK) == ID0_PTFS_V8_ONLY) {
+		dev_err(smmu->dev, "\tno v7 descriptor support!\n");
+		return -ENODEV;
+	}
+#endif
+
+	if (id & ID0_NTS) {
+		smmu->features |= ARM_SMMU_FEAT_TRANS_NESTED;
+		smmu->features |= ARM_SMMU_FEAT_TRANS_S1;
+		smmu->features |= ARM_SMMU_FEAT_TRANS_S2;
+		dev_notice(smmu->dev, "\tnested translation\n");
+	} else if (id & ID0_S1TS) {
+		smmu->features |= ARM_SMMU_FEAT_TRANS_S1;
+		dev_notice(smmu->dev, "\tstage 1 translation\n");
+	}
+
+	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) {
+		dev_err(smmu->dev, "\tstage 1 translation not support!\n");
+		return -ENODEV;
+	}
+
+	if (id & ID0_CTTW) {
+		smmu->features |= ARM_SMMU_FEAT_COHERENT_WALK;
+		dev_notice(smmu->dev, "\tcoherent table walk\n");
+	}
+
+	smmu->num_context_banks = ARM_SMMU_MAX_CBS;
+
+	/* ID2 */
+	id = readl_relaxed(gr0_base + SMMU_IDR2);
+	size = hisi_smmu_id_size_to_bits((id >> ID2_IAS_SHIFT) & ID2_IAS_MASK);
+
+	smmu->input_size = min_t(unsigned long, VA_BITS, size);
+
+	/* The stage-2 output mask is also applied for bypass */
+	size = hisi_smmu_id_size_to_bits((id >> ID2_OAS_SHIFT) & ID2_OAS_MASK);
+	smmu->s2_output_size = min_t(unsigned long, PHYS_MASK_SHIFT, size);
+
+	/*
+	 * Stage-1 output limited by stage-2 input size due to pgd
+	 * allocation (PTRS_PER_PGD).
+	 */
+#ifdef CONFIG_64BIT
+	smmu->s1_output_size = min_t(unsigned long, VA_BITS, size);
+#else
+	smmu->s1_output_size = min(32UL, size);
+#endif
+
+	dev_notice(smmu->dev,
+		   "\t%lu-bit VA, %lu-bit IPA, %lu-bit PA\n",
+		   smmu->input_size,
+		   smmu->s1_output_size, smmu->s2_output_size);
+
+	return 0;
+}
+
+static int hisi_dt_cfg_probe(struct arm_smmu_device *smmu, struct device *dev)
+{
+	int i, ret;
+	const __be32 *prop;
+	int len;
+
+	/*
+	 * some devices may not support bring cache attributes, but want
+	 * specified cache attributes. Here list three common cases:
+	 * 1, cahceable, WBRAWA
+	 * 2, non-cacheable
+	 * 3, device, nGnRE
+	 */
+	prop = of_get_property(dev->of_node, "smmu-force-memtype", &len);
+	for (i = 0; prop && (i < (len / 4) - 1); i += 2) {
+		int cbidx;
+
+		cbidx = of_read_number(&prop[i], 1);
+		if (cbidx >= ARM_SMMU_MAX_CBS) {
+			dev_err(dev, "invalid StreamID %d\n", cbidx);
+			return -ENODEV;
+		}
+
+		ret = hisi_smmu_alloc_context(smmu, cbidx, cbidx + 1, NULL);
+		if (IS_ERR_VALUE(ret)) {
+			dev_err(dev, "conflict StreamID %d\n", cbidx);
+			return ret;
+		}
+
+		smmu->cb_mtcfg[cbidx] = (u8)of_read_number(&prop[i + 1], 1);
+		if (!smmu->cb_mtcfg[cbidx])
+			smmu->cb_mtcfg[cbidx] = 0xff;
+	}
+
+	of_property_read_u32(dev->of_node,
+				"smmu-bypass-vmid", &hisi_bypass_vmid);
+
+	return 0;
+}
+
+void arm_smmu_hwdep_ops_override(struct arm_smmu_hwdep_ops *ops)
+{
+	memcpy(&smmu_hwdep_ops_bak, ops, sizeof(*ops));
+
+	ops->alloc_context	= hisi_smmu_alloc_context;
+	ops->tlb_sync		= hisi_smmu_tlb_sync;
+	ops->context_fault	= hisi_smmu_context_fault;
+	ops->global_fault	= hisi_smmu_global_fault;
+	ops->init_context_bank	= hisi_smmu_init_context_bank;
+	ops->destroy_context_bank = hisi_smmu_destroy_context_bank;
+	ops->domain_add_master	= hisi_smmu_domain_add_master;
+	ops->domain_remove_master = hisi_smmu_domain_remove_master;
+	ops->device_reset	= hisi_smmu_device_reset;
+	ops->device_cfg_probe	= hisi_smmu_device_cfg_probe;
+	ops->dt_cfg_probe	= hisi_dt_cfg_probe;
+}
--
1.8.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding
  2014-06-12  5:08 [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture Zhen Lei
  2014-06-12  5:08 ` [PATCH RFC v2 1/3] iommu/arm: Adjust code to facilitate support arm smmu variants Zhen Lei
  2014-06-12  5:08 ` [PATCH RFC v2 2/3] iommu/hisilicon: Add support for Hisilicon Ltd. System MMU architecture Zhen Lei
@ 2014-06-12  5:08 ` Zhen Lei
  2014-06-16 16:39   ` Will Deacon
  2014-06-16 16:37 ` [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture Will Deacon
  3 siblings, 1 reply; 16+ messages in thread
From: Zhen Lei @ 2014-06-12  5:08 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds a description of private properties for the Hisilicon
System MMU architecture.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 Documentation/devicetree/bindings/iommu/arm,smmu.txt | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index f284b99..75b1351 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -15,6 +15,7 @@ conditions.
                         "arm,smmu-v2"
                         "arm,mmu-400"
                         "arm,mmu-500"
+                        "hisilicon,smmu-v1"

                   depending on the particular implementation and/or the
                   version of the architecture implemented.
@@ -54,6 +55,21 @@ conditions.
                   aliases of secure registers have to be used during
                   SMMU configuration.

+** Hisilicon SMMU private properties:
+
+- smmu-force-memtype : A list of StreamIDs which not translate address but
+                  translate attributes. The StreamIDs list here can not be
+                  used for map(translation) mode again.
+                  StreamID first, then the type list below:
+                  1, cahceable, WBRAWA, Normal outer and inner write-back
+                  2, non-cacheable, Normal outer and inner non-cacheable
+                  3, device, nGnRE
+                  others, bypass
+
+- smmu-bypass-vmid   : Specify which context bank is used for bypass mode.
+                  If omit, vmid=255 is default. If bypass and map mode can
+                  share a same S2CR, config vmid=0.
+
 Example:

         smmu {
--
1.8.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture
  2014-06-12  5:08 [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture Zhen Lei
                   ` (2 preceding siblings ...)
  2014-06-12  5:08 ` [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding Zhen Lei
@ 2014-06-16 16:37 ` Will Deacon
  2014-06-17  6:32   ` leizhen
  3 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2014-06-16 16:37 UTC (permalink / raw)
  To: linux-arm-kernel

Hi there,

On Thu, Jun 12, 2014 at 06:08:09AM +0100, Zhen Lei wrote:
> Changes in v2:
> 
> - Split Hisilicon smmu implementation in a separate file, hisi-smmu.c
> - Refactor arm-smmu.c. Some direct call hardware dependent functions replaced
>   with hooks. And move common struct and marco definition into arm-smmu.h
> - Merge the description of Hisilicon private properties into arm,smmu.txt
> 
> Hisilicon smmu is similar to arm-smmu, some code can be direct reused. For
> example: map and unmap, device tree configuration, and the software framework.
> I found that, abstract about 11 hardware dependent functions in arm-smmu.c is
> enough . Abstract means use hooks to replace the direct call of functions. Now,
> if need to support Hisilicon SMMU or others arm smmu variants, just selective
> rewrite these hooks. And I add a dt_cfg_probe hook, to allow each variant parse
> their hardware special configuration. Finally, flush_pgtable is a special case,
> hardware independent but maybe referenced. So, total 13 hooks.

The fundamental question here is whether or not your SMMU implementation is
intended to follow the ARM SMMU architecture specification. Given the
changes you've highlighted, it clearly doesn't comply, so then it comes down
to how much code can actually be re-used between arm-smmu.c and hisi-smmu.c.

With your current split, I can still see plenty of duplication between the
two files (e.g. bits in the SCTLR register). I also recognise a fair number
of lines in hisi-smmu.c that I wrote originally.

So, is this supposed to be architecturally compliant with the ARM SMMU spec
or is it something completely independent?

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding
  2014-06-12  5:08 ` [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding Zhen Lei
@ 2014-06-16 16:39   ` Will Deacon
  2014-06-17  7:50     ` leizhen
  0 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2014-06-16 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 12, 2014 at 06:08:12AM +0100, Zhen Lei wrote:
> This patch adds a description of private properties for the Hisilicon
> System MMU architecture.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> index f284b99..75b1351 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> @@ -15,6 +15,7 @@ conditions.
>                          "arm,smmu-v2"
>                          "arm,mmu-400"
>                          "arm,mmu-500"
> +                        "hisilicon,smmu-v1"
> 
>                    depending on the particular implementation and/or the
>                    version of the architecture implemented.
> @@ -54,6 +55,21 @@ conditions.
>                    aliases of secure registers have to be used during
>                    SMMU configuration.
> 
> +** Hisilicon SMMU private properties:
> +
> +- smmu-force-memtype : A list of StreamIDs which not translate address but
> +                  translate attributes. The StreamIDs list here can not be
> +                  used for map(translation) mode again.
> +                  StreamID first, then the type list below:
> +                  1, cahceable, WBRAWA, Normal outer and inner write-back
> +                  2, non-cacheable, Normal outer and inner non-cacheable
> +                  3, device, nGnRE
> +                  others, bypass
> +
> +- smmu-bypass-vmid   : Specify which context bank is used for bypass mode.
> +                  If omit, vmid=255 is default. If bypass and map mode can
> +                  share a same S2CR, config vmid=0.

These don't feel like they belong in the device-tree to me. Is the list of
StreamIDs described in smmu-force-memtype a property of the hardware
configuration, or a software policy to allow you to generate cacheable
traffic for particular masters?

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture
  2014-06-16 16:37 ` [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture Will Deacon
@ 2014-06-17  6:32   ` leizhen
  2014-06-17  9:27     ` Will Deacon
  0 siblings, 1 reply; 16+ messages in thread
From: leizhen @ 2014-06-17  6:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 2014/6/17 0:37, Will Deacon wrote:
> Hi there,
> 
> On Thu, Jun 12, 2014 at 06:08:09AM +0100, Zhen Lei wrote:
>> Changes in v2:
>>
>> - Split Hisilicon smmu implementation in a separate file, hisi-smmu.c
>> - Refactor arm-smmu.c. Some direct call hardware dependent functions replaced
>>   with hooks. And move common struct and marco definition into arm-smmu.h
>> - Merge the description of Hisilicon private properties into arm,smmu.txt
>>
>> Hisilicon smmu is similar to arm-smmu, some code can be direct reused. For
>> example: map and unmap, device tree configuration, and the software framework.
>> I found that, abstract about 11 hardware dependent functions in arm-smmu.c is
>> enough . Abstract means use hooks to replace the direct call of functions. Now,
>> if need to support Hisilicon SMMU or others arm smmu variants, just selective
>> rewrite these hooks. And I add a dt_cfg_probe hook, to allow each variant parse
>> their hardware special configuration. Finally, flush_pgtable is a special case,
>> hardware independent but maybe referenced. So, total 13 hooks.
> 
> The fundamental question here is whether or not your SMMU implementation is
> intended to follow the ARM SMMU architecture specification. Given the
> changes you've highlighted, it clearly doesn't comply, so then it comes down
> to how much code can actually be re-used between arm-smmu.c and hisi-smmu.c.
> 
> With your current split, I can still see plenty of duplication between the
> two files (e.g. bits in the SCTLR register). I also recognise a fair number
> of lines in hisi-smmu.c that I wrote originally.
> 
> So, is this supposed to be architecturally compliant with the ARM SMMU spec
> or is it something completely independent?
> 
> Will
> 
> .
> 
Yeah, it doesn't comply, and finally will not comply too. So, I said before:
"similar to" and "arm-smmu variant" maybe inappropriate. It looks like a new
smmu hardware implementation.

I also want to reuse your code as much as possible. But in order to keep your
code flow clearly, I try to make as few changes as possible. I think make
differentiate granule too small is not well. Because all of our codes maybe
changed in future, and may any other smmu variants to reuse arm-smmu.c.

Especially, SCTLR register, the hardware dependent feature. I think not
suitable for reuse. A bit field differentiation may need a marco to seperate,
make the code ugly.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding
  2014-06-16 16:39   ` Will Deacon
@ 2014-06-17  7:50     ` leizhen
  2014-06-17  9:11       ` Varun Sethi
  2014-06-17  9:33       ` Will Deacon
  0 siblings, 2 replies; 16+ messages in thread
From: leizhen @ 2014-06-17  7:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 2014/6/17 0:39, Will Deacon wrote:
> On Thu, Jun 12, 2014 at 06:08:12AM +0100, Zhen Lei wrote:
>> This patch adds a description of private properties for the Hisilicon
>> System MMU architecture.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 16 ++++++++++++++++
>>  1 file changed, 16 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> index f284b99..75b1351 100644
>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> @@ -15,6 +15,7 @@ conditions.
>>                          "arm,smmu-v2"
>>                          "arm,mmu-400"
>>                          "arm,mmu-500"
>> +                        "hisilicon,smmu-v1"
>>
>>                    depending on the particular implementation and/or the
>>                    version of the architecture implemented.
>> @@ -54,6 +55,21 @@ conditions.
>>                    aliases of secure registers have to be used during
>>                    SMMU configuration.
>>
>> +** Hisilicon SMMU private properties:
>> +
>> +- smmu-force-memtype : A list of StreamIDs which not translate address but
>> +                  translate attributes. The StreamIDs list here can not be
>> +                  used for map(translation) mode again.
>> +                  StreamID first, then the type list below:
>> +                  1, cahceable, WBRAWA, Normal outer and inner write-back
>> +                  2, non-cacheable, Normal outer and inner non-cacheable
>> +                  3, device, nGnRE
>> +                  others, bypass
>> +
>> +- smmu-bypass-vmid   : Specify which context bank is used for bypass mode.
>> +                  If omit, vmid=255 is default. If bypass and map mode can
>> +                  share a same S2CR, config vmid=0.
> 
> These don't feel like they belong in the device-tree to me. Is the list of
> StreamIDs described in smmu-force-memtype a property of the hardware
> configuration, or a software policy to allow you to generate cacheable
> traffic for particular masters?
> 
> Will
> 
> .
> 
OK, I will put these description into a sperate file, hisilicon,smmu.txt
and mark a reference to arm,smmu.txt

The latter case. Some masters driver want use cacheable(WB) attribute to access
memory, but the masters can not bring cacheable attribute. So, can not use
bypass(or transaction) mode. In fact, the master driver can use iommu_map to
create map and specify IOMMU_CACHE. But maybe the driver does not want to
map, if the memory access is very dynamically, frequently map and unmap will
decrease performance.

Actually, I think smmu-force-memtype maybe suitable for arm-smmu too. But now,
if nobody declear need it, I will just implement it in hisi-smmu.c

A big issue should be discussed now. When a master use bypass mode to access
memory, this means it can access all non-secure memory. So, if a master can
be operated in user mode, that means a user process can access any kernel
memory. In ARM smmu-v2 specification, SMMU_S2CRn.PRIVCFG is used for determine
output priviledge attribute. But if use bypass mode and output attribute is
Unprivileged, specification have no mention about where to decide memory access
is permitted or not. Do you think is a spec design bug or Implemention
defined, or find some description anywhere.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding
  2014-06-17  7:50     ` leizhen
@ 2014-06-17  9:11       ` Varun Sethi
  2014-06-17  9:33       ` Will Deacon
  1 sibling, 0 replies; 16+ messages in thread
From: Varun Sethi @ 2014-06-17  9:11 UTC (permalink / raw)
  To: linux-arm-kernel



> -----Original Message-----
> From: linux-arm-kernel [mailto:linux-arm-kernel-
> bounces at lists.infradead.org] On Behalf Of leizhen
> Sent: Tuesday, June 17, 2014 1:20 PM
> To: Will Deacon
> Cc: huxinwei at huawei.com; Catalin Marinas; Zefan Li; linux-arm-kernel
> Subject: Re: [PATCH RFC v2 3/3] documentation/iommu: Add description of
> Hisilicon SMMU private binding
> 
> On 2014/6/17 0:39, Will Deacon wrote:
> > On Thu, Jun 12, 2014 at 06:08:12AM +0100, Zhen Lei wrote:
> >> This patch adds a description of private properties for the Hisilicon
> >> System MMU architecture.
> >>
> >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> >> ---
> >>  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 16
> >> ++++++++++++++++
> >>  1 file changed, 16 insertions(+)
> >>
> >> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> >> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> >> index f284b99..75b1351 100644
> >> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> >> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> >> @@ -15,6 +15,7 @@ conditions.
> >>                          "arm,smmu-v2"
> >>                          "arm,mmu-400"
> >>                          "arm,mmu-500"
> >> +                        "hisilicon,smmu-v1"
> >>
> >>                    depending on the particular implementation and/or
> the
> >>                    version of the architecture implemented.
> >> @@ -54,6 +55,21 @@ conditions.
> >>                    aliases of secure registers have to be used during
> >>                    SMMU configuration.
> >>
> >> +** Hisilicon SMMU private properties:
> >> +
> >> +- smmu-force-memtype : A list of StreamIDs which not translate
> address but
> >> +                  translate attributes. The StreamIDs list here can
> not be
> >> +                  used for map(translation) mode again.
> >> +                  StreamID first, then the type list below:
> >> +                  1, cahceable, WBRAWA, Normal outer and inner write-
> back
> >> +                  2, non-cacheable, Normal outer and inner non-
> cacheable
> >> +                  3, device, nGnRE
> >> +                  others, bypass
> >> +
> >> +- smmu-bypass-vmid   : Specify which context bank is used for bypass
> mode.
> >> +                  If omit, vmid=255 is default. If bypass and map
> mode can
> >> +                  share a same S2CR, config vmid=0.
> >
> > These don't feel like they belong in the device-tree to me. Is the
> > list of StreamIDs described in smmu-force-memtype a property of the
> > hardware configuration, or a software policy to allow you to generate
> > cacheable traffic for particular masters?
> >
> > Will
> >
> > .
> >
> OK, I will put these description into a sperate file, hisilicon,smmu.txt
> and mark a reference to arm,smmu.txt
> 
> The latter case. Some masters driver want use cacheable(WB) attribute to
> access memory, but the masters can not bring cacheable attribute. So, can
> not use bypass(or transaction) mode. In fact, the master driver can use
> iommu_map to create map and specify IOMMU_CACHE. But maybe the driver
> does not want to map, if the memory access is very dynamically,
> frequently map and unmap will decrease performance.
> 
> Actually, I think smmu-force-memtype maybe suitable for arm-smmu too. But
> now, if nobody declear need it, I will just implement it in hisi-smmu.c
> 
> A big issue should be discussed now. When a master use bypass mode to
> access memory, this means it can access all non-secure memory. So, if a
> master can be operated in user mode, that means a user process can access
> any kernel memory. In ARM smmu-v2 specification, SMMU_S2CRn.PRIVCFG is
> used for determine output priviledge attribute. But if use bypass mode
> and output attribute is Unprivileged, specification have no mention about
> where to decide memory access is permitted or not. Do you think is a spec
> design bug or Implemention defined, or find some description anywhere.
> 
You should be using VFIO framework for user space I/O. VFIO uses the iommu_api to setup IOMMU for user space access. This allows you to restrict user space I/O access.

-Varun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture
  2014-06-17  6:32   ` leizhen
@ 2014-06-17  9:27     ` Will Deacon
  0 siblings, 0 replies; 16+ messages in thread
From: Will Deacon @ 2014-06-17  9:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 17, 2014 at 07:32:47AM +0100, leizhen wrote:
> On 2014/6/17 0:37, Will Deacon wrote:
> > On Thu, Jun 12, 2014 at 06:08:09AM +0100, Zhen Lei wrote:
> >> Changes in v2:
> >>
> >> - Split Hisilicon smmu implementation in a separate file, hisi-smmu.c
> >> - Refactor arm-smmu.c. Some direct call hardware dependent functions replaced
> >>   with hooks. And move common struct and marco definition into arm-smmu.h
> >> - Merge the description of Hisilicon private properties into arm,smmu.txt
> >>
> >> Hisilicon smmu is similar to arm-smmu, some code can be direct reused. For
> >> example: map and unmap, device tree configuration, and the software framework.
> >> I found that, abstract about 11 hardware dependent functions in arm-smmu.c is
> >> enough . Abstract means use hooks to replace the direct call of functions. Now,
> >> if need to support Hisilicon SMMU or others arm smmu variants, just selective
> >> rewrite these hooks. And I add a dt_cfg_probe hook, to allow each variant parse
> >> their hardware special configuration. Finally, flush_pgtable is a special case,
> >> hardware independent but maybe referenced. So, total 13 hooks.
> > 
> > The fundamental question here is whether or not your SMMU implementation is
> > intended to follow the ARM SMMU architecture specification. Given the
> > changes you've highlighted, it clearly doesn't comply, so then it comes down
> > to how much code can actually be re-used between arm-smmu.c and hisi-smmu.c.
> > 
> > With your current split, I can still see plenty of duplication between the
> > two files (e.g. bits in the SCTLR register). I also recognise a fair number
> > of lines in hisi-smmu.c that I wrote originally.
> > 
> > So, is this supposed to be architecturally compliant with the ARM SMMU spec
> > or is it something completely independent?
> > 
> > Will
> > 
> > .
> > 
> Yeah, it doesn't comply, and finally will not comply too. So, I said before:
> "similar to" and "arm-smmu variant" maybe inappropriate. It looks like a new
> smmu hardware implementation.

Well, apart from the parts that are identical, sure.

> I also want to reuse your code as much as possible. But in order to keep your
> code flow clearly, I try to make as few changes as possible. I think make
> differentiate granule too small is not well. Because all of our codes maybe
> changed in future, and may any other smmu variants to reuse arm-smmu.c.
> 
> Especially, SCTLR register, the hardware dependent feature. I think not
> suitable for reuse. A bit field differentiation may need a marco to seperate,
> make the code ugly.

I disagree. You have a register with a same name, with the same bits, in the
same positions. That's hopefully not just luck, so the code should be shared
between the drivers. I'm not at all happy with simply duplicating what we
have and bodging it for your purposes -- that totally defeats the point of
having an architecture, even if you're not following it entirely.

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding
  2014-06-17  7:50     ` leizhen
  2014-06-17  9:11       ` Varun Sethi
@ 2014-06-17  9:33       ` Will Deacon
  2014-06-18  1:28         ` leizhen
  1 sibling, 1 reply; 16+ messages in thread
From: Will Deacon @ 2014-06-17  9:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 17, 2014 at 08:50:20AM +0100, leizhen wrote:
> On 2014/6/17 0:39, Will Deacon wrote:
> > On Thu, Jun 12, 2014 at 06:08:12AM +0100, Zhen Lei wrote:
> >> This patch adds a description of private properties for the Hisilicon
> >> System MMU architecture.
> >>
> >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> >> ---
> >>  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 16 ++++++++++++++++
> >>  1 file changed, 16 insertions(+)
> >>
> >> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> >> index f284b99..75b1351 100644
> >> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> >> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> >> @@ -15,6 +15,7 @@ conditions.
> >>                          "arm,smmu-v2"
> >>                          "arm,mmu-400"
> >>                          "arm,mmu-500"
> >> +                        "hisilicon,smmu-v1"
> >>
> >>                    depending on the particular implementation and/or the
> >>                    version of the architecture implemented.
> >> @@ -54,6 +55,21 @@ conditions.
> >>                    aliases of secure registers have to be used during
> >>                    SMMU configuration.
> >>
> >> +** Hisilicon SMMU private properties:
> >> +
> >> +- smmu-force-memtype : A list of StreamIDs which not translate address but
> >> +                  translate attributes. The StreamIDs list here can not be
> >> +                  used for map(translation) mode again.
> >> +                  StreamID first, then the type list below:
> >> +                  1, cahceable, WBRAWA, Normal outer and inner write-back
> >> +                  2, non-cacheable, Normal outer and inner non-cacheable
> >> +                  3, device, nGnRE
> >> +                  others, bypass
> >> +
> >> +- smmu-bypass-vmid   : Specify which context bank is used for bypass mode.
> >> +                  If omit, vmid=255 is default. If bypass and map mode can
> >> +                  share a same S2CR, config vmid=0.
> > 
> > These don't feel like they belong in the device-tree to me. Is the list of
> > StreamIDs described in smmu-force-memtype a property of the hardware
> > configuration, or a software policy to allow you to generate cacheable
> > traffic for particular masters?
> > 
> > Will
> > 
> > .
> > 
> OK, I will put these description into a sperate file, hisilicon,smmu.txt
> and mark a reference to arm,smmu.txt

No, don't do that. Delete the properties instead. I'm not a device-tree
expert, but these don't feel like things we should be describing there at
all.

> The latter case. Some masters driver want use cacheable(WB) attribute to access
> memory, but the masters can not bring cacheable attribute. So, can not use
> bypass(or transaction) mode. In fact, the master driver can use iommu_map to
> create map and specify IOMMU_CACHE. But maybe the driver does not want to
> map, if the memory access is very dynamically, frequently map and unmap will
> decrease performance.

You seem to be highlighting a perceived deficiency in the IOMMU API which
you're attempting to work-around with new device-tree properties. Instead,
why not propose an extension to the IOMMU API in Linux?

> Actually, I think smmu-force-memtype maybe suitable for arm-smmu too. But now,
> if nobody declear need it, I will just implement it in hisi-smmu.c

I don't want to see hisi-smmu.c at all. You need to make your driver fit
into the code we already have.

Do you have a specification available describing the hardware you have
created?

> A big issue should be discussed now. When a master use bypass mode to access
> memory, this means it can access all non-secure memory. So, if a master can
> be operated in user mode, that means a user process can access any kernel
> memory. In ARM smmu-v2 specification, SMMU_S2CRn.PRIVCFG is used for determine
> output priviledge attribute. But if use bypass mode and output attribute is
> Unprivileged, specification have no mention about where to decide memory access
> is permitted or not. Do you think is a spec design bug or Implemention
> defined, or find some description anywhere.

There are various overrides for the privilege controls but, as Varun said,
you should be using VFIO for userspace device access.

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding
  2014-06-17  9:33       ` Will Deacon
@ 2014-06-18  1:28         ` leizhen
  2014-06-18 12:03           ` Varun Sethi
  2014-06-18 13:32           ` Will Deacon
  0 siblings, 2 replies; 16+ messages in thread
From: leizhen @ 2014-06-18  1:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 2014/6/17 17:33, Will Deacon wrote:
> On Tue, Jun 17, 2014 at 08:50:20AM +0100, leizhen wrote:
>> On 2014/6/17 0:39, Will Deacon wrote:
>>> On Thu, Jun 12, 2014 at 06:08:12AM +0100, Zhen Lei wrote:
>>>> This patch adds a description of private properties for the Hisilicon
>>>> System MMU architecture.
>>>>
>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>>>> ---
>>>>  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 16 ++++++++++++++++
>>>>  1 file changed, 16 insertions(+)
>>>>
>>>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>>>> index f284b99..75b1351 100644
>>>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>>>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>>>> @@ -15,6 +15,7 @@ conditions.
>>>>                          "arm,smmu-v2"
>>>>                          "arm,mmu-400"
>>>>                          "arm,mmu-500"
>>>> +                        "hisilicon,smmu-v1"
>>>>
>>>>                    depending on the particular implementation and/or the
>>>>                    version of the architecture implemented.
>>>> @@ -54,6 +55,21 @@ conditions.
>>>>                    aliases of secure registers have to be used during
>>>>                    SMMU configuration.
>>>>
>>>> +** Hisilicon SMMU private properties:
>>>> +
>>>> +- smmu-force-memtype : A list of StreamIDs which not translate address but
>>>> +                  translate attributes. The StreamIDs list here can not be
>>>> +                  used for map(translation) mode again.
>>>> +                  StreamID first, then the type list below:
>>>> +                  1, cahceable, WBRAWA, Normal outer and inner write-back
>>>> +                  2, non-cacheable, Normal outer and inner non-cacheable
>>>> +                  3, device, nGnRE
>>>> +                  others, bypass
>>>> +
>>>> +- smmu-bypass-vmid   : Specify which context bank is used for bypass mode.
>>>> +                  If omit, vmid=255 is default. If bypass and map mode can
>>>> +                  share a same S2CR, config vmid=0.
>>>
>>> These don't feel like they belong in the device-tree to me. Is the list of
>>> StreamIDs described in smmu-force-memtype a property of the hardware
>>> configuration, or a software policy to allow you to generate cacheable
>>> traffic for particular masters?
>>>
>>> Will
>>>
>>> .
>>>
>> OK, I will put these description into a sperate file, hisilicon,smmu.txt
>> and mark a reference to arm,smmu.txt
> 
> No, don't do that. Delete the properties instead. I'm not a device-tree
> expert, but these don't feel like things we should be describing there at
> all.
> 
>> The latter case. Some masters driver want use cacheable(WB) attribute to access
>> memory, but the masters can not bring cacheable attribute. So, can not use
>> bypass(or transaction) mode. In fact, the master driver can use iommu_map to
>> create map and specify IOMMU_CACHE. But maybe the driver does not want to
>> map, if the memory access is very dynamically, frequently map and unmap will
>> decrease performance.
> 
> You seem to be highlighting a perceived deficiency in the IOMMU API which
> you're attempting to work-around with new device-tree properties. Instead,
> why not propose an extension to the IOMMU API in Linux?
> 

The private properties or new IOMMU APIs, just in order to optimize performance
or simplify master driver code, I will consider it later. I think it's good to
support base functions first.

Sorry, I just only known ARM SMMU and Hisilicon SMMU, not familiar with others.
Add a common API may meet more difficulty.

If I have time, I will try to do it.

>> Actually, I think smmu-force-memtype maybe suitable for arm-smmu too. But now,
>> if nobody declear need it, I will just implement it in hisi-smmu.c
> 
> I don't want to see hisi-smmu.c at all. You need to make your driver fit
> into the code we already have.
> 
> Do you have a specification available describing the hardware you have
> created?
> 

Although I have the Hisilicon SMMU specification, but it was written in Chinese.
I have told this to hardware engineers.

OK. I will try my best to merge two drivers into one file. Maybe need to use many
#if#else.

>> A big issue should be discussed now. When a master use bypass mode to access
>> memory, this means it can access all non-secure memory. So, if a master can
>> be operated in user mode, that means a user process can access any kernel
>> memory. In ARM smmu-v2 specification, SMMU_S2CRn.PRIVCFG is used for determine
>> output priviledge attribute. But if use bypass mode and output attribute is
>> Unprivileged, specification have no mention about where to decide memory access
>> is permitted or not. Do you think is a spec design bug or Implemention
>> defined, or find some description anywhere.
> 
> There are various overrides for the privilege controls but, as Varun said,
> you should be using VFIO for userspace device access.
> 
> Will
> 
> .
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding
  2014-06-18  1:28         ` leizhen
@ 2014-06-18 12:03           ` Varun Sethi
  2014-06-18 12:34             ` leizhen
  2014-06-18 13:32           ` Will Deacon
  1 sibling, 1 reply; 16+ messages in thread
From: Varun Sethi @ 2014-06-18 12:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will/Zhen,

> -----Original Message-----
> From: linux-arm-kernel [mailto:linux-arm-kernel-
> bounces at lists.infradead.org] On Behalf Of leizhen
> Sent: Wednesday, June 18, 2014 6:59 AM
> To: Will Deacon
> Cc: huxinwei at huawei.com; Catalin Marinas; Zefan Li; linux-arm-kernel
> Subject: Re: [PATCH RFC v2 3/3] documentation/iommu: Add description of
> Hisilicon SMMU private binding
> 
> On 2014/6/17 17:33, Will Deacon wrote:
> > On Tue, Jun 17, 2014 at 08:50:20AM +0100, leizhen wrote:
> >> On 2014/6/17 0:39, Will Deacon wrote:
> >>> On Thu, Jun 12, 2014 at 06:08:12AM +0100, Zhen Lei wrote:
> >>>> This patch adds a description of private properties for the
> >>>> Hisilicon System MMU architecture.
> >>>>
> >>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> >>>> ---
> >>>>  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 16
> >>>> ++++++++++++++++
> >>>>  1 file changed, 16 insertions(+)
> >>>>
> >>>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> >>>> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> >>>> index f284b99..75b1351 100644
> >>>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> >>>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> >>>> @@ -15,6 +15,7 @@ conditions.
> >>>>                          "arm,smmu-v2"
> >>>>                          "arm,mmu-400"
> >>>>                          "arm,mmu-500"
> >>>> +                        "hisilicon,smmu-v1"
> >>>>
> >>>>                    depending on the particular implementation and/or
> the
> >>>>                    version of the architecture implemented.
> >>>> @@ -54,6 +55,21 @@ conditions.
> >>>>                    aliases of secure registers have to be used
> during
> >>>>                    SMMU configuration.
> >>>>
> >>>> +** Hisilicon SMMU private properties:
> >>>> +
> >>>> +- smmu-force-memtype : A list of StreamIDs which not translate
> address but
> >>>> +                  translate attributes. The StreamIDs list here can
> not be
> >>>> +                  used for map(translation) mode again.
> >>>> +                  StreamID first, then the type list below:
> >>>> +                  1, cahceable, WBRAWA, Normal outer and inner
> write-back
> >>>> +                  2, non-cacheable, Normal outer and inner non-
> cacheable
> >>>> +                  3, device, nGnRE
> >>>> +                  others, bypass
> >>>> +
> >>>> +- smmu-bypass-vmid   : Specify which context bank is used for
> bypass mode.
> >>>> +                  If omit, vmid=255 is default. If bypass and map
> mode can
> >>>> +                  share a same S2CR, config vmid=0.
> >>>
> >>> These don't feel like they belong in the device-tree to me. Is the
> >>> list of StreamIDs described in smmu-force-memtype a property of the
> >>> hardware configuration, or a software policy to allow you to
> >>> generate cacheable traffic for particular masters?
> >>>
> >>> Will
> >>>
> >>> .
> >>>
> >> OK, I will put these description into a sperate file,
> >> hisilicon,smmu.txt and mark a reference to arm,smmu.txt
> >
> > No, don't do that. Delete the properties instead. I'm not a
> > device-tree expert, but these don't feel like things we should be
> > describing there at all.
> >
> >> The latter case. Some masters driver want use cacheable(WB) attribute
> >> to access memory, but the masters can not bring cacheable attribute.
> >> So, can not use bypass(or transaction) mode. In fact, the master
> >> driver can use iommu_map to create map and specify IOMMU_CACHE. But
> >> maybe the driver does not want to map, if the memory access is very
> >> dynamically, frequently map and unmap will decrease performance.
> >
> > You seem to be highlighting a perceived deficiency in the IOMMU API
> > which you're attempting to work-around with new device-tree
> > properties. Instead, why not propose an extension to the IOMMU API in
> Linux?
> >
> 
> The private properties or new IOMMU APIs, just in order to optimize
> performance or simplify master driver code, I will consider it later. I
> think it's good to support base functions first.
> 
> Sorry, I just only known ARM SMMU and Hisilicon SMMU, not familiar with
> others.
> Add a common API may meet more difficulty.
> 
> If I have time, I will try to do it.
> 
[Sethi Varun-B16395] We can introduce SMMU specific attributes (to use with iommu set attribute API call) for setting memory cacheability attributes. These could be used for setting the MAIR per context bank. Actually, we would require this for out platform. However, I am not sure how memory attribute translation would work if S2CR is set for bypass?

-Varun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding
  2014-06-18 12:03           ` Varun Sethi
@ 2014-06-18 12:34             ` leizhen
  0 siblings, 0 replies; 16+ messages in thread
From: leizhen @ 2014-06-18 12:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 2014/6/18 20:03, Varun Sethi wrote:
> Hi Will/Zhen,
> 
>> -----Original Message-----
>> From: linux-arm-kernel [mailto:linux-arm-kernel-
>> bounces at lists.infradead.org] On Behalf Of leizhen
>> Sent: Wednesday, June 18, 2014 6:59 AM
>> To: Will Deacon
>> Cc: huxinwei at huawei.com; Catalin Marinas; Zefan Li; linux-arm-kernel
>> Subject: Re: [PATCH RFC v2 3/3] documentation/iommu: Add description of
>> Hisilicon SMMU private binding
>>
>> On 2014/6/17 17:33, Will Deacon wrote:
>>> On Tue, Jun 17, 2014 at 08:50:20AM +0100, leizhen wrote:
>>>> On 2014/6/17 0:39, Will Deacon wrote:
>>>>> On Thu, Jun 12, 2014 at 06:08:12AM +0100, Zhen Lei wrote:
>>>>>> This patch adds a description of private properties for the
>>>>>> Hisilicon System MMU architecture.
>>>>>>
>>>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>>>>>> ---
>>>>>>  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 16
>>>>>> ++++++++++++++++
>>>>>>  1 file changed, 16 insertions(+)
>>>>>>
>>>>>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>>>>>> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>>>>>> index f284b99..75b1351 100644
>>>>>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>>>>>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>>>>>> @@ -15,6 +15,7 @@ conditions.
>>>>>>                          "arm,smmu-v2"
>>>>>>                          "arm,mmu-400"
>>>>>>                          "arm,mmu-500"
>>>>>> +                        "hisilicon,smmu-v1"
>>>>>>
>>>>>>                    depending on the particular implementation and/or
>> the
>>>>>>                    version of the architecture implemented.
>>>>>> @@ -54,6 +55,21 @@ conditions.
>>>>>>                    aliases of secure registers have to be used
>> during
>>>>>>                    SMMU configuration.
>>>>>>
>>>>>> +** Hisilicon SMMU private properties:
>>>>>> +
>>>>>> +- smmu-force-memtype : A list of StreamIDs which not translate
>> address but
>>>>>> +                  translate attributes. The StreamIDs list here can
>> not be
>>>>>> +                  used for map(translation) mode again.
>>>>>> +                  StreamID first, then the type list below:
>>>>>> +                  1, cahceable, WBRAWA, Normal outer and inner
>> write-back
>>>>>> +                  2, non-cacheable, Normal outer and inner non-
>> cacheable
>>>>>> +                  3, device, nGnRE
>>>>>> +                  others, bypass
>>>>>> +
>>>>>> +- smmu-bypass-vmid   : Specify which context bank is used for
>> bypass mode.
>>>>>> +                  If omit, vmid=255 is default. If bypass and map
>> mode can
>>>>>> +                  share a same S2CR, config vmid=0.
>>>>>
>>>>> These don't feel like they belong in the device-tree to me. Is the
>>>>> list of StreamIDs described in smmu-force-memtype a property of the
>>>>> hardware configuration, or a software policy to allow you to
>>>>> generate cacheable traffic for particular masters?
>>>>>
>>>>> Will
>>>>>
>>>>> .
>>>>>
>>>> OK, I will put these description into a sperate file,
>>>> hisilicon,smmu.txt and mark a reference to arm,smmu.txt
>>>
>>> No, don't do that. Delete the properties instead. I'm not a
>>> device-tree expert, but these don't feel like things we should be
>>> describing there at all.
>>>
>>>> The latter case. Some masters driver want use cacheable(WB) attribute
>>>> to access memory, but the masters can not bring cacheable attribute.
>>>> So, can not use bypass(or transaction) mode. In fact, the master
>>>> driver can use iommu_map to create map and specify IOMMU_CACHE. But
>>>> maybe the driver does not want to map, if the memory access is very
>>>> dynamically, frequently map and unmap will decrease performance.
>>>
>>> You seem to be highlighting a perceived deficiency in the IOMMU API
>>> which you're attempting to work-around with new device-tree
>>> properties. Instead, why not propose an extension to the IOMMU API in
>> Linux?
>>>
>>
>> The private properties or new IOMMU APIs, just in order to optimize
>> performance or simplify master driver code, I will consider it later. I
>> think it's good to support base functions first.
>>
>> Sorry, I just only known ARM SMMU and Hisilicon SMMU, not familiar with
>> others.
>> Add a common API may meet more difficulty.
>>
>> If I have time, I will try to do it.
>>
> [Sethi Varun-B16395] We can introduce SMMU specific attributes (to use with iommu set attribute API call) for setting memory cacheability attributes. These could be used for setting the MAIR per context bank. Actually, we would require this for out platform. However, I am not sure how memory attribute translation would work if S2CR is set for bypass?
> 
> -Varun
> 
> 
> 
> .
> 
The ARM SMMU v2 specification describes that:
If the transaction is successfully matched in the Stream mapping table, SMMU_S2CRn determines the initial
context. If SMMU_S2CRn specifies Bypass mode:
? The address is not translated, meaning that the output address is identical to the input address.
? Output attributes are a function of the input attributes and the SMMU_S2CRn fields.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding
  2014-06-18  1:28         ` leizhen
  2014-06-18 12:03           ` Varun Sethi
@ 2014-06-18 13:32           ` Will Deacon
  2014-06-19  1:58             ` leizhen
  1 sibling, 1 reply; 16+ messages in thread
From: Will Deacon @ 2014-06-18 13:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jun 18, 2014 at 02:28:30AM +0100, leizhen wrote:
> On 2014/6/17 17:33, Will Deacon wrote:
> > On Tue, Jun 17, 2014 at 08:50:20AM +0100, leizhen wrote:
> >> The latter case. Some masters driver want use cacheable(WB) attribute to access
> >> memory, but the masters can not bring cacheable attribute. So, can not use
> >> bypass(or transaction) mode. In fact, the master driver can use iommu_map to
> >> create map and specify IOMMU_CACHE. But maybe the driver does not want to
> >> map, if the memory access is very dynamically, frequently map and unmap will
> >> decrease performance.
> > 
> > You seem to be highlighting a perceived deficiency in the IOMMU API which
> > you're attempting to work-around with new device-tree properties. Instead,
> > why not propose an extension to the IOMMU API in Linux?
> > 
> 
> The private properties or new IOMMU APIs, just in order to optimize performance
> or simplify master driver code, I will consider it later. I think it's good to
> support base functions first.
> 
> Sorry, I just only known ARM SMMU and Hisilicon SMMU, not familiar with others.
> Add a common API may meet more difficulty.
> 
> If I have time, I will try to do it.

Ok, so we're agreeing to drop those properties for now?

> >> Actually, I think smmu-force-memtype maybe suitable for arm-smmu too. But now,
> >> if nobody declear need it, I will just implement it in hisi-smmu.c
> > 
> > I don't want to see hisi-smmu.c at all. You need to make your driver fit
> > into the code we already have.
> > 
> > Do you have a specification available describing the hardware you have
> > created?
> > 
> 
> Although I have the Hisilicon SMMU specification, but it was written in Chinese.
> I have told this to hardware engineers.

Having access to a specification I can understand would help to assess
exactly what you've gone and built.

> OK. I will try my best to merge two drivers into one file. Maybe need to use many
> #if#else.

The fact of the matter is that you've got a device that isn't
architecturally compliant. That leaves me in a fairly undesirable position
with a small set of options:

  (1) We can duplicate lots of the code we already have and you end up with a
      separate driver. This is obviously bad because of the code duplication,
      associated maintenance headaches, driver divergence, etc. It also brings
      into question the point of having a driver written to the architecture
      when the hardware has gone off and done something different.

  (2) We try to fit your SMMU into the existing driver. I'd like to see what
      this looks like, but if it's as much #ifdeffery as you suggest, then
      that's also bad for many of the same reasons as above.

  (3) We don't support your device in the Linux kernel. We could just treat
      your SMMU as a broken ARM SMMU implementation and not support it. If
      this was a CPU instead of an SMMU this would unquestionably be the
      correct approach.

Unless you can come up with something compelling for point (2), I'm going to
err on the side of (3). So, please, *please*, don't do a rushed job of
merging the drivers. You need to convince me why I should bother supporting
this device in my driver, which means using clean abstractions and extending
the code we already have. If this isn't possible, then perhaps you'll
consider following the architecture next time around.

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding
  2014-06-18 13:32           ` Will Deacon
@ 2014-06-19  1:58             ` leizhen
  0 siblings, 0 replies; 16+ messages in thread
From: leizhen @ 2014-06-19  1:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 2014/6/18 21:32, Will Deacon wrote:
> On Wed, Jun 18, 2014 at 02:28:30AM +0100, leizhen wrote:
>> On 2014/6/17 17:33, Will Deacon wrote:
>>> On Tue, Jun 17, 2014 at 08:50:20AM +0100, leizhen wrote:
>>>> The latter case. Some masters driver want use cacheable(WB) attribute to access
>>>> memory, but the masters can not bring cacheable attribute. So, can not use
>>>> bypass(or transaction) mode. In fact, the master driver can use iommu_map to
>>>> create map and specify IOMMU_CACHE. But maybe the driver does not want to
>>>> map, if the memory access is very dynamically, frequently map and unmap will
>>>> decrease performance.
>>>
>>> You seem to be highlighting a perceived deficiency in the IOMMU API which
>>> you're attempting to work-around with new device-tree properties. Instead,
>>> why not propose an extension to the IOMMU API in Linux?
>>>
>>
>> The private properties or new IOMMU APIs, just in order to optimize performance
>> or simplify master driver code, I will consider it later. I think it's good to
>> support base functions first.
>>
>> Sorry, I just only known ARM SMMU and Hisilicon SMMU, not familiar with others.
>> Add a common API may meet more difficulty.
>>
>> If I have time, I will try to do it.
> 
> Ok, so we're agreeing to drop those properties for now?
> 
>>>> Actually, I think smmu-force-memtype maybe suitable for arm-smmu too. But now,
>>>> if nobody declear need it, I will just implement it in hisi-smmu.c
>>>
>>> I don't want to see hisi-smmu.c at all. You need to make your driver fit
>>> into the code we already have.
>>>
>>> Do you have a specification available describing the hardware you have
>>> created?
>>>
>>
>> Although I have the Hisilicon SMMU specification, but it was written in Chinese.
>> I have told this to hardware engineers.
> 
> Having access to a specification I can understand would help to assess
> exactly what you've gone and built.
> 
>> OK. I will try my best to merge two drivers into one file. Maybe need to use many
>> #if#else.
> 
> The fact of the matter is that you've got a device that isn't
> architecturally compliant. That leaves me in a fairly undesirable position
> with a small set of options:
> 
>   (1) We can duplicate lots of the code we already have and you end up with a
>       separate driver. This is obviously bad because of the code duplication,
>       associated maintenance headaches, driver divergence, etc. It also brings
>       into question the point of having a driver written to the architecture
>       when the hardware has gone off and done something different.
> 
>   (2) We try to fit your SMMU into the existing driver. I'd like to see what
>       this looks like, but if it's as much #ifdeffery as you suggest, then
>       that's also bad for many of the same reasons as above.
> 
>   (3) We don't support your device in the Linux kernel. We could just treat
>       your SMMU as a broken ARM SMMU implementation and not support it. If
>       this was a CPU instead of an SMMU this would unquestionably be the
>       correct approach.
> 
> Unless you can come up with something compelling for point (2), I'm going to
> err on the side of (3). So, please, *please*, don't do a rushed job of
> merging the drivers. You need to convince me why I should bother supporting
> this device in my driver, which means using clean abstractions and extending
> the code we already have. If this isn't possible, then perhaps you'll
> consider following the architecture next time around.
> 
> Will
> 
> .
> 
Yeah, I konw. I also a software engineer. A victim of this SMMU.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-06-19  1:58 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-12  5:08 [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture Zhen Lei
2014-06-12  5:08 ` [PATCH RFC v2 1/3] iommu/arm: Adjust code to facilitate support arm smmu variants Zhen Lei
2014-06-12  5:08 ` [PATCH RFC v2 2/3] iommu/hisilicon: Add support for Hisilicon Ltd. System MMU architecture Zhen Lei
2014-06-12  5:08 ` [PATCH RFC v2 3/3] documentation/iommu: Add description of Hisilicon SMMU private binding Zhen Lei
2014-06-16 16:39   ` Will Deacon
2014-06-17  7:50     ` leizhen
2014-06-17  9:11       ` Varun Sethi
2014-06-17  9:33       ` Will Deacon
2014-06-18  1:28         ` leizhen
2014-06-18 12:03           ` Varun Sethi
2014-06-18 12:34             ` leizhen
2014-06-18 13:32           ` Will Deacon
2014-06-19  1:58             ` leizhen
2014-06-16 16:37 ` [PATCH RFC v2 0/3] Add support for Hisilicon SMMU architecture Will Deacon
2014-06-17  6:32   ` leizhen
2014-06-17  9:27     ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.