All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 00/14] enable HiP04 SoC
@ 2014-05-20 13:10 Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 01/14] ARM: debug: add HiP04 debug uart Haojian Zhuang
                   ` (13 more replies)
  0 siblings, 14 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

Changelog:
v9:
  * Remove delay workaround in mcpm implementation.
  * Clean in gic.
  * Rename vgic_cpu_nr_lr to vgic_cpu_hw_cfg in vgic driver.
  * Always use high word of vgic_cpu_hw_cfg for GICH_APR offset. So the
    implementation of arm64 is also updated.
  * Drop "irq: gic: use mask field in GICC_IAR" patch since it's merged.

v7:
  * Remove hip04_smp_init_ops().
  * Remove CONFIG_ARCH_HIP04 in hisilicn.c since hip04_smp_init_ops() is
    removed.

v6:
  * Split HiP04 enabling patch into patches on document, mcpm & hiP04.
  * Move reset operation in HiP04 MCPM implementation.
  * Remove ARCH_MULTI_V7_NONLPAE & ARCH_MULTI_V7_LPAE according to olof's
    comment.

v5:
  * Add ARCH_MULTI_V7_NONLPAE to avoid change too much things in Kconfig.
  * Use memreserve in DTS.
  * Remove L2 reset operation in mcpm implementation.
  * Re-use nr_lr field to cover HIP04 GICH_APR implementation.
  * Add more comments.

v4:
  * Add multi_v7_lpae_defconfig.
  * Select CONFIG_ARCH_MULTI_V7_LPAE if CONFIG_ARCH_MULTI_V7 is selected.
  * Only ARMADA_XP is ARCH_MULTI_V7_LPAE, other ARMADA chips are ARCH_MULTI_V7.
  * Remove gich_lr0 variable since we can calculate offset of GICH_LR0 from
    GICH_APR.
  * Cleanup GIC driver to support HiP04 GIC.
  * Cleanup HiP04 mcpm implementation.

v3:
  * Replace CONFIG_ARCH_MULTI_V7 by CONFIG_ARCH_MULTI_V7_LPAE in some SoC.
  * Update MCPM code based on Dave's patch.
  * Remove MCPM node in DTS file. Use sysctrl & fabric node instead.
  * Move hardcoding on bootwrapper into DTS file.
  * Append the CONFIG_MCPM_QUAD_CLUSTER for HiP04.
  * Fix the return value from gic_get_cpumask() if it's used in standard gic.
  * Add the vgic support on HiP04 GIC.
  * Add virtualization support in HiP04 defconfig.

v2:
  * Append ARCH_MULTI_V7_LPAE configuration. Define ARCH_MULTI_V7 to only
    support non-LPAE platform.
  * Append document of DT supporting.
  * Append ARCH_HISI in hi3xxx_defconfig.
  * Enable errata 798181 for HiP04 SoC.
  * Add PMU support.

Haojian Zhuang (13):
  ARM: debug: add HiP04 debug uart
  irq: gic: support hip04 gic
  ARM: mcpm: support 4 clusters
  ARM: hisi: add ARCH_HISI
  ARM: hisi: enable MCPM implementation
  ARM: hisi: enable HiP04
  document: dt: add the binding on HiP04
  document: dt: add the binding on HiP04 clock
  ARM: dts: append hip04 dts
  ARM: config: append lpae configuration
  ARM: config: append hip04_defconfig
  ARM: config: select ARCH_HISI in hi3xxx_defconfig
  virt: arm: support hip04 gic

Kefeng Wang (1):
  ARM: hisi: enable erratum 798181 of A15 on HiP04

 Documentation/devicetree/bindings/arm/gic.txt      |   1 +
 .../bindings/arm/hisilicon/hisilicon.txt           |  21 ++
 .../devicetree/bindings/clock/hip04-clock.txt      |  20 ++
 arch/arm/Kconfig                                   |   9 +
 arch/arm/Kconfig.debug                             |  10 +
 arch/arm/Makefile                                  |   2 +-
 arch/arm/boot/dts/Makefile                         |   1 +
 arch/arm/boot/dts/hip04-d01.dts                    |  39 +++
 arch/arm/boot/dts/hip04.dtsi                       | 260 +++++++++++++++
 arch/arm/configs/hi3xxx_defconfig                  |   2 +
 arch/arm/configs/hip04_defconfig                   |  74 +++++
 arch/arm/configs/multi_v7_lpae_defconfig           | 351 +++++++++++++++++++++
 arch/arm/include/asm/mcpm.h                        |   5 +
 arch/arm/kernel/asm-offsets.c                      |   2 +-
 arch/arm/kvm/interrupts_head.S                     |  29 +-
 arch/arm/mach-hisi/Kconfig                         |  27 +-
 arch/arm/mach-hisi/Makefile                        |   1 +
 arch/arm/mach-hisi/hisilicon.c                     |   9 +
 arch/arm/mach-hisi/platmcpm.c                      | 309 ++++++++++++++++++
 arch/arm64/kernel/asm-offsets.c                    |   2 +-
 arch/arm64/kvm/hyp.S                               |  28 +-
 drivers/irqchip/irq-gic.c                          | 159 +++++++---
 include/kvm/arm_vgic.h                             |   7 +-
 include/linux/irqchip/arm-gic.h                    |   6 +
 virt/kvm/arm/vgic.c                                |  48 ++-
 25 files changed, 1352 insertions(+), 70 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/clock/hip04-clock.txt
 create mode 100644 arch/arm/boot/dts/hip04-d01.dts
 create mode 100644 arch/arm/boot/dts/hip04.dtsi
 create mode 100644 arch/arm/configs/hip04_defconfig
 create mode 100644 arch/arm/configs/multi_v7_lpae_defconfig
 create mode 100644 arch/arm/mach-hisi/platmcpm.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 01/14] ARM: debug: add HiP04 debug uart
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 02/14] irq: gic: support hip04 gic Haojian Zhuang
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

Add the support of Hisilicon HiP04 debug uart.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 arch/arm/Kconfig.debug | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index 4a2fc0b..5a311af 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -223,6 +223,14 @@ choice
 		  Say Y here if you want kernel low-level debugging support
 		  on HI3716 UART.
 
+	config DEBUG_HIP04_UART
+		bool "Hisilicon HiP04 Debug UART"
+		depends on ARCH_HIP04
+		select DEBUG_UART_8250
+		help
+		  Say Y here if you want kernel low-level debugging support
+		  on HIP04 UART.
+
 	config DEBUG_HIGHBANK_UART
 		bool "Kernel low-level debugging messages via Highbank UART"
 		depends on ARCH_HIGHBANK
@@ -1044,6 +1052,7 @@ config DEBUG_UART_PHYS
 	default 0xd4017000 if DEBUG_MMP_UART2
 	default 0xd4018000 if DEBUG_MMP_UART3
 	default 0xe0000000 if ARCH_SPEAR13XX
+	default 0xe4007000 if DEBUG_HIP04_UART
 	default 0xf0000be0 if ARCH_EBSA110
 	default 0xf1012000 if DEBUG_MVEBU_UART_ALTERNATE
 	default 0xf1012000 if ARCH_DOVE || ARCH_KIRKWOOD || ARCH_MV78XX0 || \
@@ -1076,6 +1085,7 @@ config DEBUG_UART_VIRT
 	default 0xf4090000 if ARCH_LPC32XX
 	default 0xf4200000 if ARCH_GEMINI
 	default 0xf7fc9000 if DEBUG_BERLIN_UART
+	default 0xf8007000 if DEBUG_HIP04_UART
 	default 0xf8009000 if DEBUG_VEXPRESS_UART0_CA9
 	default 0xf8090000 if DEBUG_VEXPRESS_UART0_RS1
 	default 0xfb009000 if DEBUG_REALVIEW_STD_PORT
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 02/14] irq: gic: support hip04 gic
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 01/14] ARM: debug: add HiP04 debug uart Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-21 10:15   ` Marc Zyngier
  2014-06-21  1:54   ` Jason Cooper
  2014-05-20 13:10 ` [PATCH v9 03/14] ARM: mcpm: support 4 clusters Haojian Zhuang
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

There's a little difference between ARM GIC and HiP04 GIC.

* HiP04 GIC could support 16 cores at most, and ARM GIC could support
8 cores at most. So the difination on GIC_DIST_TARGET registers are
different since CPU interfaces are increased from 8-bit to 16-bit.

* HiP04 GIC could support 510 interrupts at most, and ARM GIC could
support 1020 interrupts at most.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 Documentation/devicetree/bindings/arm/gic.txt |   1 +
 drivers/irqchip/irq-gic.c                     | 159 ++++++++++++++++++++------
 2 files changed, 124 insertions(+), 36 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/gic.txt b/Documentation/devicetree/bindings/arm/gic.txt
index 5573c08..150f7d6 100644
--- a/Documentation/devicetree/bindings/arm/gic.txt
+++ b/Documentation/devicetree/bindings/arm/gic.txt
@@ -16,6 +16,7 @@ Main node required properties:
 	"arm,cortex-a9-gic"
 	"arm,cortex-a7-gic"
 	"arm,arm11mp-gic"
+	"hisilicon,hip04-gic"
 - interrupt-controller : Identifies the node as an interrupt controller
 - #interrupt-cells : Specifies the number of cells needed to encode an
   interrupt source.  The type shall be a <u32> and the value shall be 3.
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index f711fb6..64af475 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -68,6 +68,7 @@ struct gic_chip_data {
 #ifdef CONFIG_GIC_NON_BANKED
 	void __iomem *(*get_base)(union gic_base *);
 #endif
+	u32 nr_cpu_if;
 };
 
 static DEFINE_RAW_SPINLOCK(irq_controller_lock);
@@ -76,9 +77,11 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
  * The GIC mapping of CPU interfaces does not necessarily match
  * the logical CPU numbering.  Let's use a mapping as returned
  * by the GIC itself.
+ *
+ * Hisilicon HiP04 extends the number of CPU interface from 8 to 16.
  */
-#define NR_GIC_CPU_IF 8
-static u8 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
+#define NR_GIC_CPU_IF	16
+static u16 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
 
 /*
  * Supported arch specific GIC irq extension.
@@ -241,20 +244,62 @@ static int gic_retrigger(struct irq_data *d)
 	return 0;
 }
 
+static bool gic_is_standard(struct gic_chip_data *gic_data)
+{
+	return (gic_data->nr_cpu_if == 8);
+}
+
+static u32 irqs_per_target_reg(struct gic_chip_data *gic_data)
+{
+	return (32 / gic_data->nr_cpu_if);
+}
+
+/* i is the index of interrupt */
+static u32 irq_to_target_reg(struct gic_chip_data *gic_data, u32 i)
+{
+	if (gic_is_standard(gic_data))
+		i = i & ~3U;
+	else
+		i = (i << 1) & ~3U;
+	return (i + GIC_DIST_TARGET);
+}
+
 #ifdef CONFIG_SMP
+static u32 irq_to_core_shift(struct irq_data *d)
+{
+	struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
+	unsigned int i = gic_irq(d);
+
+	if (gic_is_standard(gic_data))
+		return ((i % 4) << 3);
+	return ((i % 2) << 4);
+}
+
+static u32 irq_to_core_mask(struct irq_data *d)
+{
+	struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
+	u32 mask;
+	/* ARM GIC, nr_cpu_if == 8; HiP04 GIC, nr_cpu_if == 16 */
+	mask = (1 << gic_data->nr_cpu_if) - 1;
+	return (mask << irq_to_core_shift(d));
+}
+
 static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
 			    bool force)
 {
-	void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + (gic_irq(d) & ~3);
-	unsigned int shift = (gic_irq(d) % 4) * 8;
+	void __iomem *reg;
+	struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
+	unsigned int shift = irq_to_core_shift(d);
 	unsigned int cpu = cpumask_any_and(mask_val, cpu_online_mask);
 	u32 val, mask, bit;
 
-	if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
+	if (cpu >= gic_data->nr_cpu_if || cpu >= nr_cpu_ids)
 		return -EINVAL;
 
+	reg = gic_dist_base(d) + irq_to_target_reg(gic_data, gic_irq(d));
+
 	raw_spin_lock(&irq_controller_lock);
-	mask = 0xff << shift;
+	mask = irq_to_core_mask(d);
 	bit = gic_cpu_map[cpu] << shift;
 	val = readl_relaxed(reg) & ~mask;
 	writel_relaxed(val | bit, reg);
@@ -354,15 +399,20 @@ void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
 	irq_set_chained_handler(irq, gic_handle_cascade_irq);
 }
 
-static u8 gic_get_cpumask(struct gic_chip_data *gic)
+static u16 gic_get_cpumask(struct gic_chip_data *gic)
 {
 	void __iomem *base = gic_data_dist_base(gic);
 	u32 mask, i;
 
-	for (i = mask = 0; i < 32; i += 4) {
-		mask = readl_relaxed(base + GIC_DIST_TARGET + i);
+	/*
+	 * ARM GIC uses 8 registers for interrupt 0-31,
+	 * HiP04 GIC uses 16 registers for interrupt 0-31.
+	 */
+	for (i = mask = 0; i < 32; i++) {
+		mask = readl_relaxed(base + irq_to_target_reg(gic, i));
 		mask |= mask >> 16;
-		mask |= mask >> 8;
+		if (gic_is_standard(gic))
+			mask |= mask >> 8;
 		if (mask)
 			break;
 	}
@@ -370,6 +420,10 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
 	if (!mask)
 		pr_crit("GIC CPU mask not found - kernel will fail to boot.\n");
 
+	/* ARM GIC needs 8-bit cpu mask, HiP04 GIC needs 16-bit cpu mask. */
+	if (gic_is_standard(gic))
+		mask &= 0xff;
+
 	return mask;
 }
 
@@ -392,10 +446,11 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 	 * Set all global interrupts to this CPU only.
 	 */
 	cpumask = gic_get_cpumask(gic);
-	cpumask |= cpumask << 8;
+	if (gic_is_standard(gic))
+		cpumask |= cpumask << 8;
 	cpumask |= cpumask << 16;
-	for (i = 32; i < gic_irqs; i += 4)
-		writel_relaxed(cpumask, base + GIC_DIST_TARGET + i * 4 / 4);
+	for (i = 32; i < gic_irqs; i++)
+		writel_relaxed(cpumask, base + irq_to_target_reg(gic, i));
 
 	/*
 	 * Set priority on all global interrupts.
@@ -423,7 +478,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 	/*
 	 * Get what the GIC says our CPU mask is.
 	 */
-	BUG_ON(cpu >= NR_GIC_CPU_IF);
+	BUG_ON(cpu >= gic->nr_cpu_if);
 	cpu_mask = gic_get_cpumask(gic);
 	gic_cpu_map[cpu] = cpu_mask;
 
@@ -431,7 +486,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 	 * Clear our mask from the other map entries in case they're
 	 * still undefined.
 	 */
-	for (i = 0; i < NR_GIC_CPU_IF; i++)
+	for (i = 0; i < gic->nr_cpu_if; i++)
 		if (i != cpu)
 			gic_cpu_map[i] &= ~cpu_mask;
 
@@ -467,7 +522,7 @@ void gic_cpu_if_down(void)
  */
 static void gic_dist_save(unsigned int gic_nr)
 {
-	unsigned int gic_irqs;
+	unsigned int gic_irqs, target_reg = 0;
 	void __iomem *dist_base;
 	int i;
 
@@ -484,9 +539,11 @@ static void gic_dist_save(unsigned int gic_nr)
 		gic_data[gic_nr].saved_spi_conf[i] =
 			readl_relaxed(dist_base + GIC_DIST_CONFIG + i * 4);
 
-	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
+	for (i = 0; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
+		target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
 		gic_data[gic_nr].saved_spi_target[i] =
-			readl_relaxed(dist_base + GIC_DIST_TARGET + i * 4);
+			readl_relaxed(dist_base + target_reg);
+	}
 
 	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
 		gic_data[gic_nr].saved_spi_enable[i] =
@@ -502,7 +559,7 @@ static void gic_dist_save(unsigned int gic_nr)
  */
 static void gic_dist_restore(unsigned int gic_nr)
 {
-	unsigned int gic_irqs;
+	unsigned int gic_irqs, target_reg = 0;
 	unsigned int i;
 	void __iomem *dist_base;
 
@@ -525,9 +582,11 @@ static void gic_dist_restore(unsigned int gic_nr)
 		writel_relaxed(0xa0a0a0a0,
 			dist_base + GIC_DIST_PRI + i * 4);
 
-	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
+	for (i = 0; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
+		target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
 		writel_relaxed(gic_data[gic_nr].saved_spi_target[i],
-			dist_base + GIC_DIST_TARGET + i * 4);
+			dist_base + target_reg);
+	}
 
 	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
 		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
@@ -665,9 +724,19 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	 */
 	dmb(ishst);
 
-	/* this always happens on GIC0 */
-	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
-
+	/*
+	 * CPUTargetList -- bit[23:16] in GIC_DIST_SOFTINT in ARM GIC.
+	 *                  bit[23:8] in GIC_DIST_SOFTINT in HiP04 GIC.
+	 * NSATT -- bit[15] in GIC_DIST_SOFTINT in ARM GIC.
+	 *          bit[7] in GIC_DIST_SOFTINT in HiP04 GIC.
+	 * this always happens on GIC0
+	 */
+	if (gic_is_standard(&gic_data[0]))
+		map = map << 16;
+	else
+		map = map << 8;
+	writel_relaxed(map | irq,
+		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
 }
 #endif
@@ -681,10 +750,15 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
  */
 void gic_send_sgi(unsigned int cpu_id, unsigned int irq)
 {
-	BUG_ON(cpu_id >= NR_GIC_CPU_IF);
+	BUG_ON(cpu_id >= gic_data[0].nr_cpu_if);
 	cpu_id = 1 << cpu_id;
 	/* this always happens on GIC0 */
-	writel_relaxed((cpu_id << 16) | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
+	if (gic_is_standard(&gic_data[0]))
+		cpu_id = cpu_id << 16;
+	else
+		cpu_id = cpu_id << 8;
+	writel_relaxed(cpu_id | irq,
+		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
 }
 
 /*
@@ -700,7 +774,7 @@ int gic_get_cpu_id(unsigned int cpu)
 {
 	unsigned int cpu_bit;
 
-	if (cpu >= NR_GIC_CPU_IF)
+	if (cpu >= gic_data[0].nr_cpu_if)
 		return -1;
 	cpu_bit = gic_cpu_map[cpu];
 	if (cpu_bit & (cpu_bit - 1))
@@ -747,13 +821,14 @@ void gic_migrate_target(unsigned int new_cpu_id)
 	 * CPU interface and migrate them to the new CPU interface.
 	 * We skip DIST_TARGET 0 to 7 as they are read-only.
 	 */
-	for (i = 8; i < DIV_ROUND_UP(gic_irqs, 4); i++) {
-		val = readl_relaxed(dist_base + GIC_DIST_TARGET + i * 4);
+	for (i = 8; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
+		target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
+		val = readl_relaxed(dist_base + target_reg);
 		active_mask = val & cur_target_mask;
 		if (active_mask) {
 			val &= ~active_mask;
 			val |= ror32(active_mask, ror_val);
-			writel_relaxed(val, dist_base + GIC_DIST_TARGET + i*4);
+			writel_relaxed(val, dist_base + target_reg);
 		}
 	}
 
@@ -931,7 +1006,7 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
 	irq_hw_number_t hwirq_base;
 	struct gic_chip_data *gic;
 	int gic_irqs, irq_base, i;
-	int nr_routable_irqs;
+	int nr_routable_irqs, max_nr_irq;
 
 	BUG_ON(gic_nr >= MAX_GIC_NR);
 
@@ -967,12 +1042,22 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
 		gic_set_base_accessor(gic, gic_get_common_base);
 	}
 
+	if (of_device_is_compatible(node, "hisilicon,hip04-gic")) {
+		/* HiP04 GIC supports 16 CPUs at most */
+		gic->nr_cpu_if = 16;
+		max_nr_irq = 510;
+	} else {
+		/* ARM/Qualcomm GIC supports 8 CPUs at most */
+		gic->nr_cpu_if = 8;
+		max_nr_irq = 1020;
+	}
+
 	/*
 	 * Initialize the CPU interface map to all CPUs.
 	 * It will be refined as each CPU probes its ID.
 	 */
-	for (i = 0; i < NR_GIC_CPU_IF; i++)
-		gic_cpu_map[i] = 0xff;
+	for (i = 0; i < gic->nr_cpu_if; i++)
+		gic_cpu_map[i] = (1 << gic->nr_cpu_if) - 1;
 
 	/*
 	 * For primary GICs, skip over SGIs.
@@ -988,12 +1073,13 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
 
 	/*
 	 * Find out how many interrupts are supported.
-	 * The GIC only supports up to 1020 interrupt sources.
+	 * The ARM/Qualcomm GIC only supports up to 1020 interrupt sources.
+	 * The HiP04 GIC only supports up to 510 interrupt sources.
 	 */
 	gic_irqs = readl_relaxed(gic_data_dist_base(gic) + GIC_DIST_CTR) & 0x1f;
 	gic_irqs = (gic_irqs + 1) * 32;
-	if (gic_irqs > 1020)
-		gic_irqs = 1020;
+	if (gic_irqs > max_nr_irq)
+		gic_irqs = max_nr_irq;
 	gic->gic_irqs = gic_irqs;
 
 	gic_irqs -= hwirq_base; /* calculate # of irqs to allocate */
@@ -1069,6 +1155,7 @@ gic_of_init(struct device_node *node, struct device_node *parent)
 }
 IRQCHIP_DECLARE(cortex_a15_gic, "arm,cortex-a15-gic", gic_of_init);
 IRQCHIP_DECLARE(cortex_a9_gic, "arm,cortex-a9-gic", gic_of_init);
+IRQCHIP_DECLARE(hip04_gic, "hisilicon,hip04-gic", gic_of_init);
 IRQCHIP_DECLARE(msm_8660_qgic, "qcom,msm-8660-qgic", gic_of_init);
 IRQCHIP_DECLARE(msm_qgic2, "qcom,msm-qgic2", gic_of_init);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 03/14] ARM: mcpm: support 4 clusters
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 01/14] ARM: debug: add HiP04 debug uart Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 02/14] irq: gic: support hip04 gic Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 04/14] ARM: hisi: add ARCH_HISI Haojian Zhuang
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

Add the CONFIG_MCPM_QUAD_CLUSTER configuration to enlarge cluster number
from 2 to 4.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
---
 arch/arm/Kconfig            | 9 +++++++++
 arch/arm/include/asm/mcpm.h | 5 +++++
 2 files changed, 14 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index ab438cb..6ce4a49 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1564,6 +1564,15 @@ config MCPM
 	  for (multi-)cluster based systems, such as big.LITTLE based
 	  systems.
 
+config MCPM_QUAD_CLUSTER
+	bool
+	depends on MCPM
+	help
+	  To avoid wasting resources unnecessarily, MCPM only supports up
+	  to 2 clusters by default.
+	  Platforms with 3 or 4 clusters that use MCPM must select this
+	  option to allow the additional clusters to be managed.
+
 config BIG_LITTLE
 	bool "big.LITTLE support (Experimental)"
 	depends on CPU_V7 && SMP
diff --git a/arch/arm/include/asm/mcpm.h b/arch/arm/include/asm/mcpm.h
index 608516e..fc8d70d 100644
--- a/arch/arm/include/asm/mcpm.h
+++ b/arch/arm/include/asm/mcpm.h
@@ -20,7 +20,12 @@
  * to consider dynamic allocation.
  */
 #define MAX_CPUS_PER_CLUSTER	4
+
+#ifdef CONFIG_MCPM_QUAD_CLUSTER
+#define MAX_NR_CLUSTERS		4
+#else
 #define MAX_NR_CLUSTERS		2
+#endif
 
 #ifndef __ASSEMBLY__
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 04/14] ARM: hisi: add ARCH_HISI
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
                   ` (2 preceding siblings ...)
  2014-05-20 13:10 ` [PATCH v9 03/14] ARM: mcpm: support 4 clusters Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 05/14] ARM: hisi: enable MCPM implementation Haojian Zhuang
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

Since multiple ARCH configuration will be appended into mach-hisi
directory, add ARCH_HISI as common configuration for different ARCH in
mach-hisi.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 arch/arm/Makefile          |  2 +-
 arch/arm/mach-hisi/Kconfig | 16 ++++++++++++++--
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 41c1931..4c2798a 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -154,7 +154,7 @@ machine-$(CONFIG_ARCH_EP93XX)		+= ep93xx
 machine-$(CONFIG_ARCH_EXYNOS)		+= exynos
 machine-$(CONFIG_ARCH_GEMINI)		+= gemini
 machine-$(CONFIG_ARCH_HIGHBANK)		+= highbank
-machine-$(CONFIG_ARCH_HI3xxx)		+= hisi
+machine-$(CONFIG_ARCH_HISI)		+= hisi
 machine-$(CONFIG_ARCH_INTEGRATOR)	+= integrator
 machine-$(CONFIG_ARCH_IOP13XX)		+= iop13xx
 machine-$(CONFIG_ARCH_IOP32X)		+= iop32x
diff --git a/arch/arm/mach-hisi/Kconfig b/arch/arm/mach-hisi/Kconfig
index feee4db..da16efd 100644
--- a/arch/arm/mach-hisi/Kconfig
+++ b/arch/arm/mach-hisi/Kconfig
@@ -1,8 +1,16 @@
-config ARCH_HI3xxx
-	bool "Hisilicon Hi36xx/Hi37xx family" if ARCH_MULTI_V7
+config ARCH_HISI
+	bool "Hisilicon SoC Support"
+	depends on ARCH_MULTIPLATFORM
 	select ARM_AMBA
 	select ARM_GIC
 	select ARM_TIMER_SP804
+
+if ARCH_HISI
+
+menu "Hisilicon platform type"
+
+config ARCH_HI3xxx
+	bool "Hisilicon Hi36xx/Hi37xx family" if ARCH_MULTI_V7
 	select CACHE_L2X0
 	select HAVE_ARM_SCU if SMP
 	select HAVE_ARM_TWD if SMP
@@ -10,3 +18,7 @@ config ARCH_HI3xxx
 	select PINCTRL_SINGLE
 	help
 	  Support for Hisilicon Hi36xx/Hi37xx processor family
+
+endmenu
+
+endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 05/14] ARM: hisi: enable MCPM implementation
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
                   ` (3 preceding siblings ...)
  2014-05-20 13:10 ` [PATCH v9 04/14] ARM: hisi: add ARCH_HISI Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-21  1:29   ` Nicolas Pitre
  2014-05-20 13:10 ` [PATCH v9 06/14] ARM: hisi: enable HiP04 Haojian Zhuang
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

Multiple CPU clusters are used in Hisilicon HiP04 SoC. Now use MCPM
framework to manage power on HiP04 SoC.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 arch/arm/mach-hisi/Makefile   |   1 +
 arch/arm/mach-hisi/platmcpm.c | 310 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 311 insertions(+)
 create mode 100644 arch/arm/mach-hisi/platmcpm.c

diff --git a/arch/arm/mach-hisi/Makefile b/arch/arm/mach-hisi/Makefile
index 2ae1b59..e7a8640 100644
--- a/arch/arm/mach-hisi/Makefile
+++ b/arch/arm/mach-hisi/Makefile
@@ -3,4 +3,5 @@
 #
 
 obj-y	+= hisilicon.o
+obj-$(CONFIG_MCPM)		+= platmcpm.o
 obj-$(CONFIG_SMP)		+= platsmp.o hotplug.o
diff --git a/arch/arm/mach-hisi/platmcpm.c b/arch/arm/mach-hisi/platmcpm.c
new file mode 100644
index 0000000..b991e82
--- /dev/null
+++ b/arch/arm/mach-hisi/platmcpm.c
@@ -0,0 +1,310 @@
+/*
+ * Copyright (c) 2013-2014 Linaro Ltd.
+ * Copyright (c) 2013-2014 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ */
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/of_address.h>
+
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+#include <asm/mcpm.h>
+
+#include "core.h"
+
+/* bits definition in SC_CPU_RESET_REQ[x]/SC_CPU_RESET_DREQ[x]
+ * 1 -- unreset; 0 -- reset
+ */
+#define CORE_RESET_BIT(x)		(1 << x)
+#define NEON_RESET_BIT(x)		(1 << (x + 4))
+#define CORE_DEBUG_RESET_BIT(x)		(1 << (x + 9))
+#define CLUSTER_L2_RESET_BIT		(1 << 8)
+#define CLUSTER_DEBUG_RESET_BIT		(1 << 13)
+
+/*
+ * bits definition in SC_CPU_RESET_STATUS[x]
+ * 1 -- reset status; 0 -- unreset status
+ */
+#define CORE_RESET_STATUS(x)		(1 << x)
+#define NEON_RESET_STATUS(x)		(1 << (x + 4))
+#define CORE_DEBUG_RESET_STATUS(x)	(1 << (x + 9))
+#define CLUSTER_L2_RESET_STATUS		(1 << 8)
+#define CLUSTER_DEBUG_RESET_STATUS	(1 << 13)
+#define CORE_WFI_STATUS(x)		(1 << (x + 16))
+#define CORE_WFE_STATUS(x)		(1 << (x + 20))
+#define CORE_DEBUG_ACK(x)		(1 << (x + 24))
+
+#define SC_CPU_RESET_REQ(x)		(0x520 + (x << 3))	/* reset */
+#define SC_CPU_RESET_DREQ(x)		(0x524 + (x << 3))	/* unreset */
+#define SC_CPU_RESET_STATUS(x)		(0x1520 + (x << 3))
+
+#define FAB_SF_MODE			0x0c
+#define FAB_SF_INVLD			0x10
+
+/* bits definition in FB_SF_INVLD */
+#define FB_SF_INVLD_START		(1 << 8)
+
+#define HIP04_MAX_CLUSTERS		4
+#define HIP04_MAX_CPUS_PER_CLUSTER	4
+
+#define POLL_MSEC	10
+#define TIMEOUT_MSEC	1000
+
+struct hip04_secondary_cpu_data {
+	u32	bootwrapper_phys;
+	u32	bootwrapper_size;
+	u32	bootwrapper_magic;
+	u32	relocation_entry;
+	u32	relocation_size;
+};
+
+static void __iomem *relocation, *sysctrl, *fabric;
+static int hip04_cpu_table[HIP04_MAX_CLUSTERS][HIP04_MAX_CPUS_PER_CLUSTER];
+static DEFINE_SPINLOCK(boot_lock);
+static struct hip04_secondary_cpu_data hip04_boot;
+
+static bool hip04_cluster_down(unsigned int cluster)
+{
+	int i;
+
+	for (i = 0; i < HIP04_MAX_CPUS_PER_CLUSTER; i++)
+		if (hip04_cpu_table[cluster][i])
+			return false;
+	return true;
+}
+
+static void hip04_set_snoop_filter(unsigned int cluster, unsigned int on)
+{
+	unsigned long data;
+
+	if (!fabric)
+		BUG();
+	data = readl_relaxed(fabric + FAB_SF_MODE);
+	if (on)
+		data |= 1 << cluster;
+	else
+		data &= ~(1 << cluster);
+	writel_relaxed(data, fabric + FAB_SF_MODE);
+	while (1) {
+		if (data == readl_relaxed(fabric + FAB_SF_MODE))
+			break;
+	}
+}
+
+static int hip04_mcpm_power_up(unsigned int cpu, unsigned int cluster)
+{
+	unsigned long data, mask;
+
+	if (!relocation || !sysctrl)
+		return -ENODEV;
+	if (cluster >= HIP04_MAX_CLUSTERS || cpu >= HIP04_MAX_CPUS_PER_CLUSTER)
+		return -EINVAL;
+
+	spin_lock_irq(&boot_lock);
+
+	if (hip04_cpu_table[cluster][cpu]) {
+		hip04_cpu_table[cluster][cpu]++;
+		spin_unlock_irq(&boot_lock);
+		return 0;
+	}
+
+	writel_relaxed(hip04_boot.bootwrapper_phys, relocation);
+	writel_relaxed(hip04_boot.bootwrapper_magic, relocation + 4);
+	writel_relaxed(virt_to_phys(mcpm_entry_point), relocation + 8);
+	writel_relaxed(0, relocation + 12);
+
+	if (hip04_cluster_down(cluster)) {
+		data = CLUSTER_DEBUG_RESET_BIT;
+		writel_relaxed(data, sysctrl + SC_CPU_RESET_DREQ(cluster));
+		do {
+			mask = CLUSTER_DEBUG_RESET_STATUS;
+			data = readl_relaxed(sysctrl + \
+					     SC_CPU_RESET_STATUS(cluster));
+		} while (data & mask);
+		hip04_set_snoop_filter(cluster, 1);
+	}
+
+	hip04_cpu_table[cluster][cpu]++;
+
+	data = CORE_RESET_BIT(cpu) | NEON_RESET_BIT(cpu) | \
+	       CORE_DEBUG_RESET_BIT(cpu);
+	writel_relaxed(data, sysctrl + SC_CPU_RESET_DREQ(cluster));
+	spin_unlock_irq(&boot_lock);
+	msleep(POLL_MSEC);
+
+	return 0;
+}
+
+static void hip04_mcpm_power_down(void)
+{
+	unsigned int mpidr, cpu, cluster, data = 0;
+	bool skip_reset = false;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	__mcpm_cpu_going_down(cpu, cluster);
+
+	spin_lock(&boot_lock);
+	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
+	hip04_cpu_table[cluster][cpu]--;
+	if (hip04_cpu_table[cluster][cpu] == 1) {
+		/* A power_up request went ahead of us. */
+		skip_reset = true;
+	} else if (hip04_cpu_table[cluster][cpu] > 1) {
+		pr_err("Cluster %d CPU%d boots multiple times\n", cluster, cpu);
+		BUG();
+	}
+	spin_unlock(&boot_lock);
+
+	v7_exit_coherency_flush(louis);
+
+	__mcpm_cpu_down(cpu, cluster);
+
+	if (!skip_reset) {
+		data = CORE_RESET_BIT(cpu) | NEON_RESET_BIT(cpu) | \
+		       CORE_DEBUG_RESET_BIT(cpu);
+		writel_relaxed(data, sysctrl + SC_CPU_RESET_REQ(cluster));
+	}
+}
+
+static int hip04_mcpm_wait_for_powerdown(unsigned int cpu, unsigned int cluster)
+{
+	unsigned int data, tries;
+
+	BUG_ON(cluster >= HIP04_MAX_CLUSTERS ||
+	       cpu >= HIP04_MAX_CPUS_PER_CLUSTER);
+
+	for (tries = 0; tries < TIMEOUT_MSEC / POLL_MSEC; tries++) {
+		data = readl_relaxed(sysctrl + SC_CPU_RESET_STATUS(cluster));
+		if (!(data & CORE_RESET_STATUS(cpu))) {
+			msleep(POLL_MSEC);
+			continue;
+		}
+		return 0;
+	}
+	return -ETIMEDOUT;
+}
+
+static void hip04_mcpm_powered_up(void)
+{
+	if (!relocation)
+		return;
+	spin_lock(&boot_lock);
+	writel_relaxed(0, relocation);
+	writel_relaxed(0, relocation + 4);
+	writel_relaxed(0, relocation + 8);
+	writel_relaxed(0, relocation + 12);
+	spin_unlock(&boot_lock);
+}
+
+static const struct mcpm_platform_ops hip04_mcpm_ops = {
+	.power_up		= hip04_mcpm_power_up,
+	.power_down		= hip04_mcpm_power_down,
+	.wait_for_powerdown	= hip04_mcpm_wait_for_powerdown,
+	.powered_up		= hip04_mcpm_powered_up,
+};
+
+static bool __init hip04_cpu_table_init(void)
+{
+	unsigned int mpidr, cpu, cluster;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	if (cluster >= HIP04_MAX_CLUSTERS ||
+	    cpu >= HIP04_MAX_CPUS_PER_CLUSTER) {
+		pr_err("%s: boot CPU is out of bound!\n", __func__);
+		return false;
+	}
+	hip04_set_snoop_filter(cluster, 1);
+	hip04_cpu_table[cluster][cpu] = 1;
+	return true;
+}
+
+static int __init hip04_mcpm_init(void)
+{
+	struct device_node *np, *np_fab;
+	int ret = -ENODEV;
+
+	np = of_find_compatible_node(NULL, NULL, "hisilicon,sysctrl");
+	if (!np)
+		goto err;
+	np_fab = of_find_compatible_node(NULL, NULL, "hisilicon,hip04-fabric");
+	if (!np_fab)
+		goto err;
+
+	if (of_property_read_u32(np, "bootwrapper-phys",
+				 &hip04_boot.bootwrapper_phys)) {
+		pr_err("failed to get bootwrapper-phys\n");
+		ret = -EINVAL;
+		goto err;
+	}
+	if (of_property_read_u32(np, "bootwrapper-size",
+				 &hip04_boot.bootwrapper_size)) {
+		pr_err("failed to get bootwrapper-size\n");
+		ret = -EINVAL;
+		goto err;
+	}
+	if (of_property_read_u32(np, "bootwrapper-magic",
+				 &hip04_boot.bootwrapper_magic)) {
+		pr_err("failed to get bootwrapper-magic\n");
+		ret = -EINVAL;
+		goto err;
+	}
+	if (of_property_read_u32(np, "relocation-entry",
+				 &hip04_boot.relocation_entry)) {
+		pr_err("failed to get relocation-entry\n");
+		ret = -EINVAL;
+		goto err;
+	}
+	if (of_property_read_u32(np, "relocation-size",
+				 &hip04_boot.relocation_size)) {
+		pr_err("failed to get relocation-size\n");
+		ret = -EINVAL;
+		goto err;
+	}
+
+	relocation = ioremap(hip04_boot.relocation_entry,
+			     hip04_boot.relocation_size);
+	if (!relocation) {
+		pr_err("failed to map relocation space\n");
+		ret = -ENOMEM;
+		goto err;
+	}
+	sysctrl = of_iomap(np, 0);
+	if (!sysctrl) {
+		pr_err("failed to get sysctrl base\n");
+		ret = -ENOMEM;
+		goto err_sysctrl;
+	}
+	fabric = of_iomap(np_fab, 0);
+	if (!fabric) {
+		pr_err("failed to get fabric base\n");
+		ret = -ENOMEM;
+		goto err_fabric;
+	}
+
+	if (!hip04_cpu_table_init())
+		return -EINVAL;
+	ret = mcpm_platform_register(&hip04_mcpm_ops);
+	if (!ret) {
+		mcpm_sync_init(NULL);
+		pr_info("HiP04 MCPM initialized\n");
+	}
+	mcpm_smp_set_ops();
+	return ret;
+err_fabric:
+	iounmap(sysctrl);
+err_sysctrl:
+	iounmap(relocation);
+err:
+	return ret;
+}
+early_initcall(hip04_mcpm_init);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 06/14] ARM: hisi: enable HiP04
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
                   ` (4 preceding siblings ...)
  2014-05-20 13:10 ` [PATCH v9 05/14] ARM: hisi: enable MCPM implementation Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 07/14] document: dt: add the binding on HiP04 Haojian Zhuang
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

Support HiP04 SoC what supports 16 cores. And it relies on MCPM
framework.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 arch/arm/mach-hisi/Kconfig     | 10 +++++++++-
 arch/arm/mach-hisi/hisilicon.c |  9 +++++++++
 arch/arm/mach-hisi/platmcpm.c  |  1 -
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mach-hisi/Kconfig b/arch/arm/mach-hisi/Kconfig
index da16efd..3f1ece7 100644
--- a/arch/arm/mach-hisi/Kconfig
+++ b/arch/arm/mach-hisi/Kconfig
@@ -17,7 +17,15 @@ config ARCH_HI3xxx
 	select PINCTRL
 	select PINCTRL_SINGLE
 	help
-	  Support for Hisilicon Hi36xx/Hi37xx processor family
+	  Support for Hisilicon Hi36xx/Hi37xx SoC family
+
+config ARCH_HIP04
+	bool "Hisilicon HiP04 Cortex A15 family" if ARCH_MULTI_V7
+	select HAVE_ARM_ARCH_TIMER
+	select MCPM if SMP
+	select MCPM_QUAD_CLUSTER if SMP
+	help
+	  Support for Hisilicon HiP04 SoC family
 
 endmenu
 
diff --git a/arch/arm/mach-hisi/hisilicon.c b/arch/arm/mach-hisi/hisilicon.c
index 741faf3..a9f648f 100644
--- a/arch/arm/mach-hisi/hisilicon.c
+++ b/arch/arm/mach-hisi/hisilicon.c
@@ -88,3 +88,12 @@ DT_MACHINE_START(HI3620, "Hisilicon Hi3620 (Flattened Device Tree)")
 	.smp		= smp_ops(hi3xxx_smp_ops),
 	.restart	= hi3xxx_restart,
 MACHINE_END
+
+static const char *hip04_compat[] __initconst = {
+	"hisilicon,hip04-d01",
+	NULL,
+};
+
+DT_MACHINE_START(HIP04, "Hisilicon HiP04 (Flattened Device Tree)")
+	.dt_compat	= hip04_compat,
+MACHINE_END
diff --git a/arch/arm/mach-hisi/platmcpm.c b/arch/arm/mach-hisi/platmcpm.c
index b991e82..2bda12b 100644
--- a/arch/arm/mach-hisi/platmcpm.c
+++ b/arch/arm/mach-hisi/platmcpm.c
@@ -134,7 +134,6 @@ static int hip04_mcpm_power_up(unsigned int cpu, unsigned int cluster)
 	       CORE_DEBUG_RESET_BIT(cpu);
 	writel_relaxed(data, sysctrl + SC_CPU_RESET_DREQ(cluster));
 	spin_unlock_irq(&boot_lock);
-	msleep(POLL_MSEC);
 
 	return 0;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 07/14] document: dt: add the binding on HiP04
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
                   ` (5 preceding siblings ...)
  2014-05-20 13:10 ` [PATCH v9 06/14] ARM: hisi: enable HiP04 Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 08/14] document: dt: add the binding on HiP04 clock Haojian Zhuang
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

Add bootwrapper-phys, bootwrapper-size, bootwrapper-magic properties for
Hisilicon HiP04 SoC.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 .../devicetree/bindings/arm/hisilicon/hisilicon.txt | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt b/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt
index df0a452..5024992 100644
--- a/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt
+++ b/Documentation/devicetree/bindings/arm/hisilicon/hisilicon.txt
@@ -4,6 +4,10 @@ Hisilicon Platforms Device Tree Bindings
 Hi4511 Board
 Required root node properties:
 	- compatible = "hisilicon,hi3620-hi4511";
+HiP04 D01 Board
+Required root node properties:
+	- compatible = "hisilicon,hip04-d01";
+
 
 Hisilicon system controller
 
@@ -19,6 +23,15 @@ Optional properties:
 		If reg value is not zero, cpun exit wfi and go
 - resume-offset : offset in sysctrl for notifying cpu0 when resume
 - reboot-offset : offset in sysctrl for system reboot
+- relocation-entry : relocation address of secondary cpu boot code
+- relocation-size : relocation size of secondary cpu boot code
+- bootwrapper-phys : physical address of boot wrapper
+- bootwrapper-size : size of boot wrapper
+- bootwrapper-magic : magic number for secondary cpu in boot wrapper
+The memory area of [bootwrapper-phys : bootwrapper-phys+bootwrapper-size]
+should be reserved. This should be set in /memreserve/ node in DTS file.
+bootwrapper-phys, bootwrapper-size, bootwrapper-magic is used in HiP04
+DTS file.
 
 Example:
 
@@ -31,6 +44,7 @@ Example:
 		reboot-offset = <0x4>;
 	};
 
+
 PCTRL: Peripheral misc control register
 
 Required Properties:
@@ -44,3 +58,10 @@ Example:
 		compatible = "hisilicon,pctrl";
 		reg = <0xfca09000 0x1000>;
 	};
+
+
+Fabric:
+
+Required Properties:
+- compatible: "hisilicon,hip04-fabric";
+- reg: Address and size of Fabric
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 08/14] document: dt: add the binding on HiP04 clock
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
                   ` (6 preceding siblings ...)
  2014-05-20 13:10 ` [PATCH v9 07/14] document: dt: add the binding on HiP04 Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 09/14] ARM: dts: append hip04 dts Haojian Zhuang
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

The DT binding for Hisilicon HiP04 clock driver.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 .../devicetree/bindings/clock/hip04-clock.txt        | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/clock/hip04-clock.txt

diff --git a/Documentation/devicetree/bindings/clock/hip04-clock.txt b/Documentation/devicetree/bindings/clock/hip04-clock.txt
new file mode 100644
index 0000000..4d31ae3
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/hip04-clock.txt
@@ -0,0 +1,20 @@
+* Hisilicon HiP04 Clock Controller
+
+The HiP04 clock controller generates and supplies clock to various
+controllers within the HiP04 SoC.
+
+Required Properties:
+
+- compatible: should be one of the following.
+  - "hisilicon,hip04-clock" - controller compatible with HiP04 SoC.
+
+- reg: physical base address of the controller and length of memory mapped
+  region.
+
+- #clock-cells: should be 1.
+
+
+Each clock is assigned an identifier and client nodes use this identifier
+to specify the clock which they consume.
+
+All these identifier could be found in <dt-bindings/clock/hip04-clock.h>.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 09/14] ARM: dts: append hip04 dts
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
                   ` (7 preceding siblings ...)
  2014-05-20 13:10 ` [PATCH v9 08/14] document: dt: add the binding on HiP04 clock Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 10/14] ARM: config: append lpae configuration Haojian Zhuang
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

Add hip04-d01.dts & hip04.dtsi for hip04 SoC platform.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 arch/arm/boot/dts/Makefile      |   1 +
 arch/arm/boot/dts/hip04-d01.dts |  39 ++++++
 arch/arm/boot/dts/hip04.dtsi    | 260 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 300 insertions(+)
 create mode 100644 arch/arm/boot/dts/hip04-d01.dts
 create mode 100644 arch/arm/boot/dts/hip04.dtsi

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 35c146f..7119bca 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -80,6 +80,7 @@ dtb-$(CONFIG_ARCH_EXYNOS) += exynos4210-origen.dtb \
 dtb-$(CONFIG_ARCH_HI3xxx) += hi3620-hi4511.dtb
 dtb-$(CONFIG_ARCH_HIGHBANK) += highbank.dtb \
 	ecx-2000.dtb
+dtb-$(CONFIG_ARCH_HIP04) += hip04-d01.dtb
 dtb-$(CONFIG_ARCH_INTEGRATOR) += integratorap.dtb \
 	integratorcp.dtb
 dtb-$(CONFIG_ARCH_KEYSTONE) += k2hk-evm.dtb \
diff --git a/arch/arm/boot/dts/hip04-d01.dts b/arch/arm/boot/dts/hip04-d01.dts
new file mode 100644
index 0000000..661c8e5
--- /dev/null
+++ b/arch/arm/boot/dts/hip04-d01.dts
@@ -0,0 +1,39 @@
+/*
+ *  Copyright (C) 2013-2014 Linaro Ltd.
+ *  Author: Haojian Zhuang <haojian.zhuang@linaro.org>
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License version 2 as
+ *  publishhed by the Free Software Foundation.
+ */
+
+/dts-v1/;
+
+/* For bootwrapper */
+/memreserve/ 0x10c00000 0x00010000;
+
+#include "hip04.dtsi"
+
+/ {
+	/* memory bus is 64-bit */
+	#address-cells = <2>;
+	#size-cells = <2>;
+	model = "Hisilicon D01 Development Board";
+	compatible = "hisilicon,hip04-d01";
+
+	memory at 00000000,10000000 {
+		device_type = "memory";
+		reg = <0x00000000 0x10000000 0x00000000 0xc0000000>;
+	};
+
+	memory at 00000004,c0000000 {
+		device_type = "memory";
+		reg = <0x00000004 0xc0000000 0x00000003 0x40000000>;
+	};
+
+	soc {
+		uart0: uart at 4007000 {
+			status = "ok";
+		};
+	};
+};
diff --git a/arch/arm/boot/dts/hip04.dtsi b/arch/arm/boot/dts/hip04.dtsi
new file mode 100644
index 0000000..00a1ba2
--- /dev/null
+++ b/arch/arm/boot/dts/hip04.dtsi
@@ -0,0 +1,260 @@
+/*
+ * Hisilicon Ltd. HiP04 SoC
+ *
+ * Copyright (C) 2013-2014 Hisilicon Ltd.
+ * Copyright (C) 2013-2014 Linaro Ltd.
+ *
+ * Author: Haojian Zhuang <haojian.zhuang@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * publishhed by the Free Software Foundation.
+ */
+
+#include <dt-bindings/clock/hip04-clock.h>
+
+/ {
+	/* memory bus is 64-bit */
+	#address-cells = <2>;
+	#size-cells = <2>;
+
+	aliases {
+		serial0 = &uart0;
+	};
+
+	cpus {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		cpu-map {
+			cluster0 {
+				core0 {
+					cpu = <&CPU0>;
+				};
+				core1 {
+					cpu = <&CPU1>;
+				};
+				core2 {
+					cpu = <&CPU2>;
+				};
+				core3 {
+					cpu = <&CPU3>;
+				};
+			};
+			cluster1 {
+				core0 {
+					cpu = <&CPU4>;
+				};
+				core1 {
+					cpu = <&CPU5>;
+				};
+				core2 {
+					cpu = <&CPU6>;
+				};
+				core3 {
+					cpu = <&CPU7>;
+				};
+			};
+			cluster2 {
+				core0 {
+					cpu = <&CPU8>;
+				};
+				core1 {
+					cpu = <&CPU9>;
+				};
+				core2 {
+					cpu = <&CPU10>;
+				};
+				core3 {
+					cpu = <&CPU11>;
+				};
+			};
+			cluster3 {
+				core0 {
+					cpu = <&CPU12>;
+				};
+				core1 {
+					cpu = <&CPU13>;
+				};
+				core2 {
+					cpu = <&CPU14>;
+				};
+				core3 {
+					cpu = <&CPU15>;
+				};
+			};
+		};
+		CPU0: cpu at 0 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0>;
+		};
+		CPU1: cpu at 1 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <1>;
+		};
+		CPU2: cpu at 2 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <2>;
+		};
+		CPU3: cpu at 3 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <3>;
+		};
+		CPU4: cpu at 100 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x100>;
+		};
+		CPU5: cpu at 101 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x101>;
+		};
+		CPU6: cpu at 102 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x102>;
+		};
+		CPU7: cpu at 103 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x103>;
+		};
+		CPU8: cpu at 200 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x200>;
+		};
+		CPU9: cpu at 201 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x201>;
+		};
+		CPU10: cpu at 202 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x202>;
+		};
+		CPU11: cpu at 203 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x203>;
+		};
+		CPU12: cpu at 300 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x300>;
+		};
+		CPU13: cpu at 301 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x301>;
+		};
+		CPU14: cpu at 302 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x302>;
+		};
+		CPU15: cpu at 303 {
+			device_type = "cpu";
+			compatible = "arm,cortex-a15";
+			reg = <0x303>;
+		};
+	};
+
+	clock: clock {
+		compatible = "hisilicon,hip04-clock";
+		/* dummy register.
+		 * Don't need to access clock registers since they're
+		 * configured in firmware already.
+		 */
+		reg = <0 0 0 0x1000>;
+		#clock-cells = <1>;
+	};
+
+	timer {
+		compatible = "arm,armv7-timer";
+		interrupt-parent = <&gic>;
+		interrupts = <1 13 0xf08>,
+			     <1 14 0xf08>,
+			     <1 11 0xf08>,
+			     <1 10 0xf08>;
+	};
+
+	soc {
+		/* It's a 32-bit SoC. */
+		#address-cells = <1>;
+		#size-cells = <1>;
+		compatible = "simple-bus";
+		interrupt-parent = <&gic>;
+		ranges = <0 0 0xe0000000 0x10000000>;
+
+		gic: interrupt-controller at c01000 {
+			compatible = "hisilicon,hip04-gic";
+			#interrupt-cells = <3>;
+			#address-cells = <0>;
+			interrupt-controller;
+			interrupts = <1 9 0xf04>;
+
+			reg = <0xc01000 0x1000>, <0xc02000 0x1000>,
+			      <0xc04000 0x2000>, <0xc06000 0x2000>;
+		};
+
+		sysctrl: sysctrl {
+			compatible = "hisilicon,sysctrl";
+			reg = <0x3e00000 0x00100000>;
+			relocation-entry = <0xe0000100>;
+			relocation-size = <0x1000>;
+			bootwrapper-phys = <0x10c00000>;
+			bootwrapper-size = <0x10000>;
+			bootwrapper-magic = <0xa5a5a5a5>;
+		};
+
+		fabric: fabric {
+			compatible = "hisilicon,hip04-fabric";
+			reg = <0x302a000 0x1000>;
+		};
+
+		dual_timer0: dual_timer at 3000000 {
+			compatible = "arm,sp804", "arm,primecell";
+			reg = <0x3000000 0x1000>;
+			interrupts = <0 224 4>;
+			clocks = <&clock HIP04_CLK_50M>;
+			clock-names = "apb_pclk";
+		};
+
+		arm-pmu {
+			compatible = "arm,cortex-a15-pmu";
+			interrupts = <0 64 4>,
+				     <0 65 4>,
+				     <0 66 4>,
+				     <0 67 4>,
+				     <0 68 4>,
+				     <0 69 4>,
+				     <0 70 4>,
+				     <0 71 4>,
+				     <0 72 4>,
+				     <0 73 4>,
+				     <0 74 4>,
+				     <0 75 4>,
+				     <0 76 4>,
+				     <0 77 4>,
+				     <0 78 4>,
+				     <0 79 4>;
+		};
+
+		uart0: uart at 4007000 {
+			compatible = "snps,dw-apb-uart";
+			reg = <0x4007000 0x1000>;
+			interrupts = <0 381 4>;
+			clocks = <&clock HIP04_CLK_168M>;
+			clock-names = "uartclk";
+			reg-shift = <2>;
+			status = "disabled";
+		};
+	};
+};
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 10/14] ARM: config: append lpae configuration
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
                   ` (8 preceding siblings ...)
  2014-05-20 13:10 ` [PATCH v9 09/14] ARM: dts: append hip04 dts Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:52   ` Gregory CLEMENT
                     ` (2 more replies)
  2014-05-20 13:10 ` [PATCH v9 11/14] ARM: config: append hip04_defconfig Haojian Zhuang
                   ` (3 subsequent siblings)
  13 siblings, 3 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

Append multi_v7_lpae_config. In this default configuration,
CONFIG_ARCH_MULTI_V6 is disabled. CONFIG_ARM_LPAE is enabled.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 arch/arm/configs/multi_v7_lpae_defconfig | 351 +++++++++++++++++++++++++++++++
 1 file changed, 351 insertions(+)
 create mode 100644 arch/arm/configs/multi_v7_lpae_defconfig

diff --git a/arch/arm/configs/multi_v7_lpae_defconfig b/arch/arm/configs/multi_v7_lpae_defconfig
new file mode 100644
index 0000000..59fcefc
--- /dev/null
+++ b/arch/arm/configs/multi_v7_lpae_defconfig
@@ -0,0 +1,351 @@
+CONFIG_SYSVIPC=y
+CONFIG_FHANDLE=y
+CONFIG_IRQ_DOMAIN_DEBUG=y
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_EMBEDDED=y
+CONFIG_MODULES=y
+CONFIG_MODULE_UNLOAD=y
+CONFIG_PARTITION_ADVANCED=y
+# CONFIG_ARCH_MULTI_V6 is not set
+CONFIG_ARCH_MULTI_V7=y
+CONFIG_ARM_LPAE=y
+CONFIG_ARCH_MVEBU=y
+CONFIG_MACH_ARMADA_370=y
+CONFIG_MACH_ARMADA_375=y
+CONFIG_MACH_ARMADA_38X=y
+CONFIG_MACH_ARMADA_XP=y
+CONFIG_MACH_DOVE=y
+CONFIG_ARCH_BCM=y
+CONFIG_ARCH_BCM_5301X=y
+CONFIG_ARCH_BCM_MOBILE=y
+CONFIG_ARCH_BERLIN=y
+CONFIG_MACH_BERLIN_BG2=y
+CONFIG_MACH_BERLIN_BG2CD=y
+CONFIG_GPIO_PCA953X=y
+CONFIG_ARCH_HIGHBANK=y
+CONFIG_ARCH_HISI=y
+CONFIG_ARCH_HI3xxx=y
+CONFIG_ARCH_HIP04=y
+CONFIG_ARCH_KEYSTONE=y
+CONFIG_ARCH_MXC=y
+CONFIG_MACH_IMX51_DT=y
+CONFIG_SOC_IMX53=y
+CONFIG_SOC_IMX6Q=y
+CONFIG_SOC_IMX6SL=y
+CONFIG_SOC_VF610=y
+CONFIG_ARCH_OMAP3=y
+CONFIG_ARCH_OMAP4=y
+CONFIG_SOC_OMAP5=y
+CONFIG_SOC_AM33XX=y
+CONFIG_SOC_DRA7XX=y
+CONFIG_SOC_AM43XX=y
+CONFIG_ARCH_QCOM=y
+CONFIG_ARCH_MSM8X60=y
+CONFIG_ARCH_MSM8960=y
+CONFIG_ARCH_MSM8974=y
+CONFIG_ARCH_ROCKCHIP=y
+CONFIG_ARCH_SOCFPGA=y
+CONFIG_PLAT_SPEAR=y
+CONFIG_ARCH_SPEAR13XX=y
+CONFIG_MACH_SPEAR1310=y
+CONFIG_MACH_SPEAR1340=y
+CONFIG_ARCH_STI=y
+CONFIG_ARCH_SUNXI=y
+CONFIG_ARCH_SIRF=y
+CONFIG_ARCH_TEGRA=y
+CONFIG_ARCH_TEGRA_2x_SOC=y
+CONFIG_ARCH_TEGRA_3x_SOC=y
+CONFIG_ARCH_TEGRA_114_SOC=y
+CONFIG_ARCH_TEGRA_124_SOC=y
+CONFIG_TEGRA_EMC_SCALING_ENABLE=y
+CONFIG_ARCH_U8500=y
+CONFIG_MACH_HREFV60=y
+CONFIG_MACH_SNOWBALL=y
+CONFIG_MACH_UX500_DT=y
+CONFIG_ARCH_VEXPRESS=y
+CONFIG_ARCH_VEXPRESS_CA9X4=y
+CONFIG_ARCH_VIRT=y
+CONFIG_ARCH_WM8850=y
+CONFIG_ARCH_ZYNQ=y
+CONFIG_NEON=y
+CONFIG_TRUSTED_FOUNDATIONS=y
+CONFIG_PCI=y
+CONFIG_PCI_MSI=y
+CONFIG_PCI_MVEBU=y
+CONFIG_PCI_TEGRA=y
+CONFIG_SMP=y
+CONFIG_HIGHMEM=y
+CONFIG_HIGHPTE=y
+CONFIG_CMA=y
+CONFIG_ARM_APPENDED_DTB=y
+CONFIG_ARM_ATAG_DTB_COMPAT=y
+CONFIG_KEXEC=y
+CONFIG_CPU_FREQ=y
+CONFIG_CPU_FREQ_STAT_DETAILS=y
+CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
+CONFIG_CPU_IDLE=y
+CONFIG_NET=y
+CONFIG_PACKET=y
+CONFIG_UNIX=y
+CONFIG_INET=y
+CONFIG_IP_PNP=y
+CONFIG_IP_PNP_DHCP=y
+CONFIG_IP_PNP_BOOTP=y
+CONFIG_IP_PNP_RARP=y
+CONFIG_IPV6_ROUTER_PREF=y
+CONFIG_IPV6_OPTIMISTIC_DAD=y
+CONFIG_INET6_AH=m
+CONFIG_INET6_ESP=m
+CONFIG_INET6_IPCOMP=m
+CONFIG_IPV6_MIP6=m
+CONFIG_IPV6_TUNNEL=m
+CONFIG_IPV6_MULTIPLE_TABLES=y
+CONFIG_CFG80211=m
+CONFIG_MAC80211=m
+CONFIG_RFKILL=y
+CONFIG_RFKILL_INPUT=y
+CONFIG_RFKILL_GPIO=y
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
+CONFIG_DMA_CMA=y
+CONFIG_CMA_SIZE_MBYTES=64
+CONFIG_OMAP_OCP2SCP=y
+CONFIG_MTD=y
+CONFIG_MTD_M25P80=y
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_ICS932S401=y
+CONFIG_APDS9802ALS=y
+CONFIG_ISL29003=y
+CONFIG_BLK_DEV_SD=y
+CONFIG_BLK_DEV_SR=y
+CONFIG_SCSI_MULTI_LUN=y
+CONFIG_ATA=y
+CONFIG_SATA_AHCI_PLATFORM=y
+CONFIG_SATA_HIGHBANK=y
+CONFIG_SATA_MV=y
+CONFIG_NETDEVICES=y
+CONFIG_SUN4I_EMAC=y
+CONFIG_NET_CALXEDA_XGMAC=y
+CONFIG_MV643XX_ETH=y
+CONFIG_MVNETA=y
+CONFIG_KS8851=y
+CONFIG_R8169=y
+CONFIG_SMSC911X=y
+CONFIG_STMMAC_ETH=y
+CONFIG_TI_CPSW=y
+CONFIG_AT803X_PHY=y
+CONFIG_MARVELL_PHY=y
+CONFIG_ICPLUS_PHY=y
+CONFIG_USB_PEGASUS=y
+CONFIG_USB_USBNET=y
+CONFIG_USB_NET_SMSC75XX=y
+CONFIG_USB_NET_SMSC95XX=y
+CONFIG_BRCMFMAC=m
+CONFIG_RT2X00=m
+CONFIG_RT2800USB=m
+CONFIG_INPUT_EVDEV=y
+CONFIG_KEYBOARD_GPIO=y
+CONFIG_KEYBOARD_TEGRA=y
+CONFIG_KEYBOARD_SPEAR=y
+CONFIG_KEYBOARD_CROS_EC=y
+CONFIG_MOUSE_PS2_ELANTECH=y
+CONFIG_INPUT_MISC=y
+CONFIG_INPUT_MPU3050=y
+CONFIG_SERIO_AMBAKMI=y
+CONFIG_SERIAL_8250=y
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_8250_DW=y
+CONFIG_SERIAL_AMBA_PL011=y
+CONFIG_SERIAL_AMBA_PL011_CONSOLE=y
+CONFIG_SERIAL_SIRFSOC=y
+CONFIG_SERIAL_SIRFSOC_CONSOLE=y
+CONFIG_SERIAL_TEGRA=y
+CONFIG_SERIAL_IMX=y
+CONFIG_SERIAL_IMX_CONSOLE=y
+CONFIG_SERIAL_MSM=y
+CONFIG_SERIAL_MSM_CONSOLE=y
+CONFIG_SERIAL_VT8500=y
+CONFIG_SERIAL_VT8500_CONSOLE=y
+CONFIG_SERIAL_OF_PLATFORM=y
+CONFIG_SERIAL_OMAP=y
+CONFIG_SERIAL_OMAP_CONSOLE=y
+CONFIG_SERIAL_XILINX_PS_UART=y
+CONFIG_SERIAL_XILINX_PS_UART_CONSOLE=y
+CONFIG_SERIAL_FSL_LPUART=y
+CONFIG_SERIAL_FSL_LPUART_CONSOLE=y
+CONFIG_SERIAL_ST_ASC=y
+CONFIG_SERIAL_ST_ASC_CONSOLE=y
+CONFIG_I2C_CHARDEV=y
+CONFIG_I2C_MUX=y
+CONFIG_I2C_MUX_PCA954x=y
+CONFIG_I2C_MUX_PINCTRL=y
+CONFIG_I2C_DESIGNWARE_PLATFORM=y
+CONFIG_I2C_MV64XXX=y
+CONFIG_I2C_SIRF=y
+CONFIG_I2C_TEGRA=y
+CONFIG_SPI=y
+CONFIG_SPI_OMAP24XX=y
+CONFIG_SPI_ORION=y
+CONFIG_SPI_PL022=y
+CONFIG_SPI_SIRF=y
+CONFIG_SPI_TEGRA114=y
+CONFIG_SPI_TEGRA20_SFLASH=y
+CONFIG_SPI_TEGRA20_SLINK=y
+CONFIG_PINCTRL_AS3722=y
+CONFIG_PINCTRL_PALMAS=y
+CONFIG_GPIO_SYSFS=y
+CONFIG_GPIO_GENERIC_PLATFORM=y
+CONFIG_GPIO_PCA953X_IRQ=y
+CONFIG_GPIO_TWL4030=y
+CONFIG_GPIO_PALMAS=y
+CONFIG_GPIO_TPS6586X=y
+CONFIG_GPIO_TPS65910=y
+CONFIG_BATTERY_SBS=y
+CONFIG_CHARGER_TPS65090=y
+CONFIG_POWER_RESET_AS3722=y
+CONFIG_POWER_RESET_GPIO=y
+CONFIG_SENSORS_LM90=y
+CONFIG_THERMAL=y
+CONFIG_DOVE_THERMAL=y
+CONFIG_ARMADA_THERMAL=y
+CONFIG_WATCHDOG=y
+CONFIG_ORION_WATCHDOG=y
+CONFIG_MFD_AS3722=y
+CONFIG_MFD_CROS_EC=y
+CONFIG_MFD_CROS_EC_SPI=y
+CONFIG_MFD_MAX8907=y
+CONFIG_MFD_PALMAS=y
+CONFIG_MFD_TPS65090=y
+CONFIG_MFD_TPS6586X=y
+CONFIG_MFD_TPS65910=y
+CONFIG_REGULATOR_VIRTUAL_CONSUMER=y
+CONFIG_REGULATOR_AB8500=y
+CONFIG_REGULATOR_AS3722=y
+CONFIG_REGULATOR_GPIO=y
+CONFIG_REGULATOR_MAX8907=y
+CONFIG_REGULATOR_PALMAS=y
+CONFIG_REGULATOR_TPS51632=y
+CONFIG_REGULATOR_TPS62360=y
+CONFIG_REGULATOR_TPS65090=y
+CONFIG_REGULATOR_TPS6586X=y
+CONFIG_REGULATOR_TPS65910=y
+CONFIG_REGULATOR_TWL4030=y
+CONFIG_REGULATOR_VEXPRESS=y
+CONFIG_MEDIA_SUPPORT=y
+CONFIG_MEDIA_CAMERA_SUPPORT=y
+CONFIG_MEDIA_USB_SUPPORT=y
+CONFIG_USB_VIDEO_CLASS=y
+CONFIG_USB_GSPCA=y
+CONFIG_DRM=y
+CONFIG_DRM_TEGRA=y
+CONFIG_DRM_PANEL_SIMPLE=y
+CONFIG_FB_ARMCLCD=y
+CONFIG_FB_WM8505=y
+CONFIG_FB_SIMPLE=y
+CONFIG_BACKLIGHT_LCD_SUPPORT=y
+CONFIG_BACKLIGHT_CLASS_DEVICE=y
+CONFIG_BACKLIGHT_PWM=y
+CONFIG_FRAMEBUFFER_CONSOLE=y
+CONFIG_SOUND=y
+CONFIG_SND=y
+CONFIG_SND_SOC=y
+CONFIG_SND_SOC_TEGRA=y
+CONFIG_SND_SOC_TEGRA_RT5640=y
+CONFIG_SND_SOC_TEGRA_WM8753=y
+CONFIG_SND_SOC_TEGRA_WM8903=y
+CONFIG_SND_SOC_TEGRA_TRIMSLICE=y
+CONFIG_SND_SOC_TEGRA_ALC5632=y
+CONFIG_SND_SOC_TEGRA_MAX98090=y
+CONFIG_USB=y
+CONFIG_USB_XHCI_HCD=y
+CONFIG_USB_EHCI_HCD=y
+CONFIG_USB_EHCI_TEGRA=y
+CONFIG_USB_EHCI_HCD_PLATFORM=y
+CONFIG_USB_ISP1760_HCD=y
+CONFIG_USB_STORAGE=y
+CONFIG_USB_CHIPIDEA=y
+CONFIG_USB_CHIPIDEA_HOST=y
+CONFIG_AB8500_USB=y
+CONFIG_OMAP_USB3=y
+CONFIG_SAMSUNG_USB2PHY=y
+CONFIG_SAMSUNG_USB3PHY=y
+CONFIG_USB_GPIO_VBUS=y
+CONFIG_USB_ISP1301=y
+CONFIG_USB_MXS_PHY=y
+CONFIG_MMC=y
+CONFIG_MMC_BLOCK_MINORS=16
+CONFIG_MMC_ARMMMCI=y
+CONFIG_MMC_SDHCI=y
+CONFIG_MMC_SDHCI_ESDHC_IMX=y
+CONFIG_MMC_SDHCI_TEGRA=y
+CONFIG_MMC_SDHCI_DOVE=y
+CONFIG_MMC_SDHCI_SPEAR=y
+CONFIG_MMC_SDHCI_BCM_KONA=y
+CONFIG_MMC_OMAP=y
+CONFIG_MMC_OMAP_HS=y
+CONFIG_MMC_MVSDIO=y
+CONFIG_EDAC=y
+CONFIG_EDAC_MM_EDAC=y
+CONFIG_EDAC_HIGHBANK_MC=y
+CONFIG_EDAC_HIGHBANK_L2=y
+CONFIG_RTC_CLASS=y
+CONFIG_RTC_DRV_AS3722=y
+CONFIG_RTC_DRV_MAX8907=y
+CONFIG_RTC_DRV_PALMAS=y
+CONFIG_RTC_DRV_TWL4030=y
+CONFIG_RTC_DRV_TPS6586X=y
+CONFIG_RTC_DRV_TPS65910=y
+CONFIG_RTC_DRV_EM3027=y
+CONFIG_RTC_DRV_PL031=y
+CONFIG_RTC_DRV_VT8500=y
+CONFIG_RTC_DRV_MV=y
+CONFIG_RTC_DRV_TEGRA=y
+CONFIG_DMADEVICES=y
+CONFIG_DW_DMAC=y
+CONFIG_MV_XOR=y
+CONFIG_TEGRA20_APB_DMA=y
+CONFIG_STE_DMA40=y
+CONFIG_SIRF_DMA=y
+CONFIG_TI_EDMA=y
+CONFIG_PL330_DMA=y
+CONFIG_IMX_SDMA=y
+CONFIG_IMX_DMA=y
+CONFIG_MXS_DMA=y
+CONFIG_DMA_OMAP=y
+CONFIG_STAGING=y
+CONFIG_SENSORS_ISL29018=y
+CONFIG_SENSORS_ISL29028=y
+CONFIG_MFD_NVEC=y
+CONFIG_KEYBOARD_NVEC=y
+CONFIG_SERIO_NVEC_PS2=y
+CONFIG_NVEC_POWER=y
+CONFIG_COMMON_CLK_QCOM=y
+CONFIG_MSM_GCC_8660=y
+CONFIG_MSM_MMCC_8960=y
+CONFIG_MSM_MMCC_8974=y
+CONFIG_TEGRA_IOMMU_GART=y
+CONFIG_TEGRA_IOMMU_SMMU=y
+CONFIG_MEMORY=y
+CONFIG_IIO=y
+CONFIG_AK8975=y
+CONFIG_PWM=y
+CONFIG_PWM_TEGRA=y
+CONFIG_PWM_VT8500=y
+CONFIG_OMAP_USB2=y
+CONFIG_EXT4_FS=y
+CONFIG_VFAT_FS=y
+CONFIG_TMPFS=y
+CONFIG_SQUASHFS=y
+CONFIG_SQUASHFS_LZO=y
+CONFIG_SQUASHFS_XZ=y
+CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=y
+CONFIG_ROOT_NFS=y
+CONFIG_PRINTK_TIME=y
+CONFIG_DEBUG_FS=y
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_LOCKUP_DETECTOR=y
+CONFIG_CRYPTO_DEV_TEGRA_AES=y
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 11/14] ARM: config: append hip04_defconfig
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
                   ` (9 preceding siblings ...)
  2014-05-20 13:10 ` [PATCH v9 10/14] ARM: config: append lpae configuration Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 12/14] ARM: config: select ARCH_HISI in hi3xxx_defconfig Haojian Zhuang
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

Select HiP04 SoC configuration by hip04_defconfig.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 arch/arm/configs/hip04_defconfig | 74 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)
 create mode 100644 arch/arm/configs/hip04_defconfig

diff --git a/arch/arm/configs/hip04_defconfig b/arch/arm/configs/hip04_defconfig
new file mode 100644
index 0000000..5c471b2
--- /dev/null
+++ b/arch/arm/configs/hip04_defconfig
@@ -0,0 +1,74 @@
+CONFIG_SYSVIPC=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_IRQ_DOMAIN_DEBUG=y
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_RD_GZIP=y
+# CONFIG_ARCH_MULTI_V6 is not set
+CONFIG_ARCH_MULTI_V7=y
+CONFIG_ARCH_HISI=y
+CONFIG_ARCH_HIP04=y
+CONFIG_ARM_LPAE=y
+CONFIG_SMP=y
+CONFIG_NR_CPUS=16
+CONFIG_MCPM=y
+CONFIG_MCPM_QUAD_CLUSTER=y
+CONFIG_PREEMPT=y
+CONFIG_AEABI=y
+CONFIG_ARM_APPENDED_DTB=y
+CONFIG_ARM_ATAG_DTB_COMPAT=y
+CONFIG_ARM_ATAG_DTB_COMPAT_CMDLINE_FROM_BOOTLOADER=y
+CONFIG_HIGHMEM=y
+CONFIG_VFP=y
+CONFIG_NEON=y
+CONFIG_NET=y
+CONFIG_UNIX=y
+CONFIG_INET=y
+CONFIG_IP_PNP=y
+CONFIG_IP_PNP_DHCP=y
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
+CONFIG_BLK_DEV=y
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
+CONFIG_BLK_DEV_RAM=y
+CONFIG_BLK_DEV_RAM_COUNT=16
+CONFIG_BLK_DEV_RAM_SIZE=4096
+CONFIG_BLK_DEV_SD=y
+CONFIG_ATA=y
+CONFIG_SATA_AHCI_PLATFORM=y
+CONFIG_NETDEVICES=y
+CONFIG_SERIAL_8250=y
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_8250_NR_UARTS=2
+CONFIG_SERIAL_8250_RUNTIME_UARTS=2
+CONFIG_SERIAL_8250_DW=y
+CONFIG_SERIAL_OF_PLATFORM=y
+CONFIG_I2C_DESIGNWARE_PLATFORM=y
+CONFIG_PINCTRL_SINGLE=y
+CONFIG_GPIO_GENERIC_PLATFORM=y
+CONFIG_REGULATOR_GPIO=y
+CONFIG_DRM=y
+CONFIG_FB_SIMPLE=y
+CONFIG_RTC_CLASS=y
+CONFIG_EXT4_FS=y
+CONFIG_TMPFS=y
+CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=y
+CONFIG_ROOT_NFS=y
+CONFIG_PRINTK_TIME=y
+CONFIG_DEBUG_FS=y
+CONFIG_DEBUG_KERNEL=y
+CONFIG_DEBUG_LL=y
+CONFIG_DEBUG_UART_8250=y
+CONFIG_EARLY_PRINTK=y
+CONFIG_LOCKUP_DETECTOR=y
+CONFIG_DEBUG_USER=y
+CONFIG_VIRTUALIZATION=y
+CONFIG_KVM=y
+CONFIG_KVM_ARM_HOST=y
+CONFIG_KVM_ARM_MAX_VCPUS=4
+CONFIG_KVM_ARM_VGIC=y
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 12/14] ARM: config: select ARCH_HISI in hi3xxx_defconfig
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
                   ` (10 preceding siblings ...)
  2014-05-20 13:10 ` [PATCH v9 11/14] ARM: config: append hip04_defconfig Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 13/14] ARM: hisi: enable erratum 798181 of A15 on HiP04 Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 14/14] virt: arm: support hip04 gic Haojian Zhuang
  13 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

Since ARCH_HISI is added as common configuration of both ARCH_HI3xxx and
ARCH_HIP04, update it into hi3xxx_defconfig.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 arch/arm/configs/hi3xxx_defconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/configs/hi3xxx_defconfig b/arch/arm/configs/hi3xxx_defconfig
index f186bdf..553e1b6 100644
--- a/arch/arm/configs/hi3xxx_defconfig
+++ b/arch/arm/configs/hi3xxx_defconfig
@@ -3,10 +3,12 @@ CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_RD_LZMA=y
+CONFIG_ARCH_HISI=y
 CONFIG_ARCH_HI3xxx=y
 CONFIG_SMP=y
 CONFIG_PREEMPT=y
 CONFIG_AEABI=y
+CONFIG_HIGHMEM=y
 CONFIG_ARM_APPENDED_DTB=y
 CONFIG_NET=y
 CONFIG_UNIX=y
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 13/14] ARM: hisi: enable erratum 798181 of A15 on HiP04
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
                   ` (11 preceding siblings ...)
  2014-05-20 13:10 ` [PATCH v9 12/14] ARM: config: select ARCH_HISI in hi3xxx_defconfig Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:10 ` [PATCH v9 14/14] virt: arm: support hip04 gic Haojian Zhuang
  13 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

From: Kefeng Wang <kefeng.wang@linaro.org>

The commit 93dc688 (ARM: 7684/1: errata: Workaround for Cortex-A15
erratum 798181 (TLBI/DSB operations)) introduced a workaround for
Cortex-A15 erratum 798181. Enable it for HIP04(Cortex-a15 r3p2).

Signed-off-by: Kefeng Wang <kefeng.wang@linaro.org>
Singed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 arch/arm/mach-hisi/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/mach-hisi/Kconfig b/arch/arm/mach-hisi/Kconfig
index 3f1ece7..0a5a49d 100644
--- a/arch/arm/mach-hisi/Kconfig
+++ b/arch/arm/mach-hisi/Kconfig
@@ -21,6 +21,7 @@ config ARCH_HI3xxx
 
 config ARCH_HIP04
 	bool "Hisilicon HiP04 Cortex A15 family" if ARCH_MULTI_V7
+	select ARM_ERRATA_798181 if SMP
 	select HAVE_ARM_ARCH_TIMER
 	select MCPM if SMP
 	select MCPM_QUAD_CLUSTER if SMP
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
                   ` (12 preceding siblings ...)
  2014-05-20 13:10 ` [PATCH v9 13/14] ARM: hisi: enable erratum 798181 of A15 on HiP04 Haojian Zhuang
@ 2014-05-20 13:10 ` Haojian Zhuang
  2014-05-20 13:34   ` Haojian Zhuang
                     ` (2 more replies)
  13 siblings, 3 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

In ARM standard GIC, GICH_APR offset is 0xf0 & GICH_LR0 offset is 0x100.
In HiP04 GIC, GICH_APR offset is 0x70 & GICH_LR0 offset is 0x80.

Now reuse the nr_lr field in struct vgic_cpu. Bit[31:16] is used to store
GICH_APR offset in HiP04, and bit[15:0] is used to store real nr_lr
variable. In ARM standard GIC, don't set bit[31:16]. So we could avoid
to change the VGIC implementation in arm64.

Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
---
 arch/arm/kernel/asm-offsets.c   |  2 +-
 arch/arm/kvm/interrupts_head.S  | 29 +++++++++++++++++++------
 arch/arm64/kernel/asm-offsets.c |  2 +-
 arch/arm64/kvm/hyp.S            | 28 ++++++++++++++++++------
 include/kvm/arm_vgic.h          |  7 ++++--
 include/linux/irqchip/arm-gic.h |  6 ++++++
 virt/kvm/arm/vgic.c             | 48 +++++++++++++++++++++++++++++------------
 7 files changed, 92 insertions(+), 30 deletions(-)

diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 85598b5..166cc98 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -189,7 +189,7 @@ int main(void)
   DEFINE(VGIC_CPU_ELRSR,	offsetof(struct vgic_cpu, vgic_elrsr));
   DEFINE(VGIC_CPU_APR,		offsetof(struct vgic_cpu, vgic_apr));
   DEFINE(VGIC_CPU_LR,		offsetof(struct vgic_cpu, vgic_lr));
-  DEFINE(VGIC_CPU_NR_LR,	offsetof(struct vgic_cpu, nr_lr));
+  DEFINE(VGIC_CPU_HW_CFG,	offsetof(struct vgic_cpu, hw_cfg));
 #ifdef CONFIG_KVM_ARM_TIMER
   DEFINE(VCPU_TIMER_CNTV_CTL,	offsetof(struct kvm_vcpu, arch.timer_cpu.cntv_ctl));
   DEFINE(VCPU_TIMER_CNTV_CVAL,	offsetof(struct kvm_vcpu, arch.timer_cpu.cntv_cval));
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index 76af9302..9fbbf99 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -419,7 +419,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	ldr	r7, [r2, #GICH_EISR1]
 	ldr	r8, [r2, #GICH_ELRSR0]
 	ldr	r9, [r2, #GICH_ELRSR1]
-	ldr	r10, [r2, #GICH_APR]
+	ldr	r10, [r11, #VGIC_CPU_HW_CFG]
+	mov	r10, r10, lsr #HWCFG_APR_SHIFT
+	ldr	r10, [r2, r10]
 
 	str	r3, [r11, #VGIC_CPU_HCR]
 	str	r4, [r11, #VGIC_CPU_VMCR]
@@ -435,9 +437,15 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	str	r5, [r2, #GICH_HCR]
 
 	/* Save list registers */
-	add	r2, r2, #GICH_LR0
+	ldr	r4, [r11, #VGIC_CPU_HW_CFG]
+	mov	r10, r4, lsr #HWCFG_APR_SHIFT
+	/* the offset between GICH_APR & GICH_LR0 is 0x10 */
+	add	r10, r10, #0x10
+	add	r2, r2, r10
 	add	r3, r11, #VGIC_CPU_LR
-	ldr	r4, [r11, #VGIC_CPU_NR_LR]
+	/* Get NR_LR from VGIC_CPU_HW_CFG */
+	ldr	r6, =HWCFG_NR_LR_MASK
+	and	r4, r4, r6
 1:	ldr	r6, [r2], #4
 	str	r6, [r3], #4
 	subs	r4, r4, #1
@@ -469,12 +477,21 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 
 	str	r3, [r2, #GICH_HCR]
 	str	r4, [r2, #GICH_VMCR]
-	str	r8, [r2, #GICH_APR]
+	ldr	r6, [r11, #VGIC_CPU_HW_CFG]
+	mov	r6, r6, lsr #HWCFG_APR_SHIFT
+	str	r8, [r2, r6]
 
 	/* Restore list registers */
-	add	r2, r2, #GICH_LR0
+	ldr	r4, [r11, #VGIC_CPU_HW_CFG]
+	mov	r6, r4, lsr #HWCFG_APR_SHIFT
+	/* the offset between GICH_APR & GICH_LR0 is 0x10 */
+	add	r6, r6, #0x10
+	/* get offset of GICH_LR0 */
+	add	r2, r2, r6
+	/* Get NR_LR from VGIC_CPU_HW_CFG */
 	add	r3, r11, #VGIC_CPU_LR
-	ldr	r4, [r11, #VGIC_CPU_NR_LR]
+	ldr	r6, =HWCFG_NR_LR_MASK
+	and	r4, r4, r6
 1:	ldr	r6, [r3], #4
 	str	r6, [r2], #4
 	subs	r4, r4, #1
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 646f888..2422358 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -136,7 +136,7 @@ int main(void)
   DEFINE(VGIC_CPU_ELRSR,	offsetof(struct vgic_cpu, vgic_elrsr));
   DEFINE(VGIC_CPU_APR,		offsetof(struct vgic_cpu, vgic_apr));
   DEFINE(VGIC_CPU_LR,		offsetof(struct vgic_cpu, vgic_lr));
-  DEFINE(VGIC_CPU_NR_LR,	offsetof(struct vgic_cpu, nr_lr));
+  DEFINE(VGIC_CPU_HW_CFG,	offsetof(struct vgic_cpu, hw_cfg));
   DEFINE(KVM_VTTBR,		offsetof(struct kvm, arch.vttbr));
   DEFINE(KVM_VGIC_VCTRL,	offsetof(struct kvm, arch.vgic.vctrl_base));
 #endif
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 2c56012..a4a8b3d 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -402,7 +402,9 @@ __kvm_hyp_code_start:
 	ldr	w8, [x2, #GICH_EISR1]
 	ldr	w9, [x2, #GICH_ELRSR0]
 	ldr	w10, [x2, #GICH_ELRSR1]
-	ldr	w11, [x2, #GICH_APR]
+	ldr	w11, [x3, #VGIC_CPU_HW_CFG]
+	mov	w11, w11, lsr #HWCFG_APR_SHIFT
+	ldr	w11, [x2, w10]
 CPU_BE(	rev	w4,  w4  )
 CPU_BE(	rev	w5,  w5  )
 CPU_BE(	rev	w6,  w6  )
@@ -425,8 +427,13 @@ CPU_BE(	rev	w11, w11 )
 	str	wzr, [x2, #GICH_HCR]
 
 	/* Save list registers */
-	add	x2, x2, #GICH_LR0
-	ldr	w4, [x3, #VGIC_CPU_NR_LR]
+	ldr	w4, [x3, #VGIC_CPU_HW_CFG]
+	mov	w6, w4, lsr #HWCFG_APR_SHIFT
+	ldr	w7, =HWCFG_NR_LR_MASK
+	and	w4, w4, w7
+	/* the offset between GICH_APR and GICH_LR0 is 0x10 */
+	add	w6, w6, 0x10
+	add	x2, x2, w6
 	add	x3, x3, #VGIC_CPU_LR
 1:	ldr	w5, [x2], #4
 CPU_BE(	rev	w5, w5 )
@@ -461,11 +468,20 @@ CPU_BE(	rev	w6, w6 )
 
 	str	w4, [x2, #GICH_HCR]
 	str	w5, [x2, #GICH_VMCR]
-	str	w6, [x2, #GICH_APR]
+	ldr	w4, [x3, #VGIC_CPU_HW_CFG]
+	mov	w4, w4, #HWCFG_APR_SHIFT
+	str	w6, [x2, w4]
 
 	/* Restore list registers */
-	add	x2, x2, #GICH_LR0
-	ldr	w4, [x3, #VGIC_CPU_NR_LR]
+	ldr	w4, [x3, #VGIC_CPU_HW_CFG]
+	mov	w6, w4, #HWCFG_APR_SHIFT
+	/* the offset between GICH_APR and GICH_LR0 is 0x10 */
+	add	w6, w6, #0x10
+	/* get offset of GICH_LR0 */
+	add	x2, x2, w6
+	/* get NR_LR from VGIC_CPU_HW_CFG */
+	ldr	w6, =HWCFG_NR_LR_MASK
+	and	w4, w4, w6
 	add	x3, x3, #VGIC_CPU_LR
 1:	ldr	w5, [x3], #4
 CPU_BE(	rev	w5, w5 )
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index f27000f..eba4b51 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -122,8 +122,11 @@ struct vgic_cpu {
 	/* Bitmap of used/free list registers */
 	DECLARE_BITMAP(	lr_used, VGIC_MAX_LRS);
 
-	/* Number of list registers on this CPU */
-	int		nr_lr;
+	/*
+	 * bit[31:16]: GICH_APR offset
+	 * bit[15:0]:  Number of list registers on this CPU
+	 */
+	u32		hw_cfg;
 
 	/* CPU vif control registers for world switch */
 	u32		vgic_hcr;
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 45e2d8c..b055f92 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -49,6 +49,8 @@
 #define GICH_ELRSR1 			0x34
 #define GICH_APR			0xf0
 #define GICH_LR0			0x100
+#define HIP04_GICH_APR			0x70
+/* GICH_LR0 offset in HiP04 is 0x80 */
 
 #define GICH_HCR_EN			(1 << 0)
 #define GICH_HCR_UIE			(1 << 1)
@@ -73,6 +75,10 @@
 #define GICH_MISR_EOI			(1 << 0)
 #define GICH_MISR_U			(1 << 1)
 
+#define HWCFG_NR_LR_MASK	0xffff
+#define HWCFG_APR_SHIFT		16
+#define HWCFG_APR_MASK		(0xffff << HWCFG_APR_SHIFT)
+
 #ifndef __ASSEMBLY__
 
 struct device_node;
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 47b2983..4c0c1e9 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -76,6 +76,8 @@
 #define IMPLEMENTER_ARM		0x43b
 #define GICC_ARCH_VERSION_V2	0x2
 
+#define vgic_nr_lr(vcpu)	(vcpu->hw_cfg & HWCFG_NR_LR_MASK)
+
 /* Physical address of vgic virtual cpu interface */
 static phys_addr_t vgic_vcpu_base;
 
@@ -97,7 +99,7 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
 static void vgic_update_state(struct kvm *kvm);
 static void vgic_kick_vcpus(struct kvm *kvm);
 static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
-static u32 vgic_nr_lr;
+static u32 vgic_hw_cfg;
 
 static unsigned int vgic_maint_irq;
 
@@ -624,9 +626,9 @@ static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	int vcpu_id = vcpu->vcpu_id;
 	int i, irq, source_cpu;
-	u32 *lr;
+	u32 *lr, nr_lr = vgic_nr_lr(vgic_cpu);
 
-	for_each_set_bit(i, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
+	for_each_set_bit(i, vgic_cpu->lr_used, nr_lr) {
 		lr = &vgic_cpu->vgic_lr[i];
 		irq = LR_IRQID(*lr);
 		source_cpu = LR_CPUID(*lr);
@@ -1005,8 +1007,9 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	int lr;
+	int nr_lr = vgic_nr_lr(vgic_cpu);
 
-	for_each_set_bit(lr, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
+	for_each_set_bit(lr, vgic_cpu->lr_used, nr_lr) {
 		int irq = vgic_cpu->vgic_lr[lr] & GICH_LR_VIRTUALID;
 
 		if (!vgic_irq_is_enabled(vcpu, irq)) {
@@ -1025,6 +1028,7 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	int lr;
+	int nr_lr = vgic_nr_lr(vgic_cpu);
 
 	/* Sanitize the input... */
 	BUG_ON(sgi_source_id & ~7);
@@ -1046,9 +1050,8 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 	}
 
 	/* Try to use another LR for this interrupt */
-	lr = find_first_zero_bit((unsigned long *)vgic_cpu->lr_used,
-			       vgic_cpu->nr_lr);
-	if (lr >= vgic_cpu->nr_lr)
+	lr = find_first_zero_bit((unsigned long *)vgic_cpu->lr_used, nr_lr);
+	if (lr >= nr_lr)
 		return false;
 
 	kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
@@ -1181,9 +1184,10 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
 		 * active bit.
 		 */
 		int lr, irq;
+		int nr_lr = vgic_nr_lr(vgic_cpu);
 
 		for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_eisr,
-				 vgic_cpu->nr_lr) {
+				 nr_lr) {
 			irq = vgic_cpu->vgic_lr[lr] & GICH_LR_VIRTUALID;
 
 			vgic_irq_clear_active(vcpu, irq);
@@ -1221,13 +1225,13 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
 	int lr, pending;
+	int nr_lr = vgic_nr_lr(vgic_cpu);
 	bool level_pending;
 
 	level_pending = vgic_process_maintenance(vcpu);
 
 	/* Clear mappings for empty LRs */
-	for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr,
-			 vgic_cpu->nr_lr) {
+	for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr, nr_lr) {
 		int irq;
 
 		if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
@@ -1241,8 +1245,8 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 
 	/* Check if we still have something up our sleeve... */
 	pending = find_first_zero_bit((unsigned long *)vgic_cpu->vgic_elrsr,
-				      vgic_cpu->nr_lr);
-	if (level_pending || pending < vgic_cpu->nr_lr)
+				      nr_lr);
+	if (level_pending || pending < nr_lr)
 		set_bit(vcpu->vcpu_id, &dist->irq_pending_on_cpu);
 }
 
@@ -1438,7 +1442,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
 	 */
 	vgic_cpu->vgic_vmcr = 0;
 
-	vgic_cpu->nr_lr = vgic_nr_lr;
+	vgic_cpu->hw_cfg = vgic_hw_cfg;
 	vgic_cpu->vgic_hcr = GICH_HCR_EN; /* Get the show on the road... */
 
 	return 0;
@@ -1470,17 +1474,32 @@ static struct notifier_block vgic_cpu_nb = {
 	.notifier_call = vgic_cpu_notify,
 };
 
+static const struct of_device_id of_vgic_ids[] = {
+	{
+		.compatible = "arm,cortex-a15-gic",
+		.data = (void *)GICH_APR,
+	}, {
+		.compatible = "hisilicon,hip04-gic",
+		.data = (void *)HIP04_GICH_APR,
+	}, {
+	},
+};
+
 int kvm_vgic_hyp_init(void)
 {
 	int ret;
 	struct resource vctrl_res;
 	struct resource vcpu_res;
+	const struct of_device_id *match;
+	u32 vgic_nr_lr;
 
-	vgic_node = of_find_compatible_node(NULL, NULL, "arm,cortex-a15-gic");
+	vgic_node = of_find_matching_node_and_match(NULL, of_vgic_ids, &match);
 	if (!vgic_node) {
 		kvm_err("error: no compatible vgic node in DT\n");
 		return -ENODEV;
 	}
+	/* High word of vgic_hw_cfg is the offset of GICH_APR. */
+	vgic_hw_cfg = (unsigned int)match->data << HWCFG_APR_SHIFT;
 
 	vgic_maint_irq = irq_of_parse_and_map(vgic_node, 0);
 	if (!vgic_maint_irq) {
@@ -1517,6 +1536,7 @@ int kvm_vgic_hyp_init(void)
 
 	vgic_nr_lr = readl_relaxed(vgic_vctrl_base + GICH_VTR);
 	vgic_nr_lr = (vgic_nr_lr & 0x3f) + 1;
+	vgic_hw_cfg |= vgic_nr_lr;
 
 	ret = create_hyp_io_mappings(vgic_vctrl_base,
 				     vgic_vctrl_base + resource_size(&vctrl_res),
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-20 13:10 ` [PATCH v9 14/14] virt: arm: support hip04 gic Haojian Zhuang
@ 2014-05-20 13:34   ` Haojian Zhuang
  2014-05-20 13:44   ` Christoffer Dall
  2014-05-21 13:11   ` Marc Zyngier
  2 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 20 May 2014 21:10, Haojian Zhuang <haojian.zhuang@linaro.org> wrote:
> In ARM standard GIC, GICH_APR offset is 0xf0 & GICH_LR0 offset is 0x100.
> In HiP04 GIC, GICH_APR offset is 0x70 & GICH_LR0 offset is 0x80.
>
> Now reuse the nr_lr field in struct vgic_cpu. Bit[31:16] is used to store
> GICH_APR offset in HiP04, and bit[15:0] is used to store real nr_lr
> variable. In ARM standard GIC, don't set bit[31:16]. So we could avoid
> to change the VGIC implementation in arm64.
>
> Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
> ---
>  arch/arm/kernel/asm-offsets.c   |  2 +-
>  arch/arm/kvm/interrupts_head.S  | 29 +++++++++++++++++++------
>  arch/arm64/kernel/asm-offsets.c |  2 +-
>  arch/arm64/kvm/hyp.S            | 28 ++++++++++++++++++------
>  include/kvm/arm_vgic.h          |  7 ++++--
>  include/linux/irqchip/arm-gic.h |  6 ++++++
>  virt/kvm/arm/vgic.c             | 48 +++++++++++++++++++++++++++++------------
>  7 files changed, 92 insertions(+), 30 deletions(-)
>
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 85598b5..166cc98 100644
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 2c56012..a4a8b3d 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -402,7 +402,9 @@ __kvm_hyp_code_start:
>         ldr     w8, [x2, #GICH_EISR1]
>         ldr     w9, [x2, #GICH_ELRSR0]
>         ldr     w10, [x2, #GICH_ELRSR1]
> -       ldr     w11, [x2, #GICH_APR]
> +       ldr     w11, [x3, #VGIC_CPU_HW_CFG]
> +       mov     w11, w11, lsr #HWCFG_APR_SHIFT
> +       ldr     w11, [x2, w10]
>  CPU_BE(        rev     w4,  w4  )
>  CPU_BE(        rev     w5,  w5  )
>  CPU_BE(        rev     w6,  w6  )
> @@ -425,8 +427,13 @@ CPU_BE(    rev     w11, w11 )
>         str     wzr, [x2, #GICH_HCR]
>
>         /* Save list registers */
> -       add     x2, x2, #GICH_LR0
> -       ldr     w4, [x3, #VGIC_CPU_NR_LR]
> +       ldr     w4, [x3, #VGIC_CPU_HW_CFG]
> +       mov     w6, w4, lsr #HWCFG_APR_SHIFT
> +       ldr     w7, =HWCFG_NR_LR_MASK
> +       and     w4, w4, w7
> +       /* the offset between GICH_APR and GICH_LR0 is 0x10 */
> +       add     w6, w6, 0x10
> +       add     x2, x2, w6
>         add     x3, x3, #VGIC_CPU_LR
>  1:     ldr     w5, [x2], #4
>  CPU_BE(        rev     w5, w5 )
> @@ -461,11 +468,20 @@ CPU_BE(   rev     w6, w6 )
>
>         str     w4, [x2, #GICH_HCR]
>         str     w5, [x2, #GICH_VMCR]
> -       str     w6, [x2, #GICH_APR]
> +       ldr     w4, [x3, #VGIC_CPU_HW_CFG]
> +       mov     w4, w4, #HWCFG_APR_SHIFT
> +       str     w6, [x2, w4]

Oh, I just found it's wrong.

Marc,

How to handle this case? Do I need to use another x{n} register at here?
If so, how to convert data from 32-bit register to 64-bit register?

Regards
Haojian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-20 13:10 ` [PATCH v9 14/14] virt: arm: support hip04 gic Haojian Zhuang
  2014-05-20 13:34   ` Haojian Zhuang
@ 2014-05-20 13:44   ` Christoffer Dall
  2014-05-20 13:52     ` Haojian Zhuang
  2014-05-21 13:11   ` Marc Zyngier
  2 siblings, 1 reply; 36+ messages in thread
From: Christoffer Dall @ 2014-05-20 13:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 20, 2014 at 09:10:27PM +0800, Haojian Zhuang wrote:
> In ARM standard GIC, GICH_APR offset is 0xf0 & GICH_LR0 offset is 0x100.
> In HiP04 GIC, GICH_APR offset is 0x70 & GICH_LR0 offset is 0x80.
> 
> Now reuse the nr_lr field in struct vgic_cpu. Bit[31:16] is used to store
> GICH_APR offset in HiP04, and bit[15:0] is used to store real nr_lr
> variable. In ARM standard GIC, don't set bit[31:16]. So we could avoid
> to change the VGIC implementation in arm64.
> 
> Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
> ---
>  arch/arm/kernel/asm-offsets.c   |  2 +-
>  arch/arm/kvm/interrupts_head.S  | 29 +++++++++++++++++++------
>  arch/arm64/kernel/asm-offsets.c |  2 +-
>  arch/arm64/kvm/hyp.S            | 28 ++++++++++++++++++------
>  include/kvm/arm_vgic.h          |  7 ++++--
>  include/linux/irqchip/arm-gic.h |  6 ++++++
>  virt/kvm/arm/vgic.c             | 48 +++++++++++++++++++++++++++++------------
>  7 files changed, 92 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 85598b5..166cc98 100644
> --- a/arch/arm/kernel/asm-offsets.c
> +++ b/arch/arm/kernel/asm-offsets.c
> @@ -189,7 +189,7 @@ int main(void)
>    DEFINE(VGIC_CPU_ELRSR,	offsetof(struct vgic_cpu, vgic_elrsr));
>    DEFINE(VGIC_CPU_APR,		offsetof(struct vgic_cpu, vgic_apr));
>    DEFINE(VGIC_CPU_LR,		offsetof(struct vgic_cpu, vgic_lr));
> -  DEFINE(VGIC_CPU_NR_LR,	offsetof(struct vgic_cpu, nr_lr));
> +  DEFINE(VGIC_CPU_HW_CFG,	offsetof(struct vgic_cpu, hw_cfg));
>  #ifdef CONFIG_KVM_ARM_TIMER
>    DEFINE(VCPU_TIMER_CNTV_CTL,	offsetof(struct kvm_vcpu, arch.timer_cpu.cntv_ctl));
>    DEFINE(VCPU_TIMER_CNTV_CVAL,	offsetof(struct kvm_vcpu, arch.timer_cpu.cntv_cval));
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index 76af9302..9fbbf99 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -419,7 +419,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>  	ldr	r7, [r2, #GICH_EISR1]
>  	ldr	r8, [r2, #GICH_ELRSR0]
>  	ldr	r9, [r2, #GICH_ELRSR1]
> -	ldr	r10, [r2, #GICH_APR]
> +	ldr	r10, [r11, #VGIC_CPU_HW_CFG]
> +	mov	r10, r10, lsr #HWCFG_APR_SHIFT
> +	ldr	r10, [r2, r10]
>  
>  	str	r3, [r11, #VGIC_CPU_HCR]
>  	str	r4, [r11, #VGIC_CPU_VMCR]
> @@ -435,9 +437,15 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>  	str	r5, [r2, #GICH_HCR]
>  
>  	/* Save list registers */
> -	add	r2, r2, #GICH_LR0
> +	ldr	r4, [r11, #VGIC_CPU_HW_CFG]
> +	mov	r10, r4, lsr #HWCFG_APR_SHIFT
> +	/* the offset between GICH_APR & GICH_LR0 is 0x10 */
> +	add	r10, r10, #0x10
> +	add	r2, r2, r10
>  	add	r3, r11, #VGIC_CPU_LR
> -	ldr	r4, [r11, #VGIC_CPU_NR_LR]
> +	/* Get NR_LR from VGIC_CPU_HW_CFG */
> +	ldr	r6, =HWCFG_NR_LR_MASK
> +	and	r4, r4, r6
>  1:	ldr	r6, [r2], #4
>  	str	r6, [r3], #4
>  	subs	r4, r4, #1
> @@ -469,12 +477,21 @@ vcpu	.req	r0		@ vcpu pointer always in r0
>  
>  	str	r3, [r2, #GICH_HCR]
>  	str	r4, [r2, #GICH_VMCR]
> -	str	r8, [r2, #GICH_APR]
> +	ldr	r6, [r11, #VGIC_CPU_HW_CFG]
> +	mov	r6, r6, lsr #HWCFG_APR_SHIFT
> +	str	r8, [r2, r6]
>  
>  	/* Restore list registers */
> -	add	r2, r2, #GICH_LR0
> +	ldr	r4, [r11, #VGIC_CPU_HW_CFG]
> +	mov	r6, r4, lsr #HWCFG_APR_SHIFT
> +	/* the offset between GICH_APR & GICH_LR0 is 0x10 */
> +	add	r6, r6, #0x10
> +	/* get offset of GICH_LR0 */
> +	add	r2, r2, r6
> +	/* Get NR_LR from VGIC_CPU_HW_CFG */
>  	add	r3, r11, #VGIC_CPU_LR
> -	ldr	r4, [r11, #VGIC_CPU_NR_LR]
> +	ldr	r6, =HWCFG_NR_LR_MASK
> +	and	r4, r4, r6
>  1:	ldr	r6, [r3], #4
>  	str	r6, [r2], #4
>  	subs	r4, r4, #1
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 646f888..2422358 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -136,7 +136,7 @@ int main(void)
>    DEFINE(VGIC_CPU_ELRSR,	offsetof(struct vgic_cpu, vgic_elrsr));
>    DEFINE(VGIC_CPU_APR,		offsetof(struct vgic_cpu, vgic_apr));
>    DEFINE(VGIC_CPU_LR,		offsetof(struct vgic_cpu, vgic_lr));
> -  DEFINE(VGIC_CPU_NR_LR,	offsetof(struct vgic_cpu, nr_lr));
> +  DEFINE(VGIC_CPU_HW_CFG,	offsetof(struct vgic_cpu, hw_cfg));
>    DEFINE(KVM_VTTBR,		offsetof(struct kvm, arch.vttbr));
>    DEFINE(KVM_VGIC_VCTRL,	offsetof(struct kvm, arch.vgic.vctrl_base));
>  #endif
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 2c56012..a4a8b3d 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -402,7 +402,9 @@ __kvm_hyp_code_start:
>  	ldr	w8, [x2, #GICH_EISR1]
>  	ldr	w9, [x2, #GICH_ELRSR0]
>  	ldr	w10, [x2, #GICH_ELRSR1]
> -	ldr	w11, [x2, #GICH_APR]
> +	ldr	w11, [x3, #VGIC_CPU_HW_CFG]
> +	mov	w11, w11, lsr #HWCFG_APR_SHIFT
> +	ldr	w11, [x2, w10]
>  CPU_BE(	rev	w4,  w4  )
>  CPU_BE(	rev	w5,  w5  )
>  CPU_BE(	rev	w6,  w6  )
> @@ -425,8 +427,13 @@ CPU_BE(	rev	w11, w11 )
>  	str	wzr, [x2, #GICH_HCR]
>  
>  	/* Save list registers */
> -	add	x2, x2, #GICH_LR0
> -	ldr	w4, [x3, #VGIC_CPU_NR_LR]
> +	ldr	w4, [x3, #VGIC_CPU_HW_CFG]
> +	mov	w6, w4, lsr #HWCFG_APR_SHIFT
> +	ldr	w7, =HWCFG_NR_LR_MASK
> +	and	w4, w4, w7
> +	/* the offset between GICH_APR and GICH_LR0 is 0x10 */
> +	add	w6, w6, 0x10
> +	add	x2, x2, w6
>  	add	x3, x3, #VGIC_CPU_LR
>  1:	ldr	w5, [x2], #4
>  CPU_BE(	rev	w5, w5 )
> @@ -461,11 +468,20 @@ CPU_BE(	rev	w6, w6 )
>  
>  	str	w4, [x2, #GICH_HCR]
>  	str	w5, [x2, #GICH_VMCR]
> -	str	w6, [x2, #GICH_APR]
> +	ldr	w4, [x3, #VGIC_CPU_HW_CFG]
> +	mov	w4, w4, #HWCFG_APR_SHIFT
> +	str	w6, [x2, w4]
>  
>  	/* Restore list registers */
> -	add	x2, x2, #GICH_LR0
> -	ldr	w4, [x3, #VGIC_CPU_NR_LR]
> +	ldr	w4, [x3, #VGIC_CPU_HW_CFG]
> +	mov	w6, w4, #HWCFG_APR_SHIFT
> +	/* the offset between GICH_APR and GICH_LR0 is 0x10 */
> +	add	w6, w6, #0x10
> +	/* get offset of GICH_LR0 */
> +	add	x2, x2, w6
> +	/* get NR_LR from VGIC_CPU_HW_CFG */
> +	ldr	w6, =HWCFG_NR_LR_MASK
> +	and	w4, w4, w6
>  	add	x3, x3, #VGIC_CPU_LR
>  1:	ldr	w5, [x3], #4
>  CPU_BE(	rev	w5, w5 )
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index f27000f..eba4b51 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -122,8 +122,11 @@ struct vgic_cpu {
>  	/* Bitmap of used/free list registers */
>  	DECLARE_BITMAP(	lr_used, VGIC_MAX_LRS);
>  
> -	/* Number of list registers on this CPU */
> -	int		nr_lr;
> +	/*
> +	 * bit[31:16]: GICH_APR offset
> +	 * bit[15:0]:  Number of list registers on this CPU
> +	 */
> +	u32		hw_cfg;
>  
>  	/* CPU vif control registers for world switch */
>  	u32		vgic_hcr;
> diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
> index 45e2d8c..b055f92 100644
> --- a/include/linux/irqchip/arm-gic.h
> +++ b/include/linux/irqchip/arm-gic.h
> @@ -49,6 +49,8 @@
>  #define GICH_ELRSR1 			0x34
>  #define GICH_APR			0xf0
>  #define GICH_LR0			0x100
> +#define HIP04_GICH_APR			0x70
> +/* GICH_LR0 offset in HiP04 is 0x80 */
>  
>  #define GICH_HCR_EN			(1 << 0)
>  #define GICH_HCR_UIE			(1 << 1)
> @@ -73,6 +75,10 @@
>  #define GICH_MISR_EOI			(1 << 0)
>  #define GICH_MISR_U			(1 << 1)
>  
> +#define HWCFG_NR_LR_MASK	0xffff
> +#define HWCFG_APR_SHIFT		16
> +#define HWCFG_APR_MASK		(0xffff << HWCFG_APR_SHIFT)
> +
>  #ifndef __ASSEMBLY__
>  
>  struct device_node;
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 47b2983..4c0c1e9 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -76,6 +76,8 @@
>  #define IMPLEMENTER_ARM		0x43b
>  #define GICC_ARCH_VERSION_V2	0x2
>  
> +#define vgic_nr_lr(vcpu)	(vcpu->hw_cfg & HWCFG_NR_LR_MASK)
> +
>  /* Physical address of vgic virtual cpu interface */
>  static phys_addr_t vgic_vcpu_base;
>  
> @@ -97,7 +99,7 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
>  static void vgic_update_state(struct kvm *kvm);
>  static void vgic_kick_vcpus(struct kvm *kvm);
>  static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
> -static u32 vgic_nr_lr;
> +static u32 vgic_hw_cfg;
>  
>  static unsigned int vgic_maint_irq;
>  
> @@ -624,9 +626,9 @@ static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>  	int vcpu_id = vcpu->vcpu_id;
>  	int i, irq, source_cpu;
> -	u32 *lr;
> +	u32 *lr, nr_lr = vgic_nr_lr(vgic_cpu);

This is static for any system post-boot, right?  Can't we set this
global variable once like we did before instead of having to define
these extra variables and do the bit manipulation all over the place?

-Christoffer

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 10/14] ARM: config: append lpae configuration
  2014-05-20 13:10 ` [PATCH v9 10/14] ARM: config: append lpae configuration Haojian Zhuang
@ 2014-05-20 13:52   ` Gregory CLEMENT
  2014-05-20 14:08   ` Alexandre Belloni
  2014-05-20 18:19   ` Olof Johansson
  2 siblings, 0 replies; 36+ messages in thread
From: Gregory CLEMENT @ 2014-05-20 13:52 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Haojian,

On 20/05/2014 15:10, Haojian Zhuang wrote:
> Append multi_v7_lpae_config. In this default configuration,
> CONFIG_ARCH_MULTI_V6 is disabled. CONFIG_ARM_LPAE is enabled.
> 
> Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
> ---
>  arch/arm/configs/multi_v7_lpae_defconfig | 351 +++++++++++++++++++++++++++++++
>  1 file changed, 351 insertions(+)
>  create mode 100644 arch/arm/configs/multi_v7_lpae_defconfig
> 
> diff --git a/arch/arm/configs/multi_v7_lpae_defconfig b/arch/arm/configs/multi_v7_lpae_defconfig
> new file mode 100644
> index 0000000..59fcefc
> --- /dev/null
> +++ b/arch/arm/configs/multi_v7_lpae_defconfig
> @@ -0,0 +1,351 @@

[...]


> +CONFIG_MACH_ARMADA_370=y
> +CONFIG_MACH_ARMADA_375=y
> +CONFIG_MACH_ARMADA_38X=y
> +CONFIG_MACH_ARMADA_XP=y
> +CONFIG_MACH_DOVE=y

Among them only ARMADA_XP is LPAE capable


Thanks,

Gregory



-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-20 13:44   ` Christoffer Dall
@ 2014-05-20 13:52     ` Haojian Zhuang
  2014-05-20 14:01       ` Christoffer Dall
  0 siblings, 1 reply; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 13:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 20 May 2014 21:44, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Tue, May 20, 2014 at 09:10:27PM +0800, Haojian Zhuang wrote:
>> In ARM standard GIC, GICH_APR offset is 0xf0 & GICH_LR0 offset is 0x100.
>> In HiP04 GIC, GICH_APR offset is 0x70 & GICH_LR0 offset is 0x80.
>>
>> Now reuse the nr_lr field in struct vgic_cpu. Bit[31:16] is used to store
>> GICH_APR offset in HiP04, and bit[15:0] is used to store real nr_lr
>> variable. In ARM standard GIC, don't set bit[31:16]. So we could avoid
>> to change the VGIC implementation in arm64.
>>
>> Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
>> ---
>>  arch/arm/kernel/asm-offsets.c   |  2 +-
>>  arch/arm/kvm/interrupts_head.S  | 29 +++++++++++++++++++------
>>  arch/arm64/kernel/asm-offsets.c |  2 +-
>>  arch/arm64/kvm/hyp.S            | 28 ++++++++++++++++++------
>>  include/kvm/arm_vgic.h          |  7 ++++--
>>  include/linux/irqchip/arm-gic.h |  6 ++++++
>>  virt/kvm/arm/vgic.c             | 48 +++++++++++++++++++++++++++++------------
>>  7 files changed, 92 insertions(+), 30 deletions(-)
>>
>> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
>> index 85598b5..166cc98 100644
>> --- a/arch/arm/kernel/asm-offsets.c
>> +++ b/arch/arm/kernel/asm-offsets.c
>> @@ -189,7 +189,7 @@ int main(void)
>>    DEFINE(VGIC_CPU_ELRSR,     offsetof(struct vgic_cpu, vgic_elrsr));
>>    DEFINE(VGIC_CPU_APR,               offsetof(struct vgic_cpu, vgic_apr));
>>    DEFINE(VGIC_CPU_LR,                offsetof(struct vgic_cpu, vgic_lr));
>> -  DEFINE(VGIC_CPU_NR_LR,     offsetof(struct vgic_cpu, nr_lr));
>> +  DEFINE(VGIC_CPU_HW_CFG,    offsetof(struct vgic_cpu, hw_cfg));
>>  #ifdef CONFIG_KVM_ARM_TIMER
>>    DEFINE(VCPU_TIMER_CNTV_CTL,        offsetof(struct kvm_vcpu, arch.timer_cpu.cntv_ctl));
>>    DEFINE(VCPU_TIMER_CNTV_CVAL,       offsetof(struct kvm_vcpu, arch.timer_cpu.cntv_cval));
>> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
>> index 76af9302..9fbbf99 100644
>> --- a/arch/arm/kvm/interrupts_head.S
>> +++ b/arch/arm/kvm/interrupts_head.S
>> @@ -419,7 +419,9 @@ vcpu      .req    r0              @ vcpu pointer always in r0
>>       ldr     r7, [r2, #GICH_EISR1]
>>       ldr     r8, [r2, #GICH_ELRSR0]
>>       ldr     r9, [r2, #GICH_ELRSR1]
>> -     ldr     r10, [r2, #GICH_APR]
>> +     ldr     r10, [r11, #VGIC_CPU_HW_CFG]
>> +     mov     r10, r10, lsr #HWCFG_APR_SHIFT
>> +     ldr     r10, [r2, r10]
>>
>>       str     r3, [r11, #VGIC_CPU_HCR]
>>       str     r4, [r11, #VGIC_CPU_VMCR]
>> @@ -435,9 +437,15 @@ vcpu     .req    r0              @ vcpu pointer always in r0
>>       str     r5, [r2, #GICH_HCR]
>>
>>       /* Save list registers */
>> -     add     r2, r2, #GICH_LR0
>> +     ldr     r4, [r11, #VGIC_CPU_HW_CFG]
>> +     mov     r10, r4, lsr #HWCFG_APR_SHIFT
>> +     /* the offset between GICH_APR & GICH_LR0 is 0x10 */
>> +     add     r10, r10, #0x10
>> +     add     r2, r2, r10
>>       add     r3, r11, #VGIC_CPU_LR
>> -     ldr     r4, [r11, #VGIC_CPU_NR_LR]
>> +     /* Get NR_LR from VGIC_CPU_HW_CFG */
>> +     ldr     r6, =HWCFG_NR_LR_MASK
>> +     and     r4, r4, r6
>>  1:   ldr     r6, [r2], #4
>>       str     r6, [r3], #4
>>       subs    r4, r4, #1
>> @@ -469,12 +477,21 @@ vcpu    .req    r0              @ vcpu pointer always in r0
>>
>>       str     r3, [r2, #GICH_HCR]
>>       str     r4, [r2, #GICH_VMCR]
>> -     str     r8, [r2, #GICH_APR]
>> +     ldr     r6, [r11, #VGIC_CPU_HW_CFG]
>> +     mov     r6, r6, lsr #HWCFG_APR_SHIFT
>> +     str     r8, [r2, r6]
>>
>>       /* Restore list registers */
>> -     add     r2, r2, #GICH_LR0
>> +     ldr     r4, [r11, #VGIC_CPU_HW_CFG]
>> +     mov     r6, r4, lsr #HWCFG_APR_SHIFT
>> +     /* the offset between GICH_APR & GICH_LR0 is 0x10 */
>> +     add     r6, r6, #0x10
>> +     /* get offset of GICH_LR0 */
>> +     add     r2, r2, r6
>> +     /* Get NR_LR from VGIC_CPU_HW_CFG */
>>       add     r3, r11, #VGIC_CPU_LR
>> -     ldr     r4, [r11, #VGIC_CPU_NR_LR]
>> +     ldr     r6, =HWCFG_NR_LR_MASK
>> +     and     r4, r4, r6
>>  1:   ldr     r6, [r3], #4
>>       str     r6, [r2], #4
>>       subs    r4, r4, #1
>> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
>> index 646f888..2422358 100644
>> --- a/arch/arm64/kernel/asm-offsets.c
>> +++ b/arch/arm64/kernel/asm-offsets.c
>> @@ -136,7 +136,7 @@ int main(void)
>>    DEFINE(VGIC_CPU_ELRSR,     offsetof(struct vgic_cpu, vgic_elrsr));
>>    DEFINE(VGIC_CPU_APR,               offsetof(struct vgic_cpu, vgic_apr));
>>    DEFINE(VGIC_CPU_LR,                offsetof(struct vgic_cpu, vgic_lr));
>> -  DEFINE(VGIC_CPU_NR_LR,     offsetof(struct vgic_cpu, nr_lr));
>> +  DEFINE(VGIC_CPU_HW_CFG,    offsetof(struct vgic_cpu, hw_cfg));
>>    DEFINE(KVM_VTTBR,          offsetof(struct kvm, arch.vttbr));
>>    DEFINE(KVM_VGIC_VCTRL,     offsetof(struct kvm, arch.vgic.vctrl_base));
>>  #endif
>> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
>> index 2c56012..a4a8b3d 100644
>> --- a/arch/arm64/kvm/hyp.S
>> +++ b/arch/arm64/kvm/hyp.S
>> @@ -402,7 +402,9 @@ __kvm_hyp_code_start:
>>       ldr     w8, [x2, #GICH_EISR1]
>>       ldr     w9, [x2, #GICH_ELRSR0]
>>       ldr     w10, [x2, #GICH_ELRSR1]
>> -     ldr     w11, [x2, #GICH_APR]
>> +     ldr     w11, [x3, #VGIC_CPU_HW_CFG]
>> +     mov     w11, w11, lsr #HWCFG_APR_SHIFT
>> +     ldr     w11, [x2, w10]
>>  CPU_BE(      rev     w4,  w4  )
>>  CPU_BE(      rev     w5,  w5  )
>>  CPU_BE(      rev     w6,  w6  )
>> @@ -425,8 +427,13 @@ CPU_BE(  rev     w11, w11 )
>>       str     wzr, [x2, #GICH_HCR]
>>
>>       /* Save list registers */
>> -     add     x2, x2, #GICH_LR0
>> -     ldr     w4, [x3, #VGIC_CPU_NR_LR]
>> +     ldr     w4, [x3, #VGIC_CPU_HW_CFG]
>> +     mov     w6, w4, lsr #HWCFG_APR_SHIFT
>> +     ldr     w7, =HWCFG_NR_LR_MASK
>> +     and     w4, w4, w7
>> +     /* the offset between GICH_APR and GICH_LR0 is 0x10 */
>> +     add     w6, w6, 0x10
>> +     add     x2, x2, w6
>>       add     x3, x3, #VGIC_CPU_LR
>>  1:   ldr     w5, [x2], #4
>>  CPU_BE(      rev     w5, w5 )
>> @@ -461,11 +468,20 @@ CPU_BE( rev     w6, w6 )
>>
>>       str     w4, [x2, #GICH_HCR]
>>       str     w5, [x2, #GICH_VMCR]
>> -     str     w6, [x2, #GICH_APR]
>> +     ldr     w4, [x3, #VGIC_CPU_HW_CFG]
>> +     mov     w4, w4, #HWCFG_APR_SHIFT
>> +     str     w6, [x2, w4]
>>
>>       /* Restore list registers */
>> -     add     x2, x2, #GICH_LR0
>> -     ldr     w4, [x3, #VGIC_CPU_NR_LR]
>> +     ldr     w4, [x3, #VGIC_CPU_HW_CFG]
>> +     mov     w6, w4, #HWCFG_APR_SHIFT
>> +     /* the offset between GICH_APR and GICH_LR0 is 0x10 */
>> +     add     w6, w6, #0x10
>> +     /* get offset of GICH_LR0 */
>> +     add     x2, x2, w6
>> +     /* get NR_LR from VGIC_CPU_HW_CFG */
>> +     ldr     w6, =HWCFG_NR_LR_MASK
>> +     and     w4, w4, w6
>>       add     x3, x3, #VGIC_CPU_LR
>>  1:   ldr     w5, [x3], #4
>>  CPU_BE(      rev     w5, w5 )
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index f27000f..eba4b51 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -122,8 +122,11 @@ struct vgic_cpu {
>>       /* Bitmap of used/free list registers */
>>       DECLARE_BITMAP( lr_used, VGIC_MAX_LRS);
>>
>> -     /* Number of list registers on this CPU */
>> -     int             nr_lr;
>> +     /*
>> +      * bit[31:16]: GICH_APR offset
>> +      * bit[15:0]:  Number of list registers on this CPU
>> +      */
>> +     u32             hw_cfg;
>>
>>       /* CPU vif control registers for world switch */
>>       u32             vgic_hcr;
>> diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
>> index 45e2d8c..b055f92 100644
>> --- a/include/linux/irqchip/arm-gic.h
>> +++ b/include/linux/irqchip/arm-gic.h
>> @@ -49,6 +49,8 @@
>>  #define GICH_ELRSR1                  0x34
>>  #define GICH_APR                     0xf0
>>  #define GICH_LR0                     0x100
>> +#define HIP04_GICH_APR                       0x70
>> +/* GICH_LR0 offset in HiP04 is 0x80 */
>>
>>  #define GICH_HCR_EN                  (1 << 0)
>>  #define GICH_HCR_UIE                 (1 << 1)
>> @@ -73,6 +75,10 @@
>>  #define GICH_MISR_EOI                        (1 << 0)
>>  #define GICH_MISR_U                  (1 << 1)
>>
>> +#define HWCFG_NR_LR_MASK     0xffff
>> +#define HWCFG_APR_SHIFT              16
>> +#define HWCFG_APR_MASK               (0xffff << HWCFG_APR_SHIFT)
>> +
>>  #ifndef __ASSEMBLY__
>>
>>  struct device_node;
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 47b2983..4c0c1e9 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -76,6 +76,8 @@
>>  #define IMPLEMENTER_ARM              0x43b
>>  #define GICC_ARCH_VERSION_V2 0x2
>>
>> +#define vgic_nr_lr(vcpu)     (vcpu->hw_cfg & HWCFG_NR_LR_MASK)
>> +
>>  /* Physical address of vgic virtual cpu interface */
>>  static phys_addr_t vgic_vcpu_base;
>>
>> @@ -97,7 +99,7 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
>>  static void vgic_update_state(struct kvm *kvm);
>>  static void vgic_kick_vcpus(struct kvm *kvm);
>>  static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
>> -static u32 vgic_nr_lr;
>> +static u32 vgic_hw_cfg;
>>
>>  static unsigned int vgic_maint_irq;
>>
>> @@ -624,9 +626,9 @@ static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>>       struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>       int vcpu_id = vcpu->vcpu_id;
>>       int i, irq, source_cpu;
>> -     u32 *lr;
>> +     u32 *lr, nr_lr = vgic_nr_lr(vgic_cpu);
>
> This is static for any system post-boot, right?  Can't we set this
> global variable once like we did before instead of having to define
> these extra variables and do the bit manipulation all over the place?
>
> -Christoffer

I tried to define a global gich_apr variable before. But Marc didn't agree on
that. He suggested to use vgic_cpu_nr_lr to save both GICH_APR offset
and nr_lr.

Adding gich_apr variable should be the simpler implementation.

Regards
Haojian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-20 13:52     ` Haojian Zhuang
@ 2014-05-20 14:01       ` Christoffer Dall
  2014-05-20 14:16         ` Haojian Zhuang
  0 siblings, 1 reply; 36+ messages in thread
From: Christoffer Dall @ 2014-05-20 14:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 20, 2014 at 09:52:53PM +0800, Haojian Zhuang wrote:
> On 20 May 2014 21:44, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> > On Tue, May 20, 2014 at 09:10:27PM +0800, Haojian Zhuang wrote:

[...]

> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> index 47b2983..4c0c1e9 100644
> >> --- a/virt/kvm/arm/vgic.c
> >> +++ b/virt/kvm/arm/vgic.c
> >> @@ -76,6 +76,8 @@
> >>  #define IMPLEMENTER_ARM              0x43b
> >>  #define GICC_ARCH_VERSION_V2 0x2
> >>
> >> +#define vgic_nr_lr(vcpu)     (vcpu->hw_cfg & HWCFG_NR_LR_MASK)
> >> +
> >>  /* Physical address of vgic virtual cpu interface */
> >>  static phys_addr_t vgic_vcpu_base;
> >>
> >> @@ -97,7 +99,7 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
> >>  static void vgic_update_state(struct kvm *kvm);
> >>  static void vgic_kick_vcpus(struct kvm *kvm);
> >>  static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
> >> -static u32 vgic_nr_lr;
> >> +static u32 vgic_hw_cfg;
> >>
> >>  static unsigned int vgic_maint_irq;
> >>
> >> @@ -624,9 +626,9 @@ static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
> >>       struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >>       int vcpu_id = vcpu->vcpu_id;
> >>       int i, irq, source_cpu;
> >> -     u32 *lr;
> >> +     u32 *lr, nr_lr = vgic_nr_lr(vgic_cpu);
> >
> > This is static for any system post-boot, right?  Can't we set this
> > global variable once like we did before instead of having to define
> > these extra variables and do the bit manipulation all over the place?
> >
> > -Christoffer
> 
> I tried to define a global gich_apr variable before. But Marc didn't agree on
> that. He suggested to use vgic_cpu_nr_lr to save both GICH_APR offset
> and nr_lr.
> 
> Adding gich_apr variable should be the simpler implementation.
> 
You're talking about storing this information on the vgic_cpu struct,
which is accessed on every world-switch patch.  There, you don't want
two memory accesses.

Here, on the other hand, you're in host kernel land, and you can do your
bit-shuffling once, and always access a single static variable like we
did before, which will simplify the C-code.

Makes sense?

-Christoffer

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 10/14] ARM: config: append lpae configuration
  2014-05-20 13:10 ` [PATCH v9 10/14] ARM: config: append lpae configuration Haojian Zhuang
  2014-05-20 13:52   ` Gregory CLEMENT
@ 2014-05-20 14:08   ` Alexandre Belloni
  2014-05-20 18:19   ` Olof Johansson
  2 siblings, 0 replies; 36+ messages in thread
From: Alexandre Belloni @ 2014-05-20 14:08 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 20/05/2014 at 21:10:23 +0800, Haojian Zhuang wrote :
> Append multi_v7_lpae_config. In this default configuration,
> CONFIG_ARCH_MULTI_V6 is disabled. CONFIG_ARM_LPAE is enabled.
> 
> Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
> ---
>  arch/arm/configs/multi_v7_lpae_defconfig | 351 +++++++++++++++++++++++++++++++
>  1 file changed, 351 insertions(+)
>  create mode 100644 arch/arm/configs/multi_v7_lpae_defconfig
> 
> diff --git a/arch/arm/configs/multi_v7_lpae_defconfig b/arch/arm/configs/multi_v7_lpae_defconfig
> new file mode 100644
> index 0000000..59fcefc
> --- /dev/null
> +++ b/arch/arm/configs/multi_v7_lpae_defconfig
> @@ -0,0 +1,351 @@
> +CONFIG_SYSVIPC=y
> +CONFIG_FHANDLE=y
> +CONFIG_IRQ_DOMAIN_DEBUG=y
> +CONFIG_NO_HZ=y
> +CONFIG_HIGH_RES_TIMERS=y
> +CONFIG_BLK_DEV_INITRD=y
> +CONFIG_EMBEDDED=y
> +CONFIG_MODULES=y
> +CONFIG_MODULE_UNLOAD=y
> +CONFIG_PARTITION_ADVANCED=y
> +# CONFIG_ARCH_MULTI_V6 is not set
> +CONFIG_ARCH_MULTI_V7=y
> +CONFIG_ARM_LPAE=y
> +CONFIG_ARCH_MVEBU=y
> +CONFIG_MACH_ARMADA_370=y
> +CONFIG_MACH_ARMADA_375=y
> +CONFIG_MACH_ARMADA_38X=y
> +CONFIG_MACH_ARMADA_XP=y
> +CONFIG_MACH_DOVE=y
> +CONFIG_ARCH_BCM=y
> +CONFIG_ARCH_BCM_5301X=y
> +CONFIG_ARCH_BCM_MOBILE=y
> +CONFIG_ARCH_BERLIN=y
> +CONFIG_MACH_BERLIN_BG2=y
> +CONFIG_MACH_BERLIN_BG2CD=y

Only BG2 may be LPAE capable (and that is probably not the case). I
would suggest leaving ARCH_BERLIN out.


-- 
Alexandre Belloni, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-20 14:01       ` Christoffer Dall
@ 2014-05-20 14:16         ` Haojian Zhuang
  2014-05-20 15:05           ` Christoffer Dall
  0 siblings, 1 reply; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 14:16 UTC (permalink / raw)
  To: linux-arm-kernel

On 20 May 2014 22:01, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Tue, May 20, 2014 at 09:52:53PM +0800, Haojian Zhuang wrote:
>> On 20 May 2014 21:44, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>> > On Tue, May 20, 2014 at 09:10:27PM +0800, Haojian Zhuang wrote:
>
> [...]
>
>> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> >> index 47b2983..4c0c1e9 100644
>> >> --- a/virt/kvm/arm/vgic.c
>> >> +++ b/virt/kvm/arm/vgic.c
>> >> @@ -76,6 +76,8 @@
>> >>  #define IMPLEMENTER_ARM              0x43b
>> >>  #define GICC_ARCH_VERSION_V2 0x2
>> >>
>> >> +#define vgic_nr_lr(vcpu)     (vcpu->hw_cfg & HWCFG_NR_LR_MASK)
>> >> +
>> >>  /* Physical address of vgic virtual cpu interface */
>> >>  static phys_addr_t vgic_vcpu_base;
>> >>
>> >> @@ -97,7 +99,7 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
>> >>  static void vgic_update_state(struct kvm *kvm);
>> >>  static void vgic_kick_vcpus(struct kvm *kvm);
>> >>  static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
>> >> -static u32 vgic_nr_lr;
>> >> +static u32 vgic_hw_cfg;
>> >>
>> >>  static unsigned int vgic_maint_irq;
>> >>
>> >> @@ -624,9 +626,9 @@ static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>> >>       struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> >>       int vcpu_id = vcpu->vcpu_id;
>> >>       int i, irq, source_cpu;
>> >> -     u32 *lr;
>> >> +     u32 *lr, nr_lr = vgic_nr_lr(vgic_cpu);
>> >
>> > This is static for any system post-boot, right?  Can't we set this
>> > global variable once like we did before instead of having to define
>> > these extra variables and do the bit manipulation all over the place?
>> >
>> > -Christoffer
>>
>> I tried to define a global gich_apr variable before. But Marc didn't agree on
>> that. He suggested to use vgic_cpu_nr_lr to save both GICH_APR offset
>> and nr_lr.
>>
>> Adding gich_apr variable should be the simpler implementation.
>>
> You're talking about storing this information on the vgic_cpu struct,
> which is accessed on every world-switch patch.  There, you don't want
> two memory accesses.
>
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index 76af9302..b27e43f 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -419,7 +419,9 @@ vcpu        .req    r0              @ vcpu pointer
always in r0
        ldr     r7, [r2, #GICH_EISR1]
        ldr     r8, [r2, #GICH_ELRSR0]
        ldr     r9, [r2, #GICH_ELRSR1]
-       ldr     r10, [r2, #GICH_APR]
+       ldr     r10, =gich_apr
+       ldr     r10, [r10]
+       ldr     r10, [r2, r10]

        str     r3, [r11, #VGIC_CPU_HCR]
        str     r4, [r11, #VGIC_CPU_VMCR]
@@ -435,7 +437,11 @@ vcpu       .req    r0              @ vcpu pointer
always in r0
        str     r5, [r2, #GICH_HCR]

        /* Save list registers */
-       add     r2, r2, #GICH_LR0
+       ldr     r10, =gich_apr
+       ldr     r10, [r10]
+       /* the offset between GICH_APR & GICH_LR0 is 0x10 */
+       add     r10, r10, #0x10
+       add     r2, r2, r10
        add     r3, r11, #VGIC_CPU_LR
        ldr     r4, [r11, #VGIC_CPU_NR_LR]
 1:     ldr     r6, [r2], #4
@@ -469,10 +475,16 @@ vcpu      .req    r0              @ vcpu pointer
always in r0

        str     r3, [r2, #GICH_HCR]
        str     r4, [r2, #GICH_VMCR]
-       str     r8, [r2, #GICH_APR]
+       ldr     r6, =gich_apr
+       ldr     r6, [r6]
+       str     r8, [r2, r6]

        /* Restore list registers */
-       add     r2, r2, #GICH_LR0
+       ldr     r6, =gich_apr
+       ldr     r6, [r6]
+       /* the offset between GICH_APR & GICH_LR0 is 0x10 */
+       add     r6, r6, #0x10
+       add     r2, r2, r6
        add     r3, r11, #VGIC_CPU_LR
        ldr     r4, [r11, #VGIC_CPU_NR_LR]
 1:     ldr     r6, [r3], #4
@@ -618,3 +630,7 @@ vcpu        .req    r0              @ vcpu pointer
always in r0
 .macro load_vcpu
        mrc     p15, 4, vcpu, c13, c0, 2        @ HTPIDR
 .endm
+
+       .global gich_apr
+gich_apr:
+       .long   GICH_APR

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 47b2983..6bf31db 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1470,17 +1470,30 @@ static struct notifier_block vgic_cpu_nb = {
        .notifier_call = vgic_cpu_notify,
 };

+static const struct of_device_id of_vgic_ids[] = {
+       {
+               .compatible = "arm,cortex-a15-gic",
+               .data = (void *)GICH_APR,
+       }, {
+               .compatible = "hisilicon,hip04-gic",
+               .data = (void *)HIP04_GICH_APR,
+       }, {
+       },
+};
+
 int kvm_vgic_hyp_init(void)
 {
        int ret;
        struct resource vctrl_res;
        struct resource vcpu_res;
+       const struct of_device_id *match;

-       vgic_node = of_find_compatible_node(NULL, NULL, "arm,cortex-a15-gic");
+       vgic_node = of_find_matching_node_and_match(NULL, of_vgic_ids, &match);
        if (!vgic_node) {
                kvm_err("error: no compatible vgic node in DT\n");
                return -ENODEV;
        }
+       gich_apr = (unsigned int)match->data;

        vgic_maint_irq = irq_of_parse_and_map(vgic_node, 0);
        if (!vgic_maint_irq) {

It's the implementation of gich_apr in arm32.

We needn't add or change anything in struct vgic_cpu. And both the
assembly code and the code could be much easier.

> Here, on the other hand, you're in host kernel land, and you can do your
> bit-shuffling once, and always access a single static variable like we
> did before, which will simplify the C-code.
>

No bit-shuffling in gich_apr implementation. Is it right?

Regards
Haojian

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-20 14:16         ` Haojian Zhuang
@ 2014-05-20 15:05           ` Christoffer Dall
  2014-05-20 15:39             ` Haojian Zhuang
  0 siblings, 1 reply; 36+ messages in thread
From: Christoffer Dall @ 2014-05-20 15:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 20, 2014 at 10:16:22PM +0800, Haojian Zhuang wrote:
> On 20 May 2014 22:01, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> > On Tue, May 20, 2014 at 09:52:53PM +0800, Haojian Zhuang wrote:
> >> On 20 May 2014 21:44, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >> > On Tue, May 20, 2014 at 09:10:27PM +0800, Haojian Zhuang wrote:
> >
> > [...]
> >
> >> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> >> >> index 47b2983..4c0c1e9 100644
> >> >> --- a/virt/kvm/arm/vgic.c
> >> >> +++ b/virt/kvm/arm/vgic.c
> >> >> @@ -76,6 +76,8 @@
> >> >>  #define IMPLEMENTER_ARM              0x43b
> >> >>  #define GICC_ARCH_VERSION_V2 0x2
> >> >>
> >> >> +#define vgic_nr_lr(vcpu)     (vcpu->hw_cfg & HWCFG_NR_LR_MASK)
> >> >> +
> >> >>  /* Physical address of vgic virtual cpu interface */
> >> >>  static phys_addr_t vgic_vcpu_base;
> >> >>
> >> >> @@ -97,7 +99,7 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
> >> >>  static void vgic_update_state(struct kvm *kvm);
> >> >>  static void vgic_kick_vcpus(struct kvm *kvm);
> >> >>  static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
> >> >> -static u32 vgic_nr_lr;
> >> >> +static u32 vgic_hw_cfg;
> >> >>
> >> >>  static unsigned int vgic_maint_irq;
> >> >>
> >> >> @@ -624,9 +626,9 @@ static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
> >> >>       struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >> >>       int vcpu_id = vcpu->vcpu_id;
> >> >>       int i, irq, source_cpu;
> >> >> -     u32 *lr;
> >> >> +     u32 *lr, nr_lr = vgic_nr_lr(vgic_cpu);
> >> >
> >> > This is static for any system post-boot, right?  Can't we set this
> >> > global variable once like we did before instead of having to define
> >> > these extra variables and do the bit manipulation all over the place?
> >> >
> >> > -Christoffer
> >>
> >> I tried to define a global gich_apr variable before. But Marc didn't agree on
> >> that. He suggested to use vgic_cpu_nr_lr to save both GICH_APR offset
> >> and nr_lr.
> >>
> >> Adding gich_apr variable should be the simpler implementation.
> >>
> > You're talking about storing this information on the vgic_cpu struct,
> > which is accessed on every world-switch patch.  There, you don't want
> > two memory accesses.
> >
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index 76af9302..b27e43f 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -419,7 +419,9 @@ vcpu        .req    r0              @ vcpu pointer
> always in r0
>         ldr     r7, [r2, #GICH_EISR1]
>         ldr     r8, [r2, #GICH_ELRSR0]
>         ldr     r9, [r2, #GICH_ELRSR1]
> -       ldr     r10, [r2, #GICH_APR]
> +       ldr     r10, =gich_apr
> +       ldr     r10, [r10]
> +       ldr     r10, [r2, r10]
> 
>         str     r3, [r11, #VGIC_CPU_HCR]
>         str     r4, [r11, #VGIC_CPU_VMCR]
> @@ -435,7 +437,11 @@ vcpu       .req    r0              @ vcpu pointer
> always in r0
>         str     r5, [r2, #GICH_HCR]
> 
>         /* Save list registers */
> -       add     r2, r2, #GICH_LR0
> +       ldr     r10, =gich_apr
> +       ldr     r10, [r10]
> +       /* the offset between GICH_APR & GICH_LR0 is 0x10 */
> +       add     r10, r10, #0x10
> +       add     r2, r2, r10
>         add     r3, r11, #VGIC_CPU_LR
>         ldr     r4, [r11, #VGIC_CPU_NR_LR]
>  1:     ldr     r6, [r2], #4
> @@ -469,10 +475,16 @@ vcpu      .req    r0              @ vcpu pointer
> always in r0
> 
>         str     r3, [r2, #GICH_HCR]
>         str     r4, [r2, #GICH_VMCR]
> -       str     r8, [r2, #GICH_APR]
> +       ldr     r6, =gich_apr
> +       ldr     r6, [r6]
> +       str     r8, [r2, r6]
> 
>         /* Restore list registers */
> -       add     r2, r2, #GICH_LR0
> +       ldr     r6, =gich_apr
> +       ldr     r6, [r6]
> +       /* the offset between GICH_APR & GICH_LR0 is 0x10 */
> +       add     r6, r6, #0x10
> +       add     r2, r2, r6
>         add     r3, r11, #VGIC_CPU_LR
>         ldr     r4, [r11, #VGIC_CPU_NR_LR]
>  1:     ldr     r6, [r3], #4
> @@ -618,3 +630,7 @@ vcpu        .req    r0              @ vcpu pointer
> always in r0
>  .macro load_vcpu
>         mrc     p15, 4, vcpu, c13, c0, 2        @ HTPIDR
>  .endm
> +
> +       .global gich_apr
> +gich_apr:
> +       .long   GICH_APR
> 
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 47b2983..6bf31db 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1470,17 +1470,30 @@ static struct notifier_block vgic_cpu_nb = {
>         .notifier_call = vgic_cpu_notify,
>  };
> 
> +static const struct of_device_id of_vgic_ids[] = {
> +       {
> +               .compatible = "arm,cortex-a15-gic",
> +               .data = (void *)GICH_APR,
> +       }, {
> +               .compatible = "hisilicon,hip04-gic",
> +               .data = (void *)HIP04_GICH_APR,
> +       }, {
> +       },
> +};
> +
>  int kvm_vgic_hyp_init(void)
>  {
>         int ret;
>         struct resource vctrl_res;
>         struct resource vcpu_res;
> +       const struct of_device_id *match;
> 
> -       vgic_node = of_find_compatible_node(NULL, NULL, "arm,cortex-a15-gic");
> +       vgic_node = of_find_matching_node_and_match(NULL, of_vgic_ids, &match);
>         if (!vgic_node) {
>                 kvm_err("error: no compatible vgic node in DT\n");
>                 return -ENODEV;
>         }
> +       gich_apr = (unsigned int)match->data;
> 
>         vgic_maint_irq = irq_of_parse_and_map(vgic_node, 0);
>         if (!vgic_maint_irq) {
> 
> It's the implementation of gich_apr in arm32.
> 
> We needn't add or change anything in struct vgic_cpu. And both the
> assembly code and the code could be much easier.
> 

But we do end up with an extra memory access from EL2 in the critical
path, and I believe Marc's concern here is that if we cross a cache
line, this might really hurt performance.

> > Here, on the other hand, you're in host kernel land, and you can do your
> > bit-shuffling once, and always access a single static variable like we
> > did before, which will simplify the C-code.
> >
> 
> No bit-shuffling in gich_apr implementation. Is it right?
> 

I would like to see us avoid allocating that extra nr_lr variable in
every function mucking with list registers in the C-file.

I would need to look at the data structure size and profile the
world-switch code to properly evaluate if it's worth packing the values
in a single field, so I'll let Marc comment on this one.

-Christoffer

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-20 15:05           ` Christoffer Dall
@ 2014-05-20 15:39             ` Haojian Zhuang
  2014-05-21  9:02               ` Christoffer Dall
  0 siblings, 1 reply; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-20 15:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 20 May 2014 23:05, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Tue, May 20, 2014 at 10:16:22PM +0800, Haojian Zhuang wrote:
>> On 20 May 2014 22:01, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>> It's the implementation of gich_apr in arm32.
>>
>> We needn't add or change anything in struct vgic_cpu. And both the
>> assembly code and the code could be much easier.
>>
>
> But we do end up with an extra memory access from EL2 in the critical
> path, and I believe Marc's concern here is that if we cross a cache
> line, this might really hurt performance.
>

Sorry. Do we may cross a cache line or a TLB entry?

I think that you're concerning to cross TLB entries. The reason is in
 below.

1. If the problem is on crossing cache line, it's caused by too much
instructions. Either the packing nr_lr or the gich_apr adds some
instructions. The packing nr_lr needs a little more instructions.

2. ldr instruction is a pseudo instruction. So it's parsed into operation
on PC register. Now I put gich_apr in interrupts_head.S, it results
in gich_apr variable before __kvm_hyp_code_start. It may cross the
TLB entries.
How about to declare gich_apr after __kvm_cpu_return in interrupts.S?
Since save_vgic_state & restore_vgic_state is only used once, declaring
gich_apr just after the code could avoid crossing TLB entry.

Regards
Haojian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 10/14] ARM: config: append lpae configuration
  2014-05-20 13:10 ` [PATCH v9 10/14] ARM: config: append lpae configuration Haojian Zhuang
  2014-05-20 13:52   ` Gregory CLEMENT
  2014-05-20 14:08   ` Alexandre Belloni
@ 2014-05-20 18:19   ` Olof Johansson
  2 siblings, 0 replies; 36+ messages in thread
From: Olof Johansson @ 2014-05-20 18:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 20, 2014 at 6:10 AM, Haojian Zhuang
<haojian.zhuang@linaro.org> wrote:
> Append multi_v7_lpae_config. In this default configuration,
> CONFIG_ARCH_MULTI_V6 is disabled. CONFIG_ARM_LPAE is enabled.
>
> Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
> ---
>  arch/arm/configs/multi_v7_lpae_defconfig | 351 +++++++++++++++++++++++++++++++
>  1 file changed, 351 insertions(+)
>  create mode 100644 arch/arm/configs/multi_v7_lpae_defconfig

This has to be done with more care than just taking v7_defconfig and
enabling LPAE. There are lots of platforms enabled in this config that
would never boot with LPAE on.


-Olof

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 05/14] ARM: hisi: enable MCPM implementation
  2014-05-20 13:10 ` [PATCH v9 05/14] ARM: hisi: enable MCPM implementation Haojian Zhuang
@ 2014-05-21  1:29   ` Nicolas Pitre
  2014-05-21  1:48     ` Haojian Zhuang
  0 siblings, 1 reply; 36+ messages in thread
From: Nicolas Pitre @ 2014-05-21  1:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 20 May 2014, Haojian Zhuang wrote:

> Multiple CPU clusters are used in Hisilicon HiP04 SoC. Now use MCPM
> framework to manage power on HiP04 SoC.

There are still unresolved issues with this patch.

[...]
> +static int hip04_mcpm_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	unsigned long data, mask;
> +
> +	if (!relocation || !sysctrl)
> +		return -ENODEV;
> +	if (cluster >= HIP04_MAX_CLUSTERS || cpu >= HIP04_MAX_CPUS_PER_CLUSTER)
> +		return -EINVAL;
> +
> +	spin_lock_irq(&boot_lock);
> +
> +	if (hip04_cpu_table[cluster][cpu]) {
> +		hip04_cpu_table[cluster][cpu]++;
> +		spin_unlock_irq(&boot_lock);
> +		return 0;
> +	}
> +
> +	writel_relaxed(hip04_boot.bootwrapper_phys, relocation);
> +	writel_relaxed(hip04_boot.bootwrapper_magic, relocation + 4);
> +	writel_relaxed(virt_to_phys(mcpm_entry_point), relocation + 8);
> +	writel_relaxed(0, relocation + 12);
> +
> +	if (hip04_cluster_down(cluster)) {
> +		data = CLUSTER_DEBUG_RESET_BIT;
> +		writel_relaxed(data, sysctrl + SC_CPU_RESET_DREQ(cluster));
> +		do {
> +			mask = CLUSTER_DEBUG_RESET_STATUS;
> +			data = readl_relaxed(sysctrl + \
> +					     SC_CPU_RESET_STATUS(cluster));
> +		} while (data & mask);
> +		hip04_set_snoop_filter(cluster, 1);
> +	}

Sorry to insist, but I want to repeat the question I asked during the 
previous review as I consider this is important, especially if you want 
to support deep C-States with cpuidle later.  This also has implications 
if you ever want to turn off snoops in hip04_mcpm_power_down() when all 
CPUs in a cluster are down.

You said:

| But it fails on my platform if I execute snooping setup on the new
| CPU.

I then asked:

| It fails how?  I want to make sure if the problem is with the hardware 
| design or the code.
| 
| The assembly code could be wrong.  Are you sure this is not theactual 
| reason?
| 
| Is there some documentation for this stuff?

I also see that the snoop filter is enabled from hip04_cpu_table_init() 
for the CPU that is actually executing that code.  So that must work 
somehow...

> +
> +	hip04_cpu_table[cluster][cpu]++;
> +
> +	data = CORE_RESET_BIT(cpu) | NEON_RESET_BIT(cpu) | \
> +	       CORE_DEBUG_RESET_BIT(cpu);
> +	writel_relaxed(data, sysctrl + SC_CPU_RESET_DREQ(cluster));
> +	spin_unlock_irq(&boot_lock);
> +	msleep(POLL_MSEC);

Your cover letter for this series mentionned this:

| v9:
|   * Remove delay workaround in mcpm implementation.

Why is it still there?

> +	return 0;
> +}
> +
> +static void hip04_mcpm_power_down(void)
> +{
> +	unsigned int mpidr, cpu, cluster, data = 0;
> +	bool skip_reset = false;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	__mcpm_cpu_going_down(cpu, cluster);
> +
> +	spin_lock(&boot_lock);
> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> +	hip04_cpu_table[cluster][cpu]--;
> +	if (hip04_cpu_table[cluster][cpu] == 1) {
> +		/* A power_up request went ahead of us. */
> +		skip_reset = true;
> +	} else if (hip04_cpu_table[cluster][cpu] > 1) {
> +		pr_err("Cluster %d CPU%d boots multiple times\n", cluster, cpu);
> +		BUG();
> +	}
> +	spin_unlock(&boot_lock);
> +
> +	v7_exit_coherency_flush(louis);
> +
> +	__mcpm_cpu_down(cpu, cluster);
> +
> +	if (!skip_reset) {
> +		data = CORE_RESET_BIT(cpu) | NEON_RESET_BIT(cpu) | \
> +		       CORE_DEBUG_RESET_BIT(cpu);
> +		writel_relaxed(data, sysctrl + SC_CPU_RESET_REQ(cluster));
> +	}
> +}

As I mentioned already this is going to race with the power_up() method.  
Let me illustrate the problem:

	* CPU 0		* CPU 1
	-------		-------
	* mcpm_cpu_power_down()
	* hip04_mcpm_power_down()
	* spin_lock(&boot_lock); [lock acquired]
			* mcpm_cpu_power_up(cpu = 0)
			* spin_lock(&boot_lock); [blocked]
	* hip04_cpu_table[cluster][cpu]--; [value down to 0]
	* skip_reset = false
	* spin_unlock(&boot_lock);
			* spin_lock(&boot_lock); [now succeeds]
	* v7_exit_coherency_flush(louis); [takes a while to complete]
			* hip04_cpu_table[cluster][cpu]++; [value back to 1]
			* bring CPU0 out of reset
	* put CPU0 into reset
			* spin_unlock(&boot_lock);

Here you end up with CPU0 in reset while hip04_cpu_table[cluster][cpu] 
for CPU0 is equal to 1.  The CPU will therefore never start again as 
further calls to power_up() won't see hip04_cpu_table equal to 0 
anymore.

So... I'm asking again: are you absolutely certain that the CPU reset is 
applied at the very moment the corresponding bit is set?  Isn't it 
applied only when the CPU does execute a WFI like most other platforms? 


Nicolas

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 05/14] ARM: hisi: enable MCPM implementation
  2014-05-21  1:29   ` Nicolas Pitre
@ 2014-05-21  1:48     ` Haojian Zhuang
  2014-05-21  2:06       ` Nicolas Pitre
  0 siblings, 1 reply; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-21  1:48 UTC (permalink / raw)
  To: linux-arm-kernel

On 21 May 2014 09:29, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Tue, 20 May 2014, Haojian Zhuang wrote:
>
>> Multiple CPU clusters are used in Hisilicon HiP04 SoC. Now use MCPM
>> framework to manage power on HiP04 SoC.
>
> There are still unresolved issues with this patch.
>
> [...]
>> +static int hip04_mcpm_power_up(unsigned int cpu, unsigned int cluster)
>> +{
>> +     unsigned long data, mask;
>> +
>> +     if (!relocation || !sysctrl)
>> +             return -ENODEV;
>> +     if (cluster >= HIP04_MAX_CLUSTERS || cpu >= HIP04_MAX_CPUS_PER_CLUSTER)
>> +             return -EINVAL;
>> +
>> +     spin_lock_irq(&boot_lock);
>> +
>> +     if (hip04_cpu_table[cluster][cpu]) {
>> +             hip04_cpu_table[cluster][cpu]++;
>> +             spin_unlock_irq(&boot_lock);
>> +             return 0;
>> +     }
>> +
>> +     writel_relaxed(hip04_boot.bootwrapper_phys, relocation);
>> +     writel_relaxed(hip04_boot.bootwrapper_magic, relocation + 4);
>> +     writel_relaxed(virt_to_phys(mcpm_entry_point), relocation + 8);
>> +     writel_relaxed(0, relocation + 12);
>> +
>> +     if (hip04_cluster_down(cluster)) {
>> +             data = CLUSTER_DEBUG_RESET_BIT;
>> +             writel_relaxed(data, sysctrl + SC_CPU_RESET_DREQ(cluster));
>> +             do {
>> +                     mask = CLUSTER_DEBUG_RESET_STATUS;
>> +                     data = readl_relaxed(sysctrl + \
>> +                                          SC_CPU_RESET_STATUS(cluster));
>> +             } while (data & mask);
>> +             hip04_set_snoop_filter(cluster, 1);
>> +     }
>
> Sorry to insist, but I want to repeat the question I asked during the
> previous review as I consider this is important, especially if you want
> to support deep C-States with cpuidle later.  This also has implications
> if you ever want to turn off snoops in hip04_mcpm_power_down() when all
> CPUs in a cluster are down.
>
> You said:
>
> | But it fails on my platform if I execute snooping setup on the new
> | CPU.
>
> I then asked:
>
> | It fails how?  I want to make sure if the problem is with the hardware
> | design or the code.
> |
> | The assembly code could be wrong.  Are you sure this is not theactual
> | reason?
> |
> | Is there some documentation for this stuff?
>
> I also see that the snoop filter is enabled from hip04_cpu_table_init()
> for the CPU that is actually executing that code.  So that must work
> somehow...
>

Cluster0 is very special. If I didn't enable snoop filter of cluster0, it also
works. I'll check with Hisilicon guys why it's different. The configuration
of snoop filter is a black box to me.

>> +
>> +     hip04_cpu_table[cluster][cpu]++;
>> +
>> +     data = CORE_RESET_BIT(cpu) | NEON_RESET_BIT(cpu) | \
>> +            CORE_DEBUG_RESET_BIT(cpu);
>> +     writel_relaxed(data, sysctrl + SC_CPU_RESET_DREQ(cluster));
>> +     spin_unlock_irq(&boot_lock);
>> +     msleep(POLL_MSEC);
>
> Your cover letter for this series mentionned this:
>
> | v9:
> |   * Remove delay workaround in mcpm implementation.
>
> Why is it still there?
>
Sorry. I cherry pick with the wrong id.

>> +     return 0;
>> +}
>> +
>> +static void hip04_mcpm_power_down(void)
>> +{
>> +     unsigned int mpidr, cpu, cluster, data = 0;
>> +     bool skip_reset = false;
>> +
>> +     mpidr = read_cpuid_mpidr();
>> +     cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>> +     cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
>> +
>> +     __mcpm_cpu_going_down(cpu, cluster);
>> +
>> +     spin_lock(&boot_lock);
>> +     BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
>> +     hip04_cpu_table[cluster][cpu]--;
>> +     if (hip04_cpu_table[cluster][cpu] == 1) {
>> +             /* A power_up request went ahead of us. */
>> +             skip_reset = true;
>> +     } else if (hip04_cpu_table[cluster][cpu] > 1) {
>> +             pr_err("Cluster %d CPU%d boots multiple times\n", cluster, cpu);
>> +             BUG();
>> +     }
>> +     spin_unlock(&boot_lock);
>> +
>> +     v7_exit_coherency_flush(louis);
>> +
>> +     __mcpm_cpu_down(cpu, cluster);
>> +
>> +     if (!skip_reset) {
>> +             data = CORE_RESET_BIT(cpu) | NEON_RESET_BIT(cpu) | \
>> +                    CORE_DEBUG_RESET_BIT(cpu);
>> +             writel_relaxed(data, sysctrl + SC_CPU_RESET_REQ(cluster));
>> +     }
>> +}
>
> As I mentioned already this is going to race with the power_up() method.
> Let me illustrate the problem:
>
>         * CPU 0         * CPU 1
>         -------         -------
>         * mcpm_cpu_power_down()
>         * hip04_mcpm_power_down()
>         * spin_lock(&boot_lock); [lock acquired]
>                         * mcpm_cpu_power_up(cpu = 0)
>                         * spin_lock(&boot_lock); [blocked]
>         * hip04_cpu_table[cluster][cpu]--; [value down to 0]
>         * skip_reset = false
>         * spin_unlock(&boot_lock);
>                         * spin_lock(&boot_lock); [now succeeds]
>         * v7_exit_coherency_flush(louis); [takes a while to complete]
>                         * hip04_cpu_table[cluster][cpu]++; [value back to 1]
>                         * bring CPU0 out of reset
>         * put CPU0 into reset
>                         * spin_unlock(&boot_lock);
>
> Here you end up with CPU0 in reset while hip04_cpu_table[cluster][cpu]
> for CPU0 is equal to 1.  The CPU will therefore never start again as
> further calls to power_up() won't see hip04_cpu_table equal to 0
> anymore.
>
> So... I'm asking again: are you absolutely certain that the CPU reset is
> applied at the very moment the corresponding bit is set?  Isn't it
> applied only when the CPU does execute a WFI like most other platforms?
>

If I put CPU0 into reset mode in wait_for_powerdown() that is executed
in CPU1 or other CPU, this issue doesn't exist. Is it right?

I remember that I could check hip04_cpu_table[cluster][cpu] & CPU0 WFI
status in wait_for_powerdown().

Regards
Haojian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 05/14] ARM: hisi: enable MCPM implementation
  2014-05-21  1:48     ` Haojian Zhuang
@ 2014-05-21  2:06       ` Nicolas Pitre
  0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Pitre @ 2014-05-21  2:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 21 May 2014, Haojian Zhuang wrote:

> On 21 May 2014 09:29, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> > On Tue, 20 May 2014, Haojian Zhuang wrote:
> >
> >> Multiple CPU clusters are used in Hisilicon HiP04 SoC. Now use MCPM
> >> framework to manage power on HiP04 SoC.
> >
> > There are still unresolved issues with this patch.
> >
> > [...]
> >> +static int hip04_mcpm_power_up(unsigned int cpu, unsigned int cluster)
> >> +{
> >> +     unsigned long data, mask;
> >> +
> >> +     if (!relocation || !sysctrl)
> >> +             return -ENODEV;
> >> +     if (cluster >= HIP04_MAX_CLUSTERS || cpu >= HIP04_MAX_CPUS_PER_CLUSTER)
> >> +             return -EINVAL;
> >> +
> >> +     spin_lock_irq(&boot_lock);
> >> +
> >> +     if (hip04_cpu_table[cluster][cpu]) {
> >> +             hip04_cpu_table[cluster][cpu]++;
> >> +             spin_unlock_irq(&boot_lock);
> >> +             return 0;
> >> +     }
> >> +
> >> +     writel_relaxed(hip04_boot.bootwrapper_phys, relocation);
> >> +     writel_relaxed(hip04_boot.bootwrapper_magic, relocation + 4);
> >> +     writel_relaxed(virt_to_phys(mcpm_entry_point), relocation + 8);
> >> +     writel_relaxed(0, relocation + 12);
> >> +
> >> +     if (hip04_cluster_down(cluster)) {
> >> +             data = CLUSTER_DEBUG_RESET_BIT;
> >> +             writel_relaxed(data, sysctrl + SC_CPU_RESET_DREQ(cluster));
> >> +             do {
> >> +                     mask = CLUSTER_DEBUG_RESET_STATUS;
> >> +                     data = readl_relaxed(sysctrl + \
> >> +                                          SC_CPU_RESET_STATUS(cluster));
> >> +             } while (data & mask);
> >> +             hip04_set_snoop_filter(cluster, 1);
> >> +     }
> >
> > Sorry to insist, but I want to repeat the question I asked during the
> > previous review as I consider this is important, especially if you want
> > to support deep C-States with cpuidle later.  This also has implications
> > if you ever want to turn off snoops in hip04_mcpm_power_down() when all
> > CPUs in a cluster are down.
> >
> > You said:
> >
> > | But it fails on my platform if I execute snooping setup on the new
> > | CPU.
> >
> > I then asked:
> >
> > | It fails how?  I want to make sure if the problem is with the hardware
> > | design or the code.
> > |
> > | The assembly code could be wrong.  Are you sure this is not theactual
> > | reason?
> > |
> > | Is there some documentation for this stuff?
> >
> > I also see that the snoop filter is enabled from hip04_cpu_table_init()
> > for the CPU that is actually executing that code.  So that must work
> > somehow...
> >
> 
> Cluster0 is very special. If I didn't enable snoop filter of cluster0, it also
> works. I'll check with Hisilicon guys why it's different. The configuration
> of snoop filter is a black box to me.

If you could get more info or some documentation about it that would be 
great.

> >> +static void hip04_mcpm_power_down(void)
> >> +{
> >> +     unsigned int mpidr, cpu, cluster, data = 0;
> >> +     bool skip_reset = false;
> >> +
> >> +     mpidr = read_cpuid_mpidr();
> >> +     cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> >> +     cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> >> +
> >> +     __mcpm_cpu_going_down(cpu, cluster);
> >> +
> >> +     spin_lock(&boot_lock);
> >> +     BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> >> +     hip04_cpu_table[cluster][cpu]--;
> >> +     if (hip04_cpu_table[cluster][cpu] == 1) {
> >> +             /* A power_up request went ahead of us. */
> >> +             skip_reset = true;
> >> +     } else if (hip04_cpu_table[cluster][cpu] > 1) {
> >> +             pr_err("Cluster %d CPU%d boots multiple times\n", cluster, cpu);
> >> +             BUG();
> >> +     }
> >> +     spin_unlock(&boot_lock);
> >> +
> >> +     v7_exit_coherency_flush(louis);
> >> +
> >> +     __mcpm_cpu_down(cpu, cluster);
> >> +
> >> +     if (!skip_reset) {
> >> +             data = CORE_RESET_BIT(cpu) | NEON_RESET_BIT(cpu) | \
> >> +                    CORE_DEBUG_RESET_BIT(cpu);
> >> +             writel_relaxed(data, sysctrl + SC_CPU_RESET_REQ(cluster));
> >> +     }
> >> +}
> >
> > As I mentioned already this is going to race with the power_up() method.
> > Let me illustrate the problem:
> >
> >         * CPU 0         * CPU 1
> >         -------         -------
> >         * mcpm_cpu_power_down()
> >         * hip04_mcpm_power_down()
> >         * spin_lock(&boot_lock); [lock acquired]
> >                         * mcpm_cpu_power_up(cpu = 0)
> >                         * spin_lock(&boot_lock); [blocked]
> >         * hip04_cpu_table[cluster][cpu]--; [value down to 0]
> >         * skip_reset = false
> >         * spin_unlock(&boot_lock);
> >                         * spin_lock(&boot_lock); [now succeeds]
> >         * v7_exit_coherency_flush(louis); [takes a while to complete]
> >                         * hip04_cpu_table[cluster][cpu]++; [value back to 1]
> >                         * bring CPU0 out of reset
> >         * put CPU0 into reset
> >                         * spin_unlock(&boot_lock);
> >
> > Here you end up with CPU0 in reset while hip04_cpu_table[cluster][cpu]
> > for CPU0 is equal to 1.  The CPU will therefore never start again as
> > further calls to power_up() won't see hip04_cpu_table equal to 0
> > anymore.
> >
> > So... I'm asking again: are you absolutely certain that the CPU reset is
> > applied at the very moment the corresponding bit is set?  Isn't it
> > applied only when the CPU does execute a WFI like most other platforms?
> >
> 
> If I put CPU0 into reset mode in wait_for_powerdown() that is executed
> in CPU1 or other CPU, this issue doesn't exist. Is it right?

Only if:

1) the lock is taken,

2) hip04_cpu_table[cluster][cpu] is verified to still be 0, and

3) wait_for_powerdown() is actually called.

Here (3) is optional.  It is there only to satisfy the requirements for 
kexec to work properly.  In the case of cpuidle or the switcher this 
method is not used.

I also would like to remind you that you still didn't answer my 
question.  :-)


Nicolas

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-20 15:39             ` Haojian Zhuang
@ 2014-05-21  9:02               ` Christoffer Dall
  2014-05-21  9:47                 ` Haojian Zhuang
  0 siblings, 1 reply; 36+ messages in thread
From: Christoffer Dall @ 2014-05-21  9:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 20, 2014 at 11:39:12PM +0800, Haojian Zhuang wrote:
> On 20 May 2014 23:05, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> > On Tue, May 20, 2014 at 10:16:22PM +0800, Haojian Zhuang wrote:
> >> On 20 May 2014 22:01, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >> It's the implementation of gich_apr in arm32.
> >>
> >> We needn't add or change anything in struct vgic_cpu. And both the
> >> assembly code and the code could be much easier.
> >>
> >
> > But we do end up with an extra memory access from EL2 in the critical
> > path, and I believe Marc's concern here is that if we cross a cache
> > line, this might really hurt performance.
> >
> 
> Sorry. Do we may cross a cache line or a TLB entry?
> 
> I think that you're concerning to cross TLB entries. The reason is in
>  below.
> 
> 1. If the problem is on crossing cache line, it's caused by too much
> instructions. Either the packing nr_lr or the gich_apr adds some
> instructions. The packing nr_lr needs a little more instructions.

I don't see why this argument is valid.  If you have a separate
instruction and data cache, you may be loading from a different cache
line when placing the static value close to your instructions.  If you
add a variable to the vcpu struct, all of the fields may no longer fit
in a single data cache line and you may cause the memory subsystem to
have to fetch another cache line.  I believe the latter is Marc's
concern, and I suspect he would be equally concerned about the former.

I'm not too concerned about a TLB entry here, that works at a 4K
granularity and with the proper alignment of the struct and hyp code,
that shouldn't be a concern.  Without it, of course, there's a risk of
requiring another TLB entry as well.


> 
> 2. ldr instruction is a pseudo instruction. So it's parsed into operation
> on PC register.

Eh, it just means that it does a load relative from the PC address, and
if the offset is too far to be encoded in the immediate field, then it
does an indirect load through a literal pool, if I understand what you
are referring to.  In any case, there will be at least one actual ldr
instruction issued on the PE.

> Now I put gich_apr in interrupts_head.S, it results
> in gich_apr variable before __kvm_hyp_code_start. It may cross the
> TLB entries.
> How about to declare gich_apr after __kvm_cpu_return in interrupts.S?
> Since save_vgic_state & restore_vgic_state is only used once, declaring
> gich_apr just after the code could avoid crossing TLB entry.
> 

Again, all the fields in the vcpu struct are quite likely to be aligned
within a single data cache line, I don't believe that's the case if you
stick some data in between the the hyp code.

-Christoffer

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-21  9:02               ` Christoffer Dall
@ 2014-05-21  9:47                 ` Haojian Zhuang
  2014-05-21  9:55                   ` Christoffer Dall
  0 siblings, 1 reply; 36+ messages in thread
From: Haojian Zhuang @ 2014-05-21  9:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 21 May 2014 17:02, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> On Tue, May 20, 2014 at 11:39:12PM +0800, Haojian Zhuang wrote:
>> On 20 May 2014 23:05, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>> > On Tue, May 20, 2014 at 10:16:22PM +0800, Haojian Zhuang wrote:
>> >> On 20 May 2014 22:01, Christoffer Dall <christoffer.dall@linaro.org> wrote:
>> >> It's the implementation of gich_apr in arm32.
>> >>
>> >> We needn't add or change anything in struct vgic_cpu. And both the
>> >> assembly code and the code could be much easier.
>> >>
>> >
>> > But we do end up with an extra memory access from EL2 in the critical
>> > path, and I believe Marc's concern here is that if we cross a cache
>> > line, this might really hurt performance.
>> >
>>
>> Sorry. Do we may cross a cache line or a TLB entry?
>>
>> I think that you're concerning to cross TLB entries. The reason is in
>>  below.
>>
>> 1. If the problem is on crossing cache line, it's caused by too much
>> instructions. Either the packing nr_lr or the gich_apr adds some
>> instructions. The packing nr_lr needs a little more instructions.
>
> I don't see why this argument is valid.  If you have a separate

I want to make it clear what I missing.

> instruction and data cache, you may be loading from a different cache
> line when placing the static value close to your instructions.  If you
> add a variable to the vcpu struct, all of the fields may no longer fit
> in a single data cache line and you may cause the memory subsystem to
> have to fetch another cache line.  I believe the latter is Marc's

Yes, I forgot new gich_apr is the only variable in the assembly code.
So the gich_apr will be load from a different cache line.

Then let's come back to packing hw_cfg.

Now the high word is used to store the offset of GICH_APR. The
unpacking operation is too complex to calculate the register offset,
especially in arm64 implementation.

How about changing the packing mechanism?

1. Add the definition of enconding in arm-gic.h.

#define HIP04_GIC             (1 << 16)
#define HIP04_GICH_APR   0x70
#define HIP04_GICH_LR0    0x80

2. The code in save_vgic_state could be changed in below.

  ldr     r9, [r2, #GICH_ELRSR1]
+ldr     r10, [r3, #VGIC_CPU_HW_CFG]
+tst     r10, #HIP04_GIC
+ldreq  r10, [r2, #GICH_APR]
+ldrne  r10, [r2, #HIP04_GICH_APR]

Although I used the condition checking@here, the code could
be easier.

I think that the executing time on "ldr" and "ldreq" should be same,
because CPCS should be ready

Then calculation is avoid. Only three instructions are appended
for both GICH_APR & GICH_LR0. The implementation in arm64
should be same & simple.

How do you think so?

Regards
Haojian

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-21  9:47                 ` Haojian Zhuang
@ 2014-05-21  9:55                   ` Christoffer Dall
  0 siblings, 0 replies; 36+ messages in thread
From: Christoffer Dall @ 2014-05-21  9:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 21, 2014 at 05:47:00PM +0800, Haojian Zhuang wrote:
> On 21 May 2014 17:02, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> > On Tue, May 20, 2014 at 11:39:12PM +0800, Haojian Zhuang wrote:
> >> On 20 May 2014 23:05, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >> > On Tue, May 20, 2014 at 10:16:22PM +0800, Haojian Zhuang wrote:
> >> >> On 20 May 2014 22:01, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >> >> It's the implementation of gich_apr in arm32.
> >> >>
> >> >> We needn't add or change anything in struct vgic_cpu. And both the
> >> >> assembly code and the code could be much easier.
> >> >>
> >> >
> >> > But we do end up with an extra memory access from EL2 in the critical
> >> > path, and I believe Marc's concern here is that if we cross a cache
> >> > line, this might really hurt performance.
> >> >
> >>
> >> Sorry. Do we may cross a cache line or a TLB entry?
> >>
> >> I think that you're concerning to cross TLB entries. The reason is in
> >>  below.
> >>
> >> 1. If the problem is on crossing cache line, it's caused by too much
> >> instructions. Either the packing nr_lr or the gich_apr adds some
> >> instructions. The packing nr_lr needs a little more instructions.
> >
> > I don't see why this argument is valid.  If you have a separate
> 
> I want to make it clear what I missing.
> 
> > instruction and data cache, you may be loading from a different cache
> > line when placing the static value close to your instructions.  If you
> > add a variable to the vcpu struct, all of the fields may no longer fit
> > in a single data cache line and you may cause the memory subsystem to
> > have to fetch another cache line.  I believe the latter is Marc's
> 
> Yes, I forgot new gich_apr is the only variable in the assembly code.
> So the gich_apr will be load from a different cache line.
> 
> Then let's come back to packing hw_cfg.
> 
> Now the high word is used to store the offset of GICH_APR. The
> unpacking operation is too complex to calculate the register offset,
> especially in arm64 implementation.
> 
> How about changing the packing mechanism?
> 
> 1. Add the definition of enconding in arm-gic.h.
> 
> #define HIP04_GIC             (1 << 16)
> #define HIP04_GICH_APR   0x70
> #define HIP04_GICH_LR0    0x80
> 
> 2. The code in save_vgic_state could be changed in below.
> 
>   ldr     r9, [r2, #GICH_ELRSR1]
> +ldr     r10, [r3, #VGIC_CPU_HW_CFG]
> +tst     r10, #HIP04_GIC
> +ldreq  r10, [r2, #GICH_APR]
> +ldrne  r10, [r2, #HIP04_GICH_APR]
> 
> Although I used the condition checking at here, the code could
> be easier.
> 
> I think that the executing time on "ldr" and "ldreq" should be same,
> because CPCS should be ready
> 
> Then calculation is avoid. Only three instructions are appended
> for both GICH_APR & GICH_LR0. The implementation in arm64
> should be same & simple.
> 
I think you misunderstood my point.

Keep the assembly code as is, store the APR and the NR_LR in the HW_CFG
always, on all systems, and don't use any conditionals in the assembly
code (code is difficult to read, instruction prefetching and speculative
execution becomes difficult, etc.).

Only change something in the C-code.  Set a static variable there during
vgic_hyp_init and get rid of all the local variable declarations that
dereference the vgic_vcpu struct.

-Christoffer

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 02/14] irq: gic: support hip04 gic
  2014-05-20 13:10 ` [PATCH v9 02/14] irq: gic: support hip04 gic Haojian Zhuang
@ 2014-05-21 10:15   ` Marc Zyngier
  2014-06-21  1:54   ` Jason Cooper
  1 sibling, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2014-05-21 10:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Haojian,

On 20/05/14 14:10, Haojian Zhuang wrote:
> There's a little difference between ARM GIC and HiP04 GIC.
> 
> * HiP04 GIC could support 16 cores at most, and ARM GIC could support
> 8 cores at most. So the difination on GIC_DIST_TARGET registers are
> different since CPU interfaces are increased from 8-bit to 16-bit.
> 
> * HiP04 GIC could support 510 interrupts at most, and ARM GIC could
> support 1020 interrupts at most.
> 
> Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>

This is starting to look better. Comments below.

> ---
>  Documentation/devicetree/bindings/arm/gic.txt |   1 +
>  drivers/irqchip/irq-gic.c                     | 159 ++++++++++++++++++++------
>  2 files changed, 124 insertions(+), 36 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/arm/gic.txt b/Documentation/devicetree/bindings/arm/gic.txt
> index 5573c08..150f7d6 100644
> --- a/Documentation/devicetree/bindings/arm/gic.txt
> +++ b/Documentation/devicetree/bindings/arm/gic.txt
> @@ -16,6 +16,7 @@ Main node required properties:
>         "arm,cortex-a9-gic"
>         "arm,cortex-a7-gic"
>         "arm,arm11mp-gic"
> +       "hisilicon,hip04-gic"
>  - interrupt-controller : Identifies the node as an interrupt controller
>  - #interrupt-cells : Specifies the number of cells needed to encode an
>    interrupt source.  The type shall be a <u32> and the value shall be 3.
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index f711fb6..64af475 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -68,6 +68,7 @@ struct gic_chip_data {
>  #ifdef CONFIG_GIC_NON_BANKED
>         void __iomem *(*get_base)(union gic_base *);
>  #endif
> +       u32 nr_cpu_if;
>  };
> 
>  static DEFINE_RAW_SPINLOCK(irq_controller_lock);
> @@ -76,9 +77,11 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
>   * The GIC mapping of CPU interfaces does not necessarily match
>   * the logical CPU numbering.  Let's use a mapping as returned
>   * by the GIC itself.
> + *
> + * Hisilicon HiP04 extends the number of CPU interface from 8 to 16.
>   */
> -#define NR_GIC_CPU_IF 8
> -static u8 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
> +#define NR_GIC_CPU_IF  16
> +static u16 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
> 
>  /*
>   * Supported arch specific GIC irq extension.
> @@ -241,20 +244,62 @@ static int gic_retrigger(struct irq_data *d)
>         return 0;
>  }
> 
> +static bool gic_is_standard(struct gic_chip_data *gic_data)
> +{
> +       return (gic_data->nr_cpu_if == 8);
> +}
> +
> +static u32 irqs_per_target_reg(struct gic_chip_data *gic_data)
> +{
> +       return (32 / gic_data->nr_cpu_if);
> +}
> +
> +/* i is the index of interrupt */

Not really. "i" *is* the HW interrupt number, as defined in the GIC spec.

> +static u32 irq_to_target_reg(struct gic_chip_data *gic_data, u32 i)
> +{
> +       if (gic_is_standard(gic_data))
> +               i = i & ~3U;
> +       else
> +               i = (i << 1) & ~3U;

I suggested an implementation in my previous comment that looked better.
Seeing the "& ~3U" on both side of the else is a sure sign that it can
be simplified.

> +       return (i + GIC_DIST_TARGET);
> +}
> +
>  #ifdef CONFIG_SMP
> +static u32 irq_to_core_shift(struct irq_data *d)
> +{
> +       struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
> +       unsigned int i = gic_irq(d);
> +
> +       if (gic_is_standard(gic_data))
> +               return ((i % 4) << 3);
> +       return ((i % 2) << 4);
> +}
> +
> +static u32 irq_to_core_mask(struct irq_data *d)
> +{
> +       struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
> +       u32 mask;
> +       /* ARM GIC, nr_cpu_if == 8; HiP04 GIC, nr_cpu_if == 16 */
> +       mask = (1 << gic_data->nr_cpu_if) - 1;
> +       return (mask << irq_to_core_shift(d));
> +}
> +
>  static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
>                             bool force)
>  {
> -       void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + (gic_irq(d) & ~3);
> -       unsigned int shift = (gic_irq(d) % 4) * 8;
> +       void __iomem *reg;
> +       struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
> +       unsigned int shift = irq_to_core_shift(d);
>         unsigned int cpu = cpumask_any_and(mask_val, cpu_online_mask);
>         u32 val, mask, bit;
> 
> -       if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
> +       if (cpu >= gic_data->nr_cpu_if || cpu >= nr_cpu_ids)
>                 return -EINVAL;
> 
> +       reg = gic_dist_base(d) + irq_to_target_reg(gic_data, gic_irq(d));
> +
>         raw_spin_lock(&irq_controller_lock);
> -       mask = 0xff << shift;
> +       mask = irq_to_core_mask(d);
>         bit = gic_cpu_map[cpu] << shift;
>         val = readl_relaxed(reg) & ~mask;
>         writel_relaxed(val | bit, reg);
> @@ -354,15 +399,20 @@ void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
>         irq_set_chained_handler(irq, gic_handle_cascade_irq);
>  }
> 
> -static u8 gic_get_cpumask(struct gic_chip_data *gic)
> +static u16 gic_get_cpumask(struct gic_chip_data *gic)
>  {
>         void __iomem *base = gic_data_dist_base(gic);
>         u32 mask, i;
> 
> -       for (i = mask = 0; i < 32; i += 4) {
> -               mask = readl_relaxed(base + GIC_DIST_TARGET + i);
> +       /*
> +        * ARM GIC uses 8 registers for interrupt 0-31,
> +        * HiP04 GIC uses 16 registers for interrupt 0-31.
> +        */
> +       for (i = mask = 0; i < 32; i++) {
> +               mask = readl_relaxed(base + irq_to_target_reg(gic, i));
>                 mask |= mask >> 16;
> -               mask |= mask >> 8;
> +               if (gic_is_standard(gic))
> +                       mask |= mask >> 8;
>                 if (mask)
>                         break;
>         }
> @@ -370,6 +420,10 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>         if (!mask)
>                 pr_crit("GIC CPU mask not found - kernel will fail to boot.\n");
> 
> +       /* ARM GIC needs 8-bit cpu mask, HiP04 GIC needs 16-bit cpu mask. */
> +       if (gic_is_standard(gic))
> +               mask &= 0xff;
> +
>         return mask;
>  }
> 
> @@ -392,10 +446,11 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>          * Set all global interrupts to this CPU only.
>          */
>         cpumask = gic_get_cpumask(gic);
> -       cpumask |= cpumask << 8;
> +       if (gic_is_standard(gic))
> +               cpumask |= cpumask << 8;
>         cpumask |= cpumask << 16;
> -       for (i = 32; i < gic_irqs; i += 4)
> -               writel_relaxed(cpumask, base + GIC_DIST_TARGET + i * 4 / 4);
> +       for (i = 32; i < gic_irqs; i++)
> +               writel_relaxed(cpumask, base + irq_to_target_reg(gic, i));
> 
>         /*
>          * Set priority on all global interrupts.
> @@ -423,7 +478,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
>         /*
>          * Get what the GIC says our CPU mask is.
>          */
> -       BUG_ON(cpu >= NR_GIC_CPU_IF);
> +       BUG_ON(cpu >= gic->nr_cpu_if);
>         cpu_mask = gic_get_cpumask(gic);
>         gic_cpu_map[cpu] = cpu_mask;
> 
> @@ -431,7 +486,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
>          * Clear our mask from the other map entries in case they're
>          * still undefined.
>          */
> -       for (i = 0; i < NR_GIC_CPU_IF; i++)
> +       for (i = 0; i < gic->nr_cpu_if; i++)
>                 if (i != cpu)
>                         gic_cpu_map[i] &= ~cpu_mask;
> 
> @@ -467,7 +522,7 @@ void gic_cpu_if_down(void)
>   */
>  static void gic_dist_save(unsigned int gic_nr)
>  {
> -       unsigned int gic_irqs;
> +       unsigned int gic_irqs, target_reg = 0;

Why the "= 0"?

>         void __iomem *dist_base;
>         int i;
> 
> @@ -484,9 +539,11 @@ static void gic_dist_save(unsigned int gic_nr)
>                 gic_data[gic_nr].saved_spi_conf[i] =
>                         readl_relaxed(dist_base + GIC_DIST_CONFIG + i * 4);
> 
> -       for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
> +       for (i = 0; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
> +               target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
>                 gic_data[gic_nr].saved_spi_target[i] =
> -                       readl_relaxed(dist_base + GIC_DIST_TARGET + i * 4);
> +                       readl_relaxed(dist_base + target_reg);

I think you can loose the target_reg variable altogether.

> +       }
> 
>         for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
>                 gic_data[gic_nr].saved_spi_enable[i] =
> @@ -502,7 +559,7 @@ static void gic_dist_save(unsigned int gic_nr)
>   */
>  static void gic_dist_restore(unsigned int gic_nr)
>  {
> -       unsigned int gic_irqs;
> +       unsigned int gic_irqs, target_reg = 0;
>         unsigned int i;
>         void __iomem *dist_base;
> 
> @@ -525,9 +582,11 @@ static void gic_dist_restore(unsigned int gic_nr)
>                 writel_relaxed(0xa0a0a0a0,
>                         dist_base + GIC_DIST_PRI + i * 4);
> 
> -       for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
> +       for (i = 0; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
> +               target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
>                 writel_relaxed(gic_data[gic_nr].saved_spi_target[i],
> -                       dist_base + GIC_DIST_TARGET + i * 4);
> +                       dist_base + target_reg);

Same here.

> +       }
> 
>         for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
>                 writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
> @@ -665,9 +724,19 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>          */
>         dmb(ishst);
> 
> -       /* this always happens on GIC0 */
> -       writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
> -
> +       /*
> +        * CPUTargetList -- bit[23:16] in GIC_DIST_SOFTINT in ARM GIC.
> +        *                  bit[23:8] in GIC_DIST_SOFTINT in HiP04 GIC.
> +        * NSATT -- bit[15] in GIC_DIST_SOFTINT in ARM GIC.
> +        *          bit[7] in GIC_DIST_SOFTINT in HiP04 GIC.
> +        * this always happens on GIC0
> +        */
> +       if (gic_is_standard(&gic_data[0]))
> +               map = map << 16;
> +       else
> +               map = map << 8;
> +       writel_relaxed(map | irq,
> +                      gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
>         raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
>  }
>  #endif
> @@ -681,10 +750,15 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>   */
>  void gic_send_sgi(unsigned int cpu_id, unsigned int irq)
>  {
> -       BUG_ON(cpu_id >= NR_GIC_CPU_IF);
> +       BUG_ON(cpu_id >= gic_data[0].nr_cpu_if);
>         cpu_id = 1 << cpu_id;
>         /* this always happens on GIC0 */
> -       writel_relaxed((cpu_id << 16) | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
> +       if (gic_is_standard(&gic_data[0]))
> +               cpu_id = cpu_id << 16;
> +       else
> +               cpu_id = cpu_id << 8;
> +       writel_relaxed(cpu_id | irq,
> +                      gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
>  }
> 
>  /*
> @@ -700,7 +774,7 @@ int gic_get_cpu_id(unsigned int cpu)
>  {
>         unsigned int cpu_bit;
> 
> -       if (cpu >= NR_GIC_CPU_IF)
> +       if (cpu >= gic_data[0].nr_cpu_if)
>                 return -1;
>         cpu_bit = gic_cpu_map[cpu];
>         if (cpu_bit & (cpu_bit - 1))
> @@ -747,13 +821,14 @@ void gic_migrate_target(unsigned int new_cpu_id)
>          * CPU interface and migrate them to the new CPU interface.
>          * We skip DIST_TARGET 0 to 7 as they are read-only.
>          */
> -       for (i = 8; i < DIV_ROUND_UP(gic_irqs, 4); i++) {
> -               val = readl_relaxed(dist_base + GIC_DIST_TARGET + i * 4);
> +       for (i = 8; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
> +               target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
> +               val = readl_relaxed(dist_base + target_reg);
>                 active_mask = val & cur_target_mask;
>                 if (active_mask) {
>                         val &= ~active_mask;
>                         val |= ror32(active_mask, ror_val);
> -                       writel_relaxed(val, dist_base + GIC_DIST_TARGET + i*4);
> +                       writel_relaxed(val, dist_base + target_reg);
>                 }
>         }
> 
> @@ -931,7 +1006,7 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
>         irq_hw_number_t hwirq_base;
>         struct gic_chip_data *gic;
>         int gic_irqs, irq_base, i;
> -       int nr_routable_irqs;
> +       int nr_routable_irqs, max_nr_irq;
> 
>         BUG_ON(gic_nr >= MAX_GIC_NR);
> 
> @@ -967,12 +1042,22 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
>                 gic_set_base_accessor(gic, gic_get_common_base);
>         }
> 
> +       if (of_device_is_compatible(node, "hisilicon,hip04-gic")) {
> +               /* HiP04 GIC supports 16 CPUs at most */
> +               gic->nr_cpu_if = 16;
> +               max_nr_irq = 510;
> +       } else {
> +               /* ARM/Qualcomm GIC supports 8 CPUs at most */
> +               gic->nr_cpu_if = 8;
> +               max_nr_irq = 1020;
> +       }
> +
>         /*
>          * Initialize the CPU interface map to all CPUs.
>          * It will be refined as each CPU probes its ID.
>          */
> -       for (i = 0; i < NR_GIC_CPU_IF; i++)
> -               gic_cpu_map[i] = 0xff;
> +       for (i = 0; i < gic->nr_cpu_if; i++)
> +               gic_cpu_map[i] = (1 << gic->nr_cpu_if) - 1;
> 
>         /*
>          * For primary GICs, skip over SGIs.
> @@ -988,12 +1073,13 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
> 
>         /*
>          * Find out how many interrupts are supported.
> -        * The GIC only supports up to 1020 interrupt sources.
> +        * The ARM/Qualcomm GIC only supports up to 1020 interrupt sources.
> +        * The HiP04 GIC only supports up to 510 interrupt sources.
>          */
>         gic_irqs = readl_relaxed(gic_data_dist_base(gic) + GIC_DIST_CTR) & 0x1f;
>         gic_irqs = (gic_irqs + 1) * 32;
> -       if (gic_irqs > 1020)
> -               gic_irqs = 1020;
> +       if (gic_irqs > max_nr_irq)
> +               gic_irqs = max_nr_irq;
>         gic->gic_irqs = gic_irqs;
> 
>         gic_irqs -= hwirq_base; /* calculate # of irqs to allocate */
> @@ -1069,6 +1155,7 @@ gic_of_init(struct device_node *node, struct device_node *parent)
>  }
>  IRQCHIP_DECLARE(cortex_a15_gic, "arm,cortex-a15-gic", gic_of_init);
>  IRQCHIP_DECLARE(cortex_a9_gic, "arm,cortex-a9-gic", gic_of_init);
> +IRQCHIP_DECLARE(hip04_gic, "hisilicon,hip04-gic", gic_of_init);
>  IRQCHIP_DECLARE(msm_8660_qgic, "qcom,msm-8660-qgic", gic_of_init);
>  IRQCHIP_DECLARE(msm_qgic2, "qcom,msm-qgic2", gic_of_init);
> 
> --
> 1.9.1
> 
> 


-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 14/14] virt: arm: support hip04 gic
  2014-05-20 13:10 ` [PATCH v9 14/14] virt: arm: support hip04 gic Haojian Zhuang
  2014-05-20 13:34   ` Haojian Zhuang
  2014-05-20 13:44   ` Christoffer Dall
@ 2014-05-21 13:11   ` Marc Zyngier
  2 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2014-05-21 13:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Haohian,

Christoffer has already heavily commented on some aspects of this, so
I'm going to stay clear of them. But there is a number of other issues
that I'd like to outline.

On Tue, May 20 2014 at  2:10:27 pm BST, Haojian Zhuang <haojian.zhuang@linaro.org> wrote:
> In ARM standard GIC, GICH_APR offset is 0xf0 & GICH_LR0 offset is 0x100.
> In HiP04 GIC, GICH_APR offset is 0x70 & GICH_LR0 offset is 0x80.
>
> Now reuse the nr_lr field in struct vgic_cpu. Bit[31:16] is used to store
> GICH_APR offset in HiP04, and bit[15:0] is used to store real nr_lr
> variable. In ARM standard GIC, don't set bit[31:16]. So we could avoid
> to change the VGIC implementation in arm64.
>
> Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
> ---
>  arch/arm/kernel/asm-offsets.c   |  2 +-
>  arch/arm/kvm/interrupts_head.S  | 29 +++++++++++++++++++------
>  arch/arm64/kernel/asm-offsets.c |  2 +-
>  arch/arm64/kvm/hyp.S            | 28 ++++++++++++++++++------
>  include/kvm/arm_vgic.h          |  7 ++++--
>  include/linux/irqchip/arm-gic.h |  6 ++++++
>  virt/kvm/arm/vgic.c             | 48 +++++++++++++++++++++++++++++------------
>  7 files changed, 92 insertions(+), 30 deletions(-)
>
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 85598b5..166cc98 100644
> --- a/arch/arm/kernel/asm-offsets.c
> +++ b/arch/arm/kernel/asm-offsets.c
> @@ -189,7 +189,7 @@ int main(void)
>    DEFINE(VGIC_CPU_ELRSR,       offsetof(struct vgic_cpu, vgic_elrsr));
>    DEFINE(VGIC_CPU_APR,         offsetof(struct vgic_cpu, vgic_apr));
>    DEFINE(VGIC_CPU_LR,          offsetof(struct vgic_cpu, vgic_lr));
> -  DEFINE(VGIC_CPU_NR_LR,       offsetof(struct vgic_cpu, nr_lr));
> +  DEFINE(VGIC_CPU_HW_CFG,      offsetof(struct vgic_cpu, hw_cfg));
>  #ifdef CONFIG_KVM_ARM_TIMER
>    DEFINE(VCPU_TIMER_CNTV_CTL,  offsetof(struct kvm_vcpu, arch.timer_cpu.cntv_ctl));
>    DEFINE(VCPU_TIMER_CNTV_CVAL, offsetof(struct kvm_vcpu, arch.timer_cpu.cntv_cval));
> diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
> index 76af9302..9fbbf99 100644
> --- a/arch/arm/kvm/interrupts_head.S
> +++ b/arch/arm/kvm/interrupts_head.S
> @@ -419,7 +419,9 @@ vcpu        .req    r0              @ vcpu pointer always in r0
>         ldr     r7, [r2, #GICH_EISR1]
>         ldr     r8, [r2, #GICH_ELRSR0]
>         ldr     r9, [r2, #GICH_ELRSR1]
> -       ldr     r10, [r2, #GICH_APR]
> +       ldr     r10, [r11, #VGIC_CPU_HW_CFG]
> +       mov     r10, r10, lsr #HWCFG_APR_SHIFT
> +       ldr     r10, [r2, r10]
>
>         str     r3, [r11, #VGIC_CPU_HCR]
>         str     r4, [r11, #VGIC_CPU_VMCR]
> @@ -435,9 +437,15 @@ vcpu       .req    r0              @ vcpu pointer always in r0
>         str     r5, [r2, #GICH_HCR]
>
>         /* Save list registers */
> -       add     r2, r2, #GICH_LR0
> +       ldr     r4, [r11, #VGIC_CPU_HW_CFG]

Can you find a way to avoid this reload?

> +       mov     r10, r4, lsr #HWCFG_APR_SHIFT
> +       /* the offset between GICH_APR & GICH_LR0 is 0x10 */
> +       add     r10, r10, #0x10
> +       add     r2, r2, r10
>         add     r3, r11, #VGIC_CPU_LR
> -       ldr     r4, [r11, #VGIC_CPU_NR_LR]
> +       /* Get NR_LR from VGIC_CPU_HW_CFG */
> +       ldr     r6, =HWCFG_NR_LR_MASK
> +       and     r4, r4, r6

Here, you're generating a memory access (loading HWCFG_NR_LR_MASK from
the constant pool), while the whole purpose of the exercise is to avoid
additional memory accesses.

Consider using the ubxf instruction to directly extract the information
you need. Actually, it is probably easier for me to directly show you
what I want to see (completely untested, of course):

diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S
index e4eaf30..77ddf87 100644
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -410,6 +410,8 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 
 	/* Compute the address of struct vgic_cpu */
 	add	r11, vcpu, #VCPU_VGIC_CPU
+	/* Get HW configuration */
+	ldr	r12, [r11, #VGIC_CPU_HW_CFG]
 
 	/* Save all interesting registers */
 	ldr	r3, [r2, #GICH_HCR]
@@ -419,7 +421,9 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	ldr	r7, [r2, #GICH_EISR1]
 	ldr	r8, [r2, #GICH_ELRSR0]
 	ldr	r9, [r2, #GICH_ELRSR1]
-	ldr	r10, [r2, #GICH_APR]
+	/* Extract APR offset */
+	ubfx	r10, r12, #16, #16
+	ldr	r10, [r2, r10]
 
 	str	r3, [r11, #VGIC_V2_CPU_HCR]
 	str	r4, [r11, #VGIC_V2_CPU_VMCR]
@@ -434,10 +438,16 @@ vcpu	.req	r0		@ vcpu pointer always in r0
 	mov	r5, #0
 	str	r5, [r2, #GICH_HCR]
 
+	/* Compute GICH_LR0 address */
+	ubfx	r6, r12, #16, #16
+	add	r6, r6, #0x10
+	add	r2, r2, r6
+
+	/* Extract NR_LR */
+	ubfx	r4, r12, #0, #16
+
 	/* Save list registers */
-	add	r2, r2, #GICH_LR0
 	add	r3, r11, #VGIC_V2_CPU_LR
-	ldr	r4, [r11, #VGIC_CPU_NR_LR]
 1:	ldr	r6, [r2], #4
 	str	r6, [r3], #4
 	subs	r4, r4, #1

See? No additional memory access compared to the original code.

>  1:     ldr     r6, [r2], #4
>         str     r6, [r3], #4
>         subs    r4, r4, #1
> @@ -469,12 +477,21 @@ vcpu      .req    r0              @ vcpu pointer always in r0
>
>         str     r3, [r2, #GICH_HCR]
>         str     r4, [r2, #GICH_VMCR]
> -       str     r8, [r2, #GICH_APR]
> +       ldr     r6, [r11, #VGIC_CPU_HW_CFG]
> +       mov     r6, r6, lsr #HWCFG_APR_SHIFT
> +       str     r8, [r2, r6]
>
>         /* Restore list registers */
> -       add     r2, r2, #GICH_LR0
> +       ldr     r4, [r11, #VGIC_CPU_HW_CFG]
> +       mov     r6, r4, lsr #HWCFG_APR_SHIFT
> +       /* the offset between GICH_APR & GICH_LR0 is 0x10 */
> +       add     r6, r6, #0x10
> +       /* get offset of GICH_LR0 */
> +       add     r2, r2, r6
> +       /* Get NR_LR from VGIC_CPU_HW_CFG */
>         add     r3, r11, #VGIC_CPU_LR
> -       ldr     r4, [r11, #VGIC_CPU_NR_LR]
> +       ldr     r6, =HWCFG_NR_LR_MASK
> +       and     r4, r4, r6

See my comments above.

>  1:     ldr     r6, [r3], #4
>         str     r6, [r2], #4
>         subs    r4, r4, #1
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 646f888..2422358 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -136,7 +136,7 @@ int main(void)
>    DEFINE(VGIC_CPU_ELRSR,       offsetof(struct vgic_cpu, vgic_elrsr));
>    DEFINE(VGIC_CPU_APR,         offsetof(struct vgic_cpu, vgic_apr));
>    DEFINE(VGIC_CPU_LR,          offsetof(struct vgic_cpu, vgic_lr));
> -  DEFINE(VGIC_CPU_NR_LR,       offsetof(struct vgic_cpu, nr_lr));
> +  DEFINE(VGIC_CPU_HW_CFG,      offsetof(struct vgic_cpu, hw_cfg));
>    DEFINE(KVM_VTTBR,            offsetof(struct kvm, arch.vttbr));
>    DEFINE(KVM_VGIC_VCTRL,       offsetof(struct kvm, arch.vgic.vctrl_base));
>  #endif
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 2c56012..a4a8b3d 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -402,7 +402,9 @@ __kvm_hyp_code_start:
>         ldr     w8, [x2, #GICH_EISR1]
>         ldr     w9, [x2, #GICH_ELRSR0]
>         ldr     w10, [x2, #GICH_ELRSR1]
> -       ldr     w11, [x2, #GICH_APR]
> +       ldr     w11, [x3, #VGIC_CPU_HW_CFG]
> +       mov     w11, w11, lsr #HWCFG_APR_SHIFT
> +       ldr     w11, [x2, w10]
>  CPU_BE(        rev     w4,  w4  )
>  CPU_BE(        rev     w5,  w5  )
>  CPU_BE(        rev     w6,  w6  )
> @@ -425,8 +427,13 @@ CPU_BE(    rev     w11, w11 )
>         str     wzr, [x2, #GICH_HCR]
>
>         /* Save list registers */
> -       add     x2, x2, #GICH_LR0
> -       ldr     w4, [x3, #VGIC_CPU_NR_LR]
> +       ldr     w4, [x3, #VGIC_CPU_HW_CFG]
> +       mov     w6, w4, lsr #HWCFG_APR_SHIFT
> +       ldr     w7, =HWCFG_NR_LR_MASK

As there is no arm64 SoC with this GIC yet, don't bother hacking the
whole thing. Just extract the right field of hw_cfg, and this should be
enough. If one day someone builds such an insanity, we'll add the
necessary code.

> +       and     w4, w4, w7
> +       /* the offset between GICH_APR and GICH_LR0 is 0x10 */
> +       add     w6, w6, 0x10
> +       add     x2, x2, w6
>         add     x3, x3, #VGIC_CPU_LR
>  1:     ldr     w5, [x2], #4
>  CPU_BE(        rev     w5, w5 )
> @@ -461,11 +468,20 @@ CPU_BE(   rev     w6, w6 )
>
>         str     w4, [x2, #GICH_HCR]
>         str     w5, [x2, #GICH_VMCR]
> -       str     w6, [x2, #GICH_APR]
> +       ldr     w4, [x3, #VGIC_CPU_HW_CFG]
> +       mov     w4, w4, #HWCFG_APR_SHIFT
> +       str     w6, [x2, w4]
>
>         /* Restore list registers */
> -       add     x2, x2, #GICH_LR0
> -       ldr     w4, [x3, #VGIC_CPU_NR_LR]
> +       ldr     w4, [x3, #VGIC_CPU_HW_CFG]
> +       mov     w6, w4, #HWCFG_APR_SHIFT
> +       /* the offset between GICH_APR and GICH_LR0 is 0x10 */
> +       add     w6, w6, #0x10
> +       /* get offset of GICH_LR0 */
> +       add     x2, x2, w6
> +       /* get NR_LR from VGIC_CPU_HW_CFG */
> +       ldr     w6, =HWCFG_NR_LR_MASK
> +       and     w4, w4, w6
>         add     x3, x3, #VGIC_CPU_LR

Same here.

>  1:     ldr     w5, [x3], #4
>  CPU_BE(        rev     w5, w5 )
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index f27000f..eba4b51 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -122,8 +122,11 @@ struct vgic_cpu {
>         /* Bitmap of used/free list registers */
>         DECLARE_BITMAP( lr_used, VGIC_MAX_LRS);
>
> -       /* Number of list registers on this CPU */
> -       int             nr_lr;
> +       /*
> +        * bit[31:16]: GICH_APR offset
> +        * bit[15:0]:  Number of list registers on this CPU
> +        */
> +       u32             hw_cfg;
>
>         /* CPU vif control registers for world switch */
>         u32             vgic_hcr;
> diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
> index 45e2d8c..b055f92 100644
> --- a/include/linux/irqchip/arm-gic.h
> +++ b/include/linux/irqchip/arm-gic.h
> @@ -49,6 +49,8 @@
>  #define GICH_ELRSR1                    0x34
>  #define GICH_APR                       0xf0
>  #define GICH_LR0                       0x100
> +#define HIP04_GICH_APR                 0x70
> +/* GICH_LR0 offset in HiP04 is 0x80 */
>
>  #define GICH_HCR_EN                    (1 << 0)
>  #define GICH_HCR_UIE                   (1 << 1)
> @@ -73,6 +75,10 @@
>  #define GICH_MISR_EOI                  (1 << 0)
>  #define GICH_MISR_U                    (1 << 1)
>
> +#define HWCFG_NR_LR_MASK       0xffff
> +#define HWCFG_APR_SHIFT                16
> +#define HWCFG_APR_MASK         (0xffff << HWCFG_APR_SHIFT)
> +
>  #ifndef __ASSEMBLY__
>
>  struct device_node;
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 47b2983..4c0c1e9 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -76,6 +76,8 @@
>  #define IMPLEMENTER_ARM                0x43b
>  #define GICC_ARCH_VERSION_V2   0x2
>
> +#define vgic_nr_lr(vcpu)       (vcpu->hw_cfg & HWCFG_NR_LR_MASK)
> +
>  /* Physical address of vgic virtual cpu interface */
>  static phys_addr_t vgic_vcpu_base;
>
> @@ -97,7 +99,7 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
>  static void vgic_update_state(struct kvm *kvm);
>  static void vgic_kick_vcpus(struct kvm *kvm);
>  static void vgic_dispatch_sgi(struct kvm_vcpu *vcpu, u32 reg);
> -static u32 vgic_nr_lr;
> +static u32 vgic_hw_cfg;
>
>  static unsigned int vgic_maint_irq;
>
> @@ -624,9 +626,9 @@ static void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
>         struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>         int vcpu_id = vcpu->vcpu_id;
>         int i, irq, source_cpu;
> -       u32 *lr;
> +       u32 *lr, nr_lr = vgic_nr_lr(vgic_cpu);
>
> -       for_each_set_bit(i, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
> +       for_each_set_bit(i, vgic_cpu->lr_used, nr_lr) {
>                 lr = &vgic_cpu->vgic_lr[i];
>                 irq = LR_IRQID(*lr);
>                 source_cpu = LR_CPUID(*lr);
> @@ -1005,8 +1007,9 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
>  {
>         struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>         int lr;
> +       int nr_lr = vgic_nr_lr(vgic_cpu);
>
> -       for_each_set_bit(lr, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
> +       for_each_set_bit(lr, vgic_cpu->lr_used, nr_lr) {
>                 int irq = vgic_cpu->vgic_lr[lr] & GICH_LR_VIRTUALID;
>
>                 if (!vgic_irq_is_enabled(vcpu, irq)) {
> @@ -1025,6 +1028,7 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>  {
>         struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>         int lr;
> +       int nr_lr = vgic_nr_lr(vgic_cpu);
>
>         /* Sanitize the input... */
>         BUG_ON(sgi_source_id & ~7);
> @@ -1046,9 +1050,8 @@ static bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>         }
>
>         /* Try to use another LR for this interrupt */
> -       lr = find_first_zero_bit((unsigned long *)vgic_cpu->lr_used,
> -                              vgic_cpu->nr_lr);
> -       if (lr >= vgic_cpu->nr_lr)
> +       lr = find_first_zero_bit((unsigned long *)vgic_cpu->lr_used, nr_lr);
> +       if (lr >= nr_lr)
>                 return false;
>
>         kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
> @@ -1181,9 +1184,10 @@ static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
>                  * active bit.
>                  */
>                 int lr, irq;
> +               int nr_lr = vgic_nr_lr(vgic_cpu);
>
>                 for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_eisr,
> -                                vgic_cpu->nr_lr) {
> +                                nr_lr) {
>                         irq = vgic_cpu->vgic_lr[lr] & GICH_LR_VIRTUALID;
>
>                         vgic_irq_clear_active(vcpu, irq);
> @@ -1221,13 +1225,13 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>         struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>         struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>         int lr, pending;
> +       int nr_lr = vgic_nr_lr(vgic_cpu);
>         bool level_pending;
>
>         level_pending = vgic_process_maintenance(vcpu);
>
>         /* Clear mappings for empty LRs */
> -       for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr,
> -                        vgic_cpu->nr_lr) {
> +       for_each_set_bit(lr, (unsigned long *)vgic_cpu->vgic_elrsr, nr_lr) {
>                 int irq;
>
>                 if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
> @@ -1241,8 +1245,8 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>
>         /* Check if we still have something up our sleeve... */
>         pending = find_first_zero_bit((unsigned long *)vgic_cpu->vgic_elrsr,
> -                                     vgic_cpu->nr_lr);
> -       if (level_pending || pending < vgic_cpu->nr_lr)
> +                                     nr_lr);
> +       if (level_pending || pending < nr_lr)
>                 set_bit(vcpu->vcpu_id, &dist->irq_pending_on_cpu);
>  }
>
> @@ -1438,7 +1442,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
>          */
>         vgic_cpu->vgic_vmcr = 0;
>
> -       vgic_cpu->nr_lr = vgic_nr_lr;
> +       vgic_cpu->hw_cfg = vgic_hw_cfg;
>         vgic_cpu->vgic_hcr = GICH_HCR_EN; /* Get the show on the road... */
>
>         return 0;
> @@ -1470,17 +1474,32 @@ static struct notifier_block vgic_cpu_nb = {
>         .notifier_call = vgic_cpu_notify,
>  };
>
> +static const struct of_device_id of_vgic_ids[] = {
> +       {
> +               .compatible = "arm,cortex-a15-gic",
> +               .data = (void *)GICH_APR,
> +       }, {
> +               .compatible = "hisilicon,hip04-gic",
> +               .data = (void *)HIP04_GICH_APR,
> +       }, {
> +       },
> +};
> +
>  int kvm_vgic_hyp_init(void)
>  {
>         int ret;
>         struct resource vctrl_res;
>         struct resource vcpu_res;
> +       const struct of_device_id *match;
> +       u32 vgic_nr_lr;
>
> -       vgic_node = of_find_compatible_node(NULL, NULL, "arm,cortex-a15-gic");
> +       vgic_node = of_find_matching_node_and_match(NULL, of_vgic_ids, &match);
>         if (!vgic_node) {
>                 kvm_err("error: no compatible vgic node in DT\n");
>                 return -ENODEV;
>         }
> +       /* High word of vgic_hw_cfg is the offset of GICH_APR. */
> +       vgic_hw_cfg = (unsigned int)match->data << HWCFG_APR_SHIFT;
>
>         vgic_maint_irq = irq_of_parse_and_map(vgic_node, 0);
>         if (!vgic_maint_irq) {
> @@ -1517,6 +1536,7 @@ int kvm_vgic_hyp_init(void)
>
>         vgic_nr_lr = readl_relaxed(vgic_vctrl_base + GICH_VTR);
>         vgic_nr_lr = (vgic_nr_lr & 0x3f) + 1;
> +       vgic_hw_cfg |= vgic_nr_lr;
>
>         ret = create_hyp_io_mappings(vgic_vctrl_base,
>                                      vgic_vctrl_base + resource_size(&vctrl_res),
> --
> 1.9.1
>
>

On a separate note, I'm still waiting for an answer from you about how
the LRs differ between GICv2 and this implementation. We cannot possibly
enable KVM on this HW without knowing how it differs from the original
architecture.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v9 02/14] irq: gic: support hip04 gic
  2014-05-20 13:10 ` [PATCH v9 02/14] irq: gic: support hip04 gic Haojian Zhuang
  2014-05-21 10:15   ` Marc Zyngier
@ 2014-06-21  1:54   ` Jason Cooper
  2014-07-08 22:40     ` Jason Cooper
  1 sibling, 1 reply; 36+ messages in thread
From: Jason Cooper @ 2014-06-21  1:54 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Haojian,

I'm working through my backlog ...

On Tue, May 20, 2014 at 09:10:15PM +0800, Haojian Zhuang wrote:
> There's a little difference between ARM GIC and HiP04 GIC.
> 
> * HiP04 GIC could support 16 cores at most, and ARM GIC could support
> 8 cores at most. So the difination on GIC_DIST_TARGET registers are
> different since CPU interfaces are increased from 8-bit to 16-bit.
> 
> * HiP04 GIC could support 510 interrupts at most, and ARM GIC could
> support 1020 interrupts at most.
> 
> Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
> ---
>  Documentation/devicetree/bindings/arm/gic.txt |   1 +
>  drivers/irqchip/irq-gic.c                     | 159 ++++++++++++++++++++------
>  2 files changed, 124 insertions(+), 36 deletions(-)
> 
...
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index f711fb6..64af475 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -68,6 +68,7 @@ struct gic_chip_data {
>  #ifdef CONFIG_GIC_NON_BANKED
>  	void __iomem *(*get_base)(union gic_base *);
>  #endif
> +	u32 nr_cpu_if;
>  };
>  
>  static DEFINE_RAW_SPINLOCK(irq_controller_lock);
> @@ -76,9 +77,11 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
>   * The GIC mapping of CPU interfaces does not necessarily match
>   * the logical CPU numbering.  Let's use a mapping as returned
>   * by the GIC itself.
> + *
> + * Hisilicon HiP04 extends the number of CPU interface from 8 to 16.
>   */
> -#define NR_GIC_CPU_IF 8
> -static u8 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
> +#define NR_GIC_CPU_IF	16
> +static u16 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
>  
>  /*
>   * Supported arch specific GIC irq extension.
> @@ -241,20 +244,62 @@ static int gic_retrigger(struct irq_data *d)
>  	return 0;
>  }
>  
> +static bool gic_is_standard(struct gic_chip_data *gic_data)
> +{
> +	return (gic_data->nr_cpu_if == 8);
> +}
> +
> +static u32 irqs_per_target_reg(struct gic_chip_data *gic_data)
> +{
> +	return (32 / gic_data->nr_cpu_if);
> +}
> +
> +/* i is the index of interrupt */
> +static u32 irq_to_target_reg(struct gic_chip_data *gic_data, u32 i)
> +{
> +	if (gic_is_standard(gic_data))
> +		i = i & ~3U;
> +	else
> +		i = (i << 1) & ~3U;
> +	return (i + GIC_DIST_TARGET);
> +}
> +
>  #ifdef CONFIG_SMP
> +static u32 irq_to_core_shift(struct irq_data *d)
> +{
> +	struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
> +	unsigned int i = gic_irq(d);
> +
> +	if (gic_is_standard(gic_data))
> +		return ((i % 4) << 3);
> +	return ((i % 2) << 4);
> +}
> +
> +static u32 irq_to_core_mask(struct irq_data *d)
> +{
> +	struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
> +	u32 mask;
> +	/* ARM GIC, nr_cpu_if == 8; HiP04 GIC, nr_cpu_if == 16 */
> +	mask = (1 << gic_data->nr_cpu_if) - 1;
> +	return (mask << irq_to_core_shift(d));
> +}
> +
>  static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
>  			    bool force)
>  {
> -	void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + (gic_irq(d) & ~3);
> -	unsigned int shift = (gic_irq(d) % 4) * 8;
> +	void __iomem *reg;
> +	struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
> +	unsigned int shift = irq_to_core_shift(d);
>  	unsigned int cpu = cpumask_any_and(mask_val, cpu_online_mask);
>  	u32 val, mask, bit;
>  
> -	if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
> +	if (cpu >= gic_data->nr_cpu_if || cpu >= nr_cpu_ids)
>  		return -EINVAL;
>  
> +	reg = gic_dist_base(d) + irq_to_target_reg(gic_data, gic_irq(d));
> +
>  	raw_spin_lock(&irq_controller_lock);
> -	mask = 0xff << shift;
> +	mask = irq_to_core_mask(d);

You still haven't addressed my comment from your previous version of
this series.

Please calculate the mask once at boottime and store it.  Then you can
use it in conjunction with the shift you calculate above.  This will
remove the needless calculation inside the spinlock:

  mask = (1 << gic_data->nr_cpu_if) - 1;

It will also remove the extra call to irq_data_get_irq_chip_data() since
you already did that in this function outside the spinlock.

>  	bit = gic_cpu_map[cpu] << shift;
>  	val = readl_relaxed(reg) & ~mask;
>  	writel_relaxed(val | bit, reg);
> @@ -354,15 +399,20 @@ void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
>  	irq_set_chained_handler(irq, gic_handle_cascade_irq);
>  }
>  
> -static u8 gic_get_cpumask(struct gic_chip_data *gic)
> +static u16 gic_get_cpumask(struct gic_chip_data *gic)
>  {
>  	void __iomem *base = gic_data_dist_base(gic);
>  	u32 mask, i;
>  
> -	for (i = mask = 0; i < 32; i += 4) {
> -		mask = readl_relaxed(base + GIC_DIST_TARGET + i);
> +	/*
> +	 * ARM GIC uses 8 registers for interrupt 0-31,
> +	 * HiP04 GIC uses 16 registers for interrupt 0-31.
> +	 */
> +	for (i = mask = 0; i < 32; i++) {
> +		mask = readl_relaxed(base + irq_to_target_reg(gic, i));
>  		mask |= mask >> 16;
> -		mask |= mask >> 8;
> +		if (gic_is_standard(gic))
> +			mask |= mask >> 8;
>  		if (mask)
>  			break;
>  	}
> @@ -370,6 +420,10 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>  	if (!mask)
>  		pr_crit("GIC CPU mask not found - kernel will fail to boot.\n");
>  
> +	/* ARM GIC needs 8-bit cpu mask, HiP04 GIC needs 16-bit cpu mask. */
> +	if (gic_is_standard(gic))
> +		mask &= 0xff;
> +
>  	return mask;
>  }
>  
> @@ -392,10 +446,11 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>  	 * Set all global interrupts to this CPU only.
>  	 */
>  	cpumask = gic_get_cpumask(gic);
> -	cpumask |= cpumask << 8;
> +	if (gic_is_standard(gic))
> +		cpumask |= cpumask << 8;
>  	cpumask |= cpumask << 16;
> -	for (i = 32; i < gic_irqs; i += 4)
> -		writel_relaxed(cpumask, base + GIC_DIST_TARGET + i * 4 / 4);
> +	for (i = 32; i < gic_irqs; i++)
> +		writel_relaxed(cpumask, base + irq_to_target_reg(gic, i));
>  
>  	/*
>  	 * Set priority on all global interrupts.
> @@ -423,7 +478,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
>  	/*
>  	 * Get what the GIC says our CPU mask is.
>  	 */
> -	BUG_ON(cpu >= NR_GIC_CPU_IF);
> +	BUG_ON(cpu >= gic->nr_cpu_if);
>  	cpu_mask = gic_get_cpumask(gic);
>  	gic_cpu_map[cpu] = cpu_mask;
>  
> @@ -431,7 +486,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
>  	 * Clear our mask from the other map entries in case they're
>  	 * still undefined.
>  	 */
> -	for (i = 0; i < NR_GIC_CPU_IF; i++)
> +	for (i = 0; i < gic->nr_cpu_if; i++)
>  		if (i != cpu)
>  			gic_cpu_map[i] &= ~cpu_mask;
>  
> @@ -467,7 +522,7 @@ void gic_cpu_if_down(void)
>   */
>  static void gic_dist_save(unsigned int gic_nr)
>  {
> -	unsigned int gic_irqs;
> +	unsigned int gic_irqs, target_reg = 0;
>  	void __iomem *dist_base;
>  	int i;
>  
> @@ -484,9 +539,11 @@ static void gic_dist_save(unsigned int gic_nr)
>  		gic_data[gic_nr].saved_spi_conf[i] =
>  			readl_relaxed(dist_base + GIC_DIST_CONFIG + i * 4);
>  
> -	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
> +	for (i = 0; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
> +		target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
>  		gic_data[gic_nr].saved_spi_target[i] =
> -			readl_relaxed(dist_base + GIC_DIST_TARGET + i * 4);
> +			readl_relaxed(dist_base + target_reg);
> +	}
>  
>  	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
>  		gic_data[gic_nr].saved_spi_enable[i] =
> @@ -502,7 +559,7 @@ static void gic_dist_save(unsigned int gic_nr)
>   */
>  static void gic_dist_restore(unsigned int gic_nr)
>  {
> -	unsigned int gic_irqs;
> +	unsigned int gic_irqs, target_reg = 0;
>  	unsigned int i;
>  	void __iomem *dist_base;
>  
> @@ -525,9 +582,11 @@ static void gic_dist_restore(unsigned int gic_nr)
>  		writel_relaxed(0xa0a0a0a0,
>  			dist_base + GIC_DIST_PRI + i * 4);
>  
> -	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
> +	for (i = 0; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
> +		target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
>  		writel_relaxed(gic_data[gic_nr].saved_spi_target[i],
> -			dist_base + GIC_DIST_TARGET + i * 4);
> +			dist_base + target_reg);
> +	}
>  
>  	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
>  		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
> @@ -665,9 +724,19 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>  	 */
>  	dmb(ishst);
>  
> -	/* this always happens on GIC0 */
> -	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
> -
> +	/*
> +	 * CPUTargetList -- bit[23:16] in GIC_DIST_SOFTINT in ARM GIC.
> +	 *                  bit[23:8] in GIC_DIST_SOFTINT in HiP04 GIC.
> +	 * NSATT -- bit[15] in GIC_DIST_SOFTINT in ARM GIC.
> +	 *          bit[7] in GIC_DIST_SOFTINT in HiP04 GIC.
> +	 * this always happens on GIC0
> +	 */

> +	if (gic_is_standard(&gic_data[0]))
> +		map = map << 16;
> +	else
> +		map = map << 8;

Here's another place where you should store the shift and avoid the if()
logic inside the lock.

> +	writel_relaxed(map | irq,
> +		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
>  	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
>  }
>  #endif
> @@ -681,10 +750,15 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
>   */
>  void gic_send_sgi(unsigned int cpu_id, unsigned int irq)
>  {
> -	BUG_ON(cpu_id >= NR_GIC_CPU_IF);
> +	BUG_ON(cpu_id >= gic_data[0].nr_cpu_if);
>  	cpu_id = 1 << cpu_id;
>  	/* this always happens on GIC0 */
> -	writel_relaxed((cpu_id << 16) | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);

> +	if (gic_is_standard(&gic_data[0]))
> +		cpu_id = cpu_id << 16;
> +	else
> +		cpu_id = cpu_id << 8;

You can use the stored shift here as well.

> +	writel_relaxed(cpu_id | irq,
> +		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
>  }
>  
>  /*
> @@ -700,7 +774,7 @@ int gic_get_cpu_id(unsigned int cpu)
>  {
>  	unsigned int cpu_bit;
>  
> -	if (cpu >= NR_GIC_CPU_IF)
> +	if (cpu >= gic_data[0].nr_cpu_if)
>  		return -1;
>  	cpu_bit = gic_cpu_map[cpu];
>  	if (cpu_bit & (cpu_bit - 1))
> @@ -747,13 +821,14 @@ void gic_migrate_target(unsigned int new_cpu_id)
>  	 * CPU interface and migrate them to the new CPU interface.
>  	 * We skip DIST_TARGET 0 to 7 as they are read-only.
>  	 */
> -	for (i = 8; i < DIV_ROUND_UP(gic_irqs, 4); i++) {
> -		val = readl_relaxed(dist_base + GIC_DIST_TARGET + i * 4);
> +	for (i = 8; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
> +		target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
> +		val = readl_relaxed(dist_base + target_reg);
>  		active_mask = val & cur_target_mask;
>  		if (active_mask) {
>  			val &= ~active_mask;
>  			val |= ror32(active_mask, ror_val);
> -			writel_relaxed(val, dist_base + GIC_DIST_TARGET + i*4);
> +			writel_relaxed(val, dist_base + target_reg);
>  		}
>  	}
>  
> @@ -931,7 +1006,7 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
>  	irq_hw_number_t hwirq_base;
>  	struct gic_chip_data *gic;
>  	int gic_irqs, irq_base, i;
> -	int nr_routable_irqs;
> +	int nr_routable_irqs, max_nr_irq;
>  
>  	BUG_ON(gic_nr >= MAX_GIC_NR);
>  
> @@ -967,12 +1042,22 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
>  		gic_set_base_accessor(gic, gic_get_common_base);
>  	}
>  
> +	if (of_device_is_compatible(node, "hisilicon,hip04-gic")) {
> +		/* HiP04 GIC supports 16 CPUs at most */
> +		gic->nr_cpu_if = 16;
> +		max_nr_irq = 510;
> +	} else {
> +		/* ARM/Qualcomm GIC supports 8 CPUs at most */
> +		gic->nr_cpu_if = 8;
> +		max_nr_irq = 1020;
> +	}
> +
>  	/*
>  	 * Initialize the CPU interface map to all CPUs.
>  	 * It will be refined as each CPU probes its ID.
>  	 */
> -	for (i = 0; i < NR_GIC_CPU_IF; i++)
> -		gic_cpu_map[i] = 0xff;
> +	for (i = 0; i < gic->nr_cpu_if; i++)
> +		gic_cpu_map[i] = (1 << gic->nr_cpu_if) - 1;
>  
>  	/*
>  	 * For primary GICs, skip over SGIs.
> @@ -988,12 +1073,13 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
>  
>  	/*
>  	 * Find out how many interrupts are supported.
> -	 * The GIC only supports up to 1020 interrupt sources.
> +	 * The ARM/Qualcomm GIC only supports up to 1020 interrupt sources.
> +	 * The HiP04 GIC only supports up to 510 interrupt sources.
>  	 */
>  	gic_irqs = readl_relaxed(gic_data_dist_base(gic) + GIC_DIST_CTR) & 0x1f;
>  	gic_irqs = (gic_irqs + 1) * 32;
> -	if (gic_irqs > 1020)
> -		gic_irqs = 1020;
> +	if (gic_irqs > max_nr_irq)
> +		gic_irqs = max_nr_irq;
>  	gic->gic_irqs = gic_irqs;
>  
>  	gic_irqs -= hwirq_base; /* calculate # of irqs to allocate */
> @@ -1069,6 +1155,7 @@ gic_of_init(struct device_node *node, struct device_node *parent)
>  }
>  IRQCHIP_DECLARE(cortex_a15_gic, "arm,cortex-a15-gic", gic_of_init);
>  IRQCHIP_DECLARE(cortex_a9_gic, "arm,cortex-a9-gic", gic_of_init);
> +IRQCHIP_DECLARE(hip04_gic, "hisilicon,hip04-gic", gic_of_init);
>  IRQCHIP_DECLARE(msm_8660_qgic, "qcom,msm-8660-qgic", gic_of_init);
>  IRQCHIP_DECLARE(msm_qgic2, "qcom,msm-qgic2", gic_of_init);

thx,

Jason.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 02/14] irq: gic: support hip04 gic
  2014-06-21  1:54   ` Jason Cooper
@ 2014-07-08 22:40     ` Jason Cooper
  2014-07-10  1:24       ` Haojian Zhuang
  0 siblings, 1 reply; 36+ messages in thread
From: Jason Cooper @ 2014-07-08 22:40 UTC (permalink / raw)
  To: linux-arm-kernel

Haojian,

I've set up a branch for all the changes to the gic driver this time
around.  Please keep an eye on:

  git://git.infradead.org/users/jcooper/linux.git irqchip/gic

And base your irqchip changes on top.

thx,

Jason.

On Fri, Jun 20, 2014 at 09:54:09PM -0400, Jason Cooper wrote:
> Hi Haojian,
> 
> I'm working through my backlog ...
> 
> On Tue, May 20, 2014 at 09:10:15PM +0800, Haojian Zhuang wrote:
> > There's a little difference between ARM GIC and HiP04 GIC.
> > 
> > * HiP04 GIC could support 16 cores at most, and ARM GIC could support
> > 8 cores at most. So the difination on GIC_DIST_TARGET registers are
> > different since CPU interfaces are increased from 8-bit to 16-bit.
> > 
> > * HiP04 GIC could support 510 interrupts at most, and ARM GIC could
> > support 1020 interrupts at most.
> > 
> > Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
> > ---
> >  Documentation/devicetree/bindings/arm/gic.txt |   1 +
> >  drivers/irqchip/irq-gic.c                     | 159 ++++++++++++++++++++------
> >  2 files changed, 124 insertions(+), 36 deletions(-)
> > 
> ...
> > diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> > index f711fb6..64af475 100644
> > --- a/drivers/irqchip/irq-gic.c
> > +++ b/drivers/irqchip/irq-gic.c
> > @@ -68,6 +68,7 @@ struct gic_chip_data {
> >  #ifdef CONFIG_GIC_NON_BANKED
> >  	void __iomem *(*get_base)(union gic_base *);
> >  #endif
> > +	u32 nr_cpu_if;
> >  };
> >  
> >  static DEFINE_RAW_SPINLOCK(irq_controller_lock);
> > @@ -76,9 +77,11 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock);
> >   * The GIC mapping of CPU interfaces does not necessarily match
> >   * the logical CPU numbering.  Let's use a mapping as returned
> >   * by the GIC itself.
> > + *
> > + * Hisilicon HiP04 extends the number of CPU interface from 8 to 16.
> >   */
> > -#define NR_GIC_CPU_IF 8
> > -static u8 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
> > +#define NR_GIC_CPU_IF	16
> > +static u16 gic_cpu_map[NR_GIC_CPU_IF] __read_mostly;
> >  
> >  /*
> >   * Supported arch specific GIC irq extension.
> > @@ -241,20 +244,62 @@ static int gic_retrigger(struct irq_data *d)
> >  	return 0;
> >  }
> >  
> > +static bool gic_is_standard(struct gic_chip_data *gic_data)
> > +{
> > +	return (gic_data->nr_cpu_if == 8);
> > +}
> > +
> > +static u32 irqs_per_target_reg(struct gic_chip_data *gic_data)
> > +{
> > +	return (32 / gic_data->nr_cpu_if);
> > +}
> > +
> > +/* i is the index of interrupt */
> > +static u32 irq_to_target_reg(struct gic_chip_data *gic_data, u32 i)
> > +{
> > +	if (gic_is_standard(gic_data))
> > +		i = i & ~3U;
> > +	else
> > +		i = (i << 1) & ~3U;
> > +	return (i + GIC_DIST_TARGET);
> > +}
> > +
> >  #ifdef CONFIG_SMP
> > +static u32 irq_to_core_shift(struct irq_data *d)
> > +{
> > +	struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
> > +	unsigned int i = gic_irq(d);
> > +
> > +	if (gic_is_standard(gic_data))
> > +		return ((i % 4) << 3);
> > +	return ((i % 2) << 4);
> > +}
> > +
> > +static u32 irq_to_core_mask(struct irq_data *d)
> > +{
> > +	struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
> > +	u32 mask;
> > +	/* ARM GIC, nr_cpu_if == 8; HiP04 GIC, nr_cpu_if == 16 */
> > +	mask = (1 << gic_data->nr_cpu_if) - 1;
> > +	return (mask << irq_to_core_shift(d));
> > +}
> > +
> >  static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
> >  			    bool force)
> >  {
> > -	void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + (gic_irq(d) & ~3);
> > -	unsigned int shift = (gic_irq(d) % 4) * 8;
> > +	void __iomem *reg;
> > +	struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
> > +	unsigned int shift = irq_to_core_shift(d);
> >  	unsigned int cpu = cpumask_any_and(mask_val, cpu_online_mask);
> >  	u32 val, mask, bit;
> >  
> > -	if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
> > +	if (cpu >= gic_data->nr_cpu_if || cpu >= nr_cpu_ids)
> >  		return -EINVAL;
> >  
> > +	reg = gic_dist_base(d) + irq_to_target_reg(gic_data, gic_irq(d));
> > +
> >  	raw_spin_lock(&irq_controller_lock);
> > -	mask = 0xff << shift;
> > +	mask = irq_to_core_mask(d);
> 
> You still haven't addressed my comment from your previous version of
> this series.
> 
> Please calculate the mask once at boottime and store it.  Then you can
> use it in conjunction with the shift you calculate above.  This will
> remove the needless calculation inside the spinlock:
> 
>   mask = (1 << gic_data->nr_cpu_if) - 1;
> 
> It will also remove the extra call to irq_data_get_irq_chip_data() since
> you already did that in this function outside the spinlock.
> 
> >  	bit = gic_cpu_map[cpu] << shift;
> >  	val = readl_relaxed(reg) & ~mask;
> >  	writel_relaxed(val | bit, reg);
> > @@ -354,15 +399,20 @@ void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq)
> >  	irq_set_chained_handler(irq, gic_handle_cascade_irq);
> >  }
> >  
> > -static u8 gic_get_cpumask(struct gic_chip_data *gic)
> > +static u16 gic_get_cpumask(struct gic_chip_data *gic)
> >  {
> >  	void __iomem *base = gic_data_dist_base(gic);
> >  	u32 mask, i;
> >  
> > -	for (i = mask = 0; i < 32; i += 4) {
> > -		mask = readl_relaxed(base + GIC_DIST_TARGET + i);
> > +	/*
> > +	 * ARM GIC uses 8 registers for interrupt 0-31,
> > +	 * HiP04 GIC uses 16 registers for interrupt 0-31.
> > +	 */
> > +	for (i = mask = 0; i < 32; i++) {
> > +		mask = readl_relaxed(base + irq_to_target_reg(gic, i));
> >  		mask |= mask >> 16;
> > -		mask |= mask >> 8;
> > +		if (gic_is_standard(gic))
> > +			mask |= mask >> 8;
> >  		if (mask)
> >  			break;
> >  	}
> > @@ -370,6 +420,10 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
> >  	if (!mask)
> >  		pr_crit("GIC CPU mask not found - kernel will fail to boot.\n");
> >  
> > +	/* ARM GIC needs 8-bit cpu mask, HiP04 GIC needs 16-bit cpu mask. */
> > +	if (gic_is_standard(gic))
> > +		mask &= 0xff;
> > +
> >  	return mask;
> >  }
> >  
> > @@ -392,10 +446,11 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
> >  	 * Set all global interrupts to this CPU only.
> >  	 */
> >  	cpumask = gic_get_cpumask(gic);
> > -	cpumask |= cpumask << 8;
> > +	if (gic_is_standard(gic))
> > +		cpumask |= cpumask << 8;
> >  	cpumask |= cpumask << 16;
> > -	for (i = 32; i < gic_irqs; i += 4)
> > -		writel_relaxed(cpumask, base + GIC_DIST_TARGET + i * 4 / 4);
> > +	for (i = 32; i < gic_irqs; i++)
> > +		writel_relaxed(cpumask, base + irq_to_target_reg(gic, i));
> >  
> >  	/*
> >  	 * Set priority on all global interrupts.
> > @@ -423,7 +478,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
> >  	/*
> >  	 * Get what the GIC says our CPU mask is.
> >  	 */
> > -	BUG_ON(cpu >= NR_GIC_CPU_IF);
> > +	BUG_ON(cpu >= gic->nr_cpu_if);
> >  	cpu_mask = gic_get_cpumask(gic);
> >  	gic_cpu_map[cpu] = cpu_mask;
> >  
> > @@ -431,7 +486,7 @@ static void gic_cpu_init(struct gic_chip_data *gic)
> >  	 * Clear our mask from the other map entries in case they're
> >  	 * still undefined.
> >  	 */
> > -	for (i = 0; i < NR_GIC_CPU_IF; i++)
> > +	for (i = 0; i < gic->nr_cpu_if; i++)
> >  		if (i != cpu)
> >  			gic_cpu_map[i] &= ~cpu_mask;
> >  
> > @@ -467,7 +522,7 @@ void gic_cpu_if_down(void)
> >   */
> >  static void gic_dist_save(unsigned int gic_nr)
> >  {
> > -	unsigned int gic_irqs;
> > +	unsigned int gic_irqs, target_reg = 0;
> >  	void __iomem *dist_base;
> >  	int i;
> >  
> > @@ -484,9 +539,11 @@ static void gic_dist_save(unsigned int gic_nr)
> >  		gic_data[gic_nr].saved_spi_conf[i] =
> >  			readl_relaxed(dist_base + GIC_DIST_CONFIG + i * 4);
> >  
> > -	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
> > +	for (i = 0; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
> > +		target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
> >  		gic_data[gic_nr].saved_spi_target[i] =
> > -			readl_relaxed(dist_base + GIC_DIST_TARGET + i * 4);
> > +			readl_relaxed(dist_base + target_reg);
> > +	}
> >  
> >  	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
> >  		gic_data[gic_nr].saved_spi_enable[i] =
> > @@ -502,7 +559,7 @@ static void gic_dist_save(unsigned int gic_nr)
> >   */
> >  static void gic_dist_restore(unsigned int gic_nr)
> >  {
> > -	unsigned int gic_irqs;
> > +	unsigned int gic_irqs, target_reg = 0;
> >  	unsigned int i;
> >  	void __iomem *dist_base;
> >  
> > @@ -525,9 +582,11 @@ static void gic_dist_restore(unsigned int gic_nr)
> >  		writel_relaxed(0xa0a0a0a0,
> >  			dist_base + GIC_DIST_PRI + i * 4);
> >  
> > -	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
> > +	for (i = 0; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
> > +		target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
> >  		writel_relaxed(gic_data[gic_nr].saved_spi_target[i],
> > -			dist_base + GIC_DIST_TARGET + i * 4);
> > +			dist_base + target_reg);
> > +	}
> >  
> >  	for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
> >  		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
> > @@ -665,9 +724,19 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
> >  	 */
> >  	dmb(ishst);
> >  
> > -	/* this always happens on GIC0 */
> > -	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
> > -
> > +	/*
> > +	 * CPUTargetList -- bit[23:16] in GIC_DIST_SOFTINT in ARM GIC.
> > +	 *                  bit[23:8] in GIC_DIST_SOFTINT in HiP04 GIC.
> > +	 * NSATT -- bit[15] in GIC_DIST_SOFTINT in ARM GIC.
> > +	 *          bit[7] in GIC_DIST_SOFTINT in HiP04 GIC.
> > +	 * this always happens on GIC0
> > +	 */
> 
> > +	if (gic_is_standard(&gic_data[0]))
> > +		map = map << 16;
> > +	else
> > +		map = map << 8;
> 
> Here's another place where you should store the shift and avoid the if()
> logic inside the lock.
> 
> > +	writel_relaxed(map | irq,
> > +		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
> >  	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
> >  }
> >  #endif
> > @@ -681,10 +750,15 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
> >   */
> >  void gic_send_sgi(unsigned int cpu_id, unsigned int irq)
> >  {
> > -	BUG_ON(cpu_id >= NR_GIC_CPU_IF);
> > +	BUG_ON(cpu_id >= gic_data[0].nr_cpu_if);
> >  	cpu_id = 1 << cpu_id;
> >  	/* this always happens on GIC0 */
> > -	writel_relaxed((cpu_id << 16) | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
> 
> > +	if (gic_is_standard(&gic_data[0]))
> > +		cpu_id = cpu_id << 16;
> > +	else
> > +		cpu_id = cpu_id << 8;
> 
> You can use the stored shift here as well.
> 
> > +	writel_relaxed(cpu_id | irq,
> > +		       gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
> >  }
> >  
> >  /*
> > @@ -700,7 +774,7 @@ int gic_get_cpu_id(unsigned int cpu)
> >  {
> >  	unsigned int cpu_bit;
> >  
> > -	if (cpu >= NR_GIC_CPU_IF)
> > +	if (cpu >= gic_data[0].nr_cpu_if)
> >  		return -1;
> >  	cpu_bit = gic_cpu_map[cpu];
> >  	if (cpu_bit & (cpu_bit - 1))
> > @@ -747,13 +821,14 @@ void gic_migrate_target(unsigned int new_cpu_id)
> >  	 * CPU interface and migrate them to the new CPU interface.
> >  	 * We skip DIST_TARGET 0 to 7 as they are read-only.
> >  	 */
> > -	for (i = 8; i < DIV_ROUND_UP(gic_irqs, 4); i++) {
> > -		val = readl_relaxed(dist_base + GIC_DIST_TARGET + i * 4);
> > +	for (i = 8; i < gic_irqs; i += irqs_per_target_reg(&gic_data[gic_nr])) {
> > +		target_reg = irq_to_target_reg(&gic_data[gic_nr], i);
> > +		val = readl_relaxed(dist_base + target_reg);
> >  		active_mask = val & cur_target_mask;
> >  		if (active_mask) {
> >  			val &= ~active_mask;
> >  			val |= ror32(active_mask, ror_val);
> > -			writel_relaxed(val, dist_base + GIC_DIST_TARGET + i*4);
> > +			writel_relaxed(val, dist_base + target_reg);
> >  		}
> >  	}
> >  
> > @@ -931,7 +1006,7 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
> >  	irq_hw_number_t hwirq_base;
> >  	struct gic_chip_data *gic;
> >  	int gic_irqs, irq_base, i;
> > -	int nr_routable_irqs;
> > +	int nr_routable_irqs, max_nr_irq;
> >  
> >  	BUG_ON(gic_nr >= MAX_GIC_NR);
> >  
> > @@ -967,12 +1042,22 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
> >  		gic_set_base_accessor(gic, gic_get_common_base);
> >  	}
> >  
> > +	if (of_device_is_compatible(node, "hisilicon,hip04-gic")) {
> > +		/* HiP04 GIC supports 16 CPUs at most */
> > +		gic->nr_cpu_if = 16;
> > +		max_nr_irq = 510;
> > +	} else {
> > +		/* ARM/Qualcomm GIC supports 8 CPUs at most */
> > +		gic->nr_cpu_if = 8;
> > +		max_nr_irq = 1020;
> > +	}
> > +
> >  	/*
> >  	 * Initialize the CPU interface map to all CPUs.
> >  	 * It will be refined as each CPU probes its ID.
> >  	 */
> > -	for (i = 0; i < NR_GIC_CPU_IF; i++)
> > -		gic_cpu_map[i] = 0xff;
> > +	for (i = 0; i < gic->nr_cpu_if; i++)
> > +		gic_cpu_map[i] = (1 << gic->nr_cpu_if) - 1;
> >  
> >  	/*
> >  	 * For primary GICs, skip over SGIs.
> > @@ -988,12 +1073,13 @@ void __init gic_init_bases(unsigned int gic_nr, int irq_start,
> >  
> >  	/*
> >  	 * Find out how many interrupts are supported.
> > -	 * The GIC only supports up to 1020 interrupt sources.
> > +	 * The ARM/Qualcomm GIC only supports up to 1020 interrupt sources.
> > +	 * The HiP04 GIC only supports up to 510 interrupt sources.
> >  	 */
> >  	gic_irqs = readl_relaxed(gic_data_dist_base(gic) + GIC_DIST_CTR) & 0x1f;
> >  	gic_irqs = (gic_irqs + 1) * 32;
> > -	if (gic_irqs > 1020)
> > -		gic_irqs = 1020;
> > +	if (gic_irqs > max_nr_irq)
> > +		gic_irqs = max_nr_irq;
> >  	gic->gic_irqs = gic_irqs;
> >  
> >  	gic_irqs -= hwirq_base; /* calculate # of irqs to allocate */
> > @@ -1069,6 +1155,7 @@ gic_of_init(struct device_node *node, struct device_node *parent)
> >  }
> >  IRQCHIP_DECLARE(cortex_a15_gic, "arm,cortex-a15-gic", gic_of_init);
> >  IRQCHIP_DECLARE(cortex_a9_gic, "arm,cortex-a9-gic", gic_of_init);
> > +IRQCHIP_DECLARE(hip04_gic, "hisilicon,hip04-gic", gic_of_init);
> >  IRQCHIP_DECLARE(msm_8660_qgic, "qcom,msm-8660-qgic", gic_of_init);
> >  IRQCHIP_DECLARE(msm_qgic2, "qcom,msm-qgic2", gic_of_init);
> 
> thx,
> 
> Jason.
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v9 02/14] irq: gic: support hip04 gic
  2014-07-08 22:40     ` Jason Cooper
@ 2014-07-10  1:24       ` Haojian Zhuang
  0 siblings, 0 replies; 36+ messages in thread
From: Haojian Zhuang @ 2014-07-10  1:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 9 July 2014 06:40, Jason Cooper <jason@lakedaemon.net> wrote:
> Haojian,
>
> I've set up a branch for all the changes to the gic driver this time
> around.  Please keep an eye on:
>
>   git://git.infradead.org/users/jcooper/linux.git irqchip/gic
>
> And base your irqchip changes on top.
>
> thx,
>
> Jason.
>

Sure. I'll rebase my irqchip patch on this branch. Thanks for your reminder.

Best Regards
Haojian

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2014-07-10  1:24 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-20 13:10 [PATCH v9 00/14] enable HiP04 SoC Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 01/14] ARM: debug: add HiP04 debug uart Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 02/14] irq: gic: support hip04 gic Haojian Zhuang
2014-05-21 10:15   ` Marc Zyngier
2014-06-21  1:54   ` Jason Cooper
2014-07-08 22:40     ` Jason Cooper
2014-07-10  1:24       ` Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 03/14] ARM: mcpm: support 4 clusters Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 04/14] ARM: hisi: add ARCH_HISI Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 05/14] ARM: hisi: enable MCPM implementation Haojian Zhuang
2014-05-21  1:29   ` Nicolas Pitre
2014-05-21  1:48     ` Haojian Zhuang
2014-05-21  2:06       ` Nicolas Pitre
2014-05-20 13:10 ` [PATCH v9 06/14] ARM: hisi: enable HiP04 Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 07/14] document: dt: add the binding on HiP04 Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 08/14] document: dt: add the binding on HiP04 clock Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 09/14] ARM: dts: append hip04 dts Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 10/14] ARM: config: append lpae configuration Haojian Zhuang
2014-05-20 13:52   ` Gregory CLEMENT
2014-05-20 14:08   ` Alexandre Belloni
2014-05-20 18:19   ` Olof Johansson
2014-05-20 13:10 ` [PATCH v9 11/14] ARM: config: append hip04_defconfig Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 12/14] ARM: config: select ARCH_HISI in hi3xxx_defconfig Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 13/14] ARM: hisi: enable erratum 798181 of A15 on HiP04 Haojian Zhuang
2014-05-20 13:10 ` [PATCH v9 14/14] virt: arm: support hip04 gic Haojian Zhuang
2014-05-20 13:34   ` Haojian Zhuang
2014-05-20 13:44   ` Christoffer Dall
2014-05-20 13:52     ` Haojian Zhuang
2014-05-20 14:01       ` Christoffer Dall
2014-05-20 14:16         ` Haojian Zhuang
2014-05-20 15:05           ` Christoffer Dall
2014-05-20 15:39             ` Haojian Zhuang
2014-05-21  9:02               ` Christoffer Dall
2014-05-21  9:47                 ` Haojian Zhuang
2014-05-21  9:55                   ` Christoffer Dall
2014-05-21 13:11   ` Marc Zyngier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.