All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/17] Initial POWER9 XIVE and PHB4 support
@ 2016-06-27 12:25 Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 01/17] powerpc/powernv: Add XICS emulation APIs Benjamin Herrenschmidt
                   ` (16 more replies)
  0 siblings, 17 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

This provides an initial base support for the XIVE interrupt controller
and the new IODA3 compliant PHB4, both found on POWER9. The series
is interleaved with a bug fix or two as I found problems in sim.

The new OPAL APIs haven't been committed yet so are still subject
to change, thus the patches relying on them shouldn't bee merged just
yet, at least not until we freeze those APIs. The individual fixes
are good to go though.

This is very basic support for XIVE by using OPAL APIs that simulate
a XICS. This is meant as a fallback if the interrupt controller
implementation is unknown.

A subsequent series will provide the native support for XIVE (aka
exploitation mode) and this will remain as a fallback for future
chip that might implement a variant that isn't backward compatible.

Similarily, we use an OPAL call for TCE invalidations in IODA3. We
may at some point add native support for PHB4, but this gives us
a fallback in case the HW changes in unsupported ways.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 01/17] powerpc/powernv: Add XICS emulation APIs
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 02/17] powerpc/irq: Add support for HV virtualization interrupts Benjamin Herrenschmidt
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

OPAL provides an emulated XICS interrupt controller to
use as a fallback on newer processors that don't have a
XICS. It's meant as a way to provide backward compatibility
with future processors. Add the corresponding interfaces.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/opal-api.h            | 6 +++++-
 arch/powerpc/include/asm/opal.h                | 5 +++++
 arch/powerpc/platforms/powernv/opal-wrappers.S | 4 ++++
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index 9bb8ddf..4b4b559 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -158,7 +158,11 @@
 #define OPAL_LEDS_SET_INDICATOR			115
 #define OPAL_CEC_REBOOT2			116
 #define OPAL_CONSOLE_FLUSH			117
-#define OPAL_LAST				117
+#define OPAL_INT_GET_XIRR			122
+#define	OPAL_INT_SET_CPPR			123
+#define OPAL_INT_EOI				124
+#define OPAL_INT_SET_MFRR			125
+#define OPAL_LAST				125
 
 /* Device tree flags */
 
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 9d86c66..6ccb847 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -210,6 +210,11 @@ int64_t opal_flash_write(uint64_t id, uint64_t offset, uint64_t buf,
 int64_t opal_flash_erase(uint64_t id, uint64_t offset, uint64_t size,
 		uint64_t token);
 
+int64_t opal_int_get_xirr(uint32_t *out_xirr, bool just_poll);
+int64_t opal_int_set_cppr(uint8_t cppr);
+int64_t opal_int_eoi(uint32_t xirr);
+int64_t opal_int_set_mfrr(uint32_t cpu, uint8_t mfrr);
+
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
 				   int depth, void *data);
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
index e45b88a..3854343 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -302,3 +302,7 @@ OPAL_CALL(opal_prd_msg,				OPAL_PRD_MSG);
 OPAL_CALL(opal_leds_get_ind,			OPAL_LEDS_GET_INDICATOR);
 OPAL_CALL(opal_leds_set_ind,			OPAL_LEDS_SET_INDICATOR);
 OPAL_CALL(opal_console_flush,			OPAL_CONSOLE_FLUSH);
+OPAL_CALL(opal_int_get_xirr,			OPAL_INT_GET_XIRR);
+OPAL_CALL(opal_int_set_cppr,			OPAL_INT_SET_CPPR);
+OPAL_CALL(opal_int_eoi,				OPAL_INT_EOI);
+OPAL_CALL(opal_int_set_mfrr,			OPAL_INT_SET_MFRR);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 02/17] powerpc/irq: Add support for HV virtualization interrupts
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 01/17] powerpc/powernv: Add XICS emulation APIs Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 03/17] powerpc/irq: Add mechanism to force a replay of interrupts Benjamin Herrenschmidt
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

This will be delivering external interrupts from the XIVE to the
Hypervisor. We treat it as a normal external interrupt for the
lazy irq disable code (so it will be replayed as a 0x500) and
route it to do_IRQ.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/exception-64s.h |  2 ++
 arch/powerpc/include/asm/reg.h           |  1 +
 arch/powerpc/kernel/cpu_setup_power.S    |  2 ++
 arch/powerpc/kernel/exceptions-64s.S     | 19 +++++++++++++++++++
 4 files changed, 24 insertions(+)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 93ae809..c7d2773 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -403,6 +403,8 @@ label##_relon_hv:						\
 #define SOFTEN_VALUE_0xe82	PACA_IRQ_DBELL
 #define SOFTEN_VALUE_0xe60	PACA_IRQ_HMI
 #define SOFTEN_VALUE_0xe62	PACA_IRQ_HMI
+#define SOFTEN_VALUE_0xea0	PACA_IRQ_EE
+#define SOFTEN_VALUE_0xea2	PACA_IRQ_EE
 
 #define __SOFTEN_TEST(h, vec)						\
 	lbz	r10,PACASOFTIRQEN(r13);					\
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index a0948f4..d5aabe1 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -346,6 +346,7 @@
 #define   LPCR_LPES1   0x00000004      /* LPAR Env selector 1 */
 #define   LPCR_LPES_SH	2
 #define   LPCR_RMI     0x00000002      /* real mode is cache inhibit */
+#define   LPCR_HVICE   0x00000002      /* P9: HV interrupt enable */
 #define   LPCR_HDICE   0x00000001      /* Hyp Decr enable (HV,PR,EE) */
 #define   LPCR_UPRT    0x00400000      /* Use Process Table (ISA 3) */
 #ifndef SPRN_LPID
diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
index 584e119..fd77440 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -94,6 +94,7 @@ _GLOBAL(__setup_cpu_power9)
 	mtspr	SPRN_LPID,r0
 	mfspr	r3,SPRN_LPCR
 	ori	r3, r3, LPCR_PECEDH
+	ori	r3, r3, LPCR_HVICE
 	bl	__init_LPCR
 	bl	__init_HFSCR
 	bl	__init_tlb_power9
@@ -111,6 +112,7 @@ _GLOBAL(__restore_cpu_power9)
 	mtspr	SPRN_LPID,r0
 	mfspr   r3,SPRN_LPCR
 	ori	r3, r3, LPCR_PECEDH
+	ori	r3, r3, LPCR_HVICE
 	bl	__init_LPCR
 	bl	__init_HFSCR
 	bl	__init_tlb_power9
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 4c94406..5726d84 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -351,6 +351,12 @@ hv_doorbell_trampoline:
 	EXCEPTION_PROLOG_0(PACA_EXGEN)
 	b	h_doorbell_hv
 
+	. = 0xea0
+hv_virt_irq_trampoline:
+	SET_SCRATCH0(r13)
+	EXCEPTION_PROLOG_0(PACA_EXGEN)
+	b	h_virt_irq_hv
+
 	/* We need to deal with the Altivec unavailable exception
 	 * here which is at 0xf20, thus in the middle of the
 	 * prolog code of the PerformanceMonitor one. A little
@@ -601,6 +607,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
 	MASKABLE_EXCEPTION_HV_OOL(0xe82, h_doorbell)
 	KVM_HANDLER(PACA_EXGEN, EXC_HV, 0xe82)
 
+	MASKABLE_EXCEPTION_HV_OOL(0xea2, h_virt_irq)
+	KVM_HANDLER(PACA_EXGEN, EXC_HV, 0xea2)
+
 	/* moved from 0xf00 */
 	STD_EXCEPTION_PSERIES_OOL(0xf00, performance_monitor)
 	KVM_HANDLER(PACA_EXGEN, EXC_STD, 0xf00)
@@ -680,6 +689,8 @@ _GLOBAL(__replay_interrupt)
 BEGIN_FTR_SECTION
 	cmpwi	r3,0xe80
 	beq	h_doorbell_common
+	cmpwi	r3,0xea0
+	beq	h_virt_irq_common
 FTR_SECTION_ELSE
 	cmpwi	r3,0xa00
 	beq	doorbell_super_common
@@ -754,6 +765,7 @@ kvmppc_skip_Hinterrupt:
 #else
 	STD_EXCEPTION_COMMON_ASYNC(0xe80, h_doorbell, unknown_exception)
 #endif
+	STD_EXCEPTION_COMMON_ASYNC(0xea0, h_virt_irq, do_IRQ)
 	STD_EXCEPTION_COMMON_ASYNC(0xf00, performance_monitor, performance_monitor_exception)
 	STD_EXCEPTION_COMMON(0x1300, instruction_breakpoint, instruction_breakpoint_exception)
 	STD_EXCEPTION_COMMON(0x1502, denorm, unknown_exception)
@@ -877,6 +889,12 @@ h_doorbell_relon_trampoline:
 	EXCEPTION_PROLOG_0(PACA_EXGEN)
 	b	h_doorbell_relon_hv
 
+	. = 0x4ea0
+h_virt_irq_relon_trampoline:
+	SET_SCRATCH0(r13)
+	EXCEPTION_PROLOG_0(PACA_EXGEN)
+	b	h_virt_irq_relon_hv
+
 	. = 0x4f00
 performance_monitor_relon_pseries_trampoline:
 	SET_SCRATCH0(r13)
@@ -1137,6 +1155,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
 	/* Equivalents to the above handlers for relocation-on interrupt vectors */
 	STD_RELON_EXCEPTION_HV_OOL(0xe40, emulation_assist)
 	MASKABLE_RELON_EXCEPTION_HV_OOL(0xe80, h_doorbell)
+	MASKABLE_RELON_EXCEPTION_HV_OOL(0xea0, h_virt_irq)
 
 	STD_RELON_EXCEPTION_PSERIES_OOL(0xf00, performance_monitor)
 	STD_RELON_EXCEPTION_PSERIES_OOL(0xf20, altivec_unavailable)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 03/17] powerpc/irq: Add mechanism to force a replay of interrupts
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 01/17] powerpc/powernv: Add XICS emulation APIs Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 02/17] powerpc/irq: Add support for HV virtualization interrupts Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 04/17] powerpc/xics: Add ICP OPAL backend Benjamin Herrenschmidt
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

Calling this function with interrupts soft-disabled will cause
a replay of the external interrupt vector when they are re-enabled.

This will be used by the OPAL XICS backend (and latter by the native
XIVE code) to handle EOI signaling that there are more interrupts to
fetch from the hardware since the hardware won't issue another HW
interrupt in that case.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/hw_irq.h |  2 ++
 arch/powerpc/kernel/irq.c         | 14 ++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index b59ac27..c7d82ff 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -130,6 +130,8 @@ static inline bool arch_irq_disabled_regs(struct pt_regs *regs)
 
 extern bool prep_irq_for_idle(void);
 
+extern void force_external_irq_replay(void);
+
 #else /* CONFIG_PPC64 */
 
 #define SET_MSR_EE(x)	mtmsr(x)
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 3cb46a3..604e3dd 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -342,6 +342,20 @@ bool prep_irq_for_idle(void)
 	return true;
 }
 
+/* Force a replay of the external interrupt handler on this
+ * CPU.
+ */
+void force_external_irq_replay(void)
+{
+	/* This must only be called with interrupts soft-disabled,
+	 * the replay will happen when re-enabling
+	 */
+	WARN_ON(!arch_irqs_disabled());
+
+	/* Indicate in the PACA that we have an interrupt to replay */
+	local_paca->irq_happened |= PACA_IRQ_EE;
+}
+
 #endif /* CONFIG_PPC64 */
 
 int arch_show_interrupts(struct seq_file *p, int prec)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 04/17] powerpc/xics: Add ICP OPAL backend
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (2 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 03/17] powerpc/irq: Add mechanism to force a replay of interrupts Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 05/17] powerpc/powernv: Add IODA3 PHB type Benjamin Herrenschmidt
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

This adds a new XICS backend that uses OPAL calls, which can be
used when we don't have native support for the platform interrupt
controller.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/xics.h        |   8 +-
 arch/powerpc/sysdev/xics/Makefile      |   2 +-
 arch/powerpc/sysdev/xics/icp-opal.c    | 144 +++++++++++++++++++++++++++++++++
 arch/powerpc/sysdev/xics/xics-common.c |   5 +-
 4 files changed, 156 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/sysdev/xics/icp-opal.c

diff --git a/arch/powerpc/include/asm/xics.h b/arch/powerpc/include/asm/xics.h
index 04ef3ae..a30d845 100644
--- a/arch/powerpc/include/asm/xics.h
+++ b/arch/powerpc/include/asm/xics.h
@@ -42,6 +42,12 @@ extern int icp_hv_init(void);
 static inline int icp_hv_init(void) { return -ENODEV; }
 #endif
 
+#ifdef CONFIG_PPC_POWERNV
+extern int icp_opal_init(void);
+#else
+static inline int icp_opal_init(void) { return -ENODEV; }
+#endif
+
 /* ICP ops */
 struct icp_ops {
 	unsigned int (*get_irq)(void);
@@ -135,7 +141,7 @@ static inline void xics_set_base_cppr(unsigned char cppr)
 static inline unsigned char xics_cppr_top(void)
 {
 	struct xics_cppr *os_cppr = this_cpu_ptr(&xics_cppr);
-	
+
 	return os_cppr->stack[os_cppr->index];
 }
 
diff --git a/arch/powerpc/sysdev/xics/Makefile b/arch/powerpc/sysdev/xics/Makefile
index c606aa8..5d7f5a6 100644
--- a/arch/powerpc/sysdev/xics/Makefile
+++ b/arch/powerpc/sysdev/xics/Makefile
@@ -4,4 +4,4 @@ obj-y				+= xics-common.o
 obj-$(CONFIG_PPC_ICP_NATIVE)	+= icp-native.o
 obj-$(CONFIG_PPC_ICP_HV)	+= icp-hv.o
 obj-$(CONFIG_PPC_ICS_RTAS)	+= ics-rtas.o
-obj-$(CONFIG_PPC_POWERNV)	+= ics-opal.o
+obj-$(CONFIG_PPC_POWERNV)	+= ics-opal.o icp-opal.o
diff --git a/arch/powerpc/sysdev/xics/icp-opal.c b/arch/powerpc/sysdev/xics/icp-opal.c
new file mode 100644
index 0000000..eb484e9
--- /dev/null
+++ b/arch/powerpc/sysdev/xics/icp-opal.c
@@ -0,0 +1,144 @@
+/*
+ * Copyright 2011 IBM Corporation.
+ *
+ *  This program is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU General Public License
+ *  as published by the Free Software Foundation; either version
+ *  2 of the License, or (at your option) any later version.
+ *
+ */
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/irq.h>
+#include <linux/smp.h>
+#include <linux/interrupt.h>
+#include <linux/cpu.h>
+#include <linux/of.h>
+
+#include <asm/smp.h>
+#include <asm/irq.h>
+#include <asm/errno.h>
+#include <asm/xics.h>
+#include <asm/io.h>
+#include <asm/opal.h>
+
+static void icp_opal_teardown_cpu(void)
+{
+	int cpu = smp_processor_id();
+
+	/* Clear any pending IPI */
+	opal_int_set_mfrr(cpu, 0xff);
+}
+
+static void icp_opal_flush_ipi(void)
+{
+	/* We take the ipi irq but and never return so we
+	 * need to EOI the IPI, but want to leave our priority 0
+	 *
+	 * should we check all the other interrupts too?
+	 * should we be flagging idle loop instead?
+	 * or creating some task to be scheduled?
+	 */
+
+	opal_int_eoi((0x00 << 24) | XICS_IPI);
+}
+
+static unsigned int icp_opal_get_irq(void)
+{
+	unsigned int xirr;
+	unsigned int vec;
+	unsigned int irq;
+	int64_t rc;
+
+	rc = opal_int_get_xirr(&xirr, false);
+	if (rc < 0)
+		return NO_IRQ;
+	xirr = be32_to_cpu(xirr);
+	vec = xirr & 0x00ffffff;
+	if (vec == XICS_IRQ_SPURIOUS)
+		return NO_IRQ;
+
+	irq = irq_find_mapping(xics_host, vec);
+	if (likely(irq != NO_IRQ)) {
+		xics_push_cppr(vec);
+		return irq;
+	}
+
+	/* We don't have a linux mapping, so have rtas mask it. */
+	xics_mask_unknown_vec(vec);
+
+	/* We might learn about it later, so EOI it */
+	opal_int_eoi(xirr);
+
+	return NO_IRQ;
+}
+
+static void icp_opal_set_cpu_priority(unsigned char cppr)
+{
+	xics_set_base_cppr(cppr);
+	opal_int_set_cppr(cppr);
+	iosync();
+}
+
+static void icp_opal_eoi(struct irq_data *d)
+{
+	unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d);
+	int64_t rc;
+
+	iosync();
+	rc = opal_int_eoi((xics_pop_cppr() << 24) | hw_irq);
+
+	/* EOI tells us whether there are more interrupts to fetch.
+	 *
+	 * Some HW implementations might not be able to send us another
+	 * external interrupt in that case, so we force a replay.
+	 */
+	if (rc > 0)
+		force_external_irq_replay();
+}
+
+#ifdef CONFIG_SMP
+
+static void icp_opal_cause_ipi(int cpu, unsigned long data)
+{
+	opal_int_set_mfrr(cpu, IPI_PRIORITY);
+}
+
+static irqreturn_t icp_opal_ipi_action(int irq, void *dev_id)
+{
+	int cpu = smp_processor_id();
+
+	opal_int_set_mfrr(cpu, 0xff);
+
+	return smp_ipi_demux();
+}
+
+#endif /* CONFIG_SMP */
+
+static const struct icp_ops icp_opal_ops = {
+	.get_irq	= icp_opal_get_irq,
+	.eoi		= icp_opal_eoi,
+	.set_priority	= icp_opal_set_cpu_priority,
+	.teardown_cpu	= icp_opal_teardown_cpu,
+	.flush_ipi	= icp_opal_flush_ipi,
+#ifdef CONFIG_SMP
+	.ipi_action	= icp_opal_ipi_action,
+	.cause_ipi	= icp_opal_cause_ipi,
+#endif
+};
+
+int icp_opal_init(void)
+{
+	struct device_node *np;
+
+	np = of_find_compatible_node(NULL, NULL, "ibm,opal-intc");
+	if (!np)
+		return -ENODEV;
+
+	icp_ops = &icp_opal_ops;
+
+	printk("XICS: Using OPAL ICP fallbacks\n");
+
+	return 0;
+}
+
diff --git a/arch/powerpc/sysdev/xics/xics-common.c b/arch/powerpc/sysdev/xics/xics-common.c
index 47e43b7..a795a5f 100644
--- a/arch/powerpc/sysdev/xics/xics-common.c
+++ b/arch/powerpc/sysdev/xics/xics-common.c
@@ -404,8 +404,11 @@ void __init xics_init(void)
 	/* Fist locate ICP */
 	if (firmware_has_feature(FW_FEATURE_LPAR))
 		rc = icp_hv_init();
-	if (rc < 0)
+	if (rc < 0) {
 		rc = icp_native_init();
+		if (rc == -ENODEV)
+		    rc = icp_opal_init();
+	}
 	if (rc < 0) {
 		pr_warning("XICS: Cannot find a Presentation Controller !\n");
 		return;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 05/17] powerpc/powernv: Add IODA3 PHB type
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (3 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 04/17] powerpc/xics: Add ICP OPAL backend Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 06/17] powerpc/pseries/pci: Remove obsolete SW invalidate Benjamin Herrenschmidt
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

For now, mostly treat it as IODA2

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 15 +++++++++++----
 arch/powerpc/platforms/powernv/pci.c      |  4 ++++
 arch/powerpc/platforms/powernv/pci.h      |  2 ++
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3a5ea82..341a9db 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -408,7 +408,8 @@ static void __init pnv_ioda_parse_m64_window(struct pnv_phb *phb)
 	const u32 *r;
 	u64 pci_addr;
 
-	if (phb->type != PNV_PHB_IODA1 && phb->type != PNV_PHB_IODA2) {
+	if (phb->type != PNV_PHB_IODA1 && phb->type != PNV_PHB_IODA2 &&
+	    phb->type != PNV_PHB_IODA3) {
 		pr_info("  Not support M64 window\n");
 		return;
 	}
@@ -1419,7 +1420,7 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev)
 	/* Release VF PEs */
 	pnv_ioda_release_vf_PE(pdev);
 
-	if (phb->type == PNV_PHB_IODA2) {
+	if (phb->type == PNV_PHB_IODA2 || phb->type == PNV_PHB_IODA3) {
 		if (!pdn->m64_single_mode)
 			pnv_pci_vf_resource_shift(pdev, -*pdn->pe_num_map);
 
@@ -1515,7 +1516,7 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
 	phb = hose->private_data;
 	pdn = pci_get_pdn(pdev);
 
-	if (phb->type == PNV_PHB_IODA2) {
+	if (phb->type == PNV_PHB_IODA2 || phb->type == PNV_PHB_IODA3) {
 		if (!pdn->vfs_expanded) {
 			dev_info(&pdev->dev, "don't support this SRIOV device"
 				" with non 64bit-prefetchable IOV BAR\n");
@@ -2717,7 +2718,8 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 		 */
 		if (phb->type == PNV_PHB_IODA1) {
 			pnv_pci_ioda1_setup_dma_pe(phb, pe);
-		} else if (phb->type == PNV_PHB_IODA2) {
+		} else if (phb->type == PNV_PHB_IODA2 ||
+			   phb->type == PNV_PHB_IODA3) {
 			pe_info(pe, "Assign DMA32 space\n");
 			pnv_pci_ioda2_setup_dma_pe(phb, pe);
 		} else if (phb->type == PNV_PHB_NPU) {
@@ -3621,6 +3623,11 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 		hose->mem_resources[1].flags = 0;
 }
 
+void __init pnv_pci_init_ioda3_phb(struct device_node *np)
+{
+	pnv_pci_init_ioda_phb(np, 0, PNV_PHB_IODA3);
+}
+
 void __init pnv_pci_init_ioda2_phb(struct device_node *np)
 {
 	pnv_pci_init_ioda_phb(np, 0, PNV_PHB_IODA2);
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 1d92bd9..517a789 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -815,6 +815,10 @@ void __init pnv_pci_init(void)
 	for_each_compatible_node(np, NULL, "ibm,ioda2-phb")
 		pnv_pci_init_ioda2_phb(np);
 
+	/* Look for ioda3 built-in PHB4's */
+	for_each_compatible_node(np, NULL, "ibm,ioda3-phb")
+		pnv_pci_init_ioda3_phb(np);
+
 	/* Look for NPU PHBs */
 	for_each_compatible_node(np, NULL, "ibm,ioda2-npu-phb")
 		pnv_pci_init_npu_phb(np);
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 7dee25e..772ad41 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -7,6 +7,7 @@ enum pnv_phb_type {
 	PNV_PHB_IODA1	= 0,
 	PNV_PHB_IODA2	= 1,
 	PNV_PHB_NPU	= 2,
+	PNV_PHB_IODA3	= 3,
 };
 
 /* Precise PHB model for error management */
@@ -202,6 +203,7 @@ extern void pnv_pci_setup_iommu_table(struct iommu_table *tbl,
 				      u64 dma_offset, unsigned page_shift);
 extern void pnv_pci_init_ioda_hub(struct device_node *np);
 extern void pnv_pci_init_ioda2_phb(struct device_node *np);
+extern void pnv_pci_init_ioda3_phb(struct device_node *np);
 extern void pnv_pci_init_npu_phb(struct device_node *np);
 extern void pnv_pci_ioda_tce_invalidate(struct iommu_table *tbl,
 					__be64 *startp, __be64 *endp, bool rm);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 06/17] powerpc/pseries/pci: Remove obsolete SW invalidate
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (4 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 05/17] powerpc/powernv: Add IODA3 PHB type Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 07/17] powerpc/opal: Add real mode call wrappers Benjamin Herrenschmidt
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

That was used by some old IBM internal bringup tools and is
no longer relevant.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/platforms/pseries/iommu.c | 53 +---------------------------------
 1 file changed, 1 insertion(+), 52 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 3e8865b..770a753 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -120,35 +120,6 @@ static void iommu_pseries_free_group(struct iommu_table_group *table_group,
 	kfree(table_group);
 }
 
-static void tce_invalidate_pSeries_sw(struct iommu_table *tbl,
-				      __be64 *startp, __be64 *endp)
-{
-	u64 __iomem *invalidate = (u64 __iomem *)tbl->it_index;
-	unsigned long start, end, inc;
-
-	start = __pa(startp);
-	end = __pa(endp);
-	inc = L1_CACHE_BYTES; /* invalidate a cacheline of TCEs at a time */
-
-	/* If this is non-zero, change the format.  We shift the
-	 * address and or in the magic from the device tree. */
-	if (tbl->it_busno) {
-		start <<= 12;
-		end <<= 12;
-		inc <<= 12;
-		start |= tbl->it_busno;
-		end |= tbl->it_busno;
-	}
-
-	end |= inc - 1; /* round up end to be different than start */
-
-	mb(); /* Make sure TCEs in memory are written */
-	while (start <= end) {
-		out_be64(invalidate, start);
-		start += inc;
-	}
-}
-
 static int tce_build_pSeries(struct iommu_table *tbl, long index,
 			      long npages, unsigned long uaddr,
 			      enum dma_data_direction direction,
@@ -173,9 +144,6 @@ static int tce_build_pSeries(struct iommu_table *tbl, long index,
 		uaddr += TCE_PAGE_SIZE;
 		tcep++;
 	}
-
-	if (tbl->it_type & TCE_PCI_SWINV_CREATE)
-		tce_invalidate_pSeries_sw(tbl, tces, tcep - 1);
 	return 0;
 }
 
@@ -188,9 +156,6 @@ static void tce_free_pSeries(struct iommu_table *tbl, long index, long npages)
 
 	while (npages--)
 		*(tcep++) = 0;
-
-	if (tbl->it_type & TCE_PCI_SWINV_FREE)
-		tce_invalidate_pSeries_sw(tbl, tces, tcep - 1);
 }
 
 static unsigned long tce_get_pseries(struct iommu_table *tbl, long index)
@@ -537,7 +502,7 @@ static void iommu_table_setparms(struct pci_controller *phb,
 				 struct iommu_table *tbl)
 {
 	struct device_node *node;
-	const unsigned long *basep, *sw_inval;
+	const unsigned long *basep;
 	const u32 *sizep;
 
 	node = phb->dn;
@@ -575,22 +540,6 @@ static void iommu_table_setparms(struct pci_controller *phb,
 	tbl->it_index = 0;
 	tbl->it_blocksize = 16;
 	tbl->it_type = TCE_PCI;
-
-	sw_inval = of_get_property(node, "linux,tce-sw-invalidate-info", NULL);
-	if (sw_inval) {
-		/*
-		 * This property contains information on how to
-		 * invalidate the TCE entry.  The first property is
-		 * the base MMIO address used to invalidate entries.
-		 * The second property tells us the format of the TCE
-		 * invalidate (whether it needs to be shifted) and
-		 * some magic routing info to add to our invalidate
-		 * command.
-		 */
-		tbl->it_index = (unsigned long) ioremap(sw_inval[0], 8);
-		tbl->it_busno = sw_inval[1]; /* overload this with magic */
-		tbl->it_type = TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE;
-	}
 }
 
 /*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 07/17] powerpc/opal: Add real mode call wrappers
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (5 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 06/17] powerpc/pseries/pci: Remove obsolete SW invalidate Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-28  4:18   ` Michael Neuling
  2016-06-27 12:25 ` [PATCH 08/17] powerpc/powernv/pci: Rename TCE invalidation calls Benjamin Herrenschmidt
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

Replace the old generic opal_call_realmode() with proper per-call
wrappers similar to the normal ones and convert callers.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/opal-api.h            | 10 +++-
 arch/powerpc/include/asm/opal.h                |  6 +++
 arch/powerpc/kernel/idle_power7.S              | 16 ++-----
 arch/powerpc/platforms/powernv/opal-wrappers.S | 63 +++++++++++++-------------
 4 files changed, 51 insertions(+), 44 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index 4b4b559..957795c 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -162,7 +162,8 @@
 #define	OPAL_INT_SET_CPPR			123
 #define OPAL_INT_EOI				124
 #define OPAL_INT_SET_MFRR			125
-#define OPAL_LAST				125
+#define OPAL_PCI_TCE_KILL			126
+#define OPAL_LAST				126
 
 /* Device tree flags */
 
@@ -895,6 +896,13 @@ enum {
 	OPAL_REBOOT_PLATFORM_ERROR	= 1,
 };
 
+/* Argument to OPAL_PCI_TCE_KILL */
+enum {
+	OPAL_PCI_TCE_KILL_PAGES,
+	OPAL_PCI_TCE_KILL_PE,
+	OPAL_PCI_TCE_KILL_ALL,
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 6ccb847..ec6e0cc 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -214,6 +214,12 @@ int64_t opal_int_get_xirr(uint32_t *out_xirr, bool just_poll);
 int64_t opal_int_set_cppr(uint8_t cppr);
 int64_t opal_int_eoi(uint32_t xirr);
 int64_t opal_int_set_mfrr(uint32_t cpu, uint8_t mfrr);
+int64_t opal_pci_tce_kill(uint64_t phb_id, uint32_t kill_type,
+			  uint32_t pe_num, uint32_t tce_size,
+			  uint64_t dma_addr, uint32_t npages);
+int64_t opal_rm_pci_tce_kill(uint64_t phb_id, uint32_t kill_type,
+			     uint32_t pe_num, uint32_t tce_size,
+			     uint64_t dma_addr, uint32_t npages);
 
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
diff --git a/arch/powerpc/kernel/idle_power7.S b/arch/powerpc/kernel/idle_power7.S
index 470ceeb..c93f825 100644
--- a/arch/powerpc/kernel/idle_power7.S
+++ b/arch/powerpc/kernel/idle_power7.S
@@ -196,8 +196,7 @@ fastsleep_workaround_at_entry:
 	/* Fast sleep workaround */
 	li	r3,1
 	li	r4,1
-	li	r0,OPAL_CONFIG_CPU_IDLE_STATE
-	bl	opal_call_realmode
+	bl	opal_rm_config_cpu_idle_state
 
 	/* Clear Lock bit */
 	li	r0,0
@@ -270,8 +269,7 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66);		\
 	ld	r2,PACATOC(r13);					\
 	ld	r1,PACAR1(r13);						\
 	std	r3,ORIG_GPR3(r1);	/* Save original r3 */		\
-	li	r0,OPAL_HANDLE_HMI;	/* Pass opal token argument*/	\
-	bl	opal_call_realmode;					\
+	bl	opal_rm_handle_hmi;					\
 	ld	r3,ORIG_GPR3(r1);	/* Restore original r3 */	\
 20:	nop;
 
@@ -284,7 +282,7 @@ _GLOBAL(power7_wakeup_tb_loss)
 	 * and they are restored before switching to the process context. Hence
 	 * until they are restored, they are free to be used.
 	 *
-	 * Save SRR1 in a NVGPR as it might be clobbered in opal_call_realmode
+	 * Save SRR1 in a NVGPR as it might be clobbered in opal call
 	 * (called in CHECK_HMI_INTERRUPT). SRR1 is required to determine the
 	 * wakeup reason if we branch to kvm_start_guest.
 	 */
@@ -378,10 +376,7 @@ timebase_resync:
 	 * set in exceptions-64s.S */
 	ble	cr3,clear_lock
 	/* Time base re-sync */
-	li	r0,OPAL_RESYNC_TIMEBASE
-	bl	opal_call_realmode;
-	/* TODO: Check r3 for failure */
-
+	bl	opal_rm_resync_timebase;
 	/*
 	 * If waking up from sleep, per core state is not lost, skip to
 	 * clear_lock.
@@ -469,8 +464,7 @@ hypervisor_state_restored:
 fastsleep_workaround_at_exit:
 	li	r3,1
 	li	r4,0
-	li	r0,OPAL_CONFIG_CPU_IDLE_STATE
-	bl	opal_call_realmode
+	bl	opal_rm_config_cpu_idle_state
 	b	timebase_resync
 
 /*
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
index 3854343..d5f00bb 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -59,7 +59,7 @@ END_FTR_SECTION(0, 1);						\
 #define OPAL_CALL(name, token)		\
  _GLOBAL_TOC(name);			\
 	mflr	r0;			\
-	std	r0,16(r1);		\
+	std	r0,PPC_LR_STKOFF(r1);	\
 	li	r0,token;		\
 	OPAL_BRANCH(opal_tracepoint_entry) \
 	mfcr	r12;			\
@@ -92,7 +92,7 @@ opal_return:
 	FIXUP_ENDIAN
 	ld	r2,PACATOC(r13);
 	lwz	r4,8(r1);
-	ld	r5,16(r1);
+	ld	r5,PPC_LR_STKOFF(r1);
 	ld	r6,PACASAVEDMSR(r13);
 	mtspr	SPRN_SRR0,r5;
 	mtspr	SPRN_SRR1,r6;
@@ -157,43 +157,37 @@ opal_tracepoint_return:
 	blr
 #endif
 
-/*
- * Make opal call in realmode. This is a generic function to be called
- * from realmode. It handles endianness.
- *
- * r13 - paca pointer
- * r1  - stack pointer
- * r0  - opal token
- */
-_GLOBAL(opal_call_realmode)
-	mflr	r12
-	std	r12,PPC_LR_STKOFF(r1)
-	ld	r2,PACATOC(r13)
-	/* Set opal return address */
-	LOAD_REG_ADDR(r12,return_from_opal_call)
-	mtlr	r12
-
-	mfmsr	r12
-#ifdef __LITTLE_ENDIAN__
-	/* Handle endian-ness */
-	li	r11,MSR_LE
-	andc	r12,r12,r11
-#endif
-	mtspr	SPRN_HSRR1,r12
-	LOAD_REG_ADDR(r11,opal)
-	ld	r12,8(r11)
-	ld	r2,0(r11)
-	mtspr	SPRN_HSRR0,r12
+#define OPAL_CALL_REAL(name, token)			\
+ _GLOBAL_TOC(name);					\
+	mflr	r0;					\
+	std	r0,PPC_LR_STKOFF(r1);			\
+	li	r0,token;				\
+	mfcr	r12;					\
+	stw	r12,8(r1);				\
+							\
+	/* Set opal return address */			\
+	LOAD_REG_ADDR(r11, opal_return_realmode);	\
+	mtlr	r11;					\
+	mfmsr	r12;					\
+	li	r11,MSR_LE;				\
+	andc	r12,r12,r11;				\
+	mtspr	SPRN_HSRR1,r12;				\
+	LOAD_REG_ADDR(r11,opal);			\
+	ld	r12,8(r11);				\
+	ld	r2,0(r11);				\
+	mtspr	SPRN_HSRR0,r12;				\
 	hrfid
 
-return_from_opal_call:
-#ifdef __LITTLE_ENDIAN__
+opal_return_realmode:
 	FIXUP_ENDIAN
-#endif
+	ld	r2,PACATOC(r13);
+	lwz	r11,8(r1);
 	ld	r12,PPC_LR_STKOFF(r1)
+	mtcr	r11;
 	mtlr	r12
 	blr
 
+
 OPAL_CALL(opal_invalid_call,			OPAL_INVALID_CALL);
 OPAL_CALL(opal_console_write,			OPAL_CONSOLE_WRITE);
 OPAL_CALL(opal_console_read,			OPAL_CONSOLE_READ);
@@ -271,6 +265,7 @@ OPAL_CALL(opal_validate_flash,			OPAL_FLASH_VALIDATE);
 OPAL_CALL(opal_manage_flash,			OPAL_FLASH_MANAGE);
 OPAL_CALL(opal_update_flash,			OPAL_FLASH_UPDATE);
 OPAL_CALL(opal_resync_timebase,			OPAL_RESYNC_TIMEBASE);
+OPAL_CALL_REAL(opal_rm_resync_timebase,		OPAL_RESYNC_TIMEBASE);
 OPAL_CALL(opal_check_token,			OPAL_CHECK_TOKEN);
 OPAL_CALL(opal_dump_init,			OPAL_DUMP_INIT);
 OPAL_CALL(opal_dump_info,			OPAL_DUMP_INFO);
@@ -285,7 +280,9 @@ OPAL_CALL(opal_sensor_read,			OPAL_SENSOR_READ);
 OPAL_CALL(opal_get_param,			OPAL_GET_PARAM);
 OPAL_CALL(opal_set_param,			OPAL_SET_PARAM);
 OPAL_CALL(opal_handle_hmi,			OPAL_HANDLE_HMI);
+OPAL_CALL_REAL(opal_rm_handle_hmi,		OPAL_HANDLE_HMI);
 OPAL_CALL(opal_config_cpu_idle_state,		OPAL_CONFIG_CPU_IDLE_STATE);
+OPAL_CALL_REAL(opal_rm_config_cpu_idle_state,	OPAL_CONFIG_CPU_IDLE_STATE);
 OPAL_CALL(opal_slw_set_reg,			OPAL_SLW_SET_REG);
 OPAL_CALL(opal_register_dump_region,		OPAL_REGISTER_DUMP_REGION);
 OPAL_CALL(opal_unregister_dump_region,		OPAL_UNREGISTER_DUMP_REGION);
@@ -306,3 +303,5 @@ OPAL_CALL(opal_int_get_xirr,			OPAL_INT_GET_XIRR);
 OPAL_CALL(opal_int_set_cppr,			OPAL_INT_SET_CPPR);
 OPAL_CALL(opal_int_eoi,				OPAL_INT_EOI);
 OPAL_CALL(opal_int_set_mfrr,			OPAL_INT_SET_MFRR);
+OPAL_CALL(opal_pci_tce_kill,			OPAL_PCI_TCE_KILL);
+OPAL_CALL_REAL(opal_rm_pci_tce_kill,		OPAL_PCI_TCE_KILL);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 08/17] powerpc/powernv/pci: Rename TCE invalidation calls
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (6 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 07/17] powerpc/opal: Add real mode call wrappers Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 09/17] powerpc/powernv/pci: Remove SWINV constants and obsolete TCE code Benjamin Herrenschmidt
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

The TCE invalidation functions are fairly implementation specific,
and while the IODA specs more/less describe the register, in practice
various implementation workarounds may be required. So name the
functions after the target PHB.

Note today and for the foreseeable future, there's a 1:1 relationship
between an IODA version and a PHB implementation. There exist another
variant of IODA1 (Torrent) but we never supported in with OPAL and
never will.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/platforms/powernv/npu-dma.c  |  8 ++++----
 arch/powerpc/platforms/powernv/pci-ioda.c | 30 +++++++++++++++---------------
 arch/powerpc/platforms/powernv/pci.h      |  4 +---
 3 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index 0459e10..4383a5f 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -180,7 +180,7 @@ long pnv_npu_set_window(struct pnv_ioda_pe *npe, int num,
 		pe_err(npe, "Failed to configure TCE table, err %lld\n", rc);
 		return rc;
 	}
-	pnv_pci_ioda2_tce_invalidate_entire(phb, false);
+	pnv_pci_phb3_tce_invalidate_entire(phb, false);
 
 	/* Add the table to the list so its TCE cache will get invalidated */
 	pnv_pci_link_table_and_group(phb->hose->node, num,
@@ -204,7 +204,7 @@ long pnv_npu_unset_window(struct pnv_ioda_pe *npe, int num)
 		pe_err(npe, "Unmapping failed, ret = %lld\n", rc);
 		return rc;
 	}
-	pnv_pci_ioda2_tce_invalidate_entire(phb, false);
+	pnv_pci_phb3_tce_invalidate_entire(phb, false);
 
 	pnv_pci_unlink_table_and_group(npe->table_group.tables[num],
 			&npe->table_group);
@@ -270,7 +270,7 @@ static int pnv_npu_dma_set_bypass(struct pnv_ioda_pe *npe)
 			0 /* bypass base */, top);
 
 	if (rc == OPAL_SUCCESS)
-		pnv_pci_ioda2_tce_invalidate_entire(phb, false);
+		pnv_pci_phb3_tce_invalidate_entire(phb, false);
 
 	return rc;
 }
@@ -334,7 +334,7 @@ void pnv_npu_take_ownership(struct pnv_ioda_pe *npe)
 		pe_err(npe, "Failed to disable bypass, err %lld\n", rc);
 		return;
 	}
-	pnv_pci_ioda2_tce_invalidate_entire(npe->phb, false);
+	pnv_pci_phb3_tce_invalidate_entire(npe->phb, false);
 }
 
 struct pnv_ioda_pe *pnv_pci_npu_setup_iommu(struct pnv_ioda_pe *npe)
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 341a9db..e759900 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1729,7 +1729,7 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe,
 	}
 }
 
-static void pnv_pci_ioda1_tce_invalidate(struct iommu_table *tbl,
+static void pnv_pci_p7ioc_tce_invalidate(struct iommu_table *tbl,
 		unsigned long index, unsigned long npages, bool rm)
 {
 	struct iommu_table_group_link *tgl = list_first_entry_or_null(
@@ -1790,7 +1790,7 @@ static int pnv_ioda1_tce_build(struct iommu_table *tbl, long index,
 			attrs);
 
 	if (!ret && (tbl->it_type & TCE_PCI_SWINV_CREATE))
-		pnv_pci_ioda1_tce_invalidate(tbl, index, npages, false);
+		pnv_pci_p7ioc_tce_invalidate(tbl, index, npages, false);
 
 	return ret;
 }
@@ -1803,7 +1803,7 @@ static int pnv_ioda1_tce_xchg(struct iommu_table *tbl, long index,
 
 	if (!ret && (tbl->it_type &
 			(TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE)))
-		pnv_pci_ioda1_tce_invalidate(tbl, index, 1, false);
+		pnv_pci_p7ioc_tce_invalidate(tbl, index, 1, false);
 
 	return ret;
 }
@@ -1815,7 +1815,7 @@ static void pnv_ioda1_tce_free(struct iommu_table *tbl, long index,
 	pnv_tce_free(tbl, index, npages);
 
 	if (tbl->it_type & TCE_PCI_SWINV_FREE)
-		pnv_pci_ioda1_tce_invalidate(tbl, index, npages, false);
+		pnv_pci_p7ioc_tce_invalidate(tbl, index, npages, false);
 }
 
 static struct iommu_table_ops pnv_ioda1_iommu_ops = {
@@ -1827,13 +1827,13 @@ static struct iommu_table_ops pnv_ioda1_iommu_ops = {
 	.get = pnv_tce_get,
 };
 
-#define TCE_KILL_INVAL_ALL  PPC_BIT(0)
-#define TCE_KILL_INVAL_PE   PPC_BIT(1)
-#define TCE_KILL_INVAL_TCE  PPC_BIT(2)
+#define PHB3_TCE_KILL_INVAL_ALL		PPC_BIT(0)
+#define PHB3_TCE_KILL_INVAL_PE		PPC_BIT(1)
+#define PHB3_TCE_KILL_INVAL_ONE		PPC_BIT(2)
 
-void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_phb *phb, bool rm)
+void pnv_pci_phb3_tce_invalidate_entire(struct pnv_phb *phb, bool rm)
 {
-	const unsigned long val = TCE_KILL_INVAL_ALL;
+	const unsigned long val = PHB3_TCE_KILL_INVAL_ALL;
 
 	mb(); /* Ensure previous TCE table stores are visible */
 	if (rm)
@@ -1844,10 +1844,10 @@ void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_phb *phb, bool rm)
 		__raw_writeq(cpu_to_be64(val), phb->ioda.tce_inval_reg);
 }
 
-static inline void pnv_pci_ioda2_tce_invalidate_pe(struct pnv_ioda_pe *pe)
+static inline void pnv_pci_phb3_tce_invalidate_pe(struct pnv_ioda_pe *pe)
 {
 	/* 01xb - invalidate TCEs that match the specified PE# */
-	unsigned long val = TCE_KILL_INVAL_PE | (pe->pe_number & 0xFF);
+	unsigned long val = PHB3_TCE_KILL_INVAL_PE | (pe->pe_number & 0xFF);
 	struct pnv_phb *phb = pe->phb;
 
 	if (!phb->ioda.tce_inval_reg)
@@ -1857,14 +1857,14 @@ static inline void pnv_pci_ioda2_tce_invalidate_pe(struct pnv_ioda_pe *pe)
 	__raw_writeq(cpu_to_be64(val), phb->ioda.tce_inval_reg);
 }
 
-static void pnv_pci_ioda2_do_tce_invalidate(unsigned pe_number, bool rm,
+static void pnv_pci_phb3_tce_invalidate(unsigned pe_number, bool rm,
 		__be64 __iomem *invalidate, unsigned shift,
 		unsigned long index, unsigned long npages)
 {
 	unsigned long start, end, inc;
 
 	/* We'll invalidate DMA address in PE scope */
-	start = TCE_KILL_INVAL_TCE;
+	start = PHB3_TCE_KILL_INVAL_ONE;
 	start |= (pe_number & 0xFF);
 	end = start;
 
@@ -1901,10 +1901,10 @@ static void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
 			 * per TCE entry so we have to invalidate
 			 * the entire cache for it.
 			 */
-			pnv_pci_ioda2_tce_invalidate_entire(pe->phb, rm);
+			pnv_pci_phb3_tce_invalidate_entire(pe->phb, rm);
 			continue;
 		}
-		pnv_pci_ioda2_do_tce_invalidate(pe->pe_number, rm,
+		pnv_pci_phb3_tce_invalidate(pe->pe_number, rm,
 			invalidate, tbl->it_page_shift,
 			index, npages);
 	}
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 772ad41..3dfa57b 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -205,8 +205,6 @@ extern void pnv_pci_init_ioda_hub(struct device_node *np);
 extern void pnv_pci_init_ioda2_phb(struct device_node *np);
 extern void pnv_pci_init_ioda3_phb(struct device_node *np);
 extern void pnv_pci_init_npu_phb(struct device_node *np);
-extern void pnv_pci_ioda_tce_invalidate(struct iommu_table *tbl,
-					__be64 *startp, __be64 *endp, bool rm);
 extern void pnv_pci_reset_secondary_bus(struct pci_dev *dev);
 extern int pnv_eeh_phb_reset(struct pci_controller *hose, int option);
 
@@ -226,7 +224,7 @@ extern void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level,
 
 /* Nvlink functions */
 extern void pnv_npu_try_dma_set_bypass(struct pci_dev *gpdev, bool bypass);
-extern void pnv_pci_ioda2_tce_invalidate_entire(struct pnv_phb *phb, bool rm);
+extern void pnv_pci_phb3_tce_invalidate_entire(struct pnv_phb *phb, bool rm);
 extern struct pnv_ioda_pe *pnv_pci_npu_setup_iommu(struct pnv_ioda_pe *npe);
 extern long pnv_npu_set_window(struct pnv_ioda_pe *npe, int num,
 		struct iommu_table *tbl);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 09/17] powerpc/powernv/pci: Remove SWINV constants and obsolete TCE code
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (7 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 08/17] powerpc/powernv/pci: Rename TCE invalidation calls Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 10/17] powerpc/powernv/pci: Rework accessing the TCE invalidate register Benjamin Herrenschmidt
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

We have some obsolete code in pnv_pci_p7ioc_tce_invalidate()
to handle some internal lab tools that have stopped being
useful a long time ago. Remove that along with the definition
and test for the TCE_PCI_SWINV_* flags whose value is basically
always the same.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/tce.h            |  3 --
 arch/powerpc/platforms/powernv/pci-ioda.c | 50 +++++++------------------------
 2 files changed, 10 insertions(+), 43 deletions(-)

diff --git a/arch/powerpc/include/asm/tce.h b/arch/powerpc/include/asm/tce.h
index 743f36b..12e3629 100644
--- a/arch/powerpc/include/asm/tce.h
+++ b/arch/powerpc/include/asm/tce.h
@@ -31,9 +31,6 @@
  */
 #define TCE_VB			0
 #define TCE_PCI			1
-#define TCE_PCI_SWINV_CREATE	2
-#define TCE_PCI_SWINV_FREE	4
-#define TCE_PCI_SWINV_PAIR	8
 
 /* TCE page size is 4096 bytes (1 << 12) */
 
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index e759900..7a89833 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1741,29 +1741,15 @@ static void pnv_pci_p7ioc_tce_invalidate(struct iommu_table *tbl,
 		(__be64 __iomem *)pe->phb->ioda.tce_inval_reg_phys :
 		pe->phb->ioda.tce_inval_reg;
 	unsigned long start, end, inc;
-	const unsigned shift = tbl->it_page_shift;
 
 	start = __pa(((__be64 *)tbl->it_base) + index - tbl->it_offset);
 	end = __pa(((__be64 *)tbl->it_base) + index - tbl->it_offset +
 			npages - 1);
 
-	/* BML uses this case for p6/p7/galaxy2: Shift addr and put in node */
-	if (tbl->it_busno) {
-		start <<= shift;
-		end <<= shift;
-		inc = 128ull << shift;
-		start |= tbl->it_busno;
-		end |= tbl->it_busno;
-	} else if (tbl->it_type & TCE_PCI_SWINV_PAIR) {
-		/* p7ioc-style invalidation, 2 TCEs per write */
-		start |= (1ull << 63);
-		end |= (1ull << 63);
-		inc = 16;
-        } else {
-		/* Default (older HW) */
-                inc = 128;
-	}
-
+	/* p7ioc-style invalidation, 2 TCEs per write */
+	start |= (1ull << 63);
+	end |= (1ull << 63);
+	inc = 16;
         end |= inc - 1;	/* round up end to be different than start */
 
         mb(); /* Ensure above stores are visible */
@@ -1789,7 +1775,7 @@ static int pnv_ioda1_tce_build(struct iommu_table *tbl, long index,
 	int ret = pnv_tce_build(tbl, index, npages, uaddr, direction,
 			attrs);
 
-	if (!ret && (tbl->it_type & TCE_PCI_SWINV_CREATE))
+	if (!ret)
 		pnv_pci_p7ioc_tce_invalidate(tbl, index, npages, false);
 
 	return ret;
@@ -1801,8 +1787,7 @@ static int pnv_ioda1_tce_xchg(struct iommu_table *tbl, long index,
 {
 	long ret = pnv_tce_xchg(tbl, index, hpa, direction);
 
-	if (!ret && (tbl->it_type &
-			(TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE)))
+	if (!ret)
 		pnv_pci_p7ioc_tce_invalidate(tbl, index, 1, false);
 
 	return ret;
@@ -1814,8 +1799,7 @@ static void pnv_ioda1_tce_free(struct iommu_table *tbl, long index,
 {
 	pnv_tce_free(tbl, index, npages);
 
-	if (tbl->it_type & TCE_PCI_SWINV_FREE)
-		pnv_pci_p7ioc_tce_invalidate(tbl, index, npages, false);
+	pnv_pci_p7ioc_tce_invalidate(tbl, index, npages, false);
 }
 
 static struct iommu_table_ops pnv_ioda1_iommu_ops = {
@@ -1918,7 +1902,7 @@ static int pnv_ioda2_tce_build(struct iommu_table *tbl, long index,
 	int ret = pnv_tce_build(tbl, index, npages, uaddr, direction,
 			attrs);
 
-	if (!ret && (tbl->it_type & TCE_PCI_SWINV_CREATE))
+	if (!ret)
 		pnv_pci_ioda2_tce_invalidate(tbl, index, npages, false);
 
 	return ret;
@@ -1930,8 +1914,7 @@ static int pnv_ioda2_tce_xchg(struct iommu_table *tbl, long index,
 {
 	long ret = pnv_tce_xchg(tbl, index, hpa, direction);
 
-	if (!ret && (tbl->it_type &
-			(TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE)))
+	if (!ret)
 		pnv_pci_ioda2_tce_invalidate(tbl, index, 1, false);
 
 	return ret;
@@ -1943,8 +1926,7 @@ static void pnv_ioda2_tce_free(struct iommu_table *tbl, long index,
 {
 	pnv_tce_free(tbl, index, npages);
 
-	if (tbl->it_type & TCE_PCI_SWINV_FREE)
-		pnv_pci_ioda2_tce_invalidate(tbl, index, npages, false);
+	pnv_pci_ioda2_tce_invalidate(tbl, index, npages, false);
 }
 
 static void pnv_ioda2_table_free(struct iommu_table *tbl)
@@ -2113,12 +2095,6 @@ found:
 				  base * PNV_IODA1_DMA32_SEGSIZE,
 				  IOMMU_PAGE_SHIFT_4K);
 
-	/* OPAL variant of P7IOC SW invalidated TCEs */
-	if (phb->ioda.tce_inval_reg)
-		tbl->it_type |= (TCE_PCI_SWINV_CREATE |
-				 TCE_PCI_SWINV_FREE   |
-				 TCE_PCI_SWINV_PAIR);
-
 	tbl->it_ops = &pnv_ioda1_iommu_ops;
 	pe->table_group.tce32_start = tbl->it_offset << tbl->it_page_shift;
 	pe->table_group.tce32_size = tbl->it_size << tbl->it_page_shift;
@@ -2241,8 +2217,6 @@ static long pnv_pci_ioda2_create_table(struct iommu_table_group *table_group,
 	}
 
 	tbl->it_ops = &pnv_ioda2_iommu_ops;
-	if (pe->phb->ioda.tce_inval_reg)
-		tbl->it_type |= (TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE);
 
 	*ptbl = tbl;
 
@@ -2291,10 +2265,6 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe)
 	if (!pnv_iommu_bypass_disabled)
 		pnv_pci_ioda2_set_bypass(pe, true);
 
-	/* OPAL variant of PHB3 invalidated TCEs */
-	if (pe->phb->ioda.tce_inval_reg)
-		tbl->it_type |= (TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE);
-
 	/*
 	 * Setting table base here only for carrying iommu_group
 	 * further down to let iommu_add_device() do the job.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 10/17] powerpc/powernv/pci: Rework accessing the TCE invalidate register
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (8 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 09/17] powerpc/powernv/pci: Remove SWINV constants and obsolete TCE code Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 11/17] powerpc/powernv/pci: Fallback to OPAL for TCE invalidations Benjamin Herrenschmidt
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

It's architected, always in a known place, so there is no need
to keep a separate pointer to it, we use the existing "regs",
and we complement it with a real mode variant.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 68 ++++++++++++-------------------
 arch/powerpc/platforms/powernv/pci.h      |  7 +---
 2 files changed, 28 insertions(+), 47 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 7a89833..8574a9b 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1729,6 +1729,13 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe,
 	}
 }
 
+static inline __be64 __iomem *pnv_ioda_get_inval_reg(struct pnv_phb *phb,
+						     bool real_mode)
+{
+	return real_mode ? (__be64 __iomem *)(phb->regs_phys + 0x210) :
+		(phb->regs + 0x210);
+}
+
 static void pnv_pci_p7ioc_tce_invalidate(struct iommu_table *tbl,
 		unsigned long index, unsigned long npages, bool rm)
 {
@@ -1737,9 +1744,7 @@ static void pnv_pci_p7ioc_tce_invalidate(struct iommu_table *tbl,
 			next);
 	struct pnv_ioda_pe *pe = container_of(tgl->table_group,
 			struct pnv_ioda_pe, table_group);
-	__be64 __iomem *invalidate = rm ?
-		(__be64 __iomem *)pe->phb->ioda.tce_inval_reg_phys :
-		pe->phb->ioda.tce_inval_reg;
+	__be64 __iomem *invalidate = pnv_ioda_get_inval_reg(pe->phb, rm);
 	unsigned long start, end, inc;
 
 	start = __pa(((__be64 *)tbl->it_base) + index - tbl->it_offset);
@@ -1817,39 +1822,36 @@ static struct iommu_table_ops pnv_ioda1_iommu_ops = {
 
 void pnv_pci_phb3_tce_invalidate_entire(struct pnv_phb *phb, bool rm)
 {
+	__be64 __iomem *invalidate = pnv_ioda_get_inval_reg(phb, rm);
 	const unsigned long val = PHB3_TCE_KILL_INVAL_ALL;
 
 	mb(); /* Ensure previous TCE table stores are visible */
 	if (rm)
-		__raw_rm_writeq(cpu_to_be64(val),
-				(__be64 __iomem *)
-				phb->ioda.tce_inval_reg_phys);
+		__raw_rm_writeq(cpu_to_be64(val), invalidate);
 	else
-		__raw_writeq(cpu_to_be64(val), phb->ioda.tce_inval_reg);
+		__raw_writeq(cpu_to_be64(val), invalidate);
 }
 
 static inline void pnv_pci_phb3_tce_invalidate_pe(struct pnv_ioda_pe *pe)
 {
 	/* 01xb - invalidate TCEs that match the specified PE# */
+	__be64 __iomem *invalidate = pnv_ioda_get_inval_reg(pe->phb, false);
 	unsigned long val = PHB3_TCE_KILL_INVAL_PE | (pe->pe_number & 0xFF);
-	struct pnv_phb *phb = pe->phb;
-
-	if (!phb->ioda.tce_inval_reg)
-		return;
 
 	mb(); /* Ensure above stores are visible */
-	__raw_writeq(cpu_to_be64(val), phb->ioda.tce_inval_reg);
+	__raw_writeq(cpu_to_be64(val), invalidate);
 }
 
-static void pnv_pci_phb3_tce_invalidate(unsigned pe_number, bool rm,
-		__be64 __iomem *invalidate, unsigned shift,
-		unsigned long index, unsigned long npages)
+static void pnv_pci_phb3_tce_invalidate(struct pnv_ioda_pe *pe, bool rm,
+					unsigned shift, unsigned long index,
+					unsigned long npages)
 {
+	__be64 __iomem *invalidate = pnv_ioda_get_inval_reg(pe->phb, false);
 	unsigned long start, end, inc;
 
 	/* We'll invalidate DMA address in PE scope */
 	start = PHB3_TCE_KILL_INVAL_ONE;
-	start |= (pe_number & 0xFF);
+	start |= (pe->pe_number & 0xFF);
 	end = start;
 
 	/* Figure out the start, end and step */
@@ -1875,10 +1877,6 @@ static void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
 	list_for_each_entry_rcu(tgl, &tbl->it_group_list, next) {
 		struct pnv_ioda_pe *pe = container_of(tgl->table_group,
 				struct pnv_ioda_pe, table_group);
-		__be64 __iomem *invalidate = rm ?
-			(__be64 __iomem *)pe->phb->ioda.tce_inval_reg_phys :
-			pe->phb->ioda.tce_inval_reg;
-
 		if (pe->phb->type == PNV_PHB_NPU) {
 			/*
 			 * The NVLink hardware does not support TCE kill
@@ -1888,9 +1886,8 @@ static void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
 			pnv_pci_phb3_tce_invalidate_entire(pe->phb, rm);
 			continue;
 		}
-		pnv_pci_phb3_tce_invalidate(pe->pe_number, rm,
-			invalidate, tbl->it_page_shift,
-			index, npages);
+		pnv_pci_phb3_tce_invalidate(pe, rm, tbl->it_page_shift,
+					    index, npages);
 	}
 }
 
@@ -2475,19 +2472,6 @@ static void pnv_pci_ioda_setup_iommu_api(void)
 static void pnv_pci_ioda_setup_iommu_api(void) { };
 #endif
 
-static void pnv_pci_ioda_setup_opal_tce_kill(struct pnv_phb *phb)
-{
-	const __be64 *swinvp;
-
-	/* OPAL variant of PHB3 invalidated TCEs */
-	swinvp = of_get_property(phb->hose->dn, "ibm,opal-tce-kill", NULL);
-	if (!swinvp)
-		return;
-
-	phb->ioda.tce_inval_reg_phys = be64_to_cpup(swinvp);
-	phb->ioda.tce_inval_reg = ioremap(phb->ioda.tce_inval_reg_phys, 8);
-}
-
 static __be64 *pnv_pci_ioda2_table_do_alloc_pages(int nid, unsigned shift,
 		unsigned levels, unsigned long limit,
 		unsigned long *current_offset, unsigned long *total_allocated)
@@ -2673,8 +2657,6 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 	pr_info("PCI: Domain %04x has %d available 32-bit DMA segments\n",
 		hose->global_number, phb->ioda.dma32_count);
 
-	pnv_pci_ioda_setup_opal_tce_kill(phb);
-
 	/* Walk our PE list and configure their DMA segments */
 	list_for_each_entry(pe, &phb->ioda.pe_list, list) {
 		weight = pnv_pci_ioda_pe_dma_weight(pe);
@@ -3389,6 +3371,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	struct pnv_phb *phb;
 	unsigned long size, m64map_off, m32map_off, pemap_off;
 	unsigned long iomap_off = 0, dma32map_off = 0;
+	struct resource r;
 	const __be64 *prop64;
 	const __be32 *prop32;
 	int len;
@@ -3448,9 +3431,12 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	pci_process_bridge_OF_ranges(hose, np, !hose->global_number);
 
 	/* Get registers */
-	phb->regs = of_iomap(np, 0);
-	if (phb->regs == NULL)
-		pr_err("  Failed to map registers !\n");
+	if (!of_address_to_resource(np, 0, &r)) {
+		phb->regs_phys = r.start;
+		phb->regs = ioremap(r.start, resource_size(&r));
+		if (phb->regs == NULL)
+			pr_err("  Failed to map registers !\n");
+	}
 
 	/* Initialize more IODA stuff */
 	phb->ioda.total_pe_num = 1;
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 3dfa57b..ce339ae 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -81,6 +81,7 @@ struct pnv_phb {
 	u64			opal_id;
 	int			flags;
 	void __iomem		*regs;
+	u64			regs_phys;
 	int			initialized;
 	spinlock_t		lock;
 
@@ -158,12 +159,6 @@ struct pnv_phb {
 		 * bus { bus, devfn }
 		 */
 		unsigned char		pe_rmap[0x10000];
-
-		/* TCE cache invalidate registers (physical and
-		 * remapped)
-		 */
-		phys_addr_t		tce_inval_reg_phys;
-		__be64 __iomem		*tce_inval_reg;
 	} ioda;
 
 	/* PHB and hub status structure */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 11/17] powerpc/powernv/pci: Fallback to OPAL for TCE invalidations
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (9 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 10/17] powerpc/powernv/pci: Rework accessing the TCE invalidate register Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 12/17] powerpc/powernv: set power_save func after the idle states are initialized Benjamin Herrenschmidt
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

If we don't find registers for the PHB or don't know the model
specific invalidation method, use OPAL calls instead.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 33 +++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 8574a9b..5f08cd5 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1869,6 +1869,17 @@ static void pnv_pci_phb3_tce_invalidate(struct pnv_ioda_pe *pe, bool rm,
 	}
 }
 
+static inline void pnv_pci_ioda2_tce_invalidate_pe(struct pnv_ioda_pe *pe)
+{
+	struct pnv_phb *phb = pe->phb;
+
+	if (phb->model == PNV_PHB_MODEL_PHB3 && phb->regs)
+		pnv_pci_phb3_tce_invalidate_pe(pe);
+	else
+		opal_pci_tce_kill(phb->opal_id, OPAL_PCI_TCE_KILL_PE,
+				  pe->pe_number, 0, 0, 0);
+}
+
 static void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
 		unsigned long index, unsigned long npages, bool rm)
 {
@@ -1877,17 +1888,31 @@ static void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
 	list_for_each_entry_rcu(tgl, &tbl->it_group_list, next) {
 		struct pnv_ioda_pe *pe = container_of(tgl->table_group,
 				struct pnv_ioda_pe, table_group);
-		if (pe->phb->type == PNV_PHB_NPU) {
+		struct pnv_phb *phb = pe->phb;
+		unsigned int shift = tbl->it_page_shift;
+
+		if (phb->type == PNV_PHB_NPU) {
 			/*
 			 * The NVLink hardware does not support TCE kill
 			 * per TCE entry so we have to invalidate
 			 * the entire cache for it.
 			 */
-			pnv_pci_phb3_tce_invalidate_entire(pe->phb, rm);
+			pnv_pci_phb3_tce_invalidate_entire(phb, rm);
 			continue;
 		}
-		pnv_pci_phb3_tce_invalidate(pe, rm, tbl->it_page_shift,
-					    index, npages);
+		if (phb->model == PNV_PHB_MODEL_PHB3 && phb->regs)
+			pnv_pci_phb3_tce_invalidate(pe, rm, shift,
+						    index, npages);
+		else if (rm)
+			opal_rm_pci_tce_kill(phb->opal_id,
+					     OPAL_PCI_TCE_KILL_PAGES,
+					     pe->pe_number, 1u << shift,
+					     index << shift, npages);
+		else
+			opal_pci_tce_kill(phb->opal_id,
+					  OPAL_PCI_TCE_KILL_PAGES,
+					  pe->pe_number, 1u << shift,
+					  index << shift, npages);
 	}
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 12/17] powerpc/powernv: set power_save func after the idle states are initialized
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (10 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 11/17] powerpc/powernv/pci: Fallback to OPAL for TCE invalidations Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 13/17] powerpc/powernv/pci: Use the device-tree to get available range of M64's Benjamin Herrenschmidt
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Shreyas B. Prabhu, Benjamin Herrenschmidt

From: "Shreyas B. Prabhu" <shreyas@linux.vnet.ibm.com>

pnv_init_idle_states discovers supported idle states from the
device tree and does the required initialization. Set power_save
function pointer only after this initialization is done

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/platforms/powernv/idle.c  | 3 +++
 arch/powerpc/platforms/powernv/setup.c | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index fcc8b68..fbb09fb 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -285,6 +285,9 @@ static int __init pnv_init_idle_states(void)
 	}
 
 	pnv_alloc_idle_core_states();
+
+	if (supported_cpuidle_states & OPAL_PM_NAP_ENABLED)
+		ppc_md.power_save = power7_idle;
 out_free:
 	kfree(flags);
 out:
diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index ee6430b..8492bbb 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -315,7 +315,7 @@ define_machine(powernv) {
 	.get_proc_freq          = pnv_get_proc_freq,
 	.progress		= pnv_progress,
 	.machine_shutdown	= pnv_shutdown,
-	.power_save             = power7_idle,
+	.power_save             = NULL,
 	.calibrate_decr		= generic_calibrate_decr,
 #ifdef CONFIG_KEXEC
 	.kexec_cpu_down		= pnv_kexec_cpu_down,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 13/17] powerpc/powernv/pci: Use the device-tree to get available range of M64's
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (11 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 12/17] powerpc/powernv: set power_save func after the idle states are initialized Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 14/17] powerpc/powernv/pci: Check status of a PHB before using it Benjamin Herrenschmidt
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

M64's are the configurable 64-bit windows that cover the 64-bit MMIO
space. We used to hard code 16 windows. Newer chips might have a
variable number and might need to reserve some as well (for example
on PHB4/POWER9, M32 and M64 are actually unified and we use M64#0
to map the 32-bit space).

So newer OPALs will provide a property we can use to know what range
of windows is available. The property is named so that it can
eventually support multiple ranges but we only use the first one for
now.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 49 +++++++++++++++++++++++++++----
 1 file changed, 43 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 5f08cd5..b674a38 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -192,9 +192,6 @@ static int pnv_ioda2_init_m64(struct pnv_phb *phb)
 		goto fail;
 	}
 
-	/* Mark the M64 BAR assigned */
-	set_bit(phb->ioda.m64_bar_idx, &phb->ioda.m64_bar_alloc);
-
 	/*
 	 * Strip off the segment used by the reserved PE, which is
 	 * expected to be 0 or last one of PE capabicity.
@@ -405,6 +402,7 @@ static void __init pnv_ioda_parse_m64_window(struct pnv_phb *phb)
 	struct pci_controller *hose = phb->hose;
 	struct device_node *dn = hose->dn;
 	struct resource *res;
+	u32 m64_range[2], i;
 	const u32 *r;
 	u64 pci_addr;
 
@@ -426,6 +424,29 @@ static void __init pnv_ioda_parse_m64_window(struct pnv_phb *phb)
 		return;
 	}
 
+	/* Find the available M64 BAR range and pickup the last one for
+	 * covering the whole 64-bits space. We support only one range.
+	 */
+	if (of_property_read_u32_array(dn, "ibm,opal-available-m64-ranges",
+				       m64_range, 2)) {
+		/* In absence of the property, assume 0..15 */
+		m64_range[0] = 0;
+		m64_range[1] = 16;
+	}
+	/* We only support 64 bits in our allocator */
+	if (m64_range[1] > 63) {
+		pr_warn("%s: Limiting M64 range to 63 (from %d) on PHB#%x\n",
+			__func__, m64_range[1], phb->hose->global_number);
+		m64_range[1] = 63;
+	}
+	/* Empty range, no m64 */
+	if (m64_range[1] <= m64_range[0]) {
+		pr_warn("%s: M64 empty, disabling M64 usage on PHB#%x\n",
+			__func__, phb->hose->global_number);
+		return;
+	}
+
+	/* Configure M64 informations */
 	res = &hose->mem_resources[1];
 	res->name = dn->full_name;
 	res->start = of_translate_address(dn, r + 2);
@@ -438,11 +459,27 @@ static void __init pnv_ioda_parse_m64_window(struct pnv_phb *phb)
 	phb->ioda.m64_segsize = phb->ioda.m64_size / phb->ioda.total_pe_num;
 	phb->ioda.m64_base = pci_addr;
 
-	pr_info(" MEM64 0x%016llx..0x%016llx -> 0x%016llx\n",
-			res->start, res->end, pci_addr);
+	/* This lines up nicely with the display from processing OF ranges */
+	pr_info(" MEM 0x%016llx..0x%016llx -> 0x%016llx (M64 #%d..%d)\n",
+		res->start, res->end, pci_addr, m64_range[0],
+		m64_range[0] + m64_range[1] - 1);
+
+	/* Mark all M64 used up by default */
+	phb->ioda.m64_bar_alloc = (unsigned long)-1;
 
 	/* Use last M64 BAR to cover M64 window */
-	phb->ioda.m64_bar_idx = 15;
+	m64_range[1]--;
+	phb->ioda.m64_bar_idx = m64_range[0] + m64_range[1];
+
+	pr_info(" Using M64 #%d as default window\n", phb->ioda.m64_bar_idx);
+
+	/* Mark remaining ones free */
+	for (i = m64_range[0]; i < m64_range[1]; i++)
+		clear_bit(i, &phb->ioda.m64_bar_alloc);
+
+	/* Setup init functions for M64 based on IODA version, IODA3 uses
+	 * the IODA2 code
+	 */
 	if (phb->type == PNV_PHB_IODA1)
 		phb->init_m64 = pnv_ioda1_init_m64;
 	else
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 14/17] powerpc/powernv/pci: Check status of a PHB before using it
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (12 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 13/17] powerpc/powernv/pci: Use the device-tree to get available range of M64's Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 15/17] powerpc/pci: Don't try to allocate resources that will be reassigned Benjamin Herrenschmidt
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

If the firmware encounters an error (internal or HW) during initialization
of a PHB, it might leave the device-node in the tree but mark it disabled
using the "status" property. We should check it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index b674a38..9d59eeb 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3442,6 +3442,9 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	void *aux;
 	long rc;
 
+	if (!of_device_is_available(np))
+		return;
+
 	pr_info("Initializing IODA%d OPAL PHB %s\n", ioda_type, np->full_name);
 
 	prop64 = of_get_property(np, "ibm,opal-phbid", NULL);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 15/17] powerpc/pci: Don't try to allocate resources that will be reassigned
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (13 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 14/17] powerpc/powernv/pci: Check status of a PHB before using it Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 16/17] powerpc/pci: Reduce log level of PCI I/O space warning Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 17/17] powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs Benjamin Herrenschmidt
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

When we know we will reassign all resources, trying (and failing)
to allocate them initially is fairly pointless and leads to a lot
of scary messages in the kernel log

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/pci-common.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0f7a60f..2a67b16 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1362,8 +1362,10 @@ void __init pcibios_resource_survey(void)
 	/* Allocate and assign resources */
 	list_for_each_entry(b, &pci_root_buses, node)
 		pcibios_allocate_bus_resources(b);
-	pcibios_allocate_resources(0);
-	pcibios_allocate_resources(1);
+	if (!pci_has_flag(PCI_REASSIGN_ALL_RSRC)) {
+		pcibios_allocate_resources(0);
+		pcibios_allocate_resources(1);
+	}
 
 	/* Before we start assigning unassigned resource, we try to reserve
 	 * the low IO area and the VGA memory area if they intersect the
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 16/17] powerpc/pci: Reduce log level of PCI I/O space warning
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (14 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 15/17] powerpc/pci: Don't try to allocate resources that will be reassigned Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  2016-06-27 12:25 ` [PATCH 17/17] powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs Benjamin Herrenschmidt
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev

If a PHB has no I/O space, there's no need to make it look like
something bad happened, a pr_debug() is plenty enough since this
is the case of all our modern POWER chips.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/pci-common.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 2a67b16..3ab1f7b 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1487,9 +1487,9 @@ static void pcibios_setup_phb_resources(struct pci_controller *hose,
 	res = &hose->io_resource;
 
 	if (!res->flags) {
-		pr_info("PCI: I/O resource not set for host"
-		       " bridge %s (domain %d)\n",
-		       hose->dn->full_name, hose->global_number);
+		pr_debug("PCI: I/O resource not set for host"
+			 " bridge %s (domain %d)\n",
+			 hose->dn->full_name, hose->global_number);
 	} else {
 		offset = pcibios_io_space_offset(hose);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 17/17] powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs
  2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
                   ` (15 preceding siblings ...)
  2016-06-27 12:25 ` [PATCH 16/17] powerpc/pci: Reduce log level of PCI I/O space warning Benjamin Herrenschmidt
@ 2016-06-27 12:25 ` Benjamin Herrenschmidt
  16 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-27 12:25 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Benjamin Herrenschmidt, stable

The generic allocation code may sometimes decide to assign a prefetchable
64-bit BAR to the M32 window. In fact it may also decide to allocate
a 64-bit non-prefetchable BAR to the M64 one ! So using the resource
flags as a test to decide which window was used for PE allocation is
just wrong and leads to insane PE numbers.

Instead, compare the addresses to figure it out.

CC: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 9d59eeb..4cbd706 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -110,10 +110,16 @@ static int __init iommu_setup(char *str)
 }
 early_param("iommu", iommu_setup);
 
-static inline bool pnv_pci_is_mem_pref_64(unsigned long flags)
+static inline bool pnv_pci_is_mem_pref_64(struct pnv_phb *phb, struct resource *r)
 {
-	return ((flags & (IORESOURCE_MEM_64 | IORESOURCE_PREFETCH)) ==
-		(IORESOURCE_MEM_64 | IORESOURCE_PREFETCH));
+	/* WARNING: We cannot rely on the resource flags. The Linux PCI
+	 * allocation code sometimes decides to put a 64-bit prefetchable
+	 * BAR in the 32-bit window, so we have to compare the addresses.
+	 *
+	 * For simplicity we only test resource start.
+	 */
+	return (r->start >= phb->ioda.m64_base &&
+		r->start < (phb->ioda.m64_base + phb->ioda.m64_size));
 }
 
 static struct pnv_ioda_pe *pnv_ioda_init_pe(struct pnv_phb *phb, int pe_no)
@@ -230,7 +236,7 @@ static void pnv_ioda_reserve_dev_m64_pe(struct pci_dev *pdev,
 	sgsz = phb->ioda.m64_segsize;
 	for (i = 0; i <= PCI_ROM_RESOURCE; i++) {
 		r = &pdev->resource[i];
-		if (!r->parent || !pnv_pci_is_mem_pref_64(r->flags))
+		if (!r->parent || !pnv_pci_is_mem_pref_64(phb, r))
 			continue;
 
 		start = _ALIGN_DOWN(r->start - base, sgsz);
@@ -3058,7 +3064,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 		res = &pdev->resource[i + PCI_IOV_RESOURCES];
 		if (!res->flags || res->parent)
 			continue;
-		if (!pnv_pci_is_mem_pref_64(res->flags)) {
+		if (!pnv_pci_is_mem_pref_64(phb, res)) {
 			dev_warn(&pdev->dev, "Don't support SR-IOV with"
 					" non M64 VF BAR%d: %pR. \n",
 				 i, res);
@@ -3152,8 +3158,7 @@ static void pnv_ioda_setup_pe_res(struct pnv_ioda_pe *pe,
 			region.start += phb->ioda.io_segsize;
 			index++;
 		}
-	} else if ((res->flags & IORESOURCE_MEM) &&
-		   !pnv_pci_is_mem_pref_64(res->flags)) {
+	} else if ((res->flags & IORESOURCE_MEM) && !pnv_pci_is_mem_pref_64(phb, res)) {
 		region.start = res->start -
 			       phb->hose->mem_offset[0] -
 			       phb->ioda.m32_pci_base;
@@ -3312,9 +3317,11 @@ static resource_size_t pnv_pci_window_alignment(struct pci_bus *bus,
 		bridge = bridge->bus->self;
 	}
 
-	/* We fail back to M32 if M64 isn't supported */
-	if (phb->ioda.m64_segsize &&
-	    pnv_pci_is_mem_pref_64(type))
+	/* We fail back to M32 if M64 isn't supported. We enforce the M64
+	 * alignment for any 64-bit resource, PCIe doesn't care and
+	 * bridges only do 64-bit prefetchable anyway
+	 */
+	if (phb->ioda.m64_segsize && (type & IORESOURCE_MEM_64))
 		return phb->ioda.m64_segsize;
 	if (type & IORESOURCE_MEM)
 		return phb->ioda.m32_segsize;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 07/17] powerpc/opal: Add real mode call wrappers
  2016-06-27 12:25 ` [PATCH 07/17] powerpc/opal: Add real mode call wrappers Benjamin Herrenschmidt
@ 2016-06-28  4:18   ` Michael Neuling
  2016-06-28 11:37     ` Michael Ellerman
  0 siblings, 1 reply; 21+ messages in thread
From: Michael Neuling @ 2016-06-28  4:18 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, linuxppc-dev, Michael Ellerman; +Cc: shreyas

mpe,

Just flagging this as going to conflict with Shreyas' stop instruction
patch series. =C2=A0It's relatively easy to fix so you can do it manually.

Alternatively you could take this one patch now and get Shreyas to rebase.

Mikey

On Mon, 2016-06-27 at 22:25 +1000, Benjamin Herrenschmidt wrote:
> Replace the old generic opal_call_realmode() with proper per-call
> wrappers similar to the normal ones and convert callers.
>=20
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
> =C2=A0arch/powerpc/include/asm/opal-api.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| 10 +++-
> =C2=A0arch/powerpc/include/asm/opal.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0|=C2=A0=C2=A06 =
+++
> =C2=A0arch/powerpc/kernel/idle_power7.S=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| 16 ++-----
> =C2=A0arch/powerpc/platforms/powernv/opal-wrappers.S | 63 +++++++++++++--=
-----
> ------
> =C2=A04 files changed, 51 insertions(+), 44 deletions(-)
>=20
> diff --git a/arch/powerpc/include/asm/opal-api.h
> b/arch/powerpc/include/asm/opal-api.h
> index 4b4b559..957795c 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -162,7 +162,8 @@
> =C2=A0#define	OPAL_INT_SET_CPPR			123
> =C2=A0#define OPAL_INT_EOI				124
> =C2=A0#define OPAL_INT_SET_MFRR			125
> -#define OPAL_LAST				125
> +#define OPAL_PCI_TCE_KILL			126
> +#define OPAL_LAST				126
> =C2=A0
> =C2=A0/* Device tree flags */
> =C2=A0
> @@ -895,6 +896,13 @@ enum {
> =C2=A0	OPAL_REBOOT_PLATFORM_ERROR	=3D 1,
> =C2=A0};
> =C2=A0
> +/* Argument to OPAL_PCI_TCE_KILL */
> +enum {
> +	OPAL_PCI_TCE_KILL_PAGES,
> +	OPAL_PCI_TCE_KILL_PE,
> +	OPAL_PCI_TCE_KILL_ALL,
> +};
> +
> =C2=A0#endif /* __ASSEMBLY__ */
> =C2=A0
> =C2=A0#endif /* __OPAL_API_H */
> diff --git a/arch/powerpc/include/asm/opal.h
> b/arch/powerpc/include/asm/opal.h
> index 6ccb847..ec6e0cc 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++ b/arch/powerpc/include/asm/opal.h
> @@ -214,6 +214,12 @@ int64_t opal_int_get_xirr(uint32_t *out_xirr, bool
> just_poll);
> =C2=A0int64_t opal_int_set_cppr(uint8_t cppr);
> =C2=A0int64_t opal_int_eoi(uint32_t xirr);
> =C2=A0int64_t opal_int_set_mfrr(uint32_t cpu, uint8_t mfrr);
> +int64_t opal_pci_tce_kill(uint64_t phb_id, uint32_t kill_type,
> +			=C2=A0=C2=A0uint32_t pe_num, uint32_t tce_size,
> +			=C2=A0=C2=A0uint64_t dma_addr, uint32_t npages);
> +int64_t opal_rm_pci_tce_kill(uint64_t phb_id, uint32_t kill_type,
> +			=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0uint32_t pe_num, uint32_t tce_size,
> +			=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0uint64_t dma_addr, uint32_t npages);
> =C2=A0
> =C2=A0/* Internal functions */
> =C2=A0extern int early_init_dt_scan_opal(unsigned long node, const char
> *uname,
> diff --git a/arch/powerpc/kernel/idle_power7.S
> b/arch/powerpc/kernel/idle_power7.S
> index 470ceeb..c93f825 100644
> --- a/arch/powerpc/kernel/idle_power7.S
> +++ b/arch/powerpc/kernel/idle_power7.S
> @@ -196,8 +196,7 @@ fastsleep_workaround_at_entry:
> =C2=A0	/* Fast sleep workaround */
> =C2=A0	li	r3,1
> =C2=A0	li	r4,1
> -	li	r0,OPAL_CONFIG_CPU_IDLE_STATE
> -	bl	opal_call_realmode
> +	bl	opal_rm_config_cpu_idle_state
> =C2=A0
> =C2=A0	/* Clear Lock bit */
> =C2=A0	li	r0,0
> @@ -270,8 +269,7 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S,
> 66);		\
> =C2=A0	ld	r2,PACATOC(r13);				=09
> \
> =C2=A0	ld	r1,PACAR1(r13);				=09
> 	\
> =C2=A0	std	r3,ORIG_GPR3(r1);	/* Save original r3 */=09
> 	\
> -	li	r0,OPAL_HANDLE_HMI;	/* Pass opal token
> argument*/	\
> -	bl	opal_call_realmode;				=09
> \
> +	bl	opal_rm_handle_hmi;				=09
> \
> =C2=A0	ld	r3,ORIG_GPR3(r1);	/* Restore original r3 */=09
> \
> =C2=A020:	nop;
> =C2=A0
> @@ -284,7 +282,7 @@ _GLOBAL(power7_wakeup_tb_loss)
> =C2=A0	=C2=A0* and they are restored before switching to the process
> context. Hence
> =C2=A0	=C2=A0* until they are restored, they are free to be used.
> =C2=A0	=C2=A0*
> -	=C2=A0* Save SRR1 in a NVGPR as it might be clobbered in
> opal_call_realmode
> +	=C2=A0* Save SRR1 in a NVGPR as it might be clobbered in opal call
> =C2=A0	=C2=A0* (called in CHECK_HMI_INTERRUPT). SRR1 is required to
> determine the
> =C2=A0	=C2=A0* wakeup reason if we branch to kvm_start_guest.
> =C2=A0	=C2=A0*/
> @@ -378,10 +376,7 @@ timebase_resync:
> =C2=A0	=C2=A0* set in exceptions-64s.S */
> =C2=A0	ble	cr3,clear_lock
> =C2=A0	/* Time base re-sync */
> -	li	r0,OPAL_RESYNC_TIMEBASE
> -	bl	opal_call_realmode;
> -	/* TODO: Check r3 for failure */
> -
> +	bl	opal_rm_resync_timebase;
> =C2=A0	/*
> =C2=A0	=C2=A0* If waking up from sleep, per core state is not lost, skip =
to
> =C2=A0	=C2=A0* clear_lock.
> @@ -469,8 +464,7 @@ hypervisor_state_restored:
> =C2=A0fastsleep_workaround_at_exit:
> =C2=A0	li	r3,1
> =C2=A0	li	r4,0
> -	li	r0,OPAL_CONFIG_CPU_IDLE_STATE
> -	bl	opal_call_realmode
> +	bl	opal_rm_config_cpu_idle_state
> =C2=A0	b	timebase_resync
> =C2=A0
> =C2=A0/*
> diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S
> b/arch/powerpc/platforms/powernv/opal-wrappers.S
> index 3854343..d5f00bb 100644
> --- a/arch/powerpc/platforms/powernv/opal-wrappers.S
> +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
> @@ -59,7 +59,7 @@ END_FTR_SECTION(0, 1);				=09
> 	\
> =C2=A0#define OPAL_CALL(name, token)		\
> =C2=A0 _GLOBAL_TOC(name);			\
> =C2=A0	mflr	r0;			\
> -	std	r0,16(r1);		\
> +	std	r0,PPC_LR_STKOFF(r1);	\
> =C2=A0	li	r0,token;		\
> =C2=A0	OPAL_BRANCH(opal_tracepoint_entry) \
> =C2=A0	mfcr	r12;			\
> @@ -92,7 +92,7 @@ opal_return:
> =C2=A0	FIXUP_ENDIAN
> =C2=A0	ld	r2,PACATOC(r13);
> =C2=A0	lwz	r4,8(r1);
> -	ld	r5,16(r1);
> +	ld	r5,PPC_LR_STKOFF(r1);
> =C2=A0	ld	r6,PACASAVEDMSR(r13);
> =C2=A0	mtspr	SPRN_SRR0,r5;
> =C2=A0	mtspr	SPRN_SRR1,r6;
> @@ -157,43 +157,37 @@ opal_tracepoint_return:
> =C2=A0	blr
> =C2=A0#endif
> =C2=A0
> -/*
> - * Make opal call in realmode. This is a generic function to be called
> - * from realmode. It handles endianness.
> - *
> - * r13 - paca pointer
> - * r1=C2=A0=C2=A0- stack pointer
> - * r0=C2=A0=C2=A0- opal token
> - */
> -_GLOBAL(opal_call_realmode)
> -	mflr	r12
> -	std	r12,PPC_LR_STKOFF(r1)
> -	ld	r2,PACATOC(r13)
> -	/* Set opal return address */
> -	LOAD_REG_ADDR(r12,return_from_opal_call)
> -	mtlr	r12
> -
> -	mfmsr	r12
> -#ifdef __LITTLE_ENDIAN__
> -	/* Handle endian-ness */
> -	li	r11,MSR_LE
> -	andc	r12,r12,r11
> -#endif
> -	mtspr	SPRN_HSRR1,r12
> -	LOAD_REG_ADDR(r11,opal)
> -	ld	r12,8(r11)
> -	ld	r2,0(r11)
> -	mtspr	SPRN_HSRR0,r12
> +#define OPAL_CALL_REAL(name, token)			\
> + _GLOBAL_TOC(name);					\
> +	mflr	r0;					\
> +	std	r0,PPC_LR_STKOFF(r1);			\
> +	li	r0,token;				\
> +	mfcr	r12;					\
> +	stw	r12,8(r1);				\
> +							\
> +	/* Set opal return address */			\
> +	LOAD_REG_ADDR(r11, opal_return_realmode);	\
> +	mtlr	r11;					\
> +	mfmsr	r12;					\
> +	li	r11,MSR_LE;				\
> +	andc	r12,r12,r11;				\
> +	mtspr	SPRN_HSRR1,r12;				\
> +	LOAD_REG_ADDR(r11,opal);			\
> +	ld	r12,8(r11);				\
> +	ld	r2,0(r11);				\
> +	mtspr	SPRN_HSRR0,r12;				\
> =C2=A0	hrfid
> =C2=A0
> -return_from_opal_call:
> -#ifdef __LITTLE_ENDIAN__
> +opal_return_realmode:
> =C2=A0	FIXUP_ENDIAN
> -#endif
> +	ld	r2,PACATOC(r13);
> +	lwz	r11,8(r1);
> =C2=A0	ld	r12,PPC_LR_STKOFF(r1)
> +	mtcr	r11;
> =C2=A0	mtlr	r12
> =C2=A0	blr
> =C2=A0
> +
> =C2=A0OPAL_CALL(opal_invalid_call,			OPAL_INVALID_CALL);
> =C2=A0OPAL_CALL(opal_console_write,			OPAL_CONSOLE_WRITE)
> ;
> =C2=A0OPAL_CALL(opal_console_read,			OPAL_CONSOLE_READ);
> @@ -271,6 +265,7 @@ OPAL_CALL(opal_validate_flash,		=09
> OPAL_FLASH_VALIDATE);
> =C2=A0OPAL_CALL(opal_manage_flash,			OPAL_FLASH_MANAGE);
> =C2=A0OPAL_CALL(opal_update_flash,			OPAL_FLASH_UPDATE);
> =C2=A0OPAL_CALL(opal_resync_timebase,			OPAL_RESYNC_TIMEB
> ASE);
> +OPAL_CALL_REAL(opal_rm_resync_timebase,		OPAL_RESYNC_TIMEB
> ASE);
> =C2=A0OPAL_CALL(opal_check_token,			OPAL_CHECK_TOKEN);
> =C2=A0OPAL_CALL(opal_dump_init,			OPAL_DUMP_INIT);
> =C2=A0OPAL_CALL(opal_dump_info,			OPAL_DUMP_INFO);
> @@ -285,7 +280,9 @@ OPAL_CALL(opal_sensor_read,			OP
> AL_SENSOR_READ);
> =C2=A0OPAL_CALL(opal_get_param,			OPAL_GET_PARAM);
> =C2=A0OPAL_CALL(opal_set_param,			OPAL_SET_PARAM);
> =C2=A0OPAL_CALL(opal_handle_hmi,			OPAL_HANDLE_HMI);
> +OPAL_CALL_REAL(opal_rm_handle_hmi,		OPAL_HANDLE_HMI);
> =C2=A0OPAL_CALL(opal_config_cpu_idle_state,		OPAL_CONFIG_CPU_IDL
> E_STATE);
> +OPAL_CALL_REAL(opal_rm_config_cpu_idle_state,	OPAL_CONFIG_CPU_IDL
> E_STATE);
> =C2=A0OPAL_CALL(opal_slw_set_reg,			OPAL_SLW_SET_REG);
> =C2=A0OPAL_CALL(opal_register_dump_region,		OPAL_REGISTER_DUMP_R
> EGION);
> =C2=A0OPAL_CALL(opal_unregister_dump_region,		OPAL_UNREGISTER_DU
> MP_REGION);
> @@ -306,3 +303,5 @@ OPAL_CALL(opal_int_get_xirr,			O
> PAL_INT_GET_XIRR);
> =C2=A0OPAL_CALL(opal_int_set_cppr,			OPAL_INT_SET_CPPR);
> =C2=A0OPAL_CALL(opal_int_eoi,				OPAL_INT_EOI);
> =C2=A0OPAL_CALL(opal_int_set_mfrr,			OPAL_INT_SET_MFRR);
> +OPAL_CALL(opal_pci_tce_kill,			OPAL_PCI_TCE_KILL);
> +OPAL_CALL_REAL(opal_rm_pci_tce_kill,		OPAL_PCI_TCE_KILL);

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 07/17] powerpc/opal: Add real mode call wrappers
  2016-06-28  4:18   ` Michael Neuling
@ 2016-06-28 11:37     ` Michael Ellerman
  2016-06-28 11:56       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 21+ messages in thread
From: Michael Ellerman @ 2016-06-28 11:37 UTC (permalink / raw)
  To: Michael Neuling, Benjamin Herrenschmidt, linuxppc-dev; +Cc: shreyas

On Tue, 2016-06-28 at 14:18 +1000, Michael Neuling wrote:
> mpe,
> 
> Just flagging this as going to conflict with Shreyas' stop instruction
> patch series.  It's relatively easy to fix so you can do it manually.
> 
> Alternatively you could take this one patch now and get Shreyas to rebase.

Yeah Ben already told me.

I don't know which will go in first yet, this has had zero revisions and zero
reviews so it might not go in straight away ;)

I'm happy to fix up whatever merge conflicts and/or get people to rebase, it's
no problem.

cheers

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 07/17] powerpc/opal: Add real mode call wrappers
  2016-06-28 11:37     ` Michael Ellerman
@ 2016-06-28 11:56       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-28 11:56 UTC (permalink / raw)
  To: Michael Ellerman, Michael Neuling, linuxppc-dev; +Cc: shreyas

On Tue, 2016-06-28 at 21:37 +1000, Michael Ellerman wrote:
> Yeah Ben already told me.
> 
> I don't know which will go in first yet, this has had zero revisions and zero
> reviews so it might not go in straight away ;)

The point was that patch could go in independently of the rest but I really
don't care either way.

> I'm happy to fix up whatever merge conflicts and/or get people to rebase, it's
> no problem.

Yup and rebasing my patch on top of Shreyas is not a big deal wither, I
did it at least twice in internal trees already ;-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-06-28 11:56 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-27 12:25 [PATCH 00/17] Initial POWER9 XIVE and PHB4 support Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 01/17] powerpc/powernv: Add XICS emulation APIs Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 02/17] powerpc/irq: Add support for HV virtualization interrupts Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 03/17] powerpc/irq: Add mechanism to force a replay of interrupts Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 04/17] powerpc/xics: Add ICP OPAL backend Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 05/17] powerpc/powernv: Add IODA3 PHB type Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 06/17] powerpc/pseries/pci: Remove obsolete SW invalidate Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 07/17] powerpc/opal: Add real mode call wrappers Benjamin Herrenschmidt
2016-06-28  4:18   ` Michael Neuling
2016-06-28 11:37     ` Michael Ellerman
2016-06-28 11:56       ` Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 08/17] powerpc/powernv/pci: Rename TCE invalidation calls Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 09/17] powerpc/powernv/pci: Remove SWINV constants and obsolete TCE code Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 10/17] powerpc/powernv/pci: Rework accessing the TCE invalidate register Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 11/17] powerpc/powernv/pci: Fallback to OPAL for TCE invalidations Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 12/17] powerpc/powernv: set power_save func after the idle states are initialized Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 13/17] powerpc/powernv/pci: Use the device-tree to get available range of M64's Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 14/17] powerpc/powernv/pci: Check status of a PHB before using it Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 15/17] powerpc/pci: Don't try to allocate resources that will be reassigned Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 16/17] powerpc/pci: Reduce log level of PCI I/O space warning Benjamin Herrenschmidt
2016-06-27 12:25 ` [PATCH 17/17] powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs Benjamin Herrenschmidt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.