All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/13] powerpc/xive: misc cleanups
@ 2020-12-08 15:11 Cédric Le Goater
  2020-12-08 15:11 ` [PATCH 01/13] KVM: PPC: Book3S HV: XIVE: Show detailed configuration in debug output Cédric Le Goater
                   ` (13 more replies)
  0 siblings, 14 replies; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

Hello,

The most important change is the removal of support of OPAL flags
required for P9 DD1. It provides a good cleanup of some complex
routines.

The series also includes a change on how the pages donated to the XIVE
IC are allocated in Linux. The flags are changed to make sure that
these pages can not be reclaimed.

Thanks,

C.

Cédric Le Goater (13):
  KVM: PPC: Book3S HV: XIVE: Show detailed configuration in debug output
  powerpc/xive: Rename XIVE_IRQ_NO_EOI to show its a flag
  powerpc/xive: Introduce XIVE_IPI_HW_IRQ
  powerpc/xive: Use cpu_to_node() instead of ibm,chip-id property
  powerpc/xive: Fix allocation of pages donated to the XIVE controller
  powerpc/xive: Add a name to the IRQ domain
  powerpc/xive: Add a debug_show handler to the XIVE irq_domain
  powerpc: Increase NR_IRQS range to support more KVM guests
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW
  powerpc/xive: Simplify xive_do_source_eoi()
  powerpc/xive: Improve error reporting of OPAL calls

 arch/powerpc/include/asm/opal-api.h      |   6 +-
 arch/powerpc/include/asm/xive.h          |   8 +-
 arch/powerpc/kvm/book3s_xive.h           |   2 +
 arch/powerpc/sysdev/xive/xive-internal.h |   9 +-
 arch/powerpc/kvm/book3s_xive.c           | 134 +++++++-------
 arch/powerpc/kvm/book3s_xive_native.c    |  24 ++-
 arch/powerpc/kvm/book3s_xive_template.c  |   5 -
 arch/powerpc/sysdev/xive/common.c        | 219 +++++++++++------------
 arch/powerpc/sysdev/xive/native.c        |  48 ++---
 arch/powerpc/sysdev/xive/spapr.c         |   8 +-
 arch/powerpc/Kconfig                     |   2 +-
 11 files changed, 230 insertions(+), 235 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 01/13] KVM: PPC: Book3S HV: XIVE: Show detailed configuration in debug output
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-08 15:11 ` [PATCH 02/13] powerpc/xive: Rename XIVE_IRQ_NO_EOI to show its a flag Cédric Le Goater
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater, Greg Kurz

This is useful to track allocation of the HW resources on per guest
basis. Making sure IPIs are local to the chip of the vCPUs reduces
rerouting between interrupt controllers and gives better performance
in case of pinning. Checking the distribution of VP structures on the
chips also helps in reducing PowerBUS traffic.

Signed-off-by: Greg Kurz <groug@kaod.org>
[ clg: resurrected show_sources and reworked ouput ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/kvm/book3s_xive.h        |  2 +
 arch/powerpc/kvm/book3s_xive.c        | 76 ++++++++++++++++++++++-----
 arch/powerpc/kvm/book3s_xive_native.c | 21 ++++++--
 3 files changed, 82 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_xive.h b/arch/powerpc/kvm/book3s_xive.h
index 382e3a56e789..d5d4fee7ac94 100644
--- a/arch/powerpc/kvm/book3s_xive.h
+++ b/arch/powerpc/kvm/book3s_xive.h
@@ -290,6 +290,8 @@ extern int (*__xive_vm_h_eoi)(struct kvm_vcpu *vcpu, unsigned long xirr);
  */
 void kvmppc_xive_disable_vcpu_interrupts(struct kvm_vcpu *vcpu);
 int kvmppc_xive_debug_show_queues(struct seq_file *m, struct kvm_vcpu *vcpu);
+void kvmppc_xive_debug_show_sources(struct seq_file *m,
+				    struct kvmppc_xive_src_block *sb);
 struct kvmppc_xive_src_block *kvmppc_xive_create_src_block(
 	struct kvmppc_xive *xive, int irq);
 void kvmppc_xive_free_sources(struct kvmppc_xive_src_block *sb);
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index a0ebc29f30b2..18a6b75a3bfd 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -2125,9 +2125,8 @@ int kvmppc_xive_debug_show_queues(struct seq_file *m, struct kvm_vcpu *vcpu)
 		if (!q->qpage && !xc->esc_virq[i])
 			continue;
 
-		seq_printf(m, " [q%d]: ", i);
-
 		if (q->qpage) {
+			seq_printf(m, "    q[%d]: ", i);
 			idx = q->idx;
 			i0 = be32_to_cpup(q->qpage + idx);
 			idx = (idx + 1) & q->msk;
@@ -2141,16 +2140,54 @@ int kvmppc_xive_debug_show_queues(struct seq_file *m, struct kvm_vcpu *vcpu)
 				irq_data_get_irq_handler_data(d);
 			u64 pq = xive_vm_esb_load(xd, XIVE_ESB_GET);
 
-			seq_printf(m, "E:%c%c I(%d:%llx:%llx)",
-				   (pq & XIVE_ESB_VAL_P) ? 'P' : 'p',
-				   (pq & XIVE_ESB_VAL_Q) ? 'Q' : 'q',
-				   xc->esc_virq[i], pq, xd->eoi_page);
+			seq_printf(m, "    ESC %d %c%c EOI @%llx",
+				   xc->esc_virq[i],
+				   (pq & XIVE_ESB_VAL_P) ? 'P' : '-',
+				   (pq & XIVE_ESB_VAL_Q) ? 'Q' : '-',
+				   xd->eoi_page);
 			seq_puts(m, "\n");
 		}
 	}
 	return 0;
 }
 
+void kvmppc_xive_debug_show_sources(struct seq_file *m,
+				    struct kvmppc_xive_src_block *sb)
+{
+	int i;
+
+	seq_puts(m, "    LISN      HW/CHIP   TYPE    PQ      EISN    CPU/PRIO\n");
+	for (i = 0; i < KVMPPC_XICS_IRQ_PER_ICS; i++) {
+		struct kvmppc_xive_irq_state *state = &sb->irq_state[i];
+		struct xive_irq_data *xd;
+		u64 pq;
+		u32 hw_num;
+
+		if (!state->valid)
+			continue;
+
+		kvmppc_xive_select_irq(state, &hw_num, &xd);
+
+		pq = xive_vm_esb_load(xd, XIVE_ESB_GET);
+
+		seq_printf(m, "%08x  %08x/%02x", state->number, hw_num,
+			   xd->src_chip);
+		if (state->lsi)
+			seq_printf(m, " %cLSI", state->asserted ? '^' : ' ');
+		else
+			seq_puts(m, "  MSI");
+
+		seq_printf(m, " %s  %c%c  %08x   % 4d/%d",
+			   state->ipi_number == hw_num ? "IPI" : " PT",
+			   pq & XIVE_ESB_VAL_P ? 'P' : '-',
+			   pq & XIVE_ESB_VAL_Q ? 'Q' : '-',
+			   state->eisn, state->act_server,
+			   state->act_priority);
+
+		seq_puts(m, "\n");
+	}
+}
+
 static int xive_debug_show(struct seq_file *m, void *private)
 {
 	struct kvmppc_xive *xive = m->private;
@@ -2171,7 +2208,7 @@ static int xive_debug_show(struct seq_file *m, void *private)
 	if (!kvm)
 		return 0;
 
-	seq_printf(m, "=========\nVCPU state\n=========\n");
+	seq_puts(m, "=========\nVCPU state\n=========\n");
 
 	kvm_for_each_vcpu(i, vcpu, kvm) {
 		struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
@@ -2179,11 +2216,12 @@ static int xive_debug_show(struct seq_file *m, void *private)
 		if (!xc)
 			continue;
 
-		seq_printf(m, "cpu server %#x VP:%#x CPPR:%#x HWCPPR:%#x"
-			   " MFRR:%#x PEND:%#x h_xirr: R=%lld V=%lld\n",
-			   xc->server_num, xc->vp_id, xc->cppr, xc->hw_cppr,
-			   xc->mfrr, xc->pending,
-			   xc->stat_rm_h_xirr, xc->stat_vm_h_xirr);
+		seq_printf(m, "VCPU %d: VP:%#x/%02x\n"
+			 "    CPPR:%#x HWCPPR:%#x MFRR:%#x PEND:%#x h_xirr: R=%lld V=%lld\n",
+			 xc->server_num, xc->vp_id, xc->vp_chip_id,
+			 xc->cppr, xc->hw_cppr,
+			 xc->mfrr, xc->pending,
+			 xc->stat_rm_h_xirr, xc->stat_vm_h_xirr);
 
 		kvmppc_xive_debug_show_queues(m, vcpu);
 
@@ -2199,13 +2237,25 @@ static int xive_debug_show(struct seq_file *m, void *private)
 		t_vm_h_ipi += xc->stat_vm_h_ipi;
 	}
 
-	seq_printf(m, "Hcalls totals\n");
+	seq_puts(m, "Hcalls totals\n");
 	seq_printf(m, " H_XIRR  R=%10lld V=%10lld\n", t_rm_h_xirr, t_vm_h_xirr);
 	seq_printf(m, " H_IPOLL R=%10lld V=%10lld\n", t_rm_h_ipoll, t_vm_h_ipoll);
 	seq_printf(m, " H_CPPR  R=%10lld V=%10lld\n", t_rm_h_cppr, t_vm_h_cppr);
 	seq_printf(m, " H_EOI   R=%10lld V=%10lld\n", t_rm_h_eoi, t_vm_h_eoi);
 	seq_printf(m, " H_IPI   R=%10lld V=%10lld\n", t_rm_h_ipi, t_vm_h_ipi);
 
+	seq_puts(m, "=========\nSources\n=========\n");
+
+	for (i = 0; i <= xive->max_sbid; i++) {
+		struct kvmppc_xive_src_block *sb = xive->src_blocks[i];
+
+		if (sb) {
+			arch_spin_lock(&sb->lock);
+			kvmppc_xive_debug_show_sources(m, sb);
+			arch_spin_unlock(&sb->lock);
+		}
+	}
+
 	return 0;
 }
 
diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
index 6aaaa4bedaaf..9b395381179d 100644
--- a/arch/powerpc/kvm/book3s_xive_native.c
+++ b/arch/powerpc/kvm/book3s_xive_native.c
@@ -1257,18 +1257,31 @@ static int xive_native_debug_show(struct seq_file *m, void *private)
 		if (!xc)
 			continue;
 
-		seq_printf(m, "cpu server %#x VP=%#x NSR=%02x CPPR=%02x IBP=%02x PIPR=%02x w01=%016llx w2=%08x\n",
-			   xc->server_num, xc->vp_id,
+		seq_printf(m, "VCPU %d: VP=%#x/%02x\n"
+			   "    NSR=%02x CPPR=%02x IBP=%02x PIPR=%02x w01=%016llx w2=%08x\n",
+			   xc->server_num, xc->vp_id, xc->vp_chip_id,
 			   vcpu->arch.xive_saved_state.nsr,
 			   vcpu->arch.xive_saved_state.cppr,
 			   vcpu->arch.xive_saved_state.ipb,
 			   vcpu->arch.xive_saved_state.pipr,
-			   vcpu->arch.xive_saved_state.w01,
-			   (u32) vcpu->arch.xive_cam_word);
+			   be64_to_cpu(vcpu->arch.xive_saved_state.w01),
+			   be32_to_cpu(vcpu->arch.xive_cam_word));
 
 		kvmppc_xive_debug_show_queues(m, vcpu);
 	}
 
+	seq_puts(m, "=========\nSources\n=========\n");
+
+	for (i = 0; i <= xive->max_sbid; i++) {
+		struct kvmppc_xive_src_block *sb = xive->src_blocks[i];
+
+		if (sb) {
+			arch_spin_lock(&sb->lock);
+			kvmppc_xive_debug_show_sources(m, sb);
+			arch_spin_unlock(&sb->lock);
+		}
+	}
+
 	return 0;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 02/13] powerpc/xive: Rename XIVE_IRQ_NO_EOI to show its a flag
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
  2020-12-08 15:11 ` [PATCH 01/13] KVM: PPC: Book3S HV: XIVE: Show detailed configuration in debug output Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-08 16:59   ` Greg Kurz
  2020-12-08 15:11 ` [PATCH 03/13] powerpc/xive: Introduce XIVE_IPI_HW_IRQ Cédric Le Goater
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

This is a simple cleanup to identify easily all flags of the XIVE
interrupt structure. The interrupts flagged with XIVE_IRQ_FLAG_NO_EOI
are the escalations used to wake up vCPUs in KVM. They are handled
very differently from the rest.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/include/asm/xive.h   | 2 +-
 arch/powerpc/kvm/book3s_xive.c    | 4 ++--
 arch/powerpc/sysdev/xive/common.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
index 309b4d65b74f..d332dd9a18de 100644
--- a/arch/powerpc/include/asm/xive.h
+++ b/arch/powerpc/include/asm/xive.h
@@ -66,7 +66,7 @@ struct xive_irq_data {
 #define XIVE_IRQ_FLAG_H_INT_ESB	0x20
 
 /* Special flag set by KVM for excalation interrupts */
-#define XIVE_IRQ_NO_EOI		0x80
+#define XIVE_IRQ_FLAG_NO_EOI	0x80
 
 #define XIVE_INVALID_CHIP_ID	-1
 
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index 18a6b75a3bfd..fae1c2e8da29 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -219,7 +219,7 @@ int kvmppc_xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio,
 	/* In single escalation mode, we grab the ESB MMIO of the
 	 * interrupt and mask it. Also populate the VCPU v/raddr
 	 * of the ESB page for use by asm entry/exit code. Finally
-	 * set the XIVE_IRQ_NO_EOI flag which will prevent the
+	 * set the XIVE_IRQ_FLAG_NO_EOI flag which will prevent the
 	 * core code from performing an EOI on the escalation
 	 * interrupt, thus leaving it effectively masked after
 	 * it fires once.
@@ -231,7 +231,7 @@ int kvmppc_xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio,
 		xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_01);
 		vcpu->arch.xive_esc_raddr = xd->eoi_page;
 		vcpu->arch.xive_esc_vaddr = (__force u64)xd->eoi_mmio;
-		xd->flags |= XIVE_IRQ_NO_EOI;
+		xd->flags |= XIVE_IRQ_FLAG_NO_EOI;
 	}
 
 	return 0;
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index a80440af491a..65af34ac1fa2 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -416,7 +416,7 @@ static void xive_irq_eoi(struct irq_data *d)
 	 * been passed-through to a KVM guest
 	 */
 	if (!irqd_irq_disabled(d) && !irqd_is_forwarded_to_vcpu(d) &&
-	    !(xd->flags & XIVE_IRQ_NO_EOI))
+	    !(xd->flags & XIVE_IRQ_FLAG_NO_EOI))
 		xive_do_source_eoi(irqd_to_hwirq(d), xd);
 	else
 		xd->stale_p = true;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 03/13] powerpc/xive: Introduce XIVE_IPI_HW_IRQ
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
  2020-12-08 15:11 ` [PATCH 01/13] KVM: PPC: Book3S HV: XIVE: Show detailed configuration in debug output Cédric Le Goater
  2020-12-08 15:11 ` [PATCH 02/13] powerpc/xive: Rename XIVE_IRQ_NO_EOI to show its a flag Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-08 17:06   ` Greg Kurz
  2020-12-08 15:11 ` [PATCH 04/13] powerpc/xive: Use cpu_to_node() instead of ibm, chip-id property Cédric Le Goater
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

The XIVE driver deals with CPU IPIs in a peculiar way. Each CPU has
its own XIVE IPI interrupt allocated at the HW level, for PowerNV, or
at the hypervisor level for pSeries. In practice, these interrupts are
not always used. pSeries/PowerVM prefers local doorbells for local
threads since they are faster. On PowerNV, global doorbells are also
preferred for the same reason.

The mapping in the Linux is reduced to a single interrupt using HW
interrupt number 0 and a custom irq_chip to handle EOI. This can cause
performance issues in some benchmark (ipistorm) on multichip systems.

Clarify the use of the 0 value, it will help in improving multichip
support.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/sysdev/xive/xive-internal.h |  2 ++
 arch/powerpc/sysdev/xive/common.c        | 10 +++++-----
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/sysdev/xive/xive-internal.h b/arch/powerpc/sysdev/xive/xive-internal.h
index b7b901da2168..d701af7fb48c 100644
--- a/arch/powerpc/sysdev/xive/xive-internal.h
+++ b/arch/powerpc/sysdev/xive/xive-internal.h
@@ -5,6 +5,8 @@
 #ifndef __XIVE_INTERNAL_H
 #define __XIVE_INTERNAL_H
 
+#define XIVE_IPI_HW_IRQ		0 /* interrupt source # for IPIs */
+
 /*
  * A "disabled" interrupt should never fire, to catch problems
  * we set its logical number to this
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 65af34ac1fa2..ee375daf8114 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1142,7 +1142,7 @@ static void __init xive_request_ipi(void)
 		return;
 
 	/* Initialize it */
-	virq = irq_create_mapping(xive_irq_domain, 0);
+	virq = irq_create_mapping(xive_irq_domain, XIVE_IPI_HW_IRQ);
 	xive_ipi_irq = virq;
 
 	WARN_ON(request_irq(virq, xive_muxed_ipi_action,
@@ -1242,7 +1242,7 @@ static int xive_irq_domain_map(struct irq_domain *h, unsigned int virq,
 
 #ifdef CONFIG_SMP
 	/* IPIs are special and come up with HW number 0 */
-	if (hw == 0) {
+	if (hw == XIVE_IPI_HW_IRQ) {
 		/*
 		 * IPIs are marked per-cpu. We use separate HW interrupts under
 		 * the hood but associated with the same "linux" interrupt
@@ -1271,7 +1271,7 @@ static void xive_irq_domain_unmap(struct irq_domain *d, unsigned int virq)
 	if (!data)
 		return;
 	hw_irq = (unsigned int)irqd_to_hwirq(data);
-	if (hw_irq)
+	if (hw_irq != XIVE_IPI_HW_IRQ)
 		xive_irq_free_data(virq);
 }
 
@@ -1421,7 +1421,7 @@ static void xive_flush_cpu_queue(unsigned int cpu, struct xive_cpu *xc)
 		 * Ignore anything that isn't a XIVE irq and ignore
 		 * IPIs, so can just be dropped.
 		 */
-		if (d->domain != xive_irq_domain || hw_irq == 0)
+		if (d->domain != xive_irq_domain || hw_irq == XIVE_IPI_HW_IRQ)
 			continue;
 
 		/*
@@ -1655,7 +1655,7 @@ static int xive_core_debug_show(struct seq_file *m, void *private)
 		hw_irq = (unsigned int)irqd_to_hwirq(d);
 
 		/* IPIs are special (HW number 0) */
-		if (hw_irq)
+		if (hw_irq != XIVE_IPI_HW_IRQ)
 			xive_debug_show_irq(m, hw_irq, d);
 	}
 	return 0;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 04/13] powerpc/xive: Use cpu_to_node() instead of ibm, chip-id property
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
                   ` (2 preceding siblings ...)
  2020-12-08 15:11 ` [PATCH 03/13] powerpc/xive: Introduce XIVE_IPI_HW_IRQ Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-09  4:39   ` Aneesh Kumar K.V
  2020-12-08 15:11 ` [PATCH 05/13] powerpc/xive: Fix allocation of pages donated to the XIVE controller Cédric Le Goater
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

The 'chip_id' field of the XIVE CPU structure is used to choose a
target for a source located on the same chip when possible. This field
is assigned on the PowerNV platform using the "ibm,chip-id" property
on pSeries under KVM when NUMA nodes are defined but it is undefined
under PowerVM. The XIVE source structure has a similar field
'src_chip' which is only assigned on the PowerNV platform.

cpu_to_node() returns a compatible value on all platforms, 0 being the
default node. It will also give us the opportunity to set the affinity
of a source on pSeries when we can localize them.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/sysdev/xive/common.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index ee375daf8114..605238ca65e4 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1342,16 +1342,11 @@ static int xive_prepare_cpu(unsigned int cpu)
 
 	xc = per_cpu(xive_cpu, cpu);
 	if (!xc) {
-		struct device_node *np;
-
 		xc = kzalloc_node(sizeof(struct xive_cpu),
 				  GFP_KERNEL, cpu_to_node(cpu));
 		if (!xc)
 			return -ENOMEM;
-		np = of_get_cpu_node(cpu, NULL);
-		if (np)
-			xc->chip_id = of_get_ibm_chip_id(np);
-		of_node_put(np);
+		xc->chip_id = cpu_to_node(cpu);
 		xc->hw_ipi = XIVE_BAD_IRQ;
 
 		per_cpu(xive_cpu, cpu) = xc;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 05/13] powerpc/xive: Fix allocation of pages donated to the XIVE controller
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
                   ` (3 preceding siblings ...)
  2020-12-08 15:11 ` [PATCH 04/13] powerpc/xive: Use cpu_to_node() instead of ibm, chip-id property Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-08 15:11 ` [PATCH 06/13] powerpc/xive: Add a name to the IRQ domain Cédric Le Goater
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

The XIVE interrupt controller uses a set of internal tables to handle
interrupt routing. The small tables (SBE, EAT) are direct tables and
allocated by OPAL. The bigger ones (NVT, ENDT) are indirect, i.e., the
first table entries point to a page which contains the XIVE structures
used by HW. For these, OPAL only allocates the first page level and
requests pages to be allocated by Linux when a new entry is inserted.

Make sure that these pages can not be reclaimed. The problem can be
seen while stressing a system with 4K KVM guests, 16 vCPUs each.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/sysdev/xive/xive-internal.h | 2 ++
 arch/powerpc/sysdev/xive/common.c        | 2 +-
 arch/powerpc/sysdev/xive/native.c        | 2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/sysdev/xive/xive-internal.h b/arch/powerpc/sysdev/xive/xive-internal.h
index d701af7fb48c..1eacc90f4dcf 100644
--- a/arch/powerpc/sysdev/xive/xive-internal.h
+++ b/arch/powerpc/sysdev/xive/xive-internal.h
@@ -73,6 +73,8 @@ static inline u32 xive_alloc_order(u32 queue_shift)
 	return (queue_shift > PAGE_SHIFT) ? (queue_shift - PAGE_SHIFT) : 0;
 }
 
+#define XIVE_GFP (__GFP_NORETRY | __GFP_NOWARN | __GFP_NOMEMALLOC)
+
 extern bool xive_cmdline_disabled;
 
 #endif /*  __XIVE_INTERNAL_H */
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 605238ca65e4..80fd97d764ab 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1544,7 +1544,7 @@ __be32 *xive_queue_page_alloc(unsigned int cpu, u32 queue_shift)
 	__be32 *qpage;
 
 	alloc_order = xive_alloc_order(queue_shift);
-	pages = alloc_pages_node(cpu_to_node(cpu), GFP_KERNEL, alloc_order);
+	pages = alloc_pages_node(cpu_to_node(cpu), XIVE_GFP, alloc_order);
 	if (!pages)
 		return ERR_PTR(-ENOMEM);
 	qpage = (__be32 *)page_address(pages);
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index cb58ec7ce77a..6afb44d0d816 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -643,7 +643,7 @@ static bool xive_native_provision_pages(void)
 		 * XXX TODO: Try to make the allocation local to the node where
 		 * the chip resides.
 		 */
-		p = kmem_cache_alloc(xive_provision_cache, GFP_KERNEL);
+		p = kmem_cache_alloc(xive_provision_cache, XIVE_GFP);
 		if (!p) {
 			pr_err("Failed to allocate provisioning page\n");
 			return false;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 06/13] powerpc/xive: Add a name to the IRQ domain
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
                   ` (4 preceding siblings ...)
  2020-12-08 15:11 ` [PATCH 05/13] powerpc/xive: Fix allocation of pages donated to the XIVE controller Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-08 15:11 ` [PATCH 07/13] powerpc/xive: Add a debug_show handler to the XIVE irq_domain Cédric Le Goater
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

We hope one day to handle multiple irq_domain in the XIVE driver.
Start simple by setting the name using the DT node.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/sysdev/xive/xive-internal.h |  4 ++--
 arch/powerpc/sysdev/xive/common.c        | 10 +++++-----
 arch/powerpc/sysdev/xive/native.c        |  2 +-
 arch/powerpc/sysdev/xive/spapr.c         |  2 +-
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/sysdev/xive/xive-internal.h b/arch/powerpc/sysdev/xive/xive-internal.h
index 1eacc90f4dcf..066d6fe3dc1d 100644
--- a/arch/powerpc/sysdev/xive/xive-internal.h
+++ b/arch/powerpc/sysdev/xive/xive-internal.h
@@ -63,8 +63,8 @@ struct xive_ops {
 	const char *name;
 };
 
-bool xive_core_init(const struct xive_ops *ops, void __iomem *area, u32 offset,
-		    u8 max_prio);
+bool xive_core_init(struct device_node *np, const struct xive_ops *ops,
+		    void __iomem *area, u32 offset, u8 max_prio);
 __be32 *xive_queue_page_alloc(unsigned int cpu, u32 queue_shift);
 int xive_core_debug_init(void);
 
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 80fd97d764ab..721617f0f854 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1310,9 +1310,9 @@ static const struct irq_domain_ops xive_irq_domain_ops = {
 	.xlate = xive_irq_domain_xlate,
 };
 
-static void __init xive_init_host(void)
+static void __init xive_init_host(struct device_node *np)
 {
-	xive_irq_domain = irq_domain_add_nomap(NULL, XIVE_MAX_IRQ,
+	xive_irq_domain = irq_domain_add_nomap(np, XIVE_MAX_IRQ,
 					       &xive_irq_domain_ops, NULL);
 	if (WARN_ON(xive_irq_domain == NULL))
 		return;
@@ -1508,8 +1508,8 @@ void xive_shutdown(void)
 	xive_ops->shutdown();
 }
 
-bool __init xive_core_init(const struct xive_ops *ops, void __iomem *area, u32 offset,
-			   u8 max_prio)
+bool __init xive_core_init(struct device_node *np, const struct xive_ops *ops,
+			   void __iomem *area, u32 offset, u8 max_prio)
 {
 	xive_tima = area;
 	xive_tima_offset = offset;
@@ -1520,7 +1520,7 @@ bool __init xive_core_init(const struct xive_ops *ops, void __iomem *area, u32 o
 	__xive_enabled = true;
 
 	pr_devel("Initializing host..\n");
-	xive_init_host();
+	xive_init_host(np);
 
 	pr_devel("Initializing boot CPU..\n");
 
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index 6afb44d0d816..5f1e5aed8ab4 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -622,7 +622,7 @@ bool __init xive_native_init(void)
 	xive_native_setup_pools();
 
 	/* Initialize XIVE core with our backend */
-	if (!xive_core_init(&xive_native_ops, tima, TM_QW3_HV_PHYS,
+	if (!xive_core_init(np, &xive_native_ops, tima, TM_QW3_HV_PHYS,
 			    max_prio)) {
 		opal_xive_reset(OPAL_XIVE_MODE_EMU);
 		return false;
diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
index 1e3674d7ea7b..6610e5149d5a 100644
--- a/arch/powerpc/sysdev/xive/spapr.c
+++ b/arch/powerpc/sysdev/xive/spapr.c
@@ -857,7 +857,7 @@ bool __init xive_spapr_init(void)
 	}
 
 	/* Initialize XIVE core with our backend */
-	if (!xive_core_init(&xive_spapr_ops, tima, TM_QW1_OS, max_prio))
+	if (!xive_core_init(np, &xive_spapr_ops, tima, TM_QW1_OS, max_prio))
 		return false;
 
 	pr_info("Using %dkB queues\n", 1 << (xive_queue_shift - 10));
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 07/13] powerpc/xive: Add a debug_show handler to the XIVE irq_domain
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
                   ` (5 preceding siblings ...)
  2020-12-08 15:11 ` [PATCH 06/13] powerpc/xive: Add a name to the IRQ domain Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-09 15:50   ` Greg Kurz
  2020-12-08 15:11 ` [PATCH 08/13] powerpc: Increase NR_IRQS range to support more KVM guests Cédric Le Goater
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

Full state of the Linux interrupt descriptors can be dumped under
debugfs when compiled with CONFIG_GENERIC_IRQ_DEBUGFS. Add support for
the XIVE interrupt controller.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/sysdev/xive/common.c | 58 +++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 721617f0f854..411cba12d73b 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1303,11 +1303,69 @@ static int xive_irq_domain_match(struct irq_domain *h, struct device_node *node,
 	return xive_ops->match(node);
 }
 
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+static const char * const esb_names[] = { "RESET", "OFF", "PENDING", "QUEUED" };
+
+static const struct {
+	u64  mask;
+	char *name;
+} xive_irq_flags[] = {
+	{ XIVE_IRQ_FLAG_STORE_EOI, "STORE_EOI" },
+	{ XIVE_IRQ_FLAG_LSI,       "LSI"       },
+	{ XIVE_IRQ_FLAG_SHIFT_BUG, "SHIFT_BUG" },
+	{ XIVE_IRQ_FLAG_MASK_FW,   "MASK_FW"   },
+	{ XIVE_IRQ_FLAG_EOI_FW,    "EOI_FW"    },
+	{ XIVE_IRQ_FLAG_H_INT_ESB, "H_INT_ESB" },
+	{ XIVE_IRQ_FLAG_NO_EOI,    "NO_EOI"    },
+};
+
+static void xive_irq_domain_debug_show(struct seq_file *m, struct irq_domain *d,
+				       struct irq_data *irqd, int ind)
+{
+	struct xive_irq_data *xd;
+	u64 val;
+	int i;
+
+	/* No IRQ domain level information. To be done */
+	if (!irqd)
+		return;
+
+	if (!is_xive_irq(irq_data_get_irq_chip(irqd)))
+		return;
+
+	seq_printf(m, "%*sXIVE:\n", ind, "");
+	ind++;
+
+	xd = irq_data_get_irq_handler_data(irqd);
+	if (!xd) {
+		seq_printf(m, "%*snot assigned\n", ind, "");
+		return;
+	}
+
+	val = xive_esb_read(xd, XIVE_ESB_GET);
+	seq_printf(m, "%*sESB:      %s\n", ind, "", esb_names[val & 0x3]);
+	seq_printf(m, "%*sPstate:   %s %s\n", ind, "", xd->stale_p ? "stale" : "",
+		   xd->saved_p ? "saved" : "");
+	seq_printf(m, "%*sTarget:   %d\n", ind, "", xd->target);
+	seq_printf(m, "%*sChip:     %d\n", ind, "", xd->src_chip);
+	seq_printf(m, "%*sTrigger:  0x%016llx\n", ind, "", xd->trig_page);
+	seq_printf(m, "%*sEOI:      0x%016llx\n", ind, "", xd->eoi_page);
+	seq_printf(m, "%*sFlags:    0x%llx\n", ind, "", xd->flags);
+	for (i = 0; i < ARRAY_SIZE(xive_irq_flags); i++) {
+		if (xd->flags & xive_irq_flags[i].mask)
+			seq_printf(m, "%*s%s\n", ind + 12, "", xive_irq_flags[i].name);
+	}
+}
+#endif
+
 static const struct irq_domain_ops xive_irq_domain_ops = {
 	.match = xive_irq_domain_match,
 	.map = xive_irq_domain_map,
 	.unmap = xive_irq_domain_unmap,
 	.xlate = xive_irq_domain_xlate,
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+	.debug_show = xive_irq_domain_debug_show,
+#endif
 };
 
 static void __init xive_init_host(struct device_node *np)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 08/13] powerpc: Increase NR_IRQS range to support more KVM guests
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
                   ` (6 preceding siblings ...)
  2020-12-08 15:11 ` [PATCH 07/13] powerpc/xive: Add a debug_show handler to the XIVE irq_domain Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-08 17:23   ` Greg Kurz
  2020-12-08 15:11 ` [PATCH 09/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG Cédric Le Goater
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

PowerNV systems can handle up to 4K guests and 1M interrupt numbers
per chip. Increase the range of allowed interrupts to support a larger
number of guests.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 5181872f9452..c250fbd430d1 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -66,7 +66,7 @@ config NEED_PER_CPU_PAGE_FIRST_CHUNK
 
 config NR_IRQS
 	int "Number of virtual interrupt numbers"
-	range 32 32768
+	range 32 1048576
 	default "512"
 	help
 	  This defines the number of virtual interrupt numbers the kernel
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 09/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
                   ` (7 preceding siblings ...)
  2020-12-08 15:11 ` [PATCH 08/13] powerpc: Increase NR_IRQS range to support more KVM guests Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-08 17:39   ` Greg Kurz
  2020-12-08 15:11 ` [PATCH 10/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW Cédric Le Goater
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

This flag was used to support the PHB4 LSIs on P9 DD1 and we have
stopped supporting this CPU when DD2 came out. See skiboot commit:

  https://github.com/open-power/skiboot/commit/0b0d15e3c170

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/include/asm/opal-api.h     | 2 +-
 arch/powerpc/include/asm/xive.h         | 2 +-
 arch/powerpc/kvm/book3s_xive_native.c   | 3 ---
 arch/powerpc/kvm/book3s_xive_template.c | 3 ---
 arch/powerpc/sysdev/xive/common.c       | 8 --------
 arch/powerpc/sysdev/xive/native.c       | 2 --
 6 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index 1dffa3cb16ba..48ee604ca39a 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -1091,7 +1091,7 @@ enum {
 	OPAL_XIVE_IRQ_TRIGGER_PAGE	= 0x00000001,
 	OPAL_XIVE_IRQ_STORE_EOI		= 0x00000002,
 	OPAL_XIVE_IRQ_LSI		= 0x00000004,
-	OPAL_XIVE_IRQ_SHIFT_BUG		= 0x00000008,
+	OPAL_XIVE_IRQ_SHIFT_BUG		= 0x00000008, /* P9 DD1.0 workaround */
 	OPAL_XIVE_IRQ_MASK_VIA_FW	= 0x00000010,
 	OPAL_XIVE_IRQ_EOI_VIA_FW	= 0x00000020,
 };
diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
index d332dd9a18de..ff805885a028 100644
--- a/arch/powerpc/include/asm/xive.h
+++ b/arch/powerpc/include/asm/xive.h
@@ -60,7 +60,7 @@ struct xive_irq_data {
 };
 #define XIVE_IRQ_FLAG_STORE_EOI	0x01
 #define XIVE_IRQ_FLAG_LSI	0x02
-#define XIVE_IRQ_FLAG_SHIFT_BUG	0x04
+#define XIVE_IRQ_FLAG_SHIFT_BUG	0x04 /* P9 DD1.0 workaround */
 #define XIVE_IRQ_FLAG_MASK_FW	0x08
 #define XIVE_IRQ_FLAG_EOI_FW	0x10
 #define XIVE_IRQ_FLAG_H_INT_ESB	0x20
diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
index 9b395381179d..170d1d04e1d1 100644
--- a/arch/powerpc/kvm/book3s_xive_native.c
+++ b/arch/powerpc/kvm/book3s_xive_native.c
@@ -37,9 +37,6 @@ static u8 xive_vm_esb_load(struct xive_irq_data *xd, u32 offset)
 	 * ordering.
 	 */
 
-	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
-		offset |= offset << 4;
-
 	val = in_be64(xd->eoi_mmio + offset);
 	return (u8)val;
 }
diff --git a/arch/powerpc/kvm/book3s_xive_template.c b/arch/powerpc/kvm/book3s_xive_template.c
index 4ad3c0279458..ece36e024a8f 100644
--- a/arch/powerpc/kvm/book3s_xive_template.c
+++ b/arch/powerpc/kvm/book3s_xive_template.c
@@ -61,9 +61,6 @@ static u8 GLUE(X_PFX,esb_load)(struct xive_irq_data *xd, u32 offset)
 	if (offset == XIVE_ESB_SET_PQ_10 && xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
 		offset |= XIVE_ESB_LD_ST_MO;
 
-	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
-		offset |= offset << 4;
-
 	val =__x_readq(__x_eoi_page(xd) + offset);
 #ifdef __LITTLE_ENDIAN__
 	val >>= 64-8;
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 411cba12d73b..a9259470bf9f 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -200,10 +200,6 @@ static notrace u8 xive_esb_read(struct xive_irq_data *xd, u32 offset)
 	if (offset == XIVE_ESB_SET_PQ_10 && xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
 		offset |= XIVE_ESB_LD_ST_MO;
 
-	/* Handle HW errata */
-	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
-		offset |= offset << 4;
-
 	if ((xd->flags & XIVE_IRQ_FLAG_H_INT_ESB) && xive_ops->esb_rw)
 		val = xive_ops->esb_rw(xd->hw_irq, offset, 0, 0);
 	else
@@ -214,10 +210,6 @@ static notrace u8 xive_esb_read(struct xive_irq_data *xd, u32 offset)
 
 static void xive_esb_write(struct xive_irq_data *xd, u32 offset, u64 data)
 {
-	/* Handle HW errata */
-	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
-		offset |= offset << 4;
-
 	if ((xd->flags & XIVE_IRQ_FLAG_H_INT_ESB) && xive_ops->esb_rw)
 		xive_ops->esb_rw(xd->hw_irq, offset, data, 1);
 	else
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index 5f1e5aed8ab4..0310783241b5 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -64,8 +64,6 @@ int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)
 		data->flags |= XIVE_IRQ_FLAG_STORE_EOI;
 	if (opal_flags & OPAL_XIVE_IRQ_LSI)
 		data->flags |= XIVE_IRQ_FLAG_LSI;
-	if (opal_flags & OPAL_XIVE_IRQ_SHIFT_BUG)
-		data->flags |= XIVE_IRQ_FLAG_SHIFT_BUG;
 	if (opal_flags & OPAL_XIVE_IRQ_MASK_VIA_FW)
 		data->flags |= XIVE_IRQ_FLAG_MASK_FW;
 	if (opal_flags & OPAL_XIVE_IRQ_EOI_VIA_FW)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 10/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
                   ` (8 preceding siblings ...)
  2020-12-08 15:11 ` [PATCH 09/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-09 15:21   ` Greg Kurz
  2020-12-08 15:11 ` [PATCH 11/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW Cédric Le Goater
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

This flag was used to support the PHB4 LSIs on P9 DD1 and we have
stopped supporting this CPU when DD2 came out. See skiboot commit:

  https://github.com/open-power/skiboot/commit/0b0d15e3c170

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/include/asm/opal-api.h |  2 +-
 arch/powerpc/include/asm/xive.h     |  2 +-
 arch/powerpc/kvm/book3s_xive.c      | 54 +++++------------------------
 arch/powerpc/sysdev/xive/common.c   | 39 +--------------------
 arch/powerpc/sysdev/xive/native.c   |  2 --
 5 files changed, 11 insertions(+), 88 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index 48ee604ca39a..0455b679c050 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -1092,7 +1092,7 @@ enum {
 	OPAL_XIVE_IRQ_STORE_EOI		= 0x00000002,
 	OPAL_XIVE_IRQ_LSI		= 0x00000004,
 	OPAL_XIVE_IRQ_SHIFT_BUG		= 0x00000008, /* P9 DD1.0 workaround */
-	OPAL_XIVE_IRQ_MASK_VIA_FW	= 0x00000010,
+	OPAL_XIVE_IRQ_MASK_VIA_FW	= 0x00000010, /* P9 DD1.0 workaround */
 	OPAL_XIVE_IRQ_EOI_VIA_FW	= 0x00000020,
 };
 
diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
index ff805885a028..d62368d0ba91 100644
--- a/arch/powerpc/include/asm/xive.h
+++ b/arch/powerpc/include/asm/xive.h
@@ -61,7 +61,7 @@ struct xive_irq_data {
 #define XIVE_IRQ_FLAG_STORE_EOI	0x01
 #define XIVE_IRQ_FLAG_LSI	0x02
 #define XIVE_IRQ_FLAG_SHIFT_BUG	0x04 /* P9 DD1.0 workaround */
-#define XIVE_IRQ_FLAG_MASK_FW	0x08
+#define XIVE_IRQ_FLAG_MASK_FW	0x08 /* P9 DD1.0 workaround */
 #define XIVE_IRQ_FLAG_EOI_FW	0x10
 #define XIVE_IRQ_FLAG_H_INT_ESB	0x20
 
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index fae1c2e8da29..59a986ae640b 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -419,37 +419,16 @@ static u8 xive_lock_and_mask(struct kvmppc_xive *xive,
 	/* Get the right irq */
 	kvmppc_xive_select_irq(state, &hw_num, &xd);
 
-	/*
-	 * If the interrupt is marked as needing masking via
-	 * firmware, we do it here. Firmware masking however
-	 * is "lossy", it won't return the old p and q bits
-	 * and won't set the interrupt to a state where it will
-	 * record queued ones. If this is an issue we should do
-	 * lazy masking instead.
-	 *
-	 * For now, we work around this in unmask by forcing
-	 * an interrupt whenever we unmask a non-LSI via FW
-	 * (if ever).
-	 */
-	if (xd->flags & OPAL_XIVE_IRQ_MASK_VIA_FW) {
-		xive_native_configure_irq(hw_num,
-				kvmppc_xive_vp(xive, state->act_server),
-				MASKED, state->number);
-		/* set old_p so we can track if an H_EOI was done */
-		state->old_p = true;
-		state->old_q = false;
-	} else {
-		/* Set PQ to 10, return old P and old Q and remember them */
-		val = xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_10);
-		state->old_p = !!(val & 2);
-		state->old_q = !!(val & 1);
+	/* Set PQ to 10, return old P and old Q and remember them */
+	val = xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_10);
+	state->old_p = !!(val & 2);
+	state->old_q = !!(val & 1);
 
-		/*
-		 * Synchronize hardware to sensure the queues are updated
-		 * when masking
+	/*
+	 * Synchronize hardware to sensure the queues are updated
+	 * when masking
 		 */
-		xive_native_sync_source(hw_num);
-	}
+	xive_native_sync_source(hw_num);
 
 	return old_prio;
 }
@@ -483,23 +462,6 @@ static void xive_finish_unmask(struct kvmppc_xive *xive,
 	/* Get the right irq */
 	kvmppc_xive_select_irq(state, &hw_num, &xd);
 
-	/*
-	 * See comment in xive_lock_and_mask() concerning masking
-	 * via firmware.
-	 */
-	if (xd->flags & OPAL_XIVE_IRQ_MASK_VIA_FW) {
-		xive_native_configure_irq(hw_num,
-				kvmppc_xive_vp(xive, state->act_server),
-				state->act_priority, state->number);
-		/* If an EOI is needed, do it here */
-		if (!state->old_p)
-			xive_vm_source_eoi(hw_num, xd);
-		/* If this is not an LSI, force a trigger */
-		if (!(xd->flags & OPAL_XIVE_IRQ_LSI))
-			xive_irq_trigger(xd);
-		goto bail;
-	}
-
 	/* Old Q set, set PQ to 11 */
 	if (state->old_q)
 		xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_11);
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index a9259470bf9f..a71412fefb65 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -424,9 +424,7 @@ static void xive_irq_eoi(struct irq_data *d)
 }
 
 /*
- * Helper used to mask and unmask an interrupt source. This
- * is only called for normal interrupts that do not require
- * masking/unmasking via firmware.
+ * Helper used to mask and unmask an interrupt source.
  */
 static void xive_do_source_set_mask(struct xive_irq_data *xd,
 				    bool mask)
@@ -673,20 +671,6 @@ static void xive_irq_unmask(struct irq_data *d)
 
 	pr_devel("xive_irq_unmask: irq %d data @%p\n", d->irq, xd);
 
-	/*
-	 * This is a workaround for PCI LSI problems on P9, for
-	 * these, we call FW to set the mask. The problems might
-	 * be fixed by P9 DD2.0, if that is the case, firmware
-	 * will no longer set that flag.
-	 */
-	if (xd->flags & XIVE_IRQ_FLAG_MASK_FW) {
-		unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d);
-		xive_ops->configure_irq(hw_irq,
-					get_hard_smp_processor_id(xd->target),
-					xive_irq_priority, d->irq);
-		return;
-	}
-
 	xive_do_source_set_mask(xd, false);
 }
 
@@ -696,20 +680,6 @@ static void xive_irq_mask(struct irq_data *d)
 
 	pr_devel("xive_irq_mask: irq %d data @%p\n", d->irq, xd);
 
-	/*
-	 * This is a workaround for PCI LSI problems on P9, for
-	 * these, we call OPAL to set the mask. The problems might
-	 * be fixed by P9 DD2.0, if that is the case, firmware
-	 * will no longer set that flag.
-	 */
-	if (xd->flags & XIVE_IRQ_FLAG_MASK_FW) {
-		unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d);
-		xive_ops->configure_irq(hw_irq,
-					get_hard_smp_processor_id(xd->target),
-					0xff, d->irq);
-		return;
-	}
-
 	xive_do_source_set_mask(xd, true);
 }
 
@@ -852,13 +822,6 @@ static int xive_irq_set_vcpu_affinity(struct irq_data *d, void *state)
 	int rc;
 	u8 pq;
 
-	/*
-	 * We only support this on interrupts that do not require
-	 * firmware calls for masking and unmasking
-	 */
-	if (xd->flags & XIVE_IRQ_FLAG_MASK_FW)
-		return -EIO;
-
 	/*
 	 * This is called by KVM with state non-NULL for enabling
 	 * pass-through or NULL for disabling it
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index 0310783241b5..deb97ad25d62 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -64,8 +64,6 @@ int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)
 		data->flags |= XIVE_IRQ_FLAG_STORE_EOI;
 	if (opal_flags & OPAL_XIVE_IRQ_LSI)
 		data->flags |= XIVE_IRQ_FLAG_LSI;
-	if (opal_flags & OPAL_XIVE_IRQ_MASK_VIA_FW)
-		data->flags |= XIVE_IRQ_FLAG_MASK_FW;
 	if (opal_flags & OPAL_XIVE_IRQ_EOI_VIA_FW)
 		data->flags |= XIVE_IRQ_FLAG_EOI_FW;
 	data->eoi_page = be64_to_cpu(eoi_page);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 11/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
                   ` (9 preceding siblings ...)
  2020-12-08 15:11 ` [PATCH 10/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-09 15:24   ` Greg Kurz
  2020-12-08 15:11 ` [PATCH 12/13] powerpc/xive: Simplify xive_do_source_eoi() Cédric Le Goater
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

This flag was used to support the P9 DD1 and we have stopped
supporting this CPU when DD2 came out. See skiboot commit:

  https://github.com/open-power/skiboot/commit/0b0d15e3c170

Also, remove eoi handler which is now unused.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/include/asm/opal-api.h      |  2 +-
 arch/powerpc/include/asm/xive.h          |  2 +-
 arch/powerpc/sysdev/xive/xive-internal.h |  1 -
 arch/powerpc/kvm/book3s_xive_template.c  |  2 --
 arch/powerpc/sysdev/xive/common.c        | 13 +------------
 arch/powerpc/sysdev/xive/native.c        | 12 ------------
 arch/powerpc/sysdev/xive/spapr.c         |  6 ------
 7 files changed, 3 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index 0455b679c050..0b63ba7d5917 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -1093,7 +1093,7 @@ enum {
 	OPAL_XIVE_IRQ_LSI		= 0x00000004,
 	OPAL_XIVE_IRQ_SHIFT_BUG		= 0x00000008, /* P9 DD1.0 workaround */
 	OPAL_XIVE_IRQ_MASK_VIA_FW	= 0x00000010, /* P9 DD1.0 workaround */
-	OPAL_XIVE_IRQ_EOI_VIA_FW	= 0x00000020,
+	OPAL_XIVE_IRQ_EOI_VIA_FW	= 0x00000020, /* P9 DD1.0 workaround */
 };
 
 /* Flags for OPAL_XIVE_GET/SET_QUEUE_INFO */
diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
index d62368d0ba91..f6150d7a757a 100644
--- a/arch/powerpc/include/asm/xive.h
+++ b/arch/powerpc/include/asm/xive.h
@@ -62,7 +62,7 @@ struct xive_irq_data {
 #define XIVE_IRQ_FLAG_LSI	0x02
 #define XIVE_IRQ_FLAG_SHIFT_BUG	0x04 /* P9 DD1.0 workaround */
 #define XIVE_IRQ_FLAG_MASK_FW	0x08 /* P9 DD1.0 workaround */
-#define XIVE_IRQ_FLAG_EOI_FW	0x10
+#define XIVE_IRQ_FLAG_EOI_FW	0x10 /* P9 DD1.0 workaround */
 #define XIVE_IRQ_FLAG_H_INT_ESB	0x20
 
 /* Special flag set by KVM for excalation interrupts */
diff --git a/arch/powerpc/sysdev/xive/xive-internal.h b/arch/powerpc/sysdev/xive/xive-internal.h
index 066d6fe3dc1d..3b7dd2cba9db 100644
--- a/arch/powerpc/sysdev/xive/xive-internal.h
+++ b/arch/powerpc/sysdev/xive/xive-internal.h
@@ -52,7 +52,6 @@ struct xive_ops {
 	void	(*shutdown)(void);
 
 	void	(*update_pending)(struct xive_cpu *xc);
-	void	(*eoi)(u32 hw_irq);
 	void	(*sync_source)(u32 hw_irq);
 	u64	(*esb_rw)(u32 hw_irq, u32 offset, u64 data, bool write);
 #ifdef CONFIG_SMP
diff --git a/arch/powerpc/kvm/book3s_xive_template.c b/arch/powerpc/kvm/book3s_xive_template.c
index ece36e024a8f..b0015e05d99a 100644
--- a/arch/powerpc/kvm/book3s_xive_template.c
+++ b/arch/powerpc/kvm/book3s_xive_template.c
@@ -74,8 +74,6 @@ static void GLUE(X_PFX,source_eoi)(u32 hw_irq, struct xive_irq_data *xd)
 	/* If the XIVE supports the new "store EOI facility, use it */
 	if (xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
 		__x_writeq(0, __x_eoi_page(xd) + XIVE_ESB_STORE_EOI);
-	else if (hw_irq && xd->flags & XIVE_IRQ_FLAG_EOI_FW)
-		opal_int_eoi(hw_irq);
 	else if (xd->flags & XIVE_IRQ_FLAG_LSI) {
 		/*
 		 * For LSIs the HW EOI cycle is used rather than PQ bits,
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index a71412fefb65..fe6229dd3241 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -354,18 +354,7 @@ static void xive_do_source_eoi(u32 hw_irq, struct xive_irq_data *xd)
 	/* If the XIVE supports the new "store EOI facility, use it */
 	if (xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
 		xive_esb_write(xd, XIVE_ESB_STORE_EOI, 0);
-	else if (hw_irq && xd->flags & XIVE_IRQ_FLAG_EOI_FW) {
-		/*
-		 * The FW told us to call it. This happens for some
-		 * interrupt sources that need additional HW whacking
-		 * beyond the ESB manipulation. For example LPC interrupts
-		 * on P9 DD1.0 needed a latch to be clared in the LPC bridge
-		 * itself. The Firmware will take care of it.
-		 */
-		if (WARN_ON_ONCE(!xive_ops->eoi))
-			return;
-		xive_ops->eoi(hw_irq);
-	} else {
+	else {
 		u8 eoi_val;
 
 		/*
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index deb97ad25d62..4902d05ebbd1 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -64,8 +64,6 @@ int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)
 		data->flags |= XIVE_IRQ_FLAG_STORE_EOI;
 	if (opal_flags & OPAL_XIVE_IRQ_LSI)
 		data->flags |= XIVE_IRQ_FLAG_LSI;
-	if (opal_flags & OPAL_XIVE_IRQ_EOI_VIA_FW)
-		data->flags |= XIVE_IRQ_FLAG_EOI_FW;
 	data->eoi_page = be64_to_cpu(eoi_page);
 	data->trig_page = be64_to_cpu(trig_page);
 	data->esb_shift = be32_to_cpu(esb_shift);
@@ -380,15 +378,6 @@ static void xive_native_update_pending(struct xive_cpu *xc)
 	}
 }
 
-static void xive_native_eoi(u32 hw_irq)
-{
-	/*
-	 * Not normally used except if specific interrupts need
-	 * a workaround on EOI.
-	 */
-	opal_int_eoi(hw_irq);
-}
-
 static void xive_native_setup_cpu(unsigned int cpu, struct xive_cpu *xc)
 {
 	s64 rc;
@@ -471,7 +460,6 @@ static const struct xive_ops xive_native_ops = {
 	.match			= xive_native_match,
 	.shutdown		= xive_native_shutdown,
 	.update_pending		= xive_native_update_pending,
-	.eoi			= xive_native_eoi,
 	.setup_cpu		= xive_native_setup_cpu,
 	.teardown_cpu		= xive_native_teardown_cpu,
 	.sync_source		= xive_native_sync_source,
diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
index 6610e5149d5a..01ccc0786ada 100644
--- a/arch/powerpc/sysdev/xive/spapr.c
+++ b/arch/powerpc/sysdev/xive/spapr.c
@@ -628,11 +628,6 @@ static void xive_spapr_update_pending(struct xive_cpu *xc)
 	}
 }
 
-static void xive_spapr_eoi(u32 hw_irq)
-{
-	/* Not used */;
-}
-
 static void xive_spapr_setup_cpu(unsigned int cpu, struct xive_cpu *xc)
 {
 	/* Only some debug on the TIMA settings */
@@ -677,7 +672,6 @@ static const struct xive_ops xive_spapr_ops = {
 	.match			= xive_spapr_match,
 	.shutdown		= xive_spapr_shutdown,
 	.update_pending		= xive_spapr_update_pending,
-	.eoi			= xive_spapr_eoi,
 	.setup_cpu		= xive_spapr_setup_cpu,
 	.teardown_cpu		= xive_spapr_teardown_cpu,
 	.sync_source		= xive_spapr_sync_source,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 12/13] powerpc/xive: Simplify xive_do_source_eoi()
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
                   ` (10 preceding siblings ...)
  2020-12-08 15:11 ` [PATCH 11/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-09 15:28   ` Greg Kurz
  2020-12-08 15:11 ` [PATCH 13/13] powerpc/xive: Improve error reporting of OPAL calls Cédric Le Goater
  2020-12-10  9:59 ` [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
  13 siblings, 1 reply; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

Previous patches removed the need of the first argument which was a
hack for Firwmware EOI. Remove it and flatten the routine which has
became simpler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/sysdev/xive/common.c | 72 ++++++++++++++-----------------
 1 file changed, 33 insertions(+), 39 deletions(-)

diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index fe6229dd3241..fb438203d5ee 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -348,39 +348,40 @@ static void xive_do_queue_eoi(struct xive_cpu *xc)
  * EOI an interrupt at the source. There are several methods
  * to do this depending on the HW version and source type
  */
-static void xive_do_source_eoi(u32 hw_irq, struct xive_irq_data *xd)
+static void xive_do_source_eoi(struct xive_irq_data *xd)
 {
+	u8 eoi_val;
+
 	xd->stale_p = false;
+
 	/* If the XIVE supports the new "store EOI facility, use it */
-	if (xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
+	if (xd->flags & XIVE_IRQ_FLAG_STORE_EOI) {
 		xive_esb_write(xd, XIVE_ESB_STORE_EOI, 0);
-	else {
-		u8 eoi_val;
+		return;
+	}
 
-		/*
-		 * Otherwise for EOI, we use the special MMIO that does
-		 * a clear of both P and Q and returns the old Q,
-		 * except for LSIs where we use the "EOI cycle" special
-		 * load.
-		 *
-		 * This allows us to then do a re-trigger if Q was set
-		 * rather than synthesizing an interrupt in software
-		 *
-		 * For LSIs the HW EOI cycle is used rather than PQ bits,
-		 * as they are automatically re-triggred in HW when still
-		 * pending.
-		 */
-		if (xd->flags & XIVE_IRQ_FLAG_LSI)
-			xive_esb_read(xd, XIVE_ESB_LOAD_EOI);
-		else {
-			eoi_val = xive_esb_read(xd, XIVE_ESB_SET_PQ_00);
-			DBG_VERBOSE("eoi_val=%x\n", eoi_val);
-
-			/* Re-trigger if needed */
-			if ((eoi_val & XIVE_ESB_VAL_Q) && xd->trig_mmio)
-				out_be64(xd->trig_mmio, 0);
-		}
+	/*
+	 * For LSIs, we use the "EOI cycle" special load rather than
+	 * PQ bits, as they are automatically re-triggered in HW when
+	 * still pending.
+	 */
+	if (xd->flags & XIVE_IRQ_FLAG_LSI) {
+		xive_esb_read(xd, XIVE_ESB_LOAD_EOI);
+		return;
 	}
+
+	/*
+	 * Otherwise, we use the special MMIO that does a clear of
+	 * both P and Q and returns the old Q. This allows us to then
+	 * do a re-trigger if Q was set rather than synthesizing an
+	 * interrupt in software
+	 */
+	eoi_val = xive_esb_read(xd, XIVE_ESB_SET_PQ_00);
+	DBG_VERBOSE("eoi_val=%x\n", eoi_val);
+
+	/* Re-trigger if needed */
+	if ((eoi_val & XIVE_ESB_VAL_Q) && xd->trig_mmio)
+		out_be64(xd->trig_mmio, 0);
 }
 
 /* irq_chip eoi callback, called with irq descriptor lock held */
@@ -398,7 +399,7 @@ static void xive_irq_eoi(struct irq_data *d)
 	 */
 	if (!irqd_irq_disabled(d) && !irqd_is_forwarded_to_vcpu(d) &&
 	    !(xd->flags & XIVE_IRQ_FLAG_NO_EOI))
-		xive_do_source_eoi(irqd_to_hwirq(d), xd);
+		xive_do_source_eoi(xd);
 	else
 		xd->stale_p = true;
 
@@ -788,14 +789,7 @@ static int xive_irq_retrigger(struct irq_data *d)
 	 * 11, then perform an EOI.
 	 */
 	xive_esb_read(xd, XIVE_ESB_SET_PQ_11);
-
-	/*
-	 * Note: We pass "0" to the hw_irq argument in order to
-	 * avoid calling into the backend EOI code which we don't
-	 * want to do in the case of a re-trigger. Backends typically
-	 * only do EOI for LSIs anyway.
-	 */
-	xive_do_source_eoi(0, xd);
+	xive_do_source_eoi(xd);
 
 	return 1;
 }
@@ -910,7 +904,7 @@ static int xive_irq_set_vcpu_affinity(struct irq_data *d, void *state)
 		 * while masked, the generic code will re-mask it anyway.
 		 */
 		if (!xd->saved_p)
-			xive_do_source_eoi(hw_irq, xd);
+			xive_do_source_eoi(xd);
 
 	}
 	return 0;
@@ -1054,7 +1048,7 @@ static void xive_ipi_eoi(struct irq_data *d)
 	DBG_VERBOSE("IPI eoi: irq=%d [0x%lx] (HW IRQ 0x%x) pending=%02x\n",
 		    d->irq, irqd_to_hwirq(d), xc->hw_ipi, xc->pending_prio);
 
-	xive_do_source_eoi(xc->hw_ipi, &xc->ipi_data);
+	xive_do_source_eoi(&xc->ipi_data);
 	xive_do_queue_eoi(xc);
 }
 
@@ -1443,7 +1437,7 @@ static void xive_flush_cpu_queue(unsigned int cpu, struct xive_cpu *xc)
 		 * still asserted. Otherwise do an MSI retrigger.
 		 */
 		if (xd->flags & XIVE_IRQ_FLAG_LSI)
-			xive_do_source_eoi(irqd_to_hwirq(d), xd);
+			xive_do_source_eoi(xd);
 		else
 			xive_irq_retrigger(d);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 13/13] powerpc/xive: Improve error reporting of OPAL calls
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
                   ` (11 preceding siblings ...)
  2020-12-08 15:11 ` [PATCH 12/13] powerpc/xive: Simplify xive_do_source_eoi() Cédric Le Goater
@ 2020-12-08 15:11 ` Cédric Le Goater
  2020-12-09 15:30   ` Greg Kurz
  2020-12-10  9:59 ` [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
  13 siblings, 1 reply; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-08 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Cédric Le Goater

Introduce a vp_err() macro to standardize error reporting.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/sysdev/xive/native.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index 4902d05ebbd1..42297a131a6e 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -122,6 +122,8 @@ static int xive_native_get_irq_config(u32 hw_irq, u32 *target, u8 *prio,
 	return rc == 0 ? 0 : -ENXIO;
 }
 
+#define vp_err(vp, fmt, ...) pr_err("VP[0x%x]: " fmt, vp, ##__VA_ARGS__)
+
 /* This can be called multiple time to change a queue configuration */
 int xive_native_configure_queue(u32 vp_id, struct xive_q *q, u8 prio,
 				__be32 *qpage, u32 order, bool can_escalate)
@@ -149,7 +151,7 @@ int xive_native_configure_queue(u32 vp_id, struct xive_q *q, u8 prio,
 				      &esc_irq_be,
 				      NULL);
 	if (rc) {
-		pr_err("Error %lld getting queue info prio %d\n", rc, prio);
+		vp_err(vp_id, "Failed to get queue %d info : %lld\n", prio, rc);
 		rc = -EIO;
 		goto fail;
 	}
@@ -172,7 +174,7 @@ int xive_native_configure_queue(u32 vp_id, struct xive_q *q, u8 prio,
 		msleep(OPAL_BUSY_DELAY_MS);
 	}
 	if (rc) {
-		pr_err("Error %lld setting queue for prio %d\n", rc, prio);
+		vp_err(vp_id, "Failed to set queue %d info: %lld\n", prio, rc);
 		rc = -EIO;
 	} else {
 		/*
@@ -199,7 +201,7 @@ static void __xive_native_disable_queue(u32 vp_id, struct xive_q *q, u8 prio)
 		msleep(OPAL_BUSY_DELAY_MS);
 	}
 	if (rc)
-		pr_err("Error %lld disabling queue for prio %d\n", rc, prio);
+		vp_err(vp_id, "Failed to disable queue %d : %lld\n", prio, rc);
 }
 
 void xive_native_disable_queue(u32 vp_id, struct xive_q *q, u8 prio)
@@ -698,6 +700,8 @@ int xive_native_enable_vp(u32 vp_id, bool single_escalation)
 			break;
 		msleep(OPAL_BUSY_DELAY_MS);
 	}
+	if (rc)
+		vp_err(vp_id, "Failed to enable VP : %lld\n", rc);
 	return rc ? -EIO : 0;
 }
 EXPORT_SYMBOL_GPL(xive_native_enable_vp);
@@ -712,6 +716,8 @@ int xive_native_disable_vp(u32 vp_id)
 			break;
 		msleep(OPAL_BUSY_DELAY_MS);
 	}
+	if (rc)
+		vp_err(vp_id, "Failed to disable VP : %lld\n", rc);
 	return rc ? -EIO : 0;
 }
 EXPORT_SYMBOL_GPL(xive_native_disable_vp);
@@ -723,8 +729,10 @@ int xive_native_get_vp_info(u32 vp_id, u32 *out_cam_id, u32 *out_chip_id)
 	s64 rc;
 
 	rc = opal_xive_get_vp_info(vp_id, NULL, &vp_cam_be, NULL, &vp_chip_id_be);
-	if (rc)
+	if (rc) {
+		vp_err(vp_id, "Failed to get VP info : %lld\n", rc);
 		return -EIO;
+	}
 	*out_cam_id = be64_to_cpu(vp_cam_be) & 0xffffffffu;
 	*out_chip_id = be32_to_cpu(vp_chip_id_be);
 
@@ -755,8 +763,7 @@ int xive_native_get_queue_info(u32 vp_id, u32 prio,
 	rc = opal_xive_get_queue_info(vp_id, prio, &qpage, &qsize,
 				      &qeoi_page, &escalate_irq, &qflags);
 	if (rc) {
-		pr_err("OPAL failed to get queue info for VCPU %d/%d : %lld\n",
-		       vp_id, prio, rc);
+		vp_err(vp_id, "failed to get queue %d info : %lld\n", prio, rc);
 		return -EIO;
 	}
 
@@ -784,8 +791,7 @@ int xive_native_get_queue_state(u32 vp_id, u32 prio, u32 *qtoggle, u32 *qindex)
 	rc = opal_xive_get_queue_state(vp_id, prio, &opal_qtoggle,
 				       &opal_qindex);
 	if (rc) {
-		pr_err("OPAL failed to get queue state for VCPU %d/%d : %lld\n",
-		       vp_id, prio, rc);
+		vp_err(vp_id, "failed to get queue %d state : %lld\n", prio, rc);
 		return -EIO;
 	}
 
@@ -804,8 +810,7 @@ int xive_native_set_queue_state(u32 vp_id, u32 prio, u32 qtoggle, u32 qindex)
 
 	rc = opal_xive_set_queue_state(vp_id, prio, qtoggle, qindex);
 	if (rc) {
-		pr_err("OPAL failed to set queue state for VCPU %d/%d : %lld\n",
-		       vp_id, prio, rc);
+		vp_err(vp_id, "failed to set queue %d state : %lld\n", prio, rc);
 		return -EIO;
 	}
 
@@ -827,8 +832,7 @@ int xive_native_get_vp_state(u32 vp_id, u64 *out_state)
 
 	rc = opal_xive_get_vp_state(vp_id, &state);
 	if (rc) {
-		pr_err("OPAL failed to get vp state for VCPU %d : %lld\n",
-		       vp_id, rc);
+		vp_err(vp_id, "failed to get vp state : %lld\n", rc);
 		return -EIO;
 	}
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 02/13] powerpc/xive: Rename XIVE_IRQ_NO_EOI to show its a flag
  2020-12-08 15:11 ` [PATCH 02/13] powerpc/xive: Rename XIVE_IRQ_NO_EOI to show its a flag Cédric Le Goater
@ 2020-12-08 16:59   ` Greg Kurz
  0 siblings, 0 replies; 28+ messages in thread
From: Greg Kurz @ 2020-12-08 16:59 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: linuxppc-dev

On Tue, 8 Dec 2020 16:11:13 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> This is a simple cleanup to identify easily all flags of the XIVE
> interrupt structure. The interrupts flagged with XIVE_IRQ_FLAG_NO_EOI
> are the escalations used to wake up vCPUs in KVM. They are handled
> very differently from the rest.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  arch/powerpc/include/asm/xive.h   | 2 +-
>  arch/powerpc/kvm/book3s_xive.c    | 4 ++--
>  arch/powerpc/sysdev/xive/common.c | 2 +-
>  3 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
> index 309b4d65b74f..d332dd9a18de 100644
> --- a/arch/powerpc/include/asm/xive.h
> +++ b/arch/powerpc/include/asm/xive.h
> @@ -66,7 +66,7 @@ struct xive_irq_data {
>  #define XIVE_IRQ_FLAG_H_INT_ESB	0x20
>  
>  /* Special flag set by KVM for excalation interrupts */
> -#define XIVE_IRQ_NO_EOI		0x80
> +#define XIVE_IRQ_FLAG_NO_EOI	0x80
>  
>  #define XIVE_INVALID_CHIP_ID	-1
>  
> diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
> index 18a6b75a3bfd..fae1c2e8da29 100644
> --- a/arch/powerpc/kvm/book3s_xive.c
> +++ b/arch/powerpc/kvm/book3s_xive.c
> @@ -219,7 +219,7 @@ int kvmppc_xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio,
>  	/* In single escalation mode, we grab the ESB MMIO of the
>  	 * interrupt and mask it. Also populate the VCPU v/raddr
>  	 * of the ESB page for use by asm entry/exit code. Finally
> -	 * set the XIVE_IRQ_NO_EOI flag which will prevent the
> +	 * set the XIVE_IRQ_FLAG_NO_EOI flag which will prevent the
>  	 * core code from performing an EOI on the escalation
>  	 * interrupt, thus leaving it effectively masked after
>  	 * it fires once.
> @@ -231,7 +231,7 @@ int kvmppc_xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio,
>  		xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_01);
>  		vcpu->arch.xive_esc_raddr = xd->eoi_page;
>  		vcpu->arch.xive_esc_vaddr = (__force u64)xd->eoi_mmio;
> -		xd->flags |= XIVE_IRQ_NO_EOI;
> +		xd->flags |= XIVE_IRQ_FLAG_NO_EOI;
>  	}
>  
>  	return 0;
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index a80440af491a..65af34ac1fa2 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -416,7 +416,7 @@ static void xive_irq_eoi(struct irq_data *d)
>  	 * been passed-through to a KVM guest
>  	 */
>  	if (!irqd_irq_disabled(d) && !irqd_is_forwarded_to_vcpu(d) &&
> -	    !(xd->flags & XIVE_IRQ_NO_EOI))
> +	    !(xd->flags & XIVE_IRQ_FLAG_NO_EOI))
>  		xive_do_source_eoi(irqd_to_hwirq(d), xd);
>  	else
>  		xd->stale_p = true;


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 03/13] powerpc/xive: Introduce XIVE_IPI_HW_IRQ
  2020-12-08 15:11 ` [PATCH 03/13] powerpc/xive: Introduce XIVE_IPI_HW_IRQ Cédric Le Goater
@ 2020-12-08 17:06   ` Greg Kurz
  0 siblings, 0 replies; 28+ messages in thread
From: Greg Kurz @ 2020-12-08 17:06 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: linuxppc-dev

On Tue, 8 Dec 2020 16:11:14 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> The XIVE driver deals with CPU IPIs in a peculiar way. Each CPU has
> its own XIVE IPI interrupt allocated at the HW level, for PowerNV, or
> at the hypervisor level for pSeries. In practice, these interrupts are
> not always used. pSeries/PowerVM prefers local doorbells for local
> threads since they are faster. On PowerNV, global doorbells are also
> preferred for the same reason.
> 
> The mapping in the Linux is reduced to a single interrupt using HW
> interrupt number 0 and a custom irq_chip to handle EOI. This can cause
> performance issues in some benchmark (ipistorm) on multichip systems.
> 
> Clarify the use of the 0 value, it will help in improving multichip
> support.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  arch/powerpc/sysdev/xive/xive-internal.h |  2 ++
>  arch/powerpc/sysdev/xive/common.c        | 10 +++++-----
>  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/sysdev/xive/xive-internal.h b/arch/powerpc/sysdev/xive/xive-internal.h
> index b7b901da2168..d701af7fb48c 100644
> --- a/arch/powerpc/sysdev/xive/xive-internal.h
> +++ b/arch/powerpc/sysdev/xive/xive-internal.h
> @@ -5,6 +5,8 @@
>  #ifndef __XIVE_INTERNAL_H
>  #define __XIVE_INTERNAL_H
>  
> +#define XIVE_IPI_HW_IRQ		0 /* interrupt source # for IPIs */
> +
>  /*
>   * A "disabled" interrupt should never fire, to catch problems
>   * we set its logical number to this
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index 65af34ac1fa2..ee375daf8114 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -1142,7 +1142,7 @@ static void __init xive_request_ipi(void)
>  		return;
>  
>  	/* Initialize it */
> -	virq = irq_create_mapping(xive_irq_domain, 0);
> +	virq = irq_create_mapping(xive_irq_domain, XIVE_IPI_HW_IRQ);
>  	xive_ipi_irq = virq;
>  
>  	WARN_ON(request_irq(virq, xive_muxed_ipi_action,
> @@ -1242,7 +1242,7 @@ static int xive_irq_domain_map(struct irq_domain *h, unsigned int virq,
>  
>  #ifdef CONFIG_SMP
>  	/* IPIs are special and come up with HW number 0 */
> -	if (hw == 0) {
> +	if (hw == XIVE_IPI_HW_IRQ) {
>  		/*
>  		 * IPIs are marked per-cpu. We use separate HW interrupts under
>  		 * the hood but associated with the same "linux" interrupt
> @@ -1271,7 +1271,7 @@ static void xive_irq_domain_unmap(struct irq_domain *d, unsigned int virq)
>  	if (!data)
>  		return;
>  	hw_irq = (unsigned int)irqd_to_hwirq(data);
> -	if (hw_irq)
> +	if (hw_irq != XIVE_IPI_HW_IRQ)
>  		xive_irq_free_data(virq);
>  }
>  
> @@ -1421,7 +1421,7 @@ static void xive_flush_cpu_queue(unsigned int cpu, struct xive_cpu *xc)
>  		 * Ignore anything that isn't a XIVE irq and ignore
>  		 * IPIs, so can just be dropped.
>  		 */
> -		if (d->domain != xive_irq_domain || hw_irq == 0)
> +		if (d->domain != xive_irq_domain || hw_irq == XIVE_IPI_HW_IRQ)
>  			continue;
>  
>  		/*
> @@ -1655,7 +1655,7 @@ static int xive_core_debug_show(struct seq_file *m, void *private)
>  		hw_irq = (unsigned int)irqd_to_hwirq(d);
>  
>  		/* IPIs are special (HW number 0) */
> -		if (hw_irq)
> +		if (hw_irq != XIVE_IPI_HW_IRQ)
>  			xive_debug_show_irq(m, hw_irq, d);
>  	}
>  	return 0;


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 08/13] powerpc: Increase NR_IRQS range to support more KVM guests
  2020-12-08 15:11 ` [PATCH 08/13] powerpc: Increase NR_IRQS range to support more KVM guests Cédric Le Goater
@ 2020-12-08 17:23   ` Greg Kurz
  0 siblings, 0 replies; 28+ messages in thread
From: Greg Kurz @ 2020-12-08 17:23 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: linuxppc-dev

On Tue, 8 Dec 2020 16:11:19 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> PowerNV systems can handle up to 4K guests and 1M interrupt numbers
> per chip. Increase the range of allowed interrupts to support a larger
> number of guests.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  arch/powerpc/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 5181872f9452..c250fbd430d1 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -66,7 +66,7 @@ config NEED_PER_CPU_PAGE_FIRST_CHUNK
>  
>  config NR_IRQS
>  	int "Number of virtual interrupt numbers"
> -	range 32 32768
> +	range 32 1048576
>  	default "512"
>  	help
>  	  This defines the number of virtual interrupt numbers the kernel


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 09/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG
  2020-12-08 15:11 ` [PATCH 09/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG Cédric Le Goater
@ 2020-12-08 17:39   ` Greg Kurz
  2020-12-10 13:18     ` Cédric Le Goater
  0 siblings, 1 reply; 28+ messages in thread
From: Greg Kurz @ 2020-12-08 17:39 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: linuxppc-dev

On Tue, 8 Dec 2020 16:11:20 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> This flag was used to support the PHB4 LSIs on P9 DD1 and we have
> stopped supporting this CPU when DD2 came out. See skiboot commit:
> 
>   https://github.com/open-power/skiboot/commit/0b0d15e3c170
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

Just a minor suggestion in case you need to post a v2. See below.

>  arch/powerpc/include/asm/opal-api.h     | 2 +-
>  arch/powerpc/include/asm/xive.h         | 2 +-
>  arch/powerpc/kvm/book3s_xive_native.c   | 3 ---
>  arch/powerpc/kvm/book3s_xive_template.c | 3 ---
>  arch/powerpc/sysdev/xive/common.c       | 8 --------
>  arch/powerpc/sysdev/xive/native.c       | 2 --
>  6 files changed, 2 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
> index 1dffa3cb16ba..48ee604ca39a 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -1091,7 +1091,7 @@ enum {
>  	OPAL_XIVE_IRQ_TRIGGER_PAGE	= 0x00000001,
>  	OPAL_XIVE_IRQ_STORE_EOI		= 0x00000002,
>  	OPAL_XIVE_IRQ_LSI		= 0x00000004,
> -	OPAL_XIVE_IRQ_SHIFT_BUG		= 0x00000008,
> +	OPAL_XIVE_IRQ_SHIFT_BUG		= 0x00000008, /* P9 DD1.0 workaround */

Maybe you can even comment the entire line so that any future
tentative to use that flag breaks build ?

>  	OPAL_XIVE_IRQ_MASK_VIA_FW	= 0x00000010,
>  	OPAL_XIVE_IRQ_EOI_VIA_FW	= 0x00000020,
>  };
> diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
> index d332dd9a18de..ff805885a028 100644
> --- a/arch/powerpc/include/asm/xive.h
> +++ b/arch/powerpc/include/asm/xive.h
> @@ -60,7 +60,7 @@ struct xive_irq_data {
>  };
>  #define XIVE_IRQ_FLAG_STORE_EOI	0x01
>  #define XIVE_IRQ_FLAG_LSI	0x02
> -#define XIVE_IRQ_FLAG_SHIFT_BUG	0x04
> +#define XIVE_IRQ_FLAG_SHIFT_BUG	0x04 /* P9 DD1.0 workaround */

Same here, with an extra cleanup to stop using it when initializing 
xive_irq_flags[] in common.c.

>  #define XIVE_IRQ_FLAG_MASK_FW	0x08
>  #define XIVE_IRQ_FLAG_EOI_FW	0x10
>  #define XIVE_IRQ_FLAG_H_INT_ESB	0x20
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
> index 9b395381179d..170d1d04e1d1 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -37,9 +37,6 @@ static u8 xive_vm_esb_load(struct xive_irq_data *xd, u32 offset)
>  	 * ordering.
>  	 */
>  
> -	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
> -		offset |= offset << 4;
> -
>  	val = in_be64(xd->eoi_mmio + offset);
>  	return (u8)val;
>  }
> diff --git a/arch/powerpc/kvm/book3s_xive_template.c b/arch/powerpc/kvm/book3s_xive_template.c
> index 4ad3c0279458..ece36e024a8f 100644
> --- a/arch/powerpc/kvm/book3s_xive_template.c
> +++ b/arch/powerpc/kvm/book3s_xive_template.c
> @@ -61,9 +61,6 @@ static u8 GLUE(X_PFX,esb_load)(struct xive_irq_data *xd, u32 offset)
>  	if (offset == XIVE_ESB_SET_PQ_10 && xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
>  		offset |= XIVE_ESB_LD_ST_MO;
>  
> -	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
> -		offset |= offset << 4;
> -
>  	val =__x_readq(__x_eoi_page(xd) + offset);
>  #ifdef __LITTLE_ENDIAN__
>  	val >>= 64-8;
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index 411cba12d73b..a9259470bf9f 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -200,10 +200,6 @@ static notrace u8 xive_esb_read(struct xive_irq_data *xd, u32 offset)
>  	if (offset == XIVE_ESB_SET_PQ_10 && xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
>  		offset |= XIVE_ESB_LD_ST_MO;
>  
> -	/* Handle HW errata */
> -	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
> -		offset |= offset << 4;
> -
>  	if ((xd->flags & XIVE_IRQ_FLAG_H_INT_ESB) && xive_ops->esb_rw)
>  		val = xive_ops->esb_rw(xd->hw_irq, offset, 0, 0);
>  	else
> @@ -214,10 +210,6 @@ static notrace u8 xive_esb_read(struct xive_irq_data *xd, u32 offset)
>  
>  static void xive_esb_write(struct xive_irq_data *xd, u32 offset, u64 data)
>  {
> -	/* Handle HW errata */
> -	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
> -		offset |= offset << 4;
> -
>  	if ((xd->flags & XIVE_IRQ_FLAG_H_INT_ESB) && xive_ops->esb_rw)
>  		xive_ops->esb_rw(xd->hw_irq, offset, data, 1);
>  	else
> diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
> index 5f1e5aed8ab4..0310783241b5 100644
> --- a/arch/powerpc/sysdev/xive/native.c
> +++ b/arch/powerpc/sysdev/xive/native.c
> @@ -64,8 +64,6 @@ int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)
>  		data->flags |= XIVE_IRQ_FLAG_STORE_EOI;
>  	if (opal_flags & OPAL_XIVE_IRQ_LSI)
>  		data->flags |= XIVE_IRQ_FLAG_LSI;
> -	if (opal_flags & OPAL_XIVE_IRQ_SHIFT_BUG)
> -		data->flags |= XIVE_IRQ_FLAG_SHIFT_BUG;
>  	if (opal_flags & OPAL_XIVE_IRQ_MASK_VIA_FW)
>  		data->flags |= XIVE_IRQ_FLAG_MASK_FW;
>  	if (opal_flags & OPAL_XIVE_IRQ_EOI_VIA_FW)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 04/13] powerpc/xive: Use cpu_to_node() instead of ibm, chip-id property
  2020-12-08 15:11 ` [PATCH 04/13] powerpc/xive: Use cpu_to_node() instead of ibm, chip-id property Cédric Le Goater
@ 2020-12-09  4:39   ` Aneesh Kumar K.V
  2020-12-09 15:50     ` Cédric Le Goater
  0 siblings, 1 reply; 28+ messages in thread
From: Aneesh Kumar K.V @ 2020-12-09  4:39 UTC (permalink / raw)
  To: Cédric Le Goater, linuxppc-dev; +Cc: Cédric Le Goater

Cédric Le Goater <clg@kaod.org> writes:

> The 'chip_id' field of the XIVE CPU structure is used to choose a
> target for a source located on the same chip when possible. This field
> is assigned on the PowerNV platform using the "ibm,chip-id" property
> on pSeries under KVM when NUMA nodes are defined but it is undefined
> under PowerVM. The XIVE source structure has a similar field
> 'src_chip' which is only assigned on the PowerNV platform.
>
> cpu_to_node() returns a compatible value on all platforms, 0 being the
> default node. It will also give us the opportunity to set the affinity
> of a source on pSeries when we can localize them.

But we should avoid assuming that linux numa node number is equivalent
to chip id [1]. What do we expect this value represents on virtualized
platforms like PowerVM and KVM? Is this used for a hcall?


[1] https://lore.kernel.org/linuxppc-dev/20200817103238.158133-1-aneesh.kumar@linux.ibm.com

>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  arch/powerpc/sysdev/xive/common.c | 7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index ee375daf8114..605238ca65e4 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -1342,16 +1342,11 @@ static int xive_prepare_cpu(unsigned int cpu)
>  
>  	xc = per_cpu(xive_cpu, cpu);
>  	if (!xc) {
> -		struct device_node *np;
> -
>  		xc = kzalloc_node(sizeof(struct xive_cpu),
>  				  GFP_KERNEL, cpu_to_node(cpu));
>  		if (!xc)
>  			return -ENOMEM;
> -		np = of_get_cpu_node(cpu, NULL);
> -		if (np)
> -			xc->chip_id = of_get_ibm_chip_id(np);
> -		of_node_put(np);
> +		xc->chip_id = cpu_to_node(cpu);
>  		xc->hw_ipi = XIVE_BAD_IRQ;
>  
>  		per_cpu(xive_cpu, cpu) = xc;
> -- 
> 2.26.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 10/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW
  2020-12-08 15:11 ` [PATCH 10/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW Cédric Le Goater
@ 2020-12-09 15:21   ` Greg Kurz
  0 siblings, 0 replies; 28+ messages in thread
From: Greg Kurz @ 2020-12-09 15:21 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: linuxppc-dev

On Tue, 8 Dec 2020 16:11:21 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> This flag was used to support the PHB4 LSIs on P9 DD1 and we have
> stopped supporting this CPU when DD2 came out. See skiboot commit:
> 
>   https://github.com/open-power/skiboot/commit/0b0d15e3c170
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

In case a v2 is required, same suggestion to comment out the removed
items entirely, plus fix an indent nit....

>  arch/powerpc/include/asm/opal-api.h |  2 +-
>  arch/powerpc/include/asm/xive.h     |  2 +-
>  arch/powerpc/kvm/book3s_xive.c      | 54 +++++------------------------
>  arch/powerpc/sysdev/xive/common.c   | 39 +--------------------
>  arch/powerpc/sysdev/xive/native.c   |  2 --
>  5 files changed, 11 insertions(+), 88 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
> index 48ee604ca39a..0455b679c050 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -1092,7 +1092,7 @@ enum {
>  	OPAL_XIVE_IRQ_STORE_EOI		= 0x00000002,
>  	OPAL_XIVE_IRQ_LSI		= 0x00000004,
>  	OPAL_XIVE_IRQ_SHIFT_BUG		= 0x00000008, /* P9 DD1.0 workaround */
> -	OPAL_XIVE_IRQ_MASK_VIA_FW	= 0x00000010,
> +	OPAL_XIVE_IRQ_MASK_VIA_FW	= 0x00000010, /* P9 DD1.0 workaround */
>  	OPAL_XIVE_IRQ_EOI_VIA_FW	= 0x00000020,
>  };
>  
> diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
> index ff805885a028..d62368d0ba91 100644
> --- a/arch/powerpc/include/asm/xive.h
> +++ b/arch/powerpc/include/asm/xive.h
> @@ -61,7 +61,7 @@ struct xive_irq_data {
>  #define XIVE_IRQ_FLAG_STORE_EOI	0x01
>  #define XIVE_IRQ_FLAG_LSI	0x02
>  #define XIVE_IRQ_FLAG_SHIFT_BUG	0x04 /* P9 DD1.0 workaround */
> -#define XIVE_IRQ_FLAG_MASK_FW	0x08
> +#define XIVE_IRQ_FLAG_MASK_FW	0x08 /* P9 DD1.0 workaround */
>  #define XIVE_IRQ_FLAG_EOI_FW	0x10
>  #define XIVE_IRQ_FLAG_H_INT_ESB	0x20
>  
> diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
> index fae1c2e8da29..59a986ae640b 100644
> --- a/arch/powerpc/kvm/book3s_xive.c
> +++ b/arch/powerpc/kvm/book3s_xive.c
> @@ -419,37 +419,16 @@ static u8 xive_lock_and_mask(struct kvmppc_xive *xive,
>  	/* Get the right irq */
>  	kvmppc_xive_select_irq(state, &hw_num, &xd);
>  
> -	/*
> -	 * If the interrupt is marked as needing masking via
> -	 * firmware, we do it here. Firmware masking however
> -	 * is "lossy", it won't return the old p and q bits
> -	 * and won't set the interrupt to a state where it will
> -	 * record queued ones. If this is an issue we should do
> -	 * lazy masking instead.
> -	 *
> -	 * For now, we work around this in unmask by forcing
> -	 * an interrupt whenever we unmask a non-LSI via FW
> -	 * (if ever).
> -	 */
> -	if (xd->flags & OPAL_XIVE_IRQ_MASK_VIA_FW) {
> -		xive_native_configure_irq(hw_num,
> -				kvmppc_xive_vp(xive, state->act_server),
> -				MASKED, state->number);
> -		/* set old_p so we can track if an H_EOI was done */
> -		state->old_p = true;
> -		state->old_q = false;
> -	} else {
> -		/* Set PQ to 10, return old P and old Q and remember them */
> -		val = xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_10);
> -		state->old_p = !!(val & 2);
> -		state->old_q = !!(val & 1);
> +	/* Set PQ to 10, return old P and old Q and remember them */
> +	val = xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_10);
> +	state->old_p = !!(val & 2);
> +	state->old_q = !!(val & 1);
>  
> -		/*
> -		 * Synchronize hardware to sensure the queues are updated
> -		 * when masking
> +	/*
> +	 * Synchronize hardware to sensure the queues are updated
> +	 * when masking
>  		 */

... here ^^

> -		xive_native_sync_source(hw_num);
> -	}
> +	xive_native_sync_source(hw_num);
>  
>  	return old_prio;
>  }
> @@ -483,23 +462,6 @@ static void xive_finish_unmask(struct kvmppc_xive *xive,
>  	/* Get the right irq */
>  	kvmppc_xive_select_irq(state, &hw_num, &xd);
>  
> -	/*
> -	 * See comment in xive_lock_and_mask() concerning masking
> -	 * via firmware.
> -	 */
> -	if (xd->flags & OPAL_XIVE_IRQ_MASK_VIA_FW) {
> -		xive_native_configure_irq(hw_num,
> -				kvmppc_xive_vp(xive, state->act_server),
> -				state->act_priority, state->number);
> -		/* If an EOI is needed, do it here */
> -		if (!state->old_p)
> -			xive_vm_source_eoi(hw_num, xd);
> -		/* If this is not an LSI, force a trigger */
> -		if (!(xd->flags & OPAL_XIVE_IRQ_LSI))
> -			xive_irq_trigger(xd);
> -		goto bail;
> -	}
> -
>  	/* Old Q set, set PQ to 11 */
>  	if (state->old_q)
>  		xive_vm_esb_load(xd, XIVE_ESB_SET_PQ_11);
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index a9259470bf9f..a71412fefb65 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -424,9 +424,7 @@ static void xive_irq_eoi(struct irq_data *d)
>  }
>  
>  /*
> - * Helper used to mask and unmask an interrupt source. This
> - * is only called for normal interrupts that do not require
> - * masking/unmasking via firmware.
> + * Helper used to mask and unmask an interrupt source.
>   */
>  static void xive_do_source_set_mask(struct xive_irq_data *xd,
>  				    bool mask)
> @@ -673,20 +671,6 @@ static void xive_irq_unmask(struct irq_data *d)
>  
>  	pr_devel("xive_irq_unmask: irq %d data @%p\n", d->irq, xd);
>  
> -	/*
> -	 * This is a workaround for PCI LSI problems on P9, for
> -	 * these, we call FW to set the mask. The problems might
> -	 * be fixed by P9 DD2.0, if that is the case, firmware
> -	 * will no longer set that flag.
> -	 */
> -	if (xd->flags & XIVE_IRQ_FLAG_MASK_FW) {
> -		unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d);
> -		xive_ops->configure_irq(hw_irq,
> -					get_hard_smp_processor_id(xd->target),
> -					xive_irq_priority, d->irq);
> -		return;
> -	}
> -
>  	xive_do_source_set_mask(xd, false);
>  }
>  
> @@ -696,20 +680,6 @@ static void xive_irq_mask(struct irq_data *d)
>  
>  	pr_devel("xive_irq_mask: irq %d data @%p\n", d->irq, xd);
>  
> -	/*
> -	 * This is a workaround for PCI LSI problems on P9, for
> -	 * these, we call OPAL to set the mask. The problems might
> -	 * be fixed by P9 DD2.0, if that is the case, firmware
> -	 * will no longer set that flag.
> -	 */
> -	if (xd->flags & XIVE_IRQ_FLAG_MASK_FW) {
> -		unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d);
> -		xive_ops->configure_irq(hw_irq,
> -					get_hard_smp_processor_id(xd->target),
> -					0xff, d->irq);
> -		return;
> -	}
> -
>  	xive_do_source_set_mask(xd, true);
>  }
>  
> @@ -852,13 +822,6 @@ static int xive_irq_set_vcpu_affinity(struct irq_data *d, void *state)
>  	int rc;
>  	u8 pq;
>  
> -	/*
> -	 * We only support this on interrupts that do not require
> -	 * firmware calls for masking and unmasking
> -	 */
> -	if (xd->flags & XIVE_IRQ_FLAG_MASK_FW)
> -		return -EIO;
> -
>  	/*
>  	 * This is called by KVM with state non-NULL for enabling
>  	 * pass-through or NULL for disabling it
> diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
> index 0310783241b5..deb97ad25d62 100644
> --- a/arch/powerpc/sysdev/xive/native.c
> +++ b/arch/powerpc/sysdev/xive/native.c
> @@ -64,8 +64,6 @@ int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)
>  		data->flags |= XIVE_IRQ_FLAG_STORE_EOI;
>  	if (opal_flags & OPAL_XIVE_IRQ_LSI)
>  		data->flags |= XIVE_IRQ_FLAG_LSI;
> -	if (opal_flags & OPAL_XIVE_IRQ_MASK_VIA_FW)
> -		data->flags |= XIVE_IRQ_FLAG_MASK_FW;
>  	if (opal_flags & OPAL_XIVE_IRQ_EOI_VIA_FW)
>  		data->flags |= XIVE_IRQ_FLAG_EOI_FW;
>  	data->eoi_page = be64_to_cpu(eoi_page);


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 11/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW
  2020-12-08 15:11 ` [PATCH 11/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW Cédric Le Goater
@ 2020-12-09 15:24   ` Greg Kurz
  0 siblings, 0 replies; 28+ messages in thread
From: Greg Kurz @ 2020-12-09 15:24 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: linuxppc-dev

On Tue, 8 Dec 2020 16:11:22 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> This flag was used to support the P9 DD1 and we have stopped
> supporting this CPU when DD2 came out. See skiboot commit:
> 
>   https://github.com/open-power/skiboot/commit/0b0d15e3c170
> 
> Also, remove eoi handler which is now unused.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

Same suggestion as with previous patch.

>  arch/powerpc/include/asm/opal-api.h      |  2 +-
>  arch/powerpc/include/asm/xive.h          |  2 +-
>  arch/powerpc/sysdev/xive/xive-internal.h |  1 -
>  arch/powerpc/kvm/book3s_xive_template.c  |  2 --
>  arch/powerpc/sysdev/xive/common.c        | 13 +------------
>  arch/powerpc/sysdev/xive/native.c        | 12 ------------
>  arch/powerpc/sysdev/xive/spapr.c         |  6 ------
>  7 files changed, 3 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
> index 0455b679c050..0b63ba7d5917 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -1093,7 +1093,7 @@ enum {
>  	OPAL_XIVE_IRQ_LSI		= 0x00000004,
>  	OPAL_XIVE_IRQ_SHIFT_BUG		= 0x00000008, /* P9 DD1.0 workaround */
>  	OPAL_XIVE_IRQ_MASK_VIA_FW	= 0x00000010, /* P9 DD1.0 workaround */
> -	OPAL_XIVE_IRQ_EOI_VIA_FW	= 0x00000020,
> +	OPAL_XIVE_IRQ_EOI_VIA_FW	= 0x00000020, /* P9 DD1.0 workaround */
>  };
>  
>  /* Flags for OPAL_XIVE_GET/SET_QUEUE_INFO */
> diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
> index d62368d0ba91..f6150d7a757a 100644
> --- a/arch/powerpc/include/asm/xive.h
> +++ b/arch/powerpc/include/asm/xive.h
> @@ -62,7 +62,7 @@ struct xive_irq_data {
>  #define XIVE_IRQ_FLAG_LSI	0x02
>  #define XIVE_IRQ_FLAG_SHIFT_BUG	0x04 /* P9 DD1.0 workaround */
>  #define XIVE_IRQ_FLAG_MASK_FW	0x08 /* P9 DD1.0 workaround */
> -#define XIVE_IRQ_FLAG_EOI_FW	0x10
> +#define XIVE_IRQ_FLAG_EOI_FW	0x10 /* P9 DD1.0 workaround */
>  #define XIVE_IRQ_FLAG_H_INT_ESB	0x20
>  
>  /* Special flag set by KVM for excalation interrupts */
> diff --git a/arch/powerpc/sysdev/xive/xive-internal.h b/arch/powerpc/sysdev/xive/xive-internal.h
> index 066d6fe3dc1d..3b7dd2cba9db 100644
> --- a/arch/powerpc/sysdev/xive/xive-internal.h
> +++ b/arch/powerpc/sysdev/xive/xive-internal.h
> @@ -52,7 +52,6 @@ struct xive_ops {
>  	void	(*shutdown)(void);
>  
>  	void	(*update_pending)(struct xive_cpu *xc);
> -	void	(*eoi)(u32 hw_irq);
>  	void	(*sync_source)(u32 hw_irq);
>  	u64	(*esb_rw)(u32 hw_irq, u32 offset, u64 data, bool write);
>  #ifdef CONFIG_SMP
> diff --git a/arch/powerpc/kvm/book3s_xive_template.c b/arch/powerpc/kvm/book3s_xive_template.c
> index ece36e024a8f..b0015e05d99a 100644
> --- a/arch/powerpc/kvm/book3s_xive_template.c
> +++ b/arch/powerpc/kvm/book3s_xive_template.c
> @@ -74,8 +74,6 @@ static void GLUE(X_PFX,source_eoi)(u32 hw_irq, struct xive_irq_data *xd)
>  	/* If the XIVE supports the new "store EOI facility, use it */
>  	if (xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
>  		__x_writeq(0, __x_eoi_page(xd) + XIVE_ESB_STORE_EOI);
> -	else if (hw_irq && xd->flags & XIVE_IRQ_FLAG_EOI_FW)
> -		opal_int_eoi(hw_irq);
>  	else if (xd->flags & XIVE_IRQ_FLAG_LSI) {
>  		/*
>  		 * For LSIs the HW EOI cycle is used rather than PQ bits,
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index a71412fefb65..fe6229dd3241 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -354,18 +354,7 @@ static void xive_do_source_eoi(u32 hw_irq, struct xive_irq_data *xd)
>  	/* If the XIVE supports the new "store EOI facility, use it */
>  	if (xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
>  		xive_esb_write(xd, XIVE_ESB_STORE_EOI, 0);
> -	else if (hw_irq && xd->flags & XIVE_IRQ_FLAG_EOI_FW) {
> -		/*
> -		 * The FW told us to call it. This happens for some
> -		 * interrupt sources that need additional HW whacking
> -		 * beyond the ESB manipulation. For example LPC interrupts
> -		 * on P9 DD1.0 needed a latch to be clared in the LPC bridge
> -		 * itself. The Firmware will take care of it.
> -		 */
> -		if (WARN_ON_ONCE(!xive_ops->eoi))
> -			return;
> -		xive_ops->eoi(hw_irq);
> -	} else {
> +	else {
>  		u8 eoi_val;
>  
>  		/*
> diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
> index deb97ad25d62..4902d05ebbd1 100644
> --- a/arch/powerpc/sysdev/xive/native.c
> +++ b/arch/powerpc/sysdev/xive/native.c
> @@ -64,8 +64,6 @@ int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)
>  		data->flags |= XIVE_IRQ_FLAG_STORE_EOI;
>  	if (opal_flags & OPAL_XIVE_IRQ_LSI)
>  		data->flags |= XIVE_IRQ_FLAG_LSI;
> -	if (opal_flags & OPAL_XIVE_IRQ_EOI_VIA_FW)
> -		data->flags |= XIVE_IRQ_FLAG_EOI_FW;
>  	data->eoi_page = be64_to_cpu(eoi_page);
>  	data->trig_page = be64_to_cpu(trig_page);
>  	data->esb_shift = be32_to_cpu(esb_shift);
> @@ -380,15 +378,6 @@ static void xive_native_update_pending(struct xive_cpu *xc)
>  	}
>  }
>  
> -static void xive_native_eoi(u32 hw_irq)
> -{
> -	/*
> -	 * Not normally used except if specific interrupts need
> -	 * a workaround on EOI.
> -	 */
> -	opal_int_eoi(hw_irq);
> -}
> -
>  static void xive_native_setup_cpu(unsigned int cpu, struct xive_cpu *xc)
>  {
>  	s64 rc;
> @@ -471,7 +460,6 @@ static const struct xive_ops xive_native_ops = {
>  	.match			= xive_native_match,
>  	.shutdown		= xive_native_shutdown,
>  	.update_pending		= xive_native_update_pending,
> -	.eoi			= xive_native_eoi,
>  	.setup_cpu		= xive_native_setup_cpu,
>  	.teardown_cpu		= xive_native_teardown_cpu,
>  	.sync_source		= xive_native_sync_source,
> diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
> index 6610e5149d5a..01ccc0786ada 100644
> --- a/arch/powerpc/sysdev/xive/spapr.c
> +++ b/arch/powerpc/sysdev/xive/spapr.c
> @@ -628,11 +628,6 @@ static void xive_spapr_update_pending(struct xive_cpu *xc)
>  	}
>  }
>  
> -static void xive_spapr_eoi(u32 hw_irq)
> -{
> -	/* Not used */;
> -}
> -
>  static void xive_spapr_setup_cpu(unsigned int cpu, struct xive_cpu *xc)
>  {
>  	/* Only some debug on the TIMA settings */
> @@ -677,7 +672,6 @@ static const struct xive_ops xive_spapr_ops = {
>  	.match			= xive_spapr_match,
>  	.shutdown		= xive_spapr_shutdown,
>  	.update_pending		= xive_spapr_update_pending,
> -	.eoi			= xive_spapr_eoi,
>  	.setup_cpu		= xive_spapr_setup_cpu,
>  	.teardown_cpu		= xive_spapr_teardown_cpu,
>  	.sync_source		= xive_spapr_sync_source,


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 12/13] powerpc/xive: Simplify xive_do_source_eoi()
  2020-12-08 15:11 ` [PATCH 12/13] powerpc/xive: Simplify xive_do_source_eoi() Cédric Le Goater
@ 2020-12-09 15:28   ` Greg Kurz
  0 siblings, 0 replies; 28+ messages in thread
From: Greg Kurz @ 2020-12-09 15:28 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: linuxppc-dev

On Tue, 8 Dec 2020 16:11:23 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> Previous patches removed the need of the first argument which was a
> hack for Firwmware EOI. Remove it and flatten the routine which has
> became simpler.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Much nicer indeed.

Reviewed-by: Greg Kurz <groug@kaod.org>

>  arch/powerpc/sysdev/xive/common.c | 72 ++++++++++++++-----------------
>  1 file changed, 33 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index fe6229dd3241..fb438203d5ee 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -348,39 +348,40 @@ static void xive_do_queue_eoi(struct xive_cpu *xc)
>   * EOI an interrupt at the source. There are several methods
>   * to do this depending on the HW version and source type
>   */
> -static void xive_do_source_eoi(u32 hw_irq, struct xive_irq_data *xd)
> +static void xive_do_source_eoi(struct xive_irq_data *xd)
>  {
> +	u8 eoi_val;
> +
>  	xd->stale_p = false;
> +
>  	/* If the XIVE supports the new "store EOI facility, use it */
> -	if (xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
> +	if (xd->flags & XIVE_IRQ_FLAG_STORE_EOI) {
>  		xive_esb_write(xd, XIVE_ESB_STORE_EOI, 0);
> -	else {
> -		u8 eoi_val;
> +		return;
> +	}
>  
> -		/*
> -		 * Otherwise for EOI, we use the special MMIO that does
> -		 * a clear of both P and Q and returns the old Q,
> -		 * except for LSIs where we use the "EOI cycle" special
> -		 * load.
> -		 *
> -		 * This allows us to then do a re-trigger if Q was set
> -		 * rather than synthesizing an interrupt in software
> -		 *
> -		 * For LSIs the HW EOI cycle is used rather than PQ bits,
> -		 * as they are automatically re-triggred in HW when still
> -		 * pending.
> -		 */
> -		if (xd->flags & XIVE_IRQ_FLAG_LSI)
> -			xive_esb_read(xd, XIVE_ESB_LOAD_EOI);
> -		else {
> -			eoi_val = xive_esb_read(xd, XIVE_ESB_SET_PQ_00);
> -			DBG_VERBOSE("eoi_val=%x\n", eoi_val);
> -
> -			/* Re-trigger if needed */
> -			if ((eoi_val & XIVE_ESB_VAL_Q) && xd->trig_mmio)
> -				out_be64(xd->trig_mmio, 0);
> -		}
> +	/*
> +	 * For LSIs, we use the "EOI cycle" special load rather than
> +	 * PQ bits, as they are automatically re-triggered in HW when
> +	 * still pending.
> +	 */
> +	if (xd->flags & XIVE_IRQ_FLAG_LSI) {
> +		xive_esb_read(xd, XIVE_ESB_LOAD_EOI);
> +		return;
>  	}
> +
> +	/*
> +	 * Otherwise, we use the special MMIO that does a clear of
> +	 * both P and Q and returns the old Q. This allows us to then
> +	 * do a re-trigger if Q was set rather than synthesizing an
> +	 * interrupt in software
> +	 */
> +	eoi_val = xive_esb_read(xd, XIVE_ESB_SET_PQ_00);
> +	DBG_VERBOSE("eoi_val=%x\n", eoi_val);
> +
> +	/* Re-trigger if needed */
> +	if ((eoi_val & XIVE_ESB_VAL_Q) && xd->trig_mmio)
> +		out_be64(xd->trig_mmio, 0);
>  }
>  
>  /* irq_chip eoi callback, called with irq descriptor lock held */
> @@ -398,7 +399,7 @@ static void xive_irq_eoi(struct irq_data *d)
>  	 */
>  	if (!irqd_irq_disabled(d) && !irqd_is_forwarded_to_vcpu(d) &&
>  	    !(xd->flags & XIVE_IRQ_FLAG_NO_EOI))
> -		xive_do_source_eoi(irqd_to_hwirq(d), xd);
> +		xive_do_source_eoi(xd);
>  	else
>  		xd->stale_p = true;
>  
> @@ -788,14 +789,7 @@ static int xive_irq_retrigger(struct irq_data *d)
>  	 * 11, then perform an EOI.
>  	 */
>  	xive_esb_read(xd, XIVE_ESB_SET_PQ_11);
> -
> -	/*
> -	 * Note: We pass "0" to the hw_irq argument in order to
> -	 * avoid calling into the backend EOI code which we don't
> -	 * want to do in the case of a re-trigger. Backends typically
> -	 * only do EOI for LSIs anyway.
> -	 */
> -	xive_do_source_eoi(0, xd);
> +	xive_do_source_eoi(xd);
>  
>  	return 1;
>  }
> @@ -910,7 +904,7 @@ static int xive_irq_set_vcpu_affinity(struct irq_data *d, void *state)
>  		 * while masked, the generic code will re-mask it anyway.
>  		 */
>  		if (!xd->saved_p)
> -			xive_do_source_eoi(hw_irq, xd);
> +			xive_do_source_eoi(xd);
>  
>  	}
>  	return 0;
> @@ -1054,7 +1048,7 @@ static void xive_ipi_eoi(struct irq_data *d)
>  	DBG_VERBOSE("IPI eoi: irq=%d [0x%lx] (HW IRQ 0x%x) pending=%02x\n",
>  		    d->irq, irqd_to_hwirq(d), xc->hw_ipi, xc->pending_prio);
>  
> -	xive_do_source_eoi(xc->hw_ipi, &xc->ipi_data);
> +	xive_do_source_eoi(&xc->ipi_data);
>  	xive_do_queue_eoi(xc);
>  }
>  
> @@ -1443,7 +1437,7 @@ static void xive_flush_cpu_queue(unsigned int cpu, struct xive_cpu *xc)
>  		 * still asserted. Otherwise do an MSI retrigger.
>  		 */
>  		if (xd->flags & XIVE_IRQ_FLAG_LSI)
> -			xive_do_source_eoi(irqd_to_hwirq(d), xd);
> +			xive_do_source_eoi(xd);
>  		else
>  			xive_irq_retrigger(d);
>  


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 13/13] powerpc/xive: Improve error reporting of OPAL calls
  2020-12-08 15:11 ` [PATCH 13/13] powerpc/xive: Improve error reporting of OPAL calls Cédric Le Goater
@ 2020-12-09 15:30   ` Greg Kurz
  0 siblings, 0 replies; 28+ messages in thread
From: Greg Kurz @ 2020-12-09 15:30 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: linuxppc-dev

On Tue, 8 Dec 2020 16:11:24 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> Introduce a vp_err() macro to standardize error reporting.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  arch/powerpc/sysdev/xive/native.c | 28 ++++++++++++++++------------
>  1 file changed, 16 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
> index 4902d05ebbd1..42297a131a6e 100644
> --- a/arch/powerpc/sysdev/xive/native.c
> +++ b/arch/powerpc/sysdev/xive/native.c
> @@ -122,6 +122,8 @@ static int xive_native_get_irq_config(u32 hw_irq, u32 *target, u8 *prio,
>  	return rc == 0 ? 0 : -ENXIO;
>  }
>  
> +#define vp_err(vp, fmt, ...) pr_err("VP[0x%x]: " fmt, vp, ##__VA_ARGS__)
> +
>  /* This can be called multiple time to change a queue configuration */
>  int xive_native_configure_queue(u32 vp_id, struct xive_q *q, u8 prio,
>  				__be32 *qpage, u32 order, bool can_escalate)
> @@ -149,7 +151,7 @@ int xive_native_configure_queue(u32 vp_id, struct xive_q *q, u8 prio,
>  				      &esc_irq_be,
>  				      NULL);
>  	if (rc) {
> -		pr_err("Error %lld getting queue info prio %d\n", rc, prio);
> +		vp_err(vp_id, "Failed to get queue %d info : %lld\n", prio, rc);
>  		rc = -EIO;
>  		goto fail;
>  	}
> @@ -172,7 +174,7 @@ int xive_native_configure_queue(u32 vp_id, struct xive_q *q, u8 prio,
>  		msleep(OPAL_BUSY_DELAY_MS);
>  	}
>  	if (rc) {
> -		pr_err("Error %lld setting queue for prio %d\n", rc, prio);
> +		vp_err(vp_id, "Failed to set queue %d info: %lld\n", prio, rc);
>  		rc = -EIO;
>  	} else {
>  		/*
> @@ -199,7 +201,7 @@ static void __xive_native_disable_queue(u32 vp_id, struct xive_q *q, u8 prio)
>  		msleep(OPAL_BUSY_DELAY_MS);
>  	}
>  	if (rc)
> -		pr_err("Error %lld disabling queue for prio %d\n", rc, prio);
> +		vp_err(vp_id, "Failed to disable queue %d : %lld\n", prio, rc);
>  }
>  
>  void xive_native_disable_queue(u32 vp_id, struct xive_q *q, u8 prio)
> @@ -698,6 +700,8 @@ int xive_native_enable_vp(u32 vp_id, bool single_escalation)
>  			break;
>  		msleep(OPAL_BUSY_DELAY_MS);
>  	}
> +	if (rc)
> +		vp_err(vp_id, "Failed to enable VP : %lld\n", rc);
>  	return rc ? -EIO : 0;
>  }
>  EXPORT_SYMBOL_GPL(xive_native_enable_vp);
> @@ -712,6 +716,8 @@ int xive_native_disable_vp(u32 vp_id)
>  			break;
>  		msleep(OPAL_BUSY_DELAY_MS);
>  	}
> +	if (rc)
> +		vp_err(vp_id, "Failed to disable VP : %lld\n", rc);
>  	return rc ? -EIO : 0;
>  }
>  EXPORT_SYMBOL_GPL(xive_native_disable_vp);
> @@ -723,8 +729,10 @@ int xive_native_get_vp_info(u32 vp_id, u32 *out_cam_id, u32 *out_chip_id)
>  	s64 rc;
>  
>  	rc = opal_xive_get_vp_info(vp_id, NULL, &vp_cam_be, NULL, &vp_chip_id_be);
> -	if (rc)
> +	if (rc) {
> +		vp_err(vp_id, "Failed to get VP info : %lld\n", rc);
>  		return -EIO;
> +	}
>  	*out_cam_id = be64_to_cpu(vp_cam_be) & 0xffffffffu;
>  	*out_chip_id = be32_to_cpu(vp_chip_id_be);
>  
> @@ -755,8 +763,7 @@ int xive_native_get_queue_info(u32 vp_id, u32 prio,
>  	rc = opal_xive_get_queue_info(vp_id, prio, &qpage, &qsize,
>  				      &qeoi_page, &escalate_irq, &qflags);
>  	if (rc) {
> -		pr_err("OPAL failed to get queue info for VCPU %d/%d : %lld\n",
> -		       vp_id, prio, rc);
> +		vp_err(vp_id, "failed to get queue %d info : %lld\n", prio, rc);
>  		return -EIO;
>  	}
>  
> @@ -784,8 +791,7 @@ int xive_native_get_queue_state(u32 vp_id, u32 prio, u32 *qtoggle, u32 *qindex)
>  	rc = opal_xive_get_queue_state(vp_id, prio, &opal_qtoggle,
>  				       &opal_qindex);
>  	if (rc) {
> -		pr_err("OPAL failed to get queue state for VCPU %d/%d : %lld\n",
> -		       vp_id, prio, rc);
> +		vp_err(vp_id, "failed to get queue %d state : %lld\n", prio, rc);
>  		return -EIO;
>  	}
>  
> @@ -804,8 +810,7 @@ int xive_native_set_queue_state(u32 vp_id, u32 prio, u32 qtoggle, u32 qindex)
>  
>  	rc = opal_xive_set_queue_state(vp_id, prio, qtoggle, qindex);
>  	if (rc) {
> -		pr_err("OPAL failed to set queue state for VCPU %d/%d : %lld\n",
> -		       vp_id, prio, rc);
> +		vp_err(vp_id, "failed to set queue %d state : %lld\n", prio, rc);
>  		return -EIO;
>  	}
>  
> @@ -827,8 +832,7 @@ int xive_native_get_vp_state(u32 vp_id, u64 *out_state)
>  
>  	rc = opal_xive_get_vp_state(vp_id, &state);
>  	if (rc) {
> -		pr_err("OPAL failed to get vp state for VCPU %d : %lld\n",
> -		       vp_id, rc);
> +		vp_err(vp_id, "failed to get vp state : %lld\n", rc);
>  		return -EIO;
>  	}
>  


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 04/13] powerpc/xive: Use cpu_to_node() instead of ibm, chip-id property
  2020-12-09  4:39   ` Aneesh Kumar K.V
@ 2020-12-09 15:50     ` Cédric Le Goater
  0 siblings, 0 replies; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-09 15:50 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev

On 12/9/20 5:39 AM, Aneesh Kumar K.V wrote:
> Cédric Le Goater <clg@kaod.org> writes:
> 
>> The 'chip_id' field of the XIVE CPU structure is used to choose a
>> target for a source located on the same chip when possible. This field
>> is assigned on the PowerNV platform using the "ibm,chip-id" property
>> on pSeries under KVM when NUMA nodes are defined but it is undefined
>> under PowerVM. The XIVE source structure has a similar field
>> 'src_chip' which is only assigned on the PowerNV platform.
>>
>> cpu_to_node() returns a compatible value on all platforms, 0 being the
>> default node. It will also give us the opportunity to set the affinity
>> of a source on pSeries when we can localize them.
> 
> But we should avoid assuming that linux numa node number is equivalent
> to chip id [1]. 

Yes. the common layers should not make the (wrong) assumption on what a 
'node_id' represents and we should provide converting routines between 
both spaces, HW and logical. like we already have for CPUs, IRQ and other 
entities. 

'chip_id' needs a rename in some places under xive. This is the first 
patch in that direction.

cpu_to_node() gives the correct value on all platforms that need it today. 
Meaning : I don't know how to define nodes on PowerVM but it's not used 
in the XIVE layer under pseries.

> What do we expect this value represents on virtualized platforms like 
> PowerVM and KVM? 

on KVM, it's relatively simple to define a machine with numa nodes and 
devices (PHBs) attached to a specific node. The topology is correctly 
detected by Linux. If, under the hood, the HW topology matches the logical 
topology exposed to the VM, we should be able to tune affinity correctly
directly from the VM and have good results. 

But our drivers are not always doing the correct thing. IRQ and PCI MSIs 
will require some massaging.

I can not tell for PowerVM. I am still learning in that space.

> Is this used for a hcall?

Source configuration hcalls do not have any and this is one of the missing 
bits for interrupt affinity on pSeries. We have no direct ways to link a 
source with a node and to choose a suitable target. 

For PHB MSIs, we could deduce the node from the PHB. For other interrupts, 
we need hcall support, probably an extension of H_INT_GET_SOURCE_INFO
with a new flag to return a node id.



The XIVE interrupt controller adds extra complexity at the HW level because
interrupt routing depends on a set of tables which are allocated by the 
hypervisor. For best performance, these XIVE resources should be local to 
the chip which means that allocation for virtual CPUs and and virtual 
interrupts  should be defined carefully. that's the hypervisor business 
to have a good allocation strategy when defining virtual topologies.

 
> [1] https://lore.kernel.org/linuxppc-dev/20200817103238.158133-1-aneesh.kumar@linux.ibm.com

Isn't this trying to solve an issue when node 0 is missing ? 

Because on a 4 nodes system (with a node 0), sparse numbering is not a 
problem : 

# numactl -H 
available: 4 nodes (0,2,4,6)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107
node 0 size: 32269 MB
node 0 free: 29294 MB
node 2 cpus: 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195
node 2 size: 32447 MB
node 2 free: 31982 MB
node 4 cpus: 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267
node 4 size: 32489 MB
node 4 free: 31949 MB
node 6 cpus: 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351
node 6 size: 32427 MB
node 6 free: 32024 MB
node distances:
node   0   2   4   6 
  0:  10  40  40  40 
  2:  40  10  40  40 
  4:  40  40  10  40 
  6:  40  40  40  10 


>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>  arch/powerpc/sysdev/xive/common.c | 7 +------
>>  1 file changed, 1 insertion(+), 6 deletions(-)
>>
>> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
>> index ee375daf8114..605238ca65e4 100644
>> --- a/arch/powerpc/sysdev/xive/common.c
>> +++ b/arch/powerpc/sysdev/xive/common.c
>> @@ -1342,16 +1342,11 @@ static int xive_prepare_cpu(unsigned int cpu)
>>  
>>  	xc = per_cpu(xive_cpu, cpu);
>>  	if (!xc) {
>> -		struct device_node *np;
>> -
>>  		xc = kzalloc_node(sizeof(struct xive_cpu),
>>  				  GFP_KERNEL, cpu_to_node(cpu));
>>  		if (!xc)
>>  			return -ENOMEM;
>> -		np = of_get_cpu_node(cpu, NULL);
>> -		if (np)
>> -			xc->chip_id = of_get_ibm_chip_id(np);
>> -		of_node_put(np);
>> +		xc->chip_id = cpu_to_node(cpu);
>>  		xc->hw_ipi = XIVE_BAD_IRQ;
>>  
>>  		per_cpu(xive_cpu, cpu) = xc;
>> -- 
>> 2.26.2


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 07/13] powerpc/xive: Add a debug_show handler to the XIVE irq_domain
  2020-12-08 15:11 ` [PATCH 07/13] powerpc/xive: Add a debug_show handler to the XIVE irq_domain Cédric Le Goater
@ 2020-12-09 15:50   ` Greg Kurz
  2020-12-09 16:04     ` Cédric Le Goater
  0 siblings, 1 reply; 28+ messages in thread
From: Greg Kurz @ 2020-12-09 15:50 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: linuxppc-dev

On Tue, 8 Dec 2020 16:11:18 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> Full state of the Linux interrupt descriptors can be dumped under
> debugfs when compiled with CONFIG_GENERIC_IRQ_DEBUGFS. Add support for
> the XIVE interrupt controller.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  arch/powerpc/sysdev/xive/common.c | 58 +++++++++++++++++++++++++++++++
>  1 file changed, 58 insertions(+)
> 
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index 721617f0f854..411cba12d73b 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -1303,11 +1303,69 @@ static int xive_irq_domain_match(struct irq_domain *h, struct device_node *node,
>  	return xive_ops->match(node);
>  }
>  
> +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
> +static const char * const esb_names[] = { "RESET", "OFF", "PENDING", "QUEUED" };
> +
> +static const struct {
> +	u64  mask;
> +	char *name;
> +} xive_irq_flags[] = {
> +	{ XIVE_IRQ_FLAG_STORE_EOI, "STORE_EOI" },
> +	{ XIVE_IRQ_FLAG_LSI,       "LSI"       },

> +	{ XIVE_IRQ_FLAG_SHIFT_BUG, "SHIFT_BUG" },
> +	{ XIVE_IRQ_FLAG_MASK_FW,   "MASK_FW"   },
> +	{ XIVE_IRQ_FLAG_EOI_FW,    "EOI_FW"    },

If seems that you don't even need these ^^ if you move this patch after
patch 11 actually.

> +	{ XIVE_IRQ_FLAG_H_INT_ESB, "H_INT_ESB" },
> +	{ XIVE_IRQ_FLAG_NO_EOI,    "NO_EOI"    },
> +};
> +
> +static void xive_irq_domain_debug_show(struct seq_file *m, struct irq_domain *d,
> +				       struct irq_data *irqd, int ind)
> +{
> +	struct xive_irq_data *xd;
> +	u64 val;
> +	int i;
> +
> +	/* No IRQ domain level information. To be done */
> +	if (!irqd)
> +		return;
> +
> +	if (!is_xive_irq(irq_data_get_irq_chip(irqd)))

Wouldn't it be a bug to get anything else but the XIVE irqchip here ?

WARN_ON_ONCE() ?

> +		return;
> +
> +	seq_printf(m, "%*sXIVE:\n", ind, "");
> +	ind++;
> +
> +	xd = irq_data_get_irq_handler_data(irqd);
> +	if (!xd) {
> +		seq_printf(m, "%*snot assigned\n", ind, "");
> +		return;
> +	}
> +
> +	val = xive_esb_read(xd, XIVE_ESB_GET);
> +	seq_printf(m, "%*sESB:      %s\n", ind, "", esb_names[val & 0x3]);
> +	seq_printf(m, "%*sPstate:   %s %s\n", ind, "", xd->stale_p ? "stale" : "",
> +		   xd->saved_p ? "saved" : "");
> +	seq_printf(m, "%*sTarget:   %d\n", ind, "", xd->target);
> +	seq_printf(m, "%*sChip:     %d\n", ind, "", xd->src_chip);
> +	seq_printf(m, "%*sTrigger:  0x%016llx\n", ind, "", xd->trig_page);
> +	seq_printf(m, "%*sEOI:      0x%016llx\n", ind, "", xd->eoi_page);
> +	seq_printf(m, "%*sFlags:    0x%llx\n", ind, "", xd->flags);
> +	for (i = 0; i < ARRAY_SIZE(xive_irq_flags); i++) {
> +		if (xd->flags & xive_irq_flags[i].mask)
> +			seq_printf(m, "%*s%s\n", ind + 12, "", xive_irq_flags[i].name);
> +	}
> +}
> +#endif
> +
>  static const struct irq_domain_ops xive_irq_domain_ops = {
>  	.match = xive_irq_domain_match,
>  	.map = xive_irq_domain_map,
>  	.unmap = xive_irq_domain_unmap,
>  	.xlate = xive_irq_domain_xlate,
> +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
> +	.debug_show = xive_irq_domain_debug_show,
> +#endif
>  };
>  
>  static void __init xive_init_host(struct device_node *np)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 07/13] powerpc/xive: Add a debug_show handler to the XIVE irq_domain
  2020-12-09 15:50   ` Greg Kurz
@ 2020-12-09 16:04     ` Cédric Le Goater
  0 siblings, 0 replies; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-09 16:04 UTC (permalink / raw)
  To: Greg Kurz; +Cc: linuxppc-dev

On 12/9/20 4:50 PM, Greg Kurz wrote:
> On Tue, 8 Dec 2020 16:11:18 +0100
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> Full state of the Linux interrupt descriptors can be dumped under
>> debugfs when compiled with CONFIG_GENERIC_IRQ_DEBUGFS. Add support for
>> the XIVE interrupt controller.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>  arch/powerpc/sysdev/xive/common.c | 58 +++++++++++++++++++++++++++++++
>>  1 file changed, 58 insertions(+)
>>
>> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
>> index 721617f0f854..411cba12d73b 100644
>> --- a/arch/powerpc/sysdev/xive/common.c
>> +++ b/arch/powerpc/sysdev/xive/common.c
>> @@ -1303,11 +1303,69 @@ static int xive_irq_domain_match(struct irq_domain *h, struct device_node *node,
>>  	return xive_ops->match(node);
>>  }
>>  
>> +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
>> +static const char * const esb_names[] = { "RESET", "OFF", "PENDING", "QUEUED" };
>> +
>> +static const struct {
>> +	u64  mask;
>> +	char *name;
>> +} xive_irq_flags[] = {
>> +	{ XIVE_IRQ_FLAG_STORE_EOI, "STORE_EOI" },
>> +	{ XIVE_IRQ_FLAG_LSI,       "LSI"       },
> 
>> +	{ XIVE_IRQ_FLAG_SHIFT_BUG, "SHIFT_BUG" },
>> +	{ XIVE_IRQ_FLAG_MASK_FW,   "MASK_FW"   },
>> +	{ XIVE_IRQ_FLAG_EOI_FW,    "EOI_FW"    },
> 
> If seems that you don't even need these ^^ if you move this patch after
> patch 11 actually.>
> 
>> +	{ XIVE_IRQ_FLAG_H_INT_ESB, "H_INT_ESB" },
>> +	{ XIVE_IRQ_FLAG_NO_EOI,    "NO_EOI"    },
>> +};
>> +
>> +static void xive_irq_domain_debug_show(struct seq_file *m, struct irq_domain *d,
>> +				       struct irq_data *irqd, int ind)
>> +{
>> +	struct xive_irq_data *xd;
>> +	u64 val;
>> +	int i;
>> +
>> +	/* No IRQ domain level information. To be done */
>> +	if (!irqd)
>> +		return;
>> +
>> +	if (!is_xive_irq(irq_data_get_irq_chip(irqd)))
> 
> Wouldn't it be a bug to get anything else but the XIVE irqchip here ?

no.

> WARN_ON_ONCE() ?

XIVE IPIs are in the same domain but have a different chip. 

C.
 
>> +		return;
>> +
>> +	seq_printf(m, "%*sXIVE:\n", ind, "");
>> +	ind++;
>> +
>> +	xd = irq_data_get_irq_handler_data(irqd);
>> +	if (!xd) {
>> +		seq_printf(m, "%*snot assigned\n", ind, "");
>> +		return;
>> +	}
>> +
>> +	val = xive_esb_read(xd, XIVE_ESB_GET);
>> +	seq_printf(m, "%*sESB:      %s\n", ind, "", esb_names[val & 0x3]);
>> +	seq_printf(m, "%*sPstate:   %s %s\n", ind, "", xd->stale_p ? "stale" : "",
>> +		   xd->saved_p ? "saved" : "");
>> +	seq_printf(m, "%*sTarget:   %d\n", ind, "", xd->target);
>> +	seq_printf(m, "%*sChip:     %d\n", ind, "", xd->src_chip);
>> +	seq_printf(m, "%*sTrigger:  0x%016llx\n", ind, "", xd->trig_page);
>> +	seq_printf(m, "%*sEOI:      0x%016llx\n", ind, "", xd->eoi_page);
>> +	seq_printf(m, "%*sFlags:    0x%llx\n", ind, "", xd->flags);
>> +	for (i = 0; i < ARRAY_SIZE(xive_irq_flags); i++) {
>> +		if (xd->flags & xive_irq_flags[i].mask)
>> +			seq_printf(m, "%*s%s\n", ind + 12, "", xive_irq_flags[i].name);
>> +	}
>> +}
>> +#endif
>> +
>>  static const struct irq_domain_ops xive_irq_domain_ops = {
>>  	.match = xive_irq_domain_match,
>>  	.map = xive_irq_domain_map,
>>  	.unmap = xive_irq_domain_unmap,
>>  	.xlate = xive_irq_domain_xlate,
>> +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
>> +	.debug_show = xive_irq_domain_debug_show,
>> +#endif
>>  };
>>  
>>  static void __init xive_init_host(struct device_node *np)
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 00/13] powerpc/xive: misc cleanups
  2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
                   ` (12 preceding siblings ...)
  2020-12-08 15:11 ` [PATCH 13/13] powerpc/xive: Improve error reporting of OPAL calls Cédric Le Goater
@ 2020-12-10  9:59 ` Cédric Le Goater
  13 siblings, 0 replies; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-10  9:59 UTC (permalink / raw)
  To: linuxppc-dev

On 12/8/20 4:11 PM, Cédric Le Goater wrote:
> Hello,
> 
> The most important change is the removal of support of OPAL flags
> required for P9 DD1. It provides a good cleanup of some complex
> routines.
> 
> The series also includes a change on how the pages donated to the XIVE
> IC are allocated in Linux. The flags are changed to make sure that
> these pages can not be reclaimed.

This issue (checkstop) only occurred a once or twice on a P9 Mihawk
with 1TB of RAM and 1TB of swap. It's hard to reproduce. It seems
we have a fix but I misunderstood some parts of the kernel page
allocation scheme and, so, I am not entirely the root issue is
well analyzed. This patch can wait until I grab a larger system
with 2TB.

I will send a v2. Greg had some valuable comments on extra 
cleanups.

Thanks,

C.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 09/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG
  2020-12-08 17:39   ` Greg Kurz
@ 2020-12-10 13:18     ` Cédric Le Goater
  0 siblings, 0 replies; 28+ messages in thread
From: Cédric Le Goater @ 2020-12-10 13:18 UTC (permalink / raw)
  To: Greg Kurz; +Cc: linuxppc-dev

On 12/8/20 6:39 PM, Greg Kurz wrote:
> On Tue, 8 Dec 2020 16:11:20 +0100
> Cédric Le Goater <clg@kaod.org> wrote:
> 
>> This flag was used to support the PHB4 LSIs on P9 DD1 and we have
>> stopped supporting this CPU when DD2 came out. See skiboot commit:
>>
>>   https://github.com/open-power/skiboot/commit/0b0d15e3c170
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
> 
> Reviewed-by: Greg Kurz <groug@kaod.org>
> 
> Just a minor suggestion in case you need to post a v2. See below.
> 
>>  arch/powerpc/include/asm/opal-api.h     | 2 +-
>>  arch/powerpc/include/asm/xive.h         | 2 +-
>>  arch/powerpc/kvm/book3s_xive_native.c   | 3 ---
>>  arch/powerpc/kvm/book3s_xive_template.c | 3 ---
>>  arch/powerpc/sysdev/xive/common.c       | 8 --------
>>  arch/powerpc/sysdev/xive/native.c       | 2 --
>>  6 files changed, 2 insertions(+), 18 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
>> index 1dffa3cb16ba..48ee604ca39a 100644
>> --- a/arch/powerpc/include/asm/opal-api.h
>> +++ b/arch/powerpc/include/asm/opal-api.h
>> @@ -1091,7 +1091,7 @@ enum {
>>  	OPAL_XIVE_IRQ_TRIGGER_PAGE	= 0x00000001,
>>  	OPAL_XIVE_IRQ_STORE_EOI		= 0x00000002,
>>  	OPAL_XIVE_IRQ_LSI		= 0x00000004,
>> -	OPAL_XIVE_IRQ_SHIFT_BUG		= 0x00000008,
>> +	OPAL_XIVE_IRQ_SHIFT_BUG		= 0x00000008, /* P9 DD1.0 workaround */
> 
> Maybe you can even comment the entire line so that any future
> tentative to use that flag breaks build ?

This file is "copied" from OPAL. I think it is best to keep them
in sync.

> 
>>  	OPAL_XIVE_IRQ_MASK_VIA_FW	= 0x00000010,
>>  	OPAL_XIVE_IRQ_EOI_VIA_FW	= 0x00000020,
>>  };
>> diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
>> index d332dd9a18de..ff805885a028 100644
>> --- a/arch/powerpc/include/asm/xive.h
>> +++ b/arch/powerpc/include/asm/xive.h
>> @@ -60,7 +60,7 @@ struct xive_irq_data {
>>  };
>>  #define XIVE_IRQ_FLAG_STORE_EOI	0x01
>>  #define XIVE_IRQ_FLAG_LSI	0x02
>> -#define XIVE_IRQ_FLAG_SHIFT_BUG	0x04
>> +#define XIVE_IRQ_FLAG_SHIFT_BUG	0x04 /* P9 DD1.0 workaround */
> 
> Same here, with an extra cleanup to stop using it when initializing 
> xive_irq_flags[] in common.c.

Yes. Since this is an internal flag, we can simply remove it.

Thanks,

C. 

> 
>>  #define XIVE_IRQ_FLAG_MASK_FW	0x08
>>  #define XIVE_IRQ_FLAG_EOI_FW	0x10
>>  #define XIVE_IRQ_FLAG_H_INT_ESB	0x20
>> diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
>> index 9b395381179d..170d1d04e1d1 100644
>> --- a/arch/powerpc/kvm/book3s_xive_native.c
>> +++ b/arch/powerpc/kvm/book3s_xive_native.c
>> @@ -37,9 +37,6 @@ static u8 xive_vm_esb_load(struct xive_irq_data *xd, u32 offset)
>>  	 * ordering.
>>  	 */
>>  
>> -	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
>> -		offset |= offset << 4;
>> -
>>  	val = in_be64(xd->eoi_mmio + offset);
>>  	return (u8)val;
>>  }
>> diff --git a/arch/powerpc/kvm/book3s_xive_template.c b/arch/powerpc/kvm/book3s_xive_template.c
>> index 4ad3c0279458..ece36e024a8f 100644
>> --- a/arch/powerpc/kvm/book3s_xive_template.c
>> +++ b/arch/powerpc/kvm/book3s_xive_template.c
>> @@ -61,9 +61,6 @@ static u8 GLUE(X_PFX,esb_load)(struct xive_irq_data *xd, u32 offset)
>>  	if (offset == XIVE_ESB_SET_PQ_10 && xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
>>  		offset |= XIVE_ESB_LD_ST_MO;
>>  
>> -	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
>> -		offset |= offset << 4;
>> -
>>  	val =__x_readq(__x_eoi_page(xd) + offset);
>>  #ifdef __LITTLE_ENDIAN__
>>  	val >>= 64-8;
>> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
>> index 411cba12d73b..a9259470bf9f 100644
>> --- a/arch/powerpc/sysdev/xive/common.c
>> +++ b/arch/powerpc/sysdev/xive/common.c
>> @@ -200,10 +200,6 @@ static notrace u8 xive_esb_read(struct xive_irq_data *xd, u32 offset)
>>  	if (offset == XIVE_ESB_SET_PQ_10 && xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
>>  		offset |= XIVE_ESB_LD_ST_MO;
>>  
>> -	/* Handle HW errata */
>> -	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
>> -		offset |= offset << 4;
>> -
>>  	if ((xd->flags & XIVE_IRQ_FLAG_H_INT_ESB) && xive_ops->esb_rw)
>>  		val = xive_ops->esb_rw(xd->hw_irq, offset, 0, 0);
>>  	else
>> @@ -214,10 +210,6 @@ static notrace u8 xive_esb_read(struct xive_irq_data *xd, u32 offset)
>>  
>>  static void xive_esb_write(struct xive_irq_data *xd, u32 offset, u64 data)
>>  {
>> -	/* Handle HW errata */
>> -	if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
>> -		offset |= offset << 4;
>> -
>>  	if ((xd->flags & XIVE_IRQ_FLAG_H_INT_ESB) && xive_ops->esb_rw)
>>  		xive_ops->esb_rw(xd->hw_irq, offset, data, 1);
>>  	else
>> diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
>> index 5f1e5aed8ab4..0310783241b5 100644
>> --- a/arch/powerpc/sysdev/xive/native.c
>> +++ b/arch/powerpc/sysdev/xive/native.c
>> @@ -64,8 +64,6 @@ int xive_native_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)
>>  		data->flags |= XIVE_IRQ_FLAG_STORE_EOI;
>>  	if (opal_flags & OPAL_XIVE_IRQ_LSI)
>>  		data->flags |= XIVE_IRQ_FLAG_LSI;
>> -	if (opal_flags & OPAL_XIVE_IRQ_SHIFT_BUG)
>> -		data->flags |= XIVE_IRQ_FLAG_SHIFT_BUG;
>>  	if (opal_flags & OPAL_XIVE_IRQ_MASK_VIA_FW)
>>  		data->flags |= XIVE_IRQ_FLAG_MASK_FW;
>>  	if (opal_flags & OPAL_XIVE_IRQ_EOI_VIA_FW)
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-12-10 15:37 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-08 15:11 [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater
2020-12-08 15:11 ` [PATCH 01/13] KVM: PPC: Book3S HV: XIVE: Show detailed configuration in debug output Cédric Le Goater
2020-12-08 15:11 ` [PATCH 02/13] powerpc/xive: Rename XIVE_IRQ_NO_EOI to show its a flag Cédric Le Goater
2020-12-08 16:59   ` Greg Kurz
2020-12-08 15:11 ` [PATCH 03/13] powerpc/xive: Introduce XIVE_IPI_HW_IRQ Cédric Le Goater
2020-12-08 17:06   ` Greg Kurz
2020-12-08 15:11 ` [PATCH 04/13] powerpc/xive: Use cpu_to_node() instead of ibm, chip-id property Cédric Le Goater
2020-12-09  4:39   ` Aneesh Kumar K.V
2020-12-09 15:50     ` Cédric Le Goater
2020-12-08 15:11 ` [PATCH 05/13] powerpc/xive: Fix allocation of pages donated to the XIVE controller Cédric Le Goater
2020-12-08 15:11 ` [PATCH 06/13] powerpc/xive: Add a name to the IRQ domain Cédric Le Goater
2020-12-08 15:11 ` [PATCH 07/13] powerpc/xive: Add a debug_show handler to the XIVE irq_domain Cédric Le Goater
2020-12-09 15:50   ` Greg Kurz
2020-12-09 16:04     ` Cédric Le Goater
2020-12-08 15:11 ` [PATCH 08/13] powerpc: Increase NR_IRQS range to support more KVM guests Cédric Le Goater
2020-12-08 17:23   ` Greg Kurz
2020-12-08 15:11 ` [PATCH 09/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG Cédric Le Goater
2020-12-08 17:39   ` Greg Kurz
2020-12-10 13:18     ` Cédric Le Goater
2020-12-08 15:11 ` [PATCH 10/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW Cédric Le Goater
2020-12-09 15:21   ` Greg Kurz
2020-12-08 15:11 ` [PATCH 11/13] powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW Cédric Le Goater
2020-12-09 15:24   ` Greg Kurz
2020-12-08 15:11 ` [PATCH 12/13] powerpc/xive: Simplify xive_do_source_eoi() Cédric Le Goater
2020-12-09 15:28   ` Greg Kurz
2020-12-08 15:11 ` [PATCH 13/13] powerpc/xive: Improve error reporting of OPAL calls Cédric Le Goater
2020-12-09 15:30   ` Greg Kurz
2020-12-10  9:59 ` [PATCH 00/13] powerpc/xive: misc cleanups Cédric Le Goater

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.