All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 00/52] x86: Rework the vector management
@ 2017-09-13 21:29 Thomas Gleixner
  2017-09-13 21:29 ` [patch 01/52] genirq: Fix cpumask check in __irq_startup_managed() Thomas Gleixner
                   ` (53 more replies)
  0 siblings, 54 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

Sorry for the large CC list, but this is a major surgery.

The vector management in x86 including the surrounding code is a
conglomorate of ancient bits and pieces which have been subject to
'modernization' and featuritis over the years. The most obscure parts are
the vector allocation mechanics, the cleanup vector handling and the cpu
hotplug machinery. Replacing these pieces of art was on my todo list for a
long time.

Recent attempts to 'solve' CPU offline / hibernation issues which are
partially caused by the current vector management implementation made me
look for real. Further information in this thread:

    http://lkml.kernel.org/r/cover.1504235838.git.yu.c.chen@intel.com

Aside of drivers allocating gazillion of interrupts, there are quite some
things which can be addressed in the x86 vector management and in the core
code.

  - Multi CPU affinities:

    A dubious property which is not available on all machines and causes
    major complexity both in the allocator and the cleanup/hotplug
    management. See:

       http://lkml.kernel.org/r/alpine.DEB.2.20.1709071045440.1827@nanos

  - Priority level spreading:

    An obscure and undocumented property which I think is sufficiently
    argued to be not required in:

       http://lkml.kernel.org/r/alpine.DEB.2.20.1709071045440.1827@nanos

  - Allocation of vectors when interrupt descriptors are allocated.

    This is a historical implementation detail, which is not really
    required when the vector allocation is delayed up to the point when
    request_irq() is invoked. This might make request_irq() fail, when the
    vector space is exhausted, but drivers should handle request_irq()
    fails anyway.

    The upside of changing this is that the active vector space becomes
    smaller especially on hibernation/cpu offline when drivers shut down
    queue interrupts of outgoing CPUs.

    Some of this is already addressed with the managed interrupt facility,
    but that was bolted on top of the existing vector management because
    proper integration was not possible at that point. I take the blame
    for this, but the tradeoff of not doing it would have been more
    broken driver boiler plate code all over the place. So I went for the
    lesser of two evils.

  - Allocation of vectors on the wrong place

    Even for managed interrupts the vector allocation at descriptor
    allocation happens on the wrong place and gets fixed after the fact
    with a call to set_affinity(). In case of not remapped interrupts
    this results in at least one interrupt on the wrong CPU before it is
    migrated to the desired target.

  - Lack of instrumentation
 
    All of this is a black box which allows no insight into the actual
    vector usage.

The series addresses these points and converts the x86 vector management to
a bitmap based allocator which provides proper reservation management for
'managed interrupts' and best effort reservation for regular interrupts.
The latter allows overcommitment, which 'fixes' some of hotplug/hibernation
problems in a clean way. It can't fix all of them depending on the driver
involved.

This rework is no excuse for driver writers to do exhaustive vector
allocations instead of utilizing the managed interrupt infrastructure, but
it addresses long standing issues in this code with the side effect of
mitigating some of the driver oddities. The proper solution for multi queue
management are 'managed interrupts' which has been proven in the block-mq
work as they solve issues which are worked around in other drivers in
creative ways with lots of copied code and often enough broken attempts to
handle interrupt affinity and CPU hotplug problems.

The new bitmap allocator and the x86 vector management code are
instrumented with tracepoints and the irq domain debugfs files allow deep
insight into the vector allocation and reservations.

The patches work on machines with and without interrupt remapping and
inside of KVM guests of various flavours, though I have no idea what I
broke on the way with other hypervisors, posted interrupts etc. So I kindly
ask for your support in testing and review.

The series applies on top of Linus tree and is available as git branch:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/apic

Note, that this branch is Linus tree plus scheduler and x86 fixes which I
required to do proper testing. They have outstanding pull requests and
might be merged already when you read this.

Thanks,

	tglx
---
 arch/x86/include/asm/x2apic.h              |   49 -
 b/arch/x86/Kconfig                         |    1 
 b/arch/x86/include/asm/apic.h              |  255 +-----
 b/arch/x86/include/asm/desc.h              |    2 
 b/arch/x86/include/asm/hw_irq.h            |    6 
 b/arch/x86/include/asm/io_apic.h           |    2 
 b/arch/x86/include/asm/irq.h               |    4 
 b/arch/x86/include/asm/irq_vectors.h       |    8 
 b/arch/x86/include/asm/irqdomain.h         |    5 
 b/arch/x86/include/asm/kvm_host.h          |    2 
 b/arch/x86/include/asm/trace/irq_vectors.h |  244 ++++++
 b/arch/x86/kernel/apic/Makefile            |    2 
 b/arch/x86/kernel/apic/apic.c              |   38 -
 b/arch/x86/kernel/apic/apic_common.c       |   46 +
 b/arch/x86/kernel/apic/apic_flat_64.c      |   10 
 b/arch/x86/kernel/apic/apic_noop.c         |   25 
 b/arch/x86/kernel/apic/apic_numachip.c     |   12 
 b/arch/x86/kernel/apic/bigsmp_32.c         |    8 
 b/arch/x86/kernel/apic/htirq.c             |    5 
 b/arch/x86/kernel/apic/io_apic.c           |   94 --
 b/arch/x86/kernel/apic/msi.c               |    5 
 b/arch/x86/kernel/apic/probe_32.c          |   29 
 b/arch/x86/kernel/apic/vector.c            | 1090 +++++++++++++++++------------
 b/arch/x86/kernel/apic/x2apic.h            |    9 
 b/arch/x86/kernel/apic/x2apic_cluster.c    |  196 +----
 b/arch/x86/kernel/apic/x2apic_phys.c       |   44 +
 b/arch/x86/kernel/apic/x2apic_uv_x.c       |   17 
 b/arch/x86/kernel/i8259.c                  |    1 
 b/arch/x86/kernel/idt.c                    |   12 
 b/arch/x86/kernel/irq.c                    |  101 --
 b/arch/x86/kernel/irqinit.c                |    1 
 b/arch/x86/kernel/setup.c                  |   12 
 b/arch/x86/kernel/smpboot.c                |   14 
 b/arch/x86/kernel/traps.c                  |    2 
 b/arch/x86/kernel/vsmp_64.c                |   19 
 b/arch/x86/platform/uv/uv_irq.c            |    5 
 b/arch/x86/xen/apic.c                      |    6 
 b/drivers/gpio/gpio-xgene-sb.c             |    7 
 b/drivers/iommu/amd_iommu.c                |   44 -
 b/drivers/iommu/intel_irq_remapping.c      |   43 -
 b/drivers/irqchip/irq-gic-v3-its.c         |    5 
 b/drivers/pinctrl/stm32/pinctrl-stm32.c    |    5 
 b/include/linux/irq.h                      |   22 
 b/include/linux/irqdesc.h                  |    1 
 b/include/linux/irqdomain.h                |   14 
 b/include/linux/msi.h                      |    5 
 b/include/trace/events/irq_matrix.h        |  201 +++++
 b/kernel/irq/Kconfig                       |    3 
 b/kernel/irq/Makefile                      |    1 
 b/kernel/irq/autoprobe.c                   |    2 
 b/kernel/irq/chip.c                        |   37 
 b/kernel/irq/debugfs.c                     |   12 
 b/kernel/irq/internals.h                   |   19 
 b/kernel/irq/irqdesc.c                     |    3 
 b/kernel/irq/irqdomain.c                   |   43 -
 b/kernel/irq/manage.c                      |   18 
 b/kernel/irq/matrix.c                      |  443 +++++++++++
 b/kernel/irq/msi.c                         |   32 
 58 files changed, 2133 insertions(+), 1208 deletions(-)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 01/52] genirq: Fix cpumask check in __irq_startup_managed()
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-16 18:24   ` [tip:irq/urgent] " tip-bot for Thomas Gleixner
  2017-09-13 21:29 ` [patch 02/52] genirq/debugfs: Show debug information for all irq descriptors Thomas Gleixner
                   ` (52 subsequent siblings)
  53 siblings, 1 reply; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven,
	stable

[-- Attachment #1: genirq--Fix-cpumask-check.patch --]
[-- Type: text/plain, Size: 739 bytes --]

The result of cpumask_any_and() is invalid when result greater or equal
nr_cpu_ids. The current check is checking for greater only. Fix it.

Fixes: 761ea388e8c4 ("genirq: Handle managed irqs gracefully in irq_startup()")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
---
 kernel/irq/chip.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -202,7 +202,7 @@ static int
 
 	irqd_clr_managed_shutdown(d);
 
-	if (cpumask_any_and(aff, cpu_online_mask) > nr_cpu_ids) {
+	if (cpumask_any_and(aff, cpu_online_mask) >= nr_cpu_ids) {
 		/*
 		 * Catch code which fiddles with enable_irq() on a managed
 		 * and potentially shutdown IRQ. Chained interrupt

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 02/52] genirq/debugfs: Show debug information for all irq descriptors
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
  2017-09-13 21:29 ` [patch 01/52] genirq: Fix cpumask check in __irq_startup_managed() Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 03/52] genirq/msi: Capture device name for debugfs Thomas Gleixner
                   ` (51 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: genirq-debugfs--Show-debug-information-for-all-irq-descriptors.patch --]
[-- Type: text/plain, Size: 1132 bytes --]

Currently the debugfs shows only information about actively used interrupts
like /proc/irq/ does. That's fine for most cases, but not helpful when
internals of allocated, but unused interrupt descriptors have to
debugged. It's also useful to provide information about all descriptors so
leaks can be debugged in a simpler way.

Move the debugfs registration to the descriptor allocation code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/irq/irqdesc.c |    1 +
 kernel/irq/manage.c  |    1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -462,6 +462,7 @@ static int alloc_descs(unsigned int star
 			goto err;
 		irq_insert_desc(start + i, desc);
 		irq_sysfs_add(start + i, desc);
+		irq_add_debugfs_entry(start + i, desc);
 	}
 	bitmap_set(allocated_irqs, start, cnt);
 	return start;
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1400,7 +1400,6 @@ static int
 		wake_up_process(new->secondary->thread);
 
 	register_irq_proc(irq, desc);
-	irq_add_debugfs_entry(irq, desc);
 	new->dir = NULL;
 	register_handler_proc(irq, new);
 	return 0;

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 03/52] genirq/msi: Capture device name for debugfs
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
  2017-09-13 21:29 ` [patch 01/52] genirq: Fix cpumask check in __irq_startup_managed() Thomas Gleixner
  2017-09-13 21:29 ` [patch 02/52] genirq/debugfs: Show debug information for all irq descriptors Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 04/52] irqdomain/debugfs: Provide domain specific debug callback Thomas Gleixner
                   ` (50 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: genirq-msi--Capture-device-name-for-debugfs.patch --]
[-- Type: text/plain, Size: 2950 bytes --]

For debugging the allocation of unused or potentially leaked interrupt
descriptor it's helpful to have some information about the site which
allocated them. In case of MSI this is simple because the caller hands the
device struct pointer into the domain allocation function.

Duplicate the device name and show it in the debugfs entry of the interrupt
descriptor.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/irqdesc.h |    1 +
 kernel/irq/debugfs.c    |   10 ++++++++++
 kernel/irq/internals.h  |    5 +++++
 kernel/irq/msi.c        |    6 +++++-
 4 files changed, 21 insertions(+), 1 deletion(-)

--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -93,6 +93,7 @@ struct irq_desc {
 #endif
 #ifdef CONFIG_GENERIC_IRQ_DEBUGFS
 	struct dentry		*debugfs_file;
+	const char		*dev_name;
 #endif
 #ifdef CONFIG_SPARSE_IRQ
 	struct rcu_head		rcu;
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -149,6 +149,7 @@ static int irq_debug_show(struct seq_fil
 	raw_spin_lock_irq(&desc->lock);
 	data = irq_desc_get_irq_data(desc);
 	seq_printf(m, "handler:  %pf\n", desc->handle_irq);
+	seq_printf(m, "device:   %s\n", desc->dev_name);
 	seq_printf(m, "status:   0x%08x\n", desc->status_use_accessors);
 	irq_debug_show_bits(m, 0, desc->status_use_accessors, irqdesc_states,
 			    ARRAY_SIZE(irqdesc_states));
@@ -226,6 +227,15 @@ static const struct file_operations dfs_
 	.release	= single_release,
 };
 
+void irq_debugfs_copy_devname(int irq, struct device *dev)
+{
+	struct irq_desc *desc = irq_to_desc(irq);
+	const char *name = dev_name(dev);
+
+	if (name)
+		desc->dev_name = kstrdup(name, GFP_KERNEL);
+}
+
 void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *desc)
 {
 	char name [10];
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -443,7 +443,9 @@ void irq_add_debugfs_entry(unsigned int
 static inline void irq_remove_debugfs_entry(struct irq_desc *desc)
 {
 	debugfs_remove(desc->debugfs_file);
+	kfree(desc->dev_name);
 }
+void irq_debugfs_copy_devname(int irq, struct device *dev);
 # ifdef CONFIG_IRQ_DOMAIN
 void irq_domain_debugfs_init(struct dentry *root);
 # else
@@ -458,4 +460,7 @@ static inline void irq_add_debugfs_entry
 static inline void irq_remove_debugfs_entry(struct irq_desc *d)
 {
 }
+static inline void irq_debugfs_copy_devname(int irq, struct device *dev)
+{
+}
 #endif /* CONFIG_GENERIC_IRQ_DEBUGFS */
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -16,6 +16,8 @@
 #include <linux/msi.h>
 #include <linux/slab.h>
 
+#include "internals.h"
+
 /**
  * alloc_msi_entry - Allocate an initialize msi_entry
  * @dev:	Pointer to the device for which this is allocated
@@ -373,8 +375,10 @@ int msi_domain_alloc_irqs(struct irq_dom
 			return ret;
 		}
 
-		for (i = 0; i < desc->nvec_used; i++)
+		for (i = 0; i < desc->nvec_used; i++) {
 			irq_set_msi_desc_off(virq, i, desc);
+			irq_debugfs_copy_devname(virq + i, dev);
+		}
 	}
 
 	if (ops->msi_finish)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 04/52] irqdomain/debugfs: Provide domain specific debug callback
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (2 preceding siblings ...)
  2017-09-13 21:29 ` [patch 03/52] genirq/msi: Capture device name for debugfs Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 05/52] genirq: Make state consistent for !IRQ_DOMAIN_HIERARCHY Thomas Gleixner
                   ` (49 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: irqdomain-debugfs--Provide-domain-specific-debug-callback.patch --]
[-- Type: text/plain, Size: 2755 bytes --]

Some interrupt domains like the X86 vector domain has special requirements
for debugging, like showing the vector usage on the CPUs.

Add a callback to the irqdomain ops which can be filled in by domains which
require it and add conditional invocations to the irqdomain and the per irq
debug files.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/irqdomain.h |    6 +++++-
 kernel/irq/debugfs.c      |    2 ++
 kernel/irq/irqdomain.c    |    2 ++
 3 files changed, 9 insertions(+), 1 deletion(-)

Index: b/include/linux/irqdomain.h
===================================================================
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -40,6 +40,7 @@ struct of_device_id;
 struct irq_chip;
 struct irq_data;
 struct cpumask;
+struct seq_file;
 
 /* Number of irqs reserved for a legacy isa controller */
 #define NUM_ISA_INTERRUPTS	16
@@ -104,7 +105,6 @@ struct irq_domain_ops {
 	int (*xlate)(struct irq_domain *d, struct device_node *node,
 		     const u32 *intspec, unsigned int intsize,
 		     unsigned long *out_hwirq, unsigned int *out_type);
-
 #ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
 	/* extended V2 interfaces to support hierarchy irq_domains */
 	int (*alloc)(struct irq_domain *d, unsigned int virq,
@@ -116,6 +116,10 @@ struct irq_domain_ops {
 	int (*translate)(struct irq_domain *d, struct irq_fwspec *fwspec,
 			 unsigned long *out_hwirq, unsigned int *out_type);
 #endif
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+	void (*debug_show)(struct seq_file *m, struct irq_domain *d,
+			   struct irq_data *irqd, int ind);
+#endif
 };
 
 extern struct irq_domain_ops irq_generic_chip_ops;
Index: b/kernel/irq/debugfs.c
===================================================================
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -81,6 +81,8 @@ irq_debug_show_data(struct seq_file *m,
 		   data->domain ? data->domain->name : "");
 	seq_printf(m, "%*shwirq:   0x%lx\n", ind + 1, "", data->hwirq);
 	irq_debug_show_chip(m, data, ind + 1);
+	if (data->domain && data->domain->ops && data->domain->ops->debug_show)
+		data->domain->ops->debug_show(m, NULL, data, ind + 1);
 #ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
 	if (!data->parent_data)
 		return;
Index: b/kernel/irq/irqdomain.c
===================================================================
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -1810,6 +1810,8 @@ irq_domain_debug_show_one(struct seq_fil
 		   d->revmap_size + d->revmap_direct_max_irq);
 	seq_printf(m, "%*smapped: %u\n", ind + 1, "", d->mapcount);
 	seq_printf(m, "%*sflags:  0x%08x\n", ind +1 , "", d->flags);
+	if (d->ops && d->ops->debug_show)
+		d->ops->debug_show(m, d, NULL, ind + 1);
 #ifdef	CONFIG_IRQ_DOMAIN_HIERARCHY
 	if (!d->parent)
 		return;

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 05/52] genirq: Make state consistent for !IRQ_DOMAIN_HIERARCHY
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (3 preceding siblings ...)
  2017-09-13 21:29 ` [patch 04/52] irqdomain/debugfs: Provide domain specific debug callback Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 06/52] genirq: Set managed shut down flag at init Thomas Gleixner
                   ` (48 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: genirq--Make-state-consistent-for-!IRQ_DOMAIN_HIERARCHY.patch --]
[-- Type: text/plain, Size: 1995 bytes --]

In the !IRQ_DOMAIN_HIERARCHY cas the activation stubs are not
setting/clearing the activation status bits. This is not a problem at the
moment, but upcoming changes require a correct status.

Add the set/clear incovations to the stub functions and move them to the
core internal header to avoid duplication and visibility outside the core.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/irqdomain.h |    4 ----
 kernel/irq/internals.h    |   11 +++++++++++
 2 files changed, 11 insertions(+), 4 deletions(-)

--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -511,8 +511,6 @@ static inline bool irq_domain_is_msi_rem
 extern bool irq_domain_hierarchical_is_msi_remap(struct irq_domain *domain);
 
 #else	/* CONFIG_IRQ_DOMAIN_HIERARCHY */
-static inline void irq_domain_activate_irq(struct irq_data *data) { }
-static inline void irq_domain_deactivate_irq(struct irq_data *data) { }
 static inline int irq_domain_alloc_irqs(struct irq_domain *domain,
 			unsigned int nr_irqs, int node, void *arg)
 {
@@ -561,8 +559,6 @@ irq_domain_hierarchical_is_msi_remap(str
 
 #else /* CONFIG_IRQ_DOMAIN */
 static inline void irq_dispose_mapping(unsigned int virq) { }
-static inline void irq_domain_activate_irq(struct irq_data *data) { }
-static inline void irq_domain_deactivate_irq(struct irq_data *data) { }
 static inline struct irq_domain *irq_find_matching_fwnode(
 	struct fwnode_handle *fwnode, enum irq_domain_bus_token bus_token)
 {
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -436,6 +436,17 @@ static inline bool irq_fixup_move_pendin
 }
 #endif /* !CONFIG_GENERIC_PENDING_IRQ */
 
+#if !defined(CONFIG_IRQ_DOMAIN) || !defined(CONFIG_IRQ_DOMAIN_HIERARCHY)
+static inline void irq_domain_activate_irq(struct irq_data *data)
+{
+	irqd_set_activated(data);
+}
+static inline void irq_domain_deactivate_irq(struct irq_data *data)
+{
+	irqd_clr_activated(data);
+}
+#endif
+
 #ifdef CONFIG_GENERIC_IRQ_DEBUGFS
 #include <linux/debugfs.h>
 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 06/52] genirq: Set managed shut down flag at init
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (4 preceding siblings ...)
  2017-09-13 21:29 ` [patch 05/52] genirq: Make state consistent for !IRQ_DOMAIN_HIERARCHY Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 07/52] genirq: Separate activation and startup Thomas Gleixner
                   ` (47 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: genirq--Set-managed-shut-down-flag-at-init.patch --]
[-- Type: text/plain, Size: 647 bytes --]

Managed interrupts should start up in managed shutdown mode. Set the status
flag when initialising the irq descriptor.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/irq/irqdesc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: b/kernel/irq/irqdesc.c
===================================================================
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -448,7 +448,7 @@ static int alloc_descs(unsigned int star
 		}
 	}
 
-	flags = affinity ? IRQD_AFFINITY_MANAGED : 0;
+	flags = affinity ? IRQD_AFFINITY_MANAGED | IRQD_MANAGED_SHUTDOWN : 0;
 	mask = NULL;
 
 	for (i = 0; i < cnt; i++) {

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 07/52] genirq: Separate activation and startup
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (5 preceding siblings ...)
  2017-09-13 21:29 ` [patch 06/52] genirq: Set managed shut down flag at init Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 08/52] genirq/irqdomain: Update irq_domain_ops.activate() signature Thomas Gleixner
                   ` (46 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: genirq--Separate-activation-and-startup.patch --]
[-- Type: text/plain, Size: 5039 bytes --]

Activation of an interrupt and startup are currently a combo
functionlity. That works so far, but upcoming changes require a strict
separation because the activation can fail in future.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/irq/autoprobe.c |    2 +-
 kernel/irq/chip.c      |   30 ++++++++++++++++++++++++------
 kernel/irq/internals.h |    2 ++
 kernel/irq/manage.c    |   17 ++++++++++++++++-
 4 files changed, 43 insertions(+), 8 deletions(-)

Index: b/kernel/irq/autoprobe.c
===================================================================
--- a/kernel/irq/autoprobe.c
+++ b/kernel/irq/autoprobe.c
@@ -53,7 +53,7 @@ unsigned long probe_irq_on(void)
 			if (desc->irq_data.chip->irq_set_type)
 				desc->irq_data.chip->irq_set_type(&desc->irq_data,
 							 IRQ_TYPE_PROBE);
-			irq_startup(desc, IRQ_NORESEND, IRQ_START_FORCE);
+			irq_activate_and_startup(desc, IRQ_NORESEND);
 		}
 		raw_spin_unlock_irq(&desc->lock);
 	}
Index: b/kernel/irq/chip.c
===================================================================
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -207,20 +207,19 @@ static int
 		 * Catch code which fiddles with enable_irq() on a managed
 		 * and potentially shutdown IRQ. Chained interrupt
 		 * installment or irq auto probing should not happen on
-		 * managed irqs either. Emit a warning, break the affinity
-		 * and start it up as a normal interrupt.
+		 * managed irqs either.
 		 */
 		if (WARN_ON_ONCE(force))
-			return IRQ_STARTUP_NORMAL;
+			return IRQ_STARTUP_ABORT;
 		/*
 		 * The interrupt was requested, but there is no online CPU
 		 * in it's affinity mask. Put it into managed shutdown
 		 * state and let the cpu hotplug mechanism start it up once
 		 * a CPU in the mask becomes available.
 		 */
-		irqd_set_managed_shutdown(d);
 		return IRQ_STARTUP_ABORT;
 	}
+	irq_domain_activate_irq(d);
 	return IRQ_STARTUP_MANAGED;
 }
 #else
@@ -236,7 +235,9 @@ static int __irq_startup(struct irq_desc
 	struct irq_data *d = irq_desc_get_irq_data(desc);
 	int ret = 0;
 
-	irq_domain_activate_irq(d);
+	/* Warn if this interrupt is not activated but try nevertheless */
+	WARN_ON_ONCE(!irqd_is_activated(d));
+
 	if (d->chip->irq_startup) {
 		ret = d->chip->irq_startup(d);
 		irq_state_clr_disabled(desc);
@@ -269,6 +270,7 @@ int irq_startup(struct irq_desc *desc, b
 			irq_set_affinity_locked(d, aff, false);
 			break;
 		case IRQ_STARTUP_ABORT:
+			irqd_set_managed_shutdown(d);
 			return 0;
 		}
 	}
@@ -278,6 +280,22 @@ int irq_startup(struct irq_desc *desc, b
 	return ret;
 }
 
+int irq_activate(struct irq_desc *desc)
+{
+	struct irq_data *d = irq_desc_get_irq_data(desc);
+
+	if (!irqd_affinity_is_managed(d))
+		irq_domain_activate_irq(d);
+	return 0;
+}
+
+void irq_activate_and_startup(struct irq_desc *desc, bool resend)
+{
+	if (WARN_ON(irq_activate(desc)))
+		return;
+	irq_startup(desc, resend, IRQ_START_FORCE);
+}
+
 static void __irq_disable(struct irq_desc *desc, bool mask);
 
 void irq_shutdown(struct irq_desc *desc)
@@ -953,7 +971,7 @@ static void
 		irq_settings_set_norequest(desc);
 		irq_settings_set_nothread(desc);
 		desc->action = &chained_action;
-		irq_startup(desc, IRQ_RESEND, IRQ_START_FORCE);
+		irq_activate_and_startup(desc, IRQ_RESEND);
 	}
 }
 
Index: b/kernel/irq/internals.h
===================================================================
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -74,6 +74,8 @@ extern void __enable_irq(struct irq_desc
 #define IRQ_START_FORCE	true
 #define IRQ_START_COND	false
 
+extern int irq_activate(struct irq_desc *desc);
+extern void irq_activate_and_startup(struct irq_desc *desc, bool resend);
 extern int irq_startup(struct irq_desc *desc, bool resend, bool force);
 
 extern void irq_shutdown(struct irq_desc *desc);
Index: b/kernel/irq/manage.c
===================================================================
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -519,7 +519,7 @@ void __enable_irq(struct irq_desc *desc)
 		 * time. If it was already started up, then irq_startup()
 		 * will invoke irq_enable() under the hood.
 		 */
-		irq_startup(desc, IRQ_RESEND, IRQ_START_COND);
+		irq_startup(desc, IRQ_RESEND, IRQ_START_FORCE);
 		break;
 	}
 	default:
@@ -1325,6 +1325,21 @@ static int
 				goto out_unlock;
 		}
 
+		/*
+		 * Activate the interrupt. That activation must happen
+		 * independently of IRQ_NOAUTOEN. request_irq() can fail
+		 * and the callers are supposed to handle
+		 * that. enable_irq() of an interrupt requested with
+		 * IRQ_NOAUTOEN is not supposed to fail. The activation
+		 * keeps it in shutdown mode, it merily associates
+		 * resources if necessary and if that's not possible it
+		 * fails. Interrupts which are in managed shutdown mode
+		 * will simply ignore that activation request.
+		 */
+		ret = irq_activate(desc);
+		if (ret)
+			goto out_unlock;
+
 		desc->istate &= ~(IRQS_AUTODETECT | IRQS_SPURIOUS_DISABLED | \
 				  IRQS_ONESHOT | IRQS_WAITING);
 		irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 08/52] genirq/irqdomain: Update irq_domain_ops.activate() signature
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (6 preceding siblings ...)
  2017-09-13 21:29 ` [patch 07/52] genirq: Separate activation and startup Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 09/52] genirq/irqdomain: Allow irq_domain_activate_irq() to fail Thomas Gleixner
                   ` (45 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: genirq-irqdomain--Give-irq_domain_ops.activate-a-return-value.patch --]
[-- Type: text/plain, Size: 9457 bytes --]

The irq_domain_ops.activate() callback has no return value and no way to
tell the function that the activation is early.

The upcoming changes to support a reservation scheme which allows to assign
interrupt vectors on x86 only when the interrupt is actually requested
requires:

  - A return value, so activation can fail at request_irq() time
  
  - Information that the activate invocation is early, i.e. before
    request_irq().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/irqdomain.h      |    4 ++--
 arch/x86/kernel/apic/htirq.c          |    5 +++--
 arch/x86/kernel/apic/io_apic.c        |    5 +++--
 arch/x86/platform/uv/uv_irq.c         |    5 +++--
 drivers/gpio/gpio-xgene-sb.c          |    8 +++++---
 drivers/iommu/amd_iommu.c             |    5 +++--
 drivers/iommu/intel_irq_remapping.c   |    5 +++--
 drivers/irqchip/irq-gic-v3-its.c      |   10 ++++++----
 drivers/pinctrl/stm32/pinctrl-stm32.c |    5 +++--
 include/linux/irqdomain.h             |    2 +-
 kernel/irq/irqdomain.c                |    2 +-
 kernel/irq/msi.c                      |    5 +++--
 12 files changed, 36 insertions(+), 25 deletions(-)

--- a/arch/x86/include/asm/irqdomain.h
+++ b/arch/x86/include/asm/irqdomain.h
@@ -41,8 +41,8 @@ extern int mp_irqdomain_alloc(struct irq
 			      unsigned int nr_irqs, void *arg);
 extern void mp_irqdomain_free(struct irq_domain *domain, unsigned int virq,
 			      unsigned int nr_irqs);
-extern void mp_irqdomain_activate(struct irq_domain *domain,
-				  struct irq_data *irq_data);
+extern int mp_irqdomain_activate(struct irq_domain *domain,
+				 struct irq_data *irq_data, bool early);
 extern void mp_irqdomain_deactivate(struct irq_domain *domain,
 				    struct irq_data *irq_data);
 extern int mp_irqdomain_ioapic_idx(struct irq_domain *domain);
--- a/arch/x86/kernel/apic/htirq.c
+++ b/arch/x86/kernel/apic/htirq.c
@@ -112,8 +112,8 @@ static void htirq_domain_free(struct irq
 	irq_domain_free_irqs_top(domain, virq, nr_irqs);
 }
 
-static void htirq_domain_activate(struct irq_domain *domain,
-				  struct irq_data *irq_data)
+static int htirq_domain_activate(struct irq_domain *domain,
+				 struct irq_data *irq_data, bool early)
 {
 	struct ht_irq_msg msg;
 	struct irq_cfg *cfg = irqd_cfg(irq_data);
@@ -132,6 +132,7 @@ static void htirq_domain_activate(struct
 			HT_IRQ_LOW_MT_ARBITRATED) |
 		HT_IRQ_LOW_IRQ_MASKED;
 	write_ht_irq_msg(irq_data->irq, &msg);
+	return 0;
 }
 
 static void htirq_domain_deactivate(struct irq_domain *domain,
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2977,8 +2977,8 @@ void mp_irqdomain_free(struct irq_domain
 	irq_domain_free_irqs_top(domain, virq, nr_irqs);
 }
 
-void mp_irqdomain_activate(struct irq_domain *domain,
-			   struct irq_data *irq_data)
+int mp_irqdomain_activate(struct irq_domain *domain,
+			  struct irq_data *irq_data, bool early)
 {
 	unsigned long flags;
 	struct irq_pin_list *entry;
@@ -2988,6 +2988,7 @@ void mp_irqdomain_activate(struct irq_do
 	for_each_irq_pin(entry, data->irq_2_pin)
 		__ioapic_write_entry(entry->apic, entry->pin, data->entry);
 	raw_spin_unlock_irqrestore(&ioapic_lock, flags);
+	return 0;
 }
 
 void mp_irqdomain_deactivate(struct irq_domain *domain,
--- a/arch/x86/platform/uv/uv_irq.c
+++ b/arch/x86/platform/uv/uv_irq.c
@@ -127,10 +127,11 @@ static void uv_domain_free(struct irq_do
  * Re-target the irq to the specified CPU and enable the specified MMR located
  * on the specified blade to allow the sending of MSIs to the specified CPU.
  */
-static void uv_domain_activate(struct irq_domain *domain,
-			       struct irq_data *irq_data)
+static int uv_domain_activate(struct irq_domain *domain,
+			      struct irq_data *irq_data, bool early)
 {
 	uv_program_mmr(irqd_cfg(irq_data), irq_data->chip_data);
+	return 0;
 }
 
 /*
--- a/drivers/gpio/gpio-xgene-sb.c
+++ b/drivers/gpio/gpio-xgene-sb.c
@@ -140,8 +140,9 @@ static int xgene_gpio_sb_to_irq(struct g
 	return irq_create_fwspec_mapping(&fwspec);
 }
 
-static void xgene_gpio_sb_domain_activate(struct irq_domain *d,
-		struct irq_data *irq_data)
+static int xgene_gpio_sb_domain_activate(struct irq_domain *d,
+					 struct irq_data *irq_data,
+					 bool early)
 {
 	struct xgene_gpio_sb *priv = d->host_data;
 	u32 gpio = HWIRQ_TO_GPIO(priv, irq_data->hwirq);
@@ -150,11 +151,12 @@ static void xgene_gpio_sb_domain_activat
 		dev_err(priv->gc.parent,
 		"Unable to configure XGene GPIO standby pin %d as IRQ\n",
 				gpio);
-		return;
+		return -ENOSPC;
 	}
 
 	xgene_gpio_set_bit(&priv->gc, priv->regs + MPA_GPIO_SEL_LO,
 			gpio * 2, 1);
+	return 0;
 }
 
 static void xgene_gpio_sb_domain_deactivate(struct irq_domain *d,
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -4170,8 +4170,8 @@ static void irq_remapping_free(struct ir
 	irq_domain_free_irqs_common(domain, virq, nr_irqs);
 }
 
-static void irq_remapping_activate(struct irq_domain *domain,
-				   struct irq_data *irq_data)
+static int irq_remapping_activate(struct irq_domain *domain,
+				  struct irq_data *irq_data, bool early)
 {
 	struct amd_ir_data *data = irq_data->chip_data;
 	struct irq_2_irte *irte_info = &data->irq_2_irte;
@@ -4180,6 +4180,7 @@ static void irq_remapping_activate(struc
 	if (iommu)
 		iommu->irte_ops->activate(data->entry, irte_info->devid,
 					  irte_info->index);
+	return 0;
 }
 
 static void irq_remapping_deactivate(struct irq_domain *domain,
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -1389,12 +1389,13 @@ static void intel_irq_remapping_free(str
 	irq_domain_free_irqs_common(domain, virq, nr_irqs);
 }
 
-static void intel_irq_remapping_activate(struct irq_domain *domain,
-					 struct irq_data *irq_data)
+static int intel_irq_remapping_activate(struct irq_domain *domain,
+					struct irq_data *irq_data, bool early)
 {
 	struct intel_ir_data *data = irq_data->chip_data;
 
 	modify_irte(&data->irq_2_iommu, &data->irte_entry);
+	return 0;
 }
 
 static void intel_irq_remapping_deactivate(struct irq_domain *domain,
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -2186,8 +2186,8 @@ static int its_irq_domain_alloc(struct i
 	return 0;
 }
 
-static void its_irq_domain_activate(struct irq_domain *domain,
-				    struct irq_data *d)
+static int its_irq_domain_activate(struct irq_domain *domain,
+				   struct irq_data *d, bool early)
 {
 	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
 	u32 event = its_get_event_id(d);
@@ -2205,6 +2205,7 @@ static void its_irq_domain_activate(stru
 
 	/* Map the GIC IRQ and event to the device */
 	its_send_mapti(its_dev, d->hwirq, event);
+	return 0;
 }
 
 static void its_irq_domain_deactivate(struct irq_domain *domain,
@@ -2678,8 +2679,8 @@ static int its_vpe_irq_domain_alloc(stru
 	return err;
 }
 
-static void its_vpe_irq_domain_activate(struct irq_domain *domain,
-					struct irq_data *d)
+static int its_vpe_irq_domain_activate(struct irq_domain *domain,
+				       struct irq_data *d, bool early)
 {
 	struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
 
@@ -2687,6 +2688,7 @@ static void its_vpe_irq_domain_activate(
 	vpe->col_idx = cpumask_first(cpu_online_mask);
 	its_send_vmapp(vpe, true);
 	its_send_vinvall(vpe);
+	return 0;
 }
 
 static void its_vpe_irq_domain_deactivate(struct irq_domain *domain,
--- a/drivers/pinctrl/stm32/pinctrl-stm32.c
+++ b/drivers/pinctrl/stm32/pinctrl-stm32.c
@@ -289,13 +289,14 @@ static int stm32_gpio_domain_translate(s
 	return 0;
 }
 
-static void stm32_gpio_domain_activate(struct irq_domain *d,
-				       struct irq_data *irq_data)
+static int stm32_gpio_domain_activate(struct irq_domain *d,
+				      struct irq_data *irq_data, bool early)
 {
 	struct stm32_gpio_bank *bank = d->host_data;
 	struct stm32_pinctrl *pctl = dev_get_drvdata(bank->gpio_chip.parent);
 
 	regmap_field_write(pctl->irqmux[irq_data->hwirq], bank->bank_nr);
+	return 0;
 }
 
 static int stm32_gpio_domain_alloc(struct irq_domain *d,
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -111,7 +111,7 @@ struct irq_domain_ops {
 		     unsigned int nr_irqs, void *arg);
 	void (*free)(struct irq_domain *d, unsigned int virq,
 		     unsigned int nr_irqs);
-	void (*activate)(struct irq_domain *d, struct irq_data *irq_data);
+	int (*activate)(struct irq_domain *d, struct irq_data *irqd, bool early);
 	void (*deactivate)(struct irq_domain *d, struct irq_data *irq_data);
 	int (*translate)(struct irq_domain *d, struct irq_fwspec *fwspec,
 			 unsigned long *out_hwirq, unsigned int *out_type);
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -1690,7 +1690,7 @@ static void __irq_domain_activate_irq(st
 		if (irq_data->parent_data)
 			__irq_domain_activate_irq(irq_data->parent_data);
 		if (domain->ops->activate)
-			domain->ops->activate(domain, irq_data);
+			domain->ops->activate(domain, irq_data, false);
 	}
 }
 
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -102,13 +102,14 @@ int msi_domain_set_affinity(struct irq_d
 	return ret;
 }
 
-static void msi_domain_activate(struct irq_domain *domain,
-				struct irq_data *irq_data)
+static int msi_domain_activate(struct irq_domain *domain,
+			       struct irq_data *irq_data, bool early)
 {
 	struct msi_msg msg;
 
 	BUG_ON(irq_chip_compose_msi_msg(irq_data, &msg));
 	irq_chip_write_msi_msg(irq_data, &msg);
+	return 0;
 }
 
 static void msi_domain_deactivate(struct irq_domain *domain,

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 09/52] genirq/irqdomain: Allow irq_domain_activate_irq() to fail
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (7 preceding siblings ...)
  2017-09-13 21:29 ` [patch 08/52] genirq/irqdomain: Update irq_domain_ops.activate() signature Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 10/52] genirq/irqdomain: Propagate early activation Thomas Gleixner
                   ` (44 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: genirq-irqdomain--Allow-irq_domain_activate_irq---to-fail.patch --]
[-- Type: text/plain, Size: 4869 bytes --]

Allow irq_domain_activate_irq() to fail. This is required to support a
reservation and late vector assignment scheme.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/irqdomain.h |    2 +-
 kernel/irq/chip.c         |    9 +++++++--
 kernel/irq/internals.h    |    3 ++-
 kernel/irq/irqdomain.c    |   40 +++++++++++++++++++++++++---------------
 kernel/irq/msi.c          |   19 +++++++++++++++++--
 5 files changed, 52 insertions(+), 21 deletions(-)

--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -441,7 +441,7 @@ extern int __irq_domain_alloc_irqs(struc
 				   unsigned int nr_irqs, int node, void *arg,
 				   bool realloc, const struct cpumask *affinity);
 extern void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs);
-extern void irq_domain_activate_irq(struct irq_data *irq_data);
+extern int irq_domain_activate_irq(struct irq_data *irq_data);
 extern void irq_domain_deactivate_irq(struct irq_data *irq_data);
 
 static inline int irq_domain_alloc_irqs(struct irq_domain *domain,
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -219,7 +219,12 @@ static int
 		 */
 		return IRQ_STARTUP_ABORT;
 	}
-	irq_domain_activate_irq(d);
+	/*
+	 * Managed interrupts have reserved resources, so this should not
+	 * happen.
+	 */
+	if (WARN_ON(irq_domain_activate_irq(d)))
+		return IRQ_STARTUP_ABORT;
 	return IRQ_STARTUP_MANAGED;
 }
 #else
@@ -285,7 +290,7 @@ int irq_activate(struct irq_desc *desc)
 	struct irq_data *d = irq_desc_get_irq_data(desc);
 
 	if (!irqd_affinity_is_managed(d))
-		irq_domain_activate_irq(d);
+		return irq_domain_activate_irq(d);
 	return 0;
 }
 
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -439,9 +439,10 @@ static inline bool irq_fixup_move_pendin
 #endif /* !CONFIG_GENERIC_PENDING_IRQ */
 
 #if !defined(CONFIG_IRQ_DOMAIN) || !defined(CONFIG_IRQ_DOMAIN_HIERARCHY)
-static inline void irq_domain_activate_irq(struct irq_data *data)
+static inline int irq_domain_activate_irq(struct irq_data *data)
 {
 	irqd_set_activated(data);
+	return 0;
 }
 static inline void irq_domain_deactivate_irq(struct irq_data *data)
 {
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -1682,28 +1682,35 @@ void irq_domain_free_irqs_parent(struct
 }
 EXPORT_SYMBOL_GPL(irq_domain_free_irqs_parent);
 
-static void __irq_domain_activate_irq(struct irq_data *irq_data)
+static void __irq_domain_deactivate_irq(struct irq_data *irq_data)
 {
 	if (irq_data && irq_data->domain) {
 		struct irq_domain *domain = irq_data->domain;
 
+		if (domain->ops->deactivate)
+			domain->ops->deactivate(domain, irq_data);
 		if (irq_data->parent_data)
-			__irq_domain_activate_irq(irq_data->parent_data);
-		if (domain->ops->activate)
-			domain->ops->activate(domain, irq_data, false);
+			__irq_domain_deactivate_irq(irq_data->parent_data);
 	}
 }
 
-static void __irq_domain_deactivate_irq(struct irq_data *irq_data)
+static int __irq_domain_activate_irq(struct irq_data *irqd)
 {
-	if (irq_data && irq_data->domain) {
-		struct irq_domain *domain = irq_data->domain;
+	int ret = 0;
 
-		if (domain->ops->deactivate)
-			domain->ops->deactivate(domain, irq_data);
-		if (irq_data->parent_data)
-			__irq_domain_deactivate_irq(irq_data->parent_data);
+	if (irqd && irqd->domain) {
+		struct irq_domain *domain = irqd->domain;
+
+		if (irqd->parent_data)
+			ret = __irq_domain_activate_irq(irqd->parent_data);
+		if (!ret && domain->ops->activate) {
+			ret = domain->ops->activate(domain, irqd, false);
+			/* Rollback in case of error */
+			if (ret && irqd->parent_data)
+				__irq_domain_deactivate_irq(irqd->parent_data);
+		}
 	}
+	return ret;
 }
 
 /**
@@ -1714,12 +1721,15 @@ static void __irq_domain_deactivate_irq(
  * This is the second step to call domain_ops->activate to program interrupt
  * controllers, so the interrupt could actually get delivered.
  */
-void irq_domain_activate_irq(struct irq_data *irq_data)
+int irq_domain_activate_irq(struct irq_data *irq_data)
 {
-	if (!irqd_is_activated(irq_data)) {
-		__irq_domain_activate_irq(irq_data);
+	int ret = 0;
+
+	if (!irqd_is_activated(irq_data))
+		ret = __irq_domain_activate_irq(irq_data);
+	if (!ret)
 		irqd_set_activated(irq_data);
-	}
+	return ret;
 }
 
 /**
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -400,11 +400,26 @@ int msi_domain_alloc_irqs(struct irq_dom
 			struct irq_data *irq_data;
 
 			irq_data = irq_domain_get_irq_data(domain, desc->irq);
-			irq_domain_activate_irq(irq_data);
+			ret = irq_domain_activate_irq(irq_data);
+			if (ret)
+				goto cleanup;
 		}
 	}
-
 	return 0;
+
+cleanup:
+	for_each_msi_entry(desc, dev) {
+		struct irq_data *irqd;
+
+		if (desc->irq == virq)
+			break;
+
+		irqd = irq_domain_get_irq_data(domain, desc->irq);
+		if (irqd_is_activated(irqd))
+			irq_domain_deactivate_irq(irqd);
+	}
+	msi_domain_free_irqs(domain, dev);
+	return ret;
 }
 
 /**

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 10/52] genirq/irqdomain: Propagate early activation
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (8 preceding siblings ...)
  2017-09-13 21:29 ` [patch 09/52] genirq/irqdomain: Allow irq_domain_activate_irq() to fail Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 11/52] genirq/irqdomain: Add force reactivation flag to irq domains Thomas Gleixner
                   ` (43 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: genirq-irqdomain--Propagate-early-activation.patch --]
[-- Type: text/plain, Size: 4639 bytes --]

Propagate the early activation mode to the irqdomain activate()
callbacks. This is required for the upcoming reservation, late vector
assignment scheme, so that the early activation call can act accordingly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/io_apic.c |    4 ++--
 include/linux/irqdomain.h      |    2 +-
 kernel/irq/chip.c              |    4 ++--
 kernel/irq/internals.h         |    2 +-
 kernel/irq/irqdomain.c         |   11 ++++++-----
 kernel/irq/msi.c               |    2 +-
 6 files changed, 13 insertions(+), 12 deletions(-)

--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2096,7 +2096,7 @@ static inline void __init check_timer(vo
 				unmask_ioapic_irq(irq_get_irq_data(0));
 		}
 		irq_domain_deactivate_irq(irq_data);
-		irq_domain_activate_irq(irq_data);
+		irq_domain_activate_irq(irq_data, false);
 		if (timer_irq_works()) {
 			if (disable_timer_pin_1 > 0)
 				clear_IO_APIC_pin(0, pin1);
@@ -2118,7 +2118,7 @@ static inline void __init check_timer(vo
 		 */
 		replace_pin_at_irq_node(data, node, apic1, pin1, apic2, pin2);
 		irq_domain_deactivate_irq(irq_data);
-		irq_domain_activate_irq(irq_data);
+		irq_domain_activate_irq(irq_data, false);
 		legacy_pic->unmask(0);
 		if (timer_irq_works()) {
 			apic_printk(APIC_QUIET, KERN_INFO "....... works.\n");
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -441,7 +441,7 @@ extern int __irq_domain_alloc_irqs(struc
 				   unsigned int nr_irqs, int node, void *arg,
 				   bool realloc, const struct cpumask *affinity);
 extern void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs);
-extern int irq_domain_activate_irq(struct irq_data *irq_data);
+extern int irq_domain_activate_irq(struct irq_data *irq_data, bool early);
 extern void irq_domain_deactivate_irq(struct irq_data *irq_data);
 
 static inline int irq_domain_alloc_irqs(struct irq_domain *domain,
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -223,7 +223,7 @@ static int
 	 * Managed interrupts have reserved resources, so this should not
 	 * happen.
 	 */
-	if (WARN_ON(irq_domain_activate_irq(d)))
+	if (WARN_ON(irq_domain_activate_irq(d, false)))
 		return IRQ_STARTUP_ABORT;
 	return IRQ_STARTUP_MANAGED;
 }
@@ -290,7 +290,7 @@ int irq_activate(struct irq_desc *desc)
 	struct irq_data *d = irq_desc_get_irq_data(desc);
 
 	if (!irqd_affinity_is_managed(d))
-		return irq_domain_activate_irq(d);
+		return irq_domain_activate_irq(d, false);
 	return 0;
 }
 
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -439,7 +439,7 @@ static inline bool irq_fixup_move_pendin
 #endif /* !CONFIG_GENERIC_PENDING_IRQ */
 
 #if !defined(CONFIG_IRQ_DOMAIN) || !defined(CONFIG_IRQ_DOMAIN_HIERARCHY)
-static inline int irq_domain_activate_irq(struct irq_data *data)
+static inline int irq_domain_activate_irq(struct irq_data *data, bool early)
 {
 	irqd_set_activated(data);
 	return 0;
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -1694,7 +1694,7 @@ static void __irq_domain_deactivate_irq(
 	}
 }
 
-static int __irq_domain_activate_irq(struct irq_data *irqd)
+static int __irq_domain_activate_irq(struct irq_data *irqd, bool early)
 {
 	int ret = 0;
 
@@ -1702,9 +1702,10 @@ static int __irq_domain_activate_irq(str
 		struct irq_domain *domain = irqd->domain;
 
 		if (irqd->parent_data)
-			ret = __irq_domain_activate_irq(irqd->parent_data);
+			ret = __irq_domain_activate_irq(irqd->parent_data,
+							early);
 		if (!ret && domain->ops->activate) {
-			ret = domain->ops->activate(domain, irqd, false);
+			ret = domain->ops->activate(domain, irqd, early);
 			/* Rollback in case of error */
 			if (ret && irqd->parent_data)
 				__irq_domain_deactivate_irq(irqd->parent_data);
@@ -1721,12 +1722,12 @@ static int __irq_domain_activate_irq(str
  * This is the second step to call domain_ops->activate to program interrupt
  * controllers, so the interrupt could actually get delivered.
  */
-int irq_domain_activate_irq(struct irq_data *irq_data)
+int irq_domain_activate_irq(struct irq_data *irq_data, bool early)
 {
 	int ret = 0;
 
 	if (!irqd_is_activated(irq_data))
-		ret = __irq_domain_activate_irq(irq_data);
+		ret = __irq_domain_activate_irq(irq_data, early);
 	if (!ret)
 		irqd_set_activated(irq_data);
 	return ret;
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -401,7 +401,7 @@ int msi_domain_alloc_irqs(struct irq_dom
 			struct irq_data *irq_data;
 
 			irq_data = irq_domain_get_irq_data(domain, desc->irq);
-			ret = irq_domain_activate_irq(irq_data);
+			ret = irq_domain_activate_irq(irq_data, true);
 			if (ret)
 				goto cleanup;
 		}

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 11/52] genirq/irqdomain: Add force reactivation flag to irq domains
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (9 preceding siblings ...)
  2017-09-13 21:29 ` [patch 10/52] genirq/irqdomain: Propagate early activation Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 12/52] genirq: Implement bitmap matrix allocator Thomas Gleixner
                   ` (42 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: genirq-irqdomain--Add-force-reactivation-flag-to-irq-domains.patch --]
[-- Type: text/plain, Size: 1313 bytes --]

Allow irqdomains to tell the core code, that after early activation the
interrupt needs to be reactivated at request_irq() time.

This allows reservation of vectors at early activation time and actual
vector assignment at request_irq() time.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/msi.h |    5 +++++
 kernel/irq/msi.c    |    2 ++
 2 files changed, 7 insertions(+)

Index: b/include/linux/msi.h
===================================================================
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -283,6 +283,11 @@ enum {
 	MSI_FLAG_PCI_MSIX		= (1 << 3),
 	/* Needs early activate, required for PCI */
 	MSI_FLAG_ACTIVATE_EARLY		= (1 << 4),
+	/*
+	 * Must reactivate when irq is started even when
+	 * MSI_FLAG_ACTIVATE_EARLY has been set.
+	 */
+	MSI_FLAG_MUST_REACTIVATE	= (1 << 5),
 };
 
 int msi_domain_set_affinity(struct irq_data *data, const struct cpumask *mask,
Index: b/kernel/irq/msi.c
===================================================================
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -403,6 +403,8 @@ int msi_domain_alloc_irqs(struct irq_dom
 			ret = irq_domain_activate_irq(irq_data, true);
 			if (ret)
 				goto cleanup;
+			if (info->flags & MSI_FLAG_MUST_REACTIVATE)
+				irqd_clr_activated(irq_data);
 		}
 	}
 	return 0;

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 12/52] genirq: Implement bitmap matrix allocator
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (10 preceding siblings ...)
  2017-09-13 21:29 ` [patch 11/52] genirq/irqdomain: Add force reactivation flag to irq domains Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 13/52] genirq/matrix: Add tracepoints Thomas Gleixner
                   ` (41 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven,
	Chris Metcalf

[-- Attachment #1: genirq--Implement-bitmap-matrix-allocator.patch --]
[-- Type: text/plain, Size: 16323 bytes --]

Implement the infrastructure for a simple bitmap based allocator, which
will replace the x86 vector allocator. It's in the core code as other
architectures might be able to reuse/extend it. For now it only implements
allocations for single CPUs, but it's simple to add multi CPU allocation
support if required.

The concept is rather simple:

 Global information:
 	system_vector bitmap
	global accounting

 PerCPU information:
 	allocation bitmap
	managed allocation bitmap
	local accounting

The system vector bitmap is used to exclude vectors system wide from the
allocation space.

The allocation bitmap is used to keep track of per cpu used vectors.

The managed allocation bitmap is used to reserve vectors for managed
interrupts.

When a regular (non managed) interrupt allocation happens then the
following rule applies:

      tmpmap = system_map | alloc_map | managed_map
      find_zero_bit(tmpmap)

Oring the bitmaps together gives the real available space. The same rule
applies for reserving a managed interrupt vector. But contrary to the
regular interrupts the reservation only marks the bit in the managed map
and therefor excludes it from the regular allocations. The managed map is
only cleaned out when the a managed interrupt is completely released and it
stays alive accross CPU offline/online operations.

For managed interrupt allocations the rule is:

      tmpmap = managed_map & ~alloc_map
      find_first_bit(tmpmap)

This returns the first bit which is in the managed map, but not yet
allocated in the allocation map. The allocation marks it in the allocation
map and hands it back to the caller for use.

The rest of the code are helper functions to handle the various
requirements and the accounting which are necessary to replace the x86
vector allocation code. The result is a single patch as the evolution of
this infrastructure cannot be represented in bits and pieces.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
---
 include/linux/irq.h |   22 ++
 kernel/irq/Kconfig  |    3 
 kernel/irq/Makefile |    1 
 kernel/irq/matrix.c |  428 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 454 insertions(+)

--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -1116,6 +1116,28 @@ static inline u32 irq_reg_readl(struct i
 		return readl(gc->reg_base + reg_offset);
 }
 
+struct irq_matrix;
+struct irq_matrix *irq_alloc_matrix(unsigned int matrix_bits,
+				    unsigned int alloc_start,
+				    unsigned int alloc_end);
+void irq_matrix_online(struct irq_matrix *m);
+void irq_matrix_offline(struct irq_matrix *m);
+void irq_matrix_assign_system(struct irq_matrix *m, unsigned int bit, bool replace);
+int irq_matrix_reserve_managed(struct irq_matrix *m, const struct cpumask *msk);
+void irq_matrix_remove_managed(struct irq_matrix *m, const struct cpumask *msk);
+int irq_matrix_alloc_managed(struct irq_matrix *m, unsigned int cpu);
+void irq_matrix_reserve(struct irq_matrix *m);
+void irq_matrix_remove_reserved(struct irq_matrix *m);
+int irq_matrix_alloc(struct irq_matrix *m, const struct cpumask *msk,
+		     bool reserved, unsigned int *mapped_cpu);
+void irq_matrix_free(struct irq_matrix *m, unsigned int cpu,
+		     unsigned int bit, bool managed);
+void irq_matrix_assign(struct irq_matrix *m, unsigned int bit);
+unsigned int irq_matrix_available(struct irq_matrix *m, bool cpudown);
+unsigned int irq_matrix_allocated(struct irq_matrix *m);
+unsigned int irq_matrix_reserved(struct irq_matrix *m);
+void irq_matrix_debug_show(struct seq_file *sf, struct irq_matrix *m, int ind);
+
 /* Contrary to Linux irqs, for hardware irqs the irq number 0 is valid */
 #define INVALID_HWIRQ	(~0UL)
 irq_hw_number_t ipi_get_hwirq(unsigned int irq, unsigned int cpu);
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -97,6 +97,9 @@ config HANDLE_DOMAIN_IRQ
 config IRQ_TIMINGS
 	bool
 
+config GENERIC_IRQ_MATRIX_ALLOCATOR
+	bool
+
 config IRQ_DOMAIN_DEBUG
 	bool "Expose hardware/virtual IRQ mapping via debugfs"
 	depends on IRQ_DOMAIN && DEBUG_FS
--- a/kernel/irq/Makefile
+++ b/kernel/irq/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_GENERIC_MSI_IRQ) += msi.o
 obj-$(CONFIG_GENERIC_IRQ_IPI) += ipi.o
 obj-$(CONFIG_SMP) += affinity.o
 obj-$(CONFIG_GENERIC_IRQ_DEBUGFS) += debugfs.o
+obj-$(CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR) += matrix.o
--- /dev/null
+++ b/kernel/irq/matrix.c
@@ -0,0 +1,428 @@
+/*
+ * Copyright (C) 2017 Thomas Gleixner <tglx@linutronix.de>
+ *
+ * SPDX-License-Identifier: GPL-2.0
+ */
+#include <linux/spinlock.h>
+#include <linux/seq_file.h>
+#include <linux/bitmap.h>
+#include <linux/percpu.h>
+#include <linux/cpu.h>
+#include <linux/irq.h>
+
+#define IRQ_MATRIX_SIZE	(BITS_TO_LONGS(IRQ_MATRIX_BITS) * sizeof(unsigned long))
+
+struct cpumap {
+	unsigned int		available;
+	unsigned int		allocated;
+	unsigned int		managed;
+	bool			online;
+	unsigned long		alloc_map[IRQ_MATRIX_SIZE];
+	unsigned long		managed_map[IRQ_MATRIX_SIZE];
+};
+
+struct irq_matrix {
+	unsigned int		matrix_bits;
+	unsigned int		alloc_start;
+	unsigned int		alloc_end;
+	unsigned int		alloc_size;
+	unsigned int		global_available;
+	unsigned int		global_reserved;
+	unsigned int		systembits_inalloc;
+	unsigned int		total_allocated;
+	unsigned int		online_maps;
+	struct cpumap __percpu	*maps;
+	unsigned long		scratch_map[IRQ_MATRIX_SIZE];
+	unsigned long		system_map[IRQ_MATRIX_SIZE];
+};
+
+/**
+ * irq_alloc_matrix - Allocate a irq_matrix structure and initialize it
+ * @matrix_bits:	Number of matrix bits must be <= IRQ_MATRIX_BITS
+ * @alloc_start:	From which bit the allocation search starts
+ * @alloc_end:		At which bit the allocation search ends, i.e first
+ *			invalid bit
+ */
+__init struct irq_matrix *irq_alloc_matrix(unsigned int matrix_bits,
+					   unsigned int alloc_start,
+					   unsigned int alloc_end)
+{
+	struct irq_matrix *m;
+
+	if (matrix_bits > IRQ_MATRIX_BITS)
+		return NULL;
+
+	m = kzalloc(sizeof(*m), GFP_KERNEL);
+	if (!m)
+		return NULL;
+
+	m->matrix_bits = matrix_bits;
+	m->alloc_start = alloc_start;
+	m->alloc_end = alloc_end;
+	m->alloc_size = alloc_end - alloc_start;
+	m->maps = alloc_percpu(*m->maps);
+	if (!m->maps) {
+		kfree(m);
+		return NULL;
+	}
+	return m;
+}
+
+/**
+ * irq_matrix_online - Bring the local CPU matrix online
+ * @m:		Matrix pointer
+ */
+void irq_matrix_online(struct irq_matrix *m)
+{
+	struct cpumap *cm = this_cpu_ptr(m->maps);
+
+	BUG_ON(cm->online);
+
+	bitmap_zero(cm->alloc_map, m->matrix_bits);
+	cm->available = m->alloc_size - (cm->managed + m->systembits_inalloc);
+	cm->allocated = 0;
+	m->global_available += cm->available;
+	cm->online = true;
+	m->online_maps++;
+}
+
+/**
+ * irq_matrix_offline - Bring the local CPU matrix offline
+ * @m:		Matrix pointer
+ */
+void irq_matrix_offline(struct irq_matrix *m)
+{
+	struct cpumap *cm = this_cpu_ptr(m->maps);
+
+	/* Update the global available size */
+	m->global_available -= cm->available;
+	cm->online = false;
+	m->online_maps--;
+}
+
+static unsigned int matrix_alloc_area(struct irq_matrix *m, struct cpumap *cm,
+				      unsigned int num, bool managed)
+{
+	unsigned int area, start = m->alloc_start;
+	unsigned int end = m->alloc_end;
+
+	bitmap_or(m->scratch_map, cm->managed_map, m->system_map, end);
+	bitmap_or(m->scratch_map, m->scratch_map, cm->alloc_map, end);
+	area = bitmap_find_next_zero_area(m->scratch_map, end, start, num, 0);
+	if (area >= end)
+		return area;
+	if (managed)
+		bitmap_set(cm->managed_map, area, num);
+	else
+		bitmap_set(cm->alloc_map, area, num);
+	return area;
+}
+
+/**
+ * irq_matrix_assign_system - Assign system wide entry in the matrix
+ * @m:		Matrix pointer
+ * @bit:	Which bit to reserve
+ * @replace:	Replace an already allocated vector with a system
+ *		vector at the same bit position.
+ *
+ * The BUG_ON()s below are on purpose. If this goes wrong in the
+ * early boot process, then the chance to survive is about zero.
+ * If this happens when the system is life, it's not much better.
+ */
+void irq_matrix_assign_system(struct irq_matrix *m, unsigned int bit,
+			      bool replace)
+{
+	struct cpumap *cm = this_cpu_ptr(m->maps);
+
+	BUG_ON(bit > m->matrix_bits);
+	BUG_ON(m->online_maps > 1 || (m->online_maps && !replace));
+
+	set_bit(bit, m->system_map);
+	if (replace) {
+		BUG_ON(!test_and_clear_bit(bit, cm->alloc_map));
+		cm->allocated--;
+		m->total_allocated--;
+	}
+	if (bit >= m->alloc_start && bit < m->alloc_end)
+		m->systembits_inalloc++;
+}
+
+/**
+ * irq_matrix_reserve_managed - Reserve a managed interrupt in a CPU map
+ * @m:		Matrix pointer
+ * @msk:	On which CPUs the bits should be reserved.
+ *
+ * Can be called for offline CPUs. Note, this will only reserve one bit
+ * on all CPUs in @msk, but it's not guaranteed that the bits are at the
+ * same offset on all CPUs
+ */
+int irq_matrix_reserve_managed(struct irq_matrix *m, const struct cpumask *msk)
+{
+	unsigned int cpu, failed_cpu;
+
+	for_each_cpu(cpu, msk) {
+		struct cpumap *cm = per_cpu_ptr(m->maps, cpu);
+		unsigned int bit;
+
+		bit = matrix_alloc_area(m, cm, 1, true);
+		if (bit >= m->alloc_end)
+			goto cleanup;
+		cm->managed++;
+		if (cm->online) {
+			cm->available--;
+			m->global_available--;
+		}
+	}
+	return 0;
+cleanup:
+	failed_cpu = cpu;
+	for_each_cpu(cpu, msk) {
+		if (cpu == failed_cpu)
+			break;
+		irq_matrix_remove_managed(m, cpumask_of(cpu));
+	}
+	return -ENOSPC;
+}
+
+/**
+ * irq_matrix_remove_managed - Remove managed interrupts in a CPU map
+ * @m:		Matrix pointer
+ * @msk:	On which CPUs the bits should be removed
+ *
+ * Can be called for offline CPUs
+ *
+ * This removes not allocated managed interrupts from the map. It does
+ * not matter which one because the managed interrupts free their
+ * allocation when they shut down. If not, the accounting is screwed,
+ * but all what can be done at this point is warn about it.
+ */
+void irq_matrix_remove_managed(struct irq_matrix *m, const struct cpumask *msk)
+{
+	unsigned int cpu;
+
+	for_each_cpu(cpu, msk) {
+		struct cpumap *cm = per_cpu_ptr(m->maps, cpu);
+		unsigned int bit, end = m->alloc_end;
+
+		if (WARN_ON_ONCE(!cm->managed))
+			continue;
+
+		/* Get managed bit which are not allocated */
+		bitmap_andnot(m->scratch_map, cm->managed_map, cm->alloc_map, end);
+
+		bit = find_first_bit(m->scratch_map, end);
+		if (WARN_ON_ONCE(bit >= end))
+			continue;
+
+		clear_bit(bit, cm->managed_map);
+
+		cm->managed--;
+		if (cm->online) {
+			cm->available++;
+			m->global_available++;
+		}
+	}
+}
+
+/**
+ * irq_matrix_alloc_managed - Allocate a managed interrupt in a CPU map
+ * @m:		Matrix pointer
+ * @cpu:	On which CPU the interrupt should be allocated
+ */
+int irq_matrix_alloc_managed(struct irq_matrix *m, unsigned int cpu)
+{
+	struct cpumap *cm = per_cpu_ptr(m->maps, cpu);
+	unsigned int bit, end = m->alloc_end;
+
+	/* Get managed bit which are not allocated */
+	bitmap_andnot(m->scratch_map, cm->managed_map, cm->alloc_map, end);
+	bit = find_first_bit(m->scratch_map, end);
+	if (bit >= end)
+		return -ENOSPC;
+	set_bit(bit, cm->alloc_map);
+	cm->allocated++;
+	m->total_allocated++;
+	return bit;
+}
+
+/**
+ * irq_matrix_assign - Assign a preallocated interrupt in the local CPU map
+ * @m:		Matrix pointer
+ * @bit:	Which bit to mark
+ *
+ * This should only be used to mark preallocated vectors
+ */
+void irq_matrix_assign(struct irq_matrix *m, unsigned int bit)
+{
+	struct cpumap *cm = this_cpu_ptr(m->maps);
+
+	if (WARN_ON_ONCE(bit < m->alloc_start || bit >= m->alloc_end))
+		return;
+	if (WARN_ON_ONCE(test_and_set_bit(bit, cm->alloc_map)))
+		return;
+	cm->allocated++;
+	m->total_allocated++;
+	cm->available--;
+	m->global_available--;
+}
+
+/**
+ * irq_matrix_reserve - Reserve interrupts
+ * @m:		Matrix pointer
+ *
+ * This is merily a book keeping call. It increments the number of globally
+ * reserved interrupt bits w/o actually allocating them. This allows to
+ * setup interrupt descriptors w/o assigning low level resources to it.
+ * The actual allocation happens when the interrupt gets activated.
+ */
+void irq_matrix_reserve(struct irq_matrix *m)
+{
+	if (m->global_reserved <= m->global_available &&
+	    m->global_reserved + 1 > m->global_available)
+		pr_warn("Interrupt reservation exceeds available resources\n");
+
+	m->global_reserved++;
+}
+
+/**
+ * irq_matrix_remove_reserved - Remove interrupt reservation
+ * @m:		Matrix pointer
+ *
+ * This is merily a book keeping call. It decrements the number of globally
+ * reserved interrupt bits. This is used to undo irq_matrix_reserve() when the
+ * interrupt was never in use and a real vector allocated, which undid the
+ * reservation.
+ */
+void irq_matrix_remove_reserved(struct irq_matrix *m)
+{
+	m->global_reserved--;
+}
+
+/**
+ * irq_matrix_alloc - Allocate a regular interrupt in a CPU map
+ * @m:		Matrix pointer
+ * @msk:	Which CPUs to search in
+ * @reserved:	Allocate previously reserved interrupts
+ * @mapped_cpu: Pointer to store the CPU for which the irq was allocated
+ */
+int irq_matrix_alloc(struct irq_matrix *m, const struct cpumask *msk,
+		     bool reserved, unsigned int *mapped_cpu)
+{
+	unsigned int cpu;
+
+	for_each_cpu(cpu, msk) {
+		struct cpumap *cm = per_cpu_ptr(m->maps, cpu);
+		unsigned int bit;
+
+		if (!cm->online)
+			continue;
+
+		bit = matrix_alloc_area(m, cm, 1, false);
+		if (bit < m->alloc_end) {
+			cm->allocated++;
+			cm->available--;
+			m->total_allocated++;
+			m->global_available--;
+			if (reserved)
+				m->global_reserved--;
+			*mapped_cpu = cpu;
+			return bit;
+		}
+	}
+	return -ENOSPC;
+}
+
+/**
+ * irq_matrix_free - Free allocated interrupt in the matrix
+ * @m:		Matrix pointer
+ * @cpu:	Which CPU map needs be updated
+ * @bit:	The bit to remove
+ * @managed:	If true, the interrupt is managed and not accounted
+ *		as available.
+ */
+void irq_matrix_free(struct irq_matrix *m, unsigned int cpu,
+		     unsigned int bit, bool managed)
+{
+	struct cpumap *cm = per_cpu_ptr(m->maps, cpu);
+
+	if (WARN_ON_ONCE(bit < m->alloc_start || bit >= m->alloc_end))
+		return;
+
+	if (cm->online) {
+		clear_bit(bit, cm->alloc_map);
+		cm->allocated--;
+		m->total_allocated--;
+		if (!managed) {
+			cm->available++;
+			m->global_available++;
+		}
+	}
+}
+
+/**
+ * irq_matrix_available - Get the number of globally available irqs
+ * @m:		Pointer to the matrix to query
+ * @cpudown:	If true, the local CPU is about to go down, adjust
+ *		the number of available irqs accordingly
+ */
+unsigned int irq_matrix_available(struct irq_matrix *m, bool cpudown)
+{
+	struct cpumap *cm = this_cpu_ptr(m->maps);
+
+	return m->global_available - cpudown ? cm->available : 0;
+}
+
+/**
+ * irq_matrix_reserved - Get the number of globally reserved irqs
+ * @m:		Pointer to the matrix to query
+ */
+unsigned int irq_matrix_reserved(struct irq_matrix *m)
+{
+	return m->global_reserved;
+}
+
+/**
+ * irq_matrix_allocated - Get the number of allocated irqs on the local cpu
+ * @m:		Pointer to the matrix to search
+ *
+ * This returns number of allocated irqs
+ */
+unsigned int irq_matrix_allocated(struct irq_matrix *m)
+{
+	struct cpumap *cm = this_cpu_ptr(m->maps);
+
+	return cm->allocated;
+}
+
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+/**
+ * irq_matrix_debug_show - Show detailed allocation information
+ * @sf:		Pointer to the seq_file to print to
+ * @m:		Pointer to the matrix allocator
+ * @ind:	Indentation for the print format
+ *
+ * Note, this is a lockless snapshot.
+ */
+void irq_matrix_debug_show(struct seq_file *sf, struct irq_matrix *m, int ind)
+{
+	unsigned int nsys = bitmap_weight(m->system_map, m->matrix_bits);
+	int cpu;
+
+	seq_printf(sf, "Online bitmaps:   %6u\n", m->online_maps);
+	seq_printf(sf, "Global available: %6u\n", m->global_available);
+	seq_printf(sf, "Global reserved:  %6u\n", m->global_reserved);
+	seq_printf(sf, "Total allocated:  %6u\n", m->total_allocated);
+	seq_printf(sf, "System: %u: %*pbl\n", nsys, m->matrix_bits,
+		   m->system_map);
+	seq_printf(sf, "%*s| CPU | avl | man | act | vectors\n", ind, " ");
+	cpus_read_lock();
+	for_each_online_cpu(cpu) {
+		struct cpumap *cm = per_cpu_ptr(m->maps, cpu);
+
+		seq_printf(sf, "%*s %4d  %4u  %4u  %4u  %*pbl\n", ind, " ",
+			   cpu, cm->available, cm->managed, cm->allocated,
+			   m->matrix_bits, cm->alloc_map);
+	}
+	cpus_read_unlock();
+}
+#endif

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 13/52] genirq/matrix: Add tracepoints
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (11 preceding siblings ...)
  2017-09-13 21:29 ` [patch 12/52] genirq: Implement bitmap matrix allocator Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 14/52] x86/apic: Deinline x2apic functions Thomas Gleixner
                   ` (40 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: genirq-matrix--Add-tracepoints.patch --]
[-- Type: text/plain, Size: 8142 bytes --]

Add tracepoints for the irq bitmap matrix allocator.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 include/trace/events/irq_matrix.h |  201 ++++++++++++++++++++++++++++++++++++++
 kernel/irq/matrix.c               |   15 ++
 2 files changed, 216 insertions(+)

--- /dev/null
+++ b/include/trace/events/irq_matrix.h
@@ -0,0 +1,201 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM irq_matrix
+
+#if !defined(_TRACE_IRQ_MATRIX_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_IRQ_MATRIX_H
+
+#include <linux/tracepoint.h>
+
+struct irq_matrix;
+struct cpumap;
+
+DECLARE_EVENT_CLASS(irq_matrix_global,
+
+	TP_PROTO(struct irq_matrix *matrix),
+
+	TP_ARGS(matrix),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	online_maps		)
+		__field(	unsigned int,	global_available	)
+		__field(	unsigned int,	global_reserved		)
+		__field(	unsigned int,	total_allocated		)
+	),
+
+	TP_fast_assign(
+		__entry->online_maps		= matrix->online_maps;
+		__entry->global_available	= matrix->global_available;
+		__entry->global_reserved	= matrix->global_reserved;
+		__entry->total_allocated	= matrix->total_allocated;
+	),
+
+	TP_printk("online_maps=%d global_avl=%u, global_rsvd=%u, total_alloc=%u",
+		  __entry->online_maps, __entry->global_available,
+		  __entry->global_reserved, __entry->total_allocated)
+);
+
+DECLARE_EVENT_CLASS(irq_matrix_global_update,
+
+	TP_PROTO(int bit, struct irq_matrix *matrix),
+
+	TP_ARGS(bit, matrix),
+
+	TP_STRUCT__entry(
+		__field(	int,		bit			)
+		__field(	unsigned int,	online_maps		)
+		__field(	unsigned int,	global_available	)
+		__field(	unsigned int,	global_reserved		)
+		__field(	unsigned int,	total_allocated		)
+	),
+
+	TP_fast_assign(
+		__entry->bit			= bit;
+		__entry->online_maps		= matrix->online_maps;
+		__entry->global_available	= matrix->global_available;
+		__entry->global_reserved	= matrix->global_reserved;
+		__entry->total_allocated	= matrix->total_allocated;
+	),
+
+	TP_printk("bit=%d online_maps=%d global_avl=%u, global_rsvd=%u, total_alloc=%u",
+		  __entry->bit, __entry->online_maps,
+		  __entry->global_available, __entry->global_reserved,
+		  __entry->total_allocated)
+);
+
+DECLARE_EVENT_CLASS(irq_matrix_cpu,
+
+	TP_PROTO(int bit, unsigned int cpu, struct irq_matrix *matrix,
+		 struct cpumap *cmap),
+
+	TP_ARGS(bit, cpu, matrix, cmap),
+
+	TP_STRUCT__entry(
+		__field(	int,		bit			)
+		__field(	unsigned int,	cpu			)
+		__field(	bool,		online			)
+		__field(	unsigned int,	available		)
+		__field(	unsigned int,	allocated		)
+		__field(	unsigned int,	managed			)
+		__field(	unsigned int,	online_maps		)
+		__field(	unsigned int,	global_available	)
+		__field(	unsigned int,	global_reserved		)
+		__field(	unsigned int,	total_allocated		)
+	),
+
+	TP_fast_assign(
+		__entry->bit			= bit;
+		__entry->cpu			= cpu;
+		__entry->online			= cmap->online;
+		__entry->available		= cmap->available;
+		__entry->allocated		= cmap->allocated;
+		__entry->managed		= cmap->managed;
+		__entry->online_maps		= matrix->online_maps;
+		__entry->global_available	= matrix->global_available;
+		__entry->global_reserved	= matrix->global_reserved;
+		__entry->total_allocated	= matrix->total_allocated;
+	),
+
+	TP_printk("bit=%d cpu=%u online=%d avl=%u alloc=%u managed=%u online_maps=%u global_avl=%u, global_rsvd=%u, total_alloc=%u",
+		  __entry->bit, __entry->cpu, __entry->online,
+		  __entry->available, __entry->allocated,
+		  __entry->managed, __entry->online_maps,
+		  __entry->global_available, __entry->global_reserved,
+		  __entry->total_allocated)
+);
+
+DEFINE_EVENT(irq_matrix_global, irq_matrix_online,
+
+	TP_PROTO(struct irq_matrix *matrix),
+
+	TP_ARGS(matrix)
+);
+
+DEFINE_EVENT(irq_matrix_global, irq_matrix_offline,
+
+	TP_PROTO(struct irq_matrix *matrix),
+
+	TP_ARGS(matrix)
+);
+
+DEFINE_EVENT(irq_matrix_global, irq_matrix_reserve,
+
+	TP_PROTO(struct irq_matrix *matrix),
+
+	TP_ARGS(matrix)
+);
+
+DEFINE_EVENT(irq_matrix_global, irq_matrix_remove_reserved,
+
+	TP_PROTO(struct irq_matrix *matrix),
+
+	TP_ARGS(matrix)
+);
+
+DEFINE_EVENT(irq_matrix_global_update, irq_matrix_assign_system,
+
+	TP_PROTO(int bit, struct irq_matrix *matrix),
+
+	TP_ARGS(bit, matrix)
+);
+
+DEFINE_EVENT(irq_matrix_cpu, irq_matrix_alloc_reserved,
+
+	TP_PROTO(int bit, unsigned int cpu,
+		 struct irq_matrix *matrix, struct cpumap *cmap),
+
+	TP_ARGS(bit, cpu, matrix, cmap)
+);
+
+DEFINE_EVENT(irq_matrix_cpu, irq_matrix_reserve_managed,
+
+	TP_PROTO(int bit, unsigned int cpu,
+		 struct irq_matrix *matrix, struct cpumap *cmap),
+
+	TP_ARGS(bit, cpu, matrix, cmap)
+);
+
+DEFINE_EVENT(irq_matrix_cpu, irq_matrix_remove_managed,
+
+	TP_PROTO(int bit, unsigned int cpu,
+		 struct irq_matrix *matrix, struct cpumap *cmap),
+
+	TP_ARGS(bit, cpu, matrix, cmap)
+);
+
+DEFINE_EVENT(irq_matrix_cpu, irq_matrix_alloc_managed,
+
+	TP_PROTO(int bit, unsigned int cpu,
+		 struct irq_matrix *matrix, struct cpumap *cmap),
+
+	TP_ARGS(bit, cpu, matrix, cmap)
+);
+
+DEFINE_EVENT(irq_matrix_cpu, irq_matrix_assign,
+
+	TP_PROTO(int bit, unsigned int cpu,
+		 struct irq_matrix *matrix, struct cpumap *cmap),
+
+	TP_ARGS(bit, cpu, matrix, cmap)
+);
+
+DEFINE_EVENT(irq_matrix_cpu, irq_matrix_alloc,
+
+	TP_PROTO(int bit, unsigned int cpu,
+		 struct irq_matrix *matrix, struct cpumap *cmap),
+
+	TP_ARGS(bit, cpu, matrix, cmap)
+);
+
+DEFINE_EVENT(irq_matrix_cpu, irq_matrix_free,
+
+	TP_PROTO(int bit, unsigned int cpu,
+		 struct irq_matrix *matrix, struct cpumap *cmap),
+
+	TP_ARGS(bit, cpu, matrix, cmap)
+);
+
+
+#endif /*  _TRACE_IRQ_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
--- a/kernel/irq/matrix.c
+++ b/kernel/irq/matrix.c
@@ -36,6 +36,9 @@ struct irq_matrix {
 	unsigned long		system_map[IRQ_MATRIX_SIZE];
 };
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/irq_matrix.h>
+
 /**
  * irq_alloc_matrix - Allocate a irq_matrix structure and initialize it
  * @matrix_bits:	Number of matrix bits must be <= IRQ_MATRIX_BITS
@@ -84,6 +87,7 @@ void irq_matrix_online(struct irq_matrix
 	m->global_available += cm->available;
 	cm->online = true;
 	m->online_maps++;
+	trace_irq_matrix_online(m);
 }
 
 /**
@@ -98,6 +102,7 @@ void irq_matrix_offline(struct irq_matri
 	m->global_available -= cm->available;
 	cm->online = false;
 	m->online_maps--;
+	trace_irq_matrix_offline(m);
 }
 
 static unsigned int matrix_alloc_area(struct irq_matrix *m, struct cpumap *cm,
@@ -145,6 +150,8 @@ void irq_matrix_assign_system(struct irq
 	}
 	if (bit >= m->alloc_start && bit < m->alloc_end)
 		m->systembits_inalloc++;
+
+	trace_irq_matrix_assign_system(bit, m);
 }
 
 /**
@@ -172,6 +179,7 @@ int irq_matrix_reserve_managed(struct ir
 			cm->available--;
 			m->global_available--;
 		}
+		trace_irq_matrix_reserve_managed(bit, cpu, m, cm);
 	}
 	return 0;
 cleanup:
@@ -221,6 +229,7 @@ void irq_matrix_remove_managed(struct ir
 			cm->available++;
 			m->global_available++;
 		}
+		trace_irq_matrix_remove_managed(bit, cpu, m, cm);
 	}
 }
 
@@ -242,6 +251,7 @@ int irq_matrix_alloc_managed(struct irq_
 	set_bit(bit, cm->alloc_map);
 	cm->allocated++;
 	m->total_allocated++;
+	trace_irq_matrix_alloc_managed(bit, cpu, m, cm);
 	return bit;
 }
 
@@ -264,6 +274,7 @@ void irq_matrix_assign(struct irq_matrix
 	m->total_allocated++;
 	cm->available--;
 	m->global_available--;
+	trace_irq_matrix_assign(bit, smp_processor_id(), m, cm);
 }
 
 /**
@@ -282,6 +293,7 @@ void irq_matrix_reserve(struct irq_matri
 		pr_warn("Interrupt reservation exceeds available resources\n");
 
 	m->global_reserved++;
+	trace_irq_matrix_reserve(m);
 }
 
 /**
@@ -296,6 +308,7 @@ void irq_matrix_reserve(struct irq_matri
 void irq_matrix_remove_reserved(struct irq_matrix *m)
 {
 	m->global_reserved--;
+	trace_irq_matrix_remove_reserved(m);
 }
 
 /**
@@ -326,6 +339,7 @@ int irq_matrix_alloc(struct irq_matrix *
 			if (reserved)
 				m->global_reserved--;
 			*mapped_cpu = cpu;
+			trace_irq_matrix_alloc(bit, cpu, m, cm);
 			return bit;
 		}
 	}
@@ -357,6 +371,7 @@ void irq_matrix_free(struct irq_matrix *
 			m->global_available++;
 		}
 	}
+	trace_irq_matrix_free(bit, cpu, m, cm);
 }
 
 /**

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 14/52] x86/apic: Deinline x2apic functions
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (12 preceding siblings ...)
  2017-09-13 21:29 ` [patch 13/52] genirq/matrix: Add tracepoints Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 15/52] x86/apic: Sanitize return value of apic.set_apic_id() Thomas Gleixner
                   ` (39 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Deinline-x2apic-functions.patch --]
[-- Type: text/plain, Size: 3476 bytes --]

These inline functions are used in both the cluster and the physical x2apic
code to fill in the function pointers of the apic structure. That means the
code is generated twice for no reason.

Move it to a C code and reuse it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/x2apic.h         |   49 ----------------------------------
 arch/x86/kernel/apic/x2apic.h         |    9 ++++++
 arch/x86/kernel/apic/x2apic_cluster.c |    2 -
 arch/x86/kernel/apic/x2apic_phys.c    |   40 +++++++++++++++++++++++++++
 4 files changed, 49 insertions(+), 51 deletions(-)

--- a/arch/x86/include/asm/x2apic.h
+++ /dev/null
@@ -1,49 +0,0 @@
-/*
- * Common bits for X2APIC cluster/physical modes.
- */
-
-#ifndef _ASM_X86_X2APIC_H
-#define _ASM_X86_X2APIC_H
-
-#include <asm/apic.h>
-#include <asm/ipi.h>
-#include <linux/cpumask.h>
-
-static int x2apic_apic_id_valid(int apicid)
-{
-	return 1;
-}
-
-static int x2apic_apic_id_registered(void)
-{
-	return 1;
-}
-
-static void
-__x2apic_send_IPI_dest(unsigned int apicid, int vector, unsigned int dest)
-{
-	unsigned long cfg = __prepare_ICR(0, vector, dest);
-	native_x2apic_icr_write(cfg, apicid);
-}
-
-static unsigned int x2apic_get_apic_id(unsigned long id)
-{
-	return id;
-}
-
-static unsigned long x2apic_set_apic_id(unsigned int id)
-{
-	return id;
-}
-
-static int x2apic_phys_pkg_id(int initial_apicid, int index_msb)
-{
-	return initial_apicid >> index_msb;
-}
-
-static void x2apic_send_IPI_self(int vector)
-{
-	apic_write(APIC_SELF_IPI, vector);
-}
-
-#endif /* _ASM_X86_X2APIC_H */
--- /dev/null
+++ b/arch/x86/kernel/apic/x2apic.h
@@ -0,0 +1,9 @@
+/* Common bits for X2APIC cluster/physical modes. */
+
+int x2apic_apic_id_valid(int apicid);
+int x2apic_apic_id_registered(void);
+void __x2apic_send_IPI_dest(unsigned int apicid, int vector, unsigned int dest);
+unsigned int x2apic_get_apic_id(unsigned long id);
+unsigned long x2apic_set_apic_id(unsigned int id);
+int x2apic_phys_pkg_id(int initial_apicid, int index_msb);
+void x2apic_send_IPI_self(int vector);
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -8,7 +8,7 @@
 #include <linux/cpu.h>
 
 #include <asm/smp.h>
-#include <asm/x2apic.h>
+#include "x2apic.h"
 
 static DEFINE_PER_CPU(u32, x86_cpu_to_logical_apicid);
 static DEFINE_PER_CPU(cpumask_var_t, cpus_in_cluster);
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -6,7 +6,8 @@
 #include <linux/dmar.h>
 
 #include <asm/smp.h>
-#include <asm/x2apic.h>
+#include <asm/ipi.h>
+#include "x2apic.h"
 
 int x2apic_phys;
 
@@ -98,6 +99,43 @@ static int x2apic_phys_probe(void)
 	return apic == &apic_x2apic_phys;
 }
 
+/* Common x2apic functions, also used by x2apic_cluster */
+int x2apic_apic_id_valid(int apicid)
+{
+	return 1;
+}
+
+int x2apic_apic_id_registered(void)
+{
+	return 1;
+}
+
+void __x2apic_send_IPI_dest(unsigned int apicid, int vector, unsigned int dest)
+{
+	unsigned long cfg = __prepare_ICR(0, vector, dest);
+	native_x2apic_icr_write(cfg, apicid);
+}
+
+unsigned int x2apic_get_apic_id(unsigned long id)
+{
+	return id;
+}
+
+unsigned long x2apic_set_apic_id(unsigned int id)
+{
+	return id;
+}
+
+int x2apic_phys_pkg_id(int initial_apicid, int index_msb)
+{
+	return initial_apicid >> index_msb;
+}
+
+void x2apic_send_IPI_self(int vector)
+{
+	apic_write(APIC_SELF_IPI, vector);
+}
+
 static struct apic apic_x2apic_phys __ro_after_init = {
 
 	.name				= "physical x2apic",

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 15/52] x86/apic: Sanitize return value of apic.set_apic_id()
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (13 preceding siblings ...)
  2017-09-13 21:29 ` [patch 14/52] x86/apic: Deinline x2apic functions Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 16/52] x86/apic: Sanitize return value of check_apicid_used() Thomas Gleixner
                   ` (38 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Sanitize-return-value-of-.set_apic_id--.patch --]
[-- Type: text/plain, Size: 3206 bytes --]

The set_apic_id() callback returns an unsigned long value which is handed
in to apic_write() as the value argument u32.

Adjust the return value so it returns u32 right away.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h          |    2 +-
 arch/x86/kernel/apic/apic_flat_64.c  |    2 +-
 arch/x86/kernel/apic/apic_numachip.c |    4 ++--
 arch/x86/kernel/apic/x2apic.h        |    2 +-
 arch/x86/kernel/apic/x2apic_phys.c   |    2 +-
 arch/x86/kernel/apic/x2apic_uv_x.c   |    2 +-
 arch/x86/xen/apic.c                  |    2 +-
 7 files changed, 8 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -296,7 +296,7 @@ struct apic {
 
 	unsigned int (*get_apic_id)(unsigned long x);
 	/* Can't be NULL on 64-bit */
-	unsigned long (*set_apic_id)(unsigned int id);
+	u32 (*set_apic_id)(unsigned int id);
 
 	int (*cpu_mask_to_apicid)(const struct cpumask *cpumask,
 				  struct irq_data *irqdata,
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -119,7 +119,7 @@ static unsigned int flat_get_apic_id(uns
 	return (x >> 24) & 0xFF;
 }
 
-static unsigned long set_apic_id(unsigned int id)
+static u32 set_apic_id(unsigned int id)
 {
 	return (id & 0xFF) << 24;
 }
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -38,7 +38,7 @@ static unsigned int numachip1_get_apic_i
 	return id;
 }
 
-static unsigned long numachip1_set_apic_id(unsigned int id)
+static u32 numachip1_set_apic_id(unsigned int id)
 {
 	return (id & 0xff) << 24;
 }
@@ -51,7 +51,7 @@ static unsigned int numachip2_get_apic_i
 	return ((mcfg >> (28 - 8)) & 0xfff00) | (x >> 24);
 }
 
-static unsigned long numachip2_set_apic_id(unsigned int id)
+static u32 numachip2_set_apic_id(unsigned int id)
 {
 	return id << 24;
 }
--- a/arch/x86/kernel/apic/x2apic.h
+++ b/arch/x86/kernel/apic/x2apic.h
@@ -4,6 +4,6 @@ int x2apic_apic_id_valid(int apicid);
 int x2apic_apic_id_registered(void);
 void __x2apic_send_IPI_dest(unsigned int apicid, int vector, unsigned int dest);
 unsigned int x2apic_get_apic_id(unsigned long id);
-unsigned long x2apic_set_apic_id(unsigned int id);
+u32 x2apic_set_apic_id(unsigned int id);
 int x2apic_phys_pkg_id(int initial_apicid, int index_msb);
 void x2apic_send_IPI_self(int vector);
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -121,7 +121,7 @@ unsigned int x2apic_get_apic_id(unsigned
 	return id;
 }
 
-unsigned long x2apic_set_apic_id(unsigned int id)
+u32 x2apic_set_apic_id(unsigned int id)
 {
 	return id;
 }
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -547,7 +547,7 @@ static unsigned int x2apic_get_apic_id(u
 	return id;
 }
 
-static unsigned long set_apic_id(unsigned int id)
+static u32 set_apic_id(unsigned int id)
 {
 	/* CHECKME: Do we need to mask out the xapic extra bits? */
 	return id;
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -30,7 +30,7 @@ static unsigned int xen_io_apic_read(uns
 	return 0xfd;
 }
 
-static unsigned long xen_set_apic_id(unsigned int x)
+static u32 xen_set_apic_id(unsigned int x)
 {
 	WARN_ON(1);
 	return x;

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 16/52] x86/apic: Sanitize return value of check_apicid_used()
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (14 preceding siblings ...)
  2017-09-13 21:29 ` [patch 15/52] x86/apic: Sanitize return value of apic.set_apic_id() Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 17/52] x86/apic: Move probe32 specific APIC functions Thomas Gleixner
                   ` (37 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Sanitize-return-value-of-check_apicid_used--.patch --]
[-- Type: text/plain, Size: 1360 bytes --]

The check is boolean, but the function returns unsigned long for no value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h      |    4 ++--
 arch/x86/kernel/apic/bigsmp_32.c |    4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -280,7 +280,7 @@ struct apic {
 	int disable_esr;
 
 	int dest_logical;
-	unsigned long (*check_apicid_used)(physid_mask_t *map, int apicid);
+	bool (*check_apicid_used)(physid_mask_t *map, int apicid);
 
 	void (*vector_allocation_domain)(int cpu, struct cpumask *retmask,
 					 const struct cpumask *mask);
@@ -572,7 +572,7 @@ default_vector_allocation_domain(int cpu
 	cpumask_copy(retmask, cpumask_of(cpu));
 }
 
-static inline unsigned long default_check_apicid_used(physid_mask_t *map, int apicid)
+static inline bool default_check_apicid_used(physid_mask_t *map, int apicid)
 {
 	return physid_isset(apicid, *map);
 }
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -26,9 +26,9 @@ static int bigsmp_apic_id_registered(voi
 	return 1;
 }
 
-static unsigned long bigsmp_check_apicid_used(physid_mask_t *map, int apicid)
+static bool bigsmp_check_apicid_used(physid_mask_t *map, int apicid)
 {
-	return 0;
+	return false;
 }
 
 static int bigsmp_early_logical_apicid(int cpu)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 17/52] x86/apic: Move probe32 specific APIC functions
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (15 preceding siblings ...)
  2017-09-13 21:29 ` [patch 16/52] x86/apic: Sanitize return value of check_apicid_used() Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 18/52] x86/apic: Move APIC noop specific functions Thomas Gleixner
                   ` (36 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Move-probe32-specific-APIC-functions.patch --]
[-- Type: text/plain, Size: 2734 bytes --]

The apic functions which are used in probe_32.c are implemented as inlines
or in apic.c. There is no reason to have them at random places.

Move them to the actual usage site and make them static.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h     |   21 ---------------------
 arch/x86/kernel/apic/apic.c     |   10 ----------
 arch/x86/kernel/apic/probe_32.c |   25 +++++++++++++++++++++++++
 3 files changed, 25 insertions(+), 31 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -515,31 +515,10 @@ extern void default_setup_apic_routing(v
 extern struct apic apic_noop;
 
 #ifdef CONFIG_X86_32
-
 static inline int noop_x86_32_early_logical_apicid(int cpu)
 {
 	return BAD_APICID;
 }
-
-/*
- * Set up the logical destination ID.
- *
- * Intel recommends to set DFR, LDR and TPR before enabling
- * an APIC.  See e.g. "AP-388 82489DX User's Manual" (Intel
- * document number 292116).  So here it goes...
- */
-extern void default_init_apic_ldr(void);
-
-static inline int default_apic_id_registered(void)
-{
-	return physid_isset(read_apic_id(), phys_cpu_present_map);
-}
-
-static inline int default_phys_pkg_id(int cpuid_apic, int index_msb)
-{
-	return cpuid_apic >> index_msb;
-}
-
 #endif
 
 extern int flat_cpu_mask_to_apicid(const struct cpumask *cpumask,
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2231,16 +2231,6 @@ int hard_smp_processor_id(void)
 	return read_apic_id();
 }
 
-void default_init_apic_ldr(void)
-{
-	unsigned long val;
-
-	apic_write(APIC_DFR, APIC_DFR_VALUE);
-	val = apic_read(APIC_LDR) & ~APIC_LDR_MASK;
-	val |= SET_APIC_LOGICAL_ID(1UL << smp_processor_id());
-	apic_write(APIC_LDR, val);
-}
-
 int default_cpu_mask_to_apicid(const struct cpumask *mask,
 			       struct irq_data *irqdata,
 			       unsigned int *apicid)
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -66,6 +66,31 @@ static void setup_apic_flat_routing(void
 #endif
 }
 
+static int default_apic_id_registered(void)
+{
+	return physid_isset(read_apic_id(), phys_cpu_present_map);
+}
+
+/*
+ * Set up the logical destination ID.  Intel recommends to set DFR, LDR and
+ * TPR before enabling an APIC.  See e.g. "AP-388 82489DX User's Manual"
+ * (Intel document number 292116).
+ */
+static void default_init_apic_ldr(void)
+{
+	unsigned long val;
+
+	apic_write(APIC_DFR, APIC_DFR_VALUE);
+	val = apic_read(APIC_LDR) & ~APIC_LDR_MASK;
+	val |= SET_APIC_LOGICAL_ID(1UL << smp_processor_id());
+	apic_write(APIC_LDR, val);
+}
+
+static int default_phys_pkg_id(int cpuid_apic, int index_msb)
+{
+	return cpuid_apic >> index_msb;
+}
+
 /* should be called last. */
 static int probe_default(void)
 {

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 18/52] x86/apic: Move APIC noop specific functions
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (16 preceding siblings ...)
  2017-09-13 21:29 ` [patch 17/52] x86/apic: Move probe32 specific APIC functions Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 19/52] x86/apic: Sanitize 32/64bit APIC callbacks Thomas Gleixner
                   ` (35 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Move-APIC-noop-specific-functions.patch --]
[-- Type: text/plain, Size: 1086 bytes --]

Move more inlines to the place where they belong.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h      |    7 -------
 arch/x86/kernel/apic/apic_noop.c |    7 +++++++
 2 files changed, 7 insertions(+), 7 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -514,13 +514,6 @@ extern void default_setup_apic_routing(v
 
 extern struct apic apic_noop;
 
-#ifdef CONFIG_X86_32
-static inline int noop_x86_32_early_logical_apicid(int cpu)
-{
-	return BAD_APICID;
-}
-#endif
-
 extern int flat_cpu_mask_to_apicid(const struct cpumask *cpumask,
 				   struct irq_data *irqdata,
 				   unsigned int *apicid);
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -108,6 +108,13 @@ static void noop_apic_write(u32 reg, u32
 	WARN_ON_ONCE(boot_cpu_has(X86_FEATURE_APIC) && !disable_apic);
 }
 
+#ifdef CONFIG_X86_32
+static int noop_x86_32_early_logical_apicid(int cpu)
+{
+	return BAD_APICID;
+}
+#endif
+
 struct apic apic_noop __ro_after_init = {
 	.name				= "noop",
 	.probe				= noop_probe,

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 19/52] x86/apic: Sanitize 32/64bit APIC callbacks
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (17 preceding siblings ...)
  2017-09-13 21:29 ` [patch 18/52] x86/apic: Move APIC noop specific functions Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 20/52] x86/apic: Move common " Thomas Gleixner
                   ` (34 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--consolidate-32-64bit.patch --]
[-- Type: text/plain, Size: 3805 bytes --]

The 32bit and the 64bit implementation of default_cpu_present_to_apicid()
and default_check_phys_apicid_present() are exactly the same, but
implemented and located differently.

Move them to common apic code and get rid of the pointless difference.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h        |   30 ------------------------------
 arch/x86/include/asm/kvm_host.h    |    2 +-
 arch/x86/kernel/apic/Makefile      |    2 +-
 arch/x86/kernel/apic/apic_common.c |   20 ++++++++++++++++++++
 arch/x86/kernel/setup.c            |   12 ------------
 5 files changed, 22 insertions(+), 44 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -463,9 +463,6 @@ static inline unsigned default_get_apic_
 extern void apic_send_IPI_self(int vector);
 
 DECLARE_PER_CPU(int, x2apic_extra_bits);
-
-extern int default_cpu_present_to_apicid(int mps_cpu);
-extern int default_check_phys_apicid_present(int phys_apicid);
 #endif
 
 extern void generic_bigsmp_probe(void);
@@ -554,35 +551,8 @@ static inline void default_ioapic_phys_i
 	*retmap = *phys_map;
 }
 
-static inline int __default_cpu_present_to_apicid(int mps_cpu)
-{
-	if (mps_cpu < nr_cpu_ids && cpu_present(mps_cpu))
-		return (int)per_cpu(x86_bios_cpu_apicid, mps_cpu);
-	else
-		return BAD_APICID;
-}
-
-static inline int
-__default_check_phys_apicid_present(int phys_apicid)
-{
-	return physid_isset(phys_apicid, phys_cpu_present_map);
-}
-
-#ifdef CONFIG_X86_32
-static inline int default_cpu_present_to_apicid(int mps_cpu)
-{
-	return __default_cpu_present_to_apicid(mps_cpu);
-}
-
-static inline int
-default_check_phys_apicid_present(int phys_apicid)
-{
-	return __default_check_phys_apicid_present(phys_apicid);
-}
-#else
 extern int default_cpu_present_to_apicid(int mps_cpu);
 extern int default_check_phys_apicid_present(int phys_apicid);
-#endif
 
 #endif /* CONFIG_X86_LOCAL_APIC */
 extern void irq_enter(void);
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1430,7 +1430,7 @@ static inline void kvm_arch_vcpu_block_f
 static inline int kvm_cpu_get_apicid(int mps_cpu)
 {
 #ifdef CONFIG_X86_LOCAL_APIC
-	return __default_cpu_present_to_apicid(mps_cpu);
+	return default_cpu_present_to_apicid(mps_cpu);
 #else
 	WARN_ON_ONCE(1);
 	return BAD_APICID;
--- a/arch/x86/kernel/apic/Makefile
+++ b/arch/x86/kernel/apic/Makefile
@@ -6,7 +6,7 @@
 # In particualr, smp_apic_timer_interrupt() is called in random places.
 KCOV_INSTRUMENT		:= n
 
-obj-$(CONFIG_X86_LOCAL_APIC)	+= apic.o apic_noop.o ipi.o vector.o
+obj-$(CONFIG_X86_LOCAL_APIC)	+= apic.o apic_common.o apic_noop.o ipi.o vector.o
 obj-y				+= hw_nmi.o
 
 obj-$(CONFIG_X86_IO_APIC)	+= io_apic.o
--- /dev/null
+++ b/arch/x86/kernel/apic/apic_common.c
@@ -0,0 +1,20 @@
+/*
+ * Common functions shared between the various APIC flavours
+ *
+ * SPDX-License-Identifier: GPL-2.0
+ */
+#include <linux/irq.h>
+#include <asm/apic.h>
+
+int default_cpu_present_to_apicid(int mps_cpu)
+{
+	if (mps_cpu < nr_cpu_ids && cpu_present(mps_cpu))
+		return (int)per_cpu(x86_bios_cpu_apicid, mps_cpu);
+	else
+		return BAD_APICID;
+}
+
+int default_check_phys_apicid_present(int phys_apicid)
+{
+	return physid_isset(phys_apicid, phys_cpu_present_map);
+}
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -136,18 +136,6 @@ RESERVE_BRK(dmi_alloc, 65536);
 static __initdata unsigned long _brk_start = (unsigned long)__brk_base;
 unsigned long _brk_end = (unsigned long)__brk_base;
 
-#ifdef CONFIG_X86_64
-int default_cpu_present_to_apicid(int mps_cpu)
-{
-	return __default_cpu_present_to_apicid(mps_cpu);
-}
-
-int default_check_phys_apicid_present(int phys_apicid)
-{
-	return __default_check_phys_apicid_present(phys_apicid);
-}
-#endif
-
 struct boot_params boot_params;
 
 /*

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 20/52] x86/apic: Move common APIC callbacks
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (18 preceding siblings ...)
  2017-09-13 21:29 ` [patch 19/52] x86/apic: Sanitize 32/64bit APIC callbacks Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 21/52] x86/apic: Reorganize struct apic Thomas Gleixner
                   ` (33 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Move-common-APIC-callbacks.patch --]
[-- Type: text/plain, Size: 7003 bytes --]

Move more apic struct specific functions out of the header and the apic
management code into the common source file.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h        |   73 +++++-----------------------------
 arch/x86/kernel/apic/apic.c        |   28 -------------
 arch/x86/kernel/apic/apic_common.c |   78 +++++++++++++++++++++++++++++++++++++
 3 files changed, 90 insertions(+), 89 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -467,94 +467,45 @@ DECLARE_PER_CPU(int, x2apic_extra_bits);
 
 extern void generic_bigsmp_probe(void);
 
-
 #ifdef CONFIG_X86_LOCAL_APIC
 
 #include <asm/smp.h>
 
 #define APIC_DFR_VALUE	(APIC_DFR_FLAT)
 
-static inline const struct cpumask *default_target_cpus(void)
-{
-#ifdef CONFIG_SMP
-	return cpu_online_mask;
-#else
-	return cpumask_of(0);
-#endif
-}
-
-static inline const struct cpumask *online_target_cpus(void)
-{
-	return cpu_online_mask;
-}
-
 DECLARE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_bios_cpu_apicid);
 
+extern struct apic apic_noop;
 
 static inline unsigned int read_apic_id(void)
 {
-	unsigned int reg;
-
-	reg = apic_read(APIC_ID);
+	unsigned int reg = apic_read(APIC_ID);
 
 	return apic->get_apic_id(reg);
 }
 
-static inline int default_apic_id_valid(int apicid)
-{
-	return (apicid < 255);
-}
-
+extern const struct cpumask *default_target_cpus(void);
+extern const struct cpumask *online_target_cpus(void);
+extern int default_apic_id_valid(int apicid);
 extern int default_acpi_madt_oem_check(char *, char *);
-
 extern void default_setup_apic_routing(void);
-
-extern struct apic apic_noop;
-
 extern int flat_cpu_mask_to_apicid(const struct cpumask *cpumask,
 				   struct irq_data *irqdata,
 				   unsigned int *apicid);
 extern int default_cpu_mask_to_apicid(const struct cpumask *cpumask,
 				      struct irq_data *irqdata,
 				      unsigned int *apicid);
-
-static inline void
-flat_vector_allocation_domain(int cpu, struct cpumask *retmask,
-			      const struct cpumask *mask)
-{
-	/* Careful. Some cpus do not strictly honor the set of cpus
-	 * specified in the interrupt destination when using lowest
-	 * priority interrupt delivery mode.
-	 *
-	 * In particular there was a hyperthreading cpu observed to
-	 * deliver interrupts to the wrong hyperthread when only one
-	 * hyperthread was specified in the interrupt desitination.
-	 */
-	cpumask_clear(retmask);
-	cpumask_bits(retmask)[0] = APIC_ALL_CPUS;
-}
-
-static inline void
-default_vector_allocation_domain(int cpu, struct cpumask *retmask,
-				 const struct cpumask *mask)
-{
-	cpumask_copy(retmask, cpumask_of(cpu));
-}
-
-static inline bool default_check_apicid_used(physid_mask_t *map, int apicid)
-{
-	return physid_isset(apicid, *map);
-}
-
-static inline void default_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap)
-{
-	*retmap = *phys_map;
-}
-
+extern bool default_check_apicid_used(physid_mask_t *map, int apicid);
+extern void flat_vector_allocation_domain(int cpu, struct cpumask *retmask,
+				   const struct cpumask *mask);
+extern void default_vector_allocation_domain(int cpu, struct cpumask *retmask,
+				      const struct cpumask *mask);
+extern void default_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap);
 extern int default_cpu_present_to_apicid(int mps_cpu);
 extern int default_check_phys_apicid_present(int phys_apicid);
 
 #endif /* CONFIG_X86_LOCAL_APIC */
+
 extern void irq_enter(void);
 extern void irq_exit(void);
 
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2231,34 +2231,6 @@ int hard_smp_processor_id(void)
 	return read_apic_id();
 }
 
-int default_cpu_mask_to_apicid(const struct cpumask *mask,
-			       struct irq_data *irqdata,
-			       unsigned int *apicid)
-{
-	unsigned int cpu = cpumask_first(mask);
-
-	if (cpu >= nr_cpu_ids)
-		return -EINVAL;
-	*apicid = per_cpu(x86_cpu_to_apicid, cpu);
-	irq_data_update_effective_affinity(irqdata, cpumask_of(cpu));
-	return 0;
-}
-
-int flat_cpu_mask_to_apicid(const struct cpumask *mask,
-			    struct irq_data *irqdata,
-			    unsigned int *apicid)
-
-{
-	struct cpumask *effmsk = irq_data_get_effective_affinity_mask(irqdata);
-	unsigned long cpu_mask = cpumask_bits(mask)[0] & APIC_ALL_CPUS;
-
-	if (!cpu_mask)
-		return -EINVAL;
-	*apicid = (unsigned int)cpu_mask;
-	cpumask_bits(effmsk)[0] = cpu_mask;
-	return 0;
-}
-
 /*
  * Override the generic EOI implementation with an optimized version.
  * Only called during early boot when only one CPU is active and with
--- a/arch/x86/kernel/apic/apic_common.c
+++ b/arch/x86/kernel/apic/apic_common.c
@@ -6,6 +6,64 @@
 #include <linux/irq.h>
 #include <asm/apic.h>
 
+int default_cpu_mask_to_apicid(const struct cpumask *msk, struct irq_data *irqd,
+			       unsigned int *apicid)
+{
+	unsigned int cpu = cpumask_first(msk);
+
+	if (cpu >= nr_cpu_ids)
+		return -EINVAL;
+	*apicid = per_cpu(x86_cpu_to_apicid, cpu);
+	irq_data_update_effective_affinity(irqd, cpumask_of(cpu));
+	return 0;
+}
+
+int flat_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqd,
+			    unsigned int *apicid)
+
+{
+	struct cpumask *effmsk = irq_data_get_effective_affinity_mask(irqd);
+	unsigned long cpu_mask = cpumask_bits(mask)[0] & APIC_ALL_CPUS;
+
+	if (!cpu_mask)
+		return -EINVAL;
+	*apicid = (unsigned int)cpu_mask;
+	cpumask_bits(effmsk)[0] = cpu_mask;
+	return 0;
+}
+
+bool default_check_apicid_used(physid_mask_t *map, int apicid)
+{
+	return physid_isset(apicid, *map);
+}
+
+void flat_vector_allocation_domain(int cpu, struct cpumask *retmask,
+				   const struct cpumask *mask)
+{
+	/*
+	 * Careful. Some cpus do not strictly honor the set of cpus
+	 * specified in the interrupt destination when using lowest
+	 * priority interrupt delivery mode.
+	 *
+	 * In particular there was a hyperthreading cpu observed to
+	 * deliver interrupts to the wrong hyperthread when only one
+	 * hyperthread was specified in the interrupt desitination.
+	 */
+	cpumask_clear(retmask);
+	cpumask_bits(retmask)[0] = APIC_ALL_CPUS;
+}
+
+void default_vector_allocation_domain(int cpu, struct cpumask *retmask,
+				      const struct cpumask *mask)
+{
+	cpumask_copy(retmask, cpumask_of(cpu));
+}
+
+void default_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap)
+{
+	*retmap = *phys_map;
+}
+
 int default_cpu_present_to_apicid(int mps_cpu)
 {
 	if (mps_cpu < nr_cpu_ids && cpu_present(mps_cpu))
@@ -13,8 +71,28 @@ int default_cpu_present_to_apicid(int mp
 	else
 		return BAD_APICID;
 }
+EXPORT_SYMBOL_GPL(default_cpu_present_to_apicid);
 
 int default_check_phys_apicid_present(int phys_apicid)
 {
 	return physid_isset(phys_apicid, phys_cpu_present_map);
 }
+
+const struct cpumask *default_target_cpus(void)
+{
+#ifdef CONFIG_SMP
+	return cpu_online_mask;
+#else
+	return cpumask_of(0);
+#endif
+}
+
+const struct cpumask *online_target_cpus(void)
+{
+	return cpu_online_mask;
+}
+
+int default_apic_id_valid(int apicid)
+{
+	return (apicid < 255);
+}

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 21/52] x86/apic: Reorganize struct apic
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (19 preceding siblings ...)
  2017-09-13 21:29 ` [patch 20/52] x86/apic: Move common " Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 22/52] x86/apic/x2apic: Simplify cluster management Thomas Gleixner
                   ` (32 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Reorganize-struct-apic.patch --]
[-- Type: text/plain, Size: 4629 bytes --]

struct apic has just grown over time by adding function pointers in random
places. Reorganize it so it becomes more cache line friendly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h |  113 ++++++++++++++++++++------------------------
 1 file changed, 52 insertions(+), 61 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -265,73 +265,63 @@ struct irq_data;
  * James Cleverdon.
  */
 struct apic {
-	char *name;
-
-	int (*probe)(void);
-	int (*acpi_madt_oem_check)(char *oem_id, char *oem_table_id);
-	int (*apic_id_valid)(int apicid);
-	int (*apic_id_registered)(void);
-
-	u32 irq_delivery_mode;
-	u32 irq_dest_mode;
+	/* Hotpath functions first */
+	void	(*eoi_write)(u32 reg, u32 v);
+	void	(*native_eoi_write)(u32 reg, u32 v);
+	void	(*write)(u32 reg, u32 v);
+	u32	(*read)(u32 reg);
+
+	/* IPI related functions */
+	void	(*wait_icr_idle)(void);
+	u32	(*safe_wait_icr_idle)(void);
+
+	void	(*send_IPI)(int cpu, int vector);
+	void	(*send_IPI_mask)(const struct cpumask *mask, int vector);
+	void	(*send_IPI_mask_allbutself)(const struct cpumask *msk, int vec);
+	void	(*send_IPI_allbutself)(int vector);
+	void	(*send_IPI_all)(int vector);
+	void	(*send_IPI_self)(int vector);
+
+	/* dest_logical is used by the IPI functions */
+	u32	dest_logical;
+	u32	disable_esr;
+	u32	irq_delivery_mode;
+	u32	irq_dest_mode;
 
+	/* Functions and data related to vector allocation */
 	const struct cpumask *(*target_cpus)(void);
+	void	(*vector_allocation_domain)(int cpu, struct cpumask *retmask,
+					    const struct cpumask *mask);
+	int	(*cpu_mask_to_apicid)(const struct cpumask *cpumask,
+				      struct irq_data *irqdata,
+				      unsigned int *apicid);
+
+	/* ICR related functions */
+	u64	(*icr_read)(void);
+	void	(*icr_write)(u32 low, u32 high);
+
+	/* Probe, setup and smpboot functions */
+	int	(*probe)(void);
+	int	(*acpi_madt_oem_check)(char *oem_id, char *oem_table_id);
+	int	(*apic_id_valid)(int apicid);
+	int	(*apic_id_registered)(void);
+
+	bool	(*check_apicid_used)(physid_mask_t *map, int apicid);
+	void	(*init_apic_ldr)(void);
+	void	(*ioapic_phys_id_map)(physid_mask_t *phys_map, physid_mask_t *retmap);
+	void	(*setup_apic_routing)(void);
+	int	(*cpu_present_to_apicid)(int mps_cpu);
+	void	(*apicid_to_cpu_present)(int phys_apicid, physid_mask_t *retmap);
+	int	(*check_phys_apicid_present)(int phys_apicid);
+	int	(*phys_pkg_id)(int cpuid_apic, int index_msb);
 
-	int disable_esr;
-
-	int dest_logical;
-	bool (*check_apicid_used)(physid_mask_t *map, int apicid);
-
-	void (*vector_allocation_domain)(int cpu, struct cpumask *retmask,
-					 const struct cpumask *mask);
-	void (*init_apic_ldr)(void);
-
-	void (*ioapic_phys_id_map)(physid_mask_t *phys_map, physid_mask_t *retmap);
-
-	void (*setup_apic_routing)(void);
-	int (*cpu_present_to_apicid)(int mps_cpu);
-	void (*apicid_to_cpu_present)(int phys_apicid, physid_mask_t *retmap);
-	int (*check_phys_apicid_present)(int phys_apicid);
-	int (*phys_pkg_id)(int cpuid_apic, int index_msb);
-
-	unsigned int (*get_apic_id)(unsigned long x);
-	/* Can't be NULL on 64-bit */
-	u32 (*set_apic_id)(unsigned int id);
-
-	int (*cpu_mask_to_apicid)(const struct cpumask *cpumask,
-				  struct irq_data *irqdata,
-				  unsigned int *apicid);
-
-	/* ipi */
-	void (*send_IPI)(int cpu, int vector);
-	void (*send_IPI_mask)(const struct cpumask *mask, int vector);
-	void (*send_IPI_mask_allbutself)(const struct cpumask *mask,
-					 int vector);
-	void (*send_IPI_allbutself)(int vector);
-	void (*send_IPI_all)(int vector);
-	void (*send_IPI_self)(int vector);
+	u32	(*get_apic_id)(unsigned long x);
+	u32	(*set_apic_id)(unsigned int id);
 
 	/* wakeup_secondary_cpu */
-	int (*wakeup_secondary_cpu)(int apicid, unsigned long start_eip);
+	int	(*wakeup_secondary_cpu)(int apicid, unsigned long start_eip);
 
-	void (*inquire_remote_apic)(int apicid);
-
-	/* apic ops */
-	u32 (*read)(u32 reg);
-	void (*write)(u32 reg, u32 v);
-	/*
-	 * ->eoi_write() has the same signature as ->write().
-	 *
-	 * Drivers can support both ->eoi_write() and ->write() by passing the same
-	 * callback value. Kernel can override ->eoi_write() and fall back
-	 * on write for EOI.
-	 */
-	void (*eoi_write)(u32 reg, u32 v);
-	void (*native_eoi_write)(u32 reg, u32 v);
-	u64 (*icr_read)(void);
-	void (*icr_write)(u32 low, u32 high);
-	void (*wait_icr_idle)(void);
-	u32 (*safe_wait_icr_idle)(void);
+	void	(*inquire_remote_apic)(int apicid);
 
 #ifdef CONFIG_X86_32
 	/*
@@ -346,6 +336,7 @@ struct apic {
 	 */
 	int (*x86_32_early_logical_apicid)(int cpu);
 #endif
+	char	*name;
 };
 
 /*

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 22/52] x86/apic/x2apic: Simplify cluster management
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (20 preceding siblings ...)
  2017-09-13 21:29 ` [patch 21/52] x86/apic: Reorganize struct apic Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 23/52] x86/apic: Get rid of apic->target_cpus Thomas Gleixner
                   ` (31 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic-x2apic--Simplify-cluster-management.patch --]
[-- Type: text/plain, Size: 8306 bytes --]

The cluster management code creates a cluster mask per cpu, which requires
that on cpu on/offline all cluster masks have to be iterated and
updated. Other information about the cluster is in different per cpu
variables.

Create a data structure which holds all information about a cluster and
fill it in when the first CPU of a cluster comes online. If another CPU of
a cluster comes online it just finds the pointer to the existing cluster
structure and reuses it.

That simplifies all usage sites and gets rid of quite some pointless
iterations over the online cpus to find the cpus which belong to the
cluster.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/x2apic_cluster.c |  154 ++++++++++++++++------------------
 1 file changed, 76 insertions(+), 78 deletions(-)

--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -10,20 +10,22 @@
 #include <asm/smp.h>
 #include "x2apic.h"
 
+struct cluster_mask {
+	unsigned int	clusterid;
+	int		node;
+	struct cpumask	mask;
+};
+
 static DEFINE_PER_CPU(u32, x86_cpu_to_logical_apicid);
-static DEFINE_PER_CPU(cpumask_var_t, cpus_in_cluster);
 static DEFINE_PER_CPU(cpumask_var_t, ipi_mask);
+static DEFINE_PER_CPU(struct cluster_mask *, cluster_masks);
+static struct cluster_mask *cluster_hotplug_mask;
 
 static int x2apic_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
 {
 	return x2apic_enabled();
 }
 
-static inline u32 x2apic_cluster(int cpu)
-{
-	return per_cpu(x86_cpu_to_logical_apicid, cpu) >> 16;
-}
-
 static void x2apic_send_IPI(int cpu, int vector)
 {
 	u32 dest = per_cpu(x86_cpu_to_logical_apicid, cpu);
@@ -35,49 +37,34 @@ static void x2apic_send_IPI(int cpu, int
 static void
 __x2apic_send_IPI_mask(const struct cpumask *mask, int vector, int apic_dest)
 {
-	struct cpumask *cpus_in_cluster_ptr;
-	struct cpumask *ipi_mask_ptr;
-	unsigned int cpu, this_cpu;
+	unsigned int cpu, clustercpu;
+	struct cpumask *tmpmsk;
 	unsigned long flags;
 	u32 dest;
 
 	x2apic_wrmsr_fence();
-
 	local_irq_save(flags);
 
-	this_cpu = smp_processor_id();
+	tmpmsk = this_cpu_cpumask_var_ptr(ipi_mask);
+	cpumask_copy(tmpmsk, mask);
+	/* If IPI should not be sent to self, clear current CPU */
+	if (apic_dest != APIC_DEST_ALLINC)
+		cpumask_clear_cpu(smp_processor_id(), tmpmsk);
+
+	/* Collapse cpus in a cluster so a single IPI per cluster is sent */
+	for_each_cpu(cpu, tmpmsk) {
+		struct cluster_mask *cmsk = per_cpu(cluster_masks, cpu);
 
-	/*
-	 * We are to modify mask, so we need an own copy
-	 * and be sure it's manipulated with irq off.
-	 */
-	ipi_mask_ptr = this_cpu_cpumask_var_ptr(ipi_mask);
-	cpumask_copy(ipi_mask_ptr, mask);
-
-	/*
-	 * The idea is to send one IPI per cluster.
-	 */
-	for_each_cpu(cpu, ipi_mask_ptr) {
-		unsigned long i;
-
-		cpus_in_cluster_ptr = per_cpu(cpus_in_cluster, cpu);
 		dest = 0;
-
-		/* Collect cpus in cluster. */
-		for_each_cpu_and(i, ipi_mask_ptr, cpus_in_cluster_ptr) {
-			if (apic_dest == APIC_DEST_ALLINC || i != this_cpu)
-				dest |= per_cpu(x86_cpu_to_logical_apicid, i);
-		}
+		for_each_cpu_and(clustercpu, tmpmsk, &cmsk->mask)
+			dest |= per_cpu(x86_cpu_to_logical_apicid, clustercpu);
 
 		if (!dest)
 			continue;
 
 		__x2apic_send_IPI_dest(dest, vector, apic->dest_logical);
-		/*
-		 * Cluster sibling cpus should be discared now so
-		 * we would not send IPI them second time.
-		 */
-		cpumask_andnot(ipi_mask_ptr, ipi_mask_ptr, cpus_in_cluster_ptr);
+		/* Remove cluster CPUs from tmpmask */
+		cpumask_andnot(tmpmsk, tmpmsk, &cmsk->mask);
 	}
 
 	local_irq_restore(flags);
@@ -109,91 +96,100 @@ x2apic_cpu_mask_to_apicid(const struct c
 			  unsigned int *apicid)
 {
 	struct cpumask *effmsk = irq_data_get_effective_affinity_mask(irqdata);
+	struct cluster_mask *cmsk;
 	unsigned int cpu;
 	u32 dest = 0;
-	u16 cluster;
 
 	cpu = cpumask_first(mask);
 	if (cpu >= nr_cpu_ids)
 		return -EINVAL;
 
-	dest = per_cpu(x86_cpu_to_logical_apicid, cpu);
-	cluster = x2apic_cluster(cpu);
-
+	cmsk = per_cpu(cluster_masks, cpu);
 	cpumask_clear(effmsk);
-	for_each_cpu(cpu, mask) {
-		if (cluster != x2apic_cluster(cpu))
-			continue;
+	for_each_cpu_and(cpu, &cmsk->mask, mask) {
 		dest |= per_cpu(x86_cpu_to_logical_apicid, cpu);
 		cpumask_set_cpu(cpu, effmsk);
 	}
-
 	*apicid = dest;
 	return 0;
 }
 
 static void init_x2apic_ldr(void)
 {
-	unsigned int this_cpu = smp_processor_id();
+	struct cluster_mask *cmsk = this_cpu_read(cluster_masks);
+	u32 cluster, apicid = apic_read(APIC_LDR);
 	unsigned int cpu;
 
-	per_cpu(x86_cpu_to_logical_apicid, this_cpu) = apic_read(APIC_LDR);
+	this_cpu_write(x86_cpu_to_logical_apicid, apicid);
 
-	cpumask_set_cpu(this_cpu, per_cpu(cpus_in_cluster, this_cpu));
+	if (cmsk)
+		goto update;
+
+	cluster = apicid >> 16;
 	for_each_online_cpu(cpu) {
-		if (x2apic_cluster(this_cpu) != x2apic_cluster(cpu))
-			continue;
-		cpumask_set_cpu(this_cpu, per_cpu(cpus_in_cluster, cpu));
-		cpumask_set_cpu(cpu, per_cpu(cpus_in_cluster, this_cpu));
+		cmsk = per_cpu(cluster_masks, cpu);
+		/* Matching cluster found. Link and update it. */
+		if (cmsk && cmsk->clusterid == cluster)
+			goto update;
 	}
+	cmsk = cluster_hotplug_mask;
+	cluster_hotplug_mask = NULL;
+update:
+	this_cpu_write(cluster_masks, cmsk);
+	cpumask_set_cpu(smp_processor_id(), &cmsk->mask);
 }
 
-/*
- * At CPU state changes, update the x2apic cluster sibling info.
- */
-static int x2apic_prepare_cpu(unsigned int cpu)
+static int alloc_clustermask(unsigned int cpu, int node)
 {
-	if (!zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL))
-		return -ENOMEM;
+	if (per_cpu(cluster_masks, cpu))
+		return 0;
+	/*
+	 * If a hotplug spare mask exists, check whether it's on the right
+	 * node. If not, free it and allocate a new one.
+	 */
+	if (cluster_hotplug_mask) {
+		if (cluster_hotplug_mask->node == node)
+			return 0;
+		kfree(cluster_hotplug_mask);
+	}
 
-	if (!zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL)) {
-		free_cpumask_var(per_cpu(cpus_in_cluster, cpu));
+	cluster_hotplug_mask = kzalloc_node(sizeof(*cluster_hotplug_mask),
+					    GFP_KERNEL, node);
+	if (!cluster_hotplug_mask)
 		return -ENOMEM;
-	}
+	cluster_hotplug_mask->node = node;
+	return 0;
+}
 
+static int x2apic_prepare_cpu(unsigned int cpu)
+{
+	if (alloc_clustermask(cpu, cpu_to_node(cpu)) < 0)
+		return -ENOMEM;
+	if (!zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL))
+		return -ENOMEM;
 	return 0;
 }
 
-static int x2apic_dead_cpu(unsigned int this_cpu)
+static int x2apic_dead_cpu(unsigned int dead_cpu)
 {
-	int cpu;
+	struct cluster_mask *cmsk = per_cpu(cluster_masks, dead_cpu);
 
-	for_each_online_cpu(cpu) {
-		if (x2apic_cluster(this_cpu) != x2apic_cluster(cpu))
-			continue;
-		cpumask_clear_cpu(this_cpu, per_cpu(cpus_in_cluster, cpu));
-		cpumask_clear_cpu(cpu, per_cpu(cpus_in_cluster, this_cpu));
-	}
-	free_cpumask_var(per_cpu(cpus_in_cluster, this_cpu));
-	free_cpumask_var(per_cpu(ipi_mask, this_cpu));
+	cpumask_clear_cpu(smp_processor_id(), &cmsk->mask);
+	free_cpumask_var(per_cpu(ipi_mask, dead_cpu));
 	return 0;
 }
 
 static int x2apic_cluster_probe(void)
 {
-	int cpu = smp_processor_id();
-	int ret;
-
 	if (!x2apic_mode)
 		return 0;
 
-	ret = cpuhp_setup_state(CPUHP_X2APIC_PREPARE, "x86/x2apic:prepare",
-				x2apic_prepare_cpu, x2apic_dead_cpu);
-	if (ret < 0) {
+	if (cpuhp_setup_state(CPUHP_X2APIC_PREPARE, "x86/x2apic:prepare",
+			      x2apic_prepare_cpu, x2apic_dead_cpu) < 0) {
 		pr_err("Failed to register X2APIC_PREPARE\n");
 		return 0;
 	}
-	cpumask_set_cpu(cpu, per_cpu(cpus_in_cluster, cpu));
+	init_x2apic_ldr();
 	return 1;
 }
 
@@ -208,6 +204,8 @@ static const struct cpumask *x2apic_clus
 static void cluster_vector_allocation_domain(int cpu, struct cpumask *retmask,
 					     const struct cpumask *mask)
 {
+	struct cluster_mask *cmsk = per_cpu(cluster_masks, cpu);
+
 	/*
 	 * To minimize vector pressure, default case of boot, device bringup
 	 * etc will use a single cpu for the interrupt destination.
@@ -220,7 +218,7 @@ static void cluster_vector_allocation_do
 	if (mask == x2apic_cluster_target_cpus())
 		cpumask_copy(retmask, cpumask_of(cpu));
 	else
-		cpumask_and(retmask, mask, per_cpu(cpus_in_cluster, cpu));
+		cpumask_and(retmask, mask, &cmsk->mask);
 }
 
 static struct apic apic_x2apic_cluster __ro_after_init = {

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 23/52] x86/apic: Get rid of apic->target_cpus
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (21 preceding siblings ...)
  2017-09-13 21:29 ` [patch 22/52] x86/apic/x2apic: Simplify cluster management Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 24/52] x86/vector: Rename used_vectors to system_vectors Thomas Gleixner
                   ` (30 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Get-rid-of-apic->target_cpus.patch --]
[-- Type: text/plain, Size: 8418 bytes --]

The target_cpus() callback of the apic struct is not really useful. Some
APICs return cpu_online_mask and others cpus_all_mask. The latter is bogus
as it does not take holes in the cpus_possible_mask into account.

Replace it with cpus_online_mask which makes the most sense and remove the
callback.

The usage sites will be removed in a later step anyway, so get rid of it
now to have incremental changes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h           |    3 ---
 arch/x86/kernel/apic/apic_common.c    |   14 --------------
 arch/x86/kernel/apic/apic_flat_64.c   |    2 --
 arch/x86/kernel/apic/apic_noop.c      |    7 -------
 arch/x86/kernel/apic/apic_numachip.c  |    2 --
 arch/x86/kernel/apic/bigsmp_32.c      |    1 -
 arch/x86/kernel/apic/io_apic.c        |    7 +++----
 arch/x86/kernel/apic/probe_32.c       |    1 -
 arch/x86/kernel/apic/vector.c         |    2 +-
 arch/x86/kernel/apic/x2apic_cluster.c |    8 +-------
 arch/x86/kernel/apic/x2apic_phys.c    |    1 -
 arch/x86/kernel/apic/x2apic_uv_x.c    |    1 -
 arch/x86/xen/apic.c                   |    1 -
 13 files changed, 5 insertions(+), 45 deletions(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -289,7 +289,6 @@ struct apic {
 	u32	irq_dest_mode;
 
 	/* Functions and data related to vector allocation */
-	const struct cpumask *(*target_cpus)(void);
 	void	(*vector_allocation_domain)(int cpu, struct cpumask *retmask,
 					    const struct cpumask *mask);
 	int	(*cpu_mask_to_apicid)(const struct cpumask *cpumask,
@@ -475,8 +474,6 @@ static inline unsigned int read_apic_id(
 	return apic->get_apic_id(reg);
 }
 
-extern const struct cpumask *default_target_cpus(void);
-extern const struct cpumask *online_target_cpus(void);
 extern int default_apic_id_valid(int apicid);
 extern int default_acpi_madt_oem_check(char *, char *);
 extern void default_setup_apic_routing(void);
--- a/arch/x86/kernel/apic/apic_common.c
+++ b/arch/x86/kernel/apic/apic_common.c
@@ -78,20 +78,6 @@ int default_check_phys_apicid_present(in
 	return physid_isset(phys_apicid, phys_cpu_present_map);
 }
 
-const struct cpumask *default_target_cpus(void)
-{
-#ifdef CONFIG_SMP
-	return cpu_online_mask;
-#else
-	return cpumask_of(0);
-#endif
-}
-
-const struct cpumask *online_target_cpus(void)
-{
-	return cpu_online_mask;
-}
-
 int default_apic_id_valid(int apicid)
 {
 	return (apicid < 255);
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -154,7 +154,6 @@ static struct apic apic_flat __ro_after_
 	.irq_delivery_mode		= dest_LowestPrio,
 	.irq_dest_mode			= 1, /* logical */
 
-	.target_cpus			= online_target_cpus,
 	.disable_esr			= 0,
 	.dest_logical			= APIC_DEST_LOGICAL,
 	.check_apicid_used		= NULL,
@@ -249,7 +248,6 @@ static struct apic apic_physflat __ro_af
 	.irq_delivery_mode		= dest_Fixed,
 	.irq_dest_mode			= 0, /* physical */
 
-	.target_cpus			= online_target_cpus,
 	.disable_esr			= 0,
 	.dest_logical			= 0,
 	.check_apicid_used		= NULL,
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -83,12 +83,6 @@ static int noop_apic_id_registered(void)
 	return physid_isset(0, phys_cpu_present_map);
 }
 
-static const struct cpumask *noop_target_cpus(void)
-{
-	/* only BSP here */
-	return cpumask_of(0);
-}
-
 static void noop_vector_allocation_domain(int cpu, struct cpumask *retmask,
 					  const struct cpumask *mask)
 {
@@ -127,7 +121,6 @@ struct apic apic_noop __ro_after_init =
 	/* logical delivery broadcast to all CPUs: */
 	.irq_dest_mode			= 1,
 
-	.target_cpus			= noop_target_cpus,
 	.disable_esr			= 0,
 	.dest_logical			= APIC_DEST_LOGICAL,
 	.check_apicid_used		= default_check_apicid_used,
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -249,7 +249,6 @@ static const struct apic apic_numachip1
 	.irq_delivery_mode		= dest_Fixed,
 	.irq_dest_mode			= 0, /* physical */
 
-	.target_cpus			= online_target_cpus,
 	.disable_esr			= 0,
 	.dest_logical			= 0,
 	.check_apicid_used		= NULL,
@@ -300,7 +299,6 @@ static const struct apic apic_numachip2
 	.irq_delivery_mode		= dest_Fixed,
 	.irq_dest_mode			= 0, /* physical */
 
-	.target_cpus			= online_target_cpus,
 	.disable_esr			= 0,
 	.dest_logical			= 0,
 	.check_apicid_used		= NULL,
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -154,7 +154,6 @@ static struct apic apic_bigsmp __ro_afte
 	/* phys delivery to target CPU: */
 	.irq_dest_mode			= 0,
 
-	.target_cpus			= default_target_cpus,
 	.disable_esr			= 1,
 	.dest_logical			= 0,
 	.check_apicid_used		= bigsmp_check_apicid_used,
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2512,9 +2512,8 @@ int acpi_get_override_irq(u32 gsi, int *
 }
 
 /*
- * This function currently is only a helper for the i386 smp boot process where
- * we need to reprogram the ioredtbls to cater for the cpus which have come online
- * so mask in all cases should simply be apic->target_cpus()
+ * This function updates target affinity of IOAPIC interrupts to include
+ * the CPUs which came online during SMP bringup.
  */
 #ifdef CONFIG_SMP
 void __init setup_ioapic_dest(void)
@@ -2547,7 +2546,7 @@ void __init setup_ioapic_dest(void)
 		if (!irqd_can_balance(idata) || irqd_affinity_was_set(idata))
 			mask = irq_data_get_affinity_mask(idata);
 		else
-			mask = apic->target_cpus();
+			mask = irq_default_affinity;
 
 		chip = irq_data_get_irq_chip(idata);
 		/* Might be lapic_chip for irq 0 */
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -109,7 +109,6 @@ static struct apic apic_default __ro_aft
 	/* logical delivery broadcast to all CPUs: */
 	.irq_dest_mode			= 1,
 
-	.target_cpus			= default_target_cpus,
 	.disable_esr			= 0,
 	.dest_logical			= APIC_DEST_LOGICAL,
 	.check_apicid_used		= default_check_apicid_used,
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -255,7 +255,7 @@ static int assign_irq_vector_policy(int
 	if (node != NUMA_NO_NODE &&
 	    assign_irq_vector(irq, data, cpumask_of_node(node), irqdata) == 0)
 		return 0;
-	return assign_irq_vector(irq, data, apic->target_cpus(), irqdata);
+	return assign_irq_vector(irq, data, cpu_online_mask, irqdata);
 }
 
 static void clear_irq_vector(int irq, struct apic_chip_data *data)
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -193,11 +193,6 @@ static int x2apic_cluster_probe(void)
 	return 1;
 }
 
-static const struct cpumask *x2apic_cluster_target_cpus(void)
-{
-	return cpu_all_mask;
-}
-
 /*
  * Each x2apic cluster is an allocation domain.
  */
@@ -215,7 +210,7 @@ static void cluster_vector_allocation_do
 	 * derived from the first cpu in the mask) members specified
 	 * in the mask.
 	 */
-	if (mask == x2apic_cluster_target_cpus())
+	if (cpumask_equal(mask, cpu_online_mask))
 		cpumask_copy(retmask, cpumask_of(cpu));
 	else
 		cpumask_and(retmask, mask, &cmsk->mask);
@@ -232,7 +227,6 @@ static struct apic apic_x2apic_cluster _
 	.irq_delivery_mode		= dest_LowestPrio,
 	.irq_dest_mode			= 1, /* logical */
 
-	.target_cpus			= x2apic_cluster_target_cpus,
 	.disable_esr			= 0,
 	.dest_logical			= APIC_DEST_LOGICAL,
 	.check_apicid_used		= NULL,
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -147,7 +147,6 @@ static struct apic apic_x2apic_phys __ro
 	.irq_delivery_mode		= dest_Fixed,
 	.irq_dest_mode			= 0, /* physical */
 
-	.target_cpus			= online_target_cpus,
 	.disable_esr			= 0,
 	.dest_logical			= 0,
 	.check_apicid_used		= NULL,
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -584,7 +584,6 @@ static struct apic apic_x2apic_uv_x __ro
 	.irq_delivery_mode		= dest_Fixed,
 	.irq_dest_mode			= 0, /* Physical */
 
-	.target_cpus			= online_target_cpus,
 	.disable_esr			= 0,
 	.dest_logical			= APIC_DEST_LOGICAL,
 	.check_apicid_used		= NULL,
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -160,7 +160,6 @@ static struct apic xen_pv_apic = {
 	/* .irq_delivery_mode - used in native_compose_msi_msg only */
 	/* .irq_dest_mode     - used in native_compose_msi_msg only */
 
-	.target_cpus			= default_target_cpus,
 	.disable_esr			= 0,
 	/* .dest_logical      -  default_send_IPI_ use it but we use our own. */
 	.check_apicid_used		= default_check_apicid_used, /* Used on 32-bit */

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 24/52] x86/vector: Rename used_vectors to system_vectors
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (22 preceding siblings ...)
  2017-09-13 21:29 ` [patch 23/52] x86/apic: Get rid of apic->target_cpus Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 25/52] x86/apic: Get rid of multi CPU affinity Thomas Gleixner
                   ` (29 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Rename-used_vectors-to-system_vectors.patch --]
[-- Type: text/plain, Size: 3814 bytes --]

used_vectors is a nisnomer as it only has the system vectors which are
excluded from the regular vector allocation marked. It's not what the name
suggests storage for the actually used vectors.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/desc.h   |    2 +-
 arch/x86/kernel/apic/vector.c |    2 +-
 arch/x86/kernel/idt.c         |   12 ++++++------
 arch/x86/kernel/irq.c         |    4 ++--
 arch/x86/kernel/traps.c       |    2 +-
 5 files changed, 11 insertions(+), 11 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -393,7 +393,7 @@ static inline void set_desc_limit(struct
 void update_intr_gate(unsigned int n, const void *addr);
 void alloc_intr_gate(unsigned int n, const void *addr);
 
-extern unsigned long used_vectors[];
+extern unsigned long system_vectors[];
 
 #ifdef CONFIG_X86_64
 DECLARE_PER_CPU(u32, debug_idt_ctr);
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -175,7 +175,7 @@ static int __assign_irq_vector(int irq,
 		if (unlikely(current_vector == vector))
 			goto next_cpu;
 
-		if (test_bit(vector, used_vectors))
+		if (test_bit(vector, system_vectors))
 			goto next;
 
 		for_each_cpu(new_cpu, vector_searchmask) {
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -225,7 +225,7 @@ idt_setup_from_table(gate_desc *idt, con
 		idt_init_desc(&desc, t);
 		write_idt_entry(idt, t->vector, &desc);
 		if (sys)
-			set_bit(t->vector, used_vectors);
+			set_bit(t->vector, system_vectors);
 	}
 }
 
@@ -313,14 +313,14 @@ void __init idt_setup_apic_and_irq_gates
 
 	idt_setup_from_table(idt_table, apic_idts, ARRAY_SIZE(apic_idts), true);
 
-	for_each_clear_bit_from(i, used_vectors, FIRST_SYSTEM_VECTOR) {
+	for_each_clear_bit_from(i, system_vectors, FIRST_SYSTEM_VECTOR) {
 		entry = irq_entries_start + 8 * (i - FIRST_EXTERNAL_VECTOR);
 		set_intr_gate(i, entry);
 	}
 
-	for_each_clear_bit_from(i, used_vectors, NR_VECTORS) {
+	for_each_clear_bit_from(i, system_vectors, NR_VECTORS) {
 #ifdef CONFIG_X86_LOCAL_APIC
-		set_bit(i, used_vectors);
+		set_bit(i, system_vectors);
 		set_intr_gate(i, spurious_interrupt);
 #else
 		entry = irq_entries_start + 8 * (i - FIRST_EXTERNAL_VECTOR);
@@ -358,7 +358,7 @@ void idt_invalidate(void *addr)
 
 void __init update_intr_gate(unsigned int n, const void *addr)
 {
-	if (WARN_ON_ONCE(!test_bit(n, used_vectors)))
+	if (WARN_ON_ONCE(!test_bit(n, system_vectors)))
 		return;
 	set_intr_gate(n, addr);
 }
@@ -366,6 +366,6 @@ void __init update_intr_gate(unsigned in
 void alloc_intr_gate(unsigned int n, const void *addr)
 {
 	BUG_ON(n < FIRST_SYSTEM_VECTOR);
-	if (!test_and_set_bit(n, used_vectors))
+	if (!test_and_set_bit(n, system_vectors))
 		set_intr_gate(n, addr);
 }
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -134,7 +134,7 @@ int arch_show_interrupts(struct seq_file
 	seq_puts(p, "  Machine check polls\n");
 #endif
 #if IS_ENABLED(CONFIG_HYPERV) || defined(CONFIG_XEN)
-	if (test_bit(HYPERVISOR_CALLBACK_VECTOR, used_vectors)) {
+	if (test_bit(HYPERVISOR_CALLBACK_VECTOR, system_vectors)) {
 		seq_printf(p, "%*s: ", prec, "HYP");
 		for_each_online_cpu(j)
 			seq_printf(p, "%10u ",
@@ -416,7 +416,7 @@ int check_irq_vectors_for_cpu_disable(vo
 		 */
 		for (vector = FIRST_EXTERNAL_VECTOR;
 		     vector < FIRST_SYSTEM_VECTOR; vector++) {
-			if (!test_bit(vector, used_vectors) &&
+			if (!test_bit(vector, system_vectors) &&
 			    IS_ERR_OR_NULL(per_cpu(vector_irq, cpu)[vector])) {
 				if (++count == this_count)
 					return 0;
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -71,7 +71,7 @@
 #include <asm/proto.h>
 #endif
 
-DECLARE_BITMAP(used_vectors, NR_VECTORS);
+DECLARE_BITMAP(system_vectors, NR_VECTORS);
 
 static inline void cond_local_irq_enable(struct pt_regs *regs)
 {

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 25/52] x86/apic: Get rid of multi CPU affinity
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (23 preceding siblings ...)
  2017-09-13 21:29 ` [patch 24/52] x86/vector: Rename used_vectors to system_vectors Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 26/52] x86/ioapic: Remove obsolete post hotplug update Thomas Gleixner
                   ` (28 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Get-rid-of-multi-CPU-affinity.patch --]
[-- Type: text/plain, Size: 2487 bytes --]

Setting the interrupt affinity of a single interrupt to multiple CPUs has a
dubious value.

 1) This only works on machines where the APIC uses logical destination
    mode. If the APIC uses physical destination mode then it is already
    restricted to a single CPU

 2) Experiments have shown, that the benefit of multi CPU affinity is close
    to zero and in some test even worse than setting the affinity to a
    single CPU.

    The reason for this is that the delivery targets the APIC with the
    lowest ID first and only if that APIC is busy (servicing an interrupt,
    i.e. ISR is not empty) it hands it over to the next APIC. In the
    conducted tests the vast majority of interrupts ends up on the APIC
    with the lowest ID anyway, so there is no natural spreading of the
    interrupts possible.

Supporting multi CPU affinities adds a lot of complexity to the code, which
can turn the allocation search into a worst case of

    nr_vectors * nr_online_cpus * nr_bits_in_target_mask

As a first step disable it by restricting the vector search to a single
CPU.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |   13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -136,8 +136,7 @@ static int __assign_irq_vector(int irq,
 	while (cpu < nr_cpu_ids) {
 		int new_cpu, offset;
 
-		/* Get the possible target cpus for @mask/@cpu from the apic */
-		apic->vector_allocation_domain(cpu, vector_cpumask, mask);
+		cpumask_copy(vector_cpumask, cpumask_of(cpu));
 
 		/*
 		 * Clear the offline cpus from @vector_cpumask for searching
@@ -367,17 +366,11 @@ static int x86_vector_alloc_irqs(struct
 		irq_data->chip = &lapic_controller;
 		irq_data->chip_data = data;
 		irq_data->hwirq = virq + i;
+		irqd_set_single_target(irq_data);
 		err = assign_irq_vector_policy(virq + i, node, data, info,
 					       irq_data);
 		if (err)
 			goto error;
-		/*
-		 * If the apic destination mode is physical, then the
-		 * effective affinity is restricted to a single target
-		 * CPU. Mark the interrupt accordingly.
-		 */
-		if (!apic->irq_dest_mode)
-			irqd_set_single_target(irq_data);
 	}
 
 	return 0;
@@ -434,7 +427,7 @@ static void __init init_legacy_irqs(void
 		BUG_ON(!data);
 
 		data->cfg.vector = ISA_IRQ_VECTOR(i);
-		cpumask_setall(data->domain);
+		cpumask_copy(data->domain, cpumask_of(0));
 		irq_set_chip_data(i, data);
 	}
 }

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 26/52] x86/ioapic: Remove obsolete post hotplug update
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (24 preceding siblings ...)
  2017-09-13 21:29 ` [patch 25/52] x86/apic: Get rid of multi CPU affinity Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 27/52] x86/vector: Simplify the CPU hotplug vector update Thomas Gleixner
                   ` (27 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-ioapic--Remove-obsolete-post-hotplug-update.patch --]
[-- Type: text/plain, Size: 2640 bytes --]

With single CPU affinities the post SMP boot vector update is pointless as
it will just leave the affinities on the same vectors and the same CPUs.

Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/io_apic.h |    2 -
 arch/x86/kernel/apic/io_apic.c |   42 -----------------------------------------
 arch/x86/kernel/smpboot.c      |    1 
 3 files changed, 45 deletions(-)

--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -192,7 +192,6 @@ static inline unsigned int io_apic_read(
 extern void setup_IO_APIC(void);
 extern void enable_IO_APIC(void);
 extern void disable_IO_APIC(void);
-extern void setup_ioapic_dest(void);
 extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
 extern void print_IO_APICs(void);
 #else  /* !CONFIG_X86_IO_APIC */
@@ -232,7 +231,6 @@ static inline void io_apic_init_mappings
 
 static inline void setup_IO_APIC(void) { }
 static inline void enable_IO_APIC(void) { }
-static inline void setup_ioapic_dest(void) { }
 
 #endif
 
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2515,48 +2515,6 @@ int acpi_get_override_irq(u32 gsi, int *
  * This function updates target affinity of IOAPIC interrupts to include
  * the CPUs which came online during SMP bringup.
  */
-#ifdef CONFIG_SMP
-void __init setup_ioapic_dest(void)
-{
-	int pin, ioapic, irq, irq_entry;
-	const struct cpumask *mask;
-	struct irq_desc *desc;
-	struct irq_data *idata;
-	struct irq_chip *chip;
-
-	if (skip_ioapic_setup == 1)
-		return;
-
-	for_each_ioapic_pin(ioapic, pin) {
-		irq_entry = find_irq_entry(ioapic, pin, mp_INT);
-		if (irq_entry == -1)
-			continue;
-
-		irq = pin_2_irq(irq_entry, ioapic, pin, 0);
-		if (irq < 0 || !mp_init_irq_at_boot(ioapic, irq))
-			continue;
-
-		desc = irq_to_desc(irq);
-		raw_spin_lock_irq(&desc->lock);
-		idata = irq_desc_get_irq_data(desc);
-
-		/*
-		 * Honour affinities which have been set in early boot
-		 */
-		if (!irqd_can_balance(idata) || irqd_affinity_was_set(idata))
-			mask = irq_data_get_affinity_mask(idata);
-		else
-			mask = irq_default_affinity;
-
-		chip = irq_data_get_irq_chip(idata);
-		/* Might be lapic_chip for irq 0 */
-		if (chip->irq_set_affinity)
-			chip->irq_set_affinity(idata, mask, false);
-		raw_spin_unlock_irq(&desc->lock);
-	}
-}
-#endif
-
 #define IOAPIC_RESOURCE_NAME_SIZE 11
 
 static struct resource *ioapic_resources;
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1392,7 +1392,6 @@ void __init native_smp_cpus_done(unsigne
 
 	nmi_selftest();
 	impress_friends();
-	setup_ioapic_dest();
 	mtrr_aps_init();
 }
 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 27/52] x86/vector: Simplify the CPU hotplug vector update
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (25 preceding siblings ...)
  2017-09-13 21:29 ` [patch 26/52] x86/ioapic: Remove obsolete post hotplug update Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 28/52] x86/vector: Cleanup variable names Thomas Gleixner
                   ` (26 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Simplify-the-CPU-hotplug-vector-update.patch --]
[-- Type: text/plain, Size: 3693 bytes --]

With single CPU affinities it's not longer required to scan all interrupts
for potential destination masks which contain the newly booting CPU.

Reduce it to install the active legacy PIC vectors on the newly booting CPU
as those cannot be affinty controlled by the kernel and potentially end up
at any CPU in the system.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |   66 ++++++++++++++++++++++--------------------
 1 file changed, 36 insertions(+), 30 deletions(-)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -459,54 +459,60 @@ int __init arch_early_irq_init(void)
 	return arch_early_ioapic_init();
 }
 
-/* Initialize vector_irq on a new cpu */
-static void __setup_vector_irq(int cpu)
+/* Temporary hack to keep things working */
+static void vector_update_shutdown_irqs(void)
 {
-	struct apic_chip_data *data;
 	struct irq_desc *desc;
-	int irq, vector;
+	int irq;
 
-	/* Mark the inuse vectors */
 	for_each_irq_desc(irq, desc) {
-		struct irq_data *idata = irq_desc_get_irq_data(desc);
+		struct irq_data *irqd = irq_desc_get_irq_data(desc);
+		struct apic_chip_data *ad = apic_chip_data(irqd);
 
-		data = apic_chip_data(idata);
-		if (!data || !cpumask_test_cpu(cpu, data->domain))
-			continue;
-		vector = data->cfg.vector;
-		per_cpu(vector_irq, cpu)[vector] = desc;
-	}
-	/* Mark the free vectors */
-	for (vector = 0; vector < NR_VECTORS; ++vector) {
-		desc = per_cpu(vector_irq, cpu)[vector];
-		if (IS_ERR_OR_NULL(desc))
-			continue;
-
-		data = apic_chip_data(irq_desc_get_irq_data(desc));
-		if (!cpumask_test_cpu(cpu, data->domain))
-			per_cpu(vector_irq, cpu)[vector] = VECTOR_UNUSED;
+		if (ad && cpumask_test_cpu(cpu, ad->domain) && ad->cfg.vector)
+			this_cpu_write(vector_irq[ad->cfg.vector], desc);
 	}
 }
 
+static struct irq_desc *__setup_vector_irq(int vector)
+{
+	int isairq = vector - ISA_IRQ_VECTOR(0);
+
+	/* Check whether the irq is in the legacy space */
+	if (isairq < 0 || isairq >= nr_legacy_irqs())
+		return VECTOR_UNUSED;
+	/* Check whether the irq is handled by the IOAPIC */
+	if (test_bit(isairq, &io_apic_irqs))
+		return VECTOR_UNUSED;
+	return irq_to_desc(isairq);
+}
+
 /*
  * Setup the vector to irq mappings. Must be called with vector_lock held.
  */
 void setup_vector_irq(int cpu)
 {
-	int irq;
+	unsigned int vector;
 
 	lockdep_assert_held(&vector_lock);
 	/*
-	 * On most of the platforms, legacy PIC delivers the interrupts on the
-	 * boot cpu. But there are certain platforms where PIC interrupts are
-	 * delivered to multiple cpu's. If the legacy IRQ is handled by the
-	 * legacy PIC, for the new cpu that is coming online, setup the static
-	 * legacy vector to irq mapping:
+	 * The interrupt affinity logic never targets interrupts to offline
+	 * CPUs. The exception are the legacy PIC interrupts. In general
+	 * they are only targeted to CPU0, but depending on the platform
+	 * they can be distributed to any online CPU in hardware. The
+	 * kernel has no influence on that. So all active legacy vectors
+	 * must be installed on all CPUs. All non legacy interrupts can be
+	 * cleared.
 	 */
-	for (irq = 0; irq < nr_legacy_irqs(); irq++)
-		per_cpu(vector_irq, cpu)[ISA_IRQ_VECTOR(irq)] = irq_to_desc(irq);
+	for (vector = 0; vector < NR_VECTORS; vector++)
+		this_cpu_write(vector_irq[vector], __setup_vector_irq(vector));
 
-	__setup_vector_irq(cpu);
+	/*
+	 * Until the rewrite of the managed interrupt management is in
+	 * place it's necessary to walk the irq descriptors and check for
+	 * interrupts which are targeted at this CPU.
+	 */
+	vector_update_shutdown_irqs();
 }
 
 static int apic_retrigger_irq(struct irq_data *irq_data)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 28/52] x86/vector: Cleanup variable names
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (26 preceding siblings ...)
  2017-09-13 21:29 ` [patch 27/52] x86/vector: Simplify the CPU hotplug vector update Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 29/52] x86/vector: Store the single CPU targets in apic data Thomas Gleixner
                   ` (25 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Cleanup-variable-names.patch --]
[-- Type: text/plain, Size: 16441 bytes --]

The naming convention of variables with the types irq_data and
apic_chip_data are inconsistent and confusing.

Before reworking the whole vector management make them consistent so
irq_data pointers are named 'irqd' and apic_chip_data are named 'apicd' all
over the place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |  228 +++++++++++++++++++++---------------------
 1 file changed, 114 insertions(+), 114 deletions(-)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -50,22 +50,22 @@ void unlock_vector_lock(void)
 	raw_spin_unlock(&vector_lock);
 }
 
-static struct apic_chip_data *apic_chip_data(struct irq_data *irq_data)
+static struct apic_chip_data *apic_chip_data(struct irq_data *irqd)
 {
-	if (!irq_data)
+	if (!irqd)
 		return NULL;
 
-	while (irq_data->parent_data)
-		irq_data = irq_data->parent_data;
+	while (irqd->parent_data)
+		irqd = irqd->parent_data;
 
-	return irq_data->chip_data;
+	return irqd->chip_data;
 }
 
-struct irq_cfg *irqd_cfg(struct irq_data *irq_data)
+struct irq_cfg *irqd_cfg(struct irq_data *irqd)
 {
-	struct apic_chip_data *data = apic_chip_data(irq_data);
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
 
-	return data ? &data->cfg : NULL;
+	return apicd ? &apicd->cfg : NULL;
 }
 EXPORT_SYMBOL_GPL(irqd_cfg);
 
@@ -76,35 +76,35 @@ struct irq_cfg *irq_cfg(unsigned int irq
 
 static struct apic_chip_data *alloc_apic_chip_data(int node)
 {
-	struct apic_chip_data *data;
+	struct apic_chip_data *apicd;
 
-	data = kzalloc_node(sizeof(*data), GFP_KERNEL, node);
-	if (!data)
+	apicd = kzalloc_node(sizeof(*apicd), GFP_KERNEL, node);
+	if (!apicd)
 		return NULL;
-	if (!zalloc_cpumask_var_node(&data->domain, GFP_KERNEL, node))
+	if (!zalloc_cpumask_var_node(&apicd->domain, GFP_KERNEL, node))
 		goto out_data;
-	if (!zalloc_cpumask_var_node(&data->old_domain, GFP_KERNEL, node))
+	if (!zalloc_cpumask_var_node(&apicd->old_domain, GFP_KERNEL, node))
 		goto out_domain;
-	return data;
+	return apicd;
 out_domain:
-	free_cpumask_var(data->domain);
+	free_cpumask_var(apicd->domain);
 out_data:
-	kfree(data);
+	kfree(apicd);
 	return NULL;
 }
 
-static void free_apic_chip_data(struct apic_chip_data *data)
+static void free_apic_chip_data(struct apic_chip_data *apicd)
 {
-	if (data) {
-		free_cpumask_var(data->domain);
-		free_cpumask_var(data->old_domain);
-		kfree(data);
+	if (apicd) {
+		free_cpumask_var(apicd->domain);
+		free_cpumask_var(apicd->old_domain);
+		kfree(apicd);
 	}
 }
 
 static int __assign_irq_vector(int irq, struct apic_chip_data *d,
 			       const struct cpumask *mask,
-			       struct irq_data *irqdata)
+			       struct irq_data *irqd)
 {
 	/*
 	 * NOTE! The local APIC isn't very good at handling
@@ -226,62 +226,62 @@ static int __assign_irq_vector(int irq,
 	 * cpus masked out.
 	 */
 	cpumask_and(vector_searchmask, vector_searchmask, mask);
-	BUG_ON(apic->cpu_mask_to_apicid(vector_searchmask, irqdata,
+	BUG_ON(apic->cpu_mask_to_apicid(vector_searchmask, irqd,
 					&d->cfg.dest_apicid));
 	return 0;
 }
 
-static int assign_irq_vector(int irq, struct apic_chip_data *data,
+static int assign_irq_vector(int irq, struct apic_chip_data *apicd,
 			     const struct cpumask *mask,
-			     struct irq_data *irqdata)
+			     struct irq_data *irqd)
 {
 	int err;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&vector_lock, flags);
-	err = __assign_irq_vector(irq, data, mask, irqdata);
+	err = __assign_irq_vector(irq, apicd, mask, irqd);
 	raw_spin_unlock_irqrestore(&vector_lock, flags);
 	return err;
 }
 
 static int assign_irq_vector_policy(int irq, int node,
-				    struct apic_chip_data *data,
+				    struct apic_chip_data *apicd,
 				    struct irq_alloc_info *info,
-				    struct irq_data *irqdata)
+				    struct irq_data *irqd)
 {
 	if (info && info->mask)
-		return assign_irq_vector(irq, data, info->mask, irqdata);
+		return assign_irq_vector(irq, apicd, info->mask, irqd);
 	if (node != NUMA_NO_NODE &&
-	    assign_irq_vector(irq, data, cpumask_of_node(node), irqdata) == 0)
+	    assign_irq_vector(irq, apicd, cpumask_of_node(node), irqd) == 0)
 		return 0;
-	return assign_irq_vector(irq, data, cpu_online_mask, irqdata);
+	return assign_irq_vector(irq, apicd, cpu_online_mask, irqd);
 }
 
-static void clear_irq_vector(int irq, struct apic_chip_data *data)
+static void clear_irq_vector(int irq, struct apic_chip_data *apicd)
 {
 	struct irq_desc *desc;
 	int cpu, vector;
 
-	if (!data->cfg.vector)
+	if (!apicd->cfg.vector)
 		return;
 
-	vector = data->cfg.vector;
-	for_each_cpu_and(cpu, data->domain, cpu_online_mask)
+	vector = apicd->cfg.vector;
+	for_each_cpu_and(cpu, apicd->domain, cpu_online_mask)
 		per_cpu(vector_irq, cpu)[vector] = VECTOR_UNUSED;
 
-	data->cfg.vector = 0;
-	cpumask_clear(data->domain);
+	apicd->cfg.vector = 0;
+	cpumask_clear(apicd->domain);
 
 	/*
 	 * If move is in progress or the old_domain mask is not empty,
 	 * i.e. the cleanup IPI has not been processed yet, we need to remove
 	 * the old references to desc from all cpus vector tables.
 	 */
-	if (!data->move_in_progress && cpumask_empty(data->old_domain))
+	if (!apicd->move_in_progress && cpumask_empty(apicd->old_domain))
 		return;
 
 	desc = irq_to_desc(irq);
-	for_each_cpu_and(cpu, data->old_domain, cpu_online_mask) {
+	for_each_cpu_and(cpu, apicd->old_domain, cpu_online_mask) {
 		for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS;
 		     vector++) {
 			if (per_cpu(vector_irq, cpu)[vector] != desc)
@@ -290,7 +290,7 @@ static void clear_irq_vector(int irq, st
 			break;
 		}
 	}
-	data->move_in_progress = 0;
+	apicd->move_in_progress = 0;
 }
 
 void init_irq_alloc_info(struct irq_alloc_info *info,
@@ -311,20 +311,20 @@ void copy_irq_alloc_info(struct irq_allo
 static void x86_vector_free_irqs(struct irq_domain *domain,
 				 unsigned int virq, unsigned int nr_irqs)
 {
-	struct apic_chip_data *apic_data;
-	struct irq_data *irq_data;
+	struct apic_chip_data *apicd;
+	struct irq_data *irqd;
 	unsigned long flags;
 	int i;
 
 	for (i = 0; i < nr_irqs; i++) {
-		irq_data = irq_domain_get_irq_data(x86_vector_domain, virq + i);
-		if (irq_data && irq_data->chip_data) {
+		irqd = irq_domain_get_irq_data(x86_vector_domain, virq + i);
+		if (irqd && irqd->chip_data) {
 			raw_spin_lock_irqsave(&vector_lock, flags);
-			clear_irq_vector(virq + i, irq_data->chip_data);
-			apic_data = irq_data->chip_data;
-			irq_domain_reset_irq_data(irq_data);
+			clear_irq_vector(virq + i, irqd->chip_data);
+			apicd = irqd->chip_data;
+			irq_domain_reset_irq_data(irqd);
 			raw_spin_unlock_irqrestore(&vector_lock, flags);
-			free_apic_chip_data(apic_data);
+			free_apic_chip_data(apicd);
 #ifdef	CONFIG_X86_IO_APIC
 			if (virq + i < nr_legacy_irqs())
 				legacy_irq_data[virq + i] = NULL;
@@ -337,8 +337,8 @@ static int x86_vector_alloc_irqs(struct
 				 unsigned int nr_irqs, void *arg)
 {
 	struct irq_alloc_info *info = arg;
-	struct apic_chip_data *data;
-	struct irq_data *irq_data;
+	struct apic_chip_data *apicd;
+	struct irq_data *irqd;
 	int i, err, node;
 
 	if (disable_apic)
@@ -349,26 +349,26 @@ static int x86_vector_alloc_irqs(struct
 		return -ENOSYS;
 
 	for (i = 0; i < nr_irqs; i++) {
-		irq_data = irq_domain_get_irq_data(domain, virq + i);
-		BUG_ON(!irq_data);
-		node = irq_data_get_node(irq_data);
+		irqd = irq_domain_get_irq_data(domain, virq + i);
+		BUG_ON(!irqd);
+		node = irq_data_get_node(irqd);
 #ifdef	CONFIG_X86_IO_APIC
 		if (virq + i < nr_legacy_irqs() && legacy_irq_data[virq + i])
-			data = legacy_irq_data[virq + i];
+			apicd = legacy_irq_data[virq + i];
 		else
 #endif
-			data = alloc_apic_chip_data(node);
-		if (!data) {
+			apicd = alloc_apic_chip_data(node);
+		if (!apicd) {
 			err = -ENOMEM;
 			goto error;
 		}
 
-		irq_data->chip = &lapic_controller;
-		irq_data->chip_data = data;
-		irq_data->hwirq = virq + i;
-		irqd_set_single_target(irq_data);
-		err = assign_irq_vector_policy(virq + i, node, data, info,
-					       irq_data);
+		irqd->chip = &lapic_controller;
+		irqd->chip_data = apicd;
+		irqd->hwirq = virq + i;
+		irqd_set_single_target(irqd);
+		err = assign_irq_vector_policy(virq + i, node, apicd, info,
+					       irqd);
 		if (err)
 			goto error;
 	}
@@ -416,19 +416,19 @@ int __init arch_probe_nr_irqs(void)
 static void __init init_legacy_irqs(void)
 {
 	int i, node = cpu_to_node(0);
-	struct apic_chip_data *data;
+	struct apic_chip_data *apicd;
 
 	/*
 	 * For legacy IRQ's, start with assigning irq0 to irq15 to
 	 * ISA_IRQ_VECTOR(i) for all cpu's.
 	 */
 	for (i = 0; i < nr_legacy_irqs(); i++) {
-		data = legacy_irq_data[i] = alloc_apic_chip_data(node);
-		BUG_ON(!data);
+		apicd = legacy_irq_data[i] = alloc_apic_chip_data(node);
+		BUG_ON(!apicd);
 
-		data->cfg.vector = ISA_IRQ_VECTOR(i);
-		cpumask_copy(data->domain, cpumask_of(0));
-		irq_set_chip_data(i, data);
+		apicd->cfg.vector = ISA_IRQ_VECTOR(i);
+		cpumask_copy(apicd->domain, cpumask_of(0));
+		irq_set_chip_data(i, apicd);
 	}
 }
 #else
@@ -515,32 +515,32 @@ void setup_vector_irq(int cpu)
 	vector_update_shutdown_irqs();
 }
 
-static int apic_retrigger_irq(struct irq_data *irq_data)
+static int apic_retrigger_irq(struct irq_data *irqd)
 {
-	struct apic_chip_data *data = apic_chip_data(irq_data);
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
 	unsigned long flags;
 	int cpu;
 
 	raw_spin_lock_irqsave(&vector_lock, flags);
-	cpu = cpumask_first_and(data->domain, cpu_online_mask);
-	apic->send_IPI_mask(cpumask_of(cpu), data->cfg.vector);
+	cpu = cpumask_first_and(apicd->domain, cpu_online_mask);
+	apic->send_IPI_mask(cpumask_of(cpu), apicd->cfg.vector);
 	raw_spin_unlock_irqrestore(&vector_lock, flags);
 
 	return 1;
 }
 
-void apic_ack_edge(struct irq_data *data)
+void apic_ack_edge(struct irq_data *irqd)
 {
-	irq_complete_move(irqd_cfg(data));
-	irq_move_irq(data);
+	irq_complete_move(irqd_cfg(irqd));
+	irq_move_irq(irqd);
 	ack_APIC_irq();
 }
 
-static int apic_set_affinity(struct irq_data *irq_data,
+static int apic_set_affinity(struct irq_data *irqd,
 			     const struct cpumask *dest, bool force)
 {
-	struct apic_chip_data *data = irq_data->chip_data;
-	int err, irq = irq_data->irq;
+	struct apic_chip_data *apicd = irqd->chip_data;
+	int err, irq = irqd->irq;
 
 	if (!IS_ENABLED(CONFIG_SMP))
 		return -EPERM;
@@ -548,7 +548,7 @@ static int apic_set_affinity(struct irq_
 	if (!cpumask_intersects(dest, cpu_online_mask))
 		return -EINVAL;
 
-	err = assign_irq_vector(irq, data, dest, irq_data);
+	err = assign_irq_vector(irq, apicd, dest, irqd);
 	return err ? err : IRQ_SET_MASK_OK;
 }
 
@@ -560,23 +560,23 @@ static struct irq_chip lapic_controller
 };
 
 #ifdef CONFIG_SMP
-static void __send_cleanup_vector(struct apic_chip_data *data)
+static void __send_cleanup_vector(struct apic_chip_data *apicd)
 {
 	raw_spin_lock(&vector_lock);
-	cpumask_and(data->old_domain, data->old_domain, cpu_online_mask);
-	data->move_in_progress = 0;
-	if (!cpumask_empty(data->old_domain))
-		apic->send_IPI_mask(data->old_domain, IRQ_MOVE_CLEANUP_VECTOR);
+	cpumask_and(apicd->old_domain, apicd->old_domain, cpu_online_mask);
+	apicd->move_in_progress = 0;
+	if (!cpumask_empty(apicd->old_domain))
+		apic->send_IPI_mask(apicd->old_domain, IRQ_MOVE_CLEANUP_VECTOR);
 	raw_spin_unlock(&vector_lock);
 }
 
 void send_cleanup_vector(struct irq_cfg *cfg)
 {
-	struct apic_chip_data *data;
+	struct apic_chip_data *apicd;
 
-	data = container_of(cfg, struct apic_chip_data, cfg);
-	if (data->move_in_progress)
-		__send_cleanup_vector(data);
+	apicd = container_of(cfg, struct apic_chip_data, cfg);
+	if (apicd->move_in_progress)
+		__send_cleanup_vector(apicd);
 }
 
 asmlinkage __visible void __irq_entry smp_irq_move_cleanup_interrupt(void)
@@ -590,7 +590,7 @@ asmlinkage __visible void __irq_entry sm
 
 	me = smp_processor_id();
 	for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
-		struct apic_chip_data *data;
+		struct apic_chip_data *apicd;
 		struct irq_desc *desc;
 		unsigned int irr;
 
@@ -606,16 +606,16 @@ asmlinkage __visible void __irq_entry sm
 			goto retry;
 		}
 
-		data = apic_chip_data(irq_desc_get_irq_data(desc));
-		if (!data)
+		apicd = apic_chip_data(irq_desc_get_irq_data(desc));
+		if (!apicd)
 			goto unlock;
 
 		/*
 		 * Nothing to cleanup if irq migration is in progress
 		 * or this cpu is not set in the cleanup mask.
 		 */
-		if (data->move_in_progress ||
-		    !cpumask_test_cpu(me, data->old_domain))
+		if (apicd->move_in_progress ||
+		    !cpumask_test_cpu(me, apicd->old_domain))
 			goto unlock;
 
 		/*
@@ -630,8 +630,8 @@ asmlinkage __visible void __irq_entry sm
 		 * this cpu is part of the target mask. We better leave that
 		 * one alone.
 		 */
-		if (vector == data->cfg.vector &&
-		    cpumask_test_cpu(me, data->domain))
+		if (vector == apicd->cfg.vector &&
+		    cpumask_test_cpu(me, apicd->domain))
 			goto unlock;
 
 		irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
@@ -647,7 +647,7 @@ asmlinkage __visible void __irq_entry sm
 			goto unlock;
 		}
 		__this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
-		cpumask_clear_cpu(me, data->old_domain);
+		cpumask_clear_cpu(me, apicd->old_domain);
 unlock:
 		raw_spin_unlock(&desc->lock);
 	}
@@ -660,15 +660,15 @@ asmlinkage __visible void __irq_entry sm
 static void __irq_complete_move(struct irq_cfg *cfg, unsigned vector)
 {
 	unsigned me;
-	struct apic_chip_data *data;
+	struct apic_chip_data *apicd;
 
-	data = container_of(cfg, struct apic_chip_data, cfg);
-	if (likely(!data->move_in_progress))
+	apicd = container_of(cfg, struct apic_chip_data, cfg);
+	if (likely(!apicd->move_in_progress))
 		return;
 
 	me = smp_processor_id();
-	if (vector == data->cfg.vector && cpumask_test_cpu(me, data->domain))
-		__send_cleanup_vector(data);
+	if (vector == apicd->cfg.vector && cpumask_test_cpu(me, apicd->domain))
+		__send_cleanup_vector(apicd);
 }
 
 void irq_complete_move(struct irq_cfg *cfg)
@@ -681,8 +681,8 @@ void irq_complete_move(struct irq_cfg *c
  */
 void irq_force_complete_move(struct irq_desc *desc)
 {
-	struct irq_data *irqdata;
-	struct apic_chip_data *data;
+	struct irq_data *irqd;
+	struct apic_chip_data *apicd;
 	struct irq_cfg *cfg;
 	unsigned int cpu;
 
@@ -695,13 +695,13 @@ void irq_force_complete_move(struct irq_
 	 * Check first that the chip_data is what we expect
 	 * (apic_chip_data) before touching it any further.
 	 */
-	irqdata = irq_domain_get_irq_data(x86_vector_domain,
+	irqd = irq_domain_get_irq_data(x86_vector_domain,
 					  irq_desc_get_irq(desc));
-	if (!irqdata)
+	if (!irqd)
 		return;
 
-	data = apic_chip_data(irqdata);
-	cfg = data ? &data->cfg : NULL;
+	apicd = apic_chip_data(irqd);
+	cfg = apicd ? &apicd->cfg : NULL;
 
 	if (!cfg)
 		return;
@@ -719,14 +719,14 @@ void irq_force_complete_move(struct irq_
 	 * Clean out all offline cpus (including the outgoing one) from the
 	 * old_domain mask.
 	 */
-	cpumask_and(data->old_domain, data->old_domain, cpu_online_mask);
+	cpumask_and(apicd->old_domain, apicd->old_domain, cpu_online_mask);
 
 	/*
 	 * If move_in_progress is cleared and the old_domain mask is empty,
 	 * then there is nothing to cleanup. fixup_irqs() will take care of
 	 * the stale vectors on the outgoing cpu.
 	 */
-	if (!data->move_in_progress && cpumask_empty(data->old_domain)) {
+	if (!apicd->move_in_progress && cpumask_empty(apicd->old_domain)) {
 		raw_spin_unlock(&vector_lock);
 		return;
 	}
@@ -739,7 +739,7 @@ void irq_force_complete_move(struct irq_
 	 * 2) The interrupt has fired on the new vector, but the cleanup IPIs
 	 *    have not been processed yet.
 	 */
-	if (data->move_in_progress) {
+	if (apicd->move_in_progress) {
 		/*
 		 * In theory there is a race:
 		 *
@@ -773,18 +773,18 @@ void irq_force_complete_move(struct irq_
 		 * area arises.
 		 */
 		pr_warn("IRQ fixup: irq %d move in progress, old vector %d\n",
-			irqdata->irq, cfg->old_vector);
+			irqd->irq, cfg->old_vector);
 	}
 	/*
 	 * If old_domain is not empty, then other cpus still have the irq
 	 * descriptor set in their vector array. Clean it up.
 	 */
-	for_each_cpu(cpu, data->old_domain)
+	for_each_cpu(cpu, apicd->old_domain)
 		per_cpu(vector_irq, cpu)[cfg->old_vector] = VECTOR_UNUSED;
 
 	/* Cleanup the left overs of the (half finished) move */
-	cpumask_clear(data->old_domain);
-	data->move_in_progress = 0;
+	cpumask_clear(apicd->old_domain);
+	apicd->move_in_progress = 0;
 	raw_spin_unlock(&vector_lock);
 }
 #endif

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 29/52] x86/vector: Store the single CPU targets in apic data
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (27 preceding siblings ...)
  2017-09-13 21:29 ` [patch 28/52] x86/vector: Cleanup variable names Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 30/52] x86/vector: Simplify vector move cleanup Thomas Gleixner
                   ` (24 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Store-the-single-CPU-targets-in-apic-data.patch --]
[-- Type: text/plain, Size: 1501 bytes --]

Now that the interrupt affinities are targeted at single CPUs storing them
in a cpumask is overkill. Store them in a dedicated variable.

This does not yet remove the domain cpumasks because the current allocator
relies on them. Preparatory change for the allocator rework.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -23,6 +23,8 @@
 
 struct apic_chip_data {
 	struct irq_cfg		cfg;
+	unsigned int		cpu;
+	unsigned int		prev_cpu;
 	cpumask_var_t		domain;
 	cpumask_var_t		old_domain;
 	u8			move_in_progress : 1;
@@ -214,6 +216,7 @@ static int __assign_irq_vector(int irq,
 	cpumask_and(d->old_domain, d->old_domain, cpu_online_mask);
 	d->move_in_progress = !cpumask_empty(d->old_domain);
 	d->cfg.old_vector = d->move_in_progress ? d->cfg.vector : 0;
+	d->prev_cpu = d->cpu;
 	d->cfg.vector = vector;
 	cpumask_copy(d->domain, vector_cpumask);
 success:
@@ -228,6 +231,7 @@ static int __assign_irq_vector(int irq,
 	cpumask_and(vector_searchmask, vector_searchmask, mask);
 	BUG_ON(apic->cpu_mask_to_apicid(vector_searchmask, irqd,
 					&d->cfg.dest_apicid));
+	d->cpu = cpumask_first(vector_searchmask);
 	return 0;
 }
 
@@ -428,6 +432,7 @@ static void __init init_legacy_irqs(void
 
 		apicd->cfg.vector = ISA_IRQ_VECTOR(i);
 		cpumask_copy(apicd->domain, cpumask_of(0));
+		apicd->cpu = 0;
 		irq_set_chip_data(i, apicd);
 	}
 }

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 30/52] x86/vector: Simplify vector move cleanup
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (28 preceding siblings ...)
  2017-09-13 21:29 ` [patch 29/52] x86/vector: Store the single CPU targets in apic data Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 31/52] x86/ioapic: Mark legacy vectors at reallocation time Thomas Gleixner
                   ` (23 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Simplify-vector-move-cleanup.patch --]
[-- Type: text/plain, Size: 11525 bytes --]

The vector move cleanup needs to walk the vector space and do a lot of
sanity checks to find a vector to cleanup.

With single CPU affinities this can be simplified and made more robust by
queueing the vector configuration which needs to be cleaned up in a hlist
on the CPU which was the previous target.

That removes all the race conditions because the cleanup either finds a
valid list entry or not. The latter happens when the interrupt was torn
down before the cleanup handler was able to run.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |  221 ++++++++++++++----------------------------
 1 file changed, 77 insertions(+), 144 deletions(-)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -25,6 +25,7 @@ struct apic_chip_data {
 	struct irq_cfg		cfg;
 	unsigned int		cpu;
 	unsigned int		prev_cpu;
+	struct hlist_node	clist;
 	cpumask_var_t		domain;
 	cpumask_var_t		old_domain;
 	u8			move_in_progress : 1;
@@ -38,6 +39,9 @@ static struct irq_chip lapic_controller;
 #ifdef	CONFIG_X86_IO_APIC
 static struct apic_chip_data *legacy_irq_data[NR_IRQS_LEGACY];
 #endif
+#ifdef CONFIG_SMP
+static DEFINE_PER_CPU(struct hlist_head, cleanup_list);
+#endif
 
 void lock_vector_lock(void)
 {
@@ -87,6 +91,7 @@ static struct apic_chip_data *alloc_apic
 		goto out_data;
 	if (!zalloc_cpumask_var_node(&apicd->old_domain, GFP_KERNEL, node))
 		goto out_domain;
+	INIT_HLIST_NODE(&apicd->clist);
 	return apicd;
 out_domain:
 	free_cpumask_var(apicd->domain);
@@ -127,8 +132,7 @@ static int __assign_irq_vector(int irq,
 	 * If there is still a move in progress or the previous move has not
 	 * been cleaned up completely, tell the caller to come back later.
 	 */
-	if (d->move_in_progress ||
-	    cpumask_intersects(d->old_domain, cpu_online_mask))
+	if (d->cfg.old_vector)
 		return -EBUSY;
 
 	/* Only try and allocate irqs on cpus that are present */
@@ -263,38 +267,22 @@ static int assign_irq_vector_policy(int
 
 static void clear_irq_vector(int irq, struct apic_chip_data *apicd)
 {
-	struct irq_desc *desc;
-	int cpu, vector;
+	unsigned int vector = apicd->cfg.vector;
 
-	if (!apicd->cfg.vector)
+	if (!vector)
 		return;
 
-	vector = apicd->cfg.vector;
-	for_each_cpu_and(cpu, apicd->domain, cpu_online_mask)
-		per_cpu(vector_irq, cpu)[vector] = VECTOR_UNUSED;
-
+	per_cpu(vector_irq, apicd->cpu)[vector] = VECTOR_UNUSED;
 	apicd->cfg.vector = 0;
-	cpumask_clear(apicd->domain);
 
-	/*
-	 * If move is in progress or the old_domain mask is not empty,
-	 * i.e. the cleanup IPI has not been processed yet, we need to remove
-	 * the old references to desc from all cpus vector tables.
-	 */
-	if (!apicd->move_in_progress && cpumask_empty(apicd->old_domain))
+	/* Clean up move in progress */
+	vector = apicd->cfg.old_vector;
+	if (!vector)
 		return;
 
-	desc = irq_to_desc(irq);
-	for_each_cpu_and(cpu, apicd->old_domain, cpu_online_mask) {
-		for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS;
-		     vector++) {
-			if (per_cpu(vector_irq, cpu)[vector] != desc)
-				continue;
-			per_cpu(vector_irq, cpu)[vector] = VECTOR_UNUSED;
-			break;
-		}
-	}
+	per_cpu(vector_irq, apicd->prev_cpu)[vector] = VECTOR_UNUSED;
 	apicd->move_in_progress = 0;
+	hlist_del_init(&apicd->clist);
 }
 
 void init_irq_alloc_info(struct irq_alloc_info *info,
@@ -474,7 +462,7 @@ static void vector_update_shutdown_irqs(
 		struct irq_data *irqd = irq_desc_get_irq_data(desc);
 		struct apic_chip_data *ad = apic_chip_data(irqd);
 
-		if (ad && cpumask_test_cpu(cpu, ad->domain) && ad->cfg.vector)
+		if (ad && ad->cfg.vector && ad->cpu == smp_processor_id())
 			this_cpu_write(vector_irq[ad->cfg.vector], desc);
 	}
 }
@@ -524,11 +512,9 @@ static int apic_retrigger_irq(struct irq
 {
 	struct apic_chip_data *apicd = apic_chip_data(irqd);
 	unsigned long flags;
-	int cpu;
 
 	raw_spin_lock_irqsave(&vector_lock, flags);
-	cpu = cpumask_first_and(apicd->domain, cpu_online_mask);
-	apic->send_IPI_mask(cpumask_of(cpu), apicd->cfg.vector);
+	apic->send_IPI(apicd->cpu, apicd->cfg.vector);
 	raw_spin_unlock_irqrestore(&vector_lock, flags);
 
 	return 1;
@@ -565,114 +551,77 @@ static struct irq_chip lapic_controller
 };
 
 #ifdef CONFIG_SMP
-static void __send_cleanup_vector(struct apic_chip_data *apicd)
-{
-	raw_spin_lock(&vector_lock);
-	cpumask_and(apicd->old_domain, apicd->old_domain, cpu_online_mask);
-	apicd->move_in_progress = 0;
-	if (!cpumask_empty(apicd->old_domain))
-		apic->send_IPI_mask(apicd->old_domain, IRQ_MOVE_CLEANUP_VECTOR);
-	raw_spin_unlock(&vector_lock);
-}
-
-void send_cleanup_vector(struct irq_cfg *cfg)
-{
-	struct apic_chip_data *apicd;
-
-	apicd = container_of(cfg, struct apic_chip_data, cfg);
-	if (apicd->move_in_progress)
-		__send_cleanup_vector(apicd);
-}
 
 asmlinkage __visible void __irq_entry smp_irq_move_cleanup_interrupt(void)
 {
-	unsigned vector, me;
+	struct hlist_head *clhead = this_cpu_ptr(&cleanup_list);
+	struct apic_chip_data *apicd;
+	struct hlist_node *tmp;
 
 	entering_ack_irq();
-
 	/* Prevent vectors vanishing under us */
 	raw_spin_lock(&vector_lock);
 
-	me = smp_processor_id();
-	for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
-		struct apic_chip_data *apicd;
-		struct irq_desc *desc;
-		unsigned int irr;
-
-	retry:
-		desc = __this_cpu_read(vector_irq[vector]);
-		if (IS_ERR_OR_NULL(desc))
-			continue;
-
-		if (!raw_spin_trylock(&desc->lock)) {
-			raw_spin_unlock(&vector_lock);
-			cpu_relax();
-			raw_spin_lock(&vector_lock);
-			goto retry;
-		}
-
-		apicd = apic_chip_data(irq_desc_get_irq_data(desc));
-		if (!apicd)
-			goto unlock;
-
-		/*
-		 * Nothing to cleanup if irq migration is in progress
-		 * or this cpu is not set in the cleanup mask.
-		 */
-		if (apicd->move_in_progress ||
-		    !cpumask_test_cpu(me, apicd->old_domain))
-			goto unlock;
+	hlist_for_each_entry_safe(apicd, tmp, clhead, clist) {
+		unsigned int irr, vector = apicd->cfg.old_vector;
 
 		/*
-		 * We have two cases to handle here:
-		 * 1) vector is unchanged but the target mask got reduced
-		 * 2) vector and the target mask has changed
-		 *
-		 * #1 is obvious, but in #2 we have two vectors with the same
-		 * irq descriptor: the old and the new vector. So we need to
-		 * make sure that we only cleanup the old vector. The new
-		 * vector has the current @vector number in the config and
-		 * this cpu is part of the target mask. We better leave that
-		 * one alone.
-		 */
-		if (vector == apicd->cfg.vector &&
-		    cpumask_test_cpu(me, apicd->domain))
-			goto unlock;
-
-		irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
-		/*
-		 * Check if the vector that needs to be cleanedup is
-		 * registered at the cpu's IRR. If so, then this is not
-		 * the best time to clean it up. Lets clean it up in the
+		 * Paranoia: Check if the vector that needs to be cleaned
+		 * up is registered at the APICs IRR. If so, then this is
+		 * not the best time to clean it up. Clean it up in the
 		 * next attempt by sending another IRQ_MOVE_CLEANUP_VECTOR
-		 * to myself.
+		 * to this CPU. IRQ_MOVE_CLEANUP_VECTOR is the lowest
+		 * priority external vector, so on return from this
+		 * interrupt the device interrupt will happen first.
 		 */
-		if (irr  & (1 << (vector % 32))) {
+		irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
+		if (irr & (1U << (vector % 32))) {
 			apic->send_IPI_self(IRQ_MOVE_CLEANUP_VECTOR);
-			goto unlock;
+			continue;
 		}
+		hlist_del_init(&apicd->clist);
 		__this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
-		cpumask_clear_cpu(me, apicd->old_domain);
-unlock:
-		raw_spin_unlock(&desc->lock);
+		apicd->cfg.old_vector = 0;
 	}
 
 	raw_spin_unlock(&vector_lock);
-
 	exiting_irq();
 }
 
+static void __send_cleanup_vector(struct apic_chip_data *apicd)
+{
+	unsigned int cpu;
+
+	raw_spin_lock(&vector_lock);
+	apicd->move_in_progress = 0;
+	cpu = apicd->prev_cpu;
+	if (cpu_online(cpu)) {
+		hlist_add_head(&apicd->clist, per_cpu_ptr(&cleanup_list, cpu));
+		apic->send_IPI(cpu, IRQ_MOVE_CLEANUP_VECTOR);
+	} else {
+		apicd->cfg.old_vector = 0;
+	}
+	raw_spin_unlock(&vector_lock);
+}
+
+void send_cleanup_vector(struct irq_cfg *cfg)
+{
+	struct apic_chip_data *apicd;
+
+	apicd = container_of(cfg, struct apic_chip_data, cfg);
+	if (apicd->move_in_progress)
+		__send_cleanup_vector(apicd);
+}
+
 static void __irq_complete_move(struct irq_cfg *cfg, unsigned vector)
 {
-	unsigned me;
 	struct apic_chip_data *apicd;
 
 	apicd = container_of(cfg, struct apic_chip_data, cfg);
 	if (likely(!apicd->move_in_progress))
 		return;
 
-	me = smp_processor_id();
-	if (vector == apicd->cfg.vector && cpumask_test_cpu(me, apicd->domain))
+	if (vector == apicd->cfg.vector && apicd->cpu == smp_processor_id())
 		__send_cleanup_vector(apicd);
 }
 
@@ -686,10 +635,9 @@ void irq_complete_move(struct irq_cfg *c
  */
 void irq_force_complete_move(struct irq_desc *desc)
 {
-	struct irq_data *irqd;
 	struct apic_chip_data *apicd;
-	struct irq_cfg *cfg;
-	unsigned int cpu;
+	struct irq_data *irqd;
+	unsigned int vector;
 
 	/*
 	 * The function is called for all descriptors regardless of which
@@ -701,42 +649,30 @@ void irq_force_complete_move(struct irq_
 	 * (apic_chip_data) before touching it any further.
 	 */
 	irqd = irq_domain_get_irq_data(x86_vector_domain,
-					  irq_desc_get_irq(desc));
+				       irq_desc_get_irq(desc));
 	if (!irqd)
 		return;
 
+	raw_spin_lock(&vector_lock);
 	apicd = apic_chip_data(irqd);
-	cfg = apicd ? &apicd->cfg : NULL;
+	if (!apicd)
+		goto unlock;
 
-	if (!cfg)
-		return;
+	/*
+	 * If old_vector is empty, no action required.
+	 */
+	vector = apicd->cfg.old_vector;
+	if (!vector)
+		goto unlock;
 
 	/*
-	 * This is tricky. If the cleanup of @data->old_domain has not been
+	 * This is tricky. If the cleanup of the old vector has not been
 	 * done yet, then the following setaffinity call will fail with
 	 * -EBUSY. This can leave the interrupt in a stale state.
 	 *
 	 * All CPUs are stuck in stop machine with interrupts disabled so
 	 * calling __irq_complete_move() would be completely pointless.
-	 */
-	raw_spin_lock(&vector_lock);
-	/*
-	 * Clean out all offline cpus (including the outgoing one) from the
-	 * old_domain mask.
-	 */
-	cpumask_and(apicd->old_domain, apicd->old_domain, cpu_online_mask);
-
-	/*
-	 * If move_in_progress is cleared and the old_domain mask is empty,
-	 * then there is nothing to cleanup. fixup_irqs() will take care of
-	 * the stale vectors on the outgoing cpu.
-	 */
-	if (!apicd->move_in_progress && cpumask_empty(apicd->old_domain)) {
-		raw_spin_unlock(&vector_lock);
-		return;
-	}
-
-	/*
+	 *
 	 * 1) The interrupt is in move_in_progress state. That means that we
 	 *    have not seen an interrupt since the io_apic was reprogrammed to
 	 *    the new vector.
@@ -778,18 +714,15 @@ void irq_force_complete_move(struct irq_
 		 * area arises.
 		 */
 		pr_warn("IRQ fixup: irq %d move in progress, old vector %d\n",
-			irqd->irq, cfg->old_vector);
+			irqd->irq, vector);
 	}
-	/*
-	 * If old_domain is not empty, then other cpus still have the irq
-	 * descriptor set in their vector array. Clean it up.
-	 */
-	for_each_cpu(cpu, apicd->old_domain)
-		per_cpu(vector_irq, cpu)[cfg->old_vector] = VECTOR_UNUSED;
-
+	per_cpu(vector_irq, apicd->prev_cpu)[vector] = VECTOR_UNUSED;
 	/* Cleanup the left overs of the (half finished) move */
 	cpumask_clear(apicd->old_domain);
+	apicd->cfg.old_vector = 0;
 	apicd->move_in_progress = 0;
+	hlist_del_init(&apicd->clist);
+unlock:
 	raw_spin_unlock(&vector_lock);
 }
 #endif

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 31/52] x86/ioapic: Mark legacy vectors at reallocation time
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (29 preceding siblings ...)
  2017-09-13 21:29 ` [patch 30/52] x86/vector: Simplify vector move cleanup Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 32/52] x86/apic: Get rid of the legacy irq data storage Thomas Gleixner
                   ` (22 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-ioapic--Mark-legacy-vectors-at-reallocation-time.patch --]
[-- Type: text/plain, Size: 1079 bytes --]

When the legacy PIC vectors are taken over by the IO APIC the current
vector assignement code is tricked to reuse the vector by allocating the
apic data in the early boot process. This can be avoided by marking the
allocation as legacy PIC take over. Preparatory patch for further cleanups.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/irqdomain.h |    1 +
 arch/x86/kernel/apic/io_apic.c   |    1 +
 2 files changed, 2 insertions(+)

--- a/arch/x86/include/asm/irqdomain.h
+++ b/arch/x86/include/asm/irqdomain.h
@@ -8,6 +8,7 @@
 enum {
 	/* Allocate contiguous CPU vectors */
 	X86_IRQ_ALLOC_CONTIGUOUS_VECTORS		= 0x1,
+	X86_IRQ_ALLOC_LEGACY				= 0x2,
 };
 
 extern struct irq_domain *x86_vector_domain;
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1013,6 +1013,7 @@ static int alloc_isa_irq_from_domain(str
 					  info->ioapic_pin))
 			return -ENOMEM;
 	} else {
+		info->flags |= X86_IRQ_ALLOC_LEGACY;
 		irq = __irq_domain_alloc_irqs(domain, irq, 1, node, info, true,
 					      NULL);
 		if (irq >= 0) {

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 32/52] x86/apic: Get rid of the legacy irq data storage
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (30 preceding siblings ...)
  2017-09-13 21:29 ` [patch 31/52] x86/ioapic: Mark legacy vectors at reallocation time Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 33/52] x86/vector: Remove pointless pointer checks Thomas Gleixner
                   ` (21 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Get-rid-of-the-legacy-irq-data-storage.patch --]
[-- Type: text/plain, Size: 3399 bytes --]

Now that the legacy PIC takeover by the IOAPIC is marked accordingly the
early boot allocation of APIC data is not longer necessary. Use the regular
allocation mechansim as it is used by non legacy interrupts and fill in the
known information (vector and affinity) so the allocator reuses the vector,
This is important as the timer check might move the timer interrupt 0 back
to the PIC in case the delivery through the IOAPIC fails.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |   52 ++++++++++--------------------------------
 1 file changed, 13 insertions(+), 39 deletions(-)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -36,9 +36,6 @@ EXPORT_SYMBOL_GPL(x86_vector_domain);
 static DEFINE_RAW_SPINLOCK(vector_lock);
 static cpumask_var_t vector_cpumask, vector_searchmask, searched_cpumask;
 static struct irq_chip lapic_controller;
-#ifdef	CONFIG_X86_IO_APIC
-static struct apic_chip_data *legacy_irq_data[NR_IRQS_LEGACY];
-#endif
 #ifdef CONFIG_SMP
 static DEFINE_PER_CPU(struct hlist_head, cleanup_list);
 #endif
@@ -317,10 +314,6 @@ static void x86_vector_free_irqs(struct
 			irq_domain_reset_irq_data(irqd);
 			raw_spin_unlock_irqrestore(&vector_lock, flags);
 			free_apic_chip_data(apicd);
-#ifdef	CONFIG_X86_IO_APIC
-			if (virq + i < nr_legacy_irqs())
-				legacy_irq_data[virq + i] = NULL;
-#endif
 		}
 	}
 }
@@ -344,12 +337,8 @@ static int x86_vector_alloc_irqs(struct
 		irqd = irq_domain_get_irq_data(domain, virq + i);
 		BUG_ON(!irqd);
 		node = irq_data_get_node(irqd);
-#ifdef	CONFIG_X86_IO_APIC
-		if (virq + i < nr_legacy_irqs() && legacy_irq_data[virq + i])
-			apicd = legacy_irq_data[virq + i];
-		else
-#endif
-			apicd = alloc_apic_chip_data(node);
+		WARN_ON_ONCE(irqd->chip_data);
+		apicd = alloc_apic_chip_data(node);
 		if (!apicd) {
 			err = -ENOMEM;
 			goto error;
@@ -359,6 +348,17 @@ static int x86_vector_alloc_irqs(struct
 		irqd->chip_data = apicd;
 		irqd->hwirq = virq + i;
 		irqd_set_single_target(irqd);
+		/*
+		 * Make sure, that the legacy to IOAPIC transition stays on
+		 * the same vector. This is required for check_timer() to
+		 * work correctly as it might switch back to legacy mode.
+		 */
+		if (info->flags & X86_IRQ_ALLOC_LEGACY) {
+			apicd->cfg.vector = ISA_IRQ_VECTOR(virq + i);
+			apicd->cpu = 0;
+			cpumask_copy(apicd->domain, cpumask_of(0));
+		}
+
 		err = assign_irq_vector_policy(virq + i, node, apicd, info,
 					       irqd);
 		if (err)
@@ -404,36 +404,10 @@ int __init arch_probe_nr_irqs(void)
 	return legacy_pic->probe();
 }
 
-#ifdef	CONFIG_X86_IO_APIC
-static void __init init_legacy_irqs(void)
-{
-	int i, node = cpu_to_node(0);
-	struct apic_chip_data *apicd;
-
-	/*
-	 * For legacy IRQ's, start with assigning irq0 to irq15 to
-	 * ISA_IRQ_VECTOR(i) for all cpu's.
-	 */
-	for (i = 0; i < nr_legacy_irqs(); i++) {
-		apicd = legacy_irq_data[i] = alloc_apic_chip_data(node);
-		BUG_ON(!apicd);
-
-		apicd->cfg.vector = ISA_IRQ_VECTOR(i);
-		cpumask_copy(apicd->domain, cpumask_of(0));
-		apicd->cpu = 0;
-		irq_set_chip_data(i, apicd);
-	}
-}
-#else
-static inline void init_legacy_irqs(void) { }
-#endif
-
 int __init arch_early_irq_init(void)
 {
 	struct fwnode_handle *fn;
 
-	init_legacy_irqs();
-
 	fn = irq_domain_alloc_named_fwnode("VECTOR");
 	BUG_ON(!fn);
 	x86_vector_domain = irq_domain_create_tree(fn, &x86_vector_domain_ops,

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 33/52] x86/vector: Remove pointless pointer checks
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (31 preceding siblings ...)
  2017-09-13 21:29 ` [patch 32/52] x86/apic: Get rid of the legacy irq data storage Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 34/52] x86/vector: Move helper functions around Thomas Gleixner
                   ` (20 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Remove-pointless-pointer-checks.patch --]
[-- Type: text/plain, Size: 814 bytes --]

The info pointer checks in assign_irq_vector_policy() are pointless because
the pointer cannot be NULL, otherwise the calling code would already crash.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: b/arch/x86/kernel/apic/vector.c
===================================================================
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -254,7 +254,7 @@ static int assign_irq_vector_policy(int
 				    struct irq_alloc_info *info,
 				    struct irq_data *irqd)
 {
-	if (info && info->mask)
+	if (info->mask)
 		return assign_irq_vector(irq, apicd, info->mask, irqd);
 	if (node != NUMA_NO_NODE &&
 	    assign_irq_vector(irq, apicd, cpumask_of_node(node), irqd) == 0)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 34/52] x86/vector: Move helper functions around
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (32 preceding siblings ...)
  2017-09-13 21:29 ` [patch 33/52] x86/vector: Remove pointless pointer checks Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 35/52] x86/apic: Add replacement for cpu_mask_to_apicid() Thomas Gleixner
                   ` (19 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Move-helper-functions-around.patch --]
[-- Type: text/plain, Size: 1463 bytes --]

Move the helper functions to a different place as they would end up in the
middle of management functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |   30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

Index: b/arch/x86/kernel/apic/vector.c
===================================================================
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -53,6 +53,21 @@ void unlock_vector_lock(void)
 	raw_spin_unlock(&vector_lock);
 }
 
+void init_irq_alloc_info(struct irq_alloc_info *info,
+			 const struct cpumask *mask)
+{
+	memset(info, 0, sizeof(*info));
+	info->mask = mask;
+}
+
+void copy_irq_alloc_info(struct irq_alloc_info *dst, struct irq_alloc_info *src)
+{
+	if (src)
+		*dst = *src;
+	else
+		memset(dst, 0, sizeof(*dst));
+}
+
 static struct apic_chip_data *apic_chip_data(struct irq_data *irqd)
 {
 	if (!irqd)
@@ -282,21 +297,6 @@ static void clear_irq_vector(int irq, st
 	hlist_del_init(&apicd->clist);
 }
 
-void init_irq_alloc_info(struct irq_alloc_info *info,
-			 const struct cpumask *mask)
-{
-	memset(info, 0, sizeof(*info));
-	info->mask = mask;
-}
-
-void copy_irq_alloc_info(struct irq_alloc_info *dst, struct irq_alloc_info *src)
-{
-	if (src)
-		*dst = *src;
-	else
-		memset(dst, 0, sizeof(*dst));
-}
-
 static void x86_vector_free_irqs(struct irq_domain *domain,
 				 unsigned int virq, unsigned int nr_irqs)
 {

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 35/52] x86/apic: Add replacement for cpu_mask_to_apicid()
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (33 preceding siblings ...)
  2017-09-13 21:29 ` [patch 34/52] x86/vector: Move helper functions around Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 36/52] x86/irq/vector: Initialize matrix allocator Thomas Gleixner
                   ` (18 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Add-replacement-for-cpu_mask_to_apicid--.patch --]
[-- Type: text/plain, Size: 6601 bytes --]

As preparation for replacing the vector allocator, provide a new function
which takes a cpu number instead of a cpu mask to calculate/lookup the
resulting APIC destination id.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h           |    5 +++++
 arch/x86/kernel/apic/apic_common.c    |   10 ++++++++++
 arch/x86/kernel/apic/apic_flat_64.c   |    2 ++
 arch/x86/kernel/apic/apic_noop.c      |    1 +
 arch/x86/kernel/apic/apic_numachip.c  |    2 ++
 arch/x86/kernel/apic/bigsmp_32.c      |    1 +
 arch/x86/kernel/apic/probe_32.c       |    1 +
 arch/x86/kernel/apic/x2apic_cluster.c |    6 ++++++
 arch/x86/kernel/apic/x2apic_phys.c    |    1 +
 arch/x86/kernel/apic/x2apic_uv_x.c    |    6 ++++++
 arch/x86/xen/apic.c                   |    1 +
 11 files changed, 36 insertions(+)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -294,6 +294,7 @@ struct apic {
 	int	(*cpu_mask_to_apicid)(const struct cpumask *cpumask,
 				      struct irq_data *irqdata,
 				      unsigned int *apicid);
+	u32	(*calc_dest_apicid)(unsigned int cpu);
 
 	/* ICR related functions */
 	u64	(*icr_read)(void);
@@ -477,6 +478,10 @@ static inline unsigned int read_apic_id(
 extern int default_apic_id_valid(int apicid);
 extern int default_acpi_madt_oem_check(char *, char *);
 extern void default_setup_apic_routing(void);
+
+extern u32 apic_default_calc_apicid(unsigned int cpu);
+extern u32 apic_flat_calc_apicid(unsigned int cpu);
+
 extern int flat_cpu_mask_to_apicid(const struct cpumask *cpumask,
 				   struct irq_data *irqdata,
 				   unsigned int *apicid);
--- a/arch/x86/kernel/apic/apic_common.c
+++ b/arch/x86/kernel/apic/apic_common.c
@@ -6,6 +6,11 @@
 #include <linux/irq.h>
 #include <asm/apic.h>
 
+u32 apic_default_calc_apicid(unsigned int cpu)
+{
+	return per_cpu(x86_cpu_to_apicid, cpu);
+}
+
 int default_cpu_mask_to_apicid(const struct cpumask *msk, struct irq_data *irqd,
 			       unsigned int *apicid)
 {
@@ -18,6 +23,11 @@ int default_cpu_mask_to_apicid(const str
 	return 0;
 }
 
+u32 apic_flat_calc_apicid(unsigned int cpu)
+{
+	return 1U << cpu;
+}
+
 int flat_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqd,
 			    unsigned int *apicid)
 
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -172,6 +172,7 @@ static struct apic apic_flat __ro_after_
 	.set_apic_id			= set_apic_id,
 
 	.cpu_mask_to_apicid		= flat_cpu_mask_to_apicid,
+	.calc_dest_apicid		= apic_flat_calc_apicid,
 
 	.send_IPI			= default_send_IPI_single,
 	.send_IPI_mask			= flat_send_IPI_mask,
@@ -267,6 +268,7 @@ static struct apic apic_physflat __ro_af
 	.set_apic_id			= set_apic_id,
 
 	.cpu_mask_to_apicid		= default_cpu_mask_to_apicid,
+	.calc_dest_apicid		= apic_default_calc_apicid,
 
 	.send_IPI			= default_send_IPI_single_phys,
 	.send_IPI_mask			= default_send_IPI_mask_sequence_phys,
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -142,6 +142,7 @@ struct apic apic_noop __ro_after_init =
 	.set_apic_id			= NULL,
 
 	.cpu_mask_to_apicid		= flat_cpu_mask_to_apicid,
+	.calc_dest_apicid		= apic_flat_calc_apicid,
 
 	.send_IPI			= noop_send_IPI,
 	.send_IPI_mask			= noop_send_IPI_mask,
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -267,6 +267,7 @@ static const struct apic apic_numachip1
 	.set_apic_id			= numachip1_set_apic_id,
 
 	.cpu_mask_to_apicid		= default_cpu_mask_to_apicid,
+	.calc_dest_apicid		= apic_default_calc_apicid,
 
 	.send_IPI			= numachip_send_IPI_one,
 	.send_IPI_mask			= numachip_send_IPI_mask,
@@ -317,6 +318,7 @@ static const struct apic apic_numachip2
 	.set_apic_id			= numachip2_set_apic_id,
 
 	.cpu_mask_to_apicid		= default_cpu_mask_to_apicid,
+	.calc_dest_apicid		= apic_default_calc_apicid,
 
 	.send_IPI			= numachip_send_IPI_one,
 	.send_IPI_mask			= numachip_send_IPI_mask,
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -172,6 +172,7 @@ static struct apic apic_bigsmp __ro_afte
 	.set_apic_id			= NULL,
 
 	.cpu_mask_to_apicid		= default_cpu_mask_to_apicid,
+	.calc_dest_apicid		= apic_default_calc_apicid,
 
 	.send_IPI			= default_send_IPI_single_phys,
 	.send_IPI_mask			= default_send_IPI_mask_sequence_phys,
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -127,6 +127,7 @@ static struct apic apic_default __ro_aft
 	.set_apic_id			= NULL,
 
 	.cpu_mask_to_apicid		= flat_cpu_mask_to_apicid,
+	.calc_dest_apicid		= apic_flat_calc_apicid,
 
 	.send_IPI			= default_send_IPI_single,
 	.send_IPI_mask			= default_send_IPI_mask_logical,
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -114,6 +114,11 @@ x2apic_cpu_mask_to_apicid(const struct c
 	return 0;
 }
 
+static u32 x2apic_calc_apicid(unsigned int cpu)
+{
+	return per_cpu(x86_cpu_to_logical_apicid, cpu);
+}
+
 static void init_x2apic_ldr(void)
 {
 	struct cluster_mask *cmsk = this_cpu_read(cluster_masks);
@@ -245,6 +250,7 @@ static struct apic apic_x2apic_cluster _
 	.set_apic_id			= x2apic_set_apic_id,
 
 	.cpu_mask_to_apicid		= x2apic_cpu_mask_to_apicid,
+	.calc_dest_apicid		= x2apic_calc_apicid,
 
 	.send_IPI			= x2apic_send_IPI,
 	.send_IPI_mask			= x2apic_send_IPI_mask,
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -165,6 +165,7 @@ static struct apic apic_x2apic_phys __ro
 	.set_apic_id			= x2apic_set_apic_id,
 
 	.cpu_mask_to_apicid		= default_cpu_mask_to_apicid,
+	.calc_dest_apicid		= apic_default_calc_apicid,
 
 	.send_IPI			= x2apic_send_IPI,
 	.send_IPI_mask			= x2apic_send_IPI_mask,
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -537,6 +537,11 @@ uv_cpu_mask_to_apicid(const struct cpuma
 	return ret;
 }
 
+static u32 apic_uv_calc_apicid(unsigned int cpu)
+{
+	return apic_default_calc_apicid(cpu) | uv_apicid_hibits;
+}
+
 static unsigned int x2apic_get_apic_id(unsigned long x)
 {
 	unsigned int id;
@@ -602,6 +607,7 @@ static struct apic apic_x2apic_uv_x __ro
 	.set_apic_id			= set_apic_id,
 
 	.cpu_mask_to_apicid		= uv_cpu_mask_to_apicid,
+	.calc_dest_apicid		= apic_uv_calc_apicid,
 
 	.send_IPI			= uv_send_IPI_one,
 	.send_IPI_mask			= uv_send_IPI_mask,
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -178,6 +178,7 @@ static struct apic xen_pv_apic = {
 	.set_apic_id 			= xen_set_apic_id, /* Can be NULL on 32-bit. */
 
 	.cpu_mask_to_apicid		= flat_cpu_mask_to_apicid,
+	.calc_dest_apicid		= apic_flat_calc_apicid,
 
 #ifdef CONFIG_SMP
 	.send_IPI_mask 			= xen_send_IPI_mask,

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 36/52] x86/irq/vector: Initialize matrix allocator
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (34 preceding siblings ...)
  2017-09-13 21:29 ` [patch 35/52] x86/apic: Add replacement for cpu_mask_to_apicid() Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 37/52] x86/vector: Add vector domain debugfs support Thomas Gleixner
                   ` (17 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-irq-vector--Initialize-matrix-allocator.patch --]
[-- Type: text/plain, Size: 7036 bytes --]

Initialize the matrix allocator and add the proper accounting points to the
code.

No functional change, just preparation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/Kconfig              |    1 
 arch/x86/include/asm/apic.h   |    6 ++++
 arch/x86/include/asm/hw_irq.h |    3 +-
 arch/x86/kernel/apic/vector.c |   56 +++++++++++++++++++++++++++++++++++++++---
 arch/x86/kernel/i8259.c       |    1 
 arch/x86/kernel/irqinit.c     |    1 
 arch/x86/kernel/smpboot.c     |    3 +-
 7 files changed, 65 insertions(+), 6 deletions(-)

Index: b/arch/x86/Kconfig
===================================================================
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -93,6 +93,7 @@ config X86
 	select GENERIC_FIND_FIRST_BIT
 	select GENERIC_IOMAP
 	select GENERIC_IRQ_EFFECTIVE_AFF_MASK	if SMP
+	select GENERIC_IRQ_MATRIX_ALLOCATOR	if X86_LOCAL_APIC
 	select GENERIC_IRQ_MIGRATION		if SMP
 	select GENERIC_IRQ_PROBE
 	select GENERIC_IRQ_SHOW
Index: b/arch/x86/include/asm/apic.h
===================================================================
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -161,6 +161,10 @@ static inline int apic_is_clustered_box(
 #endif
 
 extern int setup_APIC_eilvt(u8 lvt_off, u8 vector, u8 msg_type, u8 mask);
+extern void lapic_assign_system_vectors(void);
+extern void lapic_assign_legacy_vector(unsigned int isairq, bool replace);
+extern void lapic_online(void);
+extern void lapic_offline(void);
 
 #else /* !CONFIG_X86_LOCAL_APIC */
 static inline void lapic_shutdown(void) { }
@@ -170,6 +174,8 @@ static inline void disable_local_APIC(vo
 # define setup_boot_APIC_clock x86_init_noop
 # define setup_secondary_APIC_clock x86_init_noop
 static inline void lapic_update_tsc_freq(void) { }
+static inline void lapic_assign_system_vectors(void) { }
+static inline void lapic_assign_legacy_vector(unsigned int i, bool r) { }
 #endif /* !CONFIG_X86_LOCAL_APIC */
 
 #ifdef CONFIG_X86_X2APIC
Index: b/arch/x86/include/asm/hw_irq.h
===================================================================
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -15,6 +15,8 @@
 
 #include <asm/irq_vectors.h>
 
+#define IRQ_MATRIX_BITS		NR_VECTORS
+
 #ifndef __ASSEMBLY__
 
 #include <linux/percpu.h>
@@ -130,7 +132,6 @@ extern struct irq_cfg *irq_cfg(unsigned
 extern struct irq_cfg *irqd_cfg(struct irq_data *irq_data);
 extern void lock_vector_lock(void);
 extern void unlock_vector_lock(void);
-extern void setup_vector_irq(int cpu);
 #ifdef CONFIG_SMP
 extern void send_cleanup_vector(struct irq_cfg *);
 extern void irq_complete_move(struct irq_cfg *cfg);
Index: b/arch/x86/kernel/apic/vector.c
===================================================================
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -36,6 +36,7 @@ EXPORT_SYMBOL_GPL(x86_vector_domain);
 static DEFINE_RAW_SPINLOCK(vector_lock);
 static cpumask_var_t vector_cpumask, vector_searchmask, searched_cpumask;
 static struct irq_chip lapic_controller;
+static struct irq_matrix *vector_matrix;
 #ifdef CONFIG_SMP
 static DEFINE_PER_CPU(struct hlist_head, cleanup_list);
 #endif
@@ -404,6 +405,36 @@ int __init arch_probe_nr_irqs(void)
 	return legacy_pic->probe();
 }
 
+void lapic_assign_legacy_vector(unsigned int irq, bool replace)
+{
+	/*
+	 * Use assign system here so it wont get accounted as allocated
+	 * and moveable in the cpu hotplug check and it prevents managed
+	 * irq reservation from touching it.
+	 */
+	irq_matrix_assign_system(vector_matrix, ISA_IRQ_VECTOR(irq), replace);
+}
+
+void __init lapic_assign_system_vectors(void)
+{
+	unsigned int i, vector = 0;
+
+	for_each_set_bit_from(vector, system_vectors, NR_VECTORS)
+		irq_matrix_assign_system(vector_matrix, vector, false);
+
+	if (nr_legacy_irqs() > 1)
+		lapic_assign_legacy_vector(PIC_CASCADE_IR, false);
+
+	/* System vectors are reserved, online it */
+	irq_matrix_online(vector_matrix);
+
+	/* Mark the preallocated legacy interrupts */
+	for (i = 0; i < nr_legacy_irqs(); i++) {
+		if (i != PIC_CASCADE_IR)
+			irq_matrix_assign(vector_matrix, ISA_IRQ_VECTOR(i));
+	}
+}
+
 int __init arch_early_irq_init(void)
 {
 	struct fwnode_handle *fn;
@@ -423,6 +454,14 @@ int __init arch_early_irq_init(void)
 	BUG_ON(!alloc_cpumask_var(&vector_searchmask, GFP_KERNEL));
 	BUG_ON(!alloc_cpumask_var(&searched_cpumask, GFP_KERNEL));
 
+	/*
+	 * Allocate the vector matrix allocator data structure and limit the
+	 * search area.
+	 */
+	vector_matrix = irq_alloc_matrix(NR_VECTORS, FIRST_EXTERNAL_VECTOR,
+					 FIRST_SYSTEM_VECTOR);
+	BUG_ON(!vector_matrix);
+
 	return arch_early_ioapic_init();
 }
 
@@ -454,14 +493,16 @@ static struct irq_desc *__setup_vector_i
 	return irq_to_desc(isairq);
 }
 
-/*
- * Setup the vector to irq mappings. Must be called with vector_lock held.
- */
-void setup_vector_irq(int cpu)
+/* Online the local APIC infrastructure and initialize the vectors */
+void lapic_online(void)
 {
 	unsigned int vector;
 
 	lockdep_assert_held(&vector_lock);
+
+	/* Online the vector matrix array for this CPU */
+	irq_matrix_online(vector_matrix);
+
 	/*
 	 * The interrupt affinity logic never targets interrupts to offline
 	 * CPUs. The exception are the legacy PIC interrupts. In general
@@ -482,6 +523,13 @@ void setup_vector_irq(int cpu)
 	vector_update_shutdown_irqs();
 }
 
+void lapic_offline(void)
+{
+	lock_vector_lock();
+	irq_matrix_offline(vector_matrix);
+	unlock_vector_lock();
+}
+
 static int apic_retrigger_irq(struct irq_data *irqd)
 {
 	struct apic_chip_data *apicd = apic_chip_data(irqd);
Index: b/arch/x86/kernel/i8259.c
===================================================================
--- a/arch/x86/kernel/i8259.c
+++ b/arch/x86/kernel/i8259.c
@@ -113,6 +113,7 @@ static void make_8259A_irq(unsigned int
 	io_apic_irqs &= ~(1<<irq);
 	irq_set_chip_and_handler(irq, &i8259A_chip, handle_level_irq);
 	enable_irq(irq);
+	lapic_assign_legacy_vector(irq, true);
 }
 
 /*
Index: b/arch/x86/kernel/irqinit.c
===================================================================
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -93,6 +93,7 @@ void __init native_init_IRQ(void)
 	x86_init.irqs.pre_vector_init();
 
 	idt_setup_apic_and_irq_gates();
+	lapic_assign_system_vectors();
 
 	if (!acpi_ioapic && !of_ioapic && nr_legacy_irqs())
 		setup_irq(2, &irq2);
Index: b/arch/x86/kernel/smpboot.c
===================================================================
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -257,7 +257,7 @@ static void notrace start_secondary(void
 	 * from seeing a half valid vector space.
 	 */
 	lock_vector_lock();
-	setup_vector_irq(smp_processor_id());
+	lapic_online();
 	set_cpu_online(smp_processor_id(), true);
 	unlock_vector_lock();
 	cpu_set_state_online(smp_processor_id());
@@ -1550,6 +1550,7 @@ void cpu_disable_common(void)
 	remove_cpu_from_maps(cpu);
 	unlock_vector_lock();
 	fixup_irqs();
+	lapic_offline();
 }
 
 int native_cpu_disable(void)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 37/52] x86/vector: Add vector domain debugfs support
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (35 preceding siblings ...)
  2017-09-13 21:29 ` [patch 36/52] x86/irq/vector: Initialize matrix allocator Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 38/52] x86/smpboot: Set online before setting up vectors Thomas Gleixner
                   ` (16 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Add-vector-domain-debugfs-support.patch --]
[-- Type: text/plain, Size: 2762 bytes --]

Add the debug callback for the vector domain, which gives a detailed
information about vector usage if invoked for the domain by using rhe
matrix allocator debug function and vector/target information when invoked
for a particular interrupt.

Extra information foir the Vector domain:

Online bitmaps:       32
Global available:   6352
Global reserved:       5
Total allocated:      20
System: 41: 0-19,32,50,128,238-255
 | CPU | avl | man | act | vectors
     0   183     4    19  33-48,51-53
     1   199     4     1  33
     2   199     4     0  

Extra information for interrupts:

     Vector:    42
     Target:     4

This allows a detailed analysis of the vector usage and the association to
interrupts and devices.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |   50 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 48 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -11,6 +11,7 @@
  * published by the Free Software Foundation.
  */
 #include <linux/interrupt.h>
+#include <linux/seq_file.h>
 #include <linux/init.h>
 #include <linux/compiler.h>
 #include <linux/slab.h>
@@ -373,9 +374,54 @@ static int x86_vector_alloc_irqs(struct
 	return err;
 }
 
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+void x86_vector_debug_show(struct seq_file *m, struct irq_domain *d,
+			   struct irq_data *irqd, int ind)
+{
+	unsigned int cpu, vec, prev_cpu, prev_vec;
+	struct apic_chip_data *apicd;
+	unsigned long flags;
+	int irq;
+
+	if (!irqd) {
+		irq_matrix_debug_show(m, vector_matrix, ind);
+		return;
+	}
+
+	irq = irqd->irq;
+	if (irq < nr_legacy_irqs() && !test_bit(irq, &io_apic_irqs)) {
+		seq_printf(m, "%*sVector: %5d\n", ind, "", ISA_IRQ_VECTOR(irq));
+		seq_printf(m, "%*sTarget: Legacy PIC all CPUs\n", ind, "");
+		return;
+	}
+
+	apicd = irqd->chip_data;
+	if (!apicd) {
+		seq_printf(m, "%*sVector: Not assigned\n", ind, "");
+		return;
+	}
+
+	raw_spin_lock_irqsave(&vector_lock, flags);
+	cpu = apicd->cpu;
+	vec = apicd->cfg.vector;
+	prev_cpu = apicd->prev_cpu;
+	prev_vec = apicd->cfg.old_vector;
+	raw_spin_unlock_irqrestore(&vector_lock, flags);
+	seq_printf(m, "%*sVector: %5u\n", ind, "", vec);
+	seq_printf(m, "%*sTarget: %5u\n", ind, "", cpu);
+	if (prev_vec) {
+		seq_printf(m, "%*sPrevious vector: %5u\n", ind, "", prev_vec);
+		seq_printf(m, "%*sPrevious target: %5u\n", ind, "", prev_cpu);
+	}
+}
+#endif
+
 static const struct irq_domain_ops x86_vector_domain_ops = {
-	.alloc	= x86_vector_alloc_irqs,
-	.free	= x86_vector_free_irqs,
+	.alloc		= x86_vector_alloc_irqs,
+	.free		= x86_vector_free_irqs,
+#ifdef CONFIG_GENERIC_IRQ_DEBUGFS
+	.debug_show	= x86_vector_debug_show,
+#endif
 };
 
 int __init arch_probe_nr_irqs(void)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 38/52] x86/smpboot: Set online before setting up vectors
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (36 preceding siblings ...)
  2017-09-13 21:29 ` [patch 37/52] x86/vector: Add vector domain debugfs support Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 39/52] x86/vector: Add tracepoints for vector management Thomas Gleixner
                   ` (15 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-smpboot--Set-online-before-setting-up-vectors.patch --]
[-- Type: text/plain, Size: 1255 bytes --]

There is no reason to set the CPU online before establishing the vectors on
the upcoming CPU. The vector space is protected by the vector lock so no
changes can happen.

Marking the CPU online before setting up the vector space makes tracing
work in the early vector management cpu online code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/smpboot.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -251,14 +251,14 @@ static void notrace start_secondary(void
 	check_tsc_sync_target();
 
 	/*
-	 * Lock vector_lock and initialize the vectors on this cpu
-	 * before setting the cpu online. We must set it online with
-	 * vector_lock held to prevent a concurrent setup/teardown
-	 * from seeing a half valid vector space.
+	 * Lock vector_lock, set CPU online and bring the vector
+	 * allocator online. Online must be set with vector_lock held
+	 * to prevent a concurrent irq setup/teardown from seeing a
+	 * half valid vector space.
 	 */
 	lock_vector_lock();
-	lapic_online();
 	set_cpu_online(smp_processor_id(), true);
+	lapic_online();
 	unlock_vector_lock();
 	cpu_set_state_online(smp_processor_id());
 	x86_platform.nmi_init();

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 39/52] x86/vector: Add tracepoints for vector management
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (37 preceding siblings ...)
  2017-09-13 21:29 ` [patch 38/52] x86/smpboot: Set online before setting up vectors Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 40/52] x86/vector: Use matrix allocator for vector assignment Thomas Gleixner
                   ` (14 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Add-tracepoints-for-vector-management.patch --]
[-- Type: text/plain, Size: 6696 bytes --]

Add tracepoints for analysing the new vector management

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/trace/irq_vectors.h |  244 +++++++++++++++++++++++++++++++
 arch/x86/kernel/apic/vector.c            |    2 
 2 files changed, 246 insertions(+)

--- a/arch/x86/include/asm/trace/irq_vectors.h
+++ b/arch/x86/include/asm/trace/irq_vectors.h
@@ -137,6 +137,250 @@ DEFINE_IRQ_VECTOR_EVENT(deferred_error_a
 DEFINE_IRQ_VECTOR_EVENT(thermal_apic);
 #endif
 
+TRACE_EVENT(vector_config,
+
+	TP_PROTO(unsigned int irq, unsigned int vector,
+		 unsigned int cpu, unsigned int apicdest),
+
+	TP_ARGS(irq, vector, cpu, apicdest),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	irq		)
+		__field(	unsigned int,	vector		)
+		__field(	unsigned int,	cpu		)
+		__field(	unsigned int,	apicdest	)
+	),
+
+	TP_fast_assign(
+		__entry->irq		= irq;
+		__entry->vector		= vector;
+		__entry->cpu		= cpu;
+		__entry->apicdest	= apicdest;
+	),
+
+	TP_printk("irq=%u vector=%u cpu=%u apicdest=0x%08x",
+		  __entry->irq, __entry->vector, __entry->cpu,
+		  __entry->apicdest)
+);
+
+DECLARE_EVENT_CLASS(vector_mod,
+
+	TP_PROTO(unsigned int irq, unsigned int vector,
+		 unsigned int cpu, unsigned int prev_vector,
+		 unsigned int prev_cpu),
+
+	TP_ARGS(irq, vector, cpu, prev_vector, prev_cpu),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	irq		)
+		__field(	unsigned int,	vector		)
+		__field(	unsigned int,	cpu		)
+		__field(	unsigned int,	prev_vector	)
+		__field(	unsigned int,	prev_cpu	)
+	),
+
+	TP_fast_assign(
+		__entry->irq		= irq;
+		__entry->vector		= vector;
+		__entry->cpu		= cpu;
+		__entry->prev_vector	= prev_vector;
+		__entry->prev_cpu	= prev_cpu;
+
+	),
+
+	TP_printk("irq=%u vector=%u cpu=%u prev_vector=%u prev_cpu=%u",
+		  __entry->irq, __entry->vector, __entry->cpu,
+		  __entry->prev_vector, __entry->prev_cpu)
+);
+
+#define DEFINE_IRQ_VECTOR_MOD_EVENT(name)				\
+DEFINE_EVENT_FN(vector_mod, name,					\
+	TP_PROTO(unsigned int irq, unsigned int vector,			\
+		 unsigned int cpu, unsigned int prev_vector,		\
+		 unsigned int prev_cpu),				\
+	TP_ARGS(irq, vector, cpu, prev_vector, prev_cpu), NULL, NULL);	\
+
+DEFINE_IRQ_VECTOR_MOD_EVENT(vector_update);
+DEFINE_IRQ_VECTOR_MOD_EVENT(vector_clear);
+
+DECLARE_EVENT_CLASS(vector_reserve,
+
+	TP_PROTO(unsigned int irq, int ret),
+
+	TP_ARGS(irq, ret),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	irq	)
+		__field(	int,		ret	)
+	),
+
+	TP_fast_assign(
+		__entry->irq = irq;
+		__entry->ret = ret;
+	),
+
+	TP_printk("irq=%u ret=%d", __entry->irq, __entry->ret)
+);
+
+#define DEFINE_IRQ_VECTOR_RESERVE_EVENT(name)	\
+DEFINE_EVENT_FN(vector_reserve, name,	\
+	TP_PROTO(unsigned int irq, int ret),	\
+	TP_ARGS(irq, ret), NULL, NULL);		\
+
+DEFINE_IRQ_VECTOR_RESERVE_EVENT(vector_reserve_managed);
+DEFINE_IRQ_VECTOR_RESERVE_EVENT(vector_reserve);
+
+TRACE_EVENT(vector_alloc,
+
+	TP_PROTO(unsigned int irq, unsigned int vector, bool reserved,
+		 int ret),
+
+	TP_ARGS(irq, vector, ret, reserved),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	irq		)
+		__field(	unsigned int,	vector		)
+		__field(	bool,		reserved	)
+		__field(	int,		ret		)
+	),
+
+	TP_fast_assign(
+		__entry->irq		= irq;
+		__entry->vector		= ret < 0 ? 0 : vector;
+		__entry->reserved	= reserved;
+		__entry->ret		= ret > 0 ? 0 : ret;
+	),
+
+	TP_printk("irq=%u vector=%u reserved=%d ret=%d",
+		  __entry->irq, __entry->vector,
+		  __entry->reserved, __entry->ret)
+);
+
+TRACE_EVENT(vector_alloc_managed,
+
+	TP_PROTO(unsigned int irq, unsigned int vector,
+		 int ret),
+
+	TP_ARGS(irq, vector, ret),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	irq		)
+		__field(	unsigned int,	vector		)
+		__field(	int,		ret		)
+	),
+
+	TP_fast_assign(
+		__entry->irq		= irq;
+		__entry->vector		= ret < 0 ? 0 : vector;
+		__entry->ret		= ret > 0 ? 0 : ret;
+	),
+
+	TP_printk("irq=%u vector=%u ret=%d",
+		  __entry->irq, __entry->vector, __entry->ret)
+);
+
+DECLARE_EVENT_CLASS(vector_activate,
+
+	TP_PROTO(unsigned int irq, bool is_managed, bool can_reserve,
+		 bool early),
+
+	TP_ARGS(irq, is_managed, can_reserve, early),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	irq		)
+		__field(	bool,		is_managed	)
+		__field(	bool,		can_reserve	)
+		__field(	bool,		early		)
+	),
+
+	TP_fast_assign(
+		__entry->irq		= irq;
+		__entry->is_managed	= is_managed;
+		__entry->can_reserve	= can_reserve;
+		__entry->early		= early;
+	),
+
+	TP_printk("irq=%u is_managed=%d can_reserve=%d early=%d",
+		  __entry->irq, __entry->is_managed, __entry->can_reserve,
+		  __entry->early)
+);
+
+#define DEFINE_IRQ_VECTOR_ACTIVATE_EVENT(name)				\
+DEFINE_EVENT_FN(vector_activate, name,					\
+	TP_PROTO(unsigned int irq, bool is_managed,			\
+		 bool can_reserve, bool early),				\
+	TP_ARGS(irq, is_managed, can_reserve, early), NULL, NULL);	\
+
+DEFINE_IRQ_VECTOR_ACTIVATE_EVENT(vector_activate);
+DEFINE_IRQ_VECTOR_ACTIVATE_EVENT(vector_deactivate);
+
+TRACE_EVENT(vector_teardown,
+
+	TP_PROTO(unsigned int irq, bool is_managed, bool has_reserved),
+
+	TP_ARGS(irq, is_managed, has_reserved),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	irq		)
+		__field(	bool,		is_managed	)
+		__field(	bool,		has_reserved	)
+	),
+
+	TP_fast_assign(
+		__entry->irq		= irq;
+		__entry->is_managed	= is_managed;
+		__entry->has_reserved	= has_reserved;
+	),
+
+	TP_printk("irq=%u is_managed=%d has_reserved=%d",
+		  __entry->irq, __entry->is_managed, __entry->has_reserved)
+);
+
+TRACE_EVENT(vector_setup,
+
+	TP_PROTO(unsigned int irq, bool is_legacy, int ret),
+
+	TP_ARGS(irq, is_legacy, ret),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	irq		)
+		__field(	bool,		is_legacy	)
+		__field(	int,		ret		)
+	),
+
+	TP_fast_assign(
+		__entry->irq		= irq;
+		__entry->is_legacy	= is_legacy;
+		__entry->ret		= ret;
+	),
+
+	TP_printk("irq=%u is_legacy=%d ret=%d",
+		  __entry->irq, __entry->is_legacy, __entry->ret)
+);
+
+TRACE_EVENT(vector_free_moved,
+
+	TP_PROTO(unsigned int irq, unsigned int vector, bool is_managed),
+
+	TP_ARGS(irq, vector, is_managed),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	irq		)
+		__field(	unsigned int,	vector		)
+		__field(	bool,		is_managed	)
+	),
+
+	TP_fast_assign(
+		__entry->irq		= irq;
+		__entry->vector		= vector;
+		__entry->is_managed	= is_managed;
+	),
+
+	TP_printk("irq=%u vector=%u is_managed=%d",
+		  __entry->irq, __entry->vector, __entry->is_managed)
+);
+
+
 #endif /* CONFIG_X86_LOCAL_APIC */
 
 #undef TRACE_INCLUDE_PATH
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -22,6 +22,8 @@
 #include <asm/desc.h>
 #include <asm/irq_remapping.h>
 
+#include <asm/trace/irq_vectors.h>
+
 struct apic_chip_data {
 	struct irq_cfg		cfg;
 	unsigned int		cpu;

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 40/52] x86/vector: Use matrix allocator for vector assignment
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (38 preceding siblings ...)
  2017-09-13 21:29 ` [patch 39/52] x86/vector: Add tracepoints for vector management Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 41/52] x86/apic: Remove unused callbacks Thomas Gleixner
                   ` (13 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-irq-vector--Use-matrix-allocator-for-vector-management.patch --]
[-- Type: text/plain, Size: 17879 bytes --]

Replace the magic vector allocation code by a simple bitmap matrix
allocator. This avoids loops and hoops over CPUs and vector arrays, so in
case of densly used vector spaces it's way faster.

This also gets rid of the magic 'spread the vectors accross priority
levels' heuristics in the current allocator:

The comment in __asign_irq_vector says:

   * NOTE! The local APIC isn't very good at handling
   * multiple interrupts at the same interrupt level.
   * As the interrupt level is determined by taking the
   * vector number and shifting that right by 4, we
   * want to spread these out a bit so that they don't
   * all fall in the same interrupt level.                         

After doing some palaeontological research the following was found the
following in the PPro Developer Manual Volume 3:

     "7.4.2. Valid Interrupts

     The local and I/O APICs support 240 distinct vectors in the range of 16
     to 255. Interrupt priority is implied by its vector, according to the
     following relationship: priority = vector / 16

     One is the lowest priority and 15 is the highest. Vectors 16 through
     31 are reserved for exclusive use by the processor. The remaining
     vectors are for general use. The processor's local APIC includes an
     in-service entry and a holding entry for each priority level. To avoid
     losing inter- rupts, software should allocate no more than 2 interrupt
     vectors per priority."

The current SDM tells nothing about that, instead it states:

     "If more than one interrupt is generated with the same vector number,
      the local APIC can set the bit for the vector both in the IRR and the
      ISR. This means that for the Pentium 4 and Intel Xeon processors, the
      IRR and ISR can queue two interrupts for each interrupt vector: one
      in the IRR and one in the ISR. Any additional interrupts issued for
      the same interrupt vector are collapsed into the single bit in the
      IRR.

      For the P6 family and Pentium processors, the IRR and ISR registers
      can queue no more than two interrupts per interrupt vector and will
      reject other interrupts that are received within the same vector."

   Which means, that on P6/Pentium the APIC will reject a new message and
   tell the sender to retry, which increases the load on the APIC bus and
   nothing more.

There is no affirmative answer from Intel on that, but it's a sane approach
to remove that for the following reasons:

    1) No other (relevant Open Source) operating systems bothers to
       implement this or mentiones this at all.

    2) The current allocator has no enforcement for this and especially the
       legacy interrupts, which are the main source of interrupts on these
       P6 and older systmes, are allocated linearly in the same priority
       level and just work.

    3) The current machines have no problem with that at all as verified
       with some experiments.

    4) AMD at least confirmed that such an issue is unknown.

    5) P6 and older are dinosaurs almost 20 years EOL, so there is really
       no reason to worry about that too much.


Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |  290 ++++++++++++++++--------------------------
 1 file changed, 117 insertions(+), 173 deletions(-)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -28,16 +28,15 @@ struct apic_chip_data {
 	struct irq_cfg		cfg;
 	unsigned int		cpu;
 	unsigned int		prev_cpu;
+	unsigned int		irq;
 	struct hlist_node	clist;
-	cpumask_var_t		domain;
-	cpumask_var_t		old_domain;
 	u8			move_in_progress : 1;
 };
 
 struct irq_domain *x86_vector_domain;
 EXPORT_SYMBOL_GPL(x86_vector_domain);
 static DEFINE_RAW_SPINLOCK(vector_lock);
-static cpumask_var_t vector_cpumask, vector_searchmask, searched_cpumask;
+static cpumask_var_t vector_searchmask;
 static struct irq_chip lapic_controller;
 static struct irq_matrix *vector_matrix;
 #ifdef CONFIG_SMP
@@ -101,194 +100,124 @@ static struct apic_chip_data *alloc_apic
 	struct apic_chip_data *apicd;
 
 	apicd = kzalloc_node(sizeof(*apicd), GFP_KERNEL, node);
-	if (!apicd)
-		return NULL;
-	if (!zalloc_cpumask_var_node(&apicd->domain, GFP_KERNEL, node))
-		goto out_data;
-	if (!zalloc_cpumask_var_node(&apicd->old_domain, GFP_KERNEL, node))
-		goto out_domain;
-	INIT_HLIST_NODE(&apicd->clist);
+	if (apicd)
+		INIT_HLIST_NODE(&apicd->clist);
 	return apicd;
-out_domain:
-	free_cpumask_var(apicd->domain);
-out_data:
-	kfree(apicd);
-	return NULL;
 }
 
 static void free_apic_chip_data(struct apic_chip_data *apicd)
 {
-	if (apicd) {
-		free_cpumask_var(apicd->domain);
-		free_cpumask_var(apicd->old_domain);
-		kfree(apicd);
-	}
+	kfree(apicd);
 }
 
-static int __assign_irq_vector(int irq, struct apic_chip_data *d,
-			       const struct cpumask *mask,
-			       struct irq_data *irqd)
+static void apic_update_irq_cfg(struct irq_data *irqd)
 {
-	/*
-	 * NOTE! The local APIC isn't very good at handling
-	 * multiple interrupts at the same interrupt level.
-	 * As the interrupt level is determined by taking the
-	 * vector number and shifting that right by 4, we
-	 * want to spread these out a bit so that they don't
-	 * all fall in the same interrupt level.
-	 *
-	 * Also, we've got to be careful not to trash gate
-	 * 0x80, because int 0x80 is hm, kind of importantish. ;)
-	 */
-	static int current_vector = FIRST_EXTERNAL_VECTOR + VECTOR_OFFSET_START;
-	static int current_offset = VECTOR_OFFSET_START % 16;
-	int cpu, vector;
-
-	/*
-	 * If there is still a move in progress or the previous move has not
-	 * been cleaned up completely, tell the caller to come back later.
-	 */
-	if (d->cfg.old_vector)
-		return -EBUSY;
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
 
-	/* Only try and allocate irqs on cpus that are present */
-	cpumask_clear(d->old_domain);
-	cpumask_clear(searched_cpumask);
-	cpu = cpumask_first_and(mask, cpu_online_mask);
-	while (cpu < nr_cpu_ids) {
-		int new_cpu, offset;
+	lockdep_assert_held(&vector_lock);
 
-		cpumask_copy(vector_cpumask, cpumask_of(cpu));
+	apicd->cfg.dest_apicid = apic->calc_dest_apicid(apicd->cpu);
+	irq_data_update_effective_affinity(irqd, cpumask_of(apicd->cpu));
+	trace_vector_config(irqd->irq, apicd->cfg.vector, apicd->cpu,
+			    apicd->cfg.dest_apicid);
+}
 
-		/*
-		 * Clear the offline cpus from @vector_cpumask for searching
-		 * and verify whether the result overlaps with @mask. If true,
-		 * then the call to apic->cpu_mask_to_apicid() will
-		 * succeed as well. If not, no point in trying to find a
-		 * vector in this mask.
-		 */
-		cpumask_and(vector_searchmask, vector_cpumask, cpu_online_mask);
-		if (!cpumask_intersects(vector_searchmask, mask))
-			goto next_cpu;
-
-		if (cpumask_subset(vector_cpumask, d->domain)) {
-			if (cpumask_equal(vector_cpumask, d->domain))
-				goto success;
-			/*
-			 * Mark the cpus which are not longer in the mask for
-			 * cleanup.
-			 */
-			cpumask_andnot(d->old_domain, d->domain, vector_cpumask);
-			vector = d->cfg.vector;
-			goto update;
-		}
+static void apic_update_vector(struct irq_data *irqd, unsigned int newvec,
+			       unsigned int newcpu)
+{
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
+	struct irq_desc *desc = irq_data_to_desc(irqd);
 
-		vector = current_vector;
-		offset = current_offset;
-next:
-		vector += 16;
-		if (vector >= FIRST_SYSTEM_VECTOR) {
-			offset = (offset + 1) % 16;
-			vector = FIRST_EXTERNAL_VECTOR + offset;
-		}
+	lockdep_assert_held(&vector_lock);
 
-		/* If the search wrapped around, try the next cpu */
-		if (unlikely(current_vector == vector))
-			goto next_cpu;
-
-		if (test_bit(vector, system_vectors))
-			goto next;
-
-		for_each_cpu(new_cpu, vector_searchmask) {
-			if (!IS_ERR_OR_NULL(per_cpu(vector_irq, new_cpu)[vector]))
-				goto next;
-		}
-		/* Found one! */
-		current_vector = vector;
-		current_offset = offset;
-		/* Schedule the old vector for cleanup on all cpus */
-		if (d->cfg.vector)
-			cpumask_copy(d->old_domain, d->domain);
-		for_each_cpu(new_cpu, vector_searchmask)
-			per_cpu(vector_irq, new_cpu)[vector] = irq_to_desc(irq);
-		goto update;
+	trace_vector_update(irqd->irq, newvec, newcpu, apicd->cfg.vector,
+			    apicd->cpu);
 
-next_cpu:
-		/*
-		 * We exclude the current @vector_cpumask from the requested
-		 * @mask and try again with the next online cpu in the
-		 * result. We cannot modify @mask, so we use @vector_cpumask
-		 * as a temporary buffer here as it will be reassigned when
-		 * calling apic->vector_allocation_domain() above.
-		 */
-		cpumask_or(searched_cpumask, searched_cpumask, vector_cpumask);
-		cpumask_andnot(vector_cpumask, mask, searched_cpumask);
-		cpu = cpumask_first_and(vector_cpumask, cpu_online_mask);
-		continue;
+	/* Setup the vector move, if required  */
+	if (apicd->cfg.vector && cpu_online(apicd->cpu)) {
+		apicd->move_in_progress = true;
+		apicd->cfg.old_vector = apicd->cfg.vector;
+		apicd->prev_cpu = apicd->cpu;
+	} else {
+		apicd->cfg.old_vector = 0;
 	}
-	return -ENOSPC;
 
-update:
+	apicd->cfg.vector = newvec;
+	apicd->cpu = newcpu;
+	BUG_ON(!IS_ERR_OR_NULL(per_cpu(vector_irq, newcpu)[newvec]));
+	per_cpu(vector_irq, newcpu)[newvec] = desc;
+}
+
+static int allocate_vector(struct irq_data *irqd, const struct cpumask *dest)
+{
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
+	int vector = apicd->cfg.vector;
+	unsigned int cpu = apicd->cpu;
+
 	/*
-	 * Exclude offline cpus from the cleanup mask and set the
-	 * move_in_progress flag when the result is not empty.
+	 * If the current target CPU is online and in the new requested
+	 * affinity mask, there is no point in moving the interrupt from
+	 * one CPU to another.
 	 */
-	cpumask_and(d->old_domain, d->old_domain, cpu_online_mask);
-	d->move_in_progress = !cpumask_empty(d->old_domain);
-	d->cfg.old_vector = d->move_in_progress ? d->cfg.vector : 0;
-	d->prev_cpu = d->cpu;
-	d->cfg.vector = vector;
-	cpumask_copy(d->domain, vector_cpumask);
-success:
-	/*
-	 * Cache destination APIC IDs into cfg->dest_apicid. This cannot fail
-	 * as we already established, that mask & d->domain & cpu_online_mask
-	 * is not empty.
-	 *
-	 * vector_searchmask is a subset of d->domain and has the offline
-	 * cpus masked out.
-	 */
-	cpumask_and(vector_searchmask, vector_searchmask, mask);
-	BUG_ON(apic->cpu_mask_to_apicid(vector_searchmask, irqd,
-					&d->cfg.dest_apicid));
-	d->cpu = cpumask_first(vector_searchmask);
+	if (vector && cpu_online(cpu) && cpumask_test_cpu(cpu, dest))
+		return 0;
+
+	vector = irq_matrix_alloc(vector_matrix, dest, false, &cpu);
+	if (vector > 0)
+		apic_update_vector(irqd, vector, cpu);
+	trace_vector_alloc(irqd->irq, vector, false, vector);
+	return vector;
+}
+
+static int assign_vector_locked(struct irq_data *irqd,
+				const struct cpumask *dest)
+{
+	int vector = allocate_vector(irqd, dest);
+
+	if (vector < 0)
+		return vector;
+
+	apic_update_irq_cfg(irqd);
 	return 0;
 }
 
-static int assign_irq_vector(int irq, struct apic_chip_data *apicd,
-			     const struct cpumask *mask,
-			     struct irq_data *irqd)
+static int assign_irq_vector(struct irq_data *irqd, const struct cpumask *dest)
 {
-	int err;
 	unsigned long flags;
+	int ret;
 
 	raw_spin_lock_irqsave(&vector_lock, flags);
-	err = __assign_irq_vector(irq, apicd, mask, irqd);
+	cpumask_and(vector_searchmask, dest, cpu_online_mask);
+	ret = assign_vector_locked(irqd, vector_searchmask);
 	raw_spin_unlock_irqrestore(&vector_lock, flags);
-	return err;
+	return ret;
 }
 
-static int assign_irq_vector_policy(int irq, int node,
-				    struct apic_chip_data *apicd,
-				    struct irq_alloc_info *info,
-				    struct irq_data *irqd)
+static int assign_irq_vector_policy(struct irq_data *irqd,
+				    struct irq_alloc_info *info, int node)
 {
 	if (info->mask)
-		return assign_irq_vector(irq, apicd, info->mask, irqd);
+		return assign_irq_vector(irqd, info->mask);
 	if (node != NUMA_NO_NODE &&
-	    assign_irq_vector(irq, apicd, cpumask_of_node(node), irqd) == 0)
+	    !assign_irq_vector(irqd, cpumask_of_node(node)))
 		return 0;
-	return assign_irq_vector(irq, apicd, cpu_online_mask, irqd);
+	return assign_irq_vector(irqd, cpu_online_mask);
 }
 
-static void clear_irq_vector(int irq, struct apic_chip_data *apicd)
+static void clear_irq_vector(struct irq_data *irqd)
 {
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
 	unsigned int vector = apicd->cfg.vector;
 
+	lockdep_assert_held(&vector_lock);
 	if (!vector)
 		return;
 
+	trace_vector_clear(irqd->irq, vector, apicd->cpu, apicd->cfg.old_vector,
+			   apicd->prev_cpu);
+
 	per_cpu(vector_irq, apicd->cpu)[vector] = VECTOR_UNUSED;
+	irq_matrix_free(vector_matrix, apicd->cpu, vector, false);
 	apicd->cfg.vector = 0;
 
 	/* Clean up move in progress */
@@ -297,6 +226,8 @@ static void clear_irq_vector(int irq, st
 		return;
 
 	per_cpu(vector_irq, apicd->prev_cpu)[vector] = VECTOR_UNUSED;
+	irq_matrix_free(vector_matrix, apicd->prev_cpu, vector, false);
+	apicd->cfg.old_vector = 0;
 	apicd->move_in_progress = 0;
 	hlist_del_init(&apicd->clist);
 }
@@ -313,7 +244,7 @@ static void x86_vector_free_irqs(struct
 		irqd = irq_domain_get_irq_data(x86_vector_domain, virq + i);
 		if (irqd && irqd->chip_data) {
 			raw_spin_lock_irqsave(&vector_lock, flags);
-			clear_irq_vector(virq + i, irqd->chip_data);
+			clear_irq_vector(irqd);
 			apicd = irqd->chip_data;
 			irq_domain_reset_irq_data(irqd);
 			raw_spin_unlock_irqrestore(&vector_lock, flags);
@@ -328,6 +259,7 @@ static int x86_vector_alloc_irqs(struct
 	struct irq_alloc_info *info = arg;
 	struct apic_chip_data *apicd;
 	struct irq_data *irqd;
+	unsigned long flags;
 	int i, err, node;
 
 	if (disable_apic)
@@ -348,23 +280,30 @@ static int x86_vector_alloc_irqs(struct
 			goto error;
 		}
 
+		apicd->irq = virq + i;
 		irqd->chip = &lapic_controller;
 		irqd->chip_data = apicd;
 		irqd->hwirq = virq + i;
 		irqd_set_single_target(irqd);
 		/*
-		 * Make sure, that the legacy to IOAPIC transition stays on
-		 * the same vector. This is required for check_timer() to
-		 * work correctly as it might switch back to legacy mode.
+		 * Legacy vectors are already assigned when the IOAPIC
+		 * takes them over. They stay on the same vector. This is
+		 * required for check_timer() to work correctly as it might
+		 * switch back to legacy mode. Only update the hardware
+		 * config.
 		 */
 		if (info->flags & X86_IRQ_ALLOC_LEGACY) {
 			apicd->cfg.vector = ISA_IRQ_VECTOR(virq + i);
 			apicd->cpu = 0;
-			cpumask_copy(apicd->domain, cpumask_of(0));
+			trace_vector_setup(virq + i, true, 0);
+			raw_spin_lock_irqsave(&vector_lock, flags);
+			apic_update_irq_cfg(irqd);
+			raw_spin_unlock_irqrestore(&vector_lock, flags);
+			continue;
 		}
 
-		err = assign_irq_vector_policy(virq + i, node, apicd, info,
-					       irqd);
+		err = assign_irq_vector_policy(irqd, info, node);
+		trace_vector_setup(virq + i, false, err);
 		if (err)
 			goto error;
 	}
@@ -498,9 +437,7 @@ int __init arch_early_irq_init(void)
 	arch_init_msi_domain(x86_vector_domain);
 	arch_init_htirq_domain(x86_vector_domain);
 
-	BUG_ON(!alloc_cpumask_var(&vector_cpumask, GFP_KERNEL));
 	BUG_ON(!alloc_cpumask_var(&vector_searchmask, GFP_KERNEL));
-	BUG_ON(!alloc_cpumask_var(&searched_cpumask, GFP_KERNEL));
 
 	/*
 	 * Allocate the vector matrix allocator data structure and limit the
@@ -523,8 +460,10 @@ static void vector_update_shutdown_irqs(
 		struct irq_data *irqd = irq_desc_get_irq_data(desc);
 		struct apic_chip_data *ad = apic_chip_data(irqd);
 
-		if (ad && ad->cfg.vector && ad->cpu == smp_processor_id())
-			this_cpu_write(vector_irq[ad->cfg.vector], desc);
+		if (!ad || !ad->cfg.vector || ad->cpu != smp_processor_id())
+			continue;
+		this_cpu_write(vector_irq[ad->cfg.vector], desc);
+		irq_matrix_assign(vector_matrix, ad->cfg.vector);
 	}
 }
 
@@ -600,8 +539,7 @@ void apic_ack_edge(struct irq_data *irqd
 static int apic_set_affinity(struct irq_data *irqd,
 			     const struct cpumask *dest, bool force)
 {
-	struct apic_chip_data *apicd = irqd->chip_data;
-	int err, irq = irqd->irq;
+	int err;
 
 	if (!IS_ENABLED(CONFIG_SMP))
 		return -EPERM;
@@ -609,7 +547,7 @@ static int apic_set_affinity(struct irq_
 	if (!cpumask_intersects(dest, cpu_online_mask))
 		return -EINVAL;
 
-	err = assign_irq_vector(irq, apicd, dest, irqd);
+	err = assign_irq_vector(irqd, dest);
 	return err ? err : IRQ_SET_MASK_OK;
 }
 
@@ -622,6 +560,19 @@ static struct irq_chip lapic_controller
 
 #ifdef CONFIG_SMP
 
+static void free_moved_vector(struct apic_chip_data *apicd)
+{
+	unsigned int vector = apicd->cfg.old_vector;
+	unsigned int cpu = apicd->prev_cpu;
+
+	trace_vector_free_moved(apicd->irq, vector, false);
+	irq_matrix_free(vector_matrix, cpu, vector, false);
+	__this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
+	hlist_del_init(&apicd->clist);
+	apicd->cfg.old_vector = 0;
+	apicd->move_in_progress = 0;
+}
+
 asmlinkage __visible void __irq_entry smp_irq_move_cleanup_interrupt(void)
 {
 	struct hlist_head *clhead = this_cpu_ptr(&cleanup_list);
@@ -649,9 +600,7 @@ asmlinkage __visible void __irq_entry sm
 			apic->send_IPI_self(IRQ_MOVE_CLEANUP_VECTOR);
 			continue;
 		}
-		hlist_del_init(&apicd->clist);
-		__this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
-		apicd->cfg.old_vector = 0;
+		free_moved_vector(apicd);
 	}
 
 	raw_spin_unlock(&vector_lock);
@@ -786,12 +735,7 @@ void irq_force_complete_move(struct irq_
 		pr_warn("IRQ fixup: irq %d move in progress, old vector %d\n",
 			irqd->irq, vector);
 	}
-	per_cpu(vector_irq, apicd->prev_cpu)[vector] = VECTOR_UNUSED;
-	/* Cleanup the left overs of the (half finished) move */
-	cpumask_clear(apicd->old_domain);
-	apicd->cfg.old_vector = 0;
-	apicd->move_in_progress = 0;
-	hlist_del_init(&apicd->clist);
+	free_moved_vector(apicd);
 unlock:
 	raw_spin_unlock(&vector_lock);
 }

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 41/52] x86/apic: Remove unused callbacks
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (39 preceding siblings ...)
  2017-09-13 21:29 ` [patch 40/52] x86/vector: Use matrix allocator for vector assignment Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 42/52] x86/vector: Compile SMP only code conditionally Thomas Gleixner
                   ` (12 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic--Remove-unused-callbacks.patch --]
[-- Type: text/plain, Size: 13402 bytes --]

Now that the old allocator is gone, these apic functions are unused. Remove
them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/apic_common.c    |   48 ----------------------------------
 arch/x86/kernel/apic/apic_flat_64.c   |    4 --
 arch/x86/kernel/apic/apic_noop.c      |   10 -------
 arch/x86/kernel/apic/apic_numachip.c  |    4 --
 arch/x86/kernel/apic/bigsmp_32.c      |    2 -
 arch/x86/kernel/apic/probe_32.c       |    2 -
 arch/x86/kernel/apic/x2apic_cluster.c |   48 ----------------------------------
 arch/x86/kernel/apic/x2apic_phys.c    |    2 -
 arch/x86/kernel/apic/x2apic_uv_x.c    |   14 ---------
 arch/x86/kernel/vsmp_64.c             |   19 -------------
 arch/x86/xen/apic.c                   |    2 -
 11 files changed, 155 deletions(-)

--- a/arch/x86/kernel/apic/apic_common.c
+++ b/arch/x86/kernel/apic/apic_common.c
@@ -11,64 +11,16 @@ u32 apic_default_calc_apicid(unsigned in
 	return per_cpu(x86_cpu_to_apicid, cpu);
 }
 
-int default_cpu_mask_to_apicid(const struct cpumask *msk, struct irq_data *irqd,
-			       unsigned int *apicid)
-{
-	unsigned int cpu = cpumask_first(msk);
-
-	if (cpu >= nr_cpu_ids)
-		return -EINVAL;
-	*apicid = per_cpu(x86_cpu_to_apicid, cpu);
-	irq_data_update_effective_affinity(irqd, cpumask_of(cpu));
-	return 0;
-}
-
 u32 apic_flat_calc_apicid(unsigned int cpu)
 {
 	return 1U << cpu;
 }
 
-int flat_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqd,
-			    unsigned int *apicid)
-
-{
-	struct cpumask *effmsk = irq_data_get_effective_affinity_mask(irqd);
-	unsigned long cpu_mask = cpumask_bits(mask)[0] & APIC_ALL_CPUS;
-
-	if (!cpu_mask)
-		return -EINVAL;
-	*apicid = (unsigned int)cpu_mask;
-	cpumask_bits(effmsk)[0] = cpu_mask;
-	return 0;
-}
-
 bool default_check_apicid_used(physid_mask_t *map, int apicid)
 {
 	return physid_isset(apicid, *map);
 }
 
-void flat_vector_allocation_domain(int cpu, struct cpumask *retmask,
-				   const struct cpumask *mask)
-{
-	/*
-	 * Careful. Some cpus do not strictly honor the set of cpus
-	 * specified in the interrupt destination when using lowest
-	 * priority interrupt delivery mode.
-	 *
-	 * In particular there was a hyperthreading cpu observed to
-	 * deliver interrupts to the wrong hyperthread when only one
-	 * hyperthread was specified in the interrupt desitination.
-	 */
-	cpumask_clear(retmask);
-	cpumask_bits(retmask)[0] = APIC_ALL_CPUS;
-}
-
-void default_vector_allocation_domain(int cpu, struct cpumask *retmask,
-				      const struct cpumask *mask)
-{
-	cpumask_copy(retmask, cpumask_of(cpu));
-}
-
 void default_ioapic_phys_id_map(physid_mask_t *phys_map, physid_mask_t *retmap)
 {
 	*retmap = *phys_map;
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -158,7 +158,6 @@ static struct apic apic_flat __ro_after_
 	.dest_logical			= APIC_DEST_LOGICAL,
 	.check_apicid_used		= NULL,
 
-	.vector_allocation_domain	= flat_vector_allocation_domain,
 	.init_apic_ldr			= flat_init_apic_ldr,
 
 	.ioapic_phys_id_map		= NULL,
@@ -171,7 +170,6 @@ static struct apic apic_flat __ro_after_
 	.get_apic_id			= flat_get_apic_id,
 	.set_apic_id			= set_apic_id,
 
-	.cpu_mask_to_apicid		= flat_cpu_mask_to_apicid,
 	.calc_dest_apicid		= apic_flat_calc_apicid,
 
 	.send_IPI			= default_send_IPI_single,
@@ -253,7 +251,6 @@ static struct apic apic_physflat __ro_af
 	.dest_logical			= 0,
 	.check_apicid_used		= NULL,
 
-	.vector_allocation_domain	= default_vector_allocation_domain,
 	/* not needed, but shouldn't hurt: */
 	.init_apic_ldr			= flat_init_apic_ldr,
 
@@ -267,7 +264,6 @@ static struct apic apic_physflat __ro_af
 	.get_apic_id			= flat_get_apic_id,
 	.set_apic_id			= set_apic_id,
 
-	.cpu_mask_to_apicid		= default_cpu_mask_to_apicid,
 	.calc_dest_apicid		= apic_default_calc_apicid,
 
 	.send_IPI			= default_send_IPI_single_phys,
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -83,14 +83,6 @@ static int noop_apic_id_registered(void)
 	return physid_isset(0, phys_cpu_present_map);
 }
 
-static void noop_vector_allocation_domain(int cpu, struct cpumask *retmask,
-					  const struct cpumask *mask)
-{
-	if (cpu != 0)
-		pr_warning("APIC: Vector allocated for non-BSP cpu\n");
-	cpumask_copy(retmask, cpumask_of(cpu));
-}
-
 static u32 noop_apic_read(u32 reg)
 {
 	WARN_ON_ONCE(boot_cpu_has(X86_FEATURE_APIC) && !disable_apic);
@@ -125,7 +117,6 @@ struct apic apic_noop __ro_after_init =
 	.dest_logical			= APIC_DEST_LOGICAL,
 	.check_apicid_used		= default_check_apicid_used,
 
-	.vector_allocation_domain	= noop_vector_allocation_domain,
 	.init_apic_ldr			= noop_init_apic_ldr,
 
 	.ioapic_phys_id_map		= default_ioapic_phys_id_map,
@@ -141,7 +132,6 @@ struct apic apic_noop __ro_after_init =
 	.get_apic_id			= noop_get_apic_id,
 	.set_apic_id			= NULL,
 
-	.cpu_mask_to_apicid		= flat_cpu_mask_to_apicid,
 	.calc_dest_apicid		= apic_flat_calc_apicid,
 
 	.send_IPI			= noop_send_IPI,
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -253,7 +253,6 @@ static const struct apic apic_numachip1
 	.dest_logical			= 0,
 	.check_apicid_used		= NULL,
 
-	.vector_allocation_domain	= default_vector_allocation_domain,
 	.init_apic_ldr			= flat_init_apic_ldr,
 
 	.ioapic_phys_id_map		= NULL,
@@ -266,7 +265,6 @@ static const struct apic apic_numachip1
 	.get_apic_id			= numachip1_get_apic_id,
 	.set_apic_id			= numachip1_set_apic_id,
 
-	.cpu_mask_to_apicid		= default_cpu_mask_to_apicid,
 	.calc_dest_apicid		= apic_default_calc_apicid,
 
 	.send_IPI			= numachip_send_IPI_one,
@@ -304,7 +302,6 @@ static const struct apic apic_numachip2
 	.dest_logical			= 0,
 	.check_apicid_used		= NULL,
 
-	.vector_allocation_domain	= default_vector_allocation_domain,
 	.init_apic_ldr			= flat_init_apic_ldr,
 
 	.ioapic_phys_id_map		= NULL,
@@ -317,7 +314,6 @@ static const struct apic apic_numachip2
 	.get_apic_id			= numachip2_get_apic_id,
 	.set_apic_id			= numachip2_set_apic_id,
 
-	.cpu_mask_to_apicid		= default_cpu_mask_to_apicid,
 	.calc_dest_apicid		= apic_default_calc_apicid,
 
 	.send_IPI			= numachip_send_IPI_one,
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -158,7 +158,6 @@ static struct apic apic_bigsmp __ro_afte
 	.dest_logical			= 0,
 	.check_apicid_used		= bigsmp_check_apicid_used,
 
-	.vector_allocation_domain	= default_vector_allocation_domain,
 	.init_apic_ldr			= bigsmp_init_apic_ldr,
 
 	.ioapic_phys_id_map		= bigsmp_ioapic_phys_id_map,
@@ -171,7 +170,6 @@ static struct apic apic_bigsmp __ro_afte
 	.get_apic_id			= bigsmp_get_apic_id,
 	.set_apic_id			= NULL,
 
-	.cpu_mask_to_apicid		= default_cpu_mask_to_apicid,
 	.calc_dest_apicid		= apic_default_calc_apicid,
 
 	.send_IPI			= default_send_IPI_single_phys,
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -113,7 +113,6 @@ static struct apic apic_default __ro_aft
 	.dest_logical			= APIC_DEST_LOGICAL,
 	.check_apicid_used		= default_check_apicid_used,
 
-	.vector_allocation_domain	= flat_vector_allocation_domain,
 	.init_apic_ldr			= default_init_apic_ldr,
 
 	.ioapic_phys_id_map		= default_ioapic_phys_id_map,
@@ -126,7 +125,6 @@ static struct apic apic_default __ro_aft
 	.get_apic_id			= default_get_apic_id,
 	.set_apic_id			= NULL,
 
-	.cpu_mask_to_apicid		= flat_cpu_mask_to_apicid,
 	.calc_dest_apicid		= apic_flat_calc_apicid,
 
 	.send_IPI			= default_send_IPI_single,
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -91,29 +91,6 @@ static void x2apic_send_IPI_all(int vect
 	__x2apic_send_IPI_mask(cpu_online_mask, vector, APIC_DEST_ALLINC);
 }
 
-static int
-x2apic_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqdata,
-			  unsigned int *apicid)
-{
-	struct cpumask *effmsk = irq_data_get_effective_affinity_mask(irqdata);
-	struct cluster_mask *cmsk;
-	unsigned int cpu;
-	u32 dest = 0;
-
-	cpu = cpumask_first(mask);
-	if (cpu >= nr_cpu_ids)
-		return -EINVAL;
-
-	cmsk = per_cpu(cluster_masks, cpu);
-	cpumask_clear(effmsk);
-	for_each_cpu_and(cpu, &cmsk->mask, mask) {
-		dest |= per_cpu(x86_cpu_to_logical_apicid, cpu);
-		cpumask_set_cpu(cpu, effmsk);
-	}
-	*apicid = dest;
-	return 0;
-}
-
 static u32 x2apic_calc_apicid(unsigned int cpu)
 {
 	return per_cpu(x86_cpu_to_logical_apicid, cpu);
@@ -198,29 +175,6 @@ static int x2apic_cluster_probe(void)
 	return 1;
 }
 
-/*
- * Each x2apic cluster is an allocation domain.
- */
-static void cluster_vector_allocation_domain(int cpu, struct cpumask *retmask,
-					     const struct cpumask *mask)
-{
-	struct cluster_mask *cmsk = per_cpu(cluster_masks, cpu);
-
-	/*
-	 * To minimize vector pressure, default case of boot, device bringup
-	 * etc will use a single cpu for the interrupt destination.
-	 *
-	 * On explicit migration requests coming from irqbalance etc,
-	 * interrupts will be routed to the x2apic cluster (cluster-id
-	 * derived from the first cpu in the mask) members specified
-	 * in the mask.
-	 */
-	if (cpumask_equal(mask, cpu_online_mask))
-		cpumask_copy(retmask, cpumask_of(cpu));
-	else
-		cpumask_and(retmask, mask, &cmsk->mask);
-}
-
 static struct apic apic_x2apic_cluster __ro_after_init = {
 
 	.name				= "cluster x2apic",
@@ -236,7 +190,6 @@ static struct apic apic_x2apic_cluster _
 	.dest_logical			= APIC_DEST_LOGICAL,
 	.check_apicid_used		= NULL,
 
-	.vector_allocation_domain	= cluster_vector_allocation_domain,
 	.init_apic_ldr			= init_x2apic_ldr,
 
 	.ioapic_phys_id_map		= NULL,
@@ -249,7 +202,6 @@ static struct apic apic_x2apic_cluster _
 	.get_apic_id			= x2apic_get_apic_id,
 	.set_apic_id			= x2apic_set_apic_id,
 
-	.cpu_mask_to_apicid		= x2apic_cpu_mask_to_apicid,
 	.calc_dest_apicid		= x2apic_calc_apicid,
 
 	.send_IPI			= x2apic_send_IPI,
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -151,7 +151,6 @@ static struct apic apic_x2apic_phys __ro
 	.dest_logical			= 0,
 	.check_apicid_used		= NULL,
 
-	.vector_allocation_domain	= default_vector_allocation_domain,
 	.init_apic_ldr			= init_x2apic_ldr,
 
 	.ioapic_phys_id_map		= NULL,
@@ -164,7 +163,6 @@ static struct apic apic_x2apic_phys __ro
 	.get_apic_id			= x2apic_get_apic_id,
 	.set_apic_id			= x2apic_set_apic_id,
 
-	.cpu_mask_to_apicid		= default_cpu_mask_to_apicid,
 	.calc_dest_apicid		= apic_default_calc_apicid,
 
 	.send_IPI			= x2apic_send_IPI,
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -525,18 +525,6 @@ static void uv_init_apic_ldr(void)
 {
 }
 
-static int
-uv_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqdata,
-		      unsigned int *apicid)
-{
-	int ret = default_cpu_mask_to_apicid(mask, irqdata, apicid);
-
-	if (!ret)
-		*apicid |= uv_apicid_hibits;
-
-	return ret;
-}
-
 static u32 apic_uv_calc_apicid(unsigned int cpu)
 {
 	return apic_default_calc_apicid(cpu) | uv_apicid_hibits;
@@ -593,7 +581,6 @@ static struct apic apic_x2apic_uv_x __ro
 	.dest_logical			= APIC_DEST_LOGICAL,
 	.check_apicid_used		= NULL,
 
-	.vector_allocation_domain	= default_vector_allocation_domain,
 	.init_apic_ldr			= uv_init_apic_ldr,
 
 	.ioapic_phys_id_map		= NULL,
@@ -606,7 +593,6 @@ static struct apic apic_x2apic_uv_x __ro
 	.get_apic_id			= x2apic_get_apic_id,
 	.set_apic_id			= set_apic_id,
 
-	.cpu_mask_to_apicid		= uv_cpu_mask_to_apicid,
 	.calc_dest_apicid		= apic_uv_calc_apicid,
 
 	.send_IPI			= uv_send_IPI_one,
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -26,9 +26,6 @@
 
 #define TOPOLOGY_REGISTER_OFFSET 0x10
 
-/* Flag below is initialized once during vSMP PCI initialization. */
-static int irq_routing_comply = 1;
-
 #if defined CONFIG_PCI && defined CONFIG_PARAVIRT
 /*
  * Interrupt control on vSMPowered systems:
@@ -105,9 +102,6 @@ static void __init set_vsmp_pv_ops(void)
 	if (cap & ctl & BIT(8)) {
 		ctl &= ~BIT(8);
 
-		/* Interrupt routing set to ignore */
-		irq_routing_comply = 0;
-
 #ifdef CONFIG_PROC_FS
 		/* Don't let users change irq affinity via procfs */
 		no_irq_affinity = 1;
@@ -211,23 +205,10 @@ static int apicid_phys_pkg_id(int initia
 	return hard_smp_processor_id() >> index_msb;
 }
 
-/*
- * In vSMP, all cpus should be capable of handling interrupts, regardless of
- * the APIC used.
- */
-static void fill_vector_allocation_domain(int cpu, struct cpumask *retmask,
-					  const struct cpumask *mask)
-{
-	cpumask_setall(retmask);
-}
-
 static void vsmp_apic_post_init(void)
 {
 	/* need to update phys_pkg_id */
 	apic->phys_pkg_id = apicid_phys_pkg_id;
-
-	if (!irq_routing_comply)
-		apic->vector_allocation_domain = fill_vector_allocation_domain;
 }
 
 void __init vsmp_init(void)
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -164,7 +164,6 @@ static struct apic xen_pv_apic = {
 	/* .dest_logical      -  default_send_IPI_ use it but we use our own. */
 	.check_apicid_used		= default_check_apicid_used, /* Used on 32-bit */
 
-	.vector_allocation_domain	= flat_vector_allocation_domain,
 	.init_apic_ldr			= xen_noop, /* setup_local_APIC calls it */
 
 	.ioapic_phys_id_map		= default_ioapic_phys_id_map, /* Used on 32-bit */
@@ -177,7 +176,6 @@ static struct apic xen_pv_apic = {
 	.get_apic_id 			= xen_get_apic_id,
 	.set_apic_id 			= xen_set_apic_id, /* Can be NULL on 32-bit. */
 
-	.cpu_mask_to_apicid		= flat_cpu_mask_to_apicid,
 	.calc_dest_apicid		= apic_flat_calc_apicid,
 
 #ifdef CONFIG_SMP

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 42/52] x86/vector: Compile SMP only code conditionally
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (40 preceding siblings ...)
  2017-09-13 21:29 ` [patch 41/52] x86/apic: Remove unused callbacks Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 43/52] x86/vector: Untangle internal state from irq_cfg Thomas Gleixner
                   ` (11 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Compile-SMP-only-code-conditionally.patch --]
[-- Type: text/plain, Size: 1561 bytes --]

No point in compiling this for UP.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |   35 ++++++++++++++++++++---------------
 1 file changed, 20 insertions(+), 15 deletions(-)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -450,6 +450,7 @@ int __init arch_early_irq_init(void)
 	return arch_early_ioapic_init();
 }
 
+#ifdef CONFIG_SMP
 /* Temporary hack to keep things working */
 static void vector_update_shutdown_irqs(void)
 {
@@ -517,6 +518,25 @@ void lapic_offline(void)
 	unlock_vector_lock();
 }
 
+static int apic_set_affinity(struct irq_data *irqd,
+			     const struct cpumask *dest, bool force)
+{
+	int err;
+
+	if (!IS_ENABLED(CONFIG_SMP))
+		return -EPERM;
+
+	if (!cpumask_intersects(dest, cpu_online_mask))
+		return -EINVAL;
+
+	err = assign_irq_vector(irqd, dest);
+	return err ? err : IRQ_SET_MASK_OK;
+}
+
+#else
+# define apic_set_affinity	NULL
+#endif
+
 static int apic_retrigger_irq(struct irq_data *irqd)
 {
 	struct apic_chip_data *apicd = apic_chip_data(irqd);
@@ -536,21 +556,6 @@ void apic_ack_edge(struct irq_data *irqd
 	ack_APIC_irq();
 }
 
-static int apic_set_affinity(struct irq_data *irqd,
-			     const struct cpumask *dest, bool force)
-{
-	int err;
-
-	if (!IS_ENABLED(CONFIG_SMP))
-		return -EPERM;
-
-	if (!cpumask_intersects(dest, cpu_online_mask))
-		return -EINVAL;
-
-	err = assign_irq_vector(irqd, dest);
-	return err ? err : IRQ_SET_MASK_OK;
-}
-
 static struct irq_chip lapic_controller = {
 	.name			= "APIC",
 	.irq_ack		= apic_ack_edge,

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 43/52] x86/vector: Untangle internal state from irq_cfg
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (41 preceding siblings ...)
  2017-09-13 21:29 ` [patch 42/52] x86/vector: Compile SMP only code conditionally Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 44/52] x86/apic/msi: Force reactivation of interrupts at startup time Thomas Gleixner
                   ` (10 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Untangle-internal-state-from-irq_cfg.patch --]
[-- Type: text/plain, Size: 9619 bytes --]

The vector management state is not required to live in irq_cfg. irq_cfg is
only relevant for the depending irq domains (IOAPIC, DMAR, MSI ...).

The seperation of the vector management status allows to direct a shut down
interrupt to a special shutdown vector w/o confusing the internal state of
the vector management.

Preparatory change for the rework of managed interrupts and the global
vector reservation scheme.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/hw_irq.h |    3 -
 arch/x86/kernel/apic/vector.c |   88 ++++++++++++++++++++++--------------------
 2 files changed, 49 insertions(+), 42 deletions(-)

--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -124,8 +124,7 @@ struct irq_alloc_info {
 
 struct irq_cfg {
 	unsigned int		dest_apicid;
-	u8			vector;
-	u8			old_vector;
+	unsigned int		vector;
 };
 
 extern struct irq_cfg *irq_cfg(unsigned int irq);
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -25,7 +25,9 @@
 #include <asm/trace/irq_vectors.h>
 
 struct apic_chip_data {
-	struct irq_cfg		cfg;
+	struct irq_cfg		hw_irq_cfg;
+	unsigned int		vector;
+	unsigned int		prev_vector;
 	unsigned int		cpu;
 	unsigned int		prev_cpu;
 	unsigned int		irq;
@@ -86,7 +88,7 @@ struct irq_cfg *irqd_cfg(struct irq_data
 {
 	struct apic_chip_data *apicd = apic_chip_data(irqd);
 
-	return apicd ? &apicd->cfg : NULL;
+	return apicd ? &apicd->hw_irq_cfg : NULL;
 }
 EXPORT_SYMBOL_GPL(irqd_cfg);
 
@@ -110,16 +112,18 @@ static void free_apic_chip_data(struct a
 	kfree(apicd);
 }
 
-static void apic_update_irq_cfg(struct irq_data *irqd)
+static void apic_update_irq_cfg(struct irq_data *irqd, unsigned int vector,
+				unsigned int cpu)
 {
 	struct apic_chip_data *apicd = apic_chip_data(irqd);
 
 	lockdep_assert_held(&vector_lock);
 
-	apicd->cfg.dest_apicid = apic->calc_dest_apicid(apicd->cpu);
-	irq_data_update_effective_affinity(irqd, cpumask_of(apicd->cpu));
-	trace_vector_config(irqd->irq, apicd->cfg.vector, apicd->cpu,
-			    apicd->cfg.dest_apicid);
+	apicd->hw_irq_cfg.vector = vector;
+	apicd->hw_irq_cfg.dest_apicid = apic->calc_dest_apicid(cpu);
+	irq_data_update_effective_affinity(irqd, cpumask_of(cpu));
+	trace_vector_config(irqd->irq, vector, cpu,
+			    apicd->hw_irq_cfg.dest_apicid);
 }
 
 static void apic_update_vector(struct irq_data *irqd, unsigned int newvec,
@@ -130,19 +134,19 @@ static void apic_update_vector(struct ir
 
 	lockdep_assert_held(&vector_lock);
 
-	trace_vector_update(irqd->irq, newvec, newcpu, apicd->cfg.vector,
+	trace_vector_update(irqd->irq, newvec, newcpu, apicd->vector,
 			    apicd->cpu);
 
 	/* Setup the vector move, if required  */
-	if (apicd->cfg.vector && cpu_online(apicd->cpu)) {
+	if (apicd->vector && cpu_online(apicd->cpu)) {
 		apicd->move_in_progress = true;
-		apicd->cfg.old_vector = apicd->cfg.vector;
+		apicd->prev_vector = apicd->vector;
 		apicd->prev_cpu = apicd->cpu;
 	} else {
-		apicd->cfg.old_vector = 0;
+		apicd->prev_vector = 0;
 	}
 
-	apicd->cfg.vector = newvec;
+	apicd->vector = newvec;
 	apicd->cpu = newcpu;
 	BUG_ON(!IS_ERR_OR_NULL(per_cpu(vector_irq, newcpu)[newvec]));
 	per_cpu(vector_irq, newcpu)[newvec] = desc;
@@ -151,8 +155,10 @@ static void apic_update_vector(struct ir
 static int allocate_vector(struct irq_data *irqd, const struct cpumask *dest)
 {
 	struct apic_chip_data *apicd = apic_chip_data(irqd);
-	int vector = apicd->cfg.vector;
 	unsigned int cpu = apicd->cpu;
+	int vector = apicd->vector;
+
+	lockdep_assert_held(&vector_lock);
 
 	/*
 	 * If the current target CPU is online and in the new requested
@@ -172,12 +178,13 @@ static int allocate_vector(struct irq_da
 static int assign_vector_locked(struct irq_data *irqd,
 				const struct cpumask *dest)
 {
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
 	int vector = allocate_vector(irqd, dest);
 
 	if (vector < 0)
 		return vector;
 
-	apic_update_irq_cfg(irqd);
+	apic_update_irq_cfg(irqd, apicd->vector, apicd->cpu);
 	return 0;
 }
 
@@ -207,27 +214,28 @@ static int assign_irq_vector_policy(stru
 static void clear_irq_vector(struct irq_data *irqd)
 {
 	struct apic_chip_data *apicd = apic_chip_data(irqd);
-	unsigned int vector = apicd->cfg.vector;
+	unsigned int vector = apicd->vector;
 
 	lockdep_assert_held(&vector_lock);
+
 	if (!vector)
 		return;
 
-	trace_vector_clear(irqd->irq, vector, apicd->cpu, apicd->cfg.old_vector,
+	trace_vector_clear(irqd->irq, vector, apicd->cpu, apicd->prev_vector,
 			   apicd->prev_cpu);
 
 	per_cpu(vector_irq, apicd->cpu)[vector] = VECTOR_UNUSED;
 	irq_matrix_free(vector_matrix, apicd->cpu, vector, false);
-	apicd->cfg.vector = 0;
+	apicd->vector = 0;
 
 	/* Clean up move in progress */
-	vector = apicd->cfg.old_vector;
+	vector = apicd->prev_vector;
 	if (!vector)
 		return;
 
 	per_cpu(vector_irq, apicd->prev_cpu)[vector] = VECTOR_UNUSED;
 	irq_matrix_free(vector_matrix, apicd->prev_cpu, vector, false);
-	apicd->cfg.old_vector = 0;
+	apicd->prev_vector = 0;
 	apicd->move_in_progress = 0;
 	hlist_del_init(&apicd->clist);
 }
@@ -293,11 +301,11 @@ static int x86_vector_alloc_irqs(struct
 		 * config.
 		 */
 		if (info->flags & X86_IRQ_ALLOC_LEGACY) {
-			apicd->cfg.vector = ISA_IRQ_VECTOR(virq + i);
+			apicd->vector = ISA_IRQ_VECTOR(virq + i);
 			apicd->cpu = 0;
 			trace_vector_setup(virq + i, true, 0);
 			raw_spin_lock_irqsave(&vector_lock, flags);
-			apic_update_irq_cfg(irqd);
+			apic_update_irq_cfg(irqd, apicd->vector, apicd->cpu);
 			raw_spin_unlock_irqrestore(&vector_lock, flags);
 			continue;
 		}
@@ -319,7 +327,7 @@ static int x86_vector_alloc_irqs(struct
 void x86_vector_debug_show(struct seq_file *m, struct irq_domain *d,
 			   struct irq_data *irqd, int ind)
 {
-	unsigned int cpu, vec, prev_cpu, prev_vec;
+	unsigned int cpu, vector, prev_cpu, prev_vector;
 	struct apic_chip_data *apicd;
 	unsigned long flags;
 	int irq;
@@ -344,14 +352,14 @@ void x86_vector_debug_show(struct seq_fi
 
 	raw_spin_lock_irqsave(&vector_lock, flags);
 	cpu = apicd->cpu;
-	vec = apicd->cfg.vector;
+	vector = apicd->vector;
 	prev_cpu = apicd->prev_cpu;
-	prev_vec = apicd->cfg.old_vector;
+	prev_vector = apicd->prev_vector;
 	raw_spin_unlock_irqrestore(&vector_lock, flags);
-	seq_printf(m, "%*sVector: %5u\n", ind, "", vec);
+	seq_printf(m, "%*sVector: %5u\n", ind, "", vector);
 	seq_printf(m, "%*sTarget: %5u\n", ind, "", cpu);
-	if (prev_vec) {
-		seq_printf(m, "%*sPrevious vector: %5u\n", ind, "", prev_vec);
+	if (prev_vector) {
+		seq_printf(m, "%*sPrevious vector: %5u\n", ind, "", prev_vector);
 		seq_printf(m, "%*sPrevious target: %5u\n", ind, "", prev_cpu);
 	}
 }
@@ -461,10 +469,10 @@ static void vector_update_shutdown_irqs(
 		struct irq_data *irqd = irq_desc_get_irq_data(desc);
 		struct apic_chip_data *ad = apic_chip_data(irqd);
 
-		if (!ad || !ad->cfg.vector || ad->cpu != smp_processor_id())
+		if (!ad || !ad->vector || ad->cpu != smp_processor_id())
 			continue;
-		this_cpu_write(vector_irq[ad->cfg.vector], desc);
-		irq_matrix_assign(vector_matrix, ad->cfg.vector);
+		this_cpu_write(vector_irq[ad->vector], desc);
+		irq_matrix_assign(vector_matrix, ad->vector);
 	}
 }
 
@@ -543,7 +551,7 @@ static int apic_retrigger_irq(struct irq
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&vector_lock, flags);
-	apic->send_IPI(apicd->cpu, apicd->cfg.vector);
+	apic->send_IPI(apicd->cpu, apicd->vector);
 	raw_spin_unlock_irqrestore(&vector_lock, flags);
 
 	return 1;
@@ -567,14 +575,14 @@ static struct irq_chip lapic_controller
 
 static void free_moved_vector(struct apic_chip_data *apicd)
 {
-	unsigned int vector = apicd->cfg.old_vector;
+	unsigned int vector = apicd->prev_vector;
 	unsigned int cpu = apicd->prev_cpu;
 
 	trace_vector_free_moved(apicd->irq, vector, false);
 	irq_matrix_free(vector_matrix, cpu, vector, false);
 	__this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
 	hlist_del_init(&apicd->clist);
-	apicd->cfg.old_vector = 0;
+	apicd->prev_vector = 0;
 	apicd->move_in_progress = 0;
 }
 
@@ -589,7 +597,7 @@ asmlinkage __visible void __irq_entry sm
 	raw_spin_lock(&vector_lock);
 
 	hlist_for_each_entry_safe(apicd, tmp, clhead, clist) {
-		unsigned int irr, vector = apicd->cfg.old_vector;
+		unsigned int irr, vector = apicd->prev_vector;
 
 		/*
 		 * Paranoia: Check if the vector that needs to be cleaned
@@ -623,7 +631,7 @@ static void __send_cleanup_vector(struct
 		hlist_add_head(&apicd->clist, per_cpu_ptr(&cleanup_list, cpu));
 		apic->send_IPI(cpu, IRQ_MOVE_CLEANUP_VECTOR);
 	} else {
-		apicd->cfg.old_vector = 0;
+		apicd->prev_vector = 0;
 	}
 	raw_spin_unlock(&vector_lock);
 }
@@ -632,7 +640,7 @@ void send_cleanup_vector(struct irq_cfg
 {
 	struct apic_chip_data *apicd;
 
-	apicd = container_of(cfg, struct apic_chip_data, cfg);
+	apicd = container_of(cfg, struct apic_chip_data, hw_irq_cfg);
 	if (apicd->move_in_progress)
 		__send_cleanup_vector(apicd);
 }
@@ -641,11 +649,11 @@ static void __irq_complete_move(struct i
 {
 	struct apic_chip_data *apicd;
 
-	apicd = container_of(cfg, struct apic_chip_data, cfg);
+	apicd = container_of(cfg, struct apic_chip_data, hw_irq_cfg);
 	if (likely(!apicd->move_in_progress))
 		return;
 
-	if (vector == apicd->cfg.vector && apicd->cpu == smp_processor_id())
+	if (vector == apicd->vector && apicd->cpu == smp_processor_id())
 		__send_cleanup_vector(apicd);
 }
 
@@ -683,9 +691,9 @@ void irq_force_complete_move(struct irq_
 		goto unlock;
 
 	/*
-	 * If old_vector is empty, no action required.
+	 * If prev_vector is empty, no action required.
 	 */
-	vector = apicd->cfg.old_vector;
+	vector = apicd->prev_vector;
 	if (!vector)
 		goto unlock;
 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 44/52] x86/apic/msi: Force reactivation of interrupts at startup time
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (42 preceding siblings ...)
  2017-09-13 21:29 ` [patch 43/52] x86/vector: Untangle internal state from irq_cfg Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29   ` Thomas Gleixner
                   ` (9 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-apic-msi--Force-reactivation-of-interrupts-at-startup-time.patch --]
[-- Type: text/plain, Size: 1369 bytes --]

MSI(X) interrupts need a valid vector configuration early at allocation
time, i.e. before the PCI core enables MSI(X).

With managed interrupts and the new global reservation scheme, the early
configuration will not assign a real device vector, but a special shutdown
vector. When the irq is started up, then the interrupt must be
reconfigured. Tell the MSI irqdomain core about it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/msi.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -129,7 +129,7 @@ static struct msi_domain_ops pci_msi_dom
 
 static struct msi_domain_info pci_msi_domain_info = {
 	.flags		= MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
-			  MSI_FLAG_PCI_MSIX,
+			  MSI_FLAG_PCI_MSIX | MSI_FLAG_MUST_REACTIVATE,
 	.ops		= &pci_msi_domain_ops,
 	.chip		= &pci_msi_controller,
 	.handler	= handle_edge_irq,
@@ -167,7 +167,8 @@ static struct irq_chip pci_msi_ir_contro
 
 static struct msi_domain_info pci_msi_ir_domain_info = {
 	.flags		= MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
-			  MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX,
+			  MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX |
+			  MSI_FLAG_MUST_REACTIVATE,
 	.ops		= &pci_msi_domain_ops,
 	.chip		= &pci_msi_ir_controller,
 	.handler	= handle_edge_irq,

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 45/52] iommu/vt-d: Reevaluate vector configuration on activate()
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
@ 2017-09-13 21:29   ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 02/52] genirq/debugfs: Show debug information for all irq descriptors Thomas Gleixner
                     ` (52 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven,
	iommu

[-- Attachment #1: iommu-vt-d--Reevaluate-vector-configuration-on-activate--.patch --]
[-- Type: text/plain, Size: 2784 bytes --]

With the upcoming reservation/management scheme, early activation will
assign a special vector. The final activation at request_irq() assigns a
real vector, which needs to be updated in the tables.

Split out the reconfiguration code in set_affinity and use it for
reactivation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
---
 drivers/iommu/intel_irq_remapping.c |   38 +++++++++++++++++++-----------------
 1 file changed, 21 insertions(+), 17 deletions(-)

--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -1121,6 +1121,24 @@ struct irq_remap_ops intel_irq_remap_ops
 	.get_irq_domain		= intel_get_irq_domain,
 };
 
+static void intel_ir_reconfigure_irte(struct irq_data *irqd, bool force)
+{
+	struct intel_ir_data *ir_data = irqd->chip_data;
+	struct irte *irte = &ir_data->irte_entry;
+	struct irq_cfg *cfg = irqd_cfg(irqd);
+
+	/*
+	 * Atomically updates the IRTE with the new destination, vector
+	 * and flushes the interrupt entry cache.
+	 */
+	irte->vector = cfg->vector;
+	irte->dest_id = IRTE_DEST(cfg->dest_apicid);
+
+	/* Update the hardware only if the interrupt is in remapped mode. */
+	if (!force || ir_data->irq_2_iommu.mode == IRQ_REMAPPING)
+		modify_irte(&ir_data->irq_2_iommu, irte);
+}
+
 /*
  * Migrate the IO-APIC irq in the presence of intr-remapping.
  *
@@ -1139,27 +1157,15 @@ static int
 intel_ir_set_affinity(struct irq_data *data, const struct cpumask *mask,
 		      bool force)
 {
-	struct intel_ir_data *ir_data = data->chip_data;
-	struct irte *irte = &ir_data->irte_entry;
-	struct irq_cfg *cfg = irqd_cfg(data);
 	struct irq_data *parent = data->parent_data;
+	struct irq_cfg *cfg = irqd_cfg(data);
 	int ret;
 
 	ret = parent->chip->irq_set_affinity(parent, mask, force);
 	if (ret < 0 || ret == IRQ_SET_MASK_OK_DONE)
 		return ret;
 
-	/*
-	 * Atomically updates the IRTE with the new destination, vector
-	 * and flushes the interrupt entry cache.
-	 */
-	irte->vector = cfg->vector;
-	irte->dest_id = IRTE_DEST(cfg->dest_apicid);
-
-	/* Update the hardware only if the interrupt is in remapped mode. */
-	if (ir_data->irq_2_iommu.mode == IRQ_REMAPPING)
-		modify_irte(&ir_data->irq_2_iommu, irte);
-
+	intel_ir_reconfigure_irte(data, false);
 	/*
 	 * After this point, all the interrupts will start arriving
 	 * at the new destination. So, time to cleanup the previous
@@ -1392,9 +1398,7 @@ static void intel_irq_remapping_free(str
 static int intel_irq_remapping_activate(struct irq_domain *domain,
 					struct irq_data *irq_data, bool early)
 {
-	struct intel_ir_data *data = irq_data->chip_data;
-
-	modify_irte(&data->irq_2_iommu, &data->irte_entry);
+	intel_ir_reconfigure_irte(irq_data, true);
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 45/52] iommu/vt-d: Reevaluate vector configuration on activate()
@ 2017-09-13 21:29   ` Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Boris Ostrovsky, Juergen Gross, Tony Luck, Chen Yu, Marc Zyngier,
	Alok Kataria, Rafael J. Wysocki, Steven Rostedt,
	Christoph Hellwig, Peter Zijlstra,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Borislav Petkov, Peter Anvin, Paolo Bonzini, Rui Zhang,
	K. Y. Srinivasan, Arjan van de Ven, Ingo Molnar, Dan Williams,
	Len Brown

[-- Attachment #1: iommu-vt-d--Reevaluate-vector-configuration-on-activate--.patch --]
[-- Type: text/plain, Size: 2873 bytes --]

With the upcoming reservation/management scheme, early activation will
assign a special vector. The final activation at request_irq() assigns a
real vector, which needs to be updated in the tables.

Split out the reconfiguration code in set_affinity and use it for
reactivation.

Signed-off-by: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Cc: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
---
 drivers/iommu/intel_irq_remapping.c |   38 +++++++++++++++++++-----------------
 1 file changed, 21 insertions(+), 17 deletions(-)

--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -1121,6 +1121,24 @@ struct irq_remap_ops intel_irq_remap_ops
 	.get_irq_domain		= intel_get_irq_domain,
 };
 
+static void intel_ir_reconfigure_irte(struct irq_data *irqd, bool force)
+{
+	struct intel_ir_data *ir_data = irqd->chip_data;
+	struct irte *irte = &ir_data->irte_entry;
+	struct irq_cfg *cfg = irqd_cfg(irqd);
+
+	/*
+	 * Atomically updates the IRTE with the new destination, vector
+	 * and flushes the interrupt entry cache.
+	 */
+	irte->vector = cfg->vector;
+	irte->dest_id = IRTE_DEST(cfg->dest_apicid);
+
+	/* Update the hardware only if the interrupt is in remapped mode. */
+	if (!force || ir_data->irq_2_iommu.mode == IRQ_REMAPPING)
+		modify_irte(&ir_data->irq_2_iommu, irte);
+}
+
 /*
  * Migrate the IO-APIC irq in the presence of intr-remapping.
  *
@@ -1139,27 +1157,15 @@ static int
 intel_ir_set_affinity(struct irq_data *data, const struct cpumask *mask,
 		      bool force)
 {
-	struct intel_ir_data *ir_data = data->chip_data;
-	struct irte *irte = &ir_data->irte_entry;
-	struct irq_cfg *cfg = irqd_cfg(data);
 	struct irq_data *parent = data->parent_data;
+	struct irq_cfg *cfg = irqd_cfg(data);
 	int ret;
 
 	ret = parent->chip->irq_set_affinity(parent, mask, force);
 	if (ret < 0 || ret == IRQ_SET_MASK_OK_DONE)
 		return ret;
 
-	/*
-	 * Atomically updates the IRTE with the new destination, vector
-	 * and flushes the interrupt entry cache.
-	 */
-	irte->vector = cfg->vector;
-	irte->dest_id = IRTE_DEST(cfg->dest_apicid);
-
-	/* Update the hardware only if the interrupt is in remapped mode. */
-	if (ir_data->irq_2_iommu.mode == IRQ_REMAPPING)
-		modify_irte(&ir_data->irq_2_iommu, irte);
-
+	intel_ir_reconfigure_irte(data, false);
 	/*
 	 * After this point, all the interrupts will start arriving
 	 * at the new destination. So, time to cleanup the previous
@@ -1392,9 +1398,7 @@ static void intel_irq_remapping_free(str
 static int intel_irq_remapping_activate(struct irq_domain *domain,
 					struct irq_data *irq_data, bool early)
 {
-	struct intel_ir_data *data = irq_data->chip_data;
-
-	modify_irte(&data->irq_2_iommu, &data->irte_entry);
+	intel_ir_reconfigure_irte(irq_data, true);
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 46/52] iommu/amd: Reevaluate vector configuration on activate()
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
@ 2017-09-13 21:29   ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 02/52] genirq/debugfs: Show debug information for all irq descriptors Thomas Gleixner
                     ` (52 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven,
	iommu

[-- Attachment #1: iommu-amd--Reevaluate-vector-configuration-on-activate--.patch --]
[-- Type: text/plain, Size: 2821 bytes --]

With the upcoming reservation/management scheme, early activation will
assign a special vector. The final activation at request_irq() assigns a
real vector, which needs to be updated in the tables.

Split out the reconfiguration code in set_affinity and use it for
reactivation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
---
 drivers/iommu/amd_iommu.c |   39 +++++++++++++++++++++++++++++----------
 1 file changed, 29 insertions(+), 10 deletions(-)

--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -4170,16 +4170,25 @@ static void irq_remapping_free(struct ir
 	irq_domain_free_irqs_common(domain, virq, nr_irqs);
 }
 
+static void amd_ir_update_irte(struct irq_data *irqd, struct amd_iommu *iommu,
+			       struct amd_ir_data *ir_data,
+			       struct irq_2_irte *irte_info,
+			       struct irq_cfg *cfg);
+
 static int irq_remapping_activate(struct irq_domain *domain,
 				  struct irq_data *irq_data, bool early)
 {
 	struct amd_ir_data *data = irq_data->chip_data;
 	struct irq_2_irte *irte_info = &data->irq_2_irte;
 	struct amd_iommu *iommu = amd_iommu_rlookup_table[irte_info->devid];
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
 
-	if (iommu)
-		iommu->irte_ops->activate(data->entry, irte_info->devid,
-					  irte_info->index);
+	if (!iommu)
+		return 0;
+
+	iommu->irte_ops->activate(data->entry, irte_info->devid,
+				  irte_info->index);
+	amd_ir_update_irte(irq_data, iommu, data, irte_info, cfg);
 	return 0;
 }
 
@@ -4267,6 +4276,22 @@ static int amd_ir_set_vcpu_affinity(stru
 	return modify_irte_ga(irte_info->devid, irte_info->index, irte, ir_data);
 }
 
+
+static void amd_ir_update_irte(struct irq_data *irqd, struct amd_iommu *iommu,
+			       struct amd_ir_data *ir_data,
+			       struct irq_2_irte *irte_info,
+			       struct irq_cfg *cfg)
+{
+
+	/*
+	 * Atomically updates the IRTE with the new destination, vector
+	 * and flushes the interrupt entry cache.
+	 */
+	iommu->irte_ops->set_affinity(ir_data->entry, irte_info->devid,
+				      irte_info->index, cfg->vector,
+				      cfg->dest_apicid);
+}
+
 static int amd_ir_set_affinity(struct irq_data *data,
 			       const struct cpumask *mask, bool force)
 {
@@ -4284,13 +4309,7 @@ static int amd_ir_set_affinity(struct ir
 	if (ret < 0 || ret == IRQ_SET_MASK_OK_DONE)
 		return ret;
 
-	/*
-	 * Atomically updates the IRTE with the new destination, vector
-	 * and flushes the interrupt entry cache.
-	 */
-	iommu->irte_ops->set_affinity(ir_data->entry, irte_info->devid,
-			    irte_info->index, cfg->vector, cfg->dest_apicid);
-
+	amd_ir_update_irte(data, iommu, ir_data, irte_info, cfg);
 	/*
 	 * After this point, all the interrupts will start arriving
 	 * at the new destination. So, time to cleanup the previous

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 46/52] iommu/amd: Reevaluate vector configuration on activate()
@ 2017-09-13 21:29   ` Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Boris Ostrovsky, Juergen Gross, Tony Luck, Chen Yu, Marc Zyngier,
	Alok Kataria, Rafael J. Wysocki, Steven Rostedt,
	Christoph Hellwig, Peter Zijlstra,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Borislav Petkov, Peter Anvin, Paolo Bonzini, Rui Zhang,
	K. Y. Srinivasan, Arjan van de Ven, Ingo Molnar, Dan Williams,
	Len Brown

[-- Attachment #1: iommu-amd--Reevaluate-vector-configuration-on-activate--.patch --]
[-- Type: text/plain, Size: 2910 bytes --]

With the upcoming reservation/management scheme, early activation will
assign a special vector. The final activation at request_irq() assigns a
real vector, which needs to be updated in the tables.

Split out the reconfiguration code in set_affinity and use it for
reactivation.

Signed-off-by: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Cc: Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
---
 drivers/iommu/amd_iommu.c |   39 +++++++++++++++++++++++++++++----------
 1 file changed, 29 insertions(+), 10 deletions(-)

--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -4170,16 +4170,25 @@ static void irq_remapping_free(struct ir
 	irq_domain_free_irqs_common(domain, virq, nr_irqs);
 }
 
+static void amd_ir_update_irte(struct irq_data *irqd, struct amd_iommu *iommu,
+			       struct amd_ir_data *ir_data,
+			       struct irq_2_irte *irte_info,
+			       struct irq_cfg *cfg);
+
 static int irq_remapping_activate(struct irq_domain *domain,
 				  struct irq_data *irq_data, bool early)
 {
 	struct amd_ir_data *data = irq_data->chip_data;
 	struct irq_2_irte *irte_info = &data->irq_2_irte;
 	struct amd_iommu *iommu = amd_iommu_rlookup_table[irte_info->devid];
+	struct irq_cfg *cfg = irqd_cfg(irq_data);
 
-	if (iommu)
-		iommu->irte_ops->activate(data->entry, irte_info->devid,
-					  irte_info->index);
+	if (!iommu)
+		return 0;
+
+	iommu->irte_ops->activate(data->entry, irte_info->devid,
+				  irte_info->index);
+	amd_ir_update_irte(irq_data, iommu, data, irte_info, cfg);
 	return 0;
 }
 
@@ -4267,6 +4276,22 @@ static int amd_ir_set_vcpu_affinity(stru
 	return modify_irte_ga(irte_info->devid, irte_info->index, irte, ir_data);
 }
 
+
+static void amd_ir_update_irte(struct irq_data *irqd, struct amd_iommu *iommu,
+			       struct amd_ir_data *ir_data,
+			       struct irq_2_irte *irte_info,
+			       struct irq_cfg *cfg)
+{
+
+	/*
+	 * Atomically updates the IRTE with the new destination, vector
+	 * and flushes the interrupt entry cache.
+	 */
+	iommu->irte_ops->set_affinity(ir_data->entry, irte_info->devid,
+				      irte_info->index, cfg->vector,
+				      cfg->dest_apicid);
+}
+
 static int amd_ir_set_affinity(struct irq_data *data,
 			       const struct cpumask *mask, bool force)
 {
@@ -4284,13 +4309,7 @@ static int amd_ir_set_affinity(struct ir
 	if (ret < 0 || ret == IRQ_SET_MASK_OK_DONE)
 		return ret;
 
-	/*
-	 * Atomically updates the IRTE with the new destination, vector
-	 * and flushes the interrupt entry cache.
-	 */
-	iommu->irte_ops->set_affinity(ir_data->entry, irte_info->devid,
-			    irte_info->index, cfg->vector, cfg->dest_apicid);
-
+	amd_ir_update_irte(data, iommu, ir_data, irte_info, cfg);
 	/*
 	 * After this point, all the interrupts will start arriving
 	 * at the new destination. So, time to cleanup the previous

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 47/52] x86/io_apic: Reevaluate vector configuration on activate()
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (45 preceding siblings ...)
  2017-09-13 21:29   ` Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 48/52] x86/vector: Handle managed interrupts proper Thomas Gleixner
                   ` (6 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-io_apic--Reevaluate-vector-configuration-on-activate--.patch --]
[-- Type: text/plain, Size: 2547 bytes --]

With the upcoming reservation/management scheme, early activation will
assign a special vector. The final activation at request_irq() assigns a
real vector, which needs to be updated in the ioapic.

Split out the reconfiguration code in set_affinity and use it for
reactivation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/io_apic.c |   37 ++++++++++++++++++++++---------------
 1 file changed, 22 insertions(+), 15 deletions(-)

--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1821,26 +1821,36 @@ static void ioapic_ir_ack_level(struct i
 	eoi_ioapic_pin(data->entry.vector, data);
 }
 
+static void ioapic_configure_entry(struct irq_data *irqd)
+{
+	struct mp_chip_data *mpd = irqd->chip_data;
+	struct irq_cfg *cfg = irqd_cfg(irqd);
+	struct irq_pin_list *entry;
+
+	/*
+	 * Only update when the parent is the vector domain, don't touch it
+	 * if the parent is the remapping domain. Check the installed
+	 * ioapic chip to verify that.
+	 */
+	if (irqd->chip == &ioapic_chip) {
+		mpd->entry.dest = cfg->dest_apicid;
+		mpd->entry.vector = cfg->vector;
+	}
+	for_each_irq_pin(entry, mpd->irq_2_pin)
+		__ioapic_write_entry(entry->apic, entry->pin, mpd->entry);
+}
+
 static int ioapic_set_affinity(struct irq_data *irq_data,
 			       const struct cpumask *mask, bool force)
 {
 	struct irq_data *parent = irq_data->parent_data;
-	struct mp_chip_data *data = irq_data->chip_data;
-	struct irq_pin_list *entry;
-	struct irq_cfg *cfg;
 	unsigned long flags;
 	int ret;
 
 	ret = parent->chip->irq_set_affinity(parent, mask, force);
 	raw_spin_lock_irqsave(&ioapic_lock, flags);
-	if (ret >= 0 && ret != IRQ_SET_MASK_OK_DONE) {
-		cfg = irqd_cfg(irq_data);
-		data->entry.dest = cfg->dest_apicid;
-		data->entry.vector = cfg->vector;
-		for_each_irq_pin(entry, data->irq_2_pin)
-			__ioapic_write_entry(entry->apic, entry->pin,
-					     data->entry);
-	}
+	if (ret >= 0 && ret != IRQ_SET_MASK_OK_DONE)
+		ioapic_configure_entry(irq_data);
 	raw_spin_unlock_irqrestore(&ioapic_lock, flags);
 
 	return ret;
@@ -2939,12 +2949,9 @@ int mp_irqdomain_activate(struct irq_dom
 			  struct irq_data *irq_data, bool early)
 {
 	unsigned long flags;
-	struct irq_pin_list *entry;
-	struct mp_chip_data *data = irq_data->chip_data;
 
 	raw_spin_lock_irqsave(&ioapic_lock, flags);
-	for_each_irq_pin(entry, data->irq_2_pin)
-		__ioapic_write_entry(entry->apic, entry->pin, data->entry);
+	ioapic_configure_entry(irq_data);
 	raw_spin_unlock_irqrestore(&ioapic_lock, flags);
 	return 0;
 }

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 48/52] x86/vector: Handle managed interrupts proper
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (46 preceding siblings ...)
  2017-09-13 21:29 ` [patch 47/52] x86/io_apic: " Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 49/52] x86/vector/msi: Switch to global reservation mode Thomas Gleixner
                   ` (5 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Support-managed-interrupts-proper.patch --]
[-- Type: text/plain, Size: 9970 bytes --]

Managed interrupts need to reserve interrupt vectors permanently, but as
long as the interrupt is deactivated, the vector should not be active.

Reserve a new system vector, which can be used to initially initialize
MSI/DMAR/IOAPIC entries. In that situation the interrupts are disabled in
the corresponding MSI/DMAR/IOAPIC devices. So the vector should never be
sent to any CPU.

When the managed interrupt is started up, a real vector is assigned from
the managed vector space and configured in MSI/DMAR/IOAPIC.

This allows a clear separation of inactive and active modes and simplifies
the final decisions whether the global vector space is sufficient for CPU
offline operations.

The vector space can be reserved even on offline CPUs and will survive CPU
offline/online operations.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/irq_vectors.h |    8 -
 arch/x86/kernel/apic/vector.c      |  190 +++++++++++++++++++++++++++++++++----
 2 files changed, 174 insertions(+), 24 deletions(-)

--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -101,12 +101,8 @@
 #define POSTED_INTR_NESTED_VECTOR	0xf0
 #endif
 
-/*
- * Local APIC timer IRQ vector is on a different priority level,
- * to work around the 'lost local interrupt if more than 2 IRQ
- * sources per level' errata.
- */
-#define LOCAL_TIMER_VECTOR		0xef
+#define MANAGED_IRQ_SHUTDOWN_VECTOR	0xef
+#define LOCAL_TIMER_VECTOR		0xee
 
 #define NR_VECTORS			 256
 
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -32,7 +32,8 @@ struct apic_chip_data {
 	unsigned int		prev_cpu;
 	unsigned int		irq;
 	struct hlist_node	clist;
-	u8			move_in_progress : 1;
+	unsigned int		move_in_progress	: 1,
+				is_managed		: 1;
 };
 
 struct irq_domain *x86_vector_domain;
@@ -152,6 +153,28 @@ static void apic_update_vector(struct ir
 	per_cpu(vector_irq, newcpu)[newvec] = desc;
 }
 
+static void vector_assign_managed_shutdown(struct irq_data *irqd)
+{
+	unsigned int cpu = cpumask_first(cpu_online_mask);
+
+	apic_update_irq_cfg(irqd, MANAGED_IRQ_SHUTDOWN_VECTOR, cpu);
+}
+
+static int reserve_managed_vector(struct irq_data *irqd)
+{
+	const struct cpumask *affmsk = irq_data_get_affinity_mask(irqd);
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
+	unsigned long flags;
+	int ret;
+
+	raw_spin_lock_irqsave(&vector_lock, flags);
+	apicd->is_managed = true;
+	ret = irq_matrix_reserve_managed(vector_matrix, affmsk);
+	raw_spin_unlock_irqrestore(&vector_lock, flags);
+	trace_vector_reserve_managed(irqd->irq, ret);
+	return ret;
+}
+
 static int allocate_vector(struct irq_data *irqd, const struct cpumask *dest)
 {
 	struct apic_chip_data *apicd = apic_chip_data(irqd);
@@ -200,20 +223,65 @@ static int assign_irq_vector(struct irq_
 	return ret;
 }
 
-static int assign_irq_vector_policy(struct irq_data *irqd,
-				    struct irq_alloc_info *info, int node)
+static int assign_irq_vector_any_locked(struct irq_data *irqd)
+{
+	int node = irq_data_get_node(irqd);
+
+	if (node != NUMA_NO_NODE) {
+		if (!assign_vector_locked(irqd, cpumask_of_node(node)))
+			return 0;
+	}
+	return assign_vector_locked(irqd, cpu_online_mask);
+}
+
+static int assign_irq_vector_any(struct irq_data *irqd)
+{
+	unsigned long flags;
+	int ret;
+
+	raw_spin_lock_irqsave(&vector_lock, flags);
+	ret = assign_irq_vector_any_locked(irqd);
+	raw_spin_unlock_irqrestore(&vector_lock, flags);
+	return ret;
+}
+
+static int
+assign_irq_vector_policy(struct irq_data *irqd, struct irq_alloc_info *info)
 {
+	if (irqd_affinity_is_managed(irqd))
+		return reserve_managed_vector(irqd);
 	if (info->mask)
 		return assign_irq_vector(irqd, info->mask);
-	if (node != NUMA_NO_NODE &&
-	    !assign_irq_vector(irqd, cpumask_of_node(node)))
+	return assign_irq_vector_any(irqd);
+}
+
+static int
+assign_managed_vector(struct irq_data *irqd, const struct cpumask *dest)
+{
+	const struct cpumask *affmsk = irq_data_get_affinity_mask(irqd);
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
+	int vector, cpu;
+
+	cpumask_and(vector_searchmask, vector_searchmask, affmsk);
+	cpu = cpumask_first(vector_searchmask);
+	if (cpu >= nr_cpu_ids)
+		return -EINVAL;
+	/* set_affinity might call here for nothing */
+	if (apicd->vector && cpumask_test_cpu(apicd->cpu, vector_searchmask))
 		return 0;
-	return assign_irq_vector(irqd, cpu_online_mask);
+	vector = irq_matrix_alloc_managed(vector_matrix, cpu);
+	trace_vector_alloc_managed(irqd->irq, vector, vector);
+	if (vector < 0)
+		return vector;
+	apic_update_vector(irqd, vector, cpu);
+	apic_update_irq_cfg(irqd, vector, cpu);
+	return 0;
 }
 
 static void clear_irq_vector(struct irq_data *irqd)
 {
 	struct apic_chip_data *apicd = apic_chip_data(irqd);
+	bool managed = irqd_affinity_is_managed(irqd);
 	unsigned int vector = apicd->vector;
 
 	lockdep_assert_held(&vector_lock);
@@ -225,7 +293,7 @@ static void clear_irq_vector(struct irq_
 			   apicd->prev_cpu);
 
 	per_cpu(vector_irq, apicd->cpu)[vector] = VECTOR_UNUSED;
-	irq_matrix_free(vector_matrix, apicd->cpu, vector, false);
+	irq_matrix_free(vector_matrix, apicd->cpu, vector, managed);
 	apicd->vector = 0;
 
 	/* Clean up move in progress */
@@ -234,12 +302,86 @@ static void clear_irq_vector(struct irq_
 		return;
 
 	per_cpu(vector_irq, apicd->prev_cpu)[vector] = VECTOR_UNUSED;
-	irq_matrix_free(vector_matrix, apicd->prev_cpu, vector, false);
+	irq_matrix_free(vector_matrix, apicd->prev_cpu, vector, managed);
 	apicd->prev_vector = 0;
 	apicd->move_in_progress = 0;
 	hlist_del_init(&apicd->clist);
 }
 
+static void x86_vector_deactivate(struct irq_domain *dom, struct irq_data *irqd)
+{
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
+	unsigned long flags;
+
+	trace_vector_deactivate(irqd->irq, apicd->is_managed,
+				false, false);
+
+	if (apicd->is_managed)
+		return;
+
+	raw_spin_lock_irqsave(&vector_lock, flags);
+	clear_irq_vector(irqd);
+	vector_assign_managed_shutdown(irqd);
+	raw_spin_unlock_irqrestore(&vector_lock, flags);
+}
+
+static int activate_managed(struct irq_data *irqd)
+{
+	const struct cpumask *dest = irq_data_get_affinity_mask(irqd);
+	int ret;
+
+	cpumask_and(vector_searchmask, dest, cpu_online_mask);
+	if (WARN_ON_ONCE(cpumask_empty(vector_searchmask))) {
+		/* Something in the core code broke! Survive gracefully */
+		pr_err("Managed startup for irq %u, but no CPU\n", irqd->irq);
+		return EINVAL;
+	}
+
+	ret = assign_managed_vector(irqd, vector_searchmask);
+	/*
+	 * This should not happen. The vector reservation got buggered.  Handle
+	 * it gracefully.
+	 */
+	if (WARN_ON_ONCE(ret < 0)) {
+		pr_err("Managed startup irq %u, no vector available\n",
+		       irqd->irq);
+	}
+       return ret;
+}
+
+static int x86_vector_activate(struct irq_domain *dom, struct irq_data *irqd,
+			       bool early)
+{
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
+	unsigned long flags;
+	int ret = 0;
+
+	trace_vector_activate(irqd->irq, apicd->is_managed,
+				false, early);
+
+	if (!apicd->is_managed)
+		return 0;
+
+	raw_spin_lock_irqsave(&vector_lock, flags);
+	if (early || irqd_is_managed_and_shutdown(irqd))
+		vector_assign_managed_shutdown(irqd);
+	else
+		ret = activate_managed(irqd);
+	raw_spin_unlock_irqrestore(&vector_lock, flags);
+	return ret;
+}
+
+static void vector_free_reserved_and_managed(struct irq_data *irqd)
+{
+	const struct cpumask *dest = irq_data_get_affinity_mask(irqd);
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
+
+	trace_vector_teardown(irqd->irq, apicd->is_managed, false);
+
+	if (apicd->is_managed)
+		irq_matrix_remove_managed(vector_matrix, dest);
+}
+
 static void x86_vector_free_irqs(struct irq_domain *domain,
 				 unsigned int virq, unsigned int nr_irqs)
 {
@@ -253,6 +395,7 @@ static void x86_vector_free_irqs(struct
 		if (irqd && irqd->chip_data) {
 			raw_spin_lock_irqsave(&vector_lock, flags);
 			clear_irq_vector(irqd);
+			vector_free_reserved_and_managed(irqd);
 			apicd = irqd->chip_data;
 			irq_domain_reset_irq_data(irqd);
 			raw_spin_unlock_irqrestore(&vector_lock, flags);
@@ -310,7 +453,7 @@ static int x86_vector_alloc_irqs(struct
 			continue;
 		}
 
-		err = assign_irq_vector_policy(irqd, info, node);
+		err = assign_irq_vector_policy(irqd, info);
 		trace_vector_setup(virq + i, false, err);
 		if (err)
 			goto error;
@@ -368,6 +511,8 @@ void x86_vector_debug_show(struct seq_fi
 static const struct irq_domain_ops x86_vector_domain_ops = {
 	.alloc		= x86_vector_alloc_irqs,
 	.free		= x86_vector_free_irqs,
+	.activate	= x86_vector_activate,
+	.deactivate	= x86_vector_deactivate,
 #ifdef CONFIG_GENERIC_IRQ_DEBUGFS
 	.debug_show	= x86_vector_debug_show,
 #endif
@@ -531,13 +676,13 @@ static int apic_set_affinity(struct irq_
 {
 	int err;
 
-	if (!IS_ENABLED(CONFIG_SMP))
-		return -EPERM;
-
-	if (!cpumask_intersects(dest, cpu_online_mask))
-		return -EINVAL;
-
-	err = assign_irq_vector(irqd, dest);
+	raw_spin_lock(&vector_lock);
+	cpumask_and(vector_searchmask, dest, cpu_online_mask);
+	if (irqd_affinity_is_managed(irqd))
+		err = assign_managed_vector(irqd, vector_searchmask);
+	else
+		err = assign_vector_locked(irqd, vector_searchmask);
+	raw_spin_unlock(&vector_lock);
 	return err ? err : IRQ_SET_MASK_OK;
 }
 
@@ -577,9 +722,18 @@ static void free_moved_vector(struct api
 {
 	unsigned int vector = apicd->prev_vector;
 	unsigned int cpu = apicd->prev_cpu;
+	bool managed = apicd->is_managed;
+
+	/*
+	 * This should never happen. Managed interrupts are not
+	 * migrated except on CPU down, which does not involve the
+	 * cleanup vector. But try to keep the accounting correct
+	 * nevertheless.
+	 */
+	WARN_ON_ONCE(managed);
 
-	trace_vector_free_moved(apicd->irq, vector, false);
-	irq_matrix_free(vector_matrix, cpu, vector, false);
+	trace_vector_free_moved(apicd->irq, vector, managed);
+	irq_matrix_free(vector_matrix, cpu, vector, managed);
 	__this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
 	hlist_del_init(&apicd->clist);
 	apicd->prev_vector = 0;

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 49/52] x86/vector/msi: Switch to global reservation mode
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (47 preceding siblings ...)
  2017-09-13 21:29 ` [patch 48/52] x86/vector: Handle managed interrupts proper Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 50/52] x86/vector: Switch IOAPIC " Thomas Gleixner
                   ` (4 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Switch-MSI-to-global-reservation-mode.patch --]
[-- Type: text/plain, Size: 6789 bytes --]

Devices with many queues allocate a huge number of interrupts and get
assigned a vector for each of them, even if the queues are not active and
the interrupts never requested. This causes problems with the decision
whether the global vector space is sufficient for CPU hot unplug
operations.

Change it to a reservation scheme, which allows overcommitment.

When the interrupt is allocated and initialized the vector assignment
merily updates the reservation request counter in the matrix
allocator. This counter is used to emit warnings when the reservation
exceeds the available vector space, but does not affect CPU offline
operations. Like the managed interrupts the corresponding MSI/DMAR/IOAPIC
entries are directed to the special shutdown vector.

When the interrupt is requested, then the activation code tries to assign a
real vector. If that succeeds the interrupt is started up and functional.

If that fails, then subsequently request_irq() fails with -ENOSPC.

This allows a clear separation of inactive and active modes and simplifies
the final decisions whether the global vector space is sufficient for CPU
offline operations.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |   97 +++++++++++++++++++++++++++---------------
 1 file changed, 63 insertions(+), 34 deletions(-)

Index: b/arch/x86/kernel/apic/vector.c
===================================================================
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -33,7 +33,9 @@ struct apic_chip_data {
 	unsigned int		irq;
 	struct hlist_node	clist;
 	unsigned int		move_in_progress	: 1,
-				is_managed		: 1;
+				is_managed		: 1,
+				can_reserve		: 1,
+				has_reserved		: 1;
 };
 
 struct irq_domain *x86_vector_domain;
@@ -175,9 +177,31 @@ static int reserve_managed_vector(struct
 	return ret;
 }
 
+static void reserve_irq_vector_locked(struct irq_data *irqd)
+{
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
+
+	irq_matrix_reserve(vector_matrix);
+	apicd->can_reserve = true;
+	apicd->has_reserved = true;
+	trace_vector_reserve(irqd->irq, 0);
+	vector_assign_managed_shutdown(irqd);
+}
+
+static int reserve_irq_vector(struct irq_data *irqd)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&vector_lock, flags);
+	reserve_irq_vector_locked(irqd);
+	raw_spin_unlock_irqrestore(&vector_lock, flags);
+	return 0;
+}
+
 static int allocate_vector(struct irq_data *irqd, const struct cpumask *dest)
 {
 	struct apic_chip_data *apicd = apic_chip_data(irqd);
+	bool resvd = apicd->has_reserved;
 	unsigned int cpu = apicd->cpu;
 	int vector = apicd->vector;
 
@@ -191,10 +215,10 @@ static int allocate_vector(struct irq_da
 	if (vector && cpu_online(cpu) && cpumask_test_cpu(cpu, dest))
 		return 0;
 
-	vector = irq_matrix_alloc(vector_matrix, dest, false, &cpu);
+	vector = irq_matrix_alloc(vector_matrix, dest, resvd, &cpu);
 	if (vector > 0)
 		apic_update_vector(irqd, vector, cpu);
-	trace_vector_alloc(irqd->irq, vector, false, vector);
+	trace_vector_alloc(irqd->irq, vector, resvd, vector);
 	return vector;
 }
 
@@ -252,7 +276,11 @@ assign_irq_vector_policy(struct irq_data
 		return reserve_managed_vector(irqd);
 	if (info->mask)
 		return assign_irq_vector(irqd, info->mask);
-	return assign_irq_vector_any(irqd);
+	if (info->type != X86_IRQ_ALLOC_TYPE_MSI &&
+	    info->type != X86_IRQ_ALLOC_TYPE_MSIX)
+		return assign_irq_vector_any(irqd);
+	/* For MSI(X) make only a global reservation with no guarantee */
+	return reserve_irq_vector(irqd);
 }
 
 static int
@@ -314,17 +342,35 @@ static void x86_vector_deactivate(struct
 	unsigned long flags;
 
 	trace_vector_deactivate(irqd->irq, apicd->is_managed,
-				false, false);
+				apicd->can_reserve, false);
 
-	if (apicd->is_managed)
+	/* Regular fixed assigned interrupt */
+	if (!apicd->is_managed && !apicd->can_reserve)
+		return;
+	/* If the interrupt has a global reservation, nothing to do */
+	if (apicd->has_reserved)
 		return;
 
 	raw_spin_lock_irqsave(&vector_lock, flags);
 	clear_irq_vector(irqd);
-	vector_assign_managed_shutdown(irqd);
+	if (apicd->can_reserve)
+		reserve_irq_vector_locked(irqd);
+	else
+		vector_assign_managed_shutdown(irqd);
 	raw_spin_unlock_irqrestore(&vector_lock, flags);
 }
 
+static int activate_reserved(struct irq_data *irqd)
+{
+	struct apic_chip_data *apicd = apic_chip_data(irqd);
+	int ret;
+
+	ret = assign_irq_vector_any_locked(irqd);
+	if (!ret)
+		apicd->has_reserved = false;
+	return ret;
+}
+
 static int activate_managed(struct irq_data *irqd)
 {
 	const struct cpumask *dest = irq_data_get_affinity_mask(irqd);
@@ -357,16 +403,19 @@ static int x86_vector_activate(struct ir
 	int ret = 0;
 
 	trace_vector_activate(irqd->irq, apicd->is_managed,
-				false, early);
+			      apicd->can_reserve, early);
 
-	if (!apicd->is_managed)
+	/* Nothing to do for fixed assigned vectors */
+	if (!apicd->can_reserve && !apicd->is_managed)
 		return 0;
 
 	raw_spin_lock_irqsave(&vector_lock, flags);
 	if (early || irqd_is_managed_and_shutdown(irqd))
 		vector_assign_managed_shutdown(irqd);
-	else
+	else if (apicd->is_managed)
 		ret = activate_managed(irqd);
+	else if (apicd->has_reserved)
+		ret = activate_reserved(irqd);
 	raw_spin_unlock_irqrestore(&vector_lock, flags);
 	return ret;
 }
@@ -376,8 +425,11 @@ static void vector_free_reserved_and_man
 	const struct cpumask *dest = irq_data_get_affinity_mask(irqd);
 	struct apic_chip_data *apicd = apic_chip_data(irqd);
 
-	trace_vector_teardown(irqd->irq, apicd->is_managed, false);
+	trace_vector_teardown(irqd->irq, apicd->is_managed,
+			      apicd->has_reserved);
 
+	if (apicd->has_reserved)
+		irq_matrix_remove_reserved(vector_matrix);
 	if (apicd->is_managed)
 		irq_matrix_remove_managed(vector_matrix, dest);
 }
@@ -604,22 +656,6 @@ int __init arch_early_irq_init(void)
 }
 
 #ifdef CONFIG_SMP
-/* Temporary hack to keep things working */
-static void vector_update_shutdown_irqs(void)
-{
-	struct irq_desc *desc;
-	int irq;
-
-	for_each_irq_desc(irq, desc) {
-		struct irq_data *irqd = irq_desc_get_irq_data(desc);
-		struct apic_chip_data *ad = apic_chip_data(irqd);
-
-		if (!ad || !ad->vector || ad->cpu != smp_processor_id())
-			continue;
-		this_cpu_write(vector_irq[ad->vector], desc);
-		irq_matrix_assign(vector_matrix, ad->vector);
-	}
-}
 
 static struct irq_desc *__setup_vector_irq(int vector)
 {
@@ -655,13 +691,6 @@ void lapic_online(void)
 	 */
 	for (vector = 0; vector < NR_VECTORS; vector++)
 		this_cpu_write(vector_irq[vector], __setup_vector_irq(vector));
-
-	/*
-	 * Until the rewrite of the managed interrupt management is in
-	 * place it's necessary to walk the irq descriptors and check for
-	 * interrupts which are targeted at this CPU.
-	 */
-	vector_update_shutdown_irqs();
 }
 
 void lapic_offline(void)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 50/52] x86/vector: Switch IOAPIC to global reservation mode
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (48 preceding siblings ...)
  2017-09-13 21:29 ` [patch 49/52] x86/vector/msi: Switch to global reservation mode Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 51/52] x86/irq: Simplify hotplug vector accounting Thomas Gleixner
                   ` (3 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Switch-IOAPIC-to-global-reservation-mode.patch --]
[-- Type: text/plain, Size: 3251 bytes --]

IOAPICs install and allocate vectors for inactive interrupts. This results
in problems on CPU offline and wastes vector resources for nothing.

Handle inactive IOAPIC interrupts in the same way as inactive MSI
interrupts and switch them to the global reservation mode.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |   56 ++++++++++++++++++++++++------------------
 1 file changed, 33 insertions(+), 23 deletions(-)

Index: b/arch/x86/kernel/apic/vector.c
===================================================================
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -258,17 +258,6 @@ static int assign_irq_vector_any_locked(
 	return assign_vector_locked(irqd, cpu_online_mask);
 }
 
-static int assign_irq_vector_any(struct irq_data *irqd)
-{
-	unsigned long flags;
-	int ret;
-
-	raw_spin_lock_irqsave(&vector_lock, flags);
-	ret = assign_irq_vector_any_locked(irqd);
-	raw_spin_unlock_irqrestore(&vector_lock, flags);
-	return ret;
-}
-
 static int
 assign_irq_vector_policy(struct irq_data *irqd, struct irq_alloc_info *info)
 {
@@ -276,10 +265,10 @@ assign_irq_vector_policy(struct irq_data
 		return reserve_managed_vector(irqd);
 	if (info->mask)
 		return assign_irq_vector(irqd, info->mask);
-	if (info->type != X86_IRQ_ALLOC_TYPE_MSI &&
-	    info->type != X86_IRQ_ALLOC_TYPE_MSIX)
-		return assign_irq_vector_any(irqd);
-	/* For MSI(X) make only a global reservation with no guarantee */
+	/*
+	 * Make only a global reservation with no guarantee. A real vector
+	 * is associated at activation time.
+	 */
 	return reserve_irq_vector(irqd);
 }
 
@@ -456,13 +445,39 @@ static void x86_vector_free_irqs(struct
 	}
 }
 
+static bool vector_configure_legacy(unsigned int virq, struct irq_data *irqd,
+				    struct apic_chip_data *apicd)
+{
+	unsigned long flags;
+	bool realloc = false;
+
+	apicd->vector = ISA_IRQ_VECTOR(virq);
+	apicd->cpu = 0;
+	apicd->can_reserve = true;
+
+	raw_spin_lock_irqsave(&vector_lock, flags);
+	/*
+	 * If the interrupt is activated, then it must stay at this vector
+	 * position. That's usually the timer interrupt (0).
+	 */
+	if (irqd_is_activated(irqd)) {
+		trace_vector_setup(virq, true, 0);
+		apic_update_irq_cfg(irqd, apicd->vector, apicd->cpu);
+	} else {
+		/* Release the vector */
+		clear_irq_vector(irqd);
+		realloc = true;
+	}
+	raw_spin_unlock_irqrestore(&vector_lock, flags);
+	return realloc;
+}
+
 static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 				 unsigned int nr_irqs, void *arg)
 {
 	struct irq_alloc_info *info = arg;
 	struct apic_chip_data *apicd;
 	struct irq_data *irqd;
-	unsigned long flags;
 	int i, err, node;
 
 	if (disable_apic)
@@ -496,13 +511,8 @@ static int x86_vector_alloc_irqs(struct
 		 * config.
 		 */
 		if (info->flags & X86_IRQ_ALLOC_LEGACY) {
-			apicd->vector = ISA_IRQ_VECTOR(virq + i);
-			apicd->cpu = 0;
-			trace_vector_setup(virq + i, true, 0);
-			raw_spin_lock_irqsave(&vector_lock, flags);
-			apic_update_irq_cfg(irqd, apicd->vector, apicd->cpu);
-			raw_spin_unlock_irqrestore(&vector_lock, flags);
-			continue;
+			if (!vector_configure_legacy(virq + i, irqd, apicd))
+				continue;
 		}
 
 		err = assign_irq_vector_policy(irqd, info);

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 51/52] x86/irq: Simplify hotplug vector accounting
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (49 preceding siblings ...)
  2017-09-13 21:29 ` [patch 50/52] x86/vector: Switch IOAPIC " Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-13 21:29 ` [patch 52/52] x86/vector: Respect affinity mask in irq descriptor Thomas Gleixner
                   ` (2 subsequent siblings)
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-irq--Simplify-hotplug-vector-accounting.patch --]
[-- Type: text/plain, Size: 6891 bytes --]

Before a CPU is taken offline the number of active interrupt vectors on the
outgoing CPU and the number of vectors which are available on the other
online CPUs are counted and compared. If the active vectors are more than
the available vectors on the other CPUs then the CPU hot-unplug operation
is aborted. This again uses loop based search and is inaccurate.

The bitmap matrix allocator has accurate accounting information and can
tell exactly whether the vector space is sufficient or not.

Emit a message when the number of globaly reserved (unallocated) vectors is
larger than the number of available vectors after offlining a CPU because
after that point request_irq() might fail.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h   |    1 
 arch/x86/include/asm/irq.h    |    4 -
 arch/x86/kernel/apic/vector.c |   32 +++++++++++++
 arch/x86/kernel/irq.c         |   99 ------------------------------------------
 arch/x86/kernel/smpboot.c     |    2 
 5 files changed, 33 insertions(+), 105 deletions(-)

Index: b/arch/x86/include/asm/apic.h
===================================================================
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -377,6 +377,7 @@ extern struct apic *__apicdrivers[], *__
  */
 #ifdef CONFIG_SMP
 extern int wakeup_secondary_cpu_via_nmi(int apicid, unsigned long start_eip);
+extern int lapic_can_unplug_cpu(void);
 #endif
 
 #ifdef CONFIG_X86_LOCAL_APIC
Index: b/arch/x86/include/asm/irq.h
===================================================================
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -25,11 +25,7 @@ extern void irq_ctx_init(int cpu);
 
 struct irq_desc;
 
-#ifdef CONFIG_HOTPLUG_CPU
-#include <linux/cpumask.h>
-extern int check_irq_vectors_for_cpu_disable(void);
 extern void fixup_irqs(void);
-#endif
 
 #ifdef CONFIG_HAVE_KVM
 extern void kvm_set_posted_intr_wakeup_handler(void (*handler)(void));
Index: b/arch/x86/kernel/apic/vector.c
===================================================================
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -945,7 +945,37 @@ void irq_force_complete_move(struct irq_
 unlock:
 	raw_spin_unlock(&vector_lock);
 }
-#endif
+
+#ifdef CONFIG_HOTPLUG_CPU
+/*
+ * Note, this is not accurate accounting, but at least good enough to
+ * prevent that the actual interrupt move will run out of vectors.
+ */
+int lapic_can_unplug_cpu(void)
+{
+	unsigned int rsvd, avl, tomove, cpu = smp_processor_id();
+	int ret = 0;
+
+	raw_spin_lock(&vector_lock);
+	tomove = irq_matrix_allocated(vector_matrix);
+	avl = irq_matrix_available(vector_matrix, true);
+	if (avl < tomove) {
+		pr_warn("CPU %u has %u vectors, %u available. Cannot disable CPU\n",
+			cpu, tomove, avl);
+		ret = -ENOSPC;
+		goto out;
+	}
+	rsvd = irq_matrix_reserved(vector_matrix);
+	if (avl < rsvd) {
+		pr_warn("Reserved vectors %u > available %u. IRQ request may fail\n",
+			rsvd, avl);
+	}
+out:
+	raw_spin_unlock(&vector_lock);
+	return ret;
+}
+#endif /* HOTPLUG_CPU */
+#endif /* SMP */
 
 static void __init print_APIC_field(int base)
 {
Index: b/arch/x86/kernel/irq.c
===================================================================
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -333,105 +333,6 @@ EXPORT_SYMBOL_GPL(kvm_set_posted_intr_wa
 
 
 #ifdef CONFIG_HOTPLUG_CPU
-
-/* These two declarations are only used in check_irq_vectors_for_cpu_disable()
- * below, which is protected by stop_machine().  Putting them on the stack
- * results in a stack frame overflow.  Dynamically allocating could result in a
- * failure so declare these two cpumasks as global.
- */
-static struct cpumask affinity_new, online_new;
-
-/*
- * This cpu is going to be removed and its vectors migrated to the remaining
- * online cpus.  Check to see if there are enough vectors in the remaining cpus.
- * This function is protected by stop_machine().
- */
-int check_irq_vectors_for_cpu_disable(void)
-{
-	unsigned int this_cpu, vector, this_count, count;
-	struct irq_desc *desc;
-	struct irq_data *data;
-	int cpu;
-
-	this_cpu = smp_processor_id();
-	cpumask_copy(&online_new, cpu_online_mask);
-	cpumask_clear_cpu(this_cpu, &online_new);
-
-	this_count = 0;
-	for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
-		desc = __this_cpu_read(vector_irq[vector]);
-		if (IS_ERR_OR_NULL(desc))
-			continue;
-		/*
-		 * Protect against concurrent action removal, affinity
-		 * changes etc.
-		 */
-		raw_spin_lock(&desc->lock);
-		data = irq_desc_get_irq_data(desc);
-		cpumask_copy(&affinity_new,
-			     irq_data_get_affinity_mask(data));
-		cpumask_clear_cpu(this_cpu, &affinity_new);
-
-		/* Do not count inactive or per-cpu irqs. */
-		if (!irq_desc_has_action(desc) || irqd_is_per_cpu(data)) {
-			raw_spin_unlock(&desc->lock);
-			continue;
-		}
-
-		raw_spin_unlock(&desc->lock);
-		/*
-		 * A single irq may be mapped to multiple cpu's
-		 * vector_irq[] (for example IOAPIC cluster mode).  In
-		 * this case we have two possibilities:
-		 *
-		 * 1) the resulting affinity mask is empty; that is
-		 * this the down'd cpu is the last cpu in the irq's
-		 * affinity mask, or
-		 *
-		 * 2) the resulting affinity mask is no longer a
-		 * subset of the online cpus but the affinity mask is
-		 * not zero; that is the down'd cpu is the last online
-		 * cpu in a user set affinity mask.
-		 */
-		if (cpumask_empty(&affinity_new) ||
-		    !cpumask_subset(&affinity_new, &online_new))
-			this_count++;
-	}
-	/* No need to check any further. */
-	if (!this_count)
-		return 0;
-
-	count = 0;
-	for_each_online_cpu(cpu) {
-		if (cpu == this_cpu)
-			continue;
-		/*
-		 * We scan from FIRST_EXTERNAL_VECTOR to first system
-		 * vector. If the vector is marked in the used vectors
-		 * bitmap or an irq is assigned to it, we don't count
-		 * it as available.
-		 *
-		 * As this is an inaccurate snapshot anyway, we can do
-		 * this w/o holding vector_lock.
-		 */
-		for (vector = FIRST_EXTERNAL_VECTOR;
-		     vector < FIRST_SYSTEM_VECTOR; vector++) {
-			if (!test_bit(vector, system_vectors) &&
-			    IS_ERR_OR_NULL(per_cpu(vector_irq, cpu)[vector])) {
-				if (++count == this_count)
-					return 0;
-			}
-		}
-	}
-
-	if (count < this_count) {
-		pr_warn("CPU %d disable failed: CPU has %u vectors assigned and there are only %u available.\n",
-			this_cpu, this_count, count);
-		return -ERANGE;
-	}
-	return 0;
-}
-
 /* A cpu has been removed from cpu_online_mask.  Reset irq affinities. */
 void fixup_irqs(void)
 {
Index: b/arch/x86/kernel/smpboot.c
===================================================================
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1557,7 +1557,7 @@ int native_cpu_disable(void)
 {
 	int ret;
 
-	ret = check_irq_vectors_for_cpu_disable();
+	ret = lapic_can_unplug_cpu();
 	if (ret)
 		return ret;
 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [patch 52/52] x86/vector: Respect affinity mask in irq descriptor
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (50 preceding siblings ...)
  2017-09-13 21:29 ` [patch 51/52] x86/irq: Simplify hotplug vector accounting Thomas Gleixner
@ 2017-09-13 21:29 ` Thomas Gleixner
  2017-09-14 11:21 ` [patch 00/52] x86: Rework the vector management Juergen Gross
  2017-09-19  9:12 ` Yu Chen
  53 siblings, 0 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

[-- Attachment #1: x86-vector--Respect-affinity-mask-in-irq-descriptor.patch --]
[-- Type: text/plain, Size: 1713 bytes --]

The interrupt descriptor has a preset affinity mask at allocation
time, which is usually the default affinity mask.

The current code does not respect that mask and places the vector at some
random CPU, which gets corrected later by a set_affinity() call. That's
silly because the vector allocation can respect the mask upfront and place
the interrupt on a CPU which is in the mask. If that fails, then the
affinity is broken and a interrupt assigned on any online CPU.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/vector.c |   21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -249,12 +249,25 @@ static int assign_irq_vector(struct irq_
 
 static int assign_irq_vector_any_locked(struct irq_data *irqd)
 {
+	/* Get the affinity mask - either irq_default_affinity or (user) set */
+	const struct cpumask *affmsk = irq_data_get_affinity_mask(irqd);
 	int node = irq_data_get_node(irqd);
 
-	if (node != NUMA_NO_NODE) {
-		if (!assign_vector_locked(irqd, cpumask_of_node(node)))
-			return 0;
-	}
+	if (node == NUMA_NO_NODE)
+		goto all;
+	/* Try the intersection of @affmsk and node mask */
+	cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk);
+	if (!assign_vector_locked(irqd, vector_searchmask))
+		return 0;
+	/* Try the node mask */
+	if (!assign_vector_locked(irqd, cpumask_of_node(node)))
+		return 0;
+all:
+	/* Try the full affinity mask */
+	cpumask_and(vector_searchmask, affmsk, cpu_online_mask);
+	if (!assign_vector_locked(irqd, vector_searchmask))
+		return 0;
+	/* Try the full online mask */
 	return assign_vector_locked(irqd, cpu_online_mask);
 }
 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [patch 00/52] x86: Rework the vector management
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (51 preceding siblings ...)
  2017-09-13 21:29 ` [patch 52/52] x86/vector: Respect affinity mask in irq descriptor Thomas Gleixner
@ 2017-09-14 11:21 ` Juergen Gross
  2017-09-20 10:21   ` Paolo Bonzini
  2017-09-19  9:12 ` Yu Chen
  53 siblings, 1 reply; 59+ messages in thread
From: Juergen Gross @ 2017-09-14 11:21 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Tony Luck, K. Y. Srinivasan,
	Alok Kataria, Steven Rostedt, Arjan van de Ven

On 13/09/17 23:29, Thomas Gleixner wrote:
> Sorry for the large CC list, but this is a major surgery.
> 
> The vector management in x86 including the surrounding code is a
> conglomorate of ancient bits and pieces which have been subject to
> 'modernization' and featuritis over the years. The most obscure parts are
> the vector allocation mechanics, the cleanup vector handling and the cpu
> hotplug machinery. Replacing these pieces of art was on my todo list for a
> long time.
> 
> Recent attempts to 'solve' CPU offline / hibernation issues which are
> partially caused by the current vector management implementation made me
> look for real. Further information in this thread:
> 
>     http://lkml.kernel.org/r/cover.1504235838.git.yu.c.chen@intel.com
> 
> Aside of drivers allocating gazillion of interrupts, there are quite some
> things which can be addressed in the x86 vector management and in the core
> code.
> 
>   - Multi CPU affinities:
> 
>     A dubious property which is not available on all machines and causes
>     major complexity both in the allocator and the cleanup/hotplug
>     management. See:
> 
>        http://lkml.kernel.org/r/alpine.DEB.2.20.1709071045440.1827@nanos
> 
>   - Priority level spreading:
> 
>     An obscure and undocumented property which I think is sufficiently
>     argued to be not required in:
> 
>        http://lkml.kernel.org/r/alpine.DEB.2.20.1709071045440.1827@nanos
> 
>   - Allocation of vectors when interrupt descriptors are allocated.
> 
>     This is a historical implementation detail, which is not really
>     required when the vector allocation is delayed up to the point when
>     request_irq() is invoked. This might make request_irq() fail, when the
>     vector space is exhausted, but drivers should handle request_irq()
>     fails anyway.
> 
>     The upside of changing this is that the active vector space becomes
>     smaller especially on hibernation/cpu offline when drivers shut down
>     queue interrupts of outgoing CPUs.
> 
>     Some of this is already addressed with the managed interrupt facility,
>     but that was bolted on top of the existing vector management because
>     proper integration was not possible at that point. I take the blame
>     for this, but the tradeoff of not doing it would have been more
>     broken driver boiler plate code all over the place. So I went for the
>     lesser of two evils.
> 
>   - Allocation of vectors on the wrong place
> 
>     Even for managed interrupts the vector allocation at descriptor
>     allocation happens on the wrong place and gets fixed after the fact
>     with a call to set_affinity(). In case of not remapped interrupts
>     this results in at least one interrupt on the wrong CPU before it is
>     migrated to the desired target.
> 
>   - Lack of instrumentation
>  
>     All of this is a black box which allows no insight into the actual
>     vector usage.
> 
> The series addresses these points and converts the x86 vector management to
> a bitmap based allocator which provides proper reservation management for
> 'managed interrupts' and best effort reservation for regular interrupts.
> The latter allows overcommitment, which 'fixes' some of hotplug/hibernation
> problems in a clean way. It can't fix all of them depending on the driver
> involved.
> 
> This rework is no excuse for driver writers to do exhaustive vector
> allocations instead of utilizing the managed interrupt infrastructure, but
> it addresses long standing issues in this code with the side effect of
> mitigating some of the driver oddities. The proper solution for multi queue
> management are 'managed interrupts' which has been proven in the block-mq
> work as they solve issues which are worked around in other drivers in
> creative ways with lots of copied code and often enough broken attempts to
> handle interrupt affinity and CPU hotplug problems.
> 
> The new bitmap allocator and the x86 vector management code are
> instrumented with tracepoints and the irq domain debugfs files allow deep
> insight into the vector allocation and reservations.
> 
> The patches work on machines with and without interrupt remapping and
> inside of KVM guests of various flavours, though I have no idea what I
> broke on the way with other hypervisors, posted interrupts etc. So I kindly
> ask for your support in testing and review.
> 
> The series applies on top of Linus tree and is available as git branch:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/apic
> 
> Note, that this branch is Linus tree plus scheduler and x86 fixes which I
> required to do proper testing. They have outstanding pull requests and
> might be merged already when you read this.
> 
> Thanks,
> 
> 	tglx
> ---
>  arch/x86/include/asm/x2apic.h              |   49 -
>  b/arch/x86/Kconfig                         |    1 
>  b/arch/x86/include/asm/apic.h              |  255 +-----
>  b/arch/x86/include/asm/desc.h              |    2 
>  b/arch/x86/include/asm/hw_irq.h            |    6 
>  b/arch/x86/include/asm/io_apic.h           |    2 
>  b/arch/x86/include/asm/irq.h               |    4 
>  b/arch/x86/include/asm/irq_vectors.h       |    8 
>  b/arch/x86/include/asm/irqdomain.h         |    5 
>  b/arch/x86/include/asm/kvm_host.h          |    2 
>  b/arch/x86/include/asm/trace/irq_vectors.h |  244 ++++++
>  b/arch/x86/kernel/apic/Makefile            |    2 
>  b/arch/x86/kernel/apic/apic.c              |   38 -
>  b/arch/x86/kernel/apic/apic_common.c       |   46 +
>  b/arch/x86/kernel/apic/apic_flat_64.c      |   10 
>  b/arch/x86/kernel/apic/apic_noop.c         |   25 
>  b/arch/x86/kernel/apic/apic_numachip.c     |   12 
>  b/arch/x86/kernel/apic/bigsmp_32.c         |    8 
>  b/arch/x86/kernel/apic/htirq.c             |    5 
>  b/arch/x86/kernel/apic/io_apic.c           |   94 --
>  b/arch/x86/kernel/apic/msi.c               |    5 
>  b/arch/x86/kernel/apic/probe_32.c          |   29 
>  b/arch/x86/kernel/apic/vector.c            | 1090 +++++++++++++++++------------
>  b/arch/x86/kernel/apic/x2apic.h            |    9 
>  b/arch/x86/kernel/apic/x2apic_cluster.c    |  196 +----
>  b/arch/x86/kernel/apic/x2apic_phys.c       |   44 +
>  b/arch/x86/kernel/apic/x2apic_uv_x.c       |   17 
>  b/arch/x86/kernel/i8259.c                  |    1 
>  b/arch/x86/kernel/idt.c                    |   12 
>  b/arch/x86/kernel/irq.c                    |  101 --
>  b/arch/x86/kernel/irqinit.c                |    1 
>  b/arch/x86/kernel/setup.c                  |   12 
>  b/arch/x86/kernel/smpboot.c                |   14 
>  b/arch/x86/kernel/traps.c                  |    2 
>  b/arch/x86/kernel/vsmp_64.c                |   19 
>  b/arch/x86/platform/uv/uv_irq.c            |    5 
>  b/arch/x86/xen/apic.c                      |    6 
>  b/drivers/gpio/gpio-xgene-sb.c             |    7 
>  b/drivers/iommu/amd_iommu.c                |   44 -
>  b/drivers/iommu/intel_irq_remapping.c      |   43 -
>  b/drivers/irqchip/irq-gic-v3-its.c         |    5 
>  b/drivers/pinctrl/stm32/pinctrl-stm32.c    |    5 
>  b/include/linux/irq.h                      |   22 
>  b/include/linux/irqdesc.h                  |    1 
>  b/include/linux/irqdomain.h                |   14 
>  b/include/linux/msi.h                      |    5 
>  b/include/trace/events/irq_matrix.h        |  201 +++++
>  b/kernel/irq/Kconfig                       |    3 
>  b/kernel/irq/Makefile                      |    1 
>  b/kernel/irq/autoprobe.c                   |    2 
>  b/kernel/irq/chip.c                        |   37 
>  b/kernel/irq/debugfs.c                     |   12 
>  b/kernel/irq/internals.h                   |   19 
>  b/kernel/irq/irqdesc.c                     |    3 
>  b/kernel/irq/irqdomain.c                   |   43 -
>  b/kernel/irq/manage.c                      |   18 
>  b/kernel/irq/matrix.c                      |  443 +++++++++++
>  b/kernel/irq/msi.c                         |   32 
>  58 files changed, 2133 insertions(+), 1208 deletions(-)

Complete series tested with paravirt + xen enabled 64 bit kernel:

bare metal boot okay
boot as Xen dom0 okay
boot as Xen pv-domU okay
boot as Xen HVM-domU with PV-drivers okay
Vcpu onlining/offlining in pv-domU okay

So you can add my:

Tested-by: Juergen Gross <jgross@suse.com>
Acked-by: Juergen Gross <jgross@suse.com>


Juergen

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [tip:irq/urgent] genirq: Fix cpumask check in __irq_startup_managed()
  2017-09-13 21:29 ` [patch 01/52] genirq: Fix cpumask check in __irq_startup_managed() Thomas Gleixner
@ 2017-09-16 18:24   ` tip-bot for Thomas Gleixner
  0 siblings, 0 replies; 59+ messages in thread
From: tip-bot for Thomas Gleixner @ 2017-09-16 18:24 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tony.luck, bp, lenb, marc.zyngier, joro, rostedt, yu.c.chen,
	jgross, kys, peterz, mingo, hch, boris.ostrovsky, pbonzini,
	linux-kernel, dan.j.williams, rjw, akataria, hpa, rui.zhang,
	arjan, tglx

Commit-ID:  9cb067ef8a10bb13112e4d1c0ea996ec96527422
Gitweb:     http://git.kernel.org/tip/9cb067ef8a10bb13112e4d1c0ea996ec96527422
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Wed, 13 Sep 2017 23:29:03 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 16 Sep 2017 20:20:56 +0200

genirq: Fix cpumask check in __irq_startup_managed()

The result of cpumask_any_and() is invalid when result greater or equal
nr_cpu_ids. The current check is checking for greater only. Fix it.

Fixes: 761ea388e8c4 ("genirq: Handle managed irqs gracefully in irq_startup()")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: stable@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/20170913213152.272283444@linutronix.de

---
 kernel/irq/chip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index f51b7b6..6fc89fd 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -202,7 +202,7 @@ __irq_startup_managed(struct irq_desc *desc, struct cpumask *aff, bool force)
 
 	irqd_clr_managed_shutdown(d);
 
-	if (cpumask_any_and(aff, cpu_online_mask) > nr_cpu_ids) {
+	if (cpumask_any_and(aff, cpu_online_mask) >= nr_cpu_ids) {
 		/*
 		 * Catch code which fiddles with enable_irq() on a managed
 		 * and potentially shutdown IRQ. Chained interrupt

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [patch 00/52] x86: Rework the vector management
  2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
                   ` (52 preceding siblings ...)
  2017-09-14 11:21 ` [patch 00/52] x86: Rework the vector management Juergen Gross
@ 2017-09-19  9:12 ` Yu Chen
  53 siblings, 0 replies; 59+ messages in thread
From: Yu Chen @ 2017-09-19  9:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Rui Zhang, Rafael J. Wysocki, Len Brown,
	Dan Williams, Christoph Hellwig, Paolo Bonzini, Joerg Roedel,
	Boris Ostrovsky, Juergen Gross, Tony Luck, K. Y. Srinivasan,
	Alok Kataria, Steven Rostedt, Arjan van de Ven

On Wed, Sep 13, 2017 at 11:29:02PM +0200, Thomas Gleixner wrote:
> Sorry for the large CC list, but this is a major surgery.
> 
> The vector management in x86 including the surrounding code is a
> conglomorate of ancient bits and pieces which have been subject to
> 'modernization' and featuritis over the years. The most obscure parts are
> the vector allocation mechanics, the cleanup vector handling and the cpu
> hotplug machinery. Replacing these pieces of art was on my todo list for a
> long time.
> 
> Recent attempts to 'solve' CPU offline / hibernation issues which are
> partially caused by the current vector management implementation made me
> look for real. Further information in this thread:
> 
>     http://lkml.kernel.org/r/cover.1504235838.git.yu.c.chen@intel.com
> 
> Aside of drivers allocating gazillion of interrupts, there are quite some
> things which can be addressed in the x86 vector management and in the core
> code.
> 
>   - Multi CPU affinities:
> 
>     A dubious property which is not available on all machines and causes
>     major complexity both in the allocator and the cleanup/hotplug
>     management. See:
> 
>        http://lkml.kernel.org/r/alpine.DEB.2.20.1709071045440.1827@nanos
> 
>   - Priority level spreading:
> 
>     An obscure and undocumented property which I think is sufficiently
>     argued to be not required in:
> 
>        http://lkml.kernel.org/r/alpine.DEB.2.20.1709071045440.1827@nanos
> 
>   - Allocation of vectors when interrupt descriptors are allocated.
> 
>     This is a historical implementation detail, which is not really
>     required when the vector allocation is delayed up to the point when
>     request_irq() is invoked. This might make request_irq() fail, when the
>     vector space is exhausted, but drivers should handle request_irq()
>     fails anyway.
> 
>     The upside of changing this is that the active vector space becomes
>     smaller especially on hibernation/cpu offline when drivers shut down
>     queue interrupts of outgoing CPUs.
> 
>     Some of this is already addressed with the managed interrupt facility,
>     but that was bolted on top of the existing vector management because
>     proper integration was not possible at that point. I take the blame
>     for this, but the tradeoff of not doing it would have been more
>     broken driver boiler plate code all over the place. So I went for the
>     lesser of two evils.
> 
>   - Allocation of vectors on the wrong place
> 
>     Even for managed interrupts the vector allocation at descriptor
>     allocation happens on the wrong place and gets fixed after the fact
>     with a call to set_affinity(). In case of not remapped interrupts
>     this results in at least one interrupt on the wrong CPU before it is
>     migrated to the desired target.
> 
>   - Lack of instrumentation
>  
>     All of this is a black box which allows no insight into the actual
>     vector usage.
> 
> The series addresses these points and converts the x86 vector management to
> a bitmap based allocator which provides proper reservation management for
> 'managed interrupts' and best effort reservation for regular interrupts.
> The latter allows overcommitment, which 'fixes' some of hotplug/hibernation
> problems in a clean way. It can't fix all of them depending on the driver
> involved.
> 
> This rework is no excuse for driver writers to do exhaustive vector
> allocations instead of utilizing the managed interrupt infrastructure, but
> it addresses long standing issues in this code with the side effect of
> mitigating some of the driver oddities. The proper solution for multi queue
> management are 'managed interrupts' which has been proven in the block-mq
> work as they solve issues which are worked around in other drivers in
> creative ways with lots of copied code and often enough broken attempts to
> handle interrupt affinity and CPU hotplug problems.
> 
> The new bitmap allocator and the x86 vector management code are
> instrumented with tracepoints and the irq domain debugfs files allow deep
> insight into the vector allocation and reservations.
> 
> The patches work on machines with and without interrupt remapping and
> inside of KVM guests of various flavours, though I have no idea what I
> broke on the way with other hypervisors, posted interrupts etc. So I kindly
> ask for your support in testing and review.
> 
> The series applies on top of Linus tree and is available as git branch:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/apic
> 
> Note, that this branch is Linus tree plus scheduler and x86 fixes which I
> required to do proper testing. They have outstanding pull requests and
> might be merged already when you read this.
> 
> Thanks,
> 
> 	tglx
> ---
Tested on top of:
commit e1b476ae32fcfa59fc6752b4b01988e759269dc3
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Thu Sep 14 09:53:10 2017 +0200

    x86/vector: Exclude IRQ0 from reservation mode

from branch WIP.x86/apic, on a platform with 16 cores,
bootup okay, cpu[1-31] offline/online okay.
Before offline:

name:   VECTOR
 size:   0
 mapped: 484
 flags:  0x00000041
Online bitmaps:       32
Global available:   6419
Global reserved:     407
Total allocated:      77
System: 41: 0-19,32,50,128,238-255
 | CPU | avl | man | act | vectors
     0   126     0    77  33-49,51-110
     1   203     0     0  
     2   203     0     0  
     3   203     0     0  
     4   203     0     0  
     5   203     0     0  
     6   203     0     0  
     7   203     0     0  
     8   203     0     0  
     9   203     0     0  
    10   203     0     0  
    11   203     0     0  
    12   203     0     0  
    13   203     0     0  
    14   203     0     0  
    15   203     0     0  
    16   203     0     0  
    17   203     0     0  
    18   203     0     0  
    19   203     0     0  
    20   203     0     0  
    21   203     0     0  
    22   203     0     0  
    23   203     0     0  
    24   203     0     0  
    25   203     0     0  
    26   203     0     0  
    27   203     0     0  
    28   203     0     0  
    29   203     0     0  
    30   203     0     0  
    31   203     0     0 

After offline:

name:   VECTOR
 size:   0
 mapped: 484
 flags:  0x00000041
Online bitmaps:        1
Global available:    126
Global reserved:     407
Total allocated:      77
System: 41: 0-19,32,50,128,238-255
 | CPU | avl | man | act | vectors
     0   126     0    77  33-49,51-110

 Thanks,
 	Yu

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [patch 00/52] x86: Rework the vector management
  2017-09-14 11:21 ` [patch 00/52] x86: Rework the vector management Juergen Gross
@ 2017-09-20 10:21   ` Paolo Bonzini
  0 siblings, 0 replies; 59+ messages in thread
From: Paolo Bonzini @ 2017-09-20 10:21 UTC (permalink / raw)
  To: Juergen Gross, Thomas Gleixner, LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Joerg Roedel,
	Boris Ostrovsky, Tony Luck, K. Y. Srinivasan, Alok Kataria,
	Steven Rostedt, Arjan van de Ven

On 14/09/2017 13:21, Juergen Gross wrote:
> Complete series tested with paravirt + xen enabled 64 bit kernel:
> 
> bare metal boot okay
> boot as Xen dom0 okay
> boot as Xen pv-domU okay
> boot as Xen HVM-domU with PV-drivers okay
> Vcpu onlining/offlining in pv-domU okay

Intel has now tested posted interrupts with no regression.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2017-09-20 10:21 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
2017-09-13 21:29 ` [patch 01/52] genirq: Fix cpumask check in __irq_startup_managed() Thomas Gleixner
2017-09-16 18:24   ` [tip:irq/urgent] " tip-bot for Thomas Gleixner
2017-09-13 21:29 ` [patch 02/52] genirq/debugfs: Show debug information for all irq descriptors Thomas Gleixner
2017-09-13 21:29 ` [patch 03/52] genirq/msi: Capture device name for debugfs Thomas Gleixner
2017-09-13 21:29 ` [patch 04/52] irqdomain/debugfs: Provide domain specific debug callback Thomas Gleixner
2017-09-13 21:29 ` [patch 05/52] genirq: Make state consistent for !IRQ_DOMAIN_HIERARCHY Thomas Gleixner
2017-09-13 21:29 ` [patch 06/52] genirq: Set managed shut down flag at init Thomas Gleixner
2017-09-13 21:29 ` [patch 07/52] genirq: Separate activation and startup Thomas Gleixner
2017-09-13 21:29 ` [patch 08/52] genirq/irqdomain: Update irq_domain_ops.activate() signature Thomas Gleixner
2017-09-13 21:29 ` [patch 09/52] genirq/irqdomain: Allow irq_domain_activate_irq() to fail Thomas Gleixner
2017-09-13 21:29 ` [patch 10/52] genirq/irqdomain: Propagate early activation Thomas Gleixner
2017-09-13 21:29 ` [patch 11/52] genirq/irqdomain: Add force reactivation flag to irq domains Thomas Gleixner
2017-09-13 21:29 ` [patch 12/52] genirq: Implement bitmap matrix allocator Thomas Gleixner
2017-09-13 21:29 ` [patch 13/52] genirq/matrix: Add tracepoints Thomas Gleixner
2017-09-13 21:29 ` [patch 14/52] x86/apic: Deinline x2apic functions Thomas Gleixner
2017-09-13 21:29 ` [patch 15/52] x86/apic: Sanitize return value of apic.set_apic_id() Thomas Gleixner
2017-09-13 21:29 ` [patch 16/52] x86/apic: Sanitize return value of check_apicid_used() Thomas Gleixner
2017-09-13 21:29 ` [patch 17/52] x86/apic: Move probe32 specific APIC functions Thomas Gleixner
2017-09-13 21:29 ` [patch 18/52] x86/apic: Move APIC noop specific functions Thomas Gleixner
2017-09-13 21:29 ` [patch 19/52] x86/apic: Sanitize 32/64bit APIC callbacks Thomas Gleixner
2017-09-13 21:29 ` [patch 20/52] x86/apic: Move common " Thomas Gleixner
2017-09-13 21:29 ` [patch 21/52] x86/apic: Reorganize struct apic Thomas Gleixner
2017-09-13 21:29 ` [patch 22/52] x86/apic/x2apic: Simplify cluster management Thomas Gleixner
2017-09-13 21:29 ` [patch 23/52] x86/apic: Get rid of apic->target_cpus Thomas Gleixner
2017-09-13 21:29 ` [patch 24/52] x86/vector: Rename used_vectors to system_vectors Thomas Gleixner
2017-09-13 21:29 ` [patch 25/52] x86/apic: Get rid of multi CPU affinity Thomas Gleixner
2017-09-13 21:29 ` [patch 26/52] x86/ioapic: Remove obsolete post hotplug update Thomas Gleixner
2017-09-13 21:29 ` [patch 27/52] x86/vector: Simplify the CPU hotplug vector update Thomas Gleixner
2017-09-13 21:29 ` [patch 28/52] x86/vector: Cleanup variable names Thomas Gleixner
2017-09-13 21:29 ` [patch 29/52] x86/vector: Store the single CPU targets in apic data Thomas Gleixner
2017-09-13 21:29 ` [patch 30/52] x86/vector: Simplify vector move cleanup Thomas Gleixner
2017-09-13 21:29 ` [patch 31/52] x86/ioapic: Mark legacy vectors at reallocation time Thomas Gleixner
2017-09-13 21:29 ` [patch 32/52] x86/apic: Get rid of the legacy irq data storage Thomas Gleixner
2017-09-13 21:29 ` [patch 33/52] x86/vector: Remove pointless pointer checks Thomas Gleixner
2017-09-13 21:29 ` [patch 34/52] x86/vector: Move helper functions around Thomas Gleixner
2017-09-13 21:29 ` [patch 35/52] x86/apic: Add replacement for cpu_mask_to_apicid() Thomas Gleixner
2017-09-13 21:29 ` [patch 36/52] x86/irq/vector: Initialize matrix allocator Thomas Gleixner
2017-09-13 21:29 ` [patch 37/52] x86/vector: Add vector domain debugfs support Thomas Gleixner
2017-09-13 21:29 ` [patch 38/52] x86/smpboot: Set online before setting up vectors Thomas Gleixner
2017-09-13 21:29 ` [patch 39/52] x86/vector: Add tracepoints for vector management Thomas Gleixner
2017-09-13 21:29 ` [patch 40/52] x86/vector: Use matrix allocator for vector assignment Thomas Gleixner
2017-09-13 21:29 ` [patch 41/52] x86/apic: Remove unused callbacks Thomas Gleixner
2017-09-13 21:29 ` [patch 42/52] x86/vector: Compile SMP only code conditionally Thomas Gleixner
2017-09-13 21:29 ` [patch 43/52] x86/vector: Untangle internal state from irq_cfg Thomas Gleixner
2017-09-13 21:29 ` [patch 44/52] x86/apic/msi: Force reactivation of interrupts at startup time Thomas Gleixner
2017-09-13 21:29 ` [patch 45/52] iommu/vt-d: Reevaluate vector configuration on activate() Thomas Gleixner
2017-09-13 21:29   ` Thomas Gleixner
2017-09-13 21:29 ` [patch 46/52] iommu/amd: " Thomas Gleixner
2017-09-13 21:29   ` Thomas Gleixner
2017-09-13 21:29 ` [patch 47/52] x86/io_apic: " Thomas Gleixner
2017-09-13 21:29 ` [patch 48/52] x86/vector: Handle managed interrupts proper Thomas Gleixner
2017-09-13 21:29 ` [patch 49/52] x86/vector/msi: Switch to global reservation mode Thomas Gleixner
2017-09-13 21:29 ` [patch 50/52] x86/vector: Switch IOAPIC " Thomas Gleixner
2017-09-13 21:29 ` [patch 51/52] x86/irq: Simplify hotplug vector accounting Thomas Gleixner
2017-09-13 21:29 ` [patch 52/52] x86/vector: Respect affinity mask in irq descriptor Thomas Gleixner
2017-09-14 11:21 ` [patch 00/52] x86: Rework the vector management Juergen Gross
2017-09-20 10:21   ` Paolo Bonzini
2017-09-19  9:12 ` Yu Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.