All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 00/52] x86: Rework the vector management
@ 2017-09-13 21:29 Thomas Gleixner
  2017-09-13 21:29 ` [patch 01/52] genirq: Fix cpumask check in __irq_startup_managed() Thomas Gleixner
                   ` (53 more replies)
  0 siblings, 54 replies; 59+ messages in thread
From: Thomas Gleixner @ 2017-09-13 21:29 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Anvin, Marc Zyngier, Peter Zijlstra,
	Borislav Petkov, Chen Yu, Rui Zhang, Rafael J. Wysocki,
	Len Brown, Dan Williams, Christoph Hellwig, Paolo Bonzini,
	Joerg Roedel, Boris Ostrovsky, Juergen Gross, Tony Luck,
	K. Y. Srinivasan, Alok Kataria, Steven Rostedt, Arjan van de Ven

Sorry for the large CC list, but this is a major surgery.

The vector management in x86 including the surrounding code is a
conglomorate of ancient bits and pieces which have been subject to
'modernization' and featuritis over the years. The most obscure parts are
the vector allocation mechanics, the cleanup vector handling and the cpu
hotplug machinery. Replacing these pieces of art was on my todo list for a
long time.

Recent attempts to 'solve' CPU offline / hibernation issues which are
partially caused by the current vector management implementation made me
look for real. Further information in this thread:

    http://lkml.kernel.org/r/cover.1504235838.git.yu.c.chen@intel.com

Aside of drivers allocating gazillion of interrupts, there are quite some
things which can be addressed in the x86 vector management and in the core
code.

  - Multi CPU affinities:

    A dubious property which is not available on all machines and causes
    major complexity both in the allocator and the cleanup/hotplug
    management. See:

       http://lkml.kernel.org/r/alpine.DEB.2.20.1709071045440.1827@nanos

  - Priority level spreading:

    An obscure and undocumented property which I think is sufficiently
    argued to be not required in:

       http://lkml.kernel.org/r/alpine.DEB.2.20.1709071045440.1827@nanos

  - Allocation of vectors when interrupt descriptors are allocated.

    This is a historical implementation detail, which is not really
    required when the vector allocation is delayed up to the point when
    request_irq() is invoked. This might make request_irq() fail, when the
    vector space is exhausted, but drivers should handle request_irq()
    fails anyway.

    The upside of changing this is that the active vector space becomes
    smaller especially on hibernation/cpu offline when drivers shut down
    queue interrupts of outgoing CPUs.

    Some of this is already addressed with the managed interrupt facility,
    but that was bolted on top of the existing vector management because
    proper integration was not possible at that point. I take the blame
    for this, but the tradeoff of not doing it would have been more
    broken driver boiler plate code all over the place. So I went for the
    lesser of two evils.

  - Allocation of vectors on the wrong place

    Even for managed interrupts the vector allocation at descriptor
    allocation happens on the wrong place and gets fixed after the fact
    with a call to set_affinity(). In case of not remapped interrupts
    this results in at least one interrupt on the wrong CPU before it is
    migrated to the desired target.

  - Lack of instrumentation
 
    All of this is a black box which allows no insight into the actual
    vector usage.

The series addresses these points and converts the x86 vector management to
a bitmap based allocator which provides proper reservation management for
'managed interrupts' and best effort reservation for regular interrupts.
The latter allows overcommitment, which 'fixes' some of hotplug/hibernation
problems in a clean way. It can't fix all of them depending on the driver
involved.

This rework is no excuse for driver writers to do exhaustive vector
allocations instead of utilizing the managed interrupt infrastructure, but
it addresses long standing issues in this code with the side effect of
mitigating some of the driver oddities. The proper solution for multi queue
management are 'managed interrupts' which has been proven in the block-mq
work as they solve issues which are worked around in other drivers in
creative ways with lots of copied code and often enough broken attempts to
handle interrupt affinity and CPU hotplug problems.

The new bitmap allocator and the x86 vector management code are
instrumented with tracepoints and the irq domain debugfs files allow deep
insight into the vector allocation and reservations.

The patches work on machines with and without interrupt remapping and
inside of KVM guests of various flavours, though I have no idea what I
broke on the way with other hypervisors, posted interrupts etc. So I kindly
ask for your support in testing and review.

The series applies on top of Linus tree and is available as git branch:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/apic

Note, that this branch is Linus tree plus scheduler and x86 fixes which I
required to do proper testing. They have outstanding pull requests and
might be merged already when you read this.

Thanks,

	tglx
---
 arch/x86/include/asm/x2apic.h              |   49 -
 b/arch/x86/Kconfig                         |    1 
 b/arch/x86/include/asm/apic.h              |  255 +-----
 b/arch/x86/include/asm/desc.h              |    2 
 b/arch/x86/include/asm/hw_irq.h            |    6 
 b/arch/x86/include/asm/io_apic.h           |    2 
 b/arch/x86/include/asm/irq.h               |    4 
 b/arch/x86/include/asm/irq_vectors.h       |    8 
 b/arch/x86/include/asm/irqdomain.h         |    5 
 b/arch/x86/include/asm/kvm_host.h          |    2 
 b/arch/x86/include/asm/trace/irq_vectors.h |  244 ++++++
 b/arch/x86/kernel/apic/Makefile            |    2 
 b/arch/x86/kernel/apic/apic.c              |   38 -
 b/arch/x86/kernel/apic/apic_common.c       |   46 +
 b/arch/x86/kernel/apic/apic_flat_64.c      |   10 
 b/arch/x86/kernel/apic/apic_noop.c         |   25 
 b/arch/x86/kernel/apic/apic_numachip.c     |   12 
 b/arch/x86/kernel/apic/bigsmp_32.c         |    8 
 b/arch/x86/kernel/apic/htirq.c             |    5 
 b/arch/x86/kernel/apic/io_apic.c           |   94 --
 b/arch/x86/kernel/apic/msi.c               |    5 
 b/arch/x86/kernel/apic/probe_32.c          |   29 
 b/arch/x86/kernel/apic/vector.c            | 1090 +++++++++++++++++------------
 b/arch/x86/kernel/apic/x2apic.h            |    9 
 b/arch/x86/kernel/apic/x2apic_cluster.c    |  196 +----
 b/arch/x86/kernel/apic/x2apic_phys.c       |   44 +
 b/arch/x86/kernel/apic/x2apic_uv_x.c       |   17 
 b/arch/x86/kernel/i8259.c                  |    1 
 b/arch/x86/kernel/idt.c                    |   12 
 b/arch/x86/kernel/irq.c                    |  101 --
 b/arch/x86/kernel/irqinit.c                |    1 
 b/arch/x86/kernel/setup.c                  |   12 
 b/arch/x86/kernel/smpboot.c                |   14 
 b/arch/x86/kernel/traps.c                  |    2 
 b/arch/x86/kernel/vsmp_64.c                |   19 
 b/arch/x86/platform/uv/uv_irq.c            |    5 
 b/arch/x86/xen/apic.c                      |    6 
 b/drivers/gpio/gpio-xgene-sb.c             |    7 
 b/drivers/iommu/amd_iommu.c                |   44 -
 b/drivers/iommu/intel_irq_remapping.c      |   43 -
 b/drivers/irqchip/irq-gic-v3-its.c         |    5 
 b/drivers/pinctrl/stm32/pinctrl-stm32.c    |    5 
 b/include/linux/irq.h                      |   22 
 b/include/linux/irqdesc.h                  |    1 
 b/include/linux/irqdomain.h                |   14 
 b/include/linux/msi.h                      |    5 
 b/include/trace/events/irq_matrix.h        |  201 +++++
 b/kernel/irq/Kconfig                       |    3 
 b/kernel/irq/Makefile                      |    1 
 b/kernel/irq/autoprobe.c                   |    2 
 b/kernel/irq/chip.c                        |   37 
 b/kernel/irq/debugfs.c                     |   12 
 b/kernel/irq/internals.h                   |   19 
 b/kernel/irq/irqdesc.c                     |    3 
 b/kernel/irq/irqdomain.c                   |   43 -
 b/kernel/irq/manage.c                      |   18 
 b/kernel/irq/matrix.c                      |  443 +++++++++++
 b/kernel/irq/msi.c                         |   32 
 58 files changed, 2133 insertions(+), 1208 deletions(-)

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2017-09-20 10:21 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-13 21:29 [patch 00/52] x86: Rework the vector management Thomas Gleixner
2017-09-13 21:29 ` [patch 01/52] genirq: Fix cpumask check in __irq_startup_managed() Thomas Gleixner
2017-09-16 18:24   ` [tip:irq/urgent] " tip-bot for Thomas Gleixner
2017-09-13 21:29 ` [patch 02/52] genirq/debugfs: Show debug information for all irq descriptors Thomas Gleixner
2017-09-13 21:29 ` [patch 03/52] genirq/msi: Capture device name for debugfs Thomas Gleixner
2017-09-13 21:29 ` [patch 04/52] irqdomain/debugfs: Provide domain specific debug callback Thomas Gleixner
2017-09-13 21:29 ` [patch 05/52] genirq: Make state consistent for !IRQ_DOMAIN_HIERARCHY Thomas Gleixner
2017-09-13 21:29 ` [patch 06/52] genirq: Set managed shut down flag at init Thomas Gleixner
2017-09-13 21:29 ` [patch 07/52] genirq: Separate activation and startup Thomas Gleixner
2017-09-13 21:29 ` [patch 08/52] genirq/irqdomain: Update irq_domain_ops.activate() signature Thomas Gleixner
2017-09-13 21:29 ` [patch 09/52] genirq/irqdomain: Allow irq_domain_activate_irq() to fail Thomas Gleixner
2017-09-13 21:29 ` [patch 10/52] genirq/irqdomain: Propagate early activation Thomas Gleixner
2017-09-13 21:29 ` [patch 11/52] genirq/irqdomain: Add force reactivation flag to irq domains Thomas Gleixner
2017-09-13 21:29 ` [patch 12/52] genirq: Implement bitmap matrix allocator Thomas Gleixner
2017-09-13 21:29 ` [patch 13/52] genirq/matrix: Add tracepoints Thomas Gleixner
2017-09-13 21:29 ` [patch 14/52] x86/apic: Deinline x2apic functions Thomas Gleixner
2017-09-13 21:29 ` [patch 15/52] x86/apic: Sanitize return value of apic.set_apic_id() Thomas Gleixner
2017-09-13 21:29 ` [patch 16/52] x86/apic: Sanitize return value of check_apicid_used() Thomas Gleixner
2017-09-13 21:29 ` [patch 17/52] x86/apic: Move probe32 specific APIC functions Thomas Gleixner
2017-09-13 21:29 ` [patch 18/52] x86/apic: Move APIC noop specific functions Thomas Gleixner
2017-09-13 21:29 ` [patch 19/52] x86/apic: Sanitize 32/64bit APIC callbacks Thomas Gleixner
2017-09-13 21:29 ` [patch 20/52] x86/apic: Move common " Thomas Gleixner
2017-09-13 21:29 ` [patch 21/52] x86/apic: Reorganize struct apic Thomas Gleixner
2017-09-13 21:29 ` [patch 22/52] x86/apic/x2apic: Simplify cluster management Thomas Gleixner
2017-09-13 21:29 ` [patch 23/52] x86/apic: Get rid of apic->target_cpus Thomas Gleixner
2017-09-13 21:29 ` [patch 24/52] x86/vector: Rename used_vectors to system_vectors Thomas Gleixner
2017-09-13 21:29 ` [patch 25/52] x86/apic: Get rid of multi CPU affinity Thomas Gleixner
2017-09-13 21:29 ` [patch 26/52] x86/ioapic: Remove obsolete post hotplug update Thomas Gleixner
2017-09-13 21:29 ` [patch 27/52] x86/vector: Simplify the CPU hotplug vector update Thomas Gleixner
2017-09-13 21:29 ` [patch 28/52] x86/vector: Cleanup variable names Thomas Gleixner
2017-09-13 21:29 ` [patch 29/52] x86/vector: Store the single CPU targets in apic data Thomas Gleixner
2017-09-13 21:29 ` [patch 30/52] x86/vector: Simplify vector move cleanup Thomas Gleixner
2017-09-13 21:29 ` [patch 31/52] x86/ioapic: Mark legacy vectors at reallocation time Thomas Gleixner
2017-09-13 21:29 ` [patch 32/52] x86/apic: Get rid of the legacy irq data storage Thomas Gleixner
2017-09-13 21:29 ` [patch 33/52] x86/vector: Remove pointless pointer checks Thomas Gleixner
2017-09-13 21:29 ` [patch 34/52] x86/vector: Move helper functions around Thomas Gleixner
2017-09-13 21:29 ` [patch 35/52] x86/apic: Add replacement for cpu_mask_to_apicid() Thomas Gleixner
2017-09-13 21:29 ` [patch 36/52] x86/irq/vector: Initialize matrix allocator Thomas Gleixner
2017-09-13 21:29 ` [patch 37/52] x86/vector: Add vector domain debugfs support Thomas Gleixner
2017-09-13 21:29 ` [patch 38/52] x86/smpboot: Set online before setting up vectors Thomas Gleixner
2017-09-13 21:29 ` [patch 39/52] x86/vector: Add tracepoints for vector management Thomas Gleixner
2017-09-13 21:29 ` [patch 40/52] x86/vector: Use matrix allocator for vector assignment Thomas Gleixner
2017-09-13 21:29 ` [patch 41/52] x86/apic: Remove unused callbacks Thomas Gleixner
2017-09-13 21:29 ` [patch 42/52] x86/vector: Compile SMP only code conditionally Thomas Gleixner
2017-09-13 21:29 ` [patch 43/52] x86/vector: Untangle internal state from irq_cfg Thomas Gleixner
2017-09-13 21:29 ` [patch 44/52] x86/apic/msi: Force reactivation of interrupts at startup time Thomas Gleixner
2017-09-13 21:29 ` [patch 45/52] iommu/vt-d: Reevaluate vector configuration on activate() Thomas Gleixner
2017-09-13 21:29   ` Thomas Gleixner
2017-09-13 21:29 ` [patch 46/52] iommu/amd: " Thomas Gleixner
2017-09-13 21:29   ` Thomas Gleixner
2017-09-13 21:29 ` [patch 47/52] x86/io_apic: " Thomas Gleixner
2017-09-13 21:29 ` [patch 48/52] x86/vector: Handle managed interrupts proper Thomas Gleixner
2017-09-13 21:29 ` [patch 49/52] x86/vector/msi: Switch to global reservation mode Thomas Gleixner
2017-09-13 21:29 ` [patch 50/52] x86/vector: Switch IOAPIC " Thomas Gleixner
2017-09-13 21:29 ` [patch 51/52] x86/irq: Simplify hotplug vector accounting Thomas Gleixner
2017-09-13 21:29 ` [patch 52/52] x86/vector: Respect affinity mask in irq descriptor Thomas Gleixner
2017-09-14 11:21 ` [patch 00/52] x86: Rework the vector management Juergen Gross
2017-09-20 10:21   ` Paolo Bonzini
2017-09-19  9:12 ` Yu Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.