This started out with 5 patches from Christoph who wanted to add a mechanism for interrupts with managed affinities to spread them over all present CPUs and instead of migrating them, shut them down into managed shutdown state when the last CPU in the affinity set goes offline and then resume them when a CPU to which belongs to the affinity set comes online again. See: http://lkml.kernel.org/r/20170603140403.27379-1-hch@lst.de After staring at it for a while it became clear that the approach is not sufficient to deal with all the oddities of the x86 interrupt handling. The mechanism to migrate interrupts inside the affinity set or shut them down when the last CPU of the set goes offline is the same as the general cpu hotplug migration mechanism. x86 does not use yet the generic cpu hotplug irq migration code, so supporting this feature would require changes to both architecture and core code. After staring long enough, I decided to extend the core migration code so it can handle the x86 oddities as well. There are a few subtle changes in that code which might affect the existing users (ARM64/POWERPC), but AFAICT they should not change the behaviour. Please look carefully. While doing this I stumbled over a bunch of other details, which I addressed in seperate patches. Once again the missing ability to debug all of this turned out to be a major pain. So I hacked a quick debugfs tool, which helped me to sort out the details. I rewrote that proper and it's included in the start of the series. This also required to extend the fwnode so x86 can supply unique domain names on domain creation. The series applies on git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core and is also available via git from: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.irq Thanks, tglx ---- arch/x86/Kconfig | 2 arch/x86/include/asm/apic.h | 36 +---- arch/x86/include/asm/irq.h | 1 arch/x86/include/asm/irq_remapping.h | 3 arch/x86/kernel/apic/apic.c | 35 +++-- arch/x86/kernel/apic/apic_flat_64.c | 4 arch/x86/kernel/apic/apic_noop.c | 2 arch/x86/kernel/apic/apic_numachip.c | 4 arch/x86/kernel/apic/bigsmp_32.c | 2 arch/x86/kernel/apic/htirq.c | 21 ++- arch/x86/kernel/apic/io_apic.c | 22 +++ arch/x86/kernel/apic/msi.c | 55 ++++++-- arch/x86/kernel/apic/probe_32.c | 2 arch/x86/kernel/apic/vector.c | 49 +++++-- arch/x86/kernel/apic/x2apic_cluster.c | 36 ++--- arch/x86/kernel/apic/x2apic_phys.c | 2 arch/x86/kernel/apic/x2apic_uv_x.c | 26 +--- arch/x86/kernel/irq.c | 78 ------------ arch/x86/platform/uv/uv_irq.c | 18 ++ arch/x86/xen/apic.c | 2 drivers/iommu/amd_iommu.c | 22 ++- drivers/iommu/intel_irq_remapping.c | 31 +++- drivers/pci/host/vmd.c | 8 + drivers/xen/events/events_base.c | 6 include/linux/cpuhotplug.h | 1 include/linux/irq.h | 61 +++++++++ include/linux/irqdesc.h | 4 include/linux/irqdomain.h | 37 +++++ kernel/cpu.c | 5 kernel/irq/Kconfig | 15 ++ kernel/irq/Makefile | 1 kernel/irq/affinity.c | 76 +++++++++-- kernel/irq/autoprobe.c | 4 kernel/irq/chip.c | 91 ++++++++++++-- kernel/irq/cpuhotplug.c | 150 +++++++++++++++++++---- kernel/irq/debugfs.c | 220 ++++++++++++++++++++++++++++++++++ kernel/irq/internals.h | 100 +++++++++++++++ kernel/irq/irqdesc.c | 30 +++- kernel/irq/irqdomain.c | 170 ++++++++++++++++++++++++-- kernel/irq/manage.c | 102 ++++----------- kernel/irq/migration.c | 30 ++++ kernel/irq/msi.c | 3 kernel/irq/proc.c | 110 ++++++++++++++--- 43 files changed, 1294 insertions(+), 383 deletions(-)
[-- Attachment #1: x86-apic--Add-name-to-irq-chip.patch --] [-- Type: text/plain, Size: 496 bytes --] Add the missing name, so debugging will work proper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/vector.c | 1 + 1 file changed, 1 insertion(+) --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -534,6 +534,7 @@ static int apic_set_affinity(struct irq_ } static struct irq_chip lapic_controller = { + .name = "APIC", .irq_ack = apic_ack_edge, .irq_set_affinity = apic_set_affinity, .irq_retrigger = apic_retrigger_irq,
[-- Attachment #1: iommu-amd--Add-name-to-irq-chip.patch --] [-- Type: text/plain, Size: 881 bytes --] Add the missing name, so debugging will work proper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org --- drivers/iommu/amd_iommu.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -4386,10 +4386,11 @@ static void ir_compose_msi_msg(struct ir } static struct irq_chip amd_ir_chip = { - .irq_ack = ir_ack_apic_edge, - .irq_set_affinity = amd_ir_set_affinity, - .irq_set_vcpu_affinity = amd_ir_set_vcpu_affinity, - .irq_compose_msi_msg = ir_compose_msi_msg, + .name = "AMD-IR", + .irq_ack = ir_ack_apic_edge, + .irq_set_affinity = amd_ir_set_affinity, + .irq_set_vcpu_affinity = amd_ir_set_vcpu_affinity, + .irq_compose_msi_msg = ir_compose_msi_msg, }; int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
[-- Attachment #1: iommu-vt-d--Add-name-to-irq-chip.patch --] [-- Type: text/plain, Size: 951 bytes --] Add the missing name, so debugging will work proper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org --- drivers/iommu/intel_irq_remapping.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) --- a/drivers/iommu/intel_irq_remapping.c +++ b/drivers/iommu/intel_irq_remapping.c @@ -1205,10 +1205,11 @@ static int intel_ir_set_vcpu_affinity(st } static struct irq_chip intel_ir_chip = { - .irq_ack = ir_ack_apic_edge, - .irq_set_affinity = intel_ir_set_affinity, - .irq_compose_msi_msg = intel_ir_compose_msi_msg, - .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity, + .name = "INTEL-IR", + .irq_ack = ir_ack_apic_edge, + .irq_set_affinity = intel_ir_set_affinity, + .irq_compose_msi_msg = intel_ir_compose_msi_msg, + .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity, }; static void intel_irq_remapping_prepare_irte(struct intel_ir_data *data,
[-- Attachment #1: genirq-msi--Prevent-overwriting-domain-name.patch --] [-- Type: text/plain, Size: 675 bytes --] Prevent overwriting an already assigned domain name. Remove the extra check for chip->name, because if domain->name is NULL overwriting it with NULL is not a problem. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/msi.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -274,7 +274,8 @@ struct irq_domain *msi_create_irq_domain domain = irq_domain_create_hierarchy(parent, IRQ_DOMAIN_FLAG_MSI, 0, fwnode, &msi_domain_ops, info); - if (domain && info->chip && info->chip->name) + + if (domain && !domain->name && info->chip) domain->name = info->chip->name; return domain;
[-- Attachment #1: genirq--Allow-fwnode-to-carry-name-information-only.patch --] [-- Type: text/plain, Size: 6232 bytes --] In order to provide proper debug interface it's required to have domain names available when the domain is added. Non fwnode based architectures like x86 have no way to do so. It's not possible to use domain ops or host data for this as domain ops might be the same for several instances, but the names have to be unique. Extend the irqchip fwnode to allow transporting the domain name. If no node is supplied, create a 'unknown-N' placeholder. Warn if an invalid node is supplied and treat it like no node. This happens e.g. with i2 devices which hand in random crap. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/irqdomain.h | 31 ++++++++++++++++- kernel/irq/irqdomain.c | 83 ++++++++++++++++++++++++++++++++++++++-------- 2 files changed, 100 insertions(+), 14 deletions(-) --- a/include/linux/irqdomain.h +++ b/include/linux/irqdomain.h @@ -189,6 +189,9 @@ enum { /* Irq domain implements MSI remapping */ IRQ_DOMAIN_FLAG_MSI_REMAP = (1 << 5), + /* Irq domain name was allocated in __irq_domain_add() */ + IRQ_DOMAIN_NAME_ALLOCATED = (1 << 6), + /* * Flags starting from IRQ_DOMAIN_FLAG_NONCORE are reserved * for implementation specific purposes and ignored by the @@ -203,7 +206,33 @@ static inline struct device_node *irq_do } #ifdef CONFIG_IRQ_DOMAIN -struct fwnode_handle *irq_domain_alloc_fwnode(void *data); +struct fwnode_handle *__irq_domain_alloc_fwnode(unsigned int type, int id, + const char *name, void *data); + +enum { + IRQCHIP_FWNODE_REAL, + IRQCHIP_FWNODE_NAMED, + IRQCHIP_FWNODE_NAMED_ID, +}; + +static inline +struct fwnode_handle *irq_domain_alloc_named_fwnode(const char *name) +{ + return __irq_domain_alloc_fwnode(IRQCHIP_FWNODE_NAMED, 0, name, NULL); +} + +static inline +struct fwnode_handle *irq_domain_alloc_named_id_fwnode(const char *name, int id) +{ + return __irq_domain_alloc_fwnode(IRQCHIP_FWNODE_NAMED_ID, id, name, + NULL); +} + +static inline struct fwnode_handle *irq_domain_alloc_fwnode(void *data) +{ + return __irq_domain_alloc_fwnode(IRQCHIP_FWNODE_REAL, 0, NULL, data); +} + void irq_domain_free_fwnode(struct fwnode_handle *fwnode); struct irq_domain *__irq_domain_add(struct fwnode_handle *fwnode, int size, irq_hw_number_t hwirq_max, int direct_max, --- a/kernel/irq/irqdomain.c +++ b/kernel/irq/irqdomain.c @@ -26,39 +26,61 @@ static struct irq_domain *irq_default_do static void irq_domain_check_hierarchy(struct irq_domain *domain); struct irqchip_fwid { - struct fwnode_handle fwnode; - char *name; - void *data; + struct fwnode_handle fwnode; + unsigned int type; + char *name; + void *data; }; /** * irq_domain_alloc_fwnode - Allocate a fwnode_handle suitable for * identifying an irq domain - * @data: optional user-provided data + * @type: Type of irqchip_fwnode. See linux/irqdomain.h + * @name: Optional user provided domain name + * @id: Optional user provided id if name != NULL + * @data: Optional user-provided data * - * Allocate a struct device_node, and return a poiner to the embedded + * Allocate a struct irqchip_fwid, and return a poiner to the embedded * fwnode_handle (or NULL on failure). + * + * Note: The types IRQCHIP_FWNODE_NAMED and IRQCHIP_FWNODE_NAMED_ID are + * solely to transport name information to irqdomain creation code. The + * node is not stored. For other types the pointer is kept in the irq + * domain struct. */ -struct fwnode_handle *irq_domain_alloc_fwnode(void *data) +struct fwnode_handle *__irq_domain_alloc_fwnode(unsigned int type, int id, + const char *name, void *data) { struct irqchip_fwid *fwid; - char *name; + char *n; fwid = kzalloc(sizeof(*fwid), GFP_KERNEL); - name = kasprintf(GFP_KERNEL, "irqchip@%p", data); - if (!fwid || !name) { + switch (type) { + case IRQCHIP_FWNODE_NAMED: + n = kasprintf(GFP_KERNEL, "%s", name); + break; + case IRQCHIP_FWNODE_NAMED_ID: + n = kasprintf(GFP_KERNEL, "%s-%d", name, id); + break; + default: + n = kasprintf(GFP_KERNEL, "irqchip@%p", data); + break; + } + + if (!fwid || !n) { kfree(fwid); - kfree(name); + kfree(n); return NULL; } - fwid->name = name; + fwid->type = type; + fwid->name = n; fwid->data = data; fwid->fwnode.type = FWNODE_IRQCHIP; return &fwid->fwnode; } -EXPORT_SYMBOL_GPL(irq_domain_alloc_fwnode); +EXPORT_SYMBOL_GPL(__irq_domain_alloc_fwnode); /** * irq_domain_free_fwnode - Free a non-OF-backed fwnode_handle @@ -97,20 +119,53 @@ struct irq_domain *__irq_domain_add(stru void *host_data) { struct device_node *of_node = to_of_node(fwnode); + struct irqchip_fwid *fwid; struct irq_domain *domain; + static atomic_t unknown_domains; + domain = kzalloc_node(sizeof(*domain) + (sizeof(unsigned int) * size), GFP_KERNEL, of_node_to_nid(of_node)); if (WARN_ON(!domain)) return NULL; + if (fwnode && is_fwnode_irqchip(fwnode)) { + fwid = container_of(fwnode, struct irqchip_fwid, fwnode); + + switch (fwid->type) { + case IRQCHIP_FWNODE_NAMED: + case IRQCHIP_FWNODE_NAMED_ID: + domain->name = kstrdup(fwid->name, GFP_KERNEL); + if (!domain->name) { + kfree(domain); + return NULL; + } + domain->flags |= IRQ_DOMAIN_NAME_ALLOCATED; + break; + default: + domain->fwnode = fwnode; + domain->name = fwid->name; + break; + } + } else { + if (fwnode && !is_fwnode_irqchip(fwnode)) + pr_err("Invalid fwnode supplied for irqdomain\n"); + + domain->name = kasprintf(GFP_KERNEL, "unknown-%d", + atomic_inc_return(&unknown_domains)); + if (!domain->name) { + kfree(domain); + return NULL; + } + domain->flags |= IRQ_DOMAIN_NAME_ALLOCATED; + } + of_node_get(of_node); /* Fill structure */ INIT_RADIX_TREE(&domain->revmap_tree, GFP_KERNEL); domain->ops = ops; domain->host_data = host_data; - domain->fwnode = fwnode; domain->hwirq_max = hwirq_max; domain->revmap_size = size; domain->revmap_direct_max_irq = direct_max; @@ -152,6 +207,8 @@ void irq_domain_remove(struct irq_domain pr_debug("Removed domain %s\n", domain->name); of_node_put(irq_domain_get_of_node(domain)); + if (domain->flags & IRQ_DOMAIN_NAME_ALLOCATED) + kfree(domain->name); kfree(domain); } EXPORT_SYMBOL_GPL(irq_domain_remove);
[-- Attachment #1: x86-vector--Create-named-irq-domain.patch --] [-- Type: text/plain, Size: 822 bytes --] Use the fwnode to create a named domain so diagnosis works. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/vector.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -429,11 +429,16 @@ static void init_legacy_irqs(void) { } int __init arch_early_irq_init(void) { + struct fwnode_handle *fn; + init_legacy_irqs(); - x86_vector_domain = irq_domain_add_tree(NULL, &x86_vector_domain_ops, - NULL); + fn = irq_domain_alloc_named_fwnode("VECTOR"); + BUG_ON(!fn); + x86_vector_domain = irq_domain_create_tree(fn, &x86_vector_domain_ops, + NULL); BUG_ON(x86_vector_domain == NULL); + kfree(fn); irq_set_default_host(x86_vector_domain); arch_init_msi_domain(x86_vector_domain);
[-- Attachment #1: x86-ioapic--Create-named-irq-domain.patch --] [-- Type: text/plain, Size: 1426 bytes --] Use the fwnode to create a named domain so diagnosis works, but only when the the ioapic is not device tree based. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/io_apic.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2223,6 +2223,8 @@ static int mp_irqdomain_create(int ioapi struct ioapic *ip = &ioapics[ioapic]; struct ioapic_domain_cfg *cfg = &ip->irqdomain_cfg; struct mp_ioapic_gsi *gsi_cfg = mp_ioapic_gsi_routing(ioapic); + struct fwnode_handle *fn; + char *name = "IO-APIC"; if (cfg->type == IOAPIC_DOMAIN_INVALID) return 0; @@ -2233,9 +2235,25 @@ static int mp_irqdomain_create(int ioapi parent = irq_remapping_get_ir_irq_domain(&info); if (!parent) parent = x86_vector_domain; + else + name = "IO-APIC-IR"; + + /* Handle device tree enumerated APICs proper */ + if (cfg->dev) { + fn = of_node_to_fwnode(cfg->dev); + } else { + fn = irq_domain_alloc_named_id_fwnode(name, ioapic); + if (!fn) + return -ENOMEM; + } + + ip->irqdomain = irq_domain_create_linear(fn, hwirqs, cfg->ops, + (void *)(long)ioapic); + + /* Release fw handle if it was allocated above */ + if (!cfg->dev) + kfree(fn); - ip->irqdomain = irq_domain_add_linear(cfg->dev, hwirqs, cfg->ops, - (void *)(long)ioapic); if (!ip->irqdomain) return -ENOMEM;
[-- Attachment #1: x86-htirq--Create-named-domain.patch --] [-- Type: text/plain, Size: 1177 bytes --] Use the fwnode to create a named domain so diagnosis works. Mark the init function __init while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/htirq.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) --- a/arch/x86/kernel/apic/htirq.c +++ b/arch/x86/kernel/apic/htirq.c @@ -150,16 +150,27 @@ static const struct irq_domain_ops htirq .deactivate = htirq_domain_deactivate, }; -void arch_init_htirq_domain(struct irq_domain *parent) +void __init arch_init_htirq_domain(struct irq_domain *parent) { + struct fwnode_handle *fn; + if (disable_apic) return; - htirq_domain = irq_domain_add_tree(NULL, &htirq_domain_ops, NULL); + fn = irq_domain_alloc_named_fwnode("PCI-HT"); + if (!fn) + goto warn; + + htirq_domain = irq_domain_create_tree(fn, &htirq_domain_ops, NULL); + kfree(fn); if (!htirq_domain) - pr_warn("failed to initialize irqdomain for HTIRQ.\n"); - else - htirq_domain->parent = parent; + goto warn; + + htirq_domain->parent = parent; + return; + +warn: + pr_warn("Failed to initialize irqdomain for HTIRQ.\n"); } int arch_setup_ht_irq(int idx, int pos, struct pci_dev *dev,
[-- Attachment #1: x86-uv--Create-named-irq-domain.patch --] [-- Type: text/plain, Size: 941 bytes --] Use the fwnode to create a named domain so diagnosis works. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/platform/uv/uv_irq.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) --- a/arch/x86/platform/uv/uv_irq.c +++ b/arch/x86/platform/uv/uv_irq.c @@ -160,13 +160,21 @@ static struct irq_domain *uv_get_irq_dom { static struct irq_domain *uv_domain; static DEFINE_MUTEX(uv_lock); + struct fwnode_handle *fn; mutex_lock(&uv_lock); - if (uv_domain == NULL) { - uv_domain = irq_domain_add_tree(NULL, &uv_domain_ops, NULL); - if (uv_domain) - uv_domain->parent = x86_vector_domain; - } + if (uv_domain) + goto out; + + fn = irq_domain_alloc_named_fwnode("UV-CORE"); + if (!fn) + goto out; + + uv_domain = irq_domain_create_tree(fn, &uv_domain_ops, NULL); + kfree(fn); + if (uv_domain) + uv_domain->parent = x86_vector_domain; +out: mutex_unlock(&uv_lock); return uv_domain;
[-- Attachment #1: x86-msi--Provide-new-iommu-irqdomain-interface.patch --] [-- Type: text/plain, Size: 1673 bytes --] Provide a new interface for creating the iommu remapping domains, so that the caller can supply a name and a id in order to create named irqdomains. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org --- arch/x86/include/asm/irq_remapping.h | 2 ++ arch/x86/kernel/apic/msi.c | 15 +++++++++++++++ 2 files changed, 17 insertions(+) --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -56,6 +56,8 @@ irq_remapping_get_irq_domain(struct irq_ /* Create PCI MSI/MSIx irqdomain, use @parent as the parent irqdomain. */ extern struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent); +extern struct irq_domain * +arch_create_remap_msi_irq_domain(struct irq_domain *par, const char *n, int id); /* Get parent irqdomain for interrupt remapping irqdomain */ static inline struct irq_domain *arch_get_ir_parent_domain(void) --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -167,10 +167,25 @@ static struct msi_domain_info pci_msi_ir .handler_name = "edge", }; +struct irq_domain *arch_create_remap_msi_irq_domain(struct irq_domain *parent, + const char *name, int id) +{ + struct fwnode_handle *fn; + struct irq_domain *d; + + fn = irq_domain_alloc_named_id_fwnode(name, id); + if (!fn) + return NULL; + d = pci_msi_create_irq_domain(fn, &pci_msi_ir_domain_info, parent); + kfree(fn); + return d; +} + struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent) { return pci_msi_create_irq_domain(NULL, &pci_msi_ir_domain_info, parent); } + #endif #ifdef CONFIG_DMAR_TABLE
[-- Attachment #1: iommu-vt-d--Use-named-irq-domain-interface.patch --] [-- Type: text/plain, Size: 1573 bytes --] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org --- drivers/iommu/intel_irq_remapping.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) --- a/drivers/iommu/intel_irq_remapping.c +++ b/drivers/iommu/intel_irq_remapping.c @@ -500,8 +500,9 @@ static void iommu_enable_irq_remapping(s static int intel_setup_irq_remapping(struct intel_iommu *iommu) { struct ir_table *ir_table; - struct page *pages; + struct fwnode_handle *fn; unsigned long *bitmap; + struct page *pages; if (iommu->ir_table) return 0; @@ -525,15 +526,24 @@ static int intel_setup_irq_remapping(str goto out_free_pages; } - iommu->ir_domain = irq_domain_add_hierarchy(arch_get_ir_parent_domain(), - 0, INTR_REMAP_TABLE_ENTRIES, - NULL, &intel_ir_domain_ops, - iommu); + fn = irq_domain_alloc_named_id_fwnode("INTEL-IR", iommu->seq_id); + if (!fn) + goto out_free_bitmap; + + iommu->ir_domain = + irq_domain_create_hierarchy(arch_get_ir_parent_domain(), + 0, INTR_REMAP_TABLE_ENTRIES, + fn, &intel_ir_domain_ops, + iommu); + kfree(fn); if (!iommu->ir_domain) { pr_err("IR%d: failed to allocate irqdomain\n", iommu->seq_id); goto out_free_bitmap; } - iommu->ir_msi_domain = arch_create_msi_irq_domain(iommu->ir_domain); + iommu->ir_msi_domain = + arch_create_remap_msi_irq_domain(iommu->ir_domain, + "INTEL-IR-MSI", + iommu->seq_id); ir_table->base = page_address(pages); ir_table->bitmap = bitmap;
[-- Attachment #1: iommu-amd--Use-named-irq-domain-interface.patch --] [-- Type: text/plain, Size: 1025 bytes --] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org --- drivers/iommu/amd_iommu.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -4395,13 +4395,20 @@ static struct irq_chip amd_ir_chip = { int amd_iommu_create_irq_domain(struct amd_iommu *iommu) { - iommu->ir_domain = irq_domain_add_tree(NULL, &amd_ir_domain_ops, iommu); + struct fwnode_handle *fn; + + fn = irq_domain_alloc_named_id_fwnode("AMD-IR", iommu->index); + if (!fn) + return -ENOMEM; + iommu->ir_domain = irq_domain_create_tree(fn, &amd_ir_domain_ops, iommu); + kfree(fn); if (!iommu->ir_domain) return -ENOMEM; iommu->ir_domain->parent = arch_get_ir_parent_domain(); - iommu->msi_domain = arch_create_msi_irq_domain(iommu->ir_domain); - + iommu->msi_domain = arch_create_remap_msi_irq_domain(iommu->ir_domain, + "AMD-IR-MSI", + iommu->index); return 0; }
[-- Attachment #1: x86-msi--Remove-unused-remap-irq-domain-interface.patch --] [-- Type: text/plain, Size: 1002 bytes --] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/include/asm/irq_remapping.h | 1 - arch/x86/kernel/apic/msi.c | 6 ------ 2 files changed, 7 deletions(-) --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -55,7 +55,6 @@ extern struct irq_domain * irq_remapping_get_irq_domain(struct irq_alloc_info *info); /* Create PCI MSI/MSIx irqdomain, use @parent as the parent irqdomain. */ -extern struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent); extern struct irq_domain * arch_create_remap_msi_irq_domain(struct irq_domain *par, const char *n, int id); --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -180,12 +180,6 @@ struct irq_domain *arch_create_remap_msi kfree(fn); return d; } - -struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent) -{ - return pci_msi_create_irq_domain(NULL, &pci_msi_ir_domain_info, parent); -} - #endif #ifdef CONFIG_DMAR_TABLE
[-- Attachment #1: x86-msi--Create-named-irq-domains.patch --] [-- Type: text/plain, Size: 2357 bytes --] Use the fwnode to create named irq domains so diagnosis works. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/msi.c | 42 +++++++++++++++++++++++++++++++++--------- 1 file changed, 33 insertions(+), 9 deletions(-) --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -136,13 +136,20 @@ static struct msi_domain_info pci_msi_do .handler_name = "edge", }; -void arch_init_msi_domain(struct irq_domain *parent) +void __init arch_init_msi_domain(struct irq_domain *parent) { + struct fwnode_handle *fn; + if (disable_apic) return; - msi_default_domain = pci_msi_create_irq_domain(NULL, - &pci_msi_domain_info, parent); + fn = irq_domain_alloc_named_fwnode("PCI-MSI"); + if (fn) { + msi_default_domain = + pci_msi_create_irq_domain(fn, &pci_msi_domain_info, + parent); + kfree(fn); + } if (!msi_default_domain) pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n"); } @@ -230,13 +237,20 @@ static struct irq_domain *dmar_get_irq_d { static struct irq_domain *dmar_domain; static DEFINE_MUTEX(dmar_lock); + struct fwnode_handle *fn; mutex_lock(&dmar_lock); - if (dmar_domain == NULL) - dmar_domain = msi_create_irq_domain(NULL, &dmar_msi_domain_info, + if (dmar_domain) + goto out; + + fn = irq_domain_alloc_named_fwnode("DMAR-MSI"); + if (fn) { + dmar_domain = msi_create_irq_domain(fn, &dmar_msi_domain_info, x86_vector_domain); + kfree(fn); + } +out: mutex_unlock(&dmar_lock); - return dmar_domain; } @@ -326,9 +340,10 @@ static struct msi_domain_info hpet_msi_d struct irq_domain *hpet_create_irq_domain(int hpet_id) { - struct irq_domain *parent; - struct irq_alloc_info info; struct msi_domain_info *domain_info; + struct irq_domain *parent, *d; + struct irq_alloc_info info; + struct fwnode_handle *fn; if (x86_vector_domain == NULL) return NULL; @@ -349,7 +364,16 @@ struct irq_domain *hpet_create_irq_domai else hpet_msi_controller.name = "IR-HPET-MSI"; - return msi_create_irq_domain(NULL, domain_info, parent); + fn = irq_domain_alloc_named_id_fwnode(hpet_msi_controller.name, + hpet_id); + if (!fn) { + kfree(domain_info); + return NULL; + } + + d = msi_create_irq_domain(fn, domain_info, parent); + kfree(fn); + return d; } int hpet_assign_irq(struct irq_domain *domain, struct hpet_dev *dev,
[-- Attachment #1: PCI--vmd--Create-named-irq-domain.patch --] [-- Type: text/plain, Size: 1071 bytes --] Use the fwnode to create a named domain so diagnosis works. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Keith Busch <keith.busch@intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org --- drivers/pci/host/vmd.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) --- a/drivers/pci/host/vmd.c +++ b/drivers/pci/host/vmd.c @@ -554,6 +554,7 @@ static int vmd_find_free_domain(void) static int vmd_enable_domain(struct vmd_dev *vmd) { struct pci_sysdata *sd = &vmd->sysdata; + struct fwnode_handle *fn; struct resource *res; u32 upper_bits; unsigned long flags; @@ -617,8 +618,13 @@ static int vmd_enable_domain(struct vmd_ sd->node = pcibus_to_node(vmd->dev->bus); - vmd->irq_domain = pci_msi_create_irq_domain(NULL, &vmd_msi_domain_info, + fn = irq_domain_alloc_named_id_fwnode("VMD-MSI", vmd->sysdata.domain); + if (!fn) + return -ENODEV; + + vmd->irq_domain = pci_msi_create_irq_domain(fn, &vmd_msi_domain_info, x86_vector_domain); + kfree(fn); if (!vmd->irq_domain) return -ENODEV;
[-- Attachment #1: genirq-irqdomain--Add-map-counter.patch --] [-- Type: text/plain, Size: 1926 bytes --] Add a map counter instead of counting radix tree entries for diagnosis. That also gives correct information for linear domains. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/irqdomain.h | 2 ++ kernel/irq/irqdomain.c | 4 ++++ 2 files changed, 6 insertions(+) --- a/include/linux/irqdomain.h +++ b/include/linux/irqdomain.h @@ -130,6 +130,7 @@ struct irq_domain_chip_generic; * @host_data: private data pointer for use by owner. Not touched by irq_domain * core code. * @flags: host per irq_domain flags + * @mapcount: The number of mapped interrupts * * Optional elements * @of_node: Pointer to device tree nodes associated with the irq_domain. Used @@ -152,6 +153,7 @@ struct irq_domain { const struct irq_domain_ops *ops; void *host_data; unsigned int flags; + unsigned int mapcount; /* Optional data */ struct fwnode_handle *fwnode; --- a/kernel/irq/irqdomain.c +++ b/kernel/irq/irqdomain.c @@ -401,6 +401,7 @@ void irq_domain_disassociate(struct irq_ irq_data->domain = NULL; irq_data->hwirq = 0; + domain->mapcount--; /* Clear reverse map for this hwirq */ if (hwirq < domain->revmap_size) { @@ -452,6 +453,7 @@ int irq_domain_associate(struct irq_doma domain->name = irq_data->chip->name; } + domain->mapcount++; if (hwirq < domain->revmap_size) { domain->linear_revmap[hwirq] = virq; } else { @@ -1059,6 +1061,7 @@ static void irq_domain_insert_irq(int vi struct irq_domain *domain = data->domain; irq_hw_number_t hwirq = data->hwirq; + domain->mapcount++; if (hwirq < domain->revmap_size) { domain->linear_revmap[hwirq] = virq; } else { @@ -1088,6 +1091,7 @@ static void irq_domain_remove_irq(int vi struct irq_domain *domain = data->domain; irq_hw_number_t hwirq = data->hwirq; + domain->mapcount--; if (hwirq < domain->revmap_size) { domain->linear_revmap[hwirq] = 0; } else {
[-- Attachment #1: genirq-debugfs--Add-proper-debugfs-interface.patch --] [-- Type: text/plain, Size: 15558 bytes --] Debugging (hierarchical) interupt domains is tedious as there is no information about the hierarchy and no information about states of interrupts in the various domain levels. Add a debugfs directory 'irq' and subdirectories 'domains' and 'irqs'. The domains directory contains the domain files. The content is information about the domain. If the domain is part of a hierarchy then the parent domains are printed as well. # ls /sys/kernel/debug/irq/domains/ default INTEL-IR-2 INTEL-IR-MSI-2 IO-APIC-IR-2 PCI-MSI DMAR-MSI INTEL-IR-3 INTEL-IR-MSI-3 IO-APIC-IR-3 unknown-1 INTEL-IR-0 INTEL-IR-MSI-0 IO-APIC-IR-0 IO-APIC-IR-4 VECTOR INTEL-IR-1 INTEL-IR-MSI-1 IO-APIC-IR-1 PCI-HT # cat /sys/kernel/debug/irq/domains/VECTOR name: VECTOR size: 0 mapped: 216 flags: 0x00000041 # cat /sys/kernel/debug/irq/domains/IO-APIC-IR-0 name: IO-APIC-IR-0 size: 24 mapped: 19 flags: 0x00000041 parent: INTEL-IR-3 name: INTEL-IR-3 size: 65536 mapped: 167 flags: 0x00000041 parent: VECTOR name: VECTOR size: 0 mapped: 216 flags: 0x00000041 Unfortunately there is no per cpu information about the VECTOR domain (yet). The irqs directory contains detailed information about mapped interrupts. # cat /sys/kernel/debug/irq/irqs/3 handler: handle_edge_irq status: 0x00004000 istate: 0x00000000 ddepth: 1 wdepth: 0 dstate: 0x01018000 IRQD_IRQ_DISABLED IRQD_SINGLE_TARGET IRQD_MOVE_PCNTXT node: 0 affinity: 0-143 effectiv: 0 pending: domain: IO-APIC-IR-0 hwirq: 0x3 chip: IR-IO-APIC flags: 0x10 IRQCHIP_SKIP_SET_WAKE parent: domain: INTEL-IR-3 hwirq: 0x20000 chip: INTEL-IR flags: 0x0 parent: domain: VECTOR hwirq: 0x3 chip: APIC flags: 0x0 This was developed to simplify the debugging of the managed affinity changes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/irqdesc.h | 4 include/linux/irqdomain.h | 4 kernel/irq/Kconfig | 11 ++ kernel/irq/Makefile | 1 kernel/irq/debugfs.c | 215 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/irq/internals.h | 22 ++++ kernel/irq/irqdesc.c | 1 kernel/irq/irqdomain.c | 87 ++++++++++++++++++ kernel/irq/manage.c | 1 9 files changed, 345 insertions(+), 1 deletion(-) --- a/include/linux/irqdesc.h +++ b/include/linux/irqdesc.h @@ -46,6 +46,7 @@ struct pt_regs; * @rcu: rcu head for delayed free * @kobj: kobject used to represent this struct in sysfs * @dir: /proc/irq/ procfs entry + * @debugfs_file: dentry for the debugfs file * @name: flow handler name for /proc/interrupts output */ struct irq_desc { @@ -88,6 +89,9 @@ struct irq_desc { #ifdef CONFIG_PROC_FS struct proc_dir_entry *dir; #endif +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS + struct dentry *debugfs_file; +#endif #ifdef CONFIG_SPARSE_IRQ struct rcu_head rcu; struct kobject kobj; --- a/include/linux/irqdomain.h +++ b/include/linux/irqdomain.h @@ -139,6 +139,7 @@ struct irq_domain_chip_generic; * setting up one or more generic chips for interrupt controllers * drivers using the generic chip library which uses this pointer. * @parent: Pointer to parent irq_domain to support hierarchy irq_domains + * @debugfs_file: dentry for the domain debugfs file * * Revmap data, used internally by irq_domain * @revmap_direct_max_irq: The largest hwirq that can be set for controllers that @@ -162,6 +163,9 @@ struct irq_domain { #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY struct irq_domain *parent; #endif +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS + struct dentry *debugfs_file; +#endif /* reverse map data. The linear map gets appended to the irq_domain */ irq_hw_number_t hwirq_max; --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -108,4 +108,15 @@ config SPARSE_IRQ If you don't know what to do here, say N. +config GENERIC_IRQ_DEBUGFS + bool "Expose irq internals in debugfs" + depends on DEBUG_FS + default n + ---help--- + + Exposes internal state information through debugfs. Mostly for + developers and debugging of hard to diagnose interrupt problems. + + If you don't know what to do here, say N. + endmenu --- a/kernel/irq/Makefile +++ b/kernel/irq/Makefile @@ -10,3 +10,4 @@ obj-$(CONFIG_PM_SLEEP) += pm.o obj-$(CONFIG_GENERIC_MSI_IRQ) += msi.o obj-$(CONFIG_GENERIC_IRQ_IPI) += ipi.o obj-$(CONFIG_SMP) += affinity.o +obj-$(CONFIG_GENERIC_IRQ_DEBUGFS) += debugfs.o --- /dev/null +++ b/kernel/irq/debugfs.c @@ -0,0 +1,215 @@ +/* + * Copyright 2017 Thomas Gleixner <tglx@linutronix.de> + * + * This file is licensed under the GPL V2. + */ +#include <linux/debugfs.h> +#include <linux/irqdomain.h> +#include <linux/irq.h> + +#include "internals.h" + +static struct dentry *irq_dir; + +struct irq_bit_descr { + unsigned int mask; + char *name; +}; +#define BIT_MASK_DESCR(m) { .mask = m, .name = #m } + +static void irq_debug_show_bits(struct seq_file *m, int ind, unsigned int state, + const struct irq_bit_descr *sd, int size) +{ + int i; + + for (i = 0; i < size; i++, sd++) { + if (state & sd->mask) + seq_printf(m, "%*s%s\n", ind + 12, "", sd->name); + } +} + +#ifdef CONFIG_SMP +static void irq_debug_show_masks(struct seq_file *m, struct irq_desc *desc) +{ + struct irq_data *data = irq_desc_get_irq_data(desc); + struct cpumask *msk; + + msk = irq_data_get_affinity_mask(data); + seq_printf(m, "affinity: %*pbl\n", cpumask_pr_args(msk)); +#ifdef CONFIG_GENERIC_PENDING_IRQ + msk = desc->pending_mask; + seq_printf(m, "pending: %*pbl\n", cpumask_pr_args(msk)); +#endif +} +#else +static void irq_debug_show_masks(struct seq_file *m, struct irq_desc *desc) { } +#endif + +static const struct irq_bit_descr irqchip_flags[] = { + BIT_MASK_DESCR(IRQCHIP_SET_TYPE_MASKED), + BIT_MASK_DESCR(IRQCHIP_EOI_IF_HANDLED), + BIT_MASK_DESCR(IRQCHIP_MASK_ON_SUSPEND), + BIT_MASK_DESCR(IRQCHIP_ONOFFLINE_ENABLED), + BIT_MASK_DESCR(IRQCHIP_SKIP_SET_WAKE), + BIT_MASK_DESCR(IRQCHIP_ONESHOT_SAFE), + BIT_MASK_DESCR(IRQCHIP_EOI_THREADED), +}; + +static void +irq_debug_show_chip(struct seq_file *m, struct irq_data *data, int ind) +{ + struct irq_chip *chip = data->chip; + + if (!chip) { + seq_printf(m, "chip: None\n"); + return; + } + seq_printf(m, "%*schip: %s\n", ind, "", chip->name); + seq_printf(m, "%*sflags: 0x%lx\n", ind + 1, "", chip->flags); + irq_debug_show_bits(m, ind, chip->flags, irqchip_flags, + ARRAY_SIZE(irqchip_flags)); +} + +static void +irq_debug_show_data(struct seq_file *m, struct irq_data *data, int ind) +{ + seq_printf(m, "%*sdomain: %s\n", ind, "", + data->domain ? data->domain->name : ""); + seq_printf(m, "%*shwirq: 0x%lx\n", ind + 1, "", data->hwirq); + irq_debug_show_chip(m, data, ind + 1); +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY + if (!data->parent_data) + return; + seq_printf(m, "%*sparent:\n", ind + 1, ""); + irq_debug_show_data(m, data->parent_data, ind + 4); +#endif +} + +static const struct irq_bit_descr irqdata_states[] = { + BIT_MASK_DESCR(IRQ_TYPE_EDGE_RISING), + BIT_MASK_DESCR(IRQ_TYPE_EDGE_FALLING), + BIT_MASK_DESCR(IRQ_TYPE_LEVEL_HIGH), + BIT_MASK_DESCR(IRQ_TYPE_LEVEL_LOW), + BIT_MASK_DESCR(IRQD_LEVEL), + + BIT_MASK_DESCR(IRQD_ACTIVATED), + BIT_MASK_DESCR(IRQD_IRQ_STARTED), + BIT_MASK_DESCR(IRQD_IRQ_DISABLED), + BIT_MASK_DESCR(IRQD_IRQ_MASKED), + BIT_MASK_DESCR(IRQD_IRQ_INPROGRESS), + + BIT_MASK_DESCR(IRQD_PER_CPU), + BIT_MASK_DESCR(IRQD_NO_BALANCING), + + BIT_MASK_DESCR(IRQD_MOVE_PCNTXT), + BIT_MASK_DESCR(IRQD_AFFINITY_SET), + BIT_MASK_DESCR(IRQD_SETAFFINITY_PENDING), + BIT_MASK_DESCR(IRQD_AFFINITY_MANAGED), + BIT_MASK_DESCR(IRQD_MANAGED_SHUTDOWN), + + BIT_MASK_DESCR(IRQD_FORWARDED_TO_VCPU), + + BIT_MASK_DESCR(IRQD_WAKEUP_STATE), + BIT_MASK_DESCR(IRQD_WAKEUP_ARMED), +}; + +static const struct irq_bit_descr irqdesc_states[] = { + BIT_MASK_DESCR(_IRQ_NOPROBE), + BIT_MASK_DESCR(_IRQ_NOREQUEST), + BIT_MASK_DESCR(_IRQ_NOTHREAD), + BIT_MASK_DESCR(_IRQ_NOAUTOEN), + BIT_MASK_DESCR(_IRQ_NESTED_THREAD), + BIT_MASK_DESCR(_IRQ_PER_CPU_DEVID), + BIT_MASK_DESCR(_IRQ_IS_POLLED), + BIT_MASK_DESCR(_IRQ_DISABLE_UNLAZY), +}; + +static const struct irq_bit_descr irqdesc_istates[] = { + BIT_MASK_DESCR(IRQS_AUTODETECT), + BIT_MASK_DESCR(IRQS_SPURIOUS_DISABLED), + BIT_MASK_DESCR(IRQS_POLL_INPROGRESS), + BIT_MASK_DESCR(IRQS_ONESHOT), + BIT_MASK_DESCR(IRQS_REPLAY), + BIT_MASK_DESCR(IRQS_WAITING), + BIT_MASK_DESCR(IRQS_PENDING), + BIT_MASK_DESCR(IRQS_SUSPENDED), +}; + + +static int irq_debug_show(struct seq_file *m, void *p) +{ + struct irq_desc *desc = m->private; + struct irq_data *data; + + raw_spin_lock_irq(&desc->lock); + data = irq_desc_get_irq_data(desc); + seq_printf(m, "handler: %pf\n", desc->handle_irq); + seq_printf(m, "status: 0x%08x\n", desc->status_use_accessors); + irq_debug_show_bits(m, 0, desc->status_use_accessors, irqdesc_states, + ARRAY_SIZE(irqdesc_states)); + seq_printf(m, "istate: 0x%08x\n", desc->istate); + irq_debug_show_bits(m, 0, desc->istate, irqdesc_istates, + ARRAY_SIZE(irqdesc_istates)); + seq_printf(m, "ddepth: %u\n", desc->depth); + seq_printf(m, "wdepth: %u\n", desc->wake_depth); + seq_printf(m, "dstate: 0x%08x\n", irqd_get(data)); + irq_debug_show_bits(m, 0, irqd_get(data), irqdata_states, + ARRAY_SIZE(irqdata_states)); + seq_printf(m, "node: %d\n", irq_data_get_node(data)); + irq_debug_show_masks(m, desc); + irq_debug_show_data(m, data, 0); + raw_spin_unlock_irq(&desc->lock); + return 0; +} + +static int irq_debug_open(struct inode *inode, struct file *file) +{ + return single_open(file, irq_debug_show, inode->i_private); +} + +static const struct file_operations dfs_irq_ops = { + .open = irq_debug_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *desc) +{ + char name [10]; + + if (!irq_dir || !desc || desc->debugfs_file) + return; + + sprintf(name, "%d", irq); + desc->debugfs_file = debugfs_create_file(name, 0x444, irq_dir, desc, + &dfs_irq_ops); +} + +void irq_remove_debugfs_entry(struct irq_desc *desc) +{ + if (desc->debugfs_file) + debugfs_remove(desc->debugfs_file); +} + +static int __init irq_debugfs_init(void) +{ + struct dentry *root_dir; + int irq; + + root_dir = debugfs_create_dir("irq", NULL); + if (!root_dir) + return -ENOMEM; + + irq_domain_debugfs_init(root_dir); + + irq_dir = debugfs_create_dir("irqs", root_dir); + + irq_lock_sparse(); + for_each_active_irq(irq) + irq_add_debugfs_entry(irq, irq_to_desc(irq)); + irq_unlock_sparse(); + + return 0; +} +__initcall(irq_debugfs_init); --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -169,6 +169,11 @@ irq_put_desc_unlock(struct irq_desc *des #define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors) +static inline unsigned int irqd_get(struct irq_data *d) +{ + return __irqd_to_state(d); +} + /* * Manipulation functions for irq_data.state */ @@ -226,3 +231,20 @@ irq_pm_install_action(struct irq_desc *d static inline void irq_pm_remove_action(struct irq_desc *desc, struct irqaction *action) { } #endif + +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS +void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *desc); +void irq_remove_debugfs_entry(struct irq_desc *desc); +# ifdef CONFIG_IRQ_DOMAIN +void irq_domain_debugfs_init(struct dentry *root); +# else +static inline void irq_domain_debugfs_init(struct dentry *root); +# endif +#else /* CONFIG_GENERIC_IRQ_DEBUGFS */ +static inline void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *d) +{ +} +static inline void irq_remove_debugfs_entry(struct irq_desc *d) +{ +} +#endif /* CONFIG_GENERIC_IRQ_DEBUGFS */ --- a/kernel/irq/irqdesc.c +++ b/kernel/irq/irqdesc.c @@ -394,6 +394,7 @@ static void free_desc(unsigned int irq) { struct irq_desc *desc = irq_to_desc(irq); + irq_remove_debugfs_entry(desc); unregister_irq_proc(irq, desc); /* --- a/kernel/irq/irqdomain.c +++ b/kernel/irq/irqdomain.c @@ -29,9 +29,17 @@ struct irqchip_fwid { struct fwnode_handle fwnode; unsigned int type; char *name; - void *data; + void *data; }; +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS +static void debugfs_add_domain_dir(struct irq_domain *d); +static void debugfs_remove_domain_dir(struct irq_domain *d); +#else +static inline void debugfs_add_domain_dir(struct irq_domain *d) { } +static inline void debugfs_remove_domain_dir(struct irq_domain *d) { } +#endif + /** * irq_domain_alloc_fwnode - Allocate a fwnode_handle suitable for * identifying an irq domain @@ -172,6 +180,7 @@ struct irq_domain *__irq_domain_add(stru irq_domain_check_hierarchy(domain); mutex_lock(&irq_domain_mutex); + debugfs_add_domain_dir(domain); list_add(&domain->link, &irq_domain_list); mutex_unlock(&irq_domain_mutex); @@ -191,6 +200,7 @@ EXPORT_SYMBOL_GPL(__irq_domain_add); void irq_domain_remove(struct irq_domain *domain) { mutex_lock(&irq_domain_mutex); + debugfs_remove_domain_dir(domain); WARN_ON(!radix_tree_empty(&domain->revmap_tree)); @@ -1577,3 +1587,78 @@ static void irq_domain_check_hierarchy(s { } #endif /* CONFIG_IRQ_DOMAIN_HIERARCHY */ + +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS +static struct dentry *domain_dir; + +static void +irq_domain_debug_show_one(struct seq_file *m, struct irq_domain *d, int ind) +{ + seq_printf(m, "%*sname: %s\n", ind, "", d->name); + seq_printf(m, "%*ssize: %u\n", ind + 1, "", + d->revmap_size + d->revmap_direct_max_irq); + seq_printf(m, "%*smapped: %u\n", ind + 1, "", d->mapcount); + seq_printf(m, "%*sflags: 0x%08x\n", ind +1 , "", d->flags); +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY + if (!d->parent) + return; + seq_printf(m, "%*sparent: %s\n", ind + 1, "", d->parent->name); + irq_domain_debug_show_one(m, d->parent, ind + 4); +#endif +} + +static int irq_domain_debug_show(struct seq_file *m, void *p) +{ + struct irq_domain *d = m->private; + + /* Default domain? Might be NULL */ + if (!d) { + if (!irq_default_domain) + return 0; + d = irq_default_domain; + } + irq_domain_debug_show_one(m, d, 0); + return 0; +} + +static int irq_domain_debug_open(struct inode *inode, struct file *file) +{ + return single_open(file, irq_domain_debug_show, inode->i_private); +} + +static const struct file_operations dfs_domain_ops = { + .open = irq_domain_debug_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static void debugfs_add_domain_dir(struct irq_domain *d) +{ + if (!d->name || !domain_dir || d->debugfs_file) + return; + d->debugfs_file = debugfs_create_file(d->name, 0x444, domain_dir, d, + &dfs_domain_ops); +} + +static void debugfs_remove_domain_dir(struct irq_domain *d) +{ + if (d->debugfs_file) + debugfs_remove(d->debugfs_file); +} + +void __init irq_domain_debugfs_init(struct dentry *root) +{ + struct irq_domain *d; + + domain_dir = debugfs_create_dir("domains", root); + if (!domain_dir) + return; + + debugfs_create_file("default", 0444, domain_dir, NULL, &dfs_domain_ops); + mutex_lock(&irq_domain_mutex); + list_for_each_entry(d, &irq_domain_list, link) + debugfs_add_domain_dir(d); + mutex_unlock(&irq_domain_mutex); +} +#endif --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1396,6 +1396,7 @@ static int wake_up_process(new->secondary->thread); register_irq_proc(irq, desc); + irq_add_debugfs_entry(irq, desc); new->dir = NULL; register_handler_proc(irq, new); free_cpumask_var(mask);
[-- Attachment #1: genirq--Add-missing-comment-for-IRQD_STARTED.patch --] [-- Type: text/plain, Size: 481 bytes --] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/irq.h | 1 + 1 file changed, 1 insertion(+) --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -199,6 +199,7 @@ struct irq_data { * IRQD_WAKEUP_ARMED - Wakeup mode armed * IRQD_FORWARDED_TO_VCPU - The interrupt is forwarded to a VCPU * IRQD_AFFINITY_MANAGED - Affinity is auto-managed by the kernel + * IRQD_IRQ_STARTED - Startup state of the interrupt */ enum { IRQD_TRIGGER_MASK = 0xf,
[-- Attachment #1: genirq--Provide-irq_fixup_move_pending--.patch --] [-- Type: text/plain, Size: 2376 bytes --] If an CPU goes offline, the interrupts are migrated away, but a eventually pending interrupt move, which has not yet been made effective is kept pending even if the outgoing CPU is the sole target of the pending affinity mask. What's worse is, that the pending affinity mask is discarded even if it would contain a valid subset of the online CPUs. Implement a helper function which allows to avoid these issues. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/irq.h | 5 +++++ kernel/irq/migration.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+) --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -490,9 +490,14 @@ extern void irq_migrate_all_off_this_cpu #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ) void irq_move_irq(struct irq_data *data); void irq_move_masked_irq(struct irq_data *data); +bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear); #else static inline void irq_move_irq(struct irq_data *data) { } static inline void irq_move_masked_irq(struct irq_data *data) { } +static inline bool irq_fixup_move_pending(struct irq_desc *desc, bool fclear) +{ + return false; +} #endif extern int no_irq_affinity; --- a/kernel/irq/migration.c +++ b/kernel/irq/migration.c @@ -4,6 +4,36 @@ #include "internals.h" +/** + * irq_fixup_move_pending - Cleanup irq move pending from a dying CPU + * @desc: Interrupt descpriptor to clean up + * @force_clear: If set clear the move pending bit unconditionally. + * If not set, clear it only when the dying CPU is the + * last one in the pending mask. + * + * Returns true if the pending bit was set and the pending mask contains an + * online CPU other than the dying CPU. + */ +bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear) +{ + struct irq_data *data = irq_desc_get_irq_data(desc); + + if (!irqd_is_setaffinity_pending(data)) + return false; + + /* + * The outgoing CPU might be the last online target in a pending + * interrupt move. If that's the case clear the pending move bit. + */ + if (cpumask_any_and(desc->pending_mask, cpu_online_mask) > nr_cpu_ids) { + irqd_clr_move_pending(data); + return false; + } + if (force_clear) + irqd_clr_move_pending(data); + return true; +} + void irq_move_masked_irq(struct irq_data *idata) { struct irq_desc *desc = irq_data_to_desc(idata);
[-- Attachment #1: x86-irq--Cleanup-pending-irq-move-in-fixup_irqs--.patch --] [-- Type: text/plain, Size: 2500 bytes --] If an CPU goes offline, the interrupts are migrated away, but a eventually pending interrupt move, which has not yet been made effective is kept pending even if the outgoing CPU is the sole target of the pending affinity mask. What's worse is, that the pending affinity mask is discarded even if it would contain a valid subset of the online CPUs. Use the newly introduced helper to: - Discard a pending move when the outgoing CPU is the only target in the pending mask. - Use the pending mask instead of the affinity mask to find a valid target for the CPU if the pending mask intersects with the online CPUs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/irq.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -440,9 +440,9 @@ void fixup_irqs(void) int ret; for_each_irq_desc(irq, desc) { + const struct cpumask *affinity; int break_affinity = 0; int set_affinity = 1; - const struct cpumask *affinity; if (!desc) continue; @@ -454,19 +454,36 @@ void fixup_irqs(void) data = irq_desc_get_irq_data(desc); affinity = irq_data_get_affinity_mask(data); + if (!irq_has_action(irq) || irqd_is_per_cpu(data) || cpumask_subset(affinity, cpu_online_mask)) { + irq_fixup_move_pending(desc, false); raw_spin_unlock(&desc->lock); continue; } /* - * Complete the irq move. This cpu is going down and for - * non intr-remapping case, we can't wait till this interrupt - * arrives at this cpu before completing the irq move. + * Complete an eventually pending irq move cleanup. If this + * interrupt was moved in hard irq context, then the + * vectors need to be cleaned up. It can't wait until this + * interrupt actually happens and this CPU was involved. */ irq_force_complete_move(desc); + /* + * If there is a setaffinity pending, then try to reuse the + * pending mask, so the last change of the affinity does + * not get lost. If there is no move pending or the pending + * mask does not contain any online CPU, use the current + * affinity mask. + */ + if (irq_fixup_move_pending(desc, true)) + affinity = desc->pending_mask; + + /* + * If the mask does not contain an offline CPU, break + * affinity and use cpu_online_mask as fall back. + */ if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { break_affinity = 1; affinity = cpu_online_mask;
[-- Attachment #1: genirq--Remove-mask-argument-from-setup_affinity--.patch --] [-- Type: text/plain, Size: 5781 bytes --] No point to have this alloc/free dance of cpumasks. Provide a static mask for setup_affinity() and protect it proper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/internals.h | 2 - kernel/irq/manage.c | 53 +++++++++++++++++++++---------------------------- kernel/irq/proc.c | 8 ++++--- 3 files changed, 29 insertions(+), 34 deletions(-) --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -109,7 +109,7 @@ static inline void unregister_handler_pr extern bool irq_can_set_affinity_usr(unsigned int irq); -extern int irq_select_affinity_usr(unsigned int irq, struct cpumask *mask); +extern int irq_select_affinity_usr(unsigned int irq); extern void irq_set_thread_affinity(struct irq_desc *desc); --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -345,15 +345,18 @@ EXPORT_SYMBOL_GPL(irq_set_affinity_notif /* * Generic version of the affinity autoselector. */ -static int setup_affinity(struct irq_desc *desc, struct cpumask *mask) +static int irq_setup_affinity(struct irq_desc *desc) { struct cpumask *set = irq_default_affinity; - int node = irq_desc_get_node(desc); + int ret, node = irq_desc_get_node(desc); + static DEFINE_RAW_SPINLOCK(mask_lock); + static struct cpumask mask; /* Excludes PER_CPU and NO_BALANCE interrupts */ if (!__irq_can_set_affinity(desc)) return 0; + raw_spin_lock(&mask_lock); /* * Preserve the managed affinity setting and a userspace affinity * setup, but make sure that one of the targets is online. @@ -367,43 +370,42 @@ static int setup_affinity(struct irq_des irqd_clear(&desc->irq_data, IRQD_AFFINITY_SET); } - cpumask_and(mask, cpu_online_mask, set); + cpumask_and(&mask, cpu_online_mask, set); if (node != NUMA_NO_NODE) { const struct cpumask *nodemask = cpumask_of_node(node); /* make sure at least one of the cpus in nodemask is online */ - if (cpumask_intersects(mask, nodemask)) - cpumask_and(mask, mask, nodemask); + if (cpumask_intersects(&mask, nodemask)) + cpumask_and(&mask, &mask, nodemask); } - irq_do_set_affinity(&desc->irq_data, mask, false); - return 0; + ret = irq_do_set_affinity(&desc->irq_data, &mask, false); + raw_spin_unlock(&mask_lock); + return ret; } #else /* Wrapper for ALPHA specific affinity selector magic */ -static inline int setup_affinity(struct irq_desc *d, struct cpumask *mask) +int irq_setup_affinity(struct irq_desc *desc) { - return irq_select_affinity(irq_desc_get_irq(d)); + return irq_select_affinity(irq_desc_get_irq(desc)); } #endif /* - * Called when affinity is set via /proc/irq + * Called when a bogus affinity is set via /proc/irq */ -int irq_select_affinity_usr(unsigned int irq, struct cpumask *mask) +int irq_select_affinity_usr(unsigned int irq) { struct irq_desc *desc = irq_to_desc(irq); unsigned long flags; int ret; raw_spin_lock_irqsave(&desc->lock, flags); - ret = setup_affinity(desc, mask); + ret = irq_setup_affinity(desc); raw_spin_unlock_irqrestore(&desc->lock, flags); return ret; } - #else -static inline int -setup_affinity(struct irq_desc *desc, struct cpumask *mask) +static inline int setup_affinity(struct irq_desc *desc) { return 0; } @@ -1128,7 +1130,6 @@ static int struct irqaction *old, **old_ptr; unsigned long flags, thread_mask = 0; int ret, nested, shared = 0; - cpumask_var_t mask; if (!desc) return -EINVAL; @@ -1187,11 +1188,6 @@ static int } } - if (!alloc_cpumask_var(&mask, GFP_KERNEL)) { - ret = -ENOMEM; - goto out_thread; - } - /* * Drivers are often written to work w/o knowledge about the * underlying irq chip implementation, so a request for a @@ -1256,7 +1252,7 @@ static int */ if (thread_mask == ~0UL) { ret = -EBUSY; - goto out_mask; + goto out_unlock; } /* * The thread_mask for the action is or'ed to @@ -1300,7 +1296,7 @@ static int pr_err("Threaded irq requested with handler=NULL and !ONESHOT for irq %d\n", irq); ret = -EINVAL; - goto out_mask; + goto out_unlock; } if (!shared) { @@ -1308,7 +1304,7 @@ static int if (ret) { pr_err("Failed to request resources for %s (irq %d) on irqchip %s\n", new->name, irq, desc->irq_data.chip->name); - goto out_mask; + goto out_unlock; } init_waitqueue_head(&desc->wait_for_threads); @@ -1319,7 +1315,7 @@ static int new->flags & IRQF_TRIGGER_MASK); if (ret) - goto out_mask; + goto out_unlock; } desc->istate &= ~(IRQS_AUTODETECT | IRQS_SPURIOUS_DISABLED | \ @@ -1355,7 +1351,7 @@ static int } /* Set default affinity mask once everything is setup */ - setup_affinity(desc, mask); + irq_setup_affinity(desc); } else if (new->flags & IRQF_TRIGGER_MASK) { unsigned int nmsk = new->flags & IRQF_TRIGGER_MASK; @@ -1399,8 +1395,6 @@ static int irq_add_debugfs_entry(irq, desc); new->dir = NULL; register_handler_proc(irq, new); - free_cpumask_var(mask); - return 0; mismatch: @@ -1413,9 +1407,8 @@ static int } ret = -EBUSY; -out_mask: +out_unlock: raw_spin_unlock_irqrestore(&desc->lock, flags); - free_cpumask_var(mask); out_thread: if (new->thread) { --- a/kernel/irq/proc.c +++ b/kernel/irq/proc.c @@ -120,9 +120,11 @@ static ssize_t write_irq_affinity(int ty * one online CPU still has to be targeted. */ if (!cpumask_intersects(new_value, cpu_online_mask)) { - /* Special case for empty set - allow the architecture - code to set default SMP affinity. */ - err = irq_select_affinity_usr(irq, new_value) ? -EINVAL : count; + /* + * Special case for empty set - allow the architecture code + * to set default SMP affinity. + */ + err = irq_select_affinity_usr(irq) ? -EINVAL : count; } else { irq_set_affinity(irq, new_value); err = count;
[-- Attachment #1: genirq--Rename-setup_affinity---to-irq_setup_affinity--.patch --] [-- Type: text/plain, Size: 1447 bytes --] Rename it with a proper irq_ prefix and make it available for other files in the core code. Preparatory patch for moving the irq affinity setup around. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/internals.h | 6 ++++++ kernel/irq/manage.c | 7 +------ 2 files changed, 7 insertions(+), 6 deletions(-) --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -116,6 +116,12 @@ extern void irq_set_thread_affinity(stru extern int irq_do_set_affinity(struct irq_data *data, const struct cpumask *dest, bool force); +#ifdef CONFIG_SMP +extern int irq_setup_affinity(struct irq_desc *desc); +#else +static inline int irq_setup_affinity(struct irq_desc *desc) { return 0; } +#endif + /* Inline functions for support of irq chips on slow busses */ static inline void chip_bus_lock(struct irq_desc *desc) { --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -345,7 +345,7 @@ EXPORT_SYMBOL_GPL(irq_set_affinity_notif /* * Generic version of the affinity autoselector. */ -static int irq_setup_affinity(struct irq_desc *desc) +int irq_setup_affinity(struct irq_desc *desc) { struct cpumask *set = irq_default_affinity; int ret, node = irq_desc_get_node(desc); @@ -404,11 +404,6 @@ int irq_select_affinity_usr(unsigned int raw_spin_unlock_irqrestore(&desc->lock, flags); return ret; } -#else -static inline int setup_affinity(struct irq_desc *desc) -{ - return 0; -} #endif /**
[-- Attachment #1: genirq--Move-initial-affinity-setup-to-irq_startup--.patch --] [-- Type: text/plain, Size: 1851 bytes --] The startup vs. setaffinity ordering of interrupts depends on the IRQF_NOAUTOEN flag. Chained interrupts are not getting any affinity assignment at all. A regular interrupt is started up and then the affinity is set. A IRQF_NOAUTOEN marked interrupt is not started up, but the affinity is set nevertheless. Move the affinity setup to startup_irq() so the ordering is always the same and chained interrupts get the proper default affinity assigned as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/chip.c | 2 ++ kernel/irq/manage.c | 15 ++++++--------- 2 files changed, 8 insertions(+), 9 deletions(-) --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -213,6 +213,8 @@ int irq_startup(struct irq_desc *desc, b irq_enable(desc); } irq_state_set_started(desc); + /* Set default affinity mask once everything is setup */ + irq_setup_affinity(desc); } if (resend) --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1325,6 +1325,12 @@ static int if (new->flags & IRQF_ONESHOT) desc->istate |= IRQS_ONESHOT; + /* Exclude IRQ from balancing if requested */ + if (new->flags & IRQF_NOBALANCING) { + irq_settings_set_no_balancing(desc); + irqd_set(&desc->irq_data, IRQD_NO_BALANCING); + } + if (irq_settings_can_autoenable(desc)) { irq_startup(desc, true); } else { @@ -1339,15 +1345,6 @@ static int desc->depth = 1; } - /* Exclude IRQ from balancing if requested */ - if (new->flags & IRQF_NOBALANCING) { - irq_settings_set_no_balancing(desc); - irqd_set(&desc->irq_data, IRQD_NO_BALANCING); - } - - /* Set default affinity mask once everything is setup */ - irq_setup_affinity(desc); - } else if (new->flags & IRQF_TRIGGER_MASK) { unsigned int nmsk = new->flags & IRQF_TRIGGER_MASK; unsigned int omsk = irqd_get_trigger_type(&desc->irq_data);
[-- Attachment #1: genirq_Move_pending_helpers_to_internal.h.patch --] [-- Type: text/plain, Size: 2822 bytes --] From: Christoph Hellwig <hch@lst.de> So that the affinity code can reuse them. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/internals.h | 38 ++++++++++++++++++++++++++++++++++++++ kernel/irq/manage.c | 28 ---------------------------- 2 files changed, 38 insertions(+), 28 deletions(-) --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -238,6 +238,44 @@ static inline void irq_pm_remove_action(struct irq_desc *desc, struct irqaction *action) { } #endif +#ifdef CONFIG_GENERIC_PENDING_IRQ +static inline bool irq_can_move_pcntxt(struct irq_data *data) +{ + return irqd_can_move_in_process_context(data); +} +static inline bool irq_move_pending(struct irq_data *data) +{ + return irqd_is_setaffinity_pending(data); +} +static inline void +irq_copy_pending(struct irq_desc *desc, const struct cpumask *mask) +{ + cpumask_copy(desc->pending_mask, mask); +} +static inline void +irq_get_pending(struct cpumask *mask, struct irq_desc *desc) +{ + cpumask_copy(mask, desc->pending_mask); +} +#else /* CONFIG_GENERIC_PENDING_IRQ */ +static inline bool irq_can_move_pcntxt(struct irq_data *data) +{ + return true; +} +static inline bool irq_move_pending(struct irq_data *data) +{ + return false; +} +static inline void +irq_copy_pending(struct irq_desc *desc, const struct cpumask *mask) +{ +} +static inline void +irq_get_pending(struct cpumask *mask, struct irq_desc *desc) +{ +} +#endif /* CONFIG_GENERIC_PENDING_IRQ */ + #ifdef CONFIG_GENERIC_IRQ_DEBUGFS void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *desc); void irq_remove_debugfs_entry(struct irq_desc *desc); --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -168,34 +168,6 @@ void irq_set_thread_affinity(struct irq_ set_bit(IRQTF_AFFINITY, &action->thread_flags); } -#ifdef CONFIG_GENERIC_PENDING_IRQ -static inline bool irq_can_move_pcntxt(struct irq_data *data) -{ - return irqd_can_move_in_process_context(data); -} -static inline bool irq_move_pending(struct irq_data *data) -{ - return irqd_is_setaffinity_pending(data); -} -static inline void -irq_copy_pending(struct irq_desc *desc, const struct cpumask *mask) -{ - cpumask_copy(desc->pending_mask, mask); -} -static inline void -irq_get_pending(struct cpumask *mask, struct irq_desc *desc) -{ - cpumask_copy(mask, desc->pending_mask); -} -#else -static inline bool irq_can_move_pcntxt(struct irq_data *data) { return true; } -static inline bool irq_move_pending(struct irq_data *data) { return false; } -static inline void -irq_copy_pending(struct irq_desc *desc, const struct cpumask *mask) { } -static inline void -irq_get_pending(struct cpumask *mask, struct irq_desc *desc) { } -#endif - int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force) {
[-- Attachment #1: genirq-cpuhotplug--Remove-irq-disabling-logic.patch --] [-- Type: text/plain, Size: 981 bytes --] This is called from stop_machine() with interrupts disabled. No point in disabling them some more. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/cpuhotplug.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -59,11 +59,8 @@ static bool migrate_one_irq(struct irq_d */ void irq_migrate_all_off_this_cpu(void) { - unsigned int irq; struct irq_desc *desc; - unsigned long flags; - - local_irq_save(flags); + unsigned int irq; for_each_active_irq(irq) { bool affinity_broken; @@ -73,10 +70,9 @@ void irq_migrate_all_off_this_cpu(void) affinity_broken = migrate_one_irq(desc); raw_spin_unlock(&desc->lock); - if (affinity_broken) - pr_warn_ratelimited("IRQ%u no longer affine to CPU%u\n", + if (affinity_broken) { + pr_warn_ratelimited("IRQ %u: no longer affine to CPU%u\n", irq, smp_processor_id()); + } } - - local_irq_restore(flags); }
[-- Attachment #1: genirq-cpuhotplug--Dont-claim-success-on-error.patch --] [-- Type: text/plain, Size: 888 bytes --] In case the affinity of an interrupt was broken, a printk is emitted. But if the affinity cannot be set at all due to a missing irq_set_affinity() callback or due to a failing callback, the message is still printed preceeded by a warning/error. That makes no sense whatsoever. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/cpuhotplug.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -37,11 +37,14 @@ static bool migrate_one_irq(struct irq_d c = irq_data_get_irq_chip(d); if (!c->irq_set_affinity) { pr_debug("IRQ%u: unable to set affinity\n", d->irq); + ret = false; } else { int r = irq_do_set_affinity(d, affinity, false); - if (r) + if (r) { pr_warn_ratelimited("IRQ%u: set affinity failed(%d).\n", d->irq, r); + ret = false; + } } return ret;
[-- Attachment #1: genirq-cpuhotplug--Reorder-check-logic.patch --] [-- Type: text/plain, Size: 1858 bytes --] Move the checks for a valid irq chip and the irq_set_affinity() callback right in front of the whole migration logic. No point in doing a gazillion of other things when the interrupt cannot be migrated at all. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/cpuhotplug.c | 36 ++++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 16 deletions(-) --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -17,9 +17,20 @@ static bool migrate_one_irq(struct irq_desc *desc) { struct irq_data *d = irq_desc_get_irq_data(desc); + struct irq_chip *chip = irq_data_get_irq_chip(d); const struct cpumask *affinity = d->common->affinity; - struct irq_chip *c; - bool ret = false; + bool brokeaff = false; + int err; + + /* + * IRQ chip might be already torn down, but the irq descriptor is + * still in the radix tree. Also if the chip has no affinity setter, + * nothing can be done here. + */ + if (!chip || !chip->irq_set_affinity) { + pr_debug("IRQ %u: Unable to migrate away\n", d->irq); + return false; + } /* * If this is a per-CPU interrupt, or the affinity does not @@ -31,23 +42,16 @@ static bool migrate_one_irq(struct irq_d if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { affinity = cpu_online_mask; - ret = true; + brokeaff = true; } - c = irq_data_get_irq_chip(d); - if (!c->irq_set_affinity) { - pr_debug("IRQ%u: unable to set affinity\n", d->irq); - ret = false; - } else { - int r = irq_do_set_affinity(d, affinity, false); - if (r) { - pr_warn_ratelimited("IRQ%u: set affinity failed(%d).\n", - d->irq, r); - ret = false; - } + err = irq_do_set_affinity(d, affinity, false); + if (err) { + pr_warn_ratelimited("IRQ%u: set affinity failed(%d).\n", + d->irq, err); + return false; } - - return ret; + return brokeaff; } /**
[-- Attachment #1: genirq-cpuhotplug--Do-not-migrated-shutdown-irqs.patch --] [-- Type: text/plain, Size: 954 bytes --] Interrupts, which are shut down are tried to be migrated as well. That's pointless because the interrupt cannot fire and the next startup will move it to the proper place anyway. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/cpuhotplug.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -33,10 +33,15 @@ static bool migrate_one_irq(struct irq_d } /* - * If this is a per-CPU interrupt, or the affinity does not - * include this CPU, then we have nothing to do. + * No move required, if: + * - Interrupt is per cpu + * - Interrupt is not started + * - Affinity mask does not include this CPU. + * + * Note: Do not check desc->action as this might be a chained + * interrupt. */ - if (irqd_is_per_cpu(d) || + if (irqd_is_per_cpu(d) || !irqd_is_started(d) || !cpumask_test_cpu(smp_processor_id(), affinity)) return false;
[-- Attachment #1: genirq-cpuhotplug--Add-support-for-cleaning-up-move-in-progress.patch --] [-- Type: text/plain, Size: 3920 bytes --] In order to move x86 to the generic hotplug migration code, add support for cleaning up move in progress bits. On architectures which have this x86 specific (mis)feature not enabled, this is optimized out by the compiler. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/include/asm/irq.h | 1 - include/linux/irq.h | 2 ++ kernel/irq/cpuhotplug.c | 28 ++++++++++++++++++++++++++-- kernel/irq/internals.h | 10 +++++++++- 4 files changed, 37 insertions(+), 4 deletions(-) --- a/arch/x86/include/asm/irq.h +++ b/arch/x86/include/asm/irq.h @@ -29,7 +29,6 @@ struct irq_desc; #include <linux/cpumask.h> extern int check_irq_vectors_for_cpu_disable(void); extern void fixup_irqs(void); -extern void irq_force_complete_move(struct irq_desc *desc); #endif #ifdef CONFIG_HAVE_KVM --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -490,10 +490,12 @@ extern void irq_migrate_all_off_this_cpu #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ) void irq_move_irq(struct irq_data *data); void irq_move_masked_irq(struct irq_data *data); +void irq_force_complete_move(struct irq_desc *desc); bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear); #else static inline void irq_move_irq(struct irq_data *data) { } static inline void irq_move_masked_irq(struct irq_data *data) { } +static inline void irq_force_complete_move(struct irq_desc *desc) { } static inline bool irq_fixup_move_pending(struct irq_desc *desc, bool fclear) { return false; --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -18,7 +18,7 @@ static bool migrate_one_irq(struct irq_d { struct irq_data *d = irq_desc_get_irq_data(desc); struct irq_chip *chip = irq_data_get_irq_chip(d); - const struct cpumask *affinity = d->common->affinity; + const struct cpumask *affinity; bool brokeaff = false; int err; @@ -41,9 +41,33 @@ static bool migrate_one_irq(struct irq_d * Note: Do not check desc->action as this might be a chained * interrupt. */ + affinity = irq_data_get_affinity_mask(d); if (irqd_is_per_cpu(d) || !irqd_is_started(d) || - !cpumask_test_cpu(smp_processor_id(), affinity)) + !cpumask_test_cpu(smp_processor_id(), affinity)) { + /* + * If an irq move is pending, abort it if the dying CPU is + * the sole target. + */ + irq_fixup_move_pending(desc, false); return false; + } + + /* + * Complete an eventually pending irq move cleanup. If this + * interrupt was moved in hard irq context, then the vectors need + * to be cleaned up. It can't wait until this interrupt actually + * happens and this CPU was involved. + */ + irq_force_complete_move(desc); + + /* + * If there is a setaffinity pending, then try to reuse the pending + * mask, so the last change of the affinity does not get lost. If + * there is no move pending or the pending mask does not contain + * any online CPU, use the current affinity mask. + */ + if (irq_fixup_move_pending(desc, true)) + affinity = irq_desc_get_pending_mask(desc); if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { affinity = cpu_online_mask; --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -257,6 +257,10 @@ irq_get_pending(struct cpumask *mask, st { cpumask_copy(mask, desc->pending_mask); } +static inline struct cpumask *irq_desc_get_pending_mask(struct irq_desc *desc) +{ + return desc->pending_mask; +} #else /* CONFIG_GENERIC_PENDING_IRQ */ static inline bool irq_can_move_pcntxt(struct irq_data *data) { @@ -274,7 +278,11 @@ static inline void irq_get_pending(struct cpumask *mask, struct irq_desc *desc) { } -#endif /* CONFIG_GENERIC_PENDING_IRQ */ +static inline struct cpumask *irq_desc_get_pending_mask(struct irq_desc *desc) +{ + return NULL; +} +#endif /* !CONFIG_GENERIC_PENDING_IRQ */ #ifdef CONFIG_GENERIC_IRQ_DEBUGFS void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *desc);
[-- Attachment #1: genirq-cpuhotplug--Add-support-for-conditional-masking.patch --] [-- Type: text/plain, Size: 1421 bytes --] Interrupts which cannot be migrated in process context, need to be masked before the affinity is changed forcefully. Add support for that. Will be compiled out for architectures which do not have this x86 specific issue. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/cpuhotplug.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -18,6 +18,7 @@ static bool migrate_one_irq(struct irq_d { struct irq_data *d = irq_desc_get_irq_data(desc); struct irq_chip *chip = irq_data_get_irq_chip(d); + bool maskchip = !irq_can_move_pcntxt(d) && !irqd_irq_masked(d); const struct cpumask *affinity; bool brokeaff = false; int err; @@ -69,6 +70,10 @@ static bool migrate_one_irq(struct irq_d if (irq_fixup_move_pending(desc, true)) affinity = irq_desc_get_pending_mask(desc); + /* Mask the chip for interrupts which cannot move in process context */ + if (maskchip && chip->irq_mask) + chip->irq_mask(d); + if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { affinity = cpu_online_mask; brokeaff = true; @@ -78,8 +83,12 @@ static bool migrate_one_irq(struct irq_d if (err) { pr_warn_ratelimited("IRQ%u: set affinity failed(%d).\n", d->irq, err); - return false; + brokeaff = false; } + + if (maskchip && chip->irq_unmask) + chip->irq_unmask(d); + return brokeaff; }
[-- Attachment #1: genirq-cpuhotplug--Set-force-affinity-flag-on-hotplug-migration.patch --] [-- Type: text/plain, Size: 560 bytes --] Set the force migration flag when migrating interrupts away from an outgoing CPU. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/cpuhotplug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -79,7 +79,7 @@ static bool migrate_one_irq(struct irq_d brokeaff = true; } - err = irq_do_set_affinity(d, affinity, false); + err = irq_do_set_affinity(d, affinity, true); if (err) { pr_warn_ratelimited("IRQ%u: set affinity failed(%d).\n", d->irq, err);
[-- Attachment #1: x86-irq--Restructure-fixup_irqs--.patch --] [-- Type: text/plain, Size: 2731 bytes --] Reorder fixup_irqs() so it matches the flow in the generic migration code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/irq.c | 46 ++++++++++++++++++++-------------------------- 1 file changed, 20 insertions(+), 26 deletions(-) --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -433,7 +433,6 @@ int check_irq_vectors_for_cpu_disable(vo void fixup_irqs(void) { unsigned int irq, vector; - static int warned; struct irq_desc *desc; struct irq_data *data; struct irq_chip *chip; @@ -441,18 +440,27 @@ void fixup_irqs(void) for_each_irq_desc(irq, desc) { const struct cpumask *affinity; - int break_affinity = 0; - int set_affinity = 1; + bool break_affinity = false; if (!desc) continue; - if (irq == 2) - continue; /* interrupt's are disabled at this point */ raw_spin_lock(&desc->lock); data = irq_desc_get_irq_data(desc); + chip = irq_data_get_irq_chip(data); + /* + * The interrupt descriptor might have been cleaned up + * already, but it is not yet removed from the radix + * tree. If the chip does not have an affinity setter, + * nothing to do here. + */ + if (!chip !chip->irq_set_affinity) { + raw_spin_unlock(&desc->lock); + continue; + } + affinity = irq_data_get_affinity_mask(data); if (!irq_has_action(irq) || irqd_is_per_cpu(data) || @@ -485,30 +493,18 @@ void fixup_irqs(void) * affinity and use cpu_online_mask as fall back. */ if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { - break_affinity = 1; + broke_affinity = true; affinity = cpu_online_mask; } - chip = irq_data_get_irq_chip(data); - /* - * The interrupt descriptor might have been cleaned up - * already, but it is not yet removed from the radix tree - */ - if (!chip) { - raw_spin_unlock(&desc->lock); - continue; - } - if (!irqd_can_move_in_process_context(data) && chip->irq_mask) chip->irq_mask(data); - if (chip->irq_set_affinity) { - ret = chip->irq_set_affinity(data, affinity, true); - if (ret == -ENOSPC) - pr_crit("IRQ %d set affinity failed because there are no available vectors. The device assigned to this IRQ is unstable.\n", irq); - } else { - if (!(warned++)) - set_affinity = 0; + ret = chip->irq_set_affinity(data, affinity, true); + if (ret) { + pr_crit("IRQ %u: Force affinity failed (%d)\n", + d->irq, ret); + broke_affinity = false; } /* @@ -522,10 +518,8 @@ void fixup_irqs(void) raw_spin_unlock(&desc->lock); - if (break_affinity && set_affinity) + if (broke_affinity) pr_notice("Broke affinity for irq %i\n", irq); - else if (!set_affinity) - pr_notice("Cannot set affinity for irq %i\n", irq); } /*
[-- Attachment #1: x86-irq--Use-irq_migrate_all_off_this_cpu--.patch --] [-- Type: text/plain, Size: 3777 bytes --] The generic migration code supports all the required features already. Remove the x86 specific implementation and use the generic one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/Kconfig | 1 arch/x86/kernel/irq.c | 89 +------------------------------------------------- 2 files changed, 3 insertions(+), 87 deletions(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -87,6 +87,7 @@ config X86 select GENERIC_EARLY_IOREMAP select GENERIC_FIND_FIRST_BIT select GENERIC_IOMAP + select GENERIC_IRQ_MIGRATION if SMP select GENERIC_IRQ_PROBE select GENERIC_IRQ_SHOW select GENERIC_PENDING_IRQ if SMP --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -432,95 +432,12 @@ int check_irq_vectors_for_cpu_disable(vo /* A cpu has been removed from cpu_online_mask. Reset irq affinities. */ void fixup_irqs(void) { - unsigned int irq, vector; + unsigned int irr, vector; struct irq_desc *desc; struct irq_data *data; struct irq_chip *chip; - int ret; - for_each_irq_desc(irq, desc) { - const struct cpumask *affinity; - bool break_affinity = false; - - if (!desc) - continue; - - /* interrupt's are disabled at this point */ - raw_spin_lock(&desc->lock); - - data = irq_desc_get_irq_data(desc); - chip = irq_data_get_irq_chip(data); - /* - * The interrupt descriptor might have been cleaned up - * already, but it is not yet removed from the radix - * tree. If the chip does not have an affinity setter, - * nothing to do here. - */ - if (!chip !chip->irq_set_affinity) { - raw_spin_unlock(&desc->lock); - continue; - } - - affinity = irq_data_get_affinity_mask(data); - - if (!irq_has_action(irq) || irqd_is_per_cpu(data) || - cpumask_subset(affinity, cpu_online_mask)) { - irq_fixup_move_pending(desc, false); - raw_spin_unlock(&desc->lock); - continue; - } - - /* - * Complete an eventually pending irq move cleanup. If this - * interrupt was moved in hard irq context, then the - * vectors need to be cleaned up. It can't wait until this - * interrupt actually happens and this CPU was involved. - */ - irq_force_complete_move(desc); - - /* - * If there is a setaffinity pending, then try to reuse the - * pending mask, so the last change of the affinity does - * not get lost. If there is no move pending or the pending - * mask does not contain any online CPU, use the current - * affinity mask. - */ - if (irq_fixup_move_pending(desc, true)) - affinity = desc->pending_mask; - - /* - * If the mask does not contain an offline CPU, break - * affinity and use cpu_online_mask as fall back. - */ - if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { - broke_affinity = true; - affinity = cpu_online_mask; - } - - if (!irqd_can_move_in_process_context(data) && chip->irq_mask) - chip->irq_mask(data); - - ret = chip->irq_set_affinity(data, affinity, true); - if (ret) { - pr_crit("IRQ %u: Force affinity failed (%d)\n", - d->irq, ret); - broke_affinity = false; - } - - /* - * We unmask if the irq was not marked masked by the - * core code. That respects the lazy irq disable - * behaviour. - */ - if (!irqd_can_move_in_process_context(data) && - !irqd_irq_masked(data) && chip->irq_unmask) - chip->irq_unmask(data); - - raw_spin_unlock(&desc->lock); - - if (broke_affinity) - pr_notice("Broke affinity for irq %i\n", irq); - } + irq_migrate_all_off_this_cpu(); /* * We can remove mdelay() and then send spuriuous interrupts to @@ -539,8 +456,6 @@ void fixup_irqs(void) * nothing else will touch it. */ for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) { - unsigned int irr; - if (IS_ERR_OR_NULL(__this_cpu_read(vector_irq[vector]))) continue;
[-- Attachment #1: genirq--Move-irq_fixup_move_pending---to-core.patch --] [-- Type: text/plain, Size: 1569 bytes --] Now that x86 uses the generic code, the function declaration and inline stub can move to the core internal header. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/irq.h | 5 ----- kernel/irq/internals.h | 5 +++++ 2 files changed, 5 insertions(+), 5 deletions(-) --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -491,15 +491,10 @@ extern void irq_migrate_all_off_this_cpu void irq_move_irq(struct irq_data *data); void irq_move_masked_irq(struct irq_data *data); void irq_force_complete_move(struct irq_desc *desc); -bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear); #else static inline void irq_move_irq(struct irq_data *data) { } static inline void irq_move_masked_irq(struct irq_data *data) { } static inline void irq_force_complete_move(struct irq_desc *desc) { } -static inline bool irq_fixup_move_pending(struct irq_desc *desc, bool fclear) -{ - return false; -} #endif extern int no_irq_affinity; --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -261,6 +261,7 @@ static inline struct cpumask *irq_desc_g { return desc->pending_mask; } +bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear); #else /* CONFIG_GENERIC_PENDING_IRQ */ static inline bool irq_can_move_pcntxt(struct irq_data *data) { @@ -282,6 +283,10 @@ static inline struct cpumask *irq_desc_g { return NULL; } +static inline bool irq_fixup_move_pending(struct irq_desc *desc, bool fclear) +{ + return false; +} #endif /* !CONFIG_GENERIC_PENDING_IRQ */ #ifdef CONFIG_GENERIC_IRQ_DEBUGFS
[-- Attachment #1: genirq--Remove-pointless-arg-from-show_irq_affinity.patch --] [-- Type: text/plain, Size: 971 bytes --] The third argument of the internal helper function is unused. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/proc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/kernel/irq/proc.c +++ b/kernel/irq/proc.c @@ -37,7 +37,7 @@ static struct proc_dir_entry *root_irq_d #ifdef CONFIG_SMP -static int show_irq_affinity(int type, struct seq_file *m, void *v) +static int show_irq_affinity(int type, struct seq_file *m) { struct irq_desc *desc = irq_to_desc((long)m->private); const struct cpumask *mask = desc->irq_common_data.affinity; @@ -80,12 +80,12 @@ static int irq_affinity_hint_proc_show(s int no_irq_affinity; static int irq_affinity_proc_show(struct seq_file *m, void *v) { - return show_irq_affinity(0, m, v); + return show_irq_affinity(0, m); } static int irq_affinity_list_proc_show(struct seq_file *m, void *v) { - return show_irq_affinity(1, m, v); + return show_irq_affinity(1, m); }
[-- Attachment #1: genirq--Remove-pointless-gfp-argument.patch --] [-- Type: text/plain, Size: 2128 bytes --] All callers hand in GPF_KERNEL. No point to have an extra argument for that. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/irqdesc.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) --- a/kernel/irq/irqdesc.c +++ b/kernel/irq/irqdesc.c @@ -54,14 +54,14 @@ static void __init init_irq_default_affi #endif #ifdef CONFIG_SMP -static int alloc_masks(struct irq_desc *desc, gfp_t gfp, int node) +static int alloc_masks(struct irq_desc *desc, int node) { if (!zalloc_cpumask_var_node(&desc->irq_common_data.affinity, - gfp, node)) + GFP_KERNEL, node)) return -ENOMEM; #ifdef CONFIG_GENERIC_PENDING_IRQ - if (!zalloc_cpumask_var_node(&desc->pending_mask, gfp, node)) { + if (!zalloc_cpumask_var_node(&desc->pending_mask, GFP_KERNEL, node)) { free_cpumask_var(desc->irq_common_data.affinity); return -ENOMEM; } @@ -86,7 +86,7 @@ static void desc_smp_init(struct irq_des #else static inline int -alloc_masks(struct irq_desc *desc, gfp_t gfp, int node) { return 0; } +alloc_masks(struct irq_desc *desc, int node) { return 0; } static inline void desc_smp_init(struct irq_desc *desc, int node, const struct cpumask *affinity) { } #endif @@ -344,9 +344,8 @@ static struct irq_desc *alloc_desc(int i struct module *owner) { struct irq_desc *desc; - gfp_t gfp = GFP_KERNEL; - desc = kzalloc_node(sizeof(*desc), gfp, node); + desc = kzalloc_node(sizeof(*desc), GFP_KERNEL, node); if (!desc) return NULL; /* allocate based on nr_cpu_ids */ @@ -354,7 +353,7 @@ static struct irq_desc *alloc_desc(int i if (!desc->kstat_irqs) goto err_desc; - if (alloc_masks(desc, gfp, node)) + if (alloc_masks(desc, node)) goto err_kstat; raw_spin_lock_init(&desc->lock); @@ -525,7 +524,7 @@ int __init early_irq_init(void) for (i = 0; i < count; i++) { desc[i].kstat_irqs = alloc_percpu(unsigned int); - alloc_masks(&desc[i], GFP_KERNEL, node); + alloc_masks(&desc[i], node); raw_spin_lock_init(&desc[i].lock); lockdep_set_class(&desc[i].lock, &irq_desc_lock_class); desc_set_defaults(i, &desc[i], node, NULL, NULL);
[-- Attachment #1: genirq-proc--Replace-ever-repeating-type-cast.patch --] [-- Type: text/plain, Size: 1543 bytes --] The proc file setup repeats the same ugly type cast for the irq number over and over. Do it once and hand in the local void pointer. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/proc.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) --- a/kernel/irq/proc.c +++ b/kernel/irq/proc.c @@ -326,6 +326,7 @@ void register_handler_proc(unsigned int void register_irq_proc(unsigned int irq, struct irq_desc *desc) { static DEFINE_MUTEX(register_lock); + void __maybe_unused *irqp = (void *)(unsigned long) irq; char name [MAX_NAMELEN]; if (!root_irq_dir || (desc->irq_data.chip == &no_irq_chip)) @@ -351,20 +352,19 @@ void register_irq_proc(unsigned int irq, #ifdef CONFIG_SMP /* create /proc/irq/<irq>/smp_affinity */ proc_create_data("smp_affinity", 0644, desc->dir, - &irq_affinity_proc_fops, (void *)(long)irq); + &irq_affinity_proc_fops, irqp); /* create /proc/irq/<irq>/affinity_hint */ proc_create_data("affinity_hint", 0444, desc->dir, - &irq_affinity_hint_proc_fops, (void *)(long)irq); + &irq_affinity_hint_proc_fops, irqp); /* create /proc/irq/<irq>/smp_affinity_list */ proc_create_data("smp_affinity_list", 0644, desc->dir, - &irq_affinity_list_proc_fops, (void *)(long)irq); + &irq_affinity_list_proc_fops, irqp); proc_create_data("node", 0444, desc->dir, - &irq_node_proc_fops, (void *)(long)irq); + &irq_node_proc_fops, irqp); #endif - proc_create_data("spurious", 0444, desc->dir, &irq_spurious_proc_fops, (void *)(long)irq);
[-- Attachment #1: genirq--Introduce-effective-affinity-mask.patch --] [-- Type: text/plain, Size: 8454 bytes --] There is currently no way to evaluate the effective affinity mask of a given interrupt. Many irq chips allow only a single target CPU or a subset of CPUs in the affinity mask. Updating the mask at the time of setting the affinity to the subset would be counterproductive because information for cpu hotplug about assigned interrupt affinities gets lost. On CPU hotplug it's also pointless to force migrate an interrupt, which is not targeted at the CPU effectively. But currently the information is not available. Provide a seperate mask to be updated by the irq_chip->irq_set_affinity() implementations. Implement the read only proc files so the user can see the effective mask as well w/o trying to deduce it from /proc/interrupts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/irq.h | 29 ++++++++++++++++ kernel/irq/Kconfig | 4 ++ kernel/irq/debugfs.c | 4 ++ kernel/irq/irqdesc.c | 14 +++++++ kernel/irq/proc.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++---- 5 files changed, 134 insertions(+), 7 deletions(-) --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -136,6 +136,9 @@ struct irq_domain; * @affinity: IRQ affinity on SMP. If this is an IPI * related irq, then this is the mask of the * CPUs to which an IPI can be sent. + * @effective_affinity: The effective IRQ affinity on SMP as some irq + * chips do not allow multi CPU destinations. + * A subset of @affinity. * @msi_desc: MSI descriptor * @ipi_offset: Offset of first IPI target cpu in @affinity. Optional. */ @@ -147,6 +150,9 @@ struct irq_common_data { void *handler_data; struct msi_desc *msi_desc; cpumask_var_t affinity; +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + cpumask_var_t effective_affinity; +#endif #ifdef CONFIG_GENERIC_IRQ_IPI unsigned int ipi_offset; #endif @@ -736,6 +742,29 @@ static inline struct cpumask *irq_data_g return d->common->affinity; } +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK +static inline +struct cpumask *irq_data_get_effective_affinity_mask(struct irq_data *d) +{ + return d->common->effective_affinity; +} +static inline void irq_data_update_effective_affinity(struct irq_data *d, + const struct cpumask *m) +{ + cpumask_copy(d->common->effective_affinity, m); +} +#else +static inline void irq_data_update_effective_affinity(struct irq_data *d, + const struct cpumask *m) +{ +} +static inline +struct cpumask *irq_data_get_effective_affinity_mask(struct irq_data *d) +{ + return d->common->affinity; +} +#endif + unsigned int arch_dynirq_lower_bound(unsigned int from); int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node, --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -21,6 +21,10 @@ config GENERIC_IRQ_SHOW config GENERIC_IRQ_SHOW_LEVEL bool +# Supports effective affinity mask +config GENERIC_IRQ_EFFECTIVE_AFF_MASK + bool + # Facility to allocate a hardware interrupt. This is legacy support # and should not be used in new code. Use irq domains instead. config GENERIC_IRQ_LEGACY_ALLOC_HWIRQ --- a/kernel/irq/debugfs.c +++ b/kernel/irq/debugfs.c @@ -31,6 +31,10 @@ static void irq_debug_show_masks(struct msk = irq_data_get_affinity_mask(data); seq_printf(m, "affinity: %*pbl\n", cpumask_pr_args(msk)); +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + msk = irq_data_get_effective_affinity_mask(data); + seq_printf(m, "effectiv: %*pbl\n", cpumask_pr_args(msk)); +#endif #ifdef CONFIG_GENERIC_PENDING_IRQ msk = desc->pending_mask; seq_printf(m, "pending: %*pbl\n", cpumask_pr_args(msk)); --- a/kernel/irq/irqdesc.c +++ b/kernel/irq/irqdesc.c @@ -60,8 +60,19 @@ static int alloc_masks(struct irq_desc * GFP_KERNEL, node)) return -ENOMEM; +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + if (!zalloc_cpumask_var_node(&desc->irq_common_data.effective_affinity, + GFP_KERNEL, node)) { + free_cpumask_var(desc->irq_common_data.affinity); + return -ENOMEM; + } +#endif + #ifdef CONFIG_GENERIC_PENDING_IRQ if (!zalloc_cpumask_var_node(&desc->pending_mask, GFP_KERNEL, node)) { +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + free_cpumask_var(desc->irq_common_data.effective_affinity); +#endif free_cpumask_var(desc->irq_common_data.affinity); return -ENOMEM; } @@ -324,6 +335,9 @@ static void free_masks(struct irq_desc * free_cpumask_var(desc->pending_mask); #endif free_cpumask_var(desc->irq_common_data.affinity); +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + free_cpumask_var(desc->irq_common_data.effective_affinity); +#endif } #else static inline void free_masks(struct irq_desc *desc) { } --- a/kernel/irq/proc.c +++ b/kernel/irq/proc.c @@ -37,19 +37,47 @@ static struct proc_dir_entry *root_irq_d #ifdef CONFIG_SMP +enum { + AFFINITY, + AFFINITY_LIST, + EFFECTIVE, + EFFECTIVE_LIST, +}; + static int show_irq_affinity(int type, struct seq_file *m) { struct irq_desc *desc = irq_to_desc((long)m->private); - const struct cpumask *mask = desc->irq_common_data.affinity; + const struct cpumask *mask; + switch (type) { + case AFFINITY: + case AFFINITY_LIST: + mask = desc->irq_common_data.affinity; #ifdef CONFIG_GENERIC_PENDING_IRQ - if (irqd_is_setaffinity_pending(&desc->irq_data)) - mask = desc->pending_mask; + if (irqd_is_setaffinity_pending(&desc->irq_data)) + mask = desc->pending_mask; #endif - if (type) + break; + case EFFECTIVE: + case EFFECTIVE_LIST: +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + mask = desc->irq_common_data.effective_affinity; + break; +#else + return -EINVAL; +#endif + }; + + switch (type) { + case AFFINITY_LIST: + case EFFECTIVE_LIST: seq_printf(m, "%*pbl\n", cpumask_pr_args(mask)); - else + break; + case AFFINITY: + case EFFECTIVE: seq_printf(m, "%*pb\n", cpumask_pr_args(mask)); + break; + } return 0; } @@ -80,12 +108,12 @@ static int irq_affinity_hint_proc_show(s int no_irq_affinity; static int irq_affinity_proc_show(struct seq_file *m, void *v) { - return show_irq_affinity(0, m); + return show_irq_affinity(AFFINITY, m); } static int irq_affinity_list_proc_show(struct seq_file *m, void *v) { - return show_irq_affinity(1, m); + return show_irq_affinity(AFFINITY_LIST, m); } @@ -185,6 +213,44 @@ static const struct file_operations irq_ .write = irq_affinity_list_proc_write, }; +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK +static int irq_effective_aff_proc_show(struct seq_file *m, void *v) +{ + return show_irq_affinity(EFFECTIVE, m); +} + +static int irq_effective_aff_list_proc_show(struct seq_file *m, void *v) +{ + return show_irq_affinity(EFFECTIVE_LIST, m); +} + +static int irq_effective_aff_proc_open(struct inode *inode, struct file *file) +{ + return single_open(file, irq_effective_aff_proc_show, PDE_DATA(inode)); +} + +static int irq_effective_aff_list_proc_open(struct inode *inode, + struct file *file) +{ + return single_open(file, irq_effective_aff_list_proc_show, + PDE_DATA(inode)); +} + +static const struct file_operations irq_effective_aff_proc_fops = { + .open = irq_effective_aff_proc_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static const struct file_operations irq_effective_aff_list_proc_fops = { + .open = irq_effective_aff_list_proc_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; +#endif + static int default_affinity_show(struct seq_file *m, void *v) { seq_printf(m, "%*pb\n", cpumask_pr_args(irq_default_affinity)); @@ -364,6 +430,12 @@ void register_irq_proc(unsigned int irq, proc_create_data("node", 0444, desc->dir, &irq_node_proc_fops, irqp); +# ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + proc_create_data("effective_affinity", 0444, desc->dir, + &irq_effective_aff_proc_fops, irqp); + proc_create_data("effective_affinity_list", 0444, desc->dir, + &irq_effective_aff_list_proc_fops, irqp); +# endif #endif proc_create_data("spurious", 0444, desc->dir, &irq_spurious_proc_fops, (void *)(long)irq); @@ -383,6 +455,10 @@ void unregister_irq_proc(unsigned int ir remove_proc_entry("affinity_hint", desc->dir); remove_proc_entry("smp_affinity_list", desc->dir); remove_proc_entry("node", desc->dir); +# ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + remove_proc_entry("effective_affinity", desc->dir); + remove_proc_entry("effective_affinity_list", desc->dir); +# endif #endif remove_proc_entry("spurious", desc->dir);
[-- Attachment #1: genirq-cpuhotplug--Use-effective-affinity-mask.patch --] [-- Type: text/plain, Size: 1743 bytes --] If the architecture supports the effective affinity mask, migrating interrupts away which are not targeted by the effective mask is pointless. They can stay in the user or system supplied affinity mask, but won't be targetted at any given point as the affinity setter functions need to validate against the online cpu mask anyway. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/cpuhotplug.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -14,6 +14,14 @@ #include "internals.h" +/* For !GENERIC_IRQ_EFFECTIVE_AFF_MASK this looks at general affinity mask */ +static inline bool irq_needs_fixup(struct irq_data *d) +{ + const struct cpumask *m = irq_data_get_effective_affinity_mask(d); + + return cpumask_test_cpu(smp_processor_id(), m); +} + static bool migrate_one_irq(struct irq_desc *desc) { struct irq_data *d = irq_desc_get_irq_data(desc); @@ -42,9 +50,7 @@ static bool migrate_one_irq(struct irq_d * Note: Do not check desc->action as this might be a chained * interrupt. */ - affinity = irq_data_get_affinity_mask(d); - if (irqd_is_per_cpu(d) || !irqd_is_started(d) || - !cpumask_test_cpu(smp_processor_id(), affinity)) { + if (irqd_is_per_cpu(d) || !irqd_is_started(d) || !irq_needs_fixup(d)) { /* * If an irq move is pending, abort it if the dying CPU is * the sole target. @@ -69,6 +75,8 @@ static bool migrate_one_irq(struct irq_d */ if (irq_fixup_move_pending(desc, true)) affinity = irq_desc_get_pending_mask(desc); + else + affinity = irq_data_get_affinity_mask(d); /* Mask the chip for interrupts which cannot move in process context */ if (maskchip && chip->irq_mask)
[-- Attachment #1: x86-apic--Move-flat_cpu_mask_to_apicid_and---into-C-source.patch --] [-- Type: text/plain, Size: 2103 bytes --] No point in having inlines assigned to function pointers at multiple places. Just bloats the text. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/include/asm/apic.h | 28 ++++++---------------------- arch/x86/kernel/apic/apic.c | 16 ++++++++++++++++ 2 files changed, 22 insertions(+), 22 deletions(-) --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -540,28 +540,12 @@ static inline int default_phys_pkg_id(in #endif -static inline int -flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid) -{ - unsigned long cpu_mask = cpumask_bits(cpumask)[0] & - cpumask_bits(andmask)[0] & - cpumask_bits(cpu_online_mask)[0] & - APIC_ALL_CPUS; - - if (likely(cpu_mask)) { - *apicid = (unsigned int)cpu_mask; - return 0; - } else { - return -EINVAL; - } -} - -extern int -default_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid); +extern int flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, + const struct cpumask *andmask, + unsigned int *apicid); +extern int default_cpu_mask_to_apicid_and(const struct cpumask *cpumask, + const struct cpumask *andmask, + unsigned int *apicid); static inline void flat_vector_allocation_domain(int cpu, struct cpumask *retmask, --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2220,6 +2220,22 @@ int default_cpu_mask_to_apicid_and(const return -EINVAL; } +int flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, + const struct cpumask *andmask, + unsigned int *apicid) +{ + unsigned long cpu_mask = cpumask_bits(cpumask)[0] & + cpumask_bits(andmask)[0] & + cpumask_bits(cpu_online_mask)[0] & + APIC_ALL_CPUS; + + if (likely(cpu_mask)) { + *apicid = (unsigned int)cpu_mask; + return 0; + } + return -EINVAL; +} + /* * Override the generic EOI implementation with an optimized version. * Only called during early boot when only one CPU is active and with
[-- Attachment #1: x86-uv--Use-default_cpu_mask_to_apicid_and--.patch --] [-- Type: text/plain, Size: 1014 bytes --] Same functionality except the extra bits ored on the apicid. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/x2apic_uv_x.c | 19 ++++--------------- 1 file changed, 4 insertions(+), 15 deletions(-) --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -530,23 +530,12 @@ uv_cpu_mask_to_apicid_and(const struct c const struct cpumask *andmask, unsigned int *apicid) { - int unsigned cpu; + int ret = default_cpu_mask_to_apicid_and(cpumask, andmask, apicid); - /* - * We're using fixed IRQ delivery, can only return one phys APIC ID. - * May as well be the first. - */ - for_each_cpu_and(cpu, cpumask, andmask) { - if (cpumask_test_cpu(cpu, cpu_online_mask)) - break; - } + if (!ret) + *apicid |= uv_apicid_hibits; - if (likely(cpu < nr_cpu_ids)) { - *apicid = per_cpu(x86_cpu_to_apicid, cpu) | uv_apicid_hibits; - return 0; - } - - return -EINVAL; + return ret; } static unsigned int x2apic_get_apic_id(unsigned long x)
[-- Attachment #1: x86-apic--Move-online-masking-to-core-code.patch --] [-- Type: text/plain, Size: 3215 bytes --] All implementations of apic->cpu_mask_to_apicid_and() mask out the offline cpus. The callsite already has a mask available, which has the offline CPUs removed. Use that and remove the extra bits. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/apic.c | 27 +++++++++------------------ arch/x86/kernel/apic/vector.c | 5 ++++- arch/x86/kernel/apic/x2apic_cluster.c | 25 +++++++++---------------- 3 files changed, 22 insertions(+), 35 deletions(-) --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2205,19 +2205,12 @@ int default_cpu_mask_to_apicid_and(const const struct cpumask *andmask, unsigned int *apicid) { - unsigned int cpu; + unsigned int cpu = cpumask_first_and(cpumask, andmask); - for_each_cpu_and(cpu, cpumask, andmask) { - if (cpumask_test_cpu(cpu, cpu_online_mask)) - break; - } - - if (likely(cpu < nr_cpu_ids)) { - *apicid = per_cpu(x86_cpu_to_apicid, cpu); - return 0; - } - - return -EINVAL; + if (cpu >= nr_cpu_ids) + return -EINVAL; + *apicid = per_cpu(x86_cpu_to_apicid, cpu); + return 0; } int flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, @@ -2226,14 +2219,12 @@ int flat_cpu_mask_to_apicid_and(const st { unsigned long cpu_mask = cpumask_bits(cpumask)[0] & cpumask_bits(andmask)[0] & - cpumask_bits(cpu_online_mask)[0] & APIC_ALL_CPUS; - if (likely(cpu_mask)) { - *apicid = (unsigned int)cpu_mask; - return 0; - } - return -EINVAL; + if (!cpu_mask) + return -EINVAL; + *apicid = (unsigned int)cpu_mask; + return 0; } /* --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -221,8 +221,11 @@ static int __assign_irq_vector(int irq, * Cache destination APIC IDs into cfg->dest_apicid. This cannot fail * as we already established, that mask & d->domain & cpu_online_mask * is not empty. + * + * vector_searchmask is a subset of d->domain and has the offline + * cpus masked out. */ - BUG_ON(apic->cpu_mask_to_apicid_and(mask, d->domain, + BUG_ON(apic->cpu_mask_to_apicid_and(mask, vector_searchmask, &d->cfg.dest_apicid)); return 0; } --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -108,31 +108,24 @@ x2apic_cpu_mask_to_apicid_and(const stru const struct cpumask *andmask, unsigned int *apicid) { + unsigned int cpu; u32 dest = 0; u16 cluster; - int i; - for_each_cpu_and(i, cpumask, andmask) { - if (!cpumask_test_cpu(i, cpu_online_mask)) - continue; - dest = per_cpu(x86_cpu_to_logical_apicid, i); - cluster = x2apic_cluster(i); - break; - } - - if (!dest) + cpu = cpumask_first_and(cpumask, andmask); + if (cpu >= nr_cpu_ids) return -EINVAL; - for_each_cpu_and(i, cpumask, andmask) { - if (!cpumask_test_cpu(i, cpu_online_mask)) - continue; - if (cluster != x2apic_cluster(i)) + dest = per_cpu(x86_cpu_to_logical_apicid, cpu); + cluster = x2apic_cluster(cpu); + + for_each_cpu_and(cpu, cpumask, andmask) { + if (cluster != x2apic_cluster(cpu)) continue; - dest |= per_cpu(x86_cpu_to_logical_apicid, i); + dest |= per_cpu(x86_cpu_to_logical_apicid, cpu); } *apicid = dest; - return 0; }
[-- Attachment #1: x86-apic--Move-cpumask-and-to-core-code.patch --] [-- Type: text/plain, Size: 9500 bytes --] All implementations of apic->cpu_mask_to_apicid_and() and the two incoming cpumasks to search for the target. Move that operation to the call site and rename it to cpu_mask_to_apicid() Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/include/asm/apic.h | 15 ++++++--------- arch/x86/kernel/apic/apic.c | 14 ++++---------- arch/x86/kernel/apic/apic_flat_64.c | 4 ++-- arch/x86/kernel/apic/apic_noop.c | 2 +- arch/x86/kernel/apic/apic_numachip.c | 4 ++-- arch/x86/kernel/apic/bigsmp_32.c | 2 +- arch/x86/kernel/apic/probe_32.c | 2 +- arch/x86/kernel/apic/vector.c | 6 +++--- arch/x86/kernel/apic/x2apic_cluster.c | 10 ++++------ arch/x86/kernel/apic/x2apic_phys.c | 2 +- arch/x86/kernel/apic/x2apic_uv_x.c | 8 +++----- arch/x86/xen/apic.c | 2 +- 12 files changed, 29 insertions(+), 42 deletions(-) --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -296,9 +296,8 @@ struct apic { /* Can't be NULL on 64-bit */ unsigned long (*set_apic_id)(unsigned int id); - int (*cpu_mask_to_apicid_and)(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid); + int (*cpu_mask_to_apicid)(const struct cpumask *cpumask, + unsigned int *apicid); /* ipi */ void (*send_IPI)(int cpu, int vector); @@ -540,12 +539,10 @@ static inline int default_phys_pkg_id(in #endif -extern int flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid); -extern int default_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid); +extern int flat_cpu_mask_to_apicid(const struct cpumask *cpumask, + unsigned int *apicid); +extern int default_cpu_mask_to_apicid(const struct cpumask *cpumask, + unsigned int *apicid); static inline void flat_vector_allocation_domain(int cpu, struct cpumask *retmask, --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2201,11 +2201,9 @@ void default_init_apic_ldr(void) apic_write(APIC_LDR, val); } -int default_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid) +int default_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) { - unsigned int cpu = cpumask_first_and(cpumask, andmask); + unsigned int cpu = cpumask_first(mask); if (cpu >= nr_cpu_ids) return -EINVAL; @@ -2213,13 +2211,9 @@ int default_cpu_mask_to_apicid_and(const return 0; } -int flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid) +int flat_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) { - unsigned long cpu_mask = cpumask_bits(cpumask)[0] & - cpumask_bits(andmask)[0] & - APIC_ALL_CPUS; + unsigned long cpu_mask = cpumask_bits(mask)[0] & APIC_ALL_CPUS; if (!cpu_mask) return -EINVAL; --- a/arch/x86/kernel/apic/apic_flat_64.c +++ b/arch/x86/kernel/apic/apic_flat_64.c @@ -172,7 +172,7 @@ static struct apic apic_flat __ro_after_ .get_apic_id = flat_get_apic_id, .set_apic_id = set_apic_id, - .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = flat_cpu_mask_to_apicid, .send_IPI = default_send_IPI_single, .send_IPI_mask = flat_send_IPI_mask, @@ -268,7 +268,7 @@ static struct apic apic_physflat __ro_af .get_apic_id = flat_get_apic_id, .set_apic_id = set_apic_id, - .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = default_cpu_mask_to_apicid, .send_IPI = default_send_IPI_single_phys, .send_IPI_mask = default_send_IPI_mask_sequence_phys, --- a/arch/x86/kernel/apic/apic_noop.c +++ b/arch/x86/kernel/apic/apic_noop.c @@ -141,7 +141,7 @@ struct apic apic_noop __ro_after_init = .get_apic_id = noop_get_apic_id, .set_apic_id = NULL, - .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = flat_cpu_mask_to_apicid, .send_IPI = noop_send_IPI, .send_IPI_mask = noop_send_IPI_mask, --- a/arch/x86/kernel/apic/apic_numachip.c +++ b/arch/x86/kernel/apic/apic_numachip.c @@ -267,7 +267,7 @@ static const struct apic apic_numachip1 .get_apic_id = numachip1_get_apic_id, .set_apic_id = numachip1_set_apic_id, - .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = default_cpu_mask_to_apicid, .send_IPI = numachip_send_IPI_one, .send_IPI_mask = numachip_send_IPI_mask, @@ -318,7 +318,7 @@ static const struct apic apic_numachip2 .get_apic_id = numachip2_get_apic_id, .set_apic_id = numachip2_set_apic_id, - .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = default_cpu_mask_to_apicid, .send_IPI = numachip_send_IPI_one, .send_IPI_mask = numachip_send_IPI_mask, --- a/arch/x86/kernel/apic/bigsmp_32.c +++ b/arch/x86/kernel/apic/bigsmp_32.c @@ -172,7 +172,7 @@ static struct apic apic_bigsmp __ro_afte .get_apic_id = bigsmp_get_apic_id, .set_apic_id = NULL, - .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = default_cpu_mask_to_apicid, .send_IPI = default_send_IPI_single_phys, .send_IPI_mask = default_send_IPI_mask_sequence_phys, --- a/arch/x86/kernel/apic/probe_32.c +++ b/arch/x86/kernel/apic/probe_32.c @@ -102,7 +102,7 @@ static struct apic apic_default __ro_aft .get_apic_id = default_get_apic_id, .set_apic_id = NULL, - .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = flat_cpu_mask_to_apicid, .send_IPI = default_send_IPI_single, .send_IPI_mask = default_send_IPI_mask_logical, --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -141,7 +141,7 @@ static int __assign_irq_vector(int irq, /* * Clear the offline cpus from @vector_cpumask for searching * and verify whether the result overlaps with @mask. If true, - * then the call to apic->cpu_mask_to_apicid_and() will + * then the call to apic->cpu_mask_to_apicid() will * succeed as well. If not, no point in trying to find a * vector in this mask. */ @@ -225,8 +225,8 @@ static int __assign_irq_vector(int irq, * vector_searchmask is a subset of d->domain and has the offline * cpus masked out. */ - BUG_ON(apic->cpu_mask_to_apicid_and(mask, vector_searchmask, - &d->cfg.dest_apicid)); + cpumask_and(vector_searchmask, vector_searchmask, mask); + BUG_ON(apic->cpu_mask_to_apicid(vector_searchmask, &d->cfg.dest_apicid)); return 0; } --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -104,22 +104,20 @@ static void x2apic_send_IPI_all(int vect } static int -x2apic_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid) +x2apic_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) { unsigned int cpu; u32 dest = 0; u16 cluster; - cpu = cpumask_first_and(cpumask, andmask); + cpu = cpumask_first(mask); if (cpu >= nr_cpu_ids) return -EINVAL; dest = per_cpu(x86_cpu_to_logical_apicid, cpu); cluster = x2apic_cluster(cpu); - for_each_cpu_and(cpu, cpumask, andmask) { + for_each_cpu(cpu, mask) { if (cluster != x2apic_cluster(cpu)) continue; dest |= per_cpu(x86_cpu_to_logical_apicid, cpu); @@ -249,7 +247,7 @@ static struct apic apic_x2apic_cluster _ .get_apic_id = x2apic_get_apic_id, .set_apic_id = x2apic_set_apic_id, - .cpu_mask_to_apicid_and = x2apic_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = x2apic_cpu_mask_to_apicid, .send_IPI = x2apic_send_IPI, .send_IPI_mask = x2apic_send_IPI_mask, --- a/arch/x86/kernel/apic/x2apic_phys.c +++ b/arch/x86/kernel/apic/x2apic_phys.c @@ -127,7 +127,7 @@ static struct apic apic_x2apic_phys __ro .get_apic_id = x2apic_get_apic_id, .set_apic_id = x2apic_set_apic_id, - .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = default_cpu_mask_to_apicid, .send_IPI = x2apic_send_IPI, .send_IPI_mask = x2apic_send_IPI_mask, --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -526,11 +526,9 @@ static void uv_init_apic_ldr(void) } static int -uv_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid) +uv_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) { - int ret = default_cpu_mask_to_apicid_and(cpumask, andmask, apicid); + int ret = default_cpu_mask_to_apicid(mask, apicid); if (!ret) *apicid |= uv_apicid_hibits; @@ -603,7 +601,7 @@ static struct apic apic_x2apic_uv_x __ro .get_apic_id = x2apic_get_apic_id, .set_apic_id = set_apic_id, - .cpu_mask_to_apicid_and = uv_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = uv_cpu_mask_to_apicid, .send_IPI = uv_send_IPI_one, .send_IPI_mask = uv_send_IPI_mask, --- a/arch/x86/xen/apic.c +++ b/arch/x86/xen/apic.c @@ -178,7 +178,7 @@ static struct apic xen_pv_apic = { .get_apic_id = xen_get_apic_id, .set_apic_id = xen_set_apic_id, /* Can be NULL on 32-bit. */ - .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = flat_cpu_mask_to_apicid, #ifdef CONFIG_SMP .send_IPI_mask = xen_send_IPI_mask,
[-- Attachment #1: x86-apic--Add-irq_data-argument-to-apic--cpu_mask_to_apicid--.patch --] [-- Type: text/plain, Size: 5733 bytes --] The decision to which CPUs an interrupt is effectively routed happens in the various apic->cpu_mask_to_apicid() implementations To support effective affinity masks this information needs to be updated in irq_data. Add a pointer to irq_data to the callbacks and feed it through the call chain. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/include/asm/apic.h | 5 +++++ arch/x86/kernel/apic/apic.c | 9 +++++++-- arch/x86/kernel/apic/vector.c | 25 +++++++++++++++---------- arch/x86/kernel/apic/x2apic_cluster.c | 3 ++- arch/x86/kernel/apic/x2apic_uv_x.c | 5 +++-- 5 files changed, 32 insertions(+), 15 deletions(-) --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -252,6 +252,8 @@ static inline int x2apic_enabled(void) { #define x2apic_supported() (0) #endif /* !CONFIG_X86_X2APIC */ +struct irq_data; + /* * Copyright 2004 James Cleverdon, IBM. * Subject to the GNU Public License, v.2 @@ -297,6 +299,7 @@ struct apic { unsigned long (*set_apic_id)(unsigned int id); int (*cpu_mask_to_apicid)(const struct cpumask *cpumask, + struct irq_data *irqdata, unsigned int *apicid); /* ipi */ @@ -540,8 +543,10 @@ static inline int default_phys_pkg_id(in #endif extern int flat_cpu_mask_to_apicid(const struct cpumask *cpumask, + struct irq_data *irqdata, unsigned int *apicid); extern int default_cpu_mask_to_apicid(const struct cpumask *cpumask, + struct irq_data *irqdata, unsigned int *apicid); static inline void --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2201,7 +2201,9 @@ void default_init_apic_ldr(void) apic_write(APIC_LDR, val); } -int default_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) +int default_cpu_mask_to_apicid(const struct cpumask *mask, + struct irq_data *irqdata, + unsigned int *apicid) { unsigned int cpu = cpumask_first(mask); @@ -2211,7 +2213,10 @@ int default_cpu_mask_to_apicid(const str return 0; } -int flat_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) +int flat_cpu_mask_to_apicid(const struct cpumask *mask, + struct irq_data *irqdata, + unsigned int *apicid) + { unsigned long cpu_mask = cpumask_bits(mask)[0] & APIC_ALL_CPUS; --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -103,7 +103,8 @@ static void free_apic_chip_data(struct a } static int __assign_irq_vector(int irq, struct apic_chip_data *d, - const struct cpumask *mask) + const struct cpumask *mask, + struct irq_data *irqdata) { /* * NOTE! The local APIC isn't very good at handling @@ -226,32 +227,35 @@ static int __assign_irq_vector(int irq, * cpus masked out. */ cpumask_and(vector_searchmask, vector_searchmask, mask); - BUG_ON(apic->cpu_mask_to_apicid(vector_searchmask, &d->cfg.dest_apicid)); + BUG_ON(apic->cpu_mask_to_apicid(vector_searchmask, irqdata, + &d->cfg.dest_apicid)); return 0; } static int assign_irq_vector(int irq, struct apic_chip_data *data, - const struct cpumask *mask) + const struct cpumask *mask, + struct irq_data *irqdata) { int err; unsigned long flags; raw_spin_lock_irqsave(&vector_lock, flags); - err = __assign_irq_vector(irq, data, mask); + err = __assign_irq_vector(irq, data, mask, irqdata); raw_spin_unlock_irqrestore(&vector_lock, flags); return err; } static int assign_irq_vector_policy(int irq, int node, struct apic_chip_data *data, - struct irq_alloc_info *info) + struct irq_alloc_info *info, + struct irq_data *irqdata) { if (info && info->mask) - return assign_irq_vector(irq, data, info->mask); + return assign_irq_vector(irq, data, info->mask, irqdata); if (node != NUMA_NO_NODE && - assign_irq_vector(irq, data, cpumask_of_node(node)) == 0) + assign_irq_vector(irq, data, cpumask_of_node(node), irqdata) == 0) return 0; - return assign_irq_vector(irq, data, apic->target_cpus()); + return assign_irq_vector(irq, data, apic->target_cpus(), irqdata); } static void clear_irq_vector(int irq, struct apic_chip_data *data) @@ -363,7 +367,8 @@ static int x86_vector_alloc_irqs(struct irq_data->chip = &lapic_controller; irq_data->chip_data = data; irq_data->hwirq = virq + i; - err = assign_irq_vector_policy(virq + i, node, data, info); + err = assign_irq_vector_policy(virq + i, node, data, info, + irq_data); if (err) goto error; } @@ -537,7 +542,7 @@ static int apic_set_affinity(struct irq_ if (!cpumask_intersects(dest, cpu_online_mask)) return -EINVAL; - err = assign_irq_vector(irq, data, dest); + err = assign_irq_vector(irq, data, dest, irq_data); return err ? err : IRQ_SET_MASK_OK; } --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -104,7 +104,8 @@ static void x2apic_send_IPI_all(int vect } static int -x2apic_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) +x2apic_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqdata, + unsigned int *apicid) { unsigned int cpu; u32 dest = 0; --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -526,9 +526,10 @@ static void uv_init_apic_ldr(void) } static int -uv_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) +uv_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqdata, + unsigned int *apicid) { - int ret = default_cpu_mask_to_apicid(mask, apicid); + int ret = default_cpu_mask_to_apicid(mask, irqdata, apicid); if (!ret) *apicid |= uv_apicid_hibits;
[-- Attachment #1: xen-events--Add-support-for-effective-affinity-mask.patch --] [-- Type: text/plain, Size: 709 bytes --] Update the effective affinity mask when an interrupt was successfully targeted to a CPU. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- drivers/xen/events/events_base.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1343,8 +1343,12 @@ static int set_affinity_irq(struct irq_d bool force) { unsigned tcpu = cpumask_first_and(dest, cpu_online_mask); + int ret = rebind_irq_to_cpu(data->irq, tcpu); - return rebind_irq_to_cpu(data->irq, tcpu); + if (!ret) + irq_data_update_effective_affinity(data, cpumask_of(tcpu)); + + return ret; } static void enable_dynirq(struct irq_data *data)
[-- Attachment #1: x86-apic--Implement-effective-irq-mask-update.patch --] [-- Type: text/plain, Size: 2198 bytes --] Add the effective irq mask update to the apic implementations and enable effective irq masks for x86. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/Kconfig | 1 + arch/x86/kernel/apic/apic.c | 3 +++ arch/x86/kernel/apic/x2apic_cluster.c | 4 ++++ 3 files changed, 8 insertions(+) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -87,6 +87,7 @@ config X86 select GENERIC_EARLY_IOREMAP select GENERIC_FIND_FIRST_BIT select GENERIC_IOMAP + select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP select GENERIC_IRQ_MIGRATION if SMP select GENERIC_IRQ_PROBE select GENERIC_IRQ_SHOW --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2210,6 +2210,7 @@ int default_cpu_mask_to_apicid(const str if (cpu >= nr_cpu_ids) return -EINVAL; *apicid = per_cpu(x86_cpu_to_apicid, cpu); + irq_data_update_effective_affinity(irqdata, cpumask_of(cpu)); return 0; } @@ -2218,11 +2219,13 @@ int flat_cpu_mask_to_apicid(const struct unsigned int *apicid) { + struct cpumask *effmsk = irq_data_get_effective_affinity_mask(irqdata); unsigned long cpu_mask = cpumask_bits(mask)[0] & APIC_ALL_CPUS; if (!cpu_mask) return -EINVAL; *apicid = (unsigned int)cpu_mask; + cpumask_bits(effmsk)[0] = cpu_mask; return 0; } --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -4,6 +4,7 @@ #include <linux/kernel.h> #include <linux/ctype.h> #include <linux/dmar.h> +#include <linux/irq.h> #include <linux/cpu.h> #include <asm/smp.h> @@ -107,6 +108,7 @@ static int x2apic_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqdata, unsigned int *apicid) { + struct cpumask *effmsk = irq_data_get_effective_affinity_mask(irqdata); unsigned int cpu; u32 dest = 0; u16 cluster; @@ -118,10 +120,12 @@ x2apic_cpu_mask_to_apicid(const struct c dest = per_cpu(x86_cpu_to_logical_apicid, cpu); cluster = x2apic_cluster(cpu); + cpumask_clear(effmsk); for_each_cpu(cpu, mask) { if (cluster != x2apic_cluster(cpu)) continue; dest |= per_cpu(x86_cpu_to_logical_apicid, cpu); + cpumask_set_cpu(cpu, effmsk); } *apicid = dest;
[-- Attachment #1: genirq--Introduce-IRQD_MANAGED_SHUTDOWN.patch --] [-- Type: text/plain, Size: 2197 bytes --] Affinity managed interrupts should keep their assigned affinity accross CPU hotplug. To avoid magic hackery in device drivers, the core code shall manage them transparently. This will set these interrupts into a managed shutdown state when the last CPU of the assigned affinity mask goes offline. The interrupt will be restarted when one of the CPUs in the assigned affinity mask comes back online. Introduce the necessary state flag and the accessor functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/irq.h | 8 ++++++++ kernel/irq/internals.h | 10 ++++++++++ 2 files changed, 18 insertions(+) --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -206,6 +206,8 @@ struct irq_data { * IRQD_FORWARDED_TO_VCPU - The interrupt is forwarded to a VCPU * IRQD_AFFINITY_MANAGED - Affinity is auto-managed by the kernel * IRQD_IRQ_STARTED - Startup state of the interrupt + * IRQD_MANAGED_SHUTDOWN - Interrupt was shutdown due to empty affinity + * mask. Applies only to affinity managed irqs. */ enum { IRQD_TRIGGER_MASK = 0xf, @@ -224,6 +226,7 @@ enum { IRQD_FORWARDED_TO_VCPU = (1 << 20), IRQD_AFFINITY_MANAGED = (1 << 21), IRQD_IRQ_STARTED = (1 << 22), + IRQD_MANAGED_SHUTDOWN = (1 << 23), }; #define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors) @@ -342,6 +345,11 @@ static inline bool irqd_is_started(struc return __irqd_to_state(d) & IRQD_IRQ_STARTED; } +static inline bool irqd_is_managed_shutdown(struct irq_data *d) +{ + return __irqd_to_state(d) & IRQD_MANAGED_SHUTDOWN; +} + #undef __irqd_to_state static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d) --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -193,6 +193,16 @@ static inline void irqd_clr_move_pending __irqd_to_state(d) &= ~IRQD_SETAFFINITY_PENDING; } +static inline void irqd_set_managed_shutdown(struct irq_data *d) +{ + __irqd_to_state(d) |= IRQD_MANAGED_SHUTDOWN; +} + +static inline void irqd_clr_managed_shutdown(struct irq_data *d) +{ + __irqd_to_state(d) &= ~IRQD_MANAGED_SHUTDOWN; +} + static inline void irqd_clear(struct irq_data *d, unsigned int mask) { __irqd_to_state(d) &= ~mask;
[-- Attachment #1: genirq--Split-out-irq_startup---code.patch --] [-- Type: text/plain, Size: 1460 bytes --] Split out the inner workings of irq_startup() so it can be reused to handle managed interrupts gracefully. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/chip.c | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -195,6 +195,23 @@ static void irq_state_set_started(struct irqd_set(&desc->irq_data, IRQD_IRQ_STARTED); } +static int __irq_startup(struct irq_desc *desc) +{ + struct irq_data *d = irq_desc_get_irq_data(desc); + int ret = 0; + + irq_domain_activate_irq(d); + if (d->chip->irq_startup) { + ret = d->chip->irq_startup(d); + irq_state_clr_disabled(desc); + irq_state_clr_masked(desc); + } else { + irq_enable(desc); + } + irq_state_set_started(desc); + return ret; +} + int irq_startup(struct irq_desc *desc, bool resend) { int ret = 0; @@ -204,19 +221,9 @@ int irq_startup(struct irq_desc *desc, b if (irqd_is_started(&desc->irq_data)) { irq_enable(desc); } else { - irq_domain_activate_irq(&desc->irq_data); - if (desc->irq_data.chip->irq_startup) { - ret = desc->irq_data.chip->irq_startup(&desc->irq_data); - irq_state_clr_disabled(desc); - irq_state_clr_masked(desc); - } else { - irq_enable(desc); - } - irq_state_set_started(desc); - /* Set default affinity mask once everything is setup */ + ret = __irq_startup(desc); irq_setup_affinity(desc); } - if (resend) check_irq_resend(desc);
[-- Attachment #1: genirq--Add-force-argument-to-irq_startup--.patch --] [-- Type: text/plain, Size: 3468 bytes --] In order to handle managed interrupts gracefully on irq_startup() so they won't lose their assigned affinity, it's necessary to allow startups which keep the interrupts in managed shutdown state, if none of the assigend CPUs is online. This allows drivers to request interrupts w/o the CPUs being online, which avoid online/offline churn in drivers. Add a force argument which can override that decision and let only request_irq() and enable_irq() allow the managed shutdown handling. enable_irq() is required, because the interrupt might be requested with IRQF_NOAUTOEN and enable_irq() invokes irq_startup() which would then wreckage the assignment again. All other callers force startup and potentially break the assigned affinity. No functional change as this only adds the function argument. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/autoprobe.c | 4 ++-- kernel/irq/chip.c | 4 ++-- kernel/irq/internals.h | 9 ++++++++- kernel/irq/manage.c | 4 ++-- 4 files changed, 14 insertions(+), 7 deletions(-) --- a/kernel/irq/autoprobe.c +++ b/kernel/irq/autoprobe.c @@ -53,7 +53,7 @@ unsigned long probe_irq_on(void) if (desc->irq_data.chip->irq_set_type) desc->irq_data.chip->irq_set_type(&desc->irq_data, IRQ_TYPE_PROBE); - irq_startup(desc, false); + irq_startup(desc, IRQ_NORESEND, IRQ_START_FORCE); } raw_spin_unlock_irq(&desc->lock); } @@ -70,7 +70,7 @@ unsigned long probe_irq_on(void) raw_spin_lock_irq(&desc->lock); if (!desc->action && irq_settings_can_probe(desc)) { desc->istate |= IRQS_AUTODETECT | IRQS_WAITING; - if (irq_startup(desc, false)) + if (irq_startup(desc, IRQ_NORESEND, IRQ_START_FORCE)) desc->istate |= IRQS_PENDING; } raw_spin_unlock_irq(&desc->lock); --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -212,7 +212,7 @@ static int __irq_startup(struct irq_desc return ret; } -int irq_startup(struct irq_desc *desc, bool resend) +int irq_startup(struct irq_desc *desc, bool resend, bool force) { int ret = 0; @@ -892,7 +892,7 @@ static void irq_settings_set_norequest(desc); irq_settings_set_nothread(desc); desc->action = &chained_action; - irq_startup(desc, true); + irq_startup(desc, IRQ_RESEND, IRQ_START_FORCE); } } --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -66,7 +66,14 @@ extern int __irq_set_trigger(struct irq_ extern void __disable_irq(struct irq_desc *desc); extern void __enable_irq(struct irq_desc *desc); -extern int irq_startup(struct irq_desc *desc, bool resend); +#define IRQ_RESEND true +#define IRQ_NORESEND false + +#define IRQ_START_FORCE true +#define IRQ_START_COND false + +extern int irq_startup(struct irq_desc *desc, bool resend, bool force); + extern void irq_shutdown(struct irq_desc *desc); extern void irq_enable(struct irq_desc *desc); extern void irq_disable(struct irq_desc *desc); --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -509,7 +509,7 @@ void __enable_irq(struct irq_desc *desc) * time. If it was already started up, then irq_startup() * will invoke irq_enable() under the hood. */ - irq_startup(desc, true); + irq_startup(desc, IRQ_RESEND, IRQ_START_COND); break; } default: @@ -1304,7 +1304,7 @@ static int } if (irq_settings_can_autoenable(desc)) { - irq_startup(desc, true); + irq_startup(desc, IRQ_RESEND, IRQ_START_COND); } else { /* * Shared interrupts do not go well with disabling
[-- Attachment #1: genirq--Handle-managed-irqs-gracefully-in-irq_startup--.patch --] [-- Type: text/plain, Size: 3553 bytes --] Affinity managed interrupts should keep their assigned affinity accross CPU hotplug. To avoid magic hackery in device drivers, the core code shall manage them transparently and set these interrupts into a managed shutdown state when the last CPU of the assigned affinity mask goes offline. The interrupt will be restarted when one of the CPUs in the assigned affinity mask comes back online. Add the necessary logic to irq_startup(). If an interrupt is requested and started up, the code checks whether it is affinity managed and if so, it checks whether a CPU in the interrupts affinity mask is online. If not, it puts the interrupt into managed shutdown state. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/irq.h | 2 - kernel/irq/chip.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 62 insertions(+), 4 deletions(-) --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -345,7 +345,7 @@ static inline bool irqd_is_started(struc return __irqd_to_state(d) & IRQD_IRQ_STARTED; } -static inline bool irqd_is_managed_shutdown(struct irq_data *d) +static inline bool irqd_is_managed_and_shutdown(struct irq_data *d) { return __irqd_to_state(d) & IRQD_MANAGED_SHUTDOWN; } --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -195,6 +195,52 @@ static void irq_state_set_started(struct irqd_set(&desc->irq_data, IRQD_IRQ_STARTED); } +enum { + IRQ_STARTUP_NORMAL, + IRQ_STARTUP_MANAGED, + IRQ_STARTUP_ABORT, +}; + +#ifdef CONFIG_SMP +static int +__irq_startup_managed(struct irq_desc *desc, struct cpumask *aff, bool force) +{ + struct irq_data *d = irq_desc_get_irq_data(desc); + + if (!irqd_affinity_is_managed(d)) + return IRQ_STARTUP_NORMAL; + + irqd_clr_managed_shutdown(d); + + if (cpumask_any_and(aff, cpu_online_mask) > nr_cpu_ids) { + /* + * Catch code which fiddles with enable_irq() on a managed + * and potentially shutdown IRQ. Chained interrupt + * installment or irq auto probing should not happen on + * managed irqs either. Emit a warning, break the affinity + * and start it up as a normal interrupt. + */ + if (WARN_ON_ONCE(force)) + return IRQ_STARTUP_NORMAL; + /* + * The interrupt was requested, but there is no online CPU + * in it's affinity mask. Put it into managed shutdown + * state and let the cpu hotplug mechanism start it up once + * a CPU in the mask becomes available. + */ + irqd_set_managed_shutdown(d); + return IRQ_STARTUP_ABORT; + } + return IRQ_STARTUP_MANAGED; +} +#else +static int +__irq_startup_managed(struct irq_desc *desc, struct cpumask *aff, bool force) +{ + return IRQ_STARTUP_NORMAL; +} +#endif + static int __irq_startup(struct irq_desc *desc) { struct irq_data *d = irq_desc_get_irq_data(desc); @@ -214,15 +260,27 @@ static int __irq_startup(struct irq_desc int irq_startup(struct irq_desc *desc, bool resend, bool force) { + struct irq_data *d = irq_desc_get_irq_data(desc); + struct cpumask *aff = irq_data_get_affinity_mask(d); int ret = 0; desc->depth = 0; - if (irqd_is_started(&desc->irq_data)) { + if (irqd_is_started(d)) { irq_enable(desc); } else { - ret = __irq_startup(desc); - irq_setup_affinity(desc); + switch (__irq_startup_managed(desc, aff, force)) { + case IRQ_STARTUP_NORMAL: + ret = __irq_startup(desc); + irq_setup_affinity(desc); + break; + case IRQ_STARTUP_MANAGED: + ret = __irq_startup(desc); + irq_set_affinity_locked(d, aff, false); + break; + case IRQ_STARTUP_ABORT: + return 0; + } } if (resend) check_irq_resend(desc);
[-- Attachment #1: genirq-cpuhotplug--Handle-managed-IRQs-on-CPU-hotplug.patch --] [-- Type: text/plain, Size: 4371 bytes --] If a CPU goes offline, interrupts affine to the CPU are moved away. If the outgoing CPU is the last CPU in the affinity mask the migration code breaks the affinity and sets it it all online cpus. This is a problem for affinity managed interrupts as CPU hotplug is often used for power management purposes. If the affinity is broken, the interrupt is not longer affine to the CPUs to which it was allocated. The affinity spreading allows to lay out multi queue devices in a way that they are assigned to a single CPU or a group of CPUs. If the last CPU goes offline, then the queue is not longer used, so the interrupt can be shutdown gracefully and parked until one of the assigned CPUs comes online again. Add a graceful shutdown mechanism into the irq affinity breaking code path, mark the irq as MANAGED_SHUTDOWN and leave the affinity mask unmodified. In the online path, scan the active interrupts for managed interrupts and if the interrupt is functional and the newly online CPU is part of the affinity mask, restart the interrupt if it is marked MANAGED_SHUTDOWN or if the interrupts is started up, try to add the CPU back to the effective affinity mask. Originally-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/cpuhotplug.h | 1 + include/linux/irq.h | 5 +++++ kernel/cpu.c | 5 +++++ kernel/irq/cpuhotplug.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 56 insertions(+) --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -124,6 +124,7 @@ enum cpuhp_state { CPUHP_AP_ONLINE_IDLE, CPUHP_AP_SMPBOOT_THREADS, CPUHP_AP_X86_VDSO_VMA_ONLINE, + CPUHP_AP_IRQ_AFFINITY_ONLINE, CPUHP_AP_PERF_ONLINE, CPUHP_AP_PERF_X86_ONLINE, CPUHP_AP_PERF_X86_UNCORE_ONLINE, --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -499,7 +499,12 @@ extern int irq_set_affinity_locked(struc const struct cpumask *cpumask, bool force); extern int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info); +#if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_IRQ_MIGRATION) extern void irq_migrate_all_off_this_cpu(void); +extern int irq_affinity_online_cpu(unsigned int cpu); +#else +# define irq_affinity_online_cpu NULL +#endif #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ) void irq_move_irq(struct irq_data *data); --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1252,6 +1252,11 @@ static struct cpuhp_step cpuhp_ap_states .startup.single = smpboot_unpark_threads, .teardown.single = NULL, }, + [CPUHP_AP_IRQ_AFFINITY_ONLINE] = { + .name = "irq/affinity:online", + .startup.single = irq_affinity_online_cpu, + .teardown.single = NULL, + }, [CPUHP_AP_PERF_ONLINE] = { .name = "perf:online", .startup.single = perf_event_init_cpu, --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -83,6 +83,15 @@ static bool migrate_one_irq(struct irq_d chip->irq_mask(d); if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { + /* + * If the interrupt is managed, then shut it down and leave + * the affinity untouched. + */ + if (irqd_affinity_is_managed(d)) { + irqd_set_managed_shutdown(d); + irq_shutdown(desc); + return false; + } affinity = cpu_online_mask; brokeaff = true; } @@ -129,3 +138,39 @@ void irq_migrate_all_off_this_cpu(void) } } } + +static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) +{ + struct irq_data *data = irq_desc_get_irq_data(desc); + const struct cpumask *affinity = irq_data_get_affinity_mask(data); + + if (!irqd_affinity_is_managed(data) || !desc->action || + !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity)) + return; + + if (irqd_is_managed_and_shutdown(data)) + irq_startup(desc, IRQ_RESEND, IRQ_START_COND); + else + irq_set_affinity_locked(data, affinity, false); +} + +/** + * irq_affinity_online_cpu - Restore affinity for managed interrupts + * @cpu: Upcoming CPU for which interrupts should be restored + */ +int irq_affinity_online_cpu(unsigned int cpu) +{ + struct irq_desc *desc; + unsigned int irq; + + irq_lock_sparse(); + for_each_active_irq(irq) { + desc = irq_to_desc(irq); + raw_spin_lock_irq(&desc->lock); + irq_restore_affinity_of_irq(desc, cpu); + raw_spin_unlock_irq(&desc->lock); + } + irq_unlock_sparse(); + + return 0; +}
[-- Attachment #1: genirq--Introduce-IRQD_SINGLE_TARGET-flag.patch --] [-- Type: text/plain, Size: 1983 bytes --] Many interrupt chips allow only a single CPU as interrupt target. The core code has no knowledge about that. That's unfortunate as it could avoid trying to readd a newly online CPU to the effective affinity mask. Add the status flag and the necessary accessors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/irq.h | 16 ++++++++++++++++ kernel/irq/debugfs.c | 1 + 2 files changed, 17 insertions(+) --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -208,6 +208,7 @@ struct irq_data { * IRQD_IRQ_STARTED - Startup state of the interrupt * IRQD_MANAGED_SHUTDOWN - Interrupt was shutdown due to empty affinity * mask. Applies only to affinity managed irqs. + * IRQD_SINGLE_TARGET - IRQ allows only a single affinity target */ enum { IRQD_TRIGGER_MASK = 0xf, @@ -227,6 +228,7 @@ enum { IRQD_AFFINITY_MANAGED = (1 << 21), IRQD_IRQ_STARTED = (1 << 22), IRQD_MANAGED_SHUTDOWN = (1 << 23), + IRQD_SINGLE_TARGET = (1 << 24), }; #define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors) @@ -275,6 +277,20 @@ static inline bool irqd_is_level_type(st return __irqd_to_state(d) & IRQD_LEVEL; } +/* + * Must only be called of irqchip.irq_set_affinity() or low level + * hieararchy domain allocation functions. + */ +static inline void irqd_set_single_target(struct irq_data *d) +{ + __irqd_to_state(d) |= IRQD_SINGLE_TARGET; +} + +static inline bool irqd_is_single_target(struct irq_data *d) +{ + return __irqd_to_state(d) & IRQD_SINGLE_TARGET; +} + static inline bool irqd_is_wakeup_set(struct irq_data *d) { return __irqd_to_state(d) & IRQD_WAKEUP_STATE; --- a/kernel/irq/debugfs.c +++ b/kernel/irq/debugfs.c @@ -100,6 +100,7 @@ static const struct irq_bit_descr irqdat BIT_MASK_DESCR(IRQD_PER_CPU), BIT_MASK_DESCR(IRQD_NO_BALANCING), + BIT_MASK_DESCR(IRQD_SINGLE_TARGET), BIT_MASK_DESCR(IRQD_MOVE_PCNTXT), BIT_MASK_DESCR(IRQD_AFFINITY_SET), BIT_MASK_DESCR(IRQD_SETAFFINITY_PENDING),
[-- Attachment #1: genirq--Avoid-irq-affinity-setting-for-single-targets.patch --] [-- Type: text/plain, Size: 981 bytes --] Avoid trying to add a newly online CPU to the effective affinity mask of an started up interrupt. That interrupt will either stay on the already online CPU or move around for no value. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/cpuhotplug.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -148,9 +148,17 @@ static void irq_restore_affinity_of_irq( !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity)) return; - if (irqd_is_managed_and_shutdown(data)) + if (irqd_is_managed_and_shutdown(data)) { irq_startup(desc, IRQ_RESEND, IRQ_START_COND); - else + return; + } + + /* + * If the interrupt can only be directed to a single target + * CPU then it is already assigned to a CPU in the affinity + * mask. No point in trying to move it around. + */ + if (!irqd_is_single_target(data)) irq_set_affinity_locked(data, affinity, false); }
[-- Attachment #1: x86-apic--Mark-single-target-interrupts.patch --] [-- Type: text/plain, Size: 822 bytes --] If the interrupt destination mode of the APIC is physical then the effective affinity is restricted to a single CPU. Mark the interrupt accordingly in the domain allocation code, so the core code can avoid pointless affinity setting attempts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/vector.c | 7 +++++++ 1 file changed, 7 insertions(+) --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -371,6 +371,13 @@ static int x86_vector_alloc_irqs(struct irq_data); if (err) goto error; + /* + * If the apic destination mode is physical, then the + * effective affinity is restricted to a single target + * CPU. Mark the interrupt accordingly. + */ + if (!apic->irq_dest_mode) + irqd_set_single_target(irq_data); } return 0;
[-- Attachment #1: genirqaffinity_Assign_vectors_to_all_present_CPUs.patch --] [-- Type: text/plain, Size: 5170 bytes --] From: Christoph Hellwig <hch@lst.de> Currently the irq vector spread algorithm is restricted to online CPUs, which ties the IRQ mapping to the currently online devices and doesn't deal nicely with the fact that CPUs could come and go rapidly due to e.g. power management. Instead assign vectors to all present CPUs to avoid this churn. Build a map of all possible CPUs for a given node, as the architectures only provide a map of all onlines CPUs. Do this dynamically on each call for the vector assingments, which is a bit suboptimal and could be optimized in the future by provinding a mapping from the arch code. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <keith.busch@intel.com> Cc: linux-block@vger.kernel.org Cc: linux-nvme@lists.infradead.org Link: http://lkml.kernel.org/r/20170603140403.27379-5-hch@lst.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/irq/affinity.c | 76 +++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 63 insertions(+), 13 deletions(-) --- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c @@ -1,4 +1,7 @@ - +/* + * Copyright (C) 2016 Thomas Gleixner. + * Copyright (C) 2016-2017 Christoph Hellwig. + */ #include <linux/interrupt.h> #include <linux/kernel.h> #include <linux/slab.h> @@ -35,13 +38,54 @@ static void irq_spread_init_one(struct c } } -static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t *nodemsk) +static cpumask_var_t *alloc_node_to_present_cpumask(void) +{ + cpumask_var_t *masks; + int node; + + masks = kcalloc(nr_node_ids, sizeof(cpumask_var_t), GFP_KERNEL); + if (!masks) + return NULL; + + for (node = 0; node < nr_node_ids; node++) { + if (!zalloc_cpumask_var(&masks[node], GFP_KERNEL)) + goto out_unwind; + } + + return masks; + +out_unwind: + while (--node >= 0) + free_cpumask_var(masks[node]); + kfree(masks); + return NULL; +} + +static void free_node_to_present_cpumask(cpumask_var_t *masks) +{ + int node; + + for (node = 0; node < nr_node_ids; node++) + free_cpumask_var(masks[node]); + kfree(masks); +} + +static void build_node_to_present_cpumask(cpumask_var_t *masks) +{ + int cpu; + + for_each_present_cpu(cpu) + cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]); +} + +static int get_nodes_in_cpumask(cpumask_var_t *node_to_present_cpumask, + const struct cpumask *mask, nodemask_t *nodemsk) { int n, nodes = 0; /* Calculate the number of nodes in the supplied affinity mask */ - for_each_online_node(n) { - if (cpumask_intersects(mask, cpumask_of_node(n))) { + for_each_node(n) { + if (cpumask_intersects(mask, node_to_present_cpumask[n])) { node_set(n, *nodemsk); nodes++; } @@ -64,7 +108,7 @@ irq_create_affinity_masks(int nvecs, con int last_affv = affv + affd->pre_vectors; nodemask_t nodemsk = NODE_MASK_NONE; struct cpumask *masks; - cpumask_var_t nmsk; + cpumask_var_t nmsk, *node_to_present_cpumask; if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL)) return NULL; @@ -73,13 +117,19 @@ irq_create_affinity_masks(int nvecs, con if (!masks) goto out; + node_to_present_cpumask = alloc_node_to_present_cpumask(); + if (!node_to_present_cpumask) + goto out; + /* Fill out vectors at the beginning that don't need affinity */ for (curvec = 0; curvec < affd->pre_vectors; curvec++) cpumask_copy(masks + curvec, irq_default_affinity); /* Stabilize the cpumasks */ get_online_cpus(); - nodes = get_nodes_in_cpumask(cpu_online_mask, &nodemsk); + build_node_to_present_cpumask(node_to_present_cpumask); + nodes = get_nodes_in_cpumask(node_to_present_cpumask, cpu_present_mask, + &nodemsk); /* * If the number of nodes in the mask is greater than or equal the @@ -87,7 +137,8 @@ irq_create_affinity_masks(int nvecs, con */ if (affv <= nodes) { for_each_node_mask(n, nodemsk) { - cpumask_copy(masks + curvec, cpumask_of_node(n)); + cpumask_copy(masks + curvec, + node_to_present_cpumask[n]); if (++curvec == last_affv) break; } @@ -101,7 +152,7 @@ irq_create_affinity_masks(int nvecs, con vecs_per_node = (affv - (curvec - affd->pre_vectors)) / nodes; /* Get the cpus on this node which are in the mask */ - cpumask_and(nmsk, cpu_online_mask, cpumask_of_node(n)); + cpumask_and(nmsk, cpu_present_mask, node_to_present_cpumask[n]); /* Calculate the number of cpus per vector */ ncpus = cpumask_weight(nmsk); @@ -133,6 +184,7 @@ irq_create_affinity_masks(int nvecs, con /* Fill out vectors at the end that don't need affinity */ for (; curvec < nvecs; curvec++) cpumask_copy(masks + curvec, irq_default_affinity); + free_node_to_present_cpumask(node_to_present_cpumask); out: free_cpumask_var(nmsk); return masks; @@ -147,12 +199,10 @@ int irq_calc_affinity_vectors(int maxvec { int resv = affd->pre_vectors + affd->post_vectors; int vecs = maxvec - resv; - int cpus; + int ret; - /* Stabilize the cpumasks */ get_online_cpus(); - cpus = cpumask_weight(cpu_online_mask); + ret = min_t(int, cpumask_weight(cpu_present_mask), vecs) + resv; put_online_cpus(); - - return min(cpus, vecs) + resv; + return ret; }
Hi Thomas, At 06/20/2017 07:37 AM, Thomas Gleixner wrote: [...] > > +/** > + * irq_fixup_move_pending - Cleanup irq move pending from a dying CPU > + * @desc: Interrupt descpriptor to clean up > + * @force_clear: If set clear the move pending bit unconditionally. > + * If not set, clear it only when the dying CPU is the > + * last one in the pending mask. > + * > + * Returns true if the pending bit was set and the pending mask contains an > + * online CPU other than the dying CPU. > + */ > +bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear) > +{ > + struct irq_data *data = irq_desc_get_irq_data(desc); > + > + if (!irqd_is_setaffinity_pending(data)) > + return false; > + > + /* > + * The outgoing CPU might be the last online target in a pending > + * interrupt move. If that's the case clear the pending move bit. > + */ > + if (cpumask_any_and(desc->pending_mask, cpu_online_mask) > nr_cpu_ids) { Should we consider the case of "=nr_cpu_ids" here, like: cpumask_any_and(desc->pending_mask, cpu_online_mask) >= nr_cpu_ids Thanks. Dou > + irqd_clr_move_pending(data); > + return false; > + } > + if (force_clear) > + irqd_clr_move_pending(data); > + return true; > +} > + > void irq_move_masked_irq(struct irq_data *idata) > { > struct irq_desc *desc = irq_data_to_desc(idata);
On Tue, 20 Jun 2017, Dou Liyang wrote:
> At 06/20/2017 07:37 AM, Thomas Gleixner wrote:
> [...]
> >
> > +/**
> > + * irq_fixup_move_pending - Cleanup irq move pending from a dying CPU
> > + * @desc: Interrupt descpriptor to clean up
> > + * @force_clear: If set clear the move pending bit unconditionally.
> > + * If not set, clear it only when the dying CPU is the
> > + * last one in the pending mask.
> > + *
> > + * Returns true if the pending bit was set and the pending mask contains an
> > + * online CPU other than the dying CPU.
> > + */
> > +bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear)
> > +{
> > + struct irq_data *data = irq_desc_get_irq_data(desc);
> > +
> > + if (!irqd_is_setaffinity_pending(data))
> > + return false;
> > +
> > + /*
> > + * The outgoing CPU might be the last online target in a pending
> > + * interrupt move. If that's the case clear the pending move bit.
> > + */
> > + if (cpumask_any_and(desc->pending_mask, cpu_online_mask) > nr_cpu_ids)
> > {
>
> Should we consider the case of "=nr_cpu_ids" here, like:
>
> cpumask_any_and(desc->pending_mask, cpu_online_mask) >= nr_cpu_ids
Yes, indeed. > is wrong. Good catch!
Thanks,
tglx
On Tue, Jun 20, 2017 at 01:37:00AM +0200, Thomas Gleixner wrote:
> This started out with 5 patches from Christoph who wanted to add a
> mechanism for interrupts with managed affinities to spread them over all
> present CPUs and instead of migrating them, shut them down into managed
> shutdown state when the last CPU in the affinity set goes offline and then
> resume them when a CPU to which belongs to the affinity set comes online
> again. See:
>
> http://lkml.kernel.org/r/20170603140403.27379-1-hch@lst.de
FYI, this seems to work fine for me with the additional block patches
from that series applied.
On Tue, 20 Jun 2017, Keith Busch wrote:
> On Tue, Jun 20, 2017 at 01:37:15AM +0200, Thomas Gleixner wrote:
> > static int vmd_enable_domain(struct vmd_dev *vmd)
> > {
> > struct pci_sysdata *sd = &vmd->sysdata;
> > + struct fwnode_handle *fn;
> > struct resource *res;
> > u32 upper_bits;
> > unsigned long flags;
> > @@ -617,8 +618,13 @@ static int vmd_enable_domain(struct vmd_
> >
> > sd->node = pcibus_to_node(vmd->dev->bus);
> >
> > - vmd->irq_domain = pci_msi_create_irq_domain(NULL, &vmd_msi_domain_info,
> > + fn = irq_domain_alloc_named_id_fwnode("VMD-MSI", vmd->sysdata.domain);
> > + if (!fn)
> > + return -ENODEV;
> > +
> > + vmd->irq_domain = pci_msi_create_irq_domain(fn, &vmd_msi_domain_info,
> > x86_vector_domain);
> > + kfree(fn);
>
> If I'm following all this correctly, it looks like we need to use
> irq_domain_free_fwnode with irq_domain_alloc_named_id_fwnode instead of
> freeing 'fn' directly, otherwise we leak 'fwid->name'.
Yes, I'm a moron.
On Tue, Jun 20, 2017 at 01:37:15AM +0200, Thomas Gleixner wrote:
> static int vmd_enable_domain(struct vmd_dev *vmd)
> {
> struct pci_sysdata *sd = &vmd->sysdata;
> + struct fwnode_handle *fn;
> struct resource *res;
> u32 upper_bits;
> unsigned long flags;
> @@ -617,8 +618,13 @@ static int vmd_enable_domain(struct vmd_
>
> sd->node = pcibus_to_node(vmd->dev->bus);
>
> - vmd->irq_domain = pci_msi_create_irq_domain(NULL, &vmd_msi_domain_info,
> + fn = irq_domain_alloc_named_id_fwnode("VMD-MSI", vmd->sysdata.domain);
> + if (!fn)
> + return -ENODEV;
> +
> + vmd->irq_domain = pci_msi_create_irq_domain(fn, &vmd_msi_domain_info,
> x86_vector_domain);
> + kfree(fn);
If I'm following all this correctly, it looks like we need to use
irq_domain_free_fwnode with irq_domain_alloc_named_id_fwnode instead of
freeing 'fn' directly, otherwise we leak 'fwid->name'.
On Tue, 20 Jun 2017, Thomas Gleixner wrote:
> On Tue, 20 Jun 2017, Keith Busch wrote:
> > On Tue, Jun 20, 2017 at 01:37:15AM +0200, Thomas Gleixner wrote:
> > > static int vmd_enable_domain(struct vmd_dev *vmd)
> > > {
> > > struct pci_sysdata *sd = &vmd->sysdata;
> > > + struct fwnode_handle *fn;
> > > struct resource *res;
> > > u32 upper_bits;
> > > unsigned long flags;
> > > @@ -617,8 +618,13 @@ static int vmd_enable_domain(struct vmd_
> > >
> > > sd->node = pcibus_to_node(vmd->dev->bus);
> > >
> > > - vmd->irq_domain = pci_msi_create_irq_domain(NULL, &vmd_msi_domain_info,
> > > + fn = irq_domain_alloc_named_id_fwnode("VMD-MSI", vmd->sysdata.domain);
> > > + if (!fn)
> > > + return -ENODEV;
> > > +
> > > + vmd->irq_domain = pci_msi_create_irq_domain(fn, &vmd_msi_domain_info,
> > > x86_vector_domain);
> > > + kfree(fn);
> >
> > If I'm following all this correctly, it looks like we need to use
> > irq_domain_free_fwnode with irq_domain_alloc_named_id_fwnode instead of
> > freeing 'fn' directly, otherwise we leak 'fwid->name'.
>
> Yes, I'm a moron.
Fixed up the mess and updated the git branch.
Thanks for catching it.
tglx
On Tue, 20 Jun 2017, Keith Busch wrote:
> On Tue, Jun 20, 2017 at 01:37:32AM +0200, Thomas Gleixner wrote:
> > @@ -441,18 +440,27 @@ void fixup_irqs(void)
> >
> > for_each_irq_desc(irq, desc) {
> > const struct cpumask *affinity;
> > - int break_affinity = 0;
> > - int set_affinity = 1;
> > + bool break_affinity = false;
> >
> > if (!desc)
> > continue;
> > - if (irq == 2)
> > - continue;
> >
> > /* interrupt's are disabled at this point */
> > raw_spin_lock(&desc->lock);
> >
> > data = irq_desc_get_irq_data(desc);
> > + chip = irq_data_get_irq_chip(data);
> > + /*
> > + * The interrupt descriptor might have been cleaned up
> > + * already, but it is not yet removed from the radix
> > + * tree. If the chip does not have an affinity setter,
> > + * nothing to do here.
> > + */
> > + if (!chip !chip->irq_set_affinity) {
> > + raw_spin_unlock(&desc->lock);
> > + continue;
> > + }
>
> A bit of a moot point since the very next patch deletes all of this,
> but found this broken 'if' condition when compiling one at a time,
> missing the '&&'.
Hmm, How did I fatfinger that one after booting it?
Yes, the patch is kinda moot, but I wanted to verify that shifting the
logic around does not break any of the hotplug stress tests, so that the
next step becomes less risky.
Thanks,
tglx
On Tue, Jun 20, 2017 at 01:37:32AM +0200, Thomas Gleixner wrote:
> @@ -441,18 +440,27 @@ void fixup_irqs(void)
>
> for_each_irq_desc(irq, desc) {
> const struct cpumask *affinity;
> - int break_affinity = 0;
> - int set_affinity = 1;
> + bool break_affinity = false;
>
> if (!desc)
> continue;
> - if (irq == 2)
> - continue;
>
> /* interrupt's are disabled at this point */
> raw_spin_lock(&desc->lock);
>
> data = irq_desc_get_irq_data(desc);
> + chip = irq_data_get_irq_chip(data);
> + /*
> + * The interrupt descriptor might have been cleaned up
> + * already, but it is not yet removed from the radix
> + * tree. If the chip does not have an affinity setter,
> + * nothing to do here.
> + */
> + if (!chip !chip->irq_set_affinity) {
> + raw_spin_unlock(&desc->lock);
> + continue;
> + }
A bit of a moot point since the very next patch deletes all of this,
but found this broken 'if' condition when compiling one at a time,
missing the '&&'.
On Tue, Jun 20, 2017 at 01:37:02AM +0200, Thomas Gleixner wrote:
> Add the missing name, so debugging will work proper.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: iommu@lists.linux-foundation.org
Acked-by: Joerg Roedel <jroedel@suse.de>
On Tue, Jun 20, 2017 at 01:37:03AM +0200, Thomas Gleixner wrote:
> Add the missing name, so debugging will work proper.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: iommu@lists.linux-foundation.org
Acked-by: Joerg Roedel <jroedel@suse.de>
On Tue, Jun 20, 2017 at 01:37:11AM +0200, Thomas Gleixner wrote:
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: iommu@lists.linux-foundation.org
Acked-by: Joerg Roedel <jroedel@suse.de>
On Tue, Jun 20, 2017 at 01:37:12AM +0200, Thomas Gleixner wrote:
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: iommu@lists.linux-foundation.org
Acked-by: Joerg Roedel <jroedel@suse.de>
Commit-ID: 8947dfb257eb91d7487e06b7d2a069d82e7c19a2 Gitweb: http://git.kernel.org/tip/8947dfb257eb91d7487e06b7d2a069d82e7c19a2 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:01 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:06 +0200 x86/apic: Add name to irq chip Add the missing name, so debugging will work proper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.266561988@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/vector.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index f3557a1..6b21b9e 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -534,6 +534,7 @@ static int apic_set_affinity(struct irq_data *irq_data, } static struct irq_chip lapic_controller = { + .name = "APIC", .irq_ack = apic_ack_edge, .irq_set_affinity = apic_set_affinity, .irq_retrigger = apic_retrigger_irq,
Commit-ID: 290be194ba9d489e1857cc45d0dd24bf3429156b Gitweb: http://git.kernel.org/tip/290be194ba9d489e1857cc45d0dd24bf3429156b Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:02 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:07 +0200 iommu/amd: Add name to irq chip Add the missing name, so debugging will work proper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.343236995@linutronix.de --- drivers/iommu/amd_iommu.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 63cacf5..590e1e8 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -4386,10 +4386,11 @@ static void ir_compose_msi_msg(struct irq_data *irq_data, struct msi_msg *msg) } static struct irq_chip amd_ir_chip = { - .irq_ack = ir_ack_apic_edge, - .irq_set_affinity = amd_ir_set_affinity, - .irq_set_vcpu_affinity = amd_ir_set_vcpu_affinity, - .irq_compose_msi_msg = ir_compose_msi_msg, + .name = "AMD-IR", + .irq_ack = ir_ack_apic_edge, + .irq_set_affinity = amd_ir_set_affinity, + .irq_set_vcpu_affinity = amd_ir_set_vcpu_affinity, + .irq_compose_msi_msg = ir_compose_msi_msg, }; int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
Commit-ID: 1bb3a5a76386ba2886ee44b903eeff5765bd71d4 Gitweb: http://git.kernel.org/tip/1bb3a5a76386ba2886ee44b903eeff5765bd71d4 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:03 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:07 +0200 iommu/vt-d: Add name to irq chip Add the missing name, so debugging will work proper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.431939968@linutronix.de --- drivers/iommu/intel_irq_remapping.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c index a190cbd..ba5b580 100644 --- a/drivers/iommu/intel_irq_remapping.c +++ b/drivers/iommu/intel_irq_remapping.c @@ -1205,10 +1205,11 @@ static int intel_ir_set_vcpu_affinity(struct irq_data *data, void *info) } static struct irq_chip intel_ir_chip = { - .irq_ack = ir_ack_apic_edge, - .irq_set_affinity = intel_ir_set_affinity, - .irq_compose_msi_msg = intel_ir_compose_msi_msg, - .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity, + .name = "INTEL-IR", + .irq_ack = ir_ack_apic_edge, + .irq_set_affinity = intel_ir_set_affinity, + .irq_compose_msi_msg = intel_ir_compose_msi_msg, + .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity, }; static void intel_irq_remapping_prepare_irte(struct intel_ir_data *data,
Commit-ID: 0165308a2f994939d2e1b36624f5a8f57746bc88 Gitweb: http://git.kernel.org/tip/0165308a2f994939d2e1b36624f5a8f57746bc88 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:04 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:08 +0200 genirq/msi: Prevent overwriting domain name Prevent overwriting an already assigned domain name. Remove the extra check for chip->name, because if domain->name is NULL overwriting it with NULL is not a problem. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.510684976@linutronix.de --- kernel/irq/msi.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c index fe4d48e..9e3f185 100644 --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -274,7 +274,8 @@ struct irq_domain *msi_create_irq_domain(struct fwnode_handle *fwnode, domain = irq_domain_create_hierarchy(parent, IRQ_DOMAIN_FLAG_MSI, 0, fwnode, &msi_domain_ops, info); - if (domain && info->chip && info->chip->name) + + if (domain && !domain->name && info->chip) domain->name = info->chip->name; return domain;
Commit-ID: d59f6617eef0f76e34f7a9993f5645c5ef467e42 Gitweb: http://git.kernel.org/tip/d59f6617eef0f76e34f7a9993f5645c5ef467e42 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:05 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:08 +0200 genirq: Allow fwnode to carry name information only In order to provide proper debug interface it's required to have domain names available when the domain is added. Non fwnode based architectures like x86 have no way to do so. It's not possible to use domain ops or host data for this as domain ops might be the same for several instances, but the names have to be unique. Extend the irqchip fwnode to allow transporting the domain name. If no node is supplied, create a 'unknown-N' placeholder. Warn if an invalid node is supplied and treat it like no node. This happens e.g. with i2 devices on x86 which hand in an ACPI type node which has no interface for retrieving the name. [ Folded a fix from Marc to make DT name parsing work ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.588784933@linutronix.de --- include/linux/irqdomain.h | 31 +++++++++++++- kernel/irq/irqdomain.c | 105 ++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 122 insertions(+), 14 deletions(-) diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h index 9f36160..9cf32a2 100644 --- a/include/linux/irqdomain.h +++ b/include/linux/irqdomain.h @@ -189,6 +189,9 @@ enum { /* Irq domain implements MSI remapping */ IRQ_DOMAIN_FLAG_MSI_REMAP = (1 << 5), + /* Irq domain name was allocated in __irq_domain_add() */ + IRQ_DOMAIN_NAME_ALLOCATED = (1 << 6), + /* * Flags starting from IRQ_DOMAIN_FLAG_NONCORE are reserved * for implementation specific purposes and ignored by the @@ -203,7 +206,33 @@ static inline struct device_node *irq_domain_get_of_node(struct irq_domain *d) } #ifdef CONFIG_IRQ_DOMAIN -struct fwnode_handle *irq_domain_alloc_fwnode(void *data); +struct fwnode_handle *__irq_domain_alloc_fwnode(unsigned int type, int id, + const char *name, void *data); + +enum { + IRQCHIP_FWNODE_REAL, + IRQCHIP_FWNODE_NAMED, + IRQCHIP_FWNODE_NAMED_ID, +}; + +static inline +struct fwnode_handle *irq_domain_alloc_named_fwnode(const char *name) +{ + return __irq_domain_alloc_fwnode(IRQCHIP_FWNODE_NAMED, 0, name, NULL); +} + +static inline +struct fwnode_handle *irq_domain_alloc_named_id_fwnode(const char *name, int id) +{ + return __irq_domain_alloc_fwnode(IRQCHIP_FWNODE_NAMED_ID, id, name, + NULL); +} + +static inline struct fwnode_handle *irq_domain_alloc_fwnode(void *data) +{ + return __irq_domain_alloc_fwnode(IRQCHIP_FWNODE_REAL, 0, NULL, data); +} + void irq_domain_free_fwnode(struct fwnode_handle *fwnode); struct irq_domain *__irq_domain_add(struct fwnode_handle *fwnode, int size, irq_hw_number_t hwirq_max, int direct_max, diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c index 70b9da7..e1b925b 100644 --- a/kernel/irq/irqdomain.c +++ b/kernel/irq/irqdomain.c @@ -26,39 +26,61 @@ static struct irq_domain *irq_default_domain; static void irq_domain_check_hierarchy(struct irq_domain *domain); struct irqchip_fwid { - struct fwnode_handle fwnode; - char *name; - void *data; + struct fwnode_handle fwnode; + unsigned int type; + char *name; + void *data; }; /** * irq_domain_alloc_fwnode - Allocate a fwnode_handle suitable for * identifying an irq domain - * @data: optional user-provided data + * @type: Type of irqchip_fwnode. See linux/irqdomain.h + * @name: Optional user provided domain name + * @id: Optional user provided id if name != NULL + * @data: Optional user-provided data * - * Allocate a struct device_node, and return a poiner to the embedded + * Allocate a struct irqchip_fwid, and return a poiner to the embedded * fwnode_handle (or NULL on failure). + * + * Note: The types IRQCHIP_FWNODE_NAMED and IRQCHIP_FWNODE_NAMED_ID are + * solely to transport name information to irqdomain creation code. The + * node is not stored. For other types the pointer is kept in the irq + * domain struct. */ -struct fwnode_handle *irq_domain_alloc_fwnode(void *data) +struct fwnode_handle *__irq_domain_alloc_fwnode(unsigned int type, int id, + const char *name, void *data) { struct irqchip_fwid *fwid; - char *name; + char *n; fwid = kzalloc(sizeof(*fwid), GFP_KERNEL); - name = kasprintf(GFP_KERNEL, "irqchip@%p", data); - if (!fwid || !name) { + switch (type) { + case IRQCHIP_FWNODE_NAMED: + n = kasprintf(GFP_KERNEL, "%s", name); + break; + case IRQCHIP_FWNODE_NAMED_ID: + n = kasprintf(GFP_KERNEL, "%s-%d", name, id); + break; + default: + n = kasprintf(GFP_KERNEL, "irqchip@%p", data); + break; + } + + if (!fwid || !n) { kfree(fwid); - kfree(name); + kfree(n); return NULL; } - fwid->name = name; + fwid->type = type; + fwid->name = n; fwid->data = data; fwid->fwnode.type = FWNODE_IRQCHIP; return &fwid->fwnode; } -EXPORT_SYMBOL_GPL(irq_domain_alloc_fwnode); +EXPORT_SYMBOL_GPL(__irq_domain_alloc_fwnode); /** * irq_domain_free_fwnode - Free a non-OF-backed fwnode_handle @@ -97,20 +119,75 @@ struct irq_domain *__irq_domain_add(struct fwnode_handle *fwnode, int size, void *host_data) { struct device_node *of_node = to_of_node(fwnode); + struct irqchip_fwid *fwid; struct irq_domain *domain; + static atomic_t unknown_domains; + domain = kzalloc_node(sizeof(*domain) + (sizeof(unsigned int) * size), GFP_KERNEL, of_node_to_nid(of_node)); if (WARN_ON(!domain)) return NULL; + if (fwnode && is_fwnode_irqchip(fwnode)) { + fwid = container_of(fwnode, struct irqchip_fwid, fwnode); + + switch (fwid->type) { + case IRQCHIP_FWNODE_NAMED: + case IRQCHIP_FWNODE_NAMED_ID: + domain->name = kstrdup(fwid->name, GFP_KERNEL); + if (!domain->name) { + kfree(domain); + return NULL; + } + domain->flags |= IRQ_DOMAIN_NAME_ALLOCATED; + break; + default: + domain->fwnode = fwnode; + domain->name = fwid->name; + break; + } + } else if (of_node) { + char *name; + + /* + * DT paths contain '/', which debugfs is legitimately + * unhappy about. Replace them with ':', which does + * the trick and is not as offensive as '\'... + */ + name = kstrdup(of_node_full_name(of_node), GFP_KERNEL); + if (!name) { + kfree(domain); + return NULL; + } + + strreplace(name, '/', ':'); + + domain->name = name; + domain->fwnode = fwnode; + domain->flags |= IRQ_DOMAIN_NAME_ALLOCATED; + } + + if (!domain->name) { + if (fwnode) { + pr_err("Invalid fwnode type (%d) for irqdomain\n", + fwnode->type); + } + domain->name = kasprintf(GFP_KERNEL, "unknown-%d", + atomic_inc_return(&unknown_domains)); + if (!domain->name) { + kfree(domain); + return NULL; + } + domain->flags |= IRQ_DOMAIN_NAME_ALLOCATED; + } + of_node_get(of_node); /* Fill structure */ INIT_RADIX_TREE(&domain->revmap_tree, GFP_KERNEL); domain->ops = ops; domain->host_data = host_data; - domain->fwnode = fwnode; domain->hwirq_max = hwirq_max; domain->revmap_size = size; domain->revmap_direct_max_irq = direct_max; @@ -152,6 +229,8 @@ void irq_domain_remove(struct irq_domain *domain) pr_debug("Removed domain %s\n", domain->name); of_node_put(irq_domain_get_of_node(domain)); + if (domain->flags & IRQ_DOMAIN_NAME_ALLOCATED) + kfree(domain->name); kfree(domain); } EXPORT_SYMBOL_GPL(irq_domain_remove);
Commit-ID: 9d35f859590efa48be51b8ccded6550e0440e2c7 Gitweb: http://git.kernel.org/tip/9d35f859590efa48be51b8ccded6550e0440e2c7 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:06 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:08 +0200 x86/vector: Create named irq domain Use the fwnode to create a named domain so diagnosis works. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.673635238@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/vector.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 6b21b9e..47c5d01 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -429,11 +429,16 @@ static void init_legacy_irqs(void) { } int __init arch_early_irq_init(void) { + struct fwnode_handle *fn; + init_legacy_irqs(); - x86_vector_domain = irq_domain_add_tree(NULL, &x86_vector_domain_ops, - NULL); + fn = irq_domain_alloc_named_fwnode("VECTOR"); + BUG_ON(!fn); + x86_vector_domain = irq_domain_create_tree(fn, &x86_vector_domain_ops, + NULL); BUG_ON(x86_vector_domain == NULL); + irq_domain_free_fwnode(fn); irq_set_default_host(x86_vector_domain); arch_init_msi_domain(x86_vector_domain);
Commit-ID: 1b604745c8474c76e5fd1682ea5b7da0a1c6d440 Gitweb: http://git.kernel.org/tip/1b604745c8474c76e5fd1682ea5b7da0a1c6d440 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:07 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:09 +0200 x86/ioapic: Create named irq domain Use the fwnode to create a named domain so diagnosis works, but only when the the ioapic is not device tree based. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.752782603@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/io_apic.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 347bb9f..444ae92 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2223,6 +2223,8 @@ static int mp_irqdomain_create(int ioapic) struct ioapic *ip = &ioapics[ioapic]; struct ioapic_domain_cfg *cfg = &ip->irqdomain_cfg; struct mp_ioapic_gsi *gsi_cfg = mp_ioapic_gsi_routing(ioapic); + struct fwnode_handle *fn; + char *name = "IO-APIC"; if (cfg->type == IOAPIC_DOMAIN_INVALID) return 0; @@ -2233,9 +2235,25 @@ static int mp_irqdomain_create(int ioapic) parent = irq_remapping_get_ir_irq_domain(&info); if (!parent) parent = x86_vector_domain; + else + name = "IO-APIC-IR"; + + /* Handle device tree enumerated APICs proper */ + if (cfg->dev) { + fn = of_node_to_fwnode(cfg->dev); + } else { + fn = irq_domain_alloc_named_id_fwnode(name, ioapic); + if (!fn) + return -ENOMEM; + } + + ip->irqdomain = irq_domain_create_linear(fn, hwirqs, cfg->ops, + (void *)(long)ioapic); + + /* Release fw handle if it was allocated above */ + if (!cfg->dev) + irq_domain_free_fwnode(fn); - ip->irqdomain = irq_domain_add_linear(cfg->dev, hwirqs, cfg->ops, - (void *)(long)ioapic); if (!ip->irqdomain) return -ENOMEM;
Commit-ID: 5f432711ba94400fb39e9be81913ced81c141758 Gitweb: http://git.kernel.org/tip/5f432711ba94400fb39e9be81913ced81c141758 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:08 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:09 +0200 x86/htirq: Create named domain Use the fwnode to create a named domain so diagnosis works. Mark the init function __init while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.829047007@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/apic/htirq.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/apic/htirq.c b/arch/x86/kernel/apic/htirq.c index ae50d34..56ccf93 100644 --- a/arch/x86/kernel/apic/htirq.c +++ b/arch/x86/kernel/apic/htirq.c @@ -150,16 +150,27 @@ static const struct irq_domain_ops htirq_domain_ops = { .deactivate = htirq_domain_deactivate, }; -void arch_init_htirq_domain(struct irq_domain *parent) +void __init arch_init_htirq_domain(struct irq_domain *parent) { + struct fwnode_handle *fn; + if (disable_apic) return; - htirq_domain = irq_domain_add_tree(NULL, &htirq_domain_ops, NULL); + fn = irq_domain_alloc_named_fwnode("PCI-HT"); + if (!fn) + goto warn; + + htirq_domain = irq_domain_create_tree(fn, &htirq_domain_ops, NULL); + irq_domain_free_fwnode(fn); if (!htirq_domain) - pr_warn("failed to initialize irqdomain for HTIRQ.\n"); - else - htirq_domain->parent = parent; + goto warn; + + htirq_domain->parent = parent; + return; + +warn: + pr_warn("Failed to initialize irqdomain for HTIRQ.\n"); } int arch_setup_ht_irq(int idx, int pos, struct pci_dev *dev,
Commit-ID: f8409a6a4bf86e2d90ec8460df2874e4e19ebb27 Gitweb: http://git.kernel.org/tip/f8409a6a4bf86e2d90ec8460df2874e4e19ebb27 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:09 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:10 +0200 x86/uv: Create named irq domain Use the fwnode to create a named domain so diagnosis works. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.907511074@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/platform/uv/uv_irq.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/arch/x86/platform/uv/uv_irq.c b/arch/x86/platform/uv/uv_irq.c index 776c659..03fc397 100644 --- a/arch/x86/platform/uv/uv_irq.c +++ b/arch/x86/platform/uv/uv_irq.c @@ -160,13 +160,21 @@ static struct irq_domain *uv_get_irq_domain(void) { static struct irq_domain *uv_domain; static DEFINE_MUTEX(uv_lock); + struct fwnode_handle *fn; mutex_lock(&uv_lock); - if (uv_domain == NULL) { - uv_domain = irq_domain_add_tree(NULL, &uv_domain_ops, NULL); - if (uv_domain) - uv_domain->parent = x86_vector_domain; - } + if (uv_domain) + goto out; + + fn = irq_domain_alloc_named_fwnode("UV-CORE"); + if (!fn) + goto out; + + uv_domain = irq_domain_create_tree(fn, &uv_domain_ops, NULL); + irq_domain_free_fwnode(fn); + if (uv_domain) + uv_domain->parent = x86_vector_domain; +out: mutex_unlock(&uv_lock); return uv_domain;
Commit-ID: 667724c5a3109675cf3bfe7d75795b8608d1bcbe Gitweb: http://git.kernel.org/tip/667724c5a3109675cf3bfe7d75795b8608d1bcbe Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:10 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:10 +0200 x86/msi: Provide new iommu irqdomain interface Provide a new interface for creating the iommu remapping domains, so that the caller can supply a name and a id in order to create named irqdomains. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Joerg Roedel <joro@8bytes.org> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235443.986661206@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/include/asm/irq_remapping.h | 2 ++ arch/x86/kernel/apic/msi.c | 15 +++++++++++++++ 2 files changed, 17 insertions(+) diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h index a210eba..0398675 100644 --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -56,6 +56,8 @@ irq_remapping_get_irq_domain(struct irq_alloc_info *info); /* Create PCI MSI/MSIx irqdomain, use @parent as the parent irqdomain. */ extern struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent); +extern struct irq_domain * +arch_create_remap_msi_irq_domain(struct irq_domain *par, const char *n, int id); /* Get parent irqdomain for interrupt remapping irqdomain */ static inline struct irq_domain *arch_get_ir_parent_domain(void) diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c index c61aec7..0e6618e 100644 --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -167,10 +167,25 @@ static struct msi_domain_info pci_msi_ir_domain_info = { .handler_name = "edge", }; +struct irq_domain *arch_create_remap_msi_irq_domain(struct irq_domain *parent, + const char *name, int id) +{ + struct fwnode_handle *fn; + struct irq_domain *d; + + fn = irq_domain_alloc_named_id_fwnode(name, id); + if (!fn) + return NULL; + d = pci_msi_create_irq_domain(fn, &pci_msi_ir_domain_info, parent); + irq_domain_free_fwnode(fn); + return d; +} + struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent) { return pci_msi_create_irq_domain(NULL, &pci_msi_ir_domain_info, parent); } + #endif #ifdef CONFIG_DMAR_TABLE
Commit-ID: cea29b656a5e5f1a7b7de42795c3ae6fc417ab0b Gitweb: http://git.kernel.org/tip/cea29b656a5e5f1a7b7de42795c3ae6fc417ab0b Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:11 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:10 +0200 iommu/vt-d: Use named irq domain interface Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.063083997@linutronix.de --- drivers/iommu/intel_irq_remapping.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c index ba5b580..8fc641e 100644 --- a/drivers/iommu/intel_irq_remapping.c +++ b/drivers/iommu/intel_irq_remapping.c @@ -500,8 +500,9 @@ static void iommu_enable_irq_remapping(struct intel_iommu *iommu) static int intel_setup_irq_remapping(struct intel_iommu *iommu) { struct ir_table *ir_table; - struct page *pages; + struct fwnode_handle *fn; unsigned long *bitmap; + struct page *pages; if (iommu->ir_table) return 0; @@ -525,15 +526,24 @@ static int intel_setup_irq_remapping(struct intel_iommu *iommu) goto out_free_pages; } - iommu->ir_domain = irq_domain_add_hierarchy(arch_get_ir_parent_domain(), - 0, INTR_REMAP_TABLE_ENTRIES, - NULL, &intel_ir_domain_ops, - iommu); + fn = irq_domain_alloc_named_id_fwnode("INTEL-IR", iommu->seq_id); + if (!fn) + goto out_free_bitmap; + + iommu->ir_domain = + irq_domain_create_hierarchy(arch_get_ir_parent_domain(), + 0, INTR_REMAP_TABLE_ENTRIES, + fn, &intel_ir_domain_ops, + iommu); + irq_domain_free_fwnode(fn); if (!iommu->ir_domain) { pr_err("IR%d: failed to allocate irqdomain\n", iommu->seq_id); goto out_free_bitmap; } - iommu->ir_msi_domain = arch_create_msi_irq_domain(iommu->ir_domain); + iommu->ir_msi_domain = + arch_create_remap_msi_irq_domain(iommu->ir_domain, + "INTEL-IR-MSI", + iommu->seq_id); ir_table->base = page_address(pages); ir_table->bitmap = bitmap;
Commit-ID: 3e49a8182277ea57736285aede5f43bfa6aa11b1 Gitweb: http://git.kernel.org/tip/3e49a8182277ea57736285aede5f43bfa6aa11b1 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:12 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:11 +0200 iommu/amd: Use named irq domain interface Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.142270582@linutronix.de --- drivers/iommu/amd_iommu.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 590e1e8..503849d 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -4395,13 +4395,20 @@ static struct irq_chip amd_ir_chip = { int amd_iommu_create_irq_domain(struct amd_iommu *iommu) { - iommu->ir_domain = irq_domain_add_tree(NULL, &amd_ir_domain_ops, iommu); + struct fwnode_handle *fn; + + fn = irq_domain_alloc_named_id_fwnode("AMD-IR", iommu->index); + if (!fn) + return -ENOMEM; + iommu->ir_domain = irq_domain_create_tree(fn, &amd_ir_domain_ops, iommu); + irq_domain_free_fwnode(fn); if (!iommu->ir_domain) return -ENOMEM; iommu->ir_domain->parent = arch_get_ir_parent_domain(); - iommu->msi_domain = arch_create_msi_irq_domain(iommu->ir_domain); - + iommu->msi_domain = arch_create_remap_msi_irq_domain(iommu->ir_domain, + "AMD-IR-MSI", + iommu->index); return 0; }
Commit-ID: 0323b9690448e1d1ada91dac9d8fa62f7285751a Gitweb: http://git.kernel.org/tip/0323b9690448e1d1ada91dac9d8fa62f7285751a Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:13 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:11 +0200 x86/msi: Remove unused remap irq domain interface Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.221049665@linutronix.de --- arch/x86/include/asm/irq_remapping.h | 1 - arch/x86/kernel/apic/msi.c | 6 ------ 2 files changed, 7 deletions(-) diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h index 0398675..023b4a9 100644 --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -55,7 +55,6 @@ extern struct irq_domain * irq_remapping_get_irq_domain(struct irq_alloc_info *info); /* Create PCI MSI/MSIx irqdomain, use @parent as the parent irqdomain. */ -extern struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent); extern struct irq_domain * arch_create_remap_msi_irq_domain(struct irq_domain *par, const char *n, int id); diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c index 0e6618e..d79dc2a 100644 --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -180,12 +180,6 @@ struct irq_domain *arch_create_remap_msi_irq_domain(struct irq_domain *parent, irq_domain_free_fwnode(fn); return d; } - -struct irq_domain *arch_create_msi_irq_domain(struct irq_domain *parent) -{ - return pci_msi_create_irq_domain(NULL, &pci_msi_ir_domain_info, parent); -} - #endif #ifdef CONFIG_DMAR_TABLE
Commit-ID: f8f37ca78915b51a73bf240409fcda30d811b76b Gitweb: http://git.kernel.org/tip/f8f37ca78915b51a73bf240409fcda30d811b76b Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:14 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:11 +0200 x86/msi: Create named irq domains Use the fwnode to create named irq domains so diagnosis works. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.299024560@linutronix.de --- arch/x86/kernel/apic/msi.c | 42 +++++++++++++++++++++++++++++++++--------- 1 file changed, 33 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c index d79dc2a..9b18be7 100644 --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -136,13 +136,20 @@ static struct msi_domain_info pci_msi_domain_info = { .handler_name = "edge", }; -void arch_init_msi_domain(struct irq_domain *parent) +void __init arch_init_msi_domain(struct irq_domain *parent) { + struct fwnode_handle *fn; + if (disable_apic) return; - msi_default_domain = pci_msi_create_irq_domain(NULL, - &pci_msi_domain_info, parent); + fn = irq_domain_alloc_named_fwnode("PCI-MSI"); + if (fn) { + msi_default_domain = + pci_msi_create_irq_domain(fn, &pci_msi_domain_info, + parent); + irq_domain_free_fwnode(fn); + } if (!msi_default_domain) pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n"); } @@ -230,13 +237,20 @@ static struct irq_domain *dmar_get_irq_domain(void) { static struct irq_domain *dmar_domain; static DEFINE_MUTEX(dmar_lock); + struct fwnode_handle *fn; mutex_lock(&dmar_lock); - if (dmar_domain == NULL) - dmar_domain = msi_create_irq_domain(NULL, &dmar_msi_domain_info, + if (dmar_domain) + goto out; + + fn = irq_domain_alloc_named_fwnode("DMAR-MSI"); + if (fn) { + dmar_domain = msi_create_irq_domain(fn, &dmar_msi_domain_info, x86_vector_domain); + irq_domain_free_fwnode(fn); + } +out: mutex_unlock(&dmar_lock); - return dmar_domain; } @@ -326,9 +340,10 @@ static struct msi_domain_info hpet_msi_domain_info = { struct irq_domain *hpet_create_irq_domain(int hpet_id) { - struct irq_domain *parent; - struct irq_alloc_info info; struct msi_domain_info *domain_info; + struct irq_domain *parent, *d; + struct irq_alloc_info info; + struct fwnode_handle *fn; if (x86_vector_domain == NULL) return NULL; @@ -349,7 +364,16 @@ struct irq_domain *hpet_create_irq_domain(int hpet_id) else hpet_msi_controller.name = "IR-HPET-MSI"; - return msi_create_irq_domain(NULL, domain_info, parent); + fn = irq_domain_alloc_named_id_fwnode(hpet_msi_controller.name, + hpet_id); + if (!fn) { + kfree(domain_info); + return NULL; + } + + d = msi_create_irq_domain(fn, domain_info, parent); + irq_domain_free_fwnode(fn); + return d; } int hpet_assign_irq(struct irq_domain *domain, struct hpet_dev *dev,
Commit-ID: ae904cafd59d7120ef2afb97b252eadeba45e95f Gitweb: http://git.kernel.org/tip/ae904cafd59d7120ef2afb97b252eadeba45e95f Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:15 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:12 +0200 PCI/vmd: Create named irq domain Use the fwnode to create a named domain so diagnosis works. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-pci@vger.kernel.org Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.379861978@linutronix.de --- drivers/pci/host/vmd.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/pci/host/vmd.c b/drivers/pci/host/vmd.c index e27ad2a..31203d6 100644 --- a/drivers/pci/host/vmd.c +++ b/drivers/pci/host/vmd.c @@ -554,6 +554,7 @@ static int vmd_find_free_domain(void) static int vmd_enable_domain(struct vmd_dev *vmd) { struct pci_sysdata *sd = &vmd->sysdata; + struct fwnode_handle *fn; struct resource *res; u32 upper_bits; unsigned long flags; @@ -617,8 +618,13 @@ static int vmd_enable_domain(struct vmd_dev *vmd) sd->node = pcibus_to_node(vmd->dev->bus); - vmd->irq_domain = pci_msi_create_irq_domain(NULL, &vmd_msi_domain_info, + fn = irq_domain_alloc_named_id_fwnode("VMD-MSI", vmd->sysdata.domain); + if (!fn) + return -ENODEV; + + vmd->irq_domain = pci_msi_create_irq_domain(fn, &vmd_msi_domain_info, x86_vector_domain); + irq_domain_free_fwnode(fn); if (!vmd->irq_domain) return -ENODEV;
Commit-ID: 9dc6be3d419398eae9a19cd09b7969ceff8eaf10 Gitweb: http://git.kernel.org/tip/9dc6be3d419398eae9a19cd09b7969ceff8eaf10 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:16 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:12 +0200 genirq/irqdomain: Add map counter Add a map counter instead of counting radix tree entries for diagnosis. That also gives correct information for linear domains. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.459397746@linutronix.de --- include/linux/irqdomain.h | 2 ++ kernel/irq/irqdomain.c | 4 ++++ 2 files changed, 6 insertions(+) diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h index 9cf32a2..17ccd54 100644 --- a/include/linux/irqdomain.h +++ b/include/linux/irqdomain.h @@ -130,6 +130,7 @@ struct irq_domain_chip_generic; * @host_data: private data pointer for use by owner. Not touched by irq_domain * core code. * @flags: host per irq_domain flags + * @mapcount: The number of mapped interrupts * * Optional elements * @of_node: Pointer to device tree nodes associated with the irq_domain. Used @@ -152,6 +153,7 @@ struct irq_domain { const struct irq_domain_ops *ops; void *host_data; unsigned int flags; + unsigned int mapcount; /* Optional data */ struct fwnode_handle *fwnode; diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c index e1b925b..8d5805c 100644 --- a/kernel/irq/irqdomain.c +++ b/kernel/irq/irqdomain.c @@ -423,6 +423,7 @@ void irq_domain_disassociate(struct irq_domain *domain, unsigned int irq) irq_data->domain = NULL; irq_data->hwirq = 0; + domain->mapcount--; /* Clear reverse map for this hwirq */ if (hwirq < domain->revmap_size) { @@ -474,6 +475,7 @@ int irq_domain_associate(struct irq_domain *domain, unsigned int virq, domain->name = irq_data->chip->name; } + domain->mapcount++; if (hwirq < domain->revmap_size) { domain->linear_revmap[hwirq] = virq; } else { @@ -1081,6 +1083,7 @@ static void irq_domain_insert_irq(int virq) struct irq_domain *domain = data->domain; irq_hw_number_t hwirq = data->hwirq; + domain->mapcount++; if (hwirq < domain->revmap_size) { domain->linear_revmap[hwirq] = virq; } else { @@ -1110,6 +1113,7 @@ static void irq_domain_remove_irq(int virq) struct irq_domain *domain = data->domain; irq_hw_number_t hwirq = data->hwirq; + domain->mapcount--; if (hwirq < domain->revmap_size) { domain->linear_revmap[hwirq] = 0; } else {
Commit-ID: 087cdfb662ae50e3826e7cd2e54b6519d07b60f0 Gitweb: http://git.kernel.org/tip/087cdfb662ae50e3826e7cd2e54b6519d07b60f0 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:17 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:13 +0200 genirq/debugfs: Add proper debugfs interface Debugging (hierarchical) interupt domains is tedious as there is no information about the hierarchy and no information about states of interrupts in the various domain levels. Add a debugfs directory 'irq' and subdirectories 'domains' and 'irqs'. The domains directory contains the domain files. The content is information about the domain. If the domain is part of a hierarchy then the parent domains are printed as well. # ls /sys/kernel/debug/irq/domains/ default INTEL-IR-2 INTEL-IR-MSI-2 IO-APIC-IR-2 PCI-MSI DMAR-MSI INTEL-IR-3 INTEL-IR-MSI-3 IO-APIC-IR-3 unknown-1 INTEL-IR-0 INTEL-IR-MSI-0 IO-APIC-IR-0 IO-APIC-IR-4 VECTOR INTEL-IR-1 INTEL-IR-MSI-1 IO-APIC-IR-1 PCI-HT # cat /sys/kernel/debug/irq/domains/VECTOR name: VECTOR size: 0 mapped: 216 flags: 0x00000041 # cat /sys/kernel/debug/irq/domains/IO-APIC-IR-0 name: IO-APIC-IR-0 size: 24 mapped: 19 flags: 0x00000041 parent: INTEL-IR-3 name: INTEL-IR-3 size: 65536 mapped: 167 flags: 0x00000041 parent: VECTOR name: VECTOR size: 0 mapped: 216 flags: 0x00000041 Unfortunately there is no per cpu information about the VECTOR domain (yet). The irqs directory contains detailed information about mapped interrupts. # cat /sys/kernel/debug/irq/irqs/3 handler: handle_edge_irq status: 0x00004000 istate: 0x00000000 ddepth: 1 wdepth: 0 dstate: 0x01018000 IRQD_IRQ_DISABLED IRQD_SINGLE_TARGET IRQD_MOVE_PCNTXT node: 0 affinity: 0-143 effectiv: 0 pending: domain: IO-APIC-IR-0 hwirq: 0x3 chip: IR-IO-APIC flags: 0x10 IRQCHIP_SKIP_SET_WAKE parent: domain: INTEL-IR-3 hwirq: 0x20000 chip: INTEL-IR flags: 0x0 parent: domain: VECTOR hwirq: 0x3 chip: APIC flags: 0x0 This was developed to simplify the debugging of the managed affinity changes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.537566163@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- include/linux/irqdesc.h | 4 + include/linux/irqdomain.h | 4 + kernel/irq/Kconfig | 11 +++ kernel/irq/Makefile | 1 + kernel/irq/debugfs.c | 215 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/irq/internals.h | 22 +++++ kernel/irq/irqdesc.c | 1 + kernel/irq/irqdomain.c | 87 ++++++++++++++++++- kernel/irq/manage.c | 1 + 9 files changed, 345 insertions(+), 1 deletion(-) diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h index c9be579..d425a3a 100644 --- a/include/linux/irqdesc.h +++ b/include/linux/irqdesc.h @@ -46,6 +46,7 @@ struct pt_regs; * @rcu: rcu head for delayed free * @kobj: kobject used to represent this struct in sysfs * @dir: /proc/irq/ procfs entry + * @debugfs_file: dentry for the debugfs file * @name: flow handler name for /proc/interrupts output */ struct irq_desc { @@ -88,6 +89,9 @@ struct irq_desc { #ifdef CONFIG_PROC_FS struct proc_dir_entry *dir; #endif +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS + struct dentry *debugfs_file; +#endif #ifdef CONFIG_SPARSE_IRQ struct rcu_head rcu; struct kobject kobj; diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h index 17ccd54..914b0c3 100644 --- a/include/linux/irqdomain.h +++ b/include/linux/irqdomain.h @@ -139,6 +139,7 @@ struct irq_domain_chip_generic; * setting up one or more generic chips for interrupt controllers * drivers using the generic chip library which uses this pointer. * @parent: Pointer to parent irq_domain to support hierarchy irq_domains + * @debugfs_file: dentry for the domain debugfs file * * Revmap data, used internally by irq_domain * @revmap_direct_max_irq: The largest hwirq that can be set for controllers that @@ -162,6 +163,9 @@ struct irq_domain { #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY struct irq_domain *parent; #endif +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS + struct dentry *debugfs_file; +#endif /* reverse map data. The linear map gets appended to the irq_domain */ irq_hw_number_t hwirq_max; diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig index 3bbfd6a..8d9498e 100644 --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -108,4 +108,15 @@ config SPARSE_IRQ If you don't know what to do here, say N. +config GENERIC_IRQ_DEBUGFS + bool "Expose irq internals in debugfs" + depends on DEBUG_FS + default n + ---help--- + + Exposes internal state information through debugfs. Mostly for + developers and debugging of hard to diagnose interrupt problems. + + If you don't know what to do here, say N. + endmenu diff --git a/kernel/irq/Makefile b/kernel/irq/Makefile index 1d3ee31..c61fc9c 100644 --- a/kernel/irq/Makefile +++ b/kernel/irq/Makefile @@ -10,3 +10,4 @@ obj-$(CONFIG_PM_SLEEP) += pm.o obj-$(CONFIG_GENERIC_MSI_IRQ) += msi.o obj-$(CONFIG_GENERIC_IRQ_IPI) += ipi.o obj-$(CONFIG_SMP) += affinity.o +obj-$(CONFIG_GENERIC_IRQ_DEBUGFS) += debugfs.o diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c new file mode 100644 index 0000000..50ee2f6 --- /dev/null +++ b/kernel/irq/debugfs.c @@ -0,0 +1,215 @@ +/* + * Copyright 2017 Thomas Gleixner <tglx@linutronix.de> + * + * This file is licensed under the GPL V2. + */ +#include <linux/debugfs.h> +#include <linux/irqdomain.h> +#include <linux/irq.h> + +#include "internals.h" + +static struct dentry *irq_dir; + +struct irq_bit_descr { + unsigned int mask; + char *name; +}; +#define BIT_MASK_DESCR(m) { .mask = m, .name = #m } + +static void irq_debug_show_bits(struct seq_file *m, int ind, unsigned int state, + const struct irq_bit_descr *sd, int size) +{ + int i; + + for (i = 0; i < size; i++, sd++) { + if (state & sd->mask) + seq_printf(m, "%*s%s\n", ind + 12, "", sd->name); + } +} + +#ifdef CONFIG_SMP +static void irq_debug_show_masks(struct seq_file *m, struct irq_desc *desc) +{ + struct irq_data *data = irq_desc_get_irq_data(desc); + struct cpumask *msk; + + msk = irq_data_get_affinity_mask(data); + seq_printf(m, "affinity: %*pbl\n", cpumask_pr_args(msk)); +#ifdef CONFIG_GENERIC_PENDING_IRQ + msk = desc->pending_mask; + seq_printf(m, "pending: %*pbl\n", cpumask_pr_args(msk)); +#endif +} +#else +static void irq_debug_show_masks(struct seq_file *m, struct irq_desc *desc) { } +#endif + +static const struct irq_bit_descr irqchip_flags[] = { + BIT_MASK_DESCR(IRQCHIP_SET_TYPE_MASKED), + BIT_MASK_DESCR(IRQCHIP_EOI_IF_HANDLED), + BIT_MASK_DESCR(IRQCHIP_MASK_ON_SUSPEND), + BIT_MASK_DESCR(IRQCHIP_ONOFFLINE_ENABLED), + BIT_MASK_DESCR(IRQCHIP_SKIP_SET_WAKE), + BIT_MASK_DESCR(IRQCHIP_ONESHOT_SAFE), + BIT_MASK_DESCR(IRQCHIP_EOI_THREADED), +}; + +static void +irq_debug_show_chip(struct seq_file *m, struct irq_data *data, int ind) +{ + struct irq_chip *chip = data->chip; + + if (!chip) { + seq_printf(m, "chip: None\n"); + return; + } + seq_printf(m, "%*schip: %s\n", ind, "", chip->name); + seq_printf(m, "%*sflags: 0x%lx\n", ind + 1, "", chip->flags); + irq_debug_show_bits(m, ind, chip->flags, irqchip_flags, + ARRAY_SIZE(irqchip_flags)); +} + +static void +irq_debug_show_data(struct seq_file *m, struct irq_data *data, int ind) +{ + seq_printf(m, "%*sdomain: %s\n", ind, "", + data->domain ? data->domain->name : ""); + seq_printf(m, "%*shwirq: 0x%lx\n", ind + 1, "", data->hwirq); + irq_debug_show_chip(m, data, ind + 1); +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY + if (!data->parent_data) + return; + seq_printf(m, "%*sparent:\n", ind + 1, ""); + irq_debug_show_data(m, data->parent_data, ind + 4); +#endif +} + +static const struct irq_bit_descr irqdata_states[] = { + BIT_MASK_DESCR(IRQ_TYPE_EDGE_RISING), + BIT_MASK_DESCR(IRQ_TYPE_EDGE_FALLING), + BIT_MASK_DESCR(IRQ_TYPE_LEVEL_HIGH), + BIT_MASK_DESCR(IRQ_TYPE_LEVEL_LOW), + BIT_MASK_DESCR(IRQD_LEVEL), + + BIT_MASK_DESCR(IRQD_ACTIVATED), + BIT_MASK_DESCR(IRQD_IRQ_STARTED), + BIT_MASK_DESCR(IRQD_IRQ_DISABLED), + BIT_MASK_DESCR(IRQD_IRQ_MASKED), + BIT_MASK_DESCR(IRQD_IRQ_INPROGRESS), + + BIT_MASK_DESCR(IRQD_PER_CPU), + BIT_MASK_DESCR(IRQD_NO_BALANCING), + + BIT_MASK_DESCR(IRQD_MOVE_PCNTXT), + BIT_MASK_DESCR(IRQD_AFFINITY_SET), + BIT_MASK_DESCR(IRQD_SETAFFINITY_PENDING), + BIT_MASK_DESCR(IRQD_AFFINITY_MANAGED), + BIT_MASK_DESCR(IRQD_MANAGED_SHUTDOWN), + + BIT_MASK_DESCR(IRQD_FORWARDED_TO_VCPU), + + BIT_MASK_DESCR(IRQD_WAKEUP_STATE), + BIT_MASK_DESCR(IRQD_WAKEUP_ARMED), +}; + +static const struct irq_bit_descr irqdesc_states[] = { + BIT_MASK_DESCR(_IRQ_NOPROBE), + BIT_MASK_DESCR(_IRQ_NOREQUEST), + BIT_MASK_DESCR(_IRQ_NOTHREAD), + BIT_MASK_DESCR(_IRQ_NOAUTOEN), + BIT_MASK_DESCR(_IRQ_NESTED_THREAD), + BIT_MASK_DESCR(_IRQ_PER_CPU_DEVID), + BIT_MASK_DESCR(_IRQ_IS_POLLED), + BIT_MASK_DESCR(_IRQ_DISABLE_UNLAZY), +}; + +static const struct irq_bit_descr irqdesc_istates[] = { + BIT_MASK_DESCR(IRQS_AUTODETECT), + BIT_MASK_DESCR(IRQS_SPURIOUS_DISABLED), + BIT_MASK_DESCR(IRQS_POLL_INPROGRESS), + BIT_MASK_DESCR(IRQS_ONESHOT), + BIT_MASK_DESCR(IRQS_REPLAY), + BIT_MASK_DESCR(IRQS_WAITING), + BIT_MASK_DESCR(IRQS_PENDING), + BIT_MASK_DESCR(IRQS_SUSPENDED), +}; + + +static int irq_debug_show(struct seq_file *m, void *p) +{ + struct irq_desc *desc = m->private; + struct irq_data *data; + + raw_spin_lock_irq(&desc->lock); + data = irq_desc_get_irq_data(desc); + seq_printf(m, "handler: %pf\n", desc->handle_irq); + seq_printf(m, "status: 0x%08x\n", desc->status_use_accessors); + irq_debug_show_bits(m, 0, desc->status_use_accessors, irqdesc_states, + ARRAY_SIZE(irqdesc_states)); + seq_printf(m, "istate: 0x%08x\n", desc->istate); + irq_debug_show_bits(m, 0, desc->istate, irqdesc_istates, + ARRAY_SIZE(irqdesc_istates)); + seq_printf(m, "ddepth: %u\n", desc->depth); + seq_printf(m, "wdepth: %u\n", desc->wake_depth); + seq_printf(m, "dstate: 0x%08x\n", irqd_get(data)); + irq_debug_show_bits(m, 0, irqd_get(data), irqdata_states, + ARRAY_SIZE(irqdata_states)); + seq_printf(m, "node: %d\n", irq_data_get_node(data)); + irq_debug_show_masks(m, desc); + irq_debug_show_data(m, data, 0); + raw_spin_unlock_irq(&desc->lock); + return 0; +} + +static int irq_debug_open(struct inode *inode, struct file *file) +{ + return single_open(file, irq_debug_show, inode->i_private); +} + +static const struct file_operations dfs_irq_ops = { + .open = irq_debug_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *desc) +{ + char name [10]; + + if (!irq_dir || !desc || desc->debugfs_file) + return; + + sprintf(name, "%d", irq); + desc->debugfs_file = debugfs_create_file(name, 0444, irq_dir, desc, + &dfs_irq_ops); +} + +void irq_remove_debugfs_entry(struct irq_desc *desc) +{ + if (desc->debugfs_file) + debugfs_remove(desc->debugfs_file); +} + +static int __init irq_debugfs_init(void) +{ + struct dentry *root_dir; + int irq; + + root_dir = debugfs_create_dir("irq", NULL); + if (!root_dir) + return -ENOMEM; + + irq_domain_debugfs_init(root_dir); + + irq_dir = debugfs_create_dir("irqs", root_dir); + + irq_lock_sparse(); + for_each_active_irq(irq) + irq_add_debugfs_entry(irq, irq_to_desc(irq)); + irq_unlock_sparse(); + + return 0; +} +__initcall(irq_debugfs_init); diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index 921a241..094db5b 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -169,6 +169,11 @@ irq_put_desc_unlock(struct irq_desc *desc, unsigned long flags) #define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors) +static inline unsigned int irqd_get(struct irq_data *d) +{ + return __irqd_to_state(d); +} + /* * Manipulation functions for irq_data.state */ @@ -237,3 +242,20 @@ irq_init_generic_chip(struct irq_chip_generic *gc, const char *name, int num_ct, unsigned int irq_base, void __iomem *reg_base, irq_flow_handler_t handler) { } #endif /* CONFIG_GENERIC_IRQ_CHIP */ + +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS +void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *desc); +void irq_remove_debugfs_entry(struct irq_desc *desc); +# ifdef CONFIG_IRQ_DOMAIN +void irq_domain_debugfs_init(struct dentry *root); +# else +static inline void irq_domain_debugfs_init(struct dentry *root); +# endif +#else /* CONFIG_GENERIC_IRQ_DEBUGFS */ +static inline void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *d) +{ +} +static inline void irq_remove_debugfs_entry(struct irq_desc *d) +{ +} +#endif /* CONFIG_GENERIC_IRQ_DEBUGFS */ diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c index 09abce2..feade53 100644 --- a/kernel/irq/irqdesc.c +++ b/kernel/irq/irqdesc.c @@ -394,6 +394,7 @@ static void free_desc(unsigned int irq) { struct irq_desc *desc = irq_to_desc(irq); + irq_remove_debugfs_entry(desc); unregister_irq_proc(irq, desc); /* diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c index 8d5805c..75e1f08 100644 --- a/kernel/irq/irqdomain.c +++ b/kernel/irq/irqdomain.c @@ -29,9 +29,17 @@ struct irqchip_fwid { struct fwnode_handle fwnode; unsigned int type; char *name; - void *data; + void *data; }; +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS +static void debugfs_add_domain_dir(struct irq_domain *d); +static void debugfs_remove_domain_dir(struct irq_domain *d); +#else +static inline void debugfs_add_domain_dir(struct irq_domain *d) { } +static inline void debugfs_remove_domain_dir(struct irq_domain *d) { } +#endif + /** * irq_domain_alloc_fwnode - Allocate a fwnode_handle suitable for * identifying an irq domain @@ -194,6 +202,7 @@ struct irq_domain *__irq_domain_add(struct fwnode_handle *fwnode, int size, irq_domain_check_hierarchy(domain); mutex_lock(&irq_domain_mutex); + debugfs_add_domain_dir(domain); list_add(&domain->link, &irq_domain_list); mutex_unlock(&irq_domain_mutex); @@ -213,6 +222,7 @@ EXPORT_SYMBOL_GPL(__irq_domain_add); void irq_domain_remove(struct irq_domain *domain) { mutex_lock(&irq_domain_mutex); + debugfs_remove_domain_dir(domain); WARN_ON(!radix_tree_empty(&domain->revmap_tree)); @@ -1599,3 +1609,78 @@ static void irq_domain_check_hierarchy(struct irq_domain *domain) { } #endif /* CONFIG_IRQ_DOMAIN_HIERARCHY */ + +#ifdef CONFIG_GENERIC_IRQ_DEBUGFS +static struct dentry *domain_dir; + +static void +irq_domain_debug_show_one(struct seq_file *m, struct irq_domain *d, int ind) +{ + seq_printf(m, "%*sname: %s\n", ind, "", d->name); + seq_printf(m, "%*ssize: %u\n", ind + 1, "", + d->revmap_size + d->revmap_direct_max_irq); + seq_printf(m, "%*smapped: %u\n", ind + 1, "", d->mapcount); + seq_printf(m, "%*sflags: 0x%08x\n", ind +1 , "", d->flags); +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY + if (!d->parent) + return; + seq_printf(m, "%*sparent: %s\n", ind + 1, "", d->parent->name); + irq_domain_debug_show_one(m, d->parent, ind + 4); +#endif +} + +static int irq_domain_debug_show(struct seq_file *m, void *p) +{ + struct irq_domain *d = m->private; + + /* Default domain? Might be NULL */ + if (!d) { + if (!irq_default_domain) + return 0; + d = irq_default_domain; + } + irq_domain_debug_show_one(m, d, 0); + return 0; +} + +static int irq_domain_debug_open(struct inode *inode, struct file *file) +{ + return single_open(file, irq_domain_debug_show, inode->i_private); +} + +static const struct file_operations dfs_domain_ops = { + .open = irq_domain_debug_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static void debugfs_add_domain_dir(struct irq_domain *d) +{ + if (!d->name || !domain_dir || d->debugfs_file) + return; + d->debugfs_file = debugfs_create_file(d->name, 0444, domain_dir, d, + &dfs_domain_ops); +} + +static void debugfs_remove_domain_dir(struct irq_domain *d) +{ + if (d->debugfs_file) + debugfs_remove(d->debugfs_file); +} + +void __init irq_domain_debugfs_init(struct dentry *root) +{ + struct irq_domain *d; + + domain_dir = debugfs_create_dir("domains", root); + if (!domain_dir) + return; + + debugfs_create_file("default", 0444, domain_dir, NULL, &dfs_domain_ops); + mutex_lock(&irq_domain_mutex); + list_for_each_entry(d, &irq_domain_list, link) + debugfs_add_domain_dir(d); + mutex_unlock(&irq_domain_mutex); +} +#endif diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index 4c34696..284f4eb 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1398,6 +1398,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) wake_up_process(new->secondary->thread); register_irq_proc(irq, desc); + irq_add_debugfs_entry(irq, desc); new->dir = NULL; register_handler_proc(irq, new); free_cpumask_var(mask);
Commit-ID: 1bb0401680da156ce1549e915e711bf5b2534cc5 Gitweb: http://git.kernel.org/tip/1bb0401680da156ce1549e915e711bf5b2534cc5 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:18 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:13 +0200 genirq: Add missing comment for IRQD_STARTED Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.614913014@linutronix.de --- include/linux/irq.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/irq.h b/include/linux/irq.h index d996314..7e62e10 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -200,6 +200,7 @@ struct irq_data { * IRQD_WAKEUP_ARMED - Wakeup mode armed * IRQD_FORWARDED_TO_VCPU - The interrupt is forwarded to a VCPU * IRQD_AFFINITY_MANAGED - Affinity is auto-managed by the kernel + * IRQD_IRQ_STARTED - Startup state of the interrupt */ enum { IRQD_TRIGGER_MASK = 0xf,
Commit-ID: cdd16365b0bd7c0cd19e2cc768b6bdc8021f32c3 Gitweb: http://git.kernel.org/tip/cdd16365b0bd7c0cd19e2cc768b6bdc8021f32c3 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:19 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:13 +0200 genirq: Provide irq_fixup_move_pending() If an CPU goes offline, the interrupts are migrated away, but a eventually pending interrupt move, which has not yet been made effective is kept pending even if the outgoing CPU is the sole target of the pending affinity mask. What's worse is, that the pending affinity mask is discarded even if it would contain a valid subset of the online CPUs. Implement a helper function which allows to avoid these issues. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.691345468@linutronix.de --- include/linux/irq.h | 5 +++++ kernel/irq/migration.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+) diff --git a/include/linux/irq.h b/include/linux/irq.h index 7e62e10..d008065 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -491,9 +491,14 @@ extern void irq_migrate_all_off_this_cpu(void); #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ) void irq_move_irq(struct irq_data *data); void irq_move_masked_irq(struct irq_data *data); +bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear); #else static inline void irq_move_irq(struct irq_data *data) { } static inline void irq_move_masked_irq(struct irq_data *data) { } +static inline bool irq_fixup_move_pending(struct irq_desc *desc, bool fclear) +{ + return false; +} #endif extern int no_irq_affinity; diff --git a/kernel/irq/migration.c b/kernel/irq/migration.c index 37ddb7b..6ca054a 100644 --- a/kernel/irq/migration.c +++ b/kernel/irq/migration.c @@ -4,6 +4,36 @@ #include "internals.h" +/** + * irq_fixup_move_pending - Cleanup irq move pending from a dying CPU + * @desc: Interrupt descpriptor to clean up + * @force_clear: If set clear the move pending bit unconditionally. + * If not set, clear it only when the dying CPU is the + * last one in the pending mask. + * + * Returns true if the pending bit was set and the pending mask contains an + * online CPU other than the dying CPU. + */ +bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear) +{ + struct irq_data *data = irq_desc_get_irq_data(desc); + + if (!irqd_is_setaffinity_pending(data)) + return false; + + /* + * The outgoing CPU might be the last online target in a pending + * interrupt move. If that's the case clear the pending move bit. + */ + if (cpumask_any_and(desc->pending_mask, cpu_online_mask) >= nr_cpu_ids) { + irqd_clr_move_pending(data); + return false; + } + if (force_clear) + irqd_clr_move_pending(data); + return true; +} + void irq_move_masked_irq(struct irq_data *idata) { struct irq_desc *desc = irq_data_to_desc(idata);
Commit-ID: 8e7b632237df8b17526411d1d98f838580bb6aa3 Gitweb: http://git.kernel.org/tip/8e7b632237df8b17526411d1d98f838580bb6aa3 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:20 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:13 +0200 x86/irq: Cleanup pending irq move in fixup_irqs() If an CPU goes offline, the interrupts are migrated away, but a eventually pending interrupt move, which has not yet been made effective is kept pending even if the outgoing CPU is the sole target of the pending affinity mask. What's worse is, that the pending affinity mask is discarded even if it would contain a valid subset of the online CPUs. Use the newly introduced helper to: - Discard a pending move when the outgoing CPU is the only target in the pending mask. - Use the pending mask instead of the affinity mask to find a valid target for the CPU if the pending mask intersects with the online CPUs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.774068557@linutronix.de --- arch/x86/kernel/irq.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index f34fe74..9696007d 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -440,9 +440,9 @@ void fixup_irqs(void) int ret; for_each_irq_desc(irq, desc) { + const struct cpumask *affinity; int break_affinity = 0; int set_affinity = 1; - const struct cpumask *affinity; if (!desc) continue; @@ -454,19 +454,36 @@ void fixup_irqs(void) data = irq_desc_get_irq_data(desc); affinity = irq_data_get_affinity_mask(data); + if (!irq_has_action(irq) || irqd_is_per_cpu(data) || cpumask_subset(affinity, cpu_online_mask)) { + irq_fixup_move_pending(desc, false); raw_spin_unlock(&desc->lock); continue; } /* - * Complete the irq move. This cpu is going down and for - * non intr-remapping case, we can't wait till this interrupt - * arrives at this cpu before completing the irq move. + * Complete an eventually pending irq move cleanup. If this + * interrupt was moved in hard irq context, then the + * vectors need to be cleaned up. It can't wait until this + * interrupt actually happens and this CPU was involved. */ irq_force_complete_move(desc); + /* + * If there is a setaffinity pending, then try to reuse the + * pending mask, so the last change of the affinity does + * not get lost. If there is no move pending or the pending + * mask does not contain any online CPU, use the current + * affinity mask. + */ + if (irq_fixup_move_pending(desc, true)) + affinity = desc->pending_mask; + + /* + * If the mask does not contain an offline CPU, break + * affinity and use cpu_online_mask as fall back. + */ if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { break_affinity = 1; affinity = cpu_online_mask;
Commit-ID: cba4235e6031e9318d68186f6d765c531cbea4e1 Gitweb: http://git.kernel.org/tip/cba4235e6031e9318d68186f6d765c531cbea4e1 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:21 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:14 +0200 genirq: Remove mask argument from setup_affinity() No point to have this alloc/free dance of cpumasks. Provide a static mask for setup_affinity() and protect it proper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.851571573@linutronix.de --- kernel/irq/internals.h | 2 +- kernel/irq/manage.c | 53 ++++++++++++++++++++++---------------------------- kernel/irq/proc.c | 8 +++++--- 3 files changed, 29 insertions(+), 34 deletions(-) diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index 094db5b..33ca838 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -109,7 +109,7 @@ static inline void unregister_handler_proc(unsigned int irq, extern bool irq_can_set_affinity_usr(unsigned int irq); -extern int irq_select_affinity_usr(unsigned int irq, struct cpumask *mask); +extern int irq_select_affinity_usr(unsigned int irq); extern void irq_set_thread_affinity(struct irq_desc *desc); diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index 284f4eb..e2f20d5 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -345,15 +345,18 @@ EXPORT_SYMBOL_GPL(irq_set_affinity_notifier); /* * Generic version of the affinity autoselector. */ -static int setup_affinity(struct irq_desc *desc, struct cpumask *mask) +static int irq_setup_affinity(struct irq_desc *desc) { struct cpumask *set = irq_default_affinity; - int node = irq_desc_get_node(desc); + int ret, node = irq_desc_get_node(desc); + static DEFINE_RAW_SPINLOCK(mask_lock); + static struct cpumask mask; /* Excludes PER_CPU and NO_BALANCE interrupts */ if (!__irq_can_set_affinity(desc)) return 0; + raw_spin_lock(&mask_lock); /* * Preserve the managed affinity setting and a userspace affinity * setup, but make sure that one of the targets is online. @@ -367,43 +370,42 @@ static int setup_affinity(struct irq_desc *desc, struct cpumask *mask) irqd_clear(&desc->irq_data, IRQD_AFFINITY_SET); } - cpumask_and(mask, cpu_online_mask, set); + cpumask_and(&mask, cpu_online_mask, set); if (node != NUMA_NO_NODE) { const struct cpumask *nodemask = cpumask_of_node(node); /* make sure at least one of the cpus in nodemask is online */ - if (cpumask_intersects(mask, nodemask)) - cpumask_and(mask, mask, nodemask); + if (cpumask_intersects(&mask, nodemask)) + cpumask_and(&mask, &mask, nodemask); } - irq_do_set_affinity(&desc->irq_data, mask, false); - return 0; + ret = irq_do_set_affinity(&desc->irq_data, &mask, false); + raw_spin_unlock(&mask_lock); + return ret; } #else /* Wrapper for ALPHA specific affinity selector magic */ -static inline int setup_affinity(struct irq_desc *d, struct cpumask *mask) +int irq_setup_affinity(struct irq_desc *desc) { - return irq_select_affinity(irq_desc_get_irq(d)); + return irq_select_affinity(irq_desc_get_irq(desc)); } #endif /* - * Called when affinity is set via /proc/irq + * Called when a bogus affinity is set via /proc/irq */ -int irq_select_affinity_usr(unsigned int irq, struct cpumask *mask) +int irq_select_affinity_usr(unsigned int irq) { struct irq_desc *desc = irq_to_desc(irq); unsigned long flags; int ret; raw_spin_lock_irqsave(&desc->lock, flags); - ret = setup_affinity(desc, mask); + ret = irq_setup_affinity(desc); raw_spin_unlock_irqrestore(&desc->lock, flags); return ret; } - #else -static inline int -setup_affinity(struct irq_desc *desc, struct cpumask *mask) +static inline int setup_affinity(struct irq_desc *desc) { return 0; } @@ -1128,7 +1130,6 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) struct irqaction *old, **old_ptr; unsigned long flags, thread_mask = 0; int ret, nested, shared = 0; - cpumask_var_t mask; if (!desc) return -EINVAL; @@ -1187,11 +1188,6 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) } } - if (!alloc_cpumask_var(&mask, GFP_KERNEL)) { - ret = -ENOMEM; - goto out_thread; - } - /* * Drivers are often written to work w/o knowledge about the * underlying irq chip implementation, so a request for a @@ -1256,7 +1252,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) */ if (thread_mask == ~0UL) { ret = -EBUSY; - goto out_mask; + goto out_unlock; } /* * The thread_mask for the action is or'ed to @@ -1300,7 +1296,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) pr_err("Threaded irq requested with handler=NULL and !ONESHOT for irq %d\n", irq); ret = -EINVAL; - goto out_mask; + goto out_unlock; } if (!shared) { @@ -1308,7 +1304,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) if (ret) { pr_err("Failed to request resources for %s (irq %d) on irqchip %s\n", new->name, irq, desc->irq_data.chip->name); - goto out_mask; + goto out_unlock; } init_waitqueue_head(&desc->wait_for_threads); @@ -1320,7 +1316,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) if (ret) { irq_release_resources(desc); - goto out_mask; + goto out_unlock; } } @@ -1357,7 +1353,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) } /* Set default affinity mask once everything is setup */ - setup_affinity(desc, mask); + irq_setup_affinity(desc); } else if (new->flags & IRQF_TRIGGER_MASK) { unsigned int nmsk = new->flags & IRQF_TRIGGER_MASK; @@ -1401,8 +1397,6 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) irq_add_debugfs_entry(irq, desc); new->dir = NULL; register_handler_proc(irq, new); - free_cpumask_var(mask); - return 0; mismatch: @@ -1415,9 +1409,8 @@ mismatch: } ret = -EBUSY; -out_mask: +out_unlock: raw_spin_unlock_irqrestore(&desc->lock, flags); - free_cpumask_var(mask); out_thread: if (new->thread) { diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c index c53edad..d35bb8d 100644 --- a/kernel/irq/proc.c +++ b/kernel/irq/proc.c @@ -120,9 +120,11 @@ static ssize_t write_irq_affinity(int type, struct file *file, * one online CPU still has to be targeted. */ if (!cpumask_intersects(new_value, cpu_online_mask)) { - /* Special case for empty set - allow the architecture - code to set default SMP affinity. */ - err = irq_select_affinity_usr(irq, new_value) ? -EINVAL : count; + /* + * Special case for empty set - allow the architecture code + * to set default SMP affinity. + */ + err = irq_select_affinity_usr(irq) ? -EINVAL : count; } else { irq_set_affinity(irq, new_value); err = count;
Commit-ID: 43564bd97d0e6182bbd43b51b33254c728832551 Gitweb: http://git.kernel.org/tip/43564bd97d0e6182bbd43b51b33254c728832551 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:22 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:14 +0200 genirq: Rename setup_affinity() to irq_setup_affinity() Rename it with a proper irq_ prefix and make it available for other files in the core code. Preparatory patch for moving the irq affinity setup around. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235444.928501004@linutronix.de --- kernel/irq/internals.h | 6 ++++++ kernel/irq/manage.c | 7 +------ 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index 33ca838..2d7927d 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -116,6 +116,12 @@ extern void irq_set_thread_affinity(struct irq_desc *desc); extern int irq_do_set_affinity(struct irq_data *data, const struct cpumask *dest, bool force); +#ifdef CONFIG_SMP +extern int irq_setup_affinity(struct irq_desc *desc); +#else +static inline int irq_setup_affinity(struct irq_desc *desc) { return 0; } +#endif + /* Inline functions for support of irq chips on slow busses */ static inline void chip_bus_lock(struct irq_desc *desc) { diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index e2f20d5..907fb79 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -345,7 +345,7 @@ EXPORT_SYMBOL_GPL(irq_set_affinity_notifier); /* * Generic version of the affinity autoselector. */ -static int irq_setup_affinity(struct irq_desc *desc) +int irq_setup_affinity(struct irq_desc *desc) { struct cpumask *set = irq_default_affinity; int ret, node = irq_desc_get_node(desc); @@ -404,11 +404,6 @@ int irq_select_affinity_usr(unsigned int irq) raw_spin_unlock_irqrestore(&desc->lock, flags); return ret; } -#else -static inline int setup_affinity(struct irq_desc *desc) -{ - return 0; -} #endif /**
Commit-ID: 2e051552df69af6d134c2592d0d6f1ac80f01190 Gitweb: http://git.kernel.org/tip/2e051552df69af6d134c2592d0d6f1ac80f01190 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:23 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:15 +0200 genirq: Move initial affinity setup to irq_startup() The startup vs. setaffinity ordering of interrupts depends on the IRQF_NOAUTOEN flag. Chained interrupts are not getting any affinity assignment at all. A regular interrupt is started up and then the affinity is set. A IRQF_NOAUTOEN marked interrupt is not started up, but the affinity is set nevertheless. Move the affinity setup to startup_irq() so the ordering is always the same and chained interrupts get the proper default affinity assigned as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.020534783@linutronix.de --- kernel/irq/chip.c | 2 ++ kernel/irq/manage.c | 15 ++++++--------- 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index bc1331f..e290d73 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -213,6 +213,8 @@ int irq_startup(struct irq_desc *desc, bool resend) irq_enable(desc); } irq_state_set_started(desc); + /* Set default affinity mask once everything is setup */ + irq_setup_affinity(desc); } if (resend) diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index 907fb79..1e28307 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1327,6 +1327,12 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) if (new->flags & IRQF_ONESHOT) desc->istate |= IRQS_ONESHOT; + /* Exclude IRQ from balancing if requested */ + if (new->flags & IRQF_NOBALANCING) { + irq_settings_set_no_balancing(desc); + irqd_set(&desc->irq_data, IRQD_NO_BALANCING); + } + if (irq_settings_can_autoenable(desc)) { irq_startup(desc, true); } else { @@ -1341,15 +1347,6 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) desc->depth = 1; } - /* Exclude IRQ from balancing if requested */ - if (new->flags & IRQF_NOBALANCING) { - irq_settings_set_no_balancing(desc); - irqd_set(&desc->irq_data, IRQD_NO_BALANCING); - } - - /* Set default affinity mask once everything is setup */ - irq_setup_affinity(desc); - } else if (new->flags & IRQF_TRIGGER_MASK) { unsigned int nmsk = new->flags & IRQF_TRIGGER_MASK; unsigned int omsk = irqd_get_trigger_type(&desc->irq_data);
Commit-ID: 137221df69c6f8a7002f82dc3d95052d34f5667e Gitweb: http://git.kernel.org/tip/137221df69c6f8a7002f82dc3d95052d34f5667e Author: Christoph Hellwig <hch@lst.de> AuthorDate: Tue, 20 Jun 2017 01:37:24 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:15 +0200 genirq: Move pending helpers to internal.h So that the affinity code can reuse them. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170619235445.109426284@linutronix.de --- kernel/irq/internals.h | 38 ++++++++++++++++++++++++++++++++++++++ kernel/irq/manage.c | 28 ---------------------------- 2 files changed, 38 insertions(+), 28 deletions(-) diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index 2d7927d..20b197f 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -249,6 +249,44 @@ irq_init_generic_chip(struct irq_chip_generic *gc, const char *name, void __iomem *reg_base, irq_flow_handler_t handler) { } #endif /* CONFIG_GENERIC_IRQ_CHIP */ +#ifdef CONFIG_GENERIC_PENDING_IRQ +static inline bool irq_can_move_pcntxt(struct irq_data *data) +{ + return irqd_can_move_in_process_context(data); +} +static inline bool irq_move_pending(struct irq_data *data) +{ + return irqd_is_setaffinity_pending(data); +} +static inline void +irq_copy_pending(struct irq_desc *desc, const struct cpumask *mask) +{ + cpumask_copy(desc->pending_mask, mask); +} +static inline void +irq_get_pending(struct cpumask *mask, struct irq_desc *desc) +{ + cpumask_copy(mask, desc->pending_mask); +} +#else /* CONFIG_GENERIC_PENDING_IRQ */ +static inline bool irq_can_move_pcntxt(struct irq_data *data) +{ + return true; +} +static inline bool irq_move_pending(struct irq_data *data) +{ + return false; +} +static inline void +irq_copy_pending(struct irq_desc *desc, const struct cpumask *mask) +{ +} +static inline void +irq_get_pending(struct cpumask *mask, struct irq_desc *desc) +{ +} +#endif /* CONFIG_GENERIC_PENDING_IRQ */ + #ifdef CONFIG_GENERIC_IRQ_DEBUGFS void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *desc); void irq_remove_debugfs_entry(struct irq_desc *desc); diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index 1e28307..7dcf193 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -168,34 +168,6 @@ void irq_set_thread_affinity(struct irq_desc *desc) set_bit(IRQTF_AFFINITY, &action->thread_flags); } -#ifdef CONFIG_GENERIC_PENDING_IRQ -static inline bool irq_can_move_pcntxt(struct irq_data *data) -{ - return irqd_can_move_in_process_context(data); -} -static inline bool irq_move_pending(struct irq_data *data) -{ - return irqd_is_setaffinity_pending(data); -} -static inline void -irq_copy_pending(struct irq_desc *desc, const struct cpumask *mask) -{ - cpumask_copy(desc->pending_mask, mask); -} -static inline void -irq_get_pending(struct cpumask *mask, struct irq_desc *desc) -{ - cpumask_copy(mask, desc->pending_mask); -} -#else -static inline bool irq_can_move_pcntxt(struct irq_data *data) { return true; } -static inline bool irq_move_pending(struct irq_data *data) { return false; } -static inline void -irq_copy_pending(struct irq_desc *desc, const struct cpumask *mask) { } -static inline void -irq_get_pending(struct cpumask *mask, struct irq_desc *desc) { } -#endif - int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force) {
Commit-ID: 0dd945ff4647a1f29c6ae8f4f9a69c8f37c994cf Gitweb: http://git.kernel.org/tip/0dd945ff4647a1f29c6ae8f4f9a69c8f37c994cf Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:25 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:16 +0200 genirq/cpuhotplug: Remove irq disabling logic This is called from stop_machine() with interrupts disabled. No point in disabling them some more. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.198042748@linutronix.de --- kernel/irq/cpuhotplug.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index 011f8c4..7051398 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -59,11 +59,8 @@ static bool migrate_one_irq(struct irq_desc *desc) */ void irq_migrate_all_off_this_cpu(void) { - unsigned int irq; struct irq_desc *desc; - unsigned long flags; - - local_irq_save(flags); + unsigned int irq; for_each_active_irq(irq) { bool affinity_broken; @@ -73,10 +70,9 @@ void irq_migrate_all_off_this_cpu(void) affinity_broken = migrate_one_irq(desc); raw_spin_unlock(&desc->lock); - if (affinity_broken) - pr_warn_ratelimited("IRQ%u no longer affine to CPU%u\n", + if (affinity_broken) { + pr_warn_ratelimited("IRQ %u: no longer affine to CPU%u\n", irq, smp_processor_id()); + } } - - local_irq_restore(flags); }
Commit-ID: 735c09524d3e7c92315e8e2699a1b9acb4fb415c Gitweb: http://git.kernel.org/tip/735c09524d3e7c92315e8e2699a1b9acb4fb415c Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:26 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:16 +0200 genirq/cpuhotplug: Dont claim success on error In case the affinity of an interrupt was broken, a printk is emitted. But if the affinity cannot be set at all due to a missing irq_set_affinity() callback or due to a failing callback, the message is still printed preceeded by a warning/error. That makes no sense whatsoever. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.274852976@linutronix.de --- kernel/irq/cpuhotplug.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index 7051398..9c5521b 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -37,11 +37,14 @@ static bool migrate_one_irq(struct irq_desc *desc) c = irq_data_get_irq_chip(d); if (!c->irq_set_affinity) { pr_debug("IRQ%u: unable to set affinity\n", d->irq); + ret = false; } else { int r = irq_do_set_affinity(d, affinity, false); - if (r) + if (r) { pr_warn_ratelimited("IRQ%u: set affinity failed(%d).\n", d->irq, r); + ret = false; + } } return ret;
Commit-ID: e8a7035039306c90bcc99129ffc18e0be052bbb9 Gitweb: http://git.kernel.org/tip/e8a7035039306c90bcc99129ffc18e0be052bbb9 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:27 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:16 +0200 genirq/cpuhotplug: Reorder check logic Move the checks for a valid irq chip and the irq_set_affinity() callback right in front of the whole migration logic. No point in doing a gazillion of other things when the interrupt cannot be migrated at all. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.354181630@linutronix.de --- kernel/irq/cpuhotplug.c | 36 ++++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index 9c5521b..41fe1e0 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -17,9 +17,20 @@ static bool migrate_one_irq(struct irq_desc *desc) { struct irq_data *d = irq_desc_get_irq_data(desc); + struct irq_chip *chip = irq_data_get_irq_chip(d); const struct cpumask *affinity = d->common->affinity; - struct irq_chip *c; - bool ret = false; + bool brokeaff = false; + int err; + + /* + * IRQ chip might be already torn down, but the irq descriptor is + * still in the radix tree. Also if the chip has no affinity setter, + * nothing can be done here. + */ + if (!chip || !chip->irq_set_affinity) { + pr_debug("IRQ %u: Unable to migrate away\n", d->irq); + return false; + } /* * If this is a per-CPU interrupt, or the affinity does not @@ -31,23 +42,16 @@ static bool migrate_one_irq(struct irq_desc *desc) if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { affinity = cpu_online_mask; - ret = true; + brokeaff = true; } - c = irq_data_get_irq_chip(d); - if (!c->irq_set_affinity) { - pr_debug("IRQ%u: unable to set affinity\n", d->irq); - ret = false; - } else { - int r = irq_do_set_affinity(d, affinity, false); - if (r) { - pr_warn_ratelimited("IRQ%u: set affinity failed(%d).\n", - d->irq, r); - ret = false; - } + err = irq_do_set_affinity(d, affinity, false); + if (err) { + pr_warn_ratelimited("IRQ%u: set affinity failed(%d).\n", + d->irq, err); + return false; } - - return ret; + return brokeaff; } /**
Commit-ID: 91f26cb4cd3c22bd656ab46c49329aacaaab5504 Gitweb: http://git.kernel.org/tip/91f26cb4cd3c22bd656ab46c49329aacaaab5504 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:28 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:17 +0200 genirq/cpuhotplug: Do not migrated shutdown irqs Interrupts, which are shut down are tried to be migrated as well. That's pointless because the interrupt cannot fire and the next startup will move it to the proper place anyway. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.447550992@linutronix.de --- kernel/irq/cpuhotplug.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index 41fe1e0..09b20e1 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -33,10 +33,15 @@ static bool migrate_one_irq(struct irq_desc *desc) } /* - * If this is a per-CPU interrupt, or the affinity does not - * include this CPU, then we have nothing to do. + * No move required, if: + * - Interrupt is per cpu + * - Interrupt is not started + * - Affinity mask does not include this CPU. + * + * Note: Do not check desc->action as this might be a chained + * interrupt. */ - if (irqd_is_per_cpu(d) || + if (irqd_is_per_cpu(d) || !irqd_is_started(d) || !cpumask_test_cpu(smp_processor_id(), affinity)) return false;
Commit-ID: f0383c24b4855f6a4b5a358c7b2d2c16e0437e9b Gitweb: http://git.kernel.org/tip/f0383c24b4855f6a4b5a358c7b2d2c16e0437e9b Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:29 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:17 +0200 genirq/cpuhotplug: Add support for cleaning up move in progress In order to move x86 to the generic hotplug migration code, add support for cleaning up move in progress bits. On architectures which have this x86 specific (mis)feature not enabled, this is optimized out by the compiler. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.525817311@linutronix.de --- arch/x86/include/asm/irq.h | 1 - include/linux/irq.h | 2 ++ kernel/irq/cpuhotplug.c | 28 ++++++++++++++++++++++++++-- kernel/irq/internals.h | 10 +++++++++- 4 files changed, 37 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h index 16d3fa2..668cca5 100644 --- a/arch/x86/include/asm/irq.h +++ b/arch/x86/include/asm/irq.h @@ -29,7 +29,6 @@ struct irq_desc; #include <linux/cpumask.h> extern int check_irq_vectors_for_cpu_disable(void); extern void fixup_irqs(void); -extern void irq_force_complete_move(struct irq_desc *desc); #endif #ifdef CONFIG_HAVE_KVM diff --git a/include/linux/irq.h b/include/linux/irq.h index d008065..299271a 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -491,10 +491,12 @@ extern void irq_migrate_all_off_this_cpu(void); #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ) void irq_move_irq(struct irq_data *data); void irq_move_masked_irq(struct irq_data *data); +void irq_force_complete_move(struct irq_desc *desc); bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear); #else static inline void irq_move_irq(struct irq_data *data) { } static inline void irq_move_masked_irq(struct irq_data *data) { } +static inline void irq_force_complete_move(struct irq_desc *desc) { } static inline bool irq_fixup_move_pending(struct irq_desc *desc, bool fclear) { return false; diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index 09b20e1..4be4bd6 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -18,7 +18,7 @@ static bool migrate_one_irq(struct irq_desc *desc) { struct irq_data *d = irq_desc_get_irq_data(desc); struct irq_chip *chip = irq_data_get_irq_chip(d); - const struct cpumask *affinity = d->common->affinity; + const struct cpumask *affinity; bool brokeaff = false; int err; @@ -41,9 +41,33 @@ static bool migrate_one_irq(struct irq_desc *desc) * Note: Do not check desc->action as this might be a chained * interrupt. */ + affinity = irq_data_get_affinity_mask(d); if (irqd_is_per_cpu(d) || !irqd_is_started(d) || - !cpumask_test_cpu(smp_processor_id(), affinity)) + !cpumask_test_cpu(smp_processor_id(), affinity)) { + /* + * If an irq move is pending, abort it if the dying CPU is + * the sole target. + */ + irq_fixup_move_pending(desc, false); return false; + } + + /* + * Complete an eventually pending irq move cleanup. If this + * interrupt was moved in hard irq context, then the vectors need + * to be cleaned up. It can't wait until this interrupt actually + * happens and this CPU was involved. + */ + irq_force_complete_move(desc); + + /* + * If there is a setaffinity pending, then try to reuse the pending + * mask, so the last change of the affinity does not get lost. If + * there is no move pending or the pending mask does not contain + * any online CPU, use the current affinity mask. + */ + if (irq_fixup_move_pending(desc, true)) + affinity = irq_desc_get_pending_mask(desc); if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { affinity = cpu_online_mask; diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index 20b197f..fd4fa83 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -268,6 +268,10 @@ irq_get_pending(struct cpumask *mask, struct irq_desc *desc) { cpumask_copy(mask, desc->pending_mask); } +static inline struct cpumask *irq_desc_get_pending_mask(struct irq_desc *desc) +{ + return desc->pending_mask; +} #else /* CONFIG_GENERIC_PENDING_IRQ */ static inline bool irq_can_move_pcntxt(struct irq_data *data) { @@ -285,7 +289,11 @@ static inline void irq_get_pending(struct cpumask *mask, struct irq_desc *desc) { } -#endif /* CONFIG_GENERIC_PENDING_IRQ */ +static inline struct cpumask *irq_desc_get_pending_mask(struct irq_desc *desc) +{ + return NULL; +} +#endif /* !CONFIG_GENERIC_PENDING_IRQ */ #ifdef CONFIG_GENERIC_IRQ_DEBUGFS void irq_add_debugfs_entry(unsigned int irq, struct irq_desc *desc);
Commit-ID: 47a06d3a783217acae02976f15ca07ddc1ac024f Gitweb: http://git.kernel.org/tip/47a06d3a783217acae02976f15ca07ddc1ac024f Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:30 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:17 +0200 genirq/cpuhotplug: Add support for conditional masking Interrupts which cannot be migrated in process context, need to be masked before the affinity is changed forcefully. Add support for that. Will be compiled out for architectures which do not have this x86 specific issue. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.604565591@linutronix.de --- kernel/irq/cpuhotplug.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index 4be4bd6..6f46587 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -18,6 +18,7 @@ static bool migrate_one_irq(struct irq_desc *desc) { struct irq_data *d = irq_desc_get_irq_data(desc); struct irq_chip *chip = irq_data_get_irq_chip(d); + bool maskchip = !irq_can_move_pcntxt(d) && !irqd_irq_masked(d); const struct cpumask *affinity; bool brokeaff = false; int err; @@ -69,6 +70,10 @@ static bool migrate_one_irq(struct irq_desc *desc) if (irq_fixup_move_pending(desc, true)) affinity = irq_desc_get_pending_mask(desc); + /* Mask the chip for interrupts which cannot move in process context */ + if (maskchip && chip->irq_mask) + chip->irq_mask(d); + if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { affinity = cpu_online_mask; brokeaff = true; @@ -78,8 +83,12 @@ static bool migrate_one_irq(struct irq_desc *desc) if (err) { pr_warn_ratelimited("IRQ%u: set affinity failed(%d).\n", d->irq, err); - return false; + brokeaff = false; } + + if (maskchip && chip->irq_unmask) + chip->irq_unmask(d); + return brokeaff; }
Commit-ID: 77f85e66aa8be563ae5804eebf74a78ec6ef5555 Gitweb: http://git.kernel.org/tip/77f85e66aa8be563ae5804eebf74a78ec6ef5555 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:31 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:18 +0200 genirq/cpuhotplug: Set force affinity flag on hotplug migration Set the force migration flag when migrating interrupts away from an outgoing CPU. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.681874648@linutronix.de --- kernel/irq/cpuhotplug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index 6f46587..e09cb91 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -79,7 +79,7 @@ static bool migrate_one_irq(struct irq_desc *desc) brokeaff = true; } - err = irq_do_set_affinity(d, affinity, false); + err = irq_do_set_affinity(d, affinity, true); if (err) { pr_warn_ratelimited("IRQ%u: set affinity failed(%d).\n", d->irq, err);
Commit-ID: 654abd0a7baf144998147787121da0f9422dafc8 Gitweb: http://git.kernel.org/tip/654abd0a7baf144998147787121da0f9422dafc8 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:32 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:18 +0200 x86/irq: Restructure fixup_irqs() Reorder fixup_irqs() so it matches the flow in the generic migration code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.774272454@linutronix.de --- arch/x86/kernel/irq.c | 46 ++++++++++++++++++++-------------------------- 1 file changed, 20 insertions(+), 26 deletions(-) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 9696007d..78bd2b8 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -433,7 +433,6 @@ int check_irq_vectors_for_cpu_disable(void) void fixup_irqs(void) { unsigned int irq, vector; - static int warned; struct irq_desc *desc; struct irq_data *data; struct irq_chip *chip; @@ -441,18 +440,27 @@ void fixup_irqs(void) for_each_irq_desc(irq, desc) { const struct cpumask *affinity; - int break_affinity = 0; - int set_affinity = 1; + bool break_affinity = false; if (!desc) continue; - if (irq == 2) - continue; /* interrupt's are disabled at this point */ raw_spin_lock(&desc->lock); data = irq_desc_get_irq_data(desc); + chip = irq_data_get_irq_chip(data); + /* + * The interrupt descriptor might have been cleaned up + * already, but it is not yet removed from the radix + * tree. If the chip does not have an affinity setter, + * nothing to do here. + */ + if (!chip && !chip->irq_set_affinity) { + raw_spin_unlock(&desc->lock); + continue; + } + affinity = irq_data_get_affinity_mask(data); if (!irq_has_action(irq) || irqd_is_per_cpu(data) || @@ -485,30 +493,18 @@ void fixup_irqs(void) * affinity and use cpu_online_mask as fall back. */ if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { - break_affinity = 1; + broke_affinity = true; affinity = cpu_online_mask; } - chip = irq_data_get_irq_chip(data); - /* - * The interrupt descriptor might have been cleaned up - * already, but it is not yet removed from the radix tree - */ - if (!chip) { - raw_spin_unlock(&desc->lock); - continue; - } - if (!irqd_can_move_in_process_context(data) && chip->irq_mask) chip->irq_mask(data); - if (chip->irq_set_affinity) { - ret = chip->irq_set_affinity(data, affinity, true); - if (ret == -ENOSPC) - pr_crit("IRQ %d set affinity failed because there are no available vectors. The device assigned to this IRQ is unstable.\n", irq); - } else { - if (!(warned++)) - set_affinity = 0; + ret = chip->irq_set_affinity(data, affinity, true); + if (ret) { + pr_crit("IRQ %u: Force affinity failed (%d)\n", + d->irq, ret); + broke_affinity = false; } /* @@ -522,10 +518,8 @@ void fixup_irqs(void) raw_spin_unlock(&desc->lock); - if (break_affinity && set_affinity) + if (broke_affinity) pr_notice("Broke affinity for irq %i\n", irq); - else if (!set_affinity) - pr_notice("Cannot set affinity for irq %i\n", irq); } /*
Commit-ID: ad7a929fa4bb1143357aa83043a149d5c27c68fd Gitweb: http://git.kernel.org/tip/ad7a929fa4bb1143357aa83043a149d5c27c68fd Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:33 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:18 +0200 x86/irq: Use irq_migrate_all_off_this_cpu() The generic migration code supports all the required features already. Remove the x86 specific implementation and use the generic one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.851311033@linutronix.de --- arch/x86/Kconfig | 1 + arch/x86/kernel/irq.c | 89 ++------------------------------------------------- 2 files changed, 3 insertions(+), 87 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 0efb4c9..fcf1dad 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -87,6 +87,7 @@ config X86 select GENERIC_EARLY_IOREMAP select GENERIC_FIND_FIRST_BIT select GENERIC_IOMAP + select GENERIC_IRQ_MIGRATION if SMP select GENERIC_IRQ_PROBE select GENERIC_IRQ_SHOW select GENERIC_PENDING_IRQ if SMP diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 78bd2b8..4aa03c5 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -432,95 +432,12 @@ int check_irq_vectors_for_cpu_disable(void) /* A cpu has been removed from cpu_online_mask. Reset irq affinities. */ void fixup_irqs(void) { - unsigned int irq, vector; + unsigned int irr, vector; struct irq_desc *desc; struct irq_data *data; struct irq_chip *chip; - int ret; - for_each_irq_desc(irq, desc) { - const struct cpumask *affinity; - bool break_affinity = false; - - if (!desc) - continue; - - /* interrupt's are disabled at this point */ - raw_spin_lock(&desc->lock); - - data = irq_desc_get_irq_data(desc); - chip = irq_data_get_irq_chip(data); - /* - * The interrupt descriptor might have been cleaned up - * already, but it is not yet removed from the radix - * tree. If the chip does not have an affinity setter, - * nothing to do here. - */ - if (!chip && !chip->irq_set_affinity) { - raw_spin_unlock(&desc->lock); - continue; - } - - affinity = irq_data_get_affinity_mask(data); - - if (!irq_has_action(irq) || irqd_is_per_cpu(data) || - cpumask_subset(affinity, cpu_online_mask)) { - irq_fixup_move_pending(desc, false); - raw_spin_unlock(&desc->lock); - continue; - } - - /* - * Complete an eventually pending irq move cleanup. If this - * interrupt was moved in hard irq context, then the - * vectors need to be cleaned up. It can't wait until this - * interrupt actually happens and this CPU was involved. - */ - irq_force_complete_move(desc); - - /* - * If there is a setaffinity pending, then try to reuse the - * pending mask, so the last change of the affinity does - * not get lost. If there is no move pending or the pending - * mask does not contain any online CPU, use the current - * affinity mask. - */ - if (irq_fixup_move_pending(desc, true)) - affinity = desc->pending_mask; - - /* - * If the mask does not contain an offline CPU, break - * affinity and use cpu_online_mask as fall back. - */ - if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { - broke_affinity = true; - affinity = cpu_online_mask; - } - - if (!irqd_can_move_in_process_context(data) && chip->irq_mask) - chip->irq_mask(data); - - ret = chip->irq_set_affinity(data, affinity, true); - if (ret) { - pr_crit("IRQ %u: Force affinity failed (%d)\n", - d->irq, ret); - broke_affinity = false; - } - - /* - * We unmask if the irq was not marked masked by the - * core code. That respects the lazy irq disable - * behaviour. - */ - if (!irqd_can_move_in_process_context(data) && - !irqd_irq_masked(data) && chip->irq_unmask) - chip->irq_unmask(data); - - raw_spin_unlock(&desc->lock); - - if (broke_affinity) - pr_notice("Broke affinity for irq %i\n", irq); - } + irq_migrate_all_off_this_cpu(); /* * We can remove mdelay() and then send spuriuous interrupts to @@ -539,8 +456,6 @@ void fixup_irqs(void) * nothing else will touch it. */ for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) { - unsigned int irr; - if (IS_ERR_OR_NULL(__this_cpu_read(vector_irq[vector]))) continue;
Commit-ID: 36d84fb45140f151fa4e145381dbce5e5ffed24d Gitweb: http://git.kernel.org/tip/36d84fb45140f151fa4e145381dbce5e5ffed24d Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:34 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:19 +0200 genirq: Move irq_fixup_move_pending() to core Now that x86 uses the generic code, the function declaration and inline stub can move to the core internal header. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235445.928156166@linutronix.de --- include/linux/irq.h | 5 ----- kernel/irq/internals.h | 5 +++++ 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/include/linux/irq.h b/include/linux/irq.h index 299271a..2b7e5a7 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -492,15 +492,10 @@ extern void irq_migrate_all_off_this_cpu(void); void irq_move_irq(struct irq_data *data); void irq_move_masked_irq(struct irq_data *data); void irq_force_complete_move(struct irq_desc *desc); -bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear); #else static inline void irq_move_irq(struct irq_data *data) { } static inline void irq_move_masked_irq(struct irq_data *data) { } static inline void irq_force_complete_move(struct irq_desc *desc) { } -static inline bool irq_fixup_move_pending(struct irq_desc *desc, bool fclear) -{ - return false; -} #endif extern int no_irq_affinity; diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index fd4fa83..040806f 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -272,6 +272,7 @@ static inline struct cpumask *irq_desc_get_pending_mask(struct irq_desc *desc) { return desc->pending_mask; } +bool irq_fixup_move_pending(struct irq_desc *desc, bool force_clear); #else /* CONFIG_GENERIC_PENDING_IRQ */ static inline bool irq_can_move_pcntxt(struct irq_data *data) { @@ -293,6 +294,10 @@ static inline struct cpumask *irq_desc_get_pending_mask(struct irq_desc *desc) { return NULL; } +static inline bool irq_fixup_move_pending(struct irq_desc *desc, bool fclear) +{ + return false; +} #endif /* !CONFIG_GENERIC_PENDING_IRQ */ #ifdef CONFIG_GENERIC_IRQ_DEBUGFS
Commit-ID: 047dc6331de58da51818582c0db0dbfcb837e614 Gitweb: http://git.kernel.org/tip/047dc6331de58da51818582c0db0dbfcb837e614 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:35 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:19 +0200 genirq: Remove pointless arg from show_irq_affinity The third argument of the internal helper function is unused. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.004958600@linutronix.de --- kernel/irq/proc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c index d35bb8d..eff7c0c 100644 --- a/kernel/irq/proc.c +++ b/kernel/irq/proc.c @@ -37,7 +37,7 @@ static struct proc_dir_entry *root_irq_dir; #ifdef CONFIG_SMP -static int show_irq_affinity(int type, struct seq_file *m, void *v) +static int show_irq_affinity(int type, struct seq_file *m) { struct irq_desc *desc = irq_to_desc((long)m->private); const struct cpumask *mask = desc->irq_common_data.affinity; @@ -80,12 +80,12 @@ static int irq_affinity_hint_proc_show(struct seq_file *m, void *v) int no_irq_affinity; static int irq_affinity_proc_show(struct seq_file *m, void *v) { - return show_irq_affinity(0, m, v); + return show_irq_affinity(0, m); } static int irq_affinity_list_proc_show(struct seq_file *m, void *v) { - return show_irq_affinity(1, m, v); + return show_irq_affinity(1, m); }
Commit-ID: 4ab764c336123157690ee0000a1dcf81851c58d1 Gitweb: http://git.kernel.org/tip/4ab764c336123157690ee0000a1dcf81851c58d1 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:36 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:19 +0200 genirq: Remove pointless gfp argument All callers hand in GPF_KERNEL. No point to have an extra argument for that. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.082544752@linutronix.de --- kernel/irq/irqdesc.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c index feade53..48d4f03 100644 --- a/kernel/irq/irqdesc.c +++ b/kernel/irq/irqdesc.c @@ -54,14 +54,14 @@ static void __init init_irq_default_affinity(void) #endif #ifdef CONFIG_SMP -static int alloc_masks(struct irq_desc *desc, gfp_t gfp, int node) +static int alloc_masks(struct irq_desc *desc, int node) { if (!zalloc_cpumask_var_node(&desc->irq_common_data.affinity, - gfp, node)) + GFP_KERNEL, node)) return -ENOMEM; #ifdef CONFIG_GENERIC_PENDING_IRQ - if (!zalloc_cpumask_var_node(&desc->pending_mask, gfp, node)) { + if (!zalloc_cpumask_var_node(&desc->pending_mask, GFP_KERNEL, node)) { free_cpumask_var(desc->irq_common_data.affinity); return -ENOMEM; } @@ -86,7 +86,7 @@ static void desc_smp_init(struct irq_desc *desc, int node, #else static inline int -alloc_masks(struct irq_desc *desc, gfp_t gfp, int node) { return 0; } +alloc_masks(struct irq_desc *desc, int node) { return 0; } static inline void desc_smp_init(struct irq_desc *desc, int node, const struct cpumask *affinity) { } #endif @@ -344,9 +344,8 @@ static struct irq_desc *alloc_desc(int irq, int node, unsigned int flags, struct module *owner) { struct irq_desc *desc; - gfp_t gfp = GFP_KERNEL; - desc = kzalloc_node(sizeof(*desc), gfp, node); + desc = kzalloc_node(sizeof(*desc), GFP_KERNEL, node); if (!desc) return NULL; /* allocate based on nr_cpu_ids */ @@ -354,7 +353,7 @@ static struct irq_desc *alloc_desc(int irq, int node, unsigned int flags, if (!desc->kstat_irqs) goto err_desc; - if (alloc_masks(desc, gfp, node)) + if (alloc_masks(desc, node)) goto err_kstat; raw_spin_lock_init(&desc->lock); @@ -525,7 +524,7 @@ int __init early_irq_init(void) for (i = 0; i < count; i++) { desc[i].kstat_irqs = alloc_percpu(unsigned int); - alloc_masks(&desc[i], GFP_KERNEL, node); + alloc_masks(&desc[i], node); raw_spin_lock_init(&desc[i].lock); lockdep_set_class(&desc[i].lock, &irq_desc_lock_class); desc_set_defaults(i, &desc[i], node, NULL, NULL);
Commit-ID: c1a80386965e9fa3c2f8d1d57966216fe02c9124 Gitweb: http://git.kernel.org/tip/c1a80386965e9fa3c2f8d1d57966216fe02c9124 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:37 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:20 +0200 genirq/proc: Replace ever repeating type cast The proc file setup repeats the same ugly type cast for the irq number over and over. Do it once and hand in the local void pointer. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.160866358@linutronix.de --- kernel/irq/proc.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c index eff7c0c..cbc4c5e 100644 --- a/kernel/irq/proc.c +++ b/kernel/irq/proc.c @@ -326,6 +326,7 @@ void register_handler_proc(unsigned int irq, struct irqaction *action) void register_irq_proc(unsigned int irq, struct irq_desc *desc) { static DEFINE_MUTEX(register_lock); + void __maybe_unused *irqp = (void *)(unsigned long) irq; char name [MAX_NAMELEN]; if (!root_irq_dir || (desc->irq_data.chip == &no_irq_chip)) @@ -351,20 +352,19 @@ void register_irq_proc(unsigned int irq, struct irq_desc *desc) #ifdef CONFIG_SMP /* create /proc/irq/<irq>/smp_affinity */ proc_create_data("smp_affinity", 0644, desc->dir, - &irq_affinity_proc_fops, (void *)(long)irq); + &irq_affinity_proc_fops, irqp); /* create /proc/irq/<irq>/affinity_hint */ proc_create_data("affinity_hint", 0444, desc->dir, - &irq_affinity_hint_proc_fops, (void *)(long)irq); + &irq_affinity_hint_proc_fops, irqp); /* create /proc/irq/<irq>/smp_affinity_list */ proc_create_data("smp_affinity_list", 0644, desc->dir, - &irq_affinity_list_proc_fops, (void *)(long)irq); + &irq_affinity_list_proc_fops, irqp); proc_create_data("node", 0444, desc->dir, - &irq_node_proc_fops, (void *)(long)irq); + &irq_node_proc_fops, irqp); #endif - proc_create_data("spurious", 0444, desc->dir, &irq_spurious_proc_fops, (void *)(long)irq);
Commit-ID: 0d3f54257dc300f2db480d6a46b34bdb87f18c1b Gitweb: http://git.kernel.org/tip/0d3f54257dc300f2db480d6a46b34bdb87f18c1b Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:38 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:20 +0200 genirq: Introduce effective affinity mask There is currently no way to evaluate the effective affinity mask of a given interrupt. Many irq chips allow only a single target CPU or a subset of CPUs in the affinity mask. Updating the mask at the time of setting the affinity to the subset would be counterproductive because information for cpu hotplug about assigned interrupt affinities gets lost. On CPU hotplug it's also pointless to force migrate an interrupt, which is not targeted at the CPU effectively. But currently the information is not available. Provide a seperate mask to be updated by the irq_chip->irq_set_affinity() implementations. Implement the read only proc files so the user can see the effective mask as well w/o trying to deduce it from /proc/interrupts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.247834245@linutronix.de --- include/linux/irq.h | 29 +++++++++++++++++ kernel/irq/Kconfig | 4 +++ kernel/irq/debugfs.c | 4 +++ kernel/irq/irqdesc.c | 14 ++++++++ kernel/irq/proc.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++---- 5 files changed, 134 insertions(+), 7 deletions(-) diff --git a/include/linux/irq.h b/include/linux/irq.h index 2b7e5a7..4087ef2 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -137,6 +137,9 @@ struct irq_domain; * @affinity: IRQ affinity on SMP. If this is an IPI * related irq, then this is the mask of the * CPUs to which an IPI can be sent. + * @effective_affinity: The effective IRQ affinity on SMP as some irq + * chips do not allow multi CPU destinations. + * A subset of @affinity. * @msi_desc: MSI descriptor * @ipi_offset: Offset of first IPI target cpu in @affinity. Optional. */ @@ -148,6 +151,9 @@ struct irq_common_data { void *handler_data; struct msi_desc *msi_desc; cpumask_var_t affinity; +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + cpumask_var_t effective_affinity; +#endif #ifdef CONFIG_GENERIC_IRQ_IPI unsigned int ipi_offset; #endif @@ -737,6 +743,29 @@ static inline struct cpumask *irq_data_get_affinity_mask(struct irq_data *d) return d->common->affinity; } +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK +static inline +struct cpumask *irq_data_get_effective_affinity_mask(struct irq_data *d) +{ + return d->common->effective_affinity; +} +static inline void irq_data_update_effective_affinity(struct irq_data *d, + const struct cpumask *m) +{ + cpumask_copy(d->common->effective_affinity, m); +} +#else +static inline void irq_data_update_effective_affinity(struct irq_data *d, + const struct cpumask *m) +{ +} +static inline +struct cpumask *irq_data_get_effective_affinity_mask(struct irq_data *d) +{ + return d->common->affinity; +} +#endif + unsigned int arch_dynirq_lower_bound(unsigned int from); int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node, diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig index 8d9498e..fcbb1d6 100644 --- a/kernel/irq/Kconfig +++ b/kernel/irq/Kconfig @@ -21,6 +21,10 @@ config GENERIC_IRQ_SHOW config GENERIC_IRQ_SHOW_LEVEL bool +# Supports effective affinity mask +config GENERIC_IRQ_EFFECTIVE_AFF_MASK + bool + # Facility to allocate a hardware interrupt. This is legacy support # and should not be used in new code. Use irq domains instead. config GENERIC_IRQ_LEGACY_ALLOC_HWIRQ diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c index 50ee2f6..edbef25 100644 --- a/kernel/irq/debugfs.c +++ b/kernel/irq/debugfs.c @@ -36,6 +36,10 @@ static void irq_debug_show_masks(struct seq_file *m, struct irq_desc *desc) msk = irq_data_get_affinity_mask(data); seq_printf(m, "affinity: %*pbl\n", cpumask_pr_args(msk)); +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + msk = irq_data_get_effective_affinity_mask(data); + seq_printf(m, "effectiv: %*pbl\n", cpumask_pr_args(msk)); +#endif #ifdef CONFIG_GENERIC_PENDING_IRQ msk = desc->pending_mask; seq_printf(m, "pending: %*pbl\n", cpumask_pr_args(msk)); diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c index 48d4f03..35a95fa 100644 --- a/kernel/irq/irqdesc.c +++ b/kernel/irq/irqdesc.c @@ -60,8 +60,19 @@ static int alloc_masks(struct irq_desc *desc, int node) GFP_KERNEL, node)) return -ENOMEM; +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + if (!zalloc_cpumask_var_node(&desc->irq_common_data.effective_affinity, + GFP_KERNEL, node)) { + free_cpumask_var(desc->irq_common_data.affinity); + return -ENOMEM; + } +#endif + #ifdef CONFIG_GENERIC_PENDING_IRQ if (!zalloc_cpumask_var_node(&desc->pending_mask, GFP_KERNEL, node)) { +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + free_cpumask_var(desc->irq_common_data.effective_affinity); +#endif free_cpumask_var(desc->irq_common_data.affinity); return -ENOMEM; } @@ -324,6 +335,9 @@ static void free_masks(struct irq_desc *desc) free_cpumask_var(desc->pending_mask); #endif free_cpumask_var(desc->irq_common_data.affinity); +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + free_cpumask_var(desc->irq_common_data.effective_affinity); +#endif } #else static inline void free_masks(struct irq_desc *desc) { } diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c index cbc4c5e..7f9642a 100644 --- a/kernel/irq/proc.c +++ b/kernel/irq/proc.c @@ -37,19 +37,47 @@ static struct proc_dir_entry *root_irq_dir; #ifdef CONFIG_SMP +enum { + AFFINITY, + AFFINITY_LIST, + EFFECTIVE, + EFFECTIVE_LIST, +}; + static int show_irq_affinity(int type, struct seq_file *m) { struct irq_desc *desc = irq_to_desc((long)m->private); - const struct cpumask *mask = desc->irq_common_data.affinity; + const struct cpumask *mask; + switch (type) { + case AFFINITY: + case AFFINITY_LIST: + mask = desc->irq_common_data.affinity; #ifdef CONFIG_GENERIC_PENDING_IRQ - if (irqd_is_setaffinity_pending(&desc->irq_data)) - mask = desc->pending_mask; + if (irqd_is_setaffinity_pending(&desc->irq_data)) + mask = desc->pending_mask; #endif - if (type) + break; + case EFFECTIVE: + case EFFECTIVE_LIST: +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + mask = desc->irq_common_data.effective_affinity; + break; +#else + return -EINVAL; +#endif + }; + + switch (type) { + case AFFINITY_LIST: + case EFFECTIVE_LIST: seq_printf(m, "%*pbl\n", cpumask_pr_args(mask)); - else + break; + case AFFINITY: + case EFFECTIVE: seq_printf(m, "%*pb\n", cpumask_pr_args(mask)); + break; + } return 0; } @@ -80,12 +108,12 @@ static int irq_affinity_hint_proc_show(struct seq_file *m, void *v) int no_irq_affinity; static int irq_affinity_proc_show(struct seq_file *m, void *v) { - return show_irq_affinity(0, m); + return show_irq_affinity(AFFINITY, m); } static int irq_affinity_list_proc_show(struct seq_file *m, void *v) { - return show_irq_affinity(1, m); + return show_irq_affinity(AFFINITY_LIST, m); } @@ -185,6 +213,44 @@ static const struct file_operations irq_affinity_list_proc_fops = { .write = irq_affinity_list_proc_write, }; +#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK +static int irq_effective_aff_proc_show(struct seq_file *m, void *v) +{ + return show_irq_affinity(EFFECTIVE, m); +} + +static int irq_effective_aff_list_proc_show(struct seq_file *m, void *v) +{ + return show_irq_affinity(EFFECTIVE_LIST, m); +} + +static int irq_effective_aff_proc_open(struct inode *inode, struct file *file) +{ + return single_open(file, irq_effective_aff_proc_show, PDE_DATA(inode)); +} + +static int irq_effective_aff_list_proc_open(struct inode *inode, + struct file *file) +{ + return single_open(file, irq_effective_aff_list_proc_show, + PDE_DATA(inode)); +} + +static const struct file_operations irq_effective_aff_proc_fops = { + .open = irq_effective_aff_proc_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static const struct file_operations irq_effective_aff_list_proc_fops = { + .open = irq_effective_aff_list_proc_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; +#endif + static int default_affinity_show(struct seq_file *m, void *v) { seq_printf(m, "%*pb\n", cpumask_pr_args(irq_default_affinity)); @@ -364,6 +430,12 @@ void register_irq_proc(unsigned int irq, struct irq_desc *desc) proc_create_data("node", 0444, desc->dir, &irq_node_proc_fops, irqp); +# ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + proc_create_data("effective_affinity", 0444, desc->dir, + &irq_effective_aff_proc_fops, irqp); + proc_create_data("effective_affinity_list", 0444, desc->dir, + &irq_effective_aff_list_proc_fops, irqp); +# endif #endif proc_create_data("spurious", 0444, desc->dir, &irq_spurious_proc_fops, (void *)(long)irq); @@ -383,6 +455,10 @@ void unregister_irq_proc(unsigned int irq, struct irq_desc *desc) remove_proc_entry("affinity_hint", desc->dir); remove_proc_entry("smp_affinity_list", desc->dir); remove_proc_entry("node", desc->dir); +# ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK + remove_proc_entry("effective_affinity", desc->dir); + remove_proc_entry("effective_affinity_list", desc->dir); +# endif #endif remove_proc_entry("spurious", desc->dir);
Commit-ID: 415fcf1a2293046e0c1f4ab8558a87bad66652b1 Gitweb: http://git.kernel.org/tip/415fcf1a2293046e0c1f4ab8558a87bad66652b1 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:39 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:21 +0200 genirq/cpuhotplug: Use effective affinity mask If the architecture supports the effective affinity mask, migrating interrupts away which are not targeted by the effective mask is pointless. They can stay in the user or system supplied affinity mask, but won't be targetted at any given point as the affinity setter functions need to validate against the online cpu mask anyway. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.328488490@linutronix.de --- kernel/irq/cpuhotplug.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index e09cb91..0b093db 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -14,6 +14,14 @@ #include "internals.h" +/* For !GENERIC_IRQ_EFFECTIVE_AFF_MASK this looks at general affinity mask */ +static inline bool irq_needs_fixup(struct irq_data *d) +{ + const struct cpumask *m = irq_data_get_effective_affinity_mask(d); + + return cpumask_test_cpu(smp_processor_id(), m); +} + static bool migrate_one_irq(struct irq_desc *desc) { struct irq_data *d = irq_desc_get_irq_data(desc); @@ -42,9 +50,7 @@ static bool migrate_one_irq(struct irq_desc *desc) * Note: Do not check desc->action as this might be a chained * interrupt. */ - affinity = irq_data_get_affinity_mask(d); - if (irqd_is_per_cpu(d) || !irqd_is_started(d) || - !cpumask_test_cpu(smp_processor_id(), affinity)) { + if (irqd_is_per_cpu(d) || !irqd_is_started(d) || !irq_needs_fixup(d)) { /* * If an irq move is pending, abort it if the dying CPU is * the sole target. @@ -69,6 +75,8 @@ static bool migrate_one_irq(struct irq_desc *desc) */ if (irq_fixup_move_pending(desc, true)) affinity = irq_desc_get_pending_mask(desc); + else + affinity = irq_data_get_affinity_mask(d); /* Mask the chip for interrupts which cannot move in process context */ if (maskchip && chip->irq_mask)
Commit-ID: ad95212ee6e0b62f38b287b40c9ab6a1ba3e892b Gitweb: http://git.kernel.org/tip/ad95212ee6e0b62f38b287b40c9ab6a1ba3e892b Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:40 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:21 +0200 x86/apic: Move flat_cpu_mask_to_apicid_and() into C source No point in having inlines assigned to function pointers at multiple places. Just bloats the text. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.405975721@linutronix.de --- arch/x86/include/asm/apic.h | 28 ++++++---------------------- arch/x86/kernel/apic/apic.c | 16 ++++++++++++++++ 2 files changed, 22 insertions(+), 22 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index bdffcd9..a86be0a 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -540,28 +540,12 @@ static inline int default_phys_pkg_id(int cpuid_apic, int index_msb) #endif -static inline int -flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid) -{ - unsigned long cpu_mask = cpumask_bits(cpumask)[0] & - cpumask_bits(andmask)[0] & - cpumask_bits(cpu_online_mask)[0] & - APIC_ALL_CPUS; - - if (likely(cpu_mask)) { - *apicid = (unsigned int)cpu_mask; - return 0; - } else { - return -EINVAL; - } -} - -extern int -default_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid); +extern int flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, + const struct cpumask *andmask, + unsigned int *apicid); +extern int default_cpu_mask_to_apicid_and(const struct cpumask *cpumask, + const struct cpumask *andmask, + unsigned int *apicid); static inline void flat_vector_allocation_domain(int cpu, struct cpumask *retmask, diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 2d75faf..e9b322f 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2220,6 +2220,22 @@ int default_cpu_mask_to_apicid_and(const struct cpumask *cpumask, return -EINVAL; } +int flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, + const struct cpumask *andmask, + unsigned int *apicid) +{ + unsigned long cpu_mask = cpumask_bits(cpumask)[0] & + cpumask_bits(andmask)[0] & + cpumask_bits(cpu_online_mask)[0] & + APIC_ALL_CPUS; + + if (likely(cpu_mask)) { + *apicid = (unsigned int)cpu_mask; + return 0; + } + return -EINVAL; +} + /* * Override the generic EOI implementation with an optimized version. * Only called during early boot when only one CPU is active and with
Commit-ID: bbcf9574bc6fb85d22f2718d48da7f98830a7870 Gitweb: http://git.kernel.org/tip/bbcf9574bc6fb85d22f2718d48da7f98830a7870 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:41 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:21 +0200 x86/uv: Use default_cpu_mask_to_apicid_and() Same functionality except the extra bits ored on the apicid. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.482841015@linutronix.de --- arch/x86/kernel/apic/x2apic_uv_x.c | 19 ++++--------------- 1 file changed, 4 insertions(+), 15 deletions(-) diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c index b487b3a..fd5bb20 100644 --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -530,23 +530,12 @@ uv_cpu_mask_to_apicid_and(const struct cpumask *cpumask, const struct cpumask *andmask, unsigned int *apicid) { - int unsigned cpu; + int ret = default_cpu_mask_to_apicid_and(cpumask, andmask, apicid); - /* - * We're using fixed IRQ delivery, can only return one phys APIC ID. - * May as well be the first. - */ - for_each_cpu_and(cpu, cpumask, andmask) { - if (cpumask_test_cpu(cpu, cpu_online_mask)) - break; - } - - if (likely(cpu < nr_cpu_ids)) { - *apicid = per_cpu(x86_cpu_to_apicid, cpu) | uv_apicid_hibits; - return 0; - } + if (!ret) + *apicid |= uv_apicid_hibits; - return -EINVAL; + return ret; } static unsigned int x2apic_get_apic_id(unsigned long x)
Commit-ID: 52b166af40faec9813cd5ac26d6ba9adec2e3a9d Gitweb: http://git.kernel.org/tip/52b166af40faec9813cd5ac26d6ba9adec2e3a9d Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:42 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:21 +0200 x86/apic: Move online masking to core code All implementations of apic->cpu_mask_to_apicid_and() mask out the offline cpus. The callsite already has a mask available, which has the offline CPUs removed. Use that and remove the extra bits. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.560868224@linutronix.de --- arch/x86/kernel/apic/apic.c | 27 +++++++++------------------ arch/x86/kernel/apic/vector.c | 5 ++++- arch/x86/kernel/apic/x2apic_cluster.c | 25 +++++++++---------------- 3 files changed, 22 insertions(+), 35 deletions(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index e9b322f..8a0bde3 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2205,19 +2205,12 @@ int default_cpu_mask_to_apicid_and(const struct cpumask *cpumask, const struct cpumask *andmask, unsigned int *apicid) { - unsigned int cpu; + unsigned int cpu = cpumask_first_and(cpumask, andmask); - for_each_cpu_and(cpu, cpumask, andmask) { - if (cpumask_test_cpu(cpu, cpu_online_mask)) - break; - } - - if (likely(cpu < nr_cpu_ids)) { - *apicid = per_cpu(x86_cpu_to_apicid, cpu); - return 0; - } - - return -EINVAL; + if (cpu >= nr_cpu_ids) + return -EINVAL; + *apicid = per_cpu(x86_cpu_to_apicid, cpu); + return 0; } int flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, @@ -2226,14 +2219,12 @@ int flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, { unsigned long cpu_mask = cpumask_bits(cpumask)[0] & cpumask_bits(andmask)[0] & - cpumask_bits(cpu_online_mask)[0] & APIC_ALL_CPUS; - if (likely(cpu_mask)) { - *apicid = (unsigned int)cpu_mask; - return 0; - } - return -EINVAL; + if (!cpu_mask) + return -EINVAL; + *apicid = (unsigned int)cpu_mask; + return 0; } /* diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 47c5d01..0f94ddb 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -221,8 +221,11 @@ success: * Cache destination APIC IDs into cfg->dest_apicid. This cannot fail * as we already established, that mask & d->domain & cpu_online_mask * is not empty. + * + * vector_searchmask is a subset of d->domain and has the offline + * cpus masked out. */ - BUG_ON(apic->cpu_mask_to_apicid_and(mask, d->domain, + BUG_ON(apic->cpu_mask_to_apicid_and(mask, vector_searchmask, &d->cfg.dest_apicid)); return 0; } diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2apic_cluster.c index 5a35f20..d73baa8 100644 --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -108,31 +108,24 @@ x2apic_cpu_mask_to_apicid_and(const struct cpumask *cpumask, const struct cpumask *andmask, unsigned int *apicid) { + unsigned int cpu; u32 dest = 0; u16 cluster; - int i; - - for_each_cpu_and(i, cpumask, andmask) { - if (!cpumask_test_cpu(i, cpu_online_mask)) - continue; - dest = per_cpu(x86_cpu_to_logical_apicid, i); - cluster = x2apic_cluster(i); - break; - } - if (!dest) + cpu = cpumask_first_and(cpumask, andmask); + if (cpu >= nr_cpu_ids) return -EINVAL; - for_each_cpu_and(i, cpumask, andmask) { - if (!cpumask_test_cpu(i, cpu_online_mask)) - continue; - if (cluster != x2apic_cluster(i)) + dest = per_cpu(x86_cpu_to_logical_apicid, cpu); + cluster = x2apic_cluster(cpu); + + for_each_cpu_and(cpu, cpumask, andmask) { + if (cluster != x2apic_cluster(cpu)) continue; - dest |= per_cpu(x86_cpu_to_logical_apicid, i); + dest |= per_cpu(x86_cpu_to_logical_apicid, cpu); } *apicid = dest; - return 0; }
Commit-ID: 91cd9cb7ee1c081304d0e61f09e9faccb33d3df7 Gitweb: http://git.kernel.org/tip/91cd9cb7ee1c081304d0e61f09e9faccb33d3df7 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:43 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:22 +0200 x86/apic: Move cpumask and to core code All implementations of apic->cpu_mask_to_apicid_and() and the two incoming cpumasks to search for the target. Move that operation to the call site and rename it to cpu_mask_to_apicid() Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.641575516@linutronix.de --- arch/x86/include/asm/apic.h | 15 ++++++--------- arch/x86/kernel/apic/apic.c | 14 ++++---------- arch/x86/kernel/apic/apic_flat_64.c | 4 ++-- arch/x86/kernel/apic/apic_noop.c | 2 +- arch/x86/kernel/apic/apic_numachip.c | 4 ++-- arch/x86/kernel/apic/bigsmp_32.c | 2 +- arch/x86/kernel/apic/probe_32.c | 2 +- arch/x86/kernel/apic/vector.c | 6 +++--- arch/x86/kernel/apic/x2apic_cluster.c | 10 ++++------ arch/x86/kernel/apic/x2apic_phys.c | 2 +- arch/x86/kernel/apic/x2apic_uv_x.c | 8 +++----- arch/x86/xen/apic.c | 2 +- 12 files changed, 29 insertions(+), 42 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index a86be0a..3e64e99 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -296,9 +296,8 @@ struct apic { /* Can't be NULL on 64-bit */ unsigned long (*set_apic_id)(unsigned int id); - int (*cpu_mask_to_apicid_and)(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid); + int (*cpu_mask_to_apicid)(const struct cpumask *cpumask, + unsigned int *apicid); /* ipi */ void (*send_IPI)(int cpu, int vector); @@ -540,12 +539,10 @@ static inline int default_phys_pkg_id(int cpuid_apic, int index_msb) #endif -extern int flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid); -extern int default_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid); +extern int flat_cpu_mask_to_apicid(const struct cpumask *cpumask, + unsigned int *apicid); +extern int default_cpu_mask_to_apicid(const struct cpumask *cpumask, + unsigned int *apicid); static inline void flat_vector_allocation_domain(int cpu, struct cpumask *retmask, diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 8a0bde3..169dd42 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2201,11 +2201,9 @@ void default_init_apic_ldr(void) apic_write(APIC_LDR, val); } -int default_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid) +int default_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) { - unsigned int cpu = cpumask_first_and(cpumask, andmask); + unsigned int cpu = cpumask_first(mask); if (cpu >= nr_cpu_ids) return -EINVAL; @@ -2213,13 +2211,9 @@ int default_cpu_mask_to_apicid_and(const struct cpumask *cpumask, return 0; } -int flat_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid) +int flat_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) { - unsigned long cpu_mask = cpumask_bits(cpumask)[0] & - cpumask_bits(andmask)[0] & - APIC_ALL_CPUS; + unsigned long cpu_mask = cpumask_bits(mask)[0] & APIC_ALL_CPUS; if (!cpu_mask) return -EINVAL; diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/apic_flat_64.c index a4d7ff2..dedd5a4 100644 --- a/arch/x86/kernel/apic/apic_flat_64.c +++ b/arch/x86/kernel/apic/apic_flat_64.c @@ -172,7 +172,7 @@ static struct apic apic_flat __ro_after_init = { .get_apic_id = flat_get_apic_id, .set_apic_id = set_apic_id, - .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = flat_cpu_mask_to_apicid, .send_IPI = default_send_IPI_single, .send_IPI_mask = flat_send_IPI_mask, @@ -268,7 +268,7 @@ static struct apic apic_physflat __ro_after_init = { .get_apic_id = flat_get_apic_id, .set_apic_id = set_apic_id, - .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = default_cpu_mask_to_apicid, .send_IPI = default_send_IPI_single_phys, .send_IPI_mask = default_send_IPI_mask_sequence_phys, diff --git a/arch/x86/kernel/apic/apic_noop.c b/arch/x86/kernel/apic/apic_noop.c index 2262eb6..6599f43 100644 --- a/arch/x86/kernel/apic/apic_noop.c +++ b/arch/x86/kernel/apic/apic_noop.c @@ -141,7 +141,7 @@ struct apic apic_noop __ro_after_init = { .get_apic_id = noop_get_apic_id, .set_apic_id = NULL, - .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = flat_cpu_mask_to_apicid, .send_IPI = noop_send_IPI, .send_IPI_mask = noop_send_IPI_mask, diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c index e08fe2c..2fda912 100644 --- a/arch/x86/kernel/apic/apic_numachip.c +++ b/arch/x86/kernel/apic/apic_numachip.c @@ -267,7 +267,7 @@ static const struct apic apic_numachip1 __refconst = { .get_apic_id = numachip1_get_apic_id, .set_apic_id = numachip1_set_apic_id, - .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = default_cpu_mask_to_apicid, .send_IPI = numachip_send_IPI_one, .send_IPI_mask = numachip_send_IPI_mask, @@ -318,7 +318,7 @@ static const struct apic apic_numachip2 __refconst = { .get_apic_id = numachip2_get_apic_id, .set_apic_id = numachip2_set_apic_id, - .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = default_cpu_mask_to_apicid, .send_IPI = numachip_send_IPI_one, .send_IPI_mask = numachip_send_IPI_mask, diff --git a/arch/x86/kernel/apic/bigsmp_32.c b/arch/x86/kernel/apic/bigsmp_32.c index 5601201..456e45e 100644 --- a/arch/x86/kernel/apic/bigsmp_32.c +++ b/arch/x86/kernel/apic/bigsmp_32.c @@ -172,7 +172,7 @@ static struct apic apic_bigsmp __ro_after_init = { .get_apic_id = bigsmp_get_apic_id, .set_apic_id = NULL, - .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = default_cpu_mask_to_apicid, .send_IPI = default_send_IPI_single_phys, .send_IPI_mask = default_send_IPI_mask_sequence_phys, diff --git a/arch/x86/kernel/apic/probe_32.c b/arch/x86/kernel/apic/probe_32.c index 2e8f7f0..6328765 100644 --- a/arch/x86/kernel/apic/probe_32.c +++ b/arch/x86/kernel/apic/probe_32.c @@ -102,7 +102,7 @@ static struct apic apic_default __ro_after_init = { .get_apic_id = default_get_apic_id, .set_apic_id = NULL, - .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = flat_cpu_mask_to_apicid, .send_IPI = default_send_IPI_single, .send_IPI_mask = default_send_IPI_mask_logical, diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 0f94ddb..1f57f5a 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -141,7 +141,7 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d, /* * Clear the offline cpus from @vector_cpumask for searching * and verify whether the result overlaps with @mask. If true, - * then the call to apic->cpu_mask_to_apicid_and() will + * then the call to apic->cpu_mask_to_apicid() will * succeed as well. If not, no point in trying to find a * vector in this mask. */ @@ -225,8 +225,8 @@ success: * vector_searchmask is a subset of d->domain and has the offline * cpus masked out. */ - BUG_ON(apic->cpu_mask_to_apicid_and(mask, vector_searchmask, - &d->cfg.dest_apicid)); + cpumask_and(vector_searchmask, vector_searchmask, mask); + BUG_ON(apic->cpu_mask_to_apicid(vector_searchmask, &d->cfg.dest_apicid)); return 0; } diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2apic_cluster.c index d73baa8..6147425 100644 --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -104,22 +104,20 @@ static void x2apic_send_IPI_all(int vector) } static int -x2apic_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid) +x2apic_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) { unsigned int cpu; u32 dest = 0; u16 cluster; - cpu = cpumask_first_and(cpumask, andmask); + cpu = cpumask_first(mask); if (cpu >= nr_cpu_ids) return -EINVAL; dest = per_cpu(x86_cpu_to_logical_apicid, cpu); cluster = x2apic_cluster(cpu); - for_each_cpu_and(cpu, cpumask, andmask) { + for_each_cpu(cpu, mask) { if (cluster != x2apic_cluster(cpu)) continue; dest |= per_cpu(x86_cpu_to_logical_apicid, cpu); @@ -249,7 +247,7 @@ static struct apic apic_x2apic_cluster __ro_after_init = { .get_apic_id = x2apic_get_apic_id, .set_apic_id = x2apic_set_apic_id, - .cpu_mask_to_apicid_and = x2apic_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = x2apic_cpu_mask_to_apicid, .send_IPI = x2apic_send_IPI, .send_IPI_mask = x2apic_send_IPI_mask, diff --git a/arch/x86/kernel/apic/x2apic_phys.c b/arch/x86/kernel/apic/x2apic_phys.c index ff111f0..3baf0c3 100644 --- a/arch/x86/kernel/apic/x2apic_phys.c +++ b/arch/x86/kernel/apic/x2apic_phys.c @@ -127,7 +127,7 @@ static struct apic apic_x2apic_phys __ro_after_init = { .get_apic_id = x2apic_get_apic_id, .set_apic_id = x2apic_set_apic_id, - .cpu_mask_to_apicid_and = default_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = default_cpu_mask_to_apicid, .send_IPI = x2apic_send_IPI, .send_IPI_mask = x2apic_send_IPI_mask, diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c index fd5bb20..ad0223f 100644 --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -526,11 +526,9 @@ static void uv_init_apic_ldr(void) } static int -uv_cpu_mask_to_apicid_and(const struct cpumask *cpumask, - const struct cpumask *andmask, - unsigned int *apicid) +uv_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) { - int ret = default_cpu_mask_to_apicid_and(cpumask, andmask, apicid); + int ret = default_cpu_mask_to_apicid(mask, apicid); if (!ret) *apicid |= uv_apicid_hibits; @@ -603,7 +601,7 @@ static struct apic apic_x2apic_uv_x __ro_after_init = { .get_apic_id = x2apic_get_apic_id, .set_apic_id = set_apic_id, - .cpu_mask_to_apicid_and = uv_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = uv_cpu_mask_to_apicid, .send_IPI = uv_send_IPI_one, .send_IPI_mask = uv_send_IPI_mask, diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c index bcea81f..b5e48da 100644 --- a/arch/x86/xen/apic.c +++ b/arch/x86/xen/apic.c @@ -178,7 +178,7 @@ static struct apic xen_pv_apic = { .get_apic_id = xen_get_apic_id, .set_apic_id = xen_set_apic_id, /* Can be NULL on 32-bit. */ - .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and, + .cpu_mask_to_apicid = flat_cpu_mask_to_apicid, #ifdef CONFIG_SMP .send_IPI_mask = xen_send_IPI_mask,
Commit-ID: 0e24f7c9f67e218546ad44160d2a12d9d8be0171 Gitweb: http://git.kernel.org/tip/0e24f7c9f67e218546ad44160d2a12d9d8be0171 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:44 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:22 +0200 x86/apic: Add irq_data argument to apic->cpu_mask_to_apicid() The decision to which CPUs an interrupt is effectively routed happens in the various apic->cpu_mask_to_apicid() implementations To support effective affinity masks this information needs to be updated in irq_data. Add a pointer to irq_data to the callbacks and feed it through the call chain. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.720739075@linutronix.de --- arch/x86/include/asm/apic.h | 5 +++++ arch/x86/kernel/apic/apic.c | 9 +++++++-- arch/x86/kernel/apic/vector.c | 25 +++++++++++++++---------- arch/x86/kernel/apic/x2apic_cluster.c | 3 ++- arch/x86/kernel/apic/x2apic_uv_x.c | 5 +++-- 5 files changed, 32 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 3e64e99..5f01671 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -252,6 +252,8 @@ static inline int x2apic_enabled(void) { return 0; } #define x2apic_supported() (0) #endif /* !CONFIG_X86_X2APIC */ +struct irq_data; + /* * Copyright 2004 James Cleverdon, IBM. * Subject to the GNU Public License, v.2 @@ -297,6 +299,7 @@ struct apic { unsigned long (*set_apic_id)(unsigned int id); int (*cpu_mask_to_apicid)(const struct cpumask *cpumask, + struct irq_data *irqdata, unsigned int *apicid); /* ipi */ @@ -540,8 +543,10 @@ static inline int default_phys_pkg_id(int cpuid_apic, int index_msb) #endif extern int flat_cpu_mask_to_apicid(const struct cpumask *cpumask, + struct irq_data *irqdata, unsigned int *apicid); extern int default_cpu_mask_to_apicid(const struct cpumask *cpumask, + struct irq_data *irqdata, unsigned int *apicid); static inline void diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 169dd42..14e5a47 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2201,7 +2201,9 @@ void default_init_apic_ldr(void) apic_write(APIC_LDR, val); } -int default_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) +int default_cpu_mask_to_apicid(const struct cpumask *mask, + struct irq_data *irqdata, + unsigned int *apicid) { unsigned int cpu = cpumask_first(mask); @@ -2211,7 +2213,10 @@ int default_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) return 0; } -int flat_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) +int flat_cpu_mask_to_apicid(const struct cpumask *mask, + struct irq_data *irqdata, + unsigned int *apicid) + { unsigned long cpu_mask = cpumask_bits(mask)[0] & APIC_ALL_CPUS; diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 1f57f5a..b270a76 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -103,7 +103,8 @@ static void free_apic_chip_data(struct apic_chip_data *data) } static int __assign_irq_vector(int irq, struct apic_chip_data *d, - const struct cpumask *mask) + const struct cpumask *mask, + struct irq_data *irqdata) { /* * NOTE! The local APIC isn't very good at handling @@ -226,32 +227,35 @@ success: * cpus masked out. */ cpumask_and(vector_searchmask, vector_searchmask, mask); - BUG_ON(apic->cpu_mask_to_apicid(vector_searchmask, &d->cfg.dest_apicid)); + BUG_ON(apic->cpu_mask_to_apicid(vector_searchmask, irqdata, + &d->cfg.dest_apicid)); return 0; } static int assign_irq_vector(int irq, struct apic_chip_data *data, - const struct cpumask *mask) + const struct cpumask *mask, + struct irq_data *irqdata) { int err; unsigned long flags; raw_spin_lock_irqsave(&vector_lock, flags); - err = __assign_irq_vector(irq, data, mask); + err = __assign_irq_vector(irq, data, mask, irqdata); raw_spin_unlock_irqrestore(&vector_lock, flags); return err; } static int assign_irq_vector_policy(int irq, int node, struct apic_chip_data *data, - struct irq_alloc_info *info) + struct irq_alloc_info *info, + struct irq_data *irqdata) { if (info && info->mask) - return assign_irq_vector(irq, data, info->mask); + return assign_irq_vector(irq, data, info->mask, irqdata); if (node != NUMA_NO_NODE && - assign_irq_vector(irq, data, cpumask_of_node(node)) == 0) + assign_irq_vector(irq, data, cpumask_of_node(node), irqdata) == 0) return 0; - return assign_irq_vector(irq, data, apic->target_cpus()); + return assign_irq_vector(irq, data, apic->target_cpus(), irqdata); } static void clear_irq_vector(int irq, struct apic_chip_data *data) @@ -363,7 +367,8 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq, irq_data->chip = &lapic_controller; irq_data->chip_data = data; irq_data->hwirq = virq + i; - err = assign_irq_vector_policy(virq + i, node, data, info); + err = assign_irq_vector_policy(virq + i, node, data, info, + irq_data); if (err) goto error; } @@ -537,7 +542,7 @@ static int apic_set_affinity(struct irq_data *irq_data, if (!cpumask_intersects(dest, cpu_online_mask)) return -EINVAL; - err = assign_irq_vector(irq, data, dest); + err = assign_irq_vector(irq, data, dest, irq_data); return err ? err : IRQ_SET_MASK_OK; } diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2apic_cluster.c index 6147425..305031e 100644 --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -104,7 +104,8 @@ static void x2apic_send_IPI_all(int vector) } static int -x2apic_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) +x2apic_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqdata, + unsigned int *apicid) { unsigned int cpu; u32 dest = 0; diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c index ad0223f..0d57bb9 100644 --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -526,9 +526,10 @@ static void uv_init_apic_ldr(void) } static int -uv_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) +uv_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqdata, + unsigned int *apicid) { - int ret = default_cpu_mask_to_apicid(mask, apicid); + int ret = default_cpu_mask_to_apicid(mask, irqdata, apicid); if (!ret) *apicid |= uv_apicid_hibits;
Commit-ID: ef1c2cc88531a967fa97d1ac1f3f8a64ee6910b4 Gitweb: http://git.kernel.org/tip/ef1c2cc88531a967fa97d1ac1f3f8a64ee6910b4 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:45 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:23 +0200 xen/events: Add support for effective affinity mask Update the effective affinity mask when an interrupt was successfully targeted to a CPU. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.799944725@linutronix.de --- drivers/xen/events/events_base.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index b52852f..2e567d8 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1343,8 +1343,12 @@ static int set_affinity_irq(struct irq_data *data, const struct cpumask *dest, bool force) { unsigned tcpu = cpumask_first_and(dest, cpu_online_mask); + int ret = rebind_irq_to_cpu(data->irq, tcpu); - return rebind_irq_to_cpu(data->irq, tcpu); + if (!ret) + irq_data_update_effective_affinity(data, cpumask_of(tcpu)); + + return ret; } static void enable_dynirq(struct irq_data *data)
Commit-ID: c7d6c9dd871f42c4e0ce5563d2f684e78ea673cf Gitweb: http://git.kernel.org/tip/c7d6c9dd871f42c4e0ce5563d2f684e78ea673cf Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:46 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:23 +0200 x86/apic: Implement effective irq mask update Add the effective irq mask update to the apic implementations and enable effective irq masks for x86. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.878370703@linutronix.de --- arch/x86/Kconfig | 1 + arch/x86/kernel/apic/apic.c | 3 +++ arch/x86/kernel/apic/x2apic_cluster.c | 4 ++++ 3 files changed, 8 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index fcf1dad..0172c0b 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -87,6 +87,7 @@ config X86 select GENERIC_EARLY_IOREMAP select GENERIC_FIND_FIRST_BIT select GENERIC_IOMAP + select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP select GENERIC_IRQ_MIGRATION if SMP select GENERIC_IRQ_PROBE select GENERIC_IRQ_SHOW diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 14e5a47..e740946 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2210,6 +2210,7 @@ int default_cpu_mask_to_apicid(const struct cpumask *mask, if (cpu >= nr_cpu_ids) return -EINVAL; *apicid = per_cpu(x86_cpu_to_apicid, cpu); + irq_data_update_effective_affinity(irqdata, cpumask_of(cpu)); return 0; } @@ -2218,11 +2219,13 @@ int flat_cpu_mask_to_apicid(const struct cpumask *mask, unsigned int *apicid) { + struct cpumask *effmsk = irq_data_get_effective_affinity_mask(irqdata); unsigned long cpu_mask = cpumask_bits(mask)[0] & APIC_ALL_CPUS; if (!cpu_mask) return -EINVAL; *apicid = (unsigned int)cpu_mask; + cpumask_bits(effmsk)[0] = cpu_mask; return 0; } diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2apic_cluster.c index 305031e..481237c 100644 --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -4,6 +4,7 @@ #include <linux/kernel.h> #include <linux/ctype.h> #include <linux/dmar.h> +#include <linux/irq.h> #include <linux/cpu.h> #include <asm/smp.h> @@ -107,6 +108,7 @@ static int x2apic_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqdata, unsigned int *apicid) { + struct cpumask *effmsk = irq_data_get_effective_affinity_mask(irqdata); unsigned int cpu; u32 dest = 0; u16 cluster; @@ -118,10 +120,12 @@ x2apic_cpu_mask_to_apicid(const struct cpumask *mask, struct irq_data *irqdata, dest = per_cpu(x86_cpu_to_logical_apicid, cpu); cluster = x2apic_cluster(cpu); + cpumask_clear(effmsk); for_each_cpu(cpu, mask) { if (cluster != x2apic_cluster(cpu)) continue; dest |= per_cpu(x86_cpu_to_logical_apicid, cpu); + cpumask_set_cpu(cpu, effmsk); } *apicid = dest;
Commit-ID: 54fdf6a0875ca380647ac1cc9b5b8f2dbbbfa131 Gitweb: http://git.kernel.org/tip/54fdf6a0875ca380647ac1cc9b5b8f2dbbbfa131 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:47 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:23 +0200 genirq: Introduce IRQD_MANAGED_SHUTDOWN Affinity managed interrupts should keep their assigned affinity accross CPU hotplug. To avoid magic hackery in device drivers, the core code shall manage them transparently. This will set these interrupts into a managed shutdown state when the last CPU of the assigned affinity mask goes offline. The interrupt will be restarted when one of the CPUs in the assigned affinity mask comes back online. Introduce the necessary state flag and the accessor functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.954523476@linutronix.de --- include/linux/irq.h | 8 ++++++++ kernel/irq/internals.h | 10 ++++++++++ 2 files changed, 18 insertions(+) diff --git a/include/linux/irq.h b/include/linux/irq.h index 4087ef2..0e37276 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -207,6 +207,8 @@ struct irq_data { * IRQD_FORWARDED_TO_VCPU - The interrupt is forwarded to a VCPU * IRQD_AFFINITY_MANAGED - Affinity is auto-managed by the kernel * IRQD_IRQ_STARTED - Startup state of the interrupt + * IRQD_MANAGED_SHUTDOWN - Interrupt was shutdown due to empty affinity + * mask. Applies only to affinity managed irqs. */ enum { IRQD_TRIGGER_MASK = 0xf, @@ -225,6 +227,7 @@ enum { IRQD_FORWARDED_TO_VCPU = (1 << 20), IRQD_AFFINITY_MANAGED = (1 << 21), IRQD_IRQ_STARTED = (1 << 22), + IRQD_MANAGED_SHUTDOWN = (1 << 23), }; #define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors) @@ -343,6 +346,11 @@ static inline bool irqd_is_started(struct irq_data *d) return __irqd_to_state(d) & IRQD_IRQ_STARTED; } +static inline bool irqd_is_managed_shutdown(struct irq_data *d) +{ + return __irqd_to_state(d) & IRQD_MANAGED_SHUTDOWN; +} + #undef __irqd_to_state static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d) diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index 040806f..ca4666b 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -193,6 +193,16 @@ static inline void irqd_clr_move_pending(struct irq_data *d) __irqd_to_state(d) &= ~IRQD_SETAFFINITY_PENDING; } +static inline void irqd_set_managed_shutdown(struct irq_data *d) +{ + __irqd_to_state(d) |= IRQD_MANAGED_SHUTDOWN; +} + +static inline void irqd_clr_managed_shutdown(struct irq_data *d) +{ + __irqd_to_state(d) &= ~IRQD_MANAGED_SHUTDOWN; +} + static inline void irqd_clear(struct irq_data *d, unsigned int mask) { __irqd_to_state(d) &= ~mask;
Commit-ID: 708d174b6c32bffc5d73793bc7a267bcafeb6558 Gitweb: http://git.kernel.org/tip/708d174b6c32bffc5d73793bc7a267bcafeb6558 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:48 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:24 +0200 genirq: Split out irq_startup() code Split out the inner workings of irq_startup() so it can be reused to handle managed interrupts gracefully. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235447.033235144@linutronix.de --- kernel/irq/chip.c | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index e290d73..1163089 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -195,6 +195,23 @@ static void irq_state_set_started(struct irq_desc *desc) irqd_set(&desc->irq_data, IRQD_IRQ_STARTED); } +static int __irq_startup(struct irq_desc *desc) +{ + struct irq_data *d = irq_desc_get_irq_data(desc); + int ret = 0; + + irq_domain_activate_irq(d); + if (d->chip->irq_startup) { + ret = d->chip->irq_startup(d); + irq_state_clr_disabled(desc); + irq_state_clr_masked(desc); + } else { + irq_enable(desc); + } + irq_state_set_started(desc); + return ret; +} + int irq_startup(struct irq_desc *desc, bool resend) { int ret = 0; @@ -204,19 +221,9 @@ int irq_startup(struct irq_desc *desc, bool resend) if (irqd_is_started(&desc->irq_data)) { irq_enable(desc); } else { - irq_domain_activate_irq(&desc->irq_data); - if (desc->irq_data.chip->irq_startup) { - ret = desc->irq_data.chip->irq_startup(&desc->irq_data); - irq_state_clr_disabled(desc); - irq_state_clr_masked(desc); - } else { - irq_enable(desc); - } - irq_state_set_started(desc); - /* Set default affinity mask once everything is setup */ + ret = __irq_startup(desc); irq_setup_affinity(desc); } - if (resend) check_irq_resend(desc);
Commit-ID: 4cde9c6b826834b861a2b58653ab33150f562064 Gitweb: http://git.kernel.org/tip/4cde9c6b826834b861a2b58653ab33150f562064 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:49 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:24 +0200 genirq: Add force argument to irq_startup() In order to handle managed interrupts gracefully on irq_startup() so they won't lose their assigned affinity, it's necessary to allow startups which keep the interrupts in managed shutdown state, if none of the assigend CPUs is online. This allows drivers to request interrupts w/o the CPUs being online, which avoid online/offline churn in drivers. Add a force argument which can override that decision and let only request_irq() and enable_irq() allow the managed shutdown handling. enable_irq() is required, because the interrupt might be requested with IRQF_NOAUTOEN and enable_irq() invokes irq_startup() which would then wreckage the assignment again. All other callers force startup and potentially break the assigned affinity. No functional change as this only adds the function argument. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235447.112094565@linutronix.de --- kernel/irq/autoprobe.c | 4 ++-- kernel/irq/chip.c | 4 ++-- kernel/irq/internals.h | 9 ++++++++- kernel/irq/manage.c | 4 ++-- 4 files changed, 14 insertions(+), 7 deletions(-) diff --git a/kernel/irq/autoprobe.c b/kernel/irq/autoprobe.c index 0119b9d..d30a0dd 100644 --- a/kernel/irq/autoprobe.c +++ b/kernel/irq/autoprobe.c @@ -53,7 +53,7 @@ unsigned long probe_irq_on(void) if (desc->irq_data.chip->irq_set_type) desc->irq_data.chip->irq_set_type(&desc->irq_data, IRQ_TYPE_PROBE); - irq_startup(desc, false); + irq_startup(desc, IRQ_NORESEND, IRQ_START_FORCE); } raw_spin_unlock_irq(&desc->lock); } @@ -70,7 +70,7 @@ unsigned long probe_irq_on(void) raw_spin_lock_irq(&desc->lock); if (!desc->action && irq_settings_can_probe(desc)) { desc->istate |= IRQS_AUTODETECT | IRQS_WAITING; - if (irq_startup(desc, false)) + if (irq_startup(desc, IRQ_NORESEND, IRQ_START_FORCE)) desc->istate |= IRQS_PENDING; } raw_spin_unlock_irq(&desc->lock); diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index 1163089..b7599e9 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -212,7 +212,7 @@ static int __irq_startup(struct irq_desc *desc) return ret; } -int irq_startup(struct irq_desc *desc, bool resend) +int irq_startup(struct irq_desc *desc, bool resend, bool force) { int ret = 0; @@ -892,7 +892,7 @@ __irq_do_set_handler(struct irq_desc *desc, irq_flow_handler_t handle, irq_settings_set_norequest(desc); irq_settings_set_nothread(desc); desc->action = &chained_action; - irq_startup(desc, true); + irq_startup(desc, IRQ_RESEND, IRQ_START_FORCE); } } diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index ca4666b..5fd105e 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -66,7 +66,14 @@ extern int __irq_set_trigger(struct irq_desc *desc, unsigned long flags); extern void __disable_irq(struct irq_desc *desc); extern void __enable_irq(struct irq_desc *desc); -extern int irq_startup(struct irq_desc *desc, bool resend); +#define IRQ_RESEND true +#define IRQ_NORESEND false + +#define IRQ_START_FORCE true +#define IRQ_START_COND false + +extern int irq_startup(struct irq_desc *desc, bool resend, bool force); + extern void irq_shutdown(struct irq_desc *desc); extern void irq_enable(struct irq_desc *desc); extern void irq_disable(struct irq_desc *desc); diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index 7dcf193..3577c09 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -509,7 +509,7 @@ void __enable_irq(struct irq_desc *desc) * time. If it was already started up, then irq_startup() * will invoke irq_enable() under the hood. */ - irq_startup(desc, true); + irq_startup(desc, IRQ_RESEND, IRQ_START_COND); break; } default: @@ -1306,7 +1306,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) } if (irq_settings_can_autoenable(desc)) { - irq_startup(desc, true); + irq_startup(desc, IRQ_RESEND, IRQ_START_COND); } else { /* * Shared interrupts do not go well with disabling
Commit-ID: 761ea388e8c4e3ac883a94e16bcc8c51fa419d4f Gitweb: http://git.kernel.org/tip/761ea388e8c4e3ac883a94e16bcc8c51fa419d4f Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:50 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:24 +0200 genirq: Handle managed irqs gracefully in irq_startup() Affinity managed interrupts should keep their assigned affinity accross CPU hotplug. To avoid magic hackery in device drivers, the core code shall manage them transparently and set these interrupts into a managed shutdown state when the last CPU of the assigned affinity mask goes offline. The interrupt will be restarted when one of the CPUs in the assigned affinity mask comes back online. Add the necessary logic to irq_startup(). If an interrupt is requested and started up, the code checks whether it is affinity managed and if so, it checks whether a CPU in the interrupts affinity mask is online. If not, it puts the interrupt into managed shutdown state. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235447.189851170@linutronix.de --- include/linux/irq.h | 2 +- kernel/irq/chip.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 62 insertions(+), 4 deletions(-) diff --git a/include/linux/irq.h b/include/linux/irq.h index 0e37276..807042b 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -346,7 +346,7 @@ static inline bool irqd_is_started(struct irq_data *d) return __irqd_to_state(d) & IRQD_IRQ_STARTED; } -static inline bool irqd_is_managed_shutdown(struct irq_data *d) +static inline bool irqd_is_managed_and_shutdown(struct irq_data *d) { return __irqd_to_state(d) & IRQD_MANAGED_SHUTDOWN; } diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index b7599e9..fc89eeb 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -195,6 +195,52 @@ static void irq_state_set_started(struct irq_desc *desc) irqd_set(&desc->irq_data, IRQD_IRQ_STARTED); } +enum { + IRQ_STARTUP_NORMAL, + IRQ_STARTUP_MANAGED, + IRQ_STARTUP_ABORT, +}; + +#ifdef CONFIG_SMP +static int +__irq_startup_managed(struct irq_desc *desc, struct cpumask *aff, bool force) +{ + struct irq_data *d = irq_desc_get_irq_data(desc); + + if (!irqd_affinity_is_managed(d)) + return IRQ_STARTUP_NORMAL; + + irqd_clr_managed_shutdown(d); + + if (cpumask_any_and(aff, cpu_online_mask) > nr_cpu_ids) { + /* + * Catch code which fiddles with enable_irq() on a managed + * and potentially shutdown IRQ. Chained interrupt + * installment or irq auto probing should not happen on + * managed irqs either. Emit a warning, break the affinity + * and start it up as a normal interrupt. + */ + if (WARN_ON_ONCE(force)) + return IRQ_STARTUP_NORMAL; + /* + * The interrupt was requested, but there is no online CPU + * in it's affinity mask. Put it into managed shutdown + * state and let the cpu hotplug mechanism start it up once + * a CPU in the mask becomes available. + */ + irqd_set_managed_shutdown(d); + return IRQ_STARTUP_ABORT; + } + return IRQ_STARTUP_MANAGED; +} +#else +static int +__irq_startup_managed(struct irq_desc *desc, struct cpumask *aff, bool force) +{ + return IRQ_STARTUP_NORMAL; +} +#endif + static int __irq_startup(struct irq_desc *desc) { struct irq_data *d = irq_desc_get_irq_data(desc); @@ -214,15 +260,27 @@ static int __irq_startup(struct irq_desc *desc) int irq_startup(struct irq_desc *desc, bool resend, bool force) { + struct irq_data *d = irq_desc_get_irq_data(desc); + struct cpumask *aff = irq_data_get_affinity_mask(d); int ret = 0; desc->depth = 0; - if (irqd_is_started(&desc->irq_data)) { + if (irqd_is_started(d)) { irq_enable(desc); } else { - ret = __irq_startup(desc); - irq_setup_affinity(desc); + switch (__irq_startup_managed(desc, aff, force)) { + case IRQ_STARTUP_NORMAL: + ret = __irq_startup(desc); + irq_setup_affinity(desc); + break; + case IRQ_STARTUP_MANAGED: + ret = __irq_startup(desc); + irq_set_affinity_locked(d, aff, false); + break; + case IRQ_STARTUP_ABORT: + return 0; + } } if (resend) check_irq_resend(desc);
Commit-ID: c5cb83bb337c25caae995d992d1cdf9b317f83de Gitweb: http://git.kernel.org/tip/c5cb83bb337c25caae995d992d1cdf9b317f83de Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:51 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:25 +0200 genirq/cpuhotplug: Handle managed IRQs on CPU hotplug If a CPU goes offline, interrupts affine to the CPU are moved away. If the outgoing CPU is the last CPU in the affinity mask the migration code breaks the affinity and sets it it all online cpus. This is a problem for affinity managed interrupts as CPU hotplug is often used for power management purposes. If the affinity is broken, the interrupt is not longer affine to the CPUs to which it was allocated. The affinity spreading allows to lay out multi queue devices in a way that they are assigned to a single CPU or a group of CPUs. If the last CPU goes offline, then the queue is not longer used, so the interrupt can be shutdown gracefully and parked until one of the assigned CPUs comes online again. Add a graceful shutdown mechanism into the irq affinity breaking code path, mark the irq as MANAGED_SHUTDOWN and leave the affinity mask unmodified. In the online path, scan the active interrupts for managed interrupts and if the interrupt is functional and the newly online CPU is part of the affinity mask, restart the interrupt if it is marked MANAGED_SHUTDOWN or if the interrupts is started up, try to add the CPU back to the effective affinity mask. Originally-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170619235447.273417334@linutronix.de --- include/linux/cpuhotplug.h | 1 + include/linux/irq.h | 5 +++++ kernel/cpu.c | 5 +++++ kernel/irq/cpuhotplug.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 56 insertions(+) diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 0f2a803..c15f22c 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -124,6 +124,7 @@ enum cpuhp_state { CPUHP_AP_ONLINE_IDLE, CPUHP_AP_SMPBOOT_THREADS, CPUHP_AP_X86_VDSO_VMA_ONLINE, + CPUHP_AP_IRQ_AFFINITY_ONLINE, CPUHP_AP_PERF_ONLINE, CPUHP_AP_PERF_X86_ONLINE, CPUHP_AP_PERF_X86_UNCORE_ONLINE, diff --git a/include/linux/irq.h b/include/linux/irq.h index 807042b..19cea63 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -500,7 +500,12 @@ extern int irq_set_affinity_locked(struct irq_data *data, const struct cpumask *cpumask, bool force); extern int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info); +#if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_IRQ_MIGRATION) extern void irq_migrate_all_off_this_cpu(void); +extern int irq_affinity_online_cpu(unsigned int cpu); +#else +# define irq_affinity_online_cpu NULL +#endif #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ) void irq_move_irq(struct irq_data *data); diff --git a/kernel/cpu.c b/kernel/cpu.c index cb51034..b86b32e 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1252,6 +1252,11 @@ static struct cpuhp_step cpuhp_ap_states[] = { .startup.single = smpboot_unpark_threads, .teardown.single = NULL, }, + [CPUHP_AP_IRQ_AFFINITY_ONLINE] = { + .name = "irq/affinity:online", + .startup.single = irq_affinity_online_cpu, + .teardown.single = NULL, + }, [CPUHP_AP_PERF_ONLINE] = { .name = "perf:online", .startup.single = perf_event_init_cpu, diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index 0b093db..b7964e7 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -83,6 +83,15 @@ static bool migrate_one_irq(struct irq_desc *desc) chip->irq_mask(d); if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { + /* + * If the interrupt is managed, then shut it down and leave + * the affinity untouched. + */ + if (irqd_affinity_is_managed(d)) { + irqd_set_managed_shutdown(d); + irq_shutdown(desc); + return false; + } affinity = cpu_online_mask; brokeaff = true; } @@ -129,3 +138,39 @@ void irq_migrate_all_off_this_cpu(void) } } } + +static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) +{ + struct irq_data *data = irq_desc_get_irq_data(desc); + const struct cpumask *affinity = irq_data_get_affinity_mask(data); + + if (!irqd_affinity_is_managed(data) || !desc->action || + !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity)) + return; + + if (irqd_is_managed_and_shutdown(data)) + irq_startup(desc, IRQ_RESEND, IRQ_START_COND); + else + irq_set_affinity_locked(data, affinity, false); +} + +/** + * irq_affinity_online_cpu - Restore affinity for managed interrupts + * @cpu: Upcoming CPU for which interrupts should be restored + */ +int irq_affinity_online_cpu(unsigned int cpu) +{ + struct irq_desc *desc; + unsigned int irq; + + irq_lock_sparse(); + for_each_active_irq(irq) { + desc = irq_to_desc(irq); + raw_spin_lock_irq(&desc->lock); + irq_restore_affinity_of_irq(desc, cpu); + raw_spin_unlock_irq(&desc->lock); + } + irq_unlock_sparse(); + + return 0; +}
Commit-ID: d52dd44175bd27ad9d8e34a994fb80877c1f6d61 Gitweb: http://git.kernel.org/tip/d52dd44175bd27ad9d8e34a994fb80877c1f6d61 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:52 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:25 +0200 genirq: Introduce IRQD_SINGLE_TARGET flag Many interrupt chips allow only a single CPU as interrupt target. The core code has no knowledge about that. That's unfortunate as it could avoid trying to readd a newly online CPU to the effective affinity mask. Add the status flag and the necessary accessors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235447.352343969@linutronix.de --- include/linux/irq.h | 16 ++++++++++++++++ kernel/irq/debugfs.c | 1 + 2 files changed, 17 insertions(+) diff --git a/include/linux/irq.h b/include/linux/irq.h index 19cea63..00db35b 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -209,6 +209,7 @@ struct irq_data { * IRQD_IRQ_STARTED - Startup state of the interrupt * IRQD_MANAGED_SHUTDOWN - Interrupt was shutdown due to empty affinity * mask. Applies only to affinity managed irqs. + * IRQD_SINGLE_TARGET - IRQ allows only a single affinity target */ enum { IRQD_TRIGGER_MASK = 0xf, @@ -228,6 +229,7 @@ enum { IRQD_AFFINITY_MANAGED = (1 << 21), IRQD_IRQ_STARTED = (1 << 22), IRQD_MANAGED_SHUTDOWN = (1 << 23), + IRQD_SINGLE_TARGET = (1 << 24), }; #define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors) @@ -276,6 +278,20 @@ static inline bool irqd_is_level_type(struct irq_data *d) return __irqd_to_state(d) & IRQD_LEVEL; } +/* + * Must only be called of irqchip.irq_set_affinity() or low level + * hieararchy domain allocation functions. + */ +static inline void irqd_set_single_target(struct irq_data *d) +{ + __irqd_to_state(d) |= IRQD_SINGLE_TARGET; +} + +static inline bool irqd_is_single_target(struct irq_data *d) +{ + return __irqd_to_state(d) & IRQD_SINGLE_TARGET; +} + static inline bool irqd_is_wakeup_set(struct irq_data *d) { return __irqd_to_state(d) & IRQD_WAKEUP_STATE; diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c index edbef25..dbd6e78 100644 --- a/kernel/irq/debugfs.c +++ b/kernel/irq/debugfs.c @@ -105,6 +105,7 @@ static const struct irq_bit_descr irqdata_states[] = { BIT_MASK_DESCR(IRQD_PER_CPU), BIT_MASK_DESCR(IRQD_NO_BALANCING), + BIT_MASK_DESCR(IRQD_SINGLE_TARGET), BIT_MASK_DESCR(IRQD_MOVE_PCNTXT), BIT_MASK_DESCR(IRQD_AFFINITY_SET), BIT_MASK_DESCR(IRQD_SETAFFINITY_PENDING),
Commit-ID: 8f31a9845db348f5781df47ce04c79e4cfe90016 Gitweb: http://git.kernel.org/tip/8f31a9845db348f5781df47ce04c79e4cfe90016 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:53 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:25 +0200 genirq/cpuhotplug: Avoid irq affinity setting for single targets Avoid trying to add a newly online CPU to the effective affinity mask of an started up interrupt. That interrupt will either stay on the already online CPU or move around for no value. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235447.431321047@linutronix.de --- kernel/irq/cpuhotplug.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c index b7964e7..aee8f7e 100644 --- a/kernel/irq/cpuhotplug.c +++ b/kernel/irq/cpuhotplug.c @@ -148,9 +148,17 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu) !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity)) return; - if (irqd_is_managed_and_shutdown(data)) + if (irqd_is_managed_and_shutdown(data)) { irq_startup(desc, IRQ_RESEND, IRQ_START_COND); - else + return; + } + + /* + * If the interrupt can only be directed to a single target + * CPU then it is already assigned to a CPU in the affinity + * mask. No point in trying to move it around. + */ + if (!irqd_is_single_target(data)) irq_set_affinity_locked(data, affinity, false); }
Commit-ID: 3ca57222c36ba31b80aa25de313f3c8ab26a8102 Gitweb: http://git.kernel.org/tip/3ca57222c36ba31b80aa25de313f3c8ab26a8102 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 20 Jun 2017 01:37:54 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 22 Jun 2017 18:21:26 +0200 x86/apic: Mark single target interrupts If the interrupt destination mode of the APIC is physical then the effective affinity is restricted to a single CPU. Mark the interrupt accordingly in the domain allocation code, so the core code can avoid pointless affinity setting attempts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235447.508846202@linutronix.de --- arch/x86/kernel/apic/vector.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index b270a76..2567dc0 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -371,6 +371,13 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq, irq_data); if (err) goto error; + /* + * If the apic destination mode is physical, then the + * effective affinity is restricted to a single target + * CPU. Mark the interrupt accordingly. + */ + if (!apic->irq_dest_mode) + irqd_set_single_target(irq_data); } return 0;