* [RFC PATCH 1/5] irq: Always enable parent interrupt tracking
2023-05-30 21:45 [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control Radu Rendec
@ 2023-05-30 21:45 ` Radu Rendec
2023-05-31 13:10 ` Thomas Gleixner
2023-05-30 21:45 ` [RFC PATCH 2/5] irq: Show the parent chained interrupt in debugfs Radu Rendec
` (4 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Radu Rendec @ 2023-05-30 21:45 UTC (permalink / raw)
To: linux-kernel; +Cc: Marc Zyngier, Thomas Gleixner
The kernel already has some support for tracking the parent interrupt in
the case of chained interrupts, but it is currently used only by the
IRQ-resend code.
This patch enables the parent interrupt tracking code unconditionally.
The IRQ-resend code still depends on CONFIG_HARDIRQS_SW_RESEND.
The intention is to (re)use the existing parent interrupt tracking
support for different purposes, more specifically to expose chained
interrupt topology to userspace. That, in turn, makes it possible to
control the SMP affinity of chained interrupts in a way that does not
break the existing interface and the promises it makes.
Signed-off-by: Radu Rendec <rrendec@redhat.com>
---
include/linux/irq.h | 7 -------
kernel/irq/manage.c | 2 --
2 files changed, 9 deletions(-)
diff --git a/include/linux/irq.h b/include/linux/irq.h
index b1b28affb32a7..7710f157e12de 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -641,14 +641,7 @@ static inline void irq_force_complete_move(struct irq_desc *desc) { }
extern int no_irq_affinity;
-#ifdef CONFIG_HARDIRQS_SW_RESEND
int irq_set_parent(int irq, int parent_irq);
-#else
-static inline int irq_set_parent(int irq, int parent_irq)
-{
- return 0;
-}
-#endif
/*
* Built-in IRQ handlers for various IRQ types,
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index eb862b5f91c42..49683e55261eb 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1004,7 +1004,6 @@ int __irq_set_trigger(struct irq_desc *desc, unsigned long flags)
return ret;
}
-#ifdef CONFIG_HARDIRQS_SW_RESEND
int irq_set_parent(int irq, int parent_irq)
{
unsigned long flags;
@@ -1019,7 +1018,6 @@ int irq_set_parent(int irq, int parent_irq)
return 0;
}
EXPORT_SYMBOL_GPL(irq_set_parent);
-#endif
/*
* Default primary interrupt handler for threaded interrupts. Is
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC PATCH 2/5] irq: Show the parent chained interrupt in debugfs
2023-05-30 21:45 [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control Radu Rendec
2023-05-30 21:45 ` [RFC PATCH 1/5] irq: Always enable parent interrupt tracking Radu Rendec
@ 2023-05-30 21:45 ` Radu Rendec
2023-05-30 21:45 ` [RFC PATCH 3/5] irq: Expose chained interrupt parents in sysfs Radu Rendec
` (3 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Radu Rendec @ 2023-05-30 21:45 UTC (permalink / raw)
To: linux-kernel; +Cc: Marc Zyngier, Thomas Gleixner
This is a trivial change to expose the parent chained interrupt. The
intention is to make it easier to debug chained interrupts, particularly
in the context of setting the SMP affinity.
Signed-off-by: Radu Rendec <rrendec@redhat.com>
---
kernel/irq/debugfs.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c
index bbcaac64038ef..3ada976df8612 100644
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -177,6 +177,7 @@ static int irq_debug_show(struct seq_file *m, void *p)
ARRAY_SIZE(irqdesc_istates));
seq_printf(m, "ddepth: %u\n", desc->depth);
seq_printf(m, "wdepth: %u\n", desc->wake_depth);
+ seq_printf(m, "parent: %d\n", desc->parent_irq);
seq_printf(m, "dstate: 0x%08x\n", irqd_get(data));
irq_debug_show_bits(m, 0, irqd_get(data), irqdata_states,
ARRAY_SIZE(irqdata_states));
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC PATCH 3/5] irq: Expose chained interrupt parents in sysfs
2023-05-30 21:45 [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control Radu Rendec
2023-05-30 21:45 ` [RFC PATCH 1/5] irq: Always enable parent interrupt tracking Radu Rendec
2023-05-30 21:45 ` [RFC PATCH 2/5] irq: Show the parent chained interrupt in debugfs Radu Rendec
@ 2023-05-30 21:45 ` Radu Rendec
2023-05-30 21:45 ` [RFC PATCH 4/5] irq: Move SMP affinity write handler out of proc.c Radu Rendec
` (2 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Radu Rendec @ 2023-05-30 21:45 UTC (permalink / raw)
To: linux-kernel; +Cc: Marc Zyngier, Thomas Gleixner
This patch extends irq_set_parent() by adding the ability to show the
direct parent of a chained interrupt as a symlink in sysfs. In addition,
the root level interrupt (in a multi-level chained interrupt topology)
will also show all child interrupts. Example:
$ ls -l /sys/kernel/irq/38/muxed_irqs/
total 0
lrwxrwxrwx 1 root root 0 May 24 16:00 41 -> ../../41
lrwxrwxrwx 1 root root 0 May 24 16:00 42 -> ../../42
$ ls -l /sys/kernel/irq/41/parent
lrwxrwxrwx 1 root root 0 May 24 16:00 /sys/kernel/irq/41/parent -> ../38
$ ls -l /sys/kernel/irq/42/parent
lrwxrwxrwx 1 root root 0 May 24 16:18 /sys/kernel/irq/42/parent -> ../38
Chained IRQ chip drivers are expected to call irq_set_parent() in their
.map domain op handler. A few already do (for the purpose of enabling
IRQ-resend support).
An IRQ chip driver may also implement the newer hierarchical domain API
and still use chained interrupts. In that case, irq_set_parent() would
be called in the .alloc domain op handler.
Since most legacy drivers do not implement the .unmap domain op and most
hierarchical drivers use irq_domain_free_irqs_common() as their .free
domain op, the irqdomain core is modified to remove an existing parent
mapping automatically when the interrupt is disassociated with the
domain.
The whole purpose of exposing interrupt topology in sysfs is to be able
to control the SMP affinity of chained interrupts. Since chained IRQ
chips can be stacked and the affinity can be typically controlled only
at the root level, irq_set_parent() will always populate the muxed_irqs
directory of the root interrupt, not the one of the direct parent.
However, the "parent" symlink will always point to the direct parent.
Notes (RFC/draft patch):
* The muxed_irqs directory may not be necessary to control the SMP
affinity. See the cover letter for details.
* Synchronization around muxed_irqs is likely broken. But since we may
decide to get rid of muxed_irqs altogether, the implementation here is
provided as a proof of concept and ground for discussion.
Signed-off-by: Radu Rendec <rrendec@redhat.com>
---
include/linux/irqdesc.h | 1 +
kernel/irq/internals.h | 8 ++++
kernel/irq/irqdesc.c | 84 +++++++++++++++++++++++++++++++++++------
kernel/irq/irqdomain.c | 15 ++++++++
kernel/irq/manage.c | 16 ++++++--
5 files changed, 110 insertions(+), 14 deletions(-)
diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
index 844a8e30e6de5..b57450745857f 100644
--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -97,6 +97,7 @@ struct irq_desc {
#ifdef CONFIG_SPARSE_IRQ
struct rcu_head rcu;
struct kobject kobj;
+ struct kobject *muxed_irqs;
#endif
struct mutex request_mutex;
int parent_irq;
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index 5fdc0b5575797..c75cd836155c9 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -101,6 +101,14 @@ static inline void irq_mark_irq(unsigned int irq) { }
extern void irq_mark_irq(unsigned int irq);
#endif
+#if defined(CONFIG_SPARSE_IRQ) && defined(CONFIG_SYSFS)
+extern void irq_sysfs_update_parent(struct irq_desc *desc,
+ struct irq_desc *parent);
+#else
+static inline void irq_sysfs_update_parent(struct irq_desc *desc,
+ struct irq_desc *parent) { }
+#endif
+
extern int __irq_get_irqchip_state(struct irq_data *data,
enum irqchip_irq_state which,
bool *state);
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index d45c163dbb749..ec52b8b41002e 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -285,17 +285,27 @@ static const struct kobj_type irq_kobj_type = {
static void irq_sysfs_add(int irq, struct irq_desc *desc)
{
- if (irq_kobj_base) {
- /*
- * Continue even in case of failure as this is nothing
- * crucial and failures in the late irq_sysfs_init()
- * cannot be rolled back.
- */
- if (kobject_add(&desc->kobj, irq_kobj_base, "%d", irq))
- pr_warn("Failed to add kobject for irq %d\n", irq);
- else
- desc->istate |= IRQS_SYSFS;
+ if (!irq_kobj_base)
+ return;
+
+ /*
+ * Continue even in case of failure as this is nothing
+ * crucial and failures in the late irq_sysfs_init()
+ * cannot be rolled back.
+ */
+ if (kobject_add(&desc->kobj, irq_kobj_base, "%d", irq)) {
+ pr_warn("Failed to add kobject for irq %d\n", irq);
+ return;
+ }
+
+ desc->muxed_irqs = kobject_create_and_add("muxed_irqs", &desc->kobj);
+ if (!desc->muxed_irqs) {
+ pr_warn("Failed to add mux kobject for irq %d\n", irq);
+ kobject_del(&desc->kobj);
+ return;
}
+
+ desc->istate |= IRQS_SYSFS;
}
static void irq_sysfs_del(struct irq_desc *desc)
@@ -306,8 +316,10 @@ static void irq_sysfs_del(struct irq_desc *desc)
* sysfs is not initialized yet, and the case of a failed
* kobject_add() invocation.
*/
- if (desc->istate & IRQS_SYSFS)
+ if (desc->istate & IRQS_SYSFS) {
+ kobject_del(desc->muxed_irqs);
kobject_del(&desc->kobj);
+ }
}
static int __init irq_sysfs_init(void)
@@ -333,6 +345,55 @@ static int __init irq_sysfs_init(void)
}
postcore_initcall(irq_sysfs_init);
+static inline struct irq_desc *irq_find_root_desc(struct irq_desc *desc)
+{
+ while (desc && desc->parent_irq)
+ desc = irq_to_desc(desc->parent_irq);
+
+ return desc;
+}
+
+/*
+ * Update sysfs entries to show that parent_desc is the new parent of desc
+ *
+ * At this point, desc->parent_irq is the old parent, and parent_desc is the
+ * new parent. If no parent has been set yet, desc->parent is 0. If the parent
+ * is being removed, parent_desc is NULL.
+ *
+ * Called with desc->request_mutex locked.
+ */
+void irq_sysfs_update_parent(struct irq_desc *desc, struct irq_desc *parent_desc)
+{
+ unsigned int irq = desc->irq_data.irq;
+ struct irq_desc *root_desc;
+ int err;
+
+ /* First, remove all references to the old parent (if any). */
+ if (desc->parent_irq) {
+ sysfs_remove_link(&desc->kobj, "parent");
+ root_desc = irq_find_root_desc(irq_to_desc(desc->parent_irq));
+ if (!WARN_ON(!root_desc))
+ sysfs_remove_link(root_desc->muxed_irqs, desc->kobj.name);
+ }
+
+ /* Next, create references to the new parent (if any). */
+ if (!parent_desc)
+ return;
+
+ err = sysfs_create_link(&desc->kobj, &parent_desc->kobj, "parent");
+ if (err)
+ pr_warn("Failed to link irq %u parent: %d\n", irq, err);
+
+ root_desc = irq_find_root_desc(parent_desc);
+ if (WARN_ON(!root_desc))
+ return;
+
+ err = sysfs_create_link(root_desc->muxed_irqs, &desc->kobj,
+ desc->kobj.name);
+ if (err)
+ pr_warn("Failed to link irq %u root: %d\n", irq, err);
+}
+
#else /* !CONFIG_SYSFS */
static const struct kobj_type irq_kobj_type = {
@@ -438,6 +499,7 @@ static void delayed_free_desc(struct rcu_head *rhp)
{
struct irq_desc *desc = container_of(rhp, struct irq_desc, rcu);
+ kobject_put(desc->muxed_irqs);
kobject_put(&desc->kobj);
}
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index f34760a1e2226..fad1559587c3e 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -553,6 +553,14 @@ static void irq_domain_disassociate(struct irq_domain *domain, unsigned int irq)
mutex_lock(&domain->root->mutex);
+ /*
+ * For domains that support it, irq_set_parent() is called in the map()
+ * op. The reverse operation of removing the parent mapping should be
+ * done in the unmap() op which is called below. Since most domains do
+ * not implement unmap(), remove the parent mapping here.
+ */
+ irq_set_parent(irq, 0);
+
irq_set_status_flags(irq, IRQ_NOREQUEST);
/* remove chip and handler */
@@ -1748,6 +1756,13 @@ void irq_domain_free_irqs(unsigned int virq, unsigned int nr_irqs)
domain = data->domain;
mutex_lock(&domain->root->mutex);
+ /*
+ * Typically irq_set_parent() would be called in the alloc() op. The
+ * reverse operation of removing the parent mapping should be done in
+ * the free() op which is called indirectly below. Since most domains
+ * use irq_domain_free_irqs_common(), remove the parent mapping here.
+ */
+ irq_set_parent(virq, 0);
for (i = 0; i < nr_irqs; i++)
irq_domain_remove_irq(virq + i);
irq_domain_free_irqs_hierarchy(domain, virq, nr_irqs);
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 49683e55261eb..eec9b94747439 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1007,14 +1007,24 @@ int __irq_set_trigger(struct irq_desc *desc, unsigned long flags)
int irq_set_parent(int irq, int parent_irq)
{
unsigned long flags;
- struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0);
+ struct irq_desc *desc = irq_to_desc(irq);
+ struct irq_desc *parent_desc = parent_irq ?
+ irq_to_desc(parent_irq) : NULL;
- if (!desc)
+ if (!desc || (parent_irq && !parent_desc))
return -EINVAL;
+ mutex_lock(&desc->request_mutex);
+
+ if (desc->parent_irq != parent_irq)
+ irq_sysfs_update_parent(desc, parent_desc);
+
+ raw_spin_lock_irqsave(&desc->lock, flags);
desc->parent_irq = parent_irq;
+ raw_spin_unlock_irqrestore(&desc->lock, flags);
+
+ mutex_unlock(&desc->request_mutex);
- irq_put_desc_unlock(desc, flags);
return 0;
}
EXPORT_SYMBOL_GPL(irq_set_parent);
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC PATCH 4/5] irq: Move SMP affinity write handler out of proc.c
2023-05-30 21:45 [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control Radu Rendec
` (2 preceding siblings ...)
2023-05-30 21:45 ` [RFC PATCH 3/5] irq: Expose chained interrupt parents in sysfs Radu Rendec
@ 2023-05-30 21:45 ` Radu Rendec
2023-06-01 17:57 ` kernel test robot
2023-05-30 21:45 ` [RFC PATCH 5/5] irq: Add smp_affinity/list attributes to sysfs Radu Rendec
2023-05-31 13:09 ` [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control Thomas Gleixner
5 siblings, 1 reply; 13+ messages in thread
From: Radu Rendec @ 2023-05-30 21:45 UTC (permalink / raw)
To: linux-kernel; +Cc: Marc Zyngier, Thomas Gleixner
This patch prepares the ground for setting the SMP affinity from sysfs.
The bulk of the code is identical for procfs and sysfs, except for the
cpumask parsing functions, where procfs requires the _user variants.
Summary of changes:
- irq_select_affinity_usr() and write_irq_affinity() are moved from
from proc.c to irqdesc.c
- write_irq_affinity() is slightly modified to allow using the other
variant of cpumask parsing functions
- the definition of no_irq_affinity is moved from proc.c to manage.c
and available only when CONFIG_SMP is enabled
- the declaration of no_irq_affinity is available only when CONFIG_SMP
is enabled
Note that all existing use cases of no_irq_affinity were already
confined within CONFIG_SMP preprocessor conditionals.
Signed-off-by: Radu Rendec <rrendec@redhat.com>
---
include/linux/irq.h | 2 ++
kernel/irq/internals.h | 2 ++
kernel/irq/irqdesc.c | 67 +++++++++++++++++++++++++++++++++++++++
kernel/irq/manage.c | 2 ++
kernel/irq/proc.c | 72 +++---------------------------------------
5 files changed, 78 insertions(+), 67 deletions(-)
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 7710f157e12de..0393fc02cfd46 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -639,7 +639,9 @@ static inline void irq_move_masked_irq(struct irq_data *data) { }
static inline void irq_force_complete_move(struct irq_desc *desc) { }
#endif
+#ifdef CONFIG_SMP
extern int no_irq_affinity;
+#endif
int irq_set_parent(int irq, int parent_irq);
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index c75cd836155c9..381a0b4c1d381 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -147,6 +147,8 @@ extern int irq_do_set_affinity(struct irq_data *data,
#ifdef CONFIG_SMP
extern int irq_setup_affinity(struct irq_desc *desc);
+extern ssize_t write_irq_affinity(unsigned int irq, const char __user *buffer,
+ size_t count, bool is_list, bool is_user);
#else
static inline int irq_setup_affinity(struct irq_desc *desc) { return 0; }
#endif
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index ec52b8b41002e..a46a76c29b8d1 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -133,6 +133,73 @@ EXPORT_SYMBOL_GPL(nr_irqs);
static DEFINE_MUTEX(sparse_irq_lock);
static DECLARE_BITMAP(allocated_irqs, IRQ_BITMAP_BITS);
+#ifdef CONFIG_SMP
+
+#ifndef CONFIG_AUTO_IRQ_AFFINITY
+static inline int irq_select_affinity_usr(unsigned int irq)
+{
+ /*
+ * If the interrupt is started up already then this fails. The
+ * interrupt is assigned to an online CPU already. There is no
+ * point to move it around randomly. Tell user space that the
+ * selected mask is bogus.
+ *
+ * If not then any change to the affinity is pointless because the
+ * startup code invokes irq_setup_affinity() which will select
+ * a online CPU anyway.
+ */
+ return -EINVAL;
+}
+#else
+/* ALPHA magic affinity auto selector. Keep it for historical reasons. */
+static inline int irq_select_affinity_usr(unsigned int irq)
+{
+ return irq_select_affinity(irq);
+}
+#endif
+
+ssize_t write_irq_affinity(unsigned int irq, const char __user *buffer,
+ size_t count, bool is_list, bool is_user)
+{
+ cpumask_var_t mask;
+ int err;
+
+ if (!irq_can_set_affinity_usr(irq) || no_irq_affinity)
+ return -EIO;
+
+ if (!zalloc_cpumask_var(&mask, GFP_KERNEL))
+ return -ENOMEM;
+
+ if (is_user)
+ err = is_list ? cpumask_parselist_user(buffer, count, mask) :
+ cpumask_parse_user(buffer, count, mask);
+ else
+ err = is_list ? cpulist_parse(buffer, mask) :
+ cpumask_parse(buffer, mask);
+ if (err)
+ goto free_cpumask;
+
+ /*
+ * Do not allow disabling IRQs completely - it's a too easy
+ * way to make the system unusable accidentally :-) At least
+ * one online CPU still has to be targeted.
+ */
+ if (!cpumask_intersects(mask, cpu_online_mask)) {
+ /*
+ * Special case for empty set - allow the architecture code
+ * to set default SMP affinity.
+ */
+ err = irq_select_affinity_usr(irq) ? -EINVAL : count;
+ } else {
+ err = irq_set_affinity(irq, mask) ?: count;
+ }
+
+free_cpumask:
+ free_cpumask_var(mask);
+ return err;
+}
+#endif
+
#ifdef CONFIG_SPARSE_IRQ
static void irq_kobj_release(struct kobject *kobj);
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index eec9b94747439..91cee7270d221 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -143,6 +143,8 @@ EXPORT_SYMBOL(synchronize_irq);
#ifdef CONFIG_SMP
cpumask_var_t irq_default_affinity;
+int no_irq_affinity;
+
static bool __irq_can_set_affinity(struct irq_desc *desc)
{
if (!desc || !irqd_can_balance(&desc->irq_data) ||
diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index 623b8136e9af3..76f0dda1f26b8 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -100,7 +100,6 @@ static int irq_affinity_hint_proc_show(struct seq_file *m, void *v)
return 0;
}
-int no_irq_affinity;
static int irq_affinity_proc_show(struct seq_file *m, void *v)
{
return show_irq_affinity(AFFINITY, m);
@@ -111,81 +110,20 @@ static int irq_affinity_list_proc_show(struct seq_file *m, void *v)
return show_irq_affinity(AFFINITY_LIST, m);
}
-#ifndef CONFIG_AUTO_IRQ_AFFINITY
-static inline int irq_select_affinity_usr(unsigned int irq)
-{
- /*
- * If the interrupt is started up already then this fails. The
- * interrupt is assigned to an online CPU already. There is no
- * point to move it around randomly. Tell user space that the
- * selected mask is bogus.
- *
- * If not then any change to the affinity is pointless because the
- * startup code invokes irq_setup_affinity() which will select
- * a online CPU anyway.
- */
- return -EINVAL;
-}
-#else
-/* ALPHA magic affinity auto selector. Keep it for historical reasons. */
-static inline int irq_select_affinity_usr(unsigned int irq)
-{
- return irq_select_affinity(irq);
-}
-#endif
-
-static ssize_t write_irq_affinity(int type, struct file *file,
+static ssize_t irq_affinity_proc_write(struct file *file,
const char __user *buffer, size_t count, loff_t *pos)
{
unsigned int irq = (int)(long)pde_data(file_inode(file));
- cpumask_var_t new_value;
- int err;
-
- if (!irq_can_set_affinity_usr(irq) || no_irq_affinity)
- return -EIO;
-
- if (!zalloc_cpumask_var(&new_value, GFP_KERNEL))
- return -ENOMEM;
-
- if (type)
- err = cpumask_parselist_user(buffer, count, new_value);
- else
- err = cpumask_parse_user(buffer, count, new_value);
- if (err)
- goto free_cpumask;
- /*
- * Do not allow disabling IRQs completely - it's a too easy
- * way to make the system unusable accidentally :-) At least
- * one online CPU still has to be targeted.
- */
- if (!cpumask_intersects(new_value, cpu_online_mask)) {
- /*
- * Special case for empty set - allow the architecture code
- * to set default SMP affinity.
- */
- err = irq_select_affinity_usr(irq) ? -EINVAL : count;
- } else {
- err = irq_set_affinity(irq, new_value);
- if (!err)
- err = count;
- }
-
-free_cpumask:
- free_cpumask_var(new_value);
- return err;
-}
-
-static ssize_t irq_affinity_proc_write(struct file *file,
- const char __user *buffer, size_t count, loff_t *pos)
-{
- return write_irq_affinity(0, file, buffer, count, pos);
+ return write_irq_affinity(irq, buffer, count, false, true);
}
static ssize_t irq_affinity_list_proc_write(struct file *file,
const char __user *buffer, size_t count, loff_t *pos)
{
- return write_irq_affinity(1, file, buffer, count, pos);
+ unsigned int irq = (int)(long)pde_data(file_inode(file));
+
+ return write_irq_affinity(irq, buffer, count, true, true);
}
static int irq_affinity_proc_open(struct inode *inode, struct file *file)
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 4/5] irq: Move SMP affinity write handler out of proc.c
2023-05-30 21:45 ` [RFC PATCH 4/5] irq: Move SMP affinity write handler out of proc.c Radu Rendec
@ 2023-06-01 17:57 ` kernel test robot
0 siblings, 0 replies; 13+ messages in thread
From: kernel test robot @ 2023-06-01 17:57 UTC (permalink / raw)
To: Radu Rendec; +Cc: oe-kbuild-all
Hi Radu,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:
[auto build test WARNING on linus/master]
[also build test WARNING on v6.4-rc4 next-20230601]
[cannot apply to tip/irq/core]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Radu-Rendec/irq-Always-enable-parent-interrupt-tracking/20230531-054837
base: linus/master
patch link: https://lore.kernel.org/r/20230530214550.864894-5-rrendec%40redhat.com
patch subject: [RFC PATCH 4/5] irq: Move SMP affinity write handler out of proc.c
config: arm64-randconfig-s041-20230531 (https://download.01.org/0day-ci/archive/20230602/202306020130.qPJz16Yw-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 12.3.0
reproduce:
mkdir -p ~/bin
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.4-39-gce1a6720-dirty
# https://github.com/intel-lab-lkp/linux/commit/de57fb00123349265cf26d7e81b4715d052d9451
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Radu-Rendec/irq-Always-enable-parent-interrupt-tracking/20230531-054837
git checkout de57fb00123349265cf26d7e81b4715d052d9451
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm64 SHELL=/bin/bash kernel/irq/
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306020130.qPJz16Yw-lkp@intel.com/
sparse warnings: (new ones prefixed by >>)
>> kernel/irq/irqdesc.c:178:47: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const *buf @@ got char const [noderef] __user *buffer @@
kernel/irq/irqdesc.c:178:47: sparse: expected char const *buf
kernel/irq/irqdesc.c:178:47: sparse: got char const [noderef] __user *buffer
kernel/irq/irqdesc.c:177:47: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const *buf @@ got char const [noderef] __user *buffer @@
kernel/irq/irqdesc.c:177:47: sparse: expected char const *buf
kernel/irq/irqdesc.c:177:47: sparse: got char const [noderef] __user *buffer
kernel/irq/irqdesc.c:971:17: sparse: sparse: context imbalance in '__irq_get_desc_lock' - wrong count at exit
vim +178 kernel/irq/irqdesc.c
160
161 ssize_t write_irq_affinity(unsigned int irq, const char __user *buffer,
162 size_t count, bool is_list, bool is_user)
163 {
164 cpumask_var_t mask;
165 int err;
166
167 if (!irq_can_set_affinity_usr(irq) || no_irq_affinity)
168 return -EIO;
169
170 if (!zalloc_cpumask_var(&mask, GFP_KERNEL))
171 return -ENOMEM;
172
173 if (is_user)
174 err = is_list ? cpumask_parselist_user(buffer, count, mask) :
175 cpumask_parse_user(buffer, count, mask);
176 else
177 err = is_list ? cpulist_parse(buffer, mask) :
> 178 cpumask_parse(buffer, mask);
179 if (err)
180 goto free_cpumask;
181
182 /*
183 * Do not allow disabling IRQs completely - it's a too easy
184 * way to make the system unusable accidentally :-) At least
185 * one online CPU still has to be targeted.
186 */
187 if (!cpumask_intersects(mask, cpu_online_mask)) {
188 /*
189 * Special case for empty set - allow the architecture code
190 * to set default SMP affinity.
191 */
192 err = irq_select_affinity_usr(irq) ? -EINVAL : count;
193 } else {
194 err = irq_set_affinity(irq, mask) ?: count;
195 }
196
197 free_cpumask:
198 free_cpumask_var(mask);
199 return err;
200 }
201 #endif
202
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC PATCH 5/5] irq: Add smp_affinity/list attributes to sysfs
2023-05-30 21:45 [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control Radu Rendec
` (3 preceding siblings ...)
2023-05-30 21:45 ` [RFC PATCH 4/5] irq: Move SMP affinity write handler out of proc.c Radu Rendec
@ 2023-05-30 21:45 ` Radu Rendec
2023-06-01 23:22 ` kernel test robot
2023-05-31 13:09 ` [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control Thomas Gleixner
5 siblings, 1 reply; 13+ messages in thread
From: Radu Rendec @ 2023-05-30 21:45 UTC (permalink / raw)
To: linux-kernel; +Cc: Marc Zyngier, Thomas Gleixner
This patch adds the smp_affinity and smp_affinity_list attributes to the
sysfs interrupt interface. The implementation is identical to procfs,
and the attributes are visible only when CONFIG_SMP is enabled.
The intention is to allow SMP affinity to be controlled for chained
interrupt parents, which are typically not visible in procfs because
they are not requested through request_irq().
Signed-off-by: Radu Rendec <rrendec@redhat.com>
---
kernel/irq/irqdesc.c | 55 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 55 insertions(+)
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index a46a76c29b8d1..5b014df9fd730 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -210,6 +210,9 @@ static struct kobject *irq_kobj_base;
#define IRQ_ATTR_RO(_name) \
static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
+#define IRQ_ATTR_RW(_name) \
+static struct kobj_attribute _name##_attr = __ATTR_RW(_name)
+
static ssize_t per_cpu_count_show(struct kobject *kobj,
struct kobj_attribute *attr, char *buf)
{
@@ -332,6 +335,54 @@ static ssize_t actions_show(struct kobject *kobj,
}
IRQ_ATTR_RO(actions);
+#ifdef CONFIG_SMP
+static ssize_t smp_affinity_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ struct irq_desc *desc = container_of(kobj, struct irq_desc, kobj);
+ const struct cpumask *mask = desc->irq_common_data.affinity;
+
+#ifdef CONFIG_GENERIC_PENDING_IRQ
+ if (irqd_is_setaffinity_pending(&desc->irq_data))
+ mask = desc->pending_mask;
+#endif
+
+ return scnprintf(buf, PAGE_SIZE, "%*pb\n", cpumask_pr_args(mask));
+}
+static ssize_t smp_affinity_store(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct irq_desc *desc = container_of(kobj, struct irq_desc, kobj);
+
+ return write_irq_affinity(desc->irq_data.irq, buf, count, false, false);
+}
+IRQ_ATTR_RW(smp_affinity);
+
+static ssize_t smp_affinity_list_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ struct irq_desc *desc = container_of(kobj, struct irq_desc, kobj);
+ const struct cpumask *mask = desc->irq_common_data.affinity;
+
+#ifdef CONFIG_GENERIC_PENDING_IRQ
+ if (irqd_is_setaffinity_pending(&desc->irq_data))
+ mask = desc->pending_mask;
+#endif
+
+ return scnprintf(buf, PAGE_SIZE, "%*pbl\n", cpumask_pr_args(mask));
+}
+static ssize_t smp_affinity_list_store(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct irq_desc *desc = container_of(kobj, struct irq_desc, kobj);
+
+ return write_irq_affinity(desc->irq_data.irq, buf, count, true, false);
+}
+IRQ_ATTR_RW(smp_affinity_list);
+#endif
+
static struct attribute *irq_attrs[] = {
&per_cpu_count_attr.attr,
&chip_name_attr.attr,
@@ -340,6 +391,10 @@ static struct attribute *irq_attrs[] = {
&wakeup_attr.attr,
&name_attr.attr,
&actions_attr.attr,
+#ifdef CONFIG_SMP
+ &smp_affinity_attr.attr,
+ &smp_affinity_list_attr.attr,
+#endif
NULL
};
ATTRIBUTE_GROUPS(irq);
--
2.40.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 5/5] irq: Add smp_affinity/list attributes to sysfs
2023-05-30 21:45 ` [RFC PATCH 5/5] irq: Add smp_affinity/list attributes to sysfs Radu Rendec
@ 2023-06-01 23:22 ` kernel test robot
0 siblings, 0 replies; 13+ messages in thread
From: kernel test robot @ 2023-06-01 23:22 UTC (permalink / raw)
To: Radu Rendec; +Cc: oe-kbuild-all
Hi Radu,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:
[auto build test WARNING on linus/master]
[also build test WARNING on v6.4-rc4 next-20230601]
[cannot apply to tip/irq/core]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Radu-Rendec/irq-Always-enable-parent-interrupt-tracking/20230531-054837
base: linus/master
patch link: https://lore.kernel.org/r/20230530214550.864894-6-rrendec%40redhat.com
patch subject: [RFC PATCH 5/5] irq: Add smp_affinity/list attributes to sysfs
config: arm64-randconfig-s041-20230531 (https://download.01.org/0day-ci/archive/20230602/202306020717.zwWktkxV-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 12.3.0
reproduce:
mkdir -p ~/bin
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.4-39-gce1a6720-dirty
# https://github.com/intel-lab-lkp/linux/commit/88907e76f490cd936da59266176bc9423b52a069
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Radu-Rendec/irq-Always-enable-parent-interrupt-tracking/20230531-054837
git checkout 88907e76f490cd936da59266176bc9423b52a069
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm64 SHELL=/bin/bash kernel/
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202306020717.zwWktkxV-lkp@intel.com/
sparse warnings: (new ones prefixed by >>)
kernel/irq/irqdesc.c:178:47: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const *buf @@ got char const [noderef] __user *buffer @@
kernel/irq/irqdesc.c:178:47: sparse: expected char const *buf
kernel/irq/irqdesc.c:178:47: sparse: got char const [noderef] __user *buffer
kernel/irq/irqdesc.c:177:47: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const *buf @@ got char const [noderef] __user *buffer @@
kernel/irq/irqdesc.c:177:47: sparse: expected char const *buf
kernel/irq/irqdesc.c:177:47: sparse: got char const [noderef] __user *buffer
>> kernel/irq/irqdesc.c:358:55: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected char const [noderef] __user *buffer @@ got char const *buf @@
kernel/irq/irqdesc.c:358:55: sparse: expected char const [noderef] __user *buffer
kernel/irq/irqdesc.c:358:55: sparse: got char const *buf
kernel/irq/irqdesc.c:381:55: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected char const [noderef] __user *buffer @@ got char const *buf @@
kernel/irq/irqdesc.c:381:55: sparse: expected char const [noderef] __user *buffer
kernel/irq/irqdesc.c:381:55: sparse: got char const *buf
kernel/irq/irqdesc.c:1026:17: sparse: sparse: context imbalance in '__irq_get_desc_lock' - wrong count at exit
vim +358 kernel/irq/irqdesc.c
349
350 return scnprintf(buf, PAGE_SIZE, "%*pb\n", cpumask_pr_args(mask));
351 }
352 static ssize_t smp_affinity_store(struct kobject *kobj,
353 struct kobj_attribute *attr,
354 const char *buf, size_t count)
355 {
356 struct irq_desc *desc = container_of(kobj, struct irq_desc, kobj);
357
> 358 return write_irq_affinity(desc->irq_data.irq, buf, count, false, false);
359 }
360 IRQ_ATTR_RW(smp_affinity);
361
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control
2023-05-30 21:45 [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control Radu Rendec
` (4 preceding siblings ...)
2023-05-30 21:45 ` [RFC PATCH 5/5] irq: Add smp_affinity/list attributes to sysfs Radu Rendec
@ 2023-05-31 13:09 ` Thomas Gleixner
2023-05-31 22:54 ` Radu Rendec
2023-06-20 15:58 ` Radu Rendec
5 siblings, 2 replies; 13+ messages in thread
From: Thomas Gleixner @ 2023-05-31 13:09 UTC (permalink / raw)
To: Radu Rendec, linux-kernel; +Cc: Marc Zyngier
On Tue, May 30 2023 at 17:45, Radu Rendec wrote:
> Note: I still need to update the Documentation/ directory for the new
> sysfs interface, and I will address that in a future version.
> At this point, I just want to get feedback about the current
> approach.
From a conceptual POV I understand why this is required, which makes me
hate this chained mechanism even more.
Aside of having no visibility (counters, affinity, etc) the worst thing
about these chained hidden interrupts is:
There is no control of runaway interrupts as they circumvent the core,
which has caused hard to diagnose system hangups in the past. See
ba714a9c1dea ("pinctrl/amd: Use regular interrupt instead of chained")
for demonstration.
The argument I heard for this chained interrupt muck is that it's so
much more performant than using regular interrupt handlers via
request_irq(). It's obviously less overhead, but whether it matters and
most importantly whether it justifies the downsides is a different
question.
There is also the argument about double accounting. Right now the
chained handler is not accounted and only the childs are.
Though that is inconsistent with other demultiplex handlers which _must_
use regular interrupt handlers (threaded) to demultiplex interrupt chips
which sit behind SPI/I2C...
The sum of child interrupts is also not necessarily the number of parent
interrupts, unless there is always exactly only one child handler to
invoke.
Quite some of those demux handlers are also not RT compatible.
AFAICT, there is no real technical reason at least not for the vast
majority of usage sites why the demultiplex handler cannot run inside a
regular interrupt handler context.
So I personally would prefer to get rid of this chained oddball and just
have consistent mechanisms for dealing with interrupts, which would just
avoid exposing the affinity files in two different places.
Providing information about child/parent relationship is an orthogonal
issue.
If there is some good reason (aside of the chained muck) to have sysfs
based affinity management, then I'm not objecting as long as the
functionality is the same, i.e. effective affinity needs be exposed too.
Thanks,
tglx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control
2023-05-31 13:09 ` [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control Thomas Gleixner
@ 2023-05-31 22:54 ` Radu Rendec
2023-06-20 15:58 ` Radu Rendec
1 sibling, 0 replies; 13+ messages in thread
From: Radu Rendec @ 2023-05-31 22:54 UTC (permalink / raw)
To: Thomas Gleixner, linux-kernel; +Cc: Marc Zyngier
On Wed, 2023-05-31 at 15:09 +0200, Thomas Gleixner wrote:
> From a conceptual POV I understand why this is required, which makes me
> hate this chained mechanism even more.
>
> [cut]
>
> AFAICT, there is no real technical reason at least not for the vast
> majority of usage sites why the demultiplex handler cannot run inside a
> regular interrupt handler context.
Thanks for taking the time to explain everything in such detail.
Much appreciated!
> So I personally would prefer to get rid of this chained oddball and just
> have consistent mechanisms for dealing with interrupts, which would just
> avoid exposing the affinity files in two different places.
Does this mean that if I came across a chained driver that lacked
affinity support, then changing it to use regular interrupts via
request_irq() would be a viable solution to enable affinity support
and you would consider accepting such a patch?
Taking a step back, in the case of hierarchical domains where *no*
multiplexing is involved, do you consider setting .irq_set_affinity()
to irq_chip_set_affinity_parent() a good way to enable affinity support
(assuming, of course, the driver lacks such support originally)?
At the end of the day, what I'm trying to do is find a way to enable
affinity support for a few specific drivers where it's lacking. My
initial impression, after reading Marc's message[1], was that the fix
lay at the system level, at least for multiplexing drivers. Hence my
naive attempt at a system-level fix. It is now becoming clear that the
fix will have to be evaluated/addressed at the driver level, on a case
by case basis.
As a side note, one thing I particularly like about your approach is
that it doesn't require any changes to irqbalance.
> Providing information about child/parent relationship is an orthogonal
> issue.
Agreed. Do you see any value in doing that? And, if yes, is reusing
(struct irq_desc).parent_irq and irq_set_parent() a good way of doing
it? FWIW, a multiplexing driver could do that regardless of how it
registers a handler for the parent interrupt (chained/regular).
> If there is some good reason (aside of the chained muck) to have sysfs
> based affinity management, then I'm not objecting as long as the
> functionality is the same, i.e. effective affinity needs be exposed too.
I can't think of any other reason. AFAICT, chained interrupts are the
only interrupts that are active but not visible in procfs. For any
other purpose, the procfs interface is good as it is.
Thanks,
Radu
[1] https://lore.kernel.org/all/87fslr7ygl.wl-maz@kernel.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control
2023-05-31 13:09 ` [RFC PATCH 0/5] irq: sysfs interface improvements for SMP affinity control Thomas Gleixner
2023-05-31 22:54 ` Radu Rendec
@ 2023-06-20 15:58 ` Radu Rendec
1 sibling, 0 replies; 13+ messages in thread
From: Radu Rendec @ 2023-06-20 15:58 UTC (permalink / raw)
To: Thomas Gleixner, Marc Zyngier; +Cc: linux-kernel
On Wed, 2023-05-31 at 15:09 +0200, Thomas Gleixner wrote:
> On Tue, May 30 2023 at 17:45, Radu Rendec wrote:
> > Note: I still need to update the Documentation/ directory for the new
> > sysfs interface, and I will address that in a future version.
> > At this point, I just want to get feedback about the current
> > approach.
>
> From a conceptual POV I understand why this is required, which makes me
> hate this chained mechanism even more.
>
> Aside of having no visibility (counters, affinity, etc) the worst thing
> about these chained hidden interrupts is:
>
> There is no control of runaway interrupts as they circumvent the core,
> which has caused hard to diagnose system hangups in the past. See
>
> ba714a9c1dea ("pinctrl/amd: Use regular interrupt instead of chained")
>
> for demonstration.
>
> The argument I heard for this chained interrupt muck is that it's so
> much more performant than using regular interrupt handlers via
> request_irq(). It's obviously less overhead, but whether it matters and
> most importantly whether it justifies the downsides is a different
> question.
>
> There is also the argument about double accounting. Right now the
> chained handler is not accounted and only the childs are.
>
> Though that is inconsistent with other demultiplex handlers which _must_
> use regular interrupt handlers (threaded) to demultiplex interrupt chips
> which sit behind SPI/I2C...
>
> The sum of child interrupts is also not necessarily the number of parent
> interrupts, unless there is always exactly only one child handler to
> invoke.
>
> Quite some of those demux handlers are also not RT compatible.
>
> AFAICT, there is no real technical reason at least not for the vast
> majority of usage sites why the demultiplex handler cannot run inside a
> regular interrupt handler context.
I was about to send an RFC patch that converts a multiplexing IRQ chip
driver from chained to regular interrupts, when I realized I had come
full circle. Marc rejected a similar patch in the past [1] and the main
argument was that exposing the parent interrupt in procfs is backwards
incompatible. Quote:
The problem of changing affinities for chained (or multiplexing)
interrupts is, to make it short, that it breaks the existing userspace
ABI that a change in affinity affects only the interrupt userspace
acts upon, and no other. Which is why we don't expose any affinity
setting for such an interrupt, as by definition changing its affinity
affects all the interrupts that are muxed onto it.
So, there are two contradictory arguments here:
* Chained interrupts are "muck", mainly because they circumvent the
interrupt core and prevent the interrupt storm detector from kicking
in when hardware misbehaves (and for other reasons as well).
* Exposing the parent interrupt affinity control in procfs breaks the
userspace ABI because all child interrupts inherently share the same
affinity settings. This implies regular interrupts cannot be used.
Meanwhile, both interrupt storms and the lack of affinity control
support for multiplexing drivers are real problems, and my team and I
came across both. FWIW, the interrupt storm is the one we found more
recently, and it's the reason why I wanted to send that RFC patch I
mentioned before.
Is Marc's argument still relevant? Perhaps with both arguments in the
same email it becomes clearer that there needs to be some alignment at
the maintainer level. I would be more than happy to send patches and
help fix both issues. But I think that's not possible until you come up
with a strategy that you *both* agree to.
[1] https://lore.kernel.org/all/874k0bf7f7.wl-maz@kernel.org/
Thanks,
Radu
> So I personally would prefer to get rid of this chained oddball and just
> have consistent mechanisms for dealing with interrupts, which would just
> avoid exposing the affinity files in two different places.
>
> Providing information about child/parent relationship is an orthogonal
> issue.
>
> If there is some good reason (aside of the chained muck) to have sysfs
> based affinity management, then I'm not objecting as long as the
> functionality is the same, i.e. effective affinity needs be exposed too.
>
> Thanks,
>
> tglx
^ permalink raw reply [flat|nested] 13+ messages in thread