All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] genirq: don't leak handler procfs entries
@ 2023-08-14  9:36 Bartosz Golaszewski
  2023-08-14  9:36 ` [PATCH 1/2] genirq: proc: drop unused argument from unregister_handler_proc() Bartosz Golaszewski
  2023-08-14  9:36 ` [PATCH 2/2] genirq: proc: fix a procfs entry leak Bartosz Golaszewski
  0 siblings, 2 replies; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-08-14  9:36 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier; +Cc: linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

When we remove the procfs entry for an irq desc that's still in use, we
leak the procfs entries created per handler. We need to go through the
irqaction chain and remove all entries before finally removing the irq's
top procfs directory.

First patch drops an unused argument from unregister_handler_proc(), the
second fixes the actual leak.

Bartosz Golaszewski (2):
  genirq: proc: drop unused argument from unregister_handler_proc()
  genirq: proc: fix a procfs entry leak

 kernel/irq/internals.h |  5 ++---
 kernel/irq/manage.c    |  6 +++---
 kernel/irq/proc.c      | 17 ++++++++++++++++-
 3 files changed, 21 insertions(+), 7 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/2] genirq: proc: drop unused argument from unregister_handler_proc()
  2023-08-14  9:36 [PATCH 0/2] genirq: don't leak handler procfs entries Bartosz Golaszewski
@ 2023-08-14  9:36 ` Bartosz Golaszewski
  2023-08-14  9:36 ` [PATCH 2/2] genirq: proc: fix a procfs entry leak Bartosz Golaszewski
  1 sibling, 0 replies; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-08-14  9:36 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier; +Cc: linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

The irq argument is unused. Drop it.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 kernel/irq/internals.h | 5 ++---
 kernel/irq/manage.c    | 6 +++---
 kernel/irq/proc.c      | 2 +-
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index bdd35bb9c735..eee0e27e6750 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -125,14 +125,13 @@ void __irq_wake_thread(struct irq_desc *desc, struct irqaction *action);
 extern void register_irq_proc(unsigned int irq, struct irq_desc *desc);
 extern void unregister_irq_proc(unsigned int irq, struct irq_desc *desc);
 extern void register_handler_proc(unsigned int irq, struct irqaction *action);
-extern void unregister_handler_proc(unsigned int irq, struct irqaction *action);
+extern void unregister_handler_proc(struct irqaction *action);
 #else
 static inline void register_irq_proc(unsigned int irq, struct irq_desc *desc) { }
 static inline void unregister_irq_proc(unsigned int irq, struct irq_desc *desc) { }
 static inline void register_handler_proc(unsigned int irq,
 					 struct irqaction *action) { }
-static inline void unregister_handler_proc(unsigned int irq,
-					   struct irqaction *action) { }
+static inline void unregister_handler_proc(struct irqaction *action) { }
 #endif
 
 extern bool irq_can_set_affinity_usr(unsigned int irq);
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index d2742af0f0fd..7ed8a151ded8 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1937,7 +1937,7 @@ static struct irqaction *__free_irq(struct irq_desc *desc, void *dev_id)
 	 */
 	chip_bus_sync_unlock(desc);
 
-	unregister_handler_proc(irq, action);
+	unregister_handler_proc(action);
 
 	/*
 	 * Make sure it's not being used on another CPU and if the chip
@@ -2056,7 +2056,7 @@ static const void *__cleanup_nmi(unsigned int irq, struct irq_desc *desc)
 	if (!WARN_ON(desc->action == NULL)) {
 		irq_pm_remove_action(desc, desc->action);
 		devname = desc->action->name;
-		unregister_handler_proc(irq, desc->action);
+		unregister_handler_proc(desc->action);
 
 		kfree(desc->action);
 		desc->action = NULL;
@@ -2487,7 +2487,7 @@ static struct irqaction *__free_percpu_irq(unsigned int irq, void __percpu *dev_
 
 	raw_spin_unlock_irqrestore(&desc->lock, flags);
 
-	unregister_handler_proc(irq, action);
+	unregister_handler_proc(action);
 
 	irq_chip_pm_put(&desc->irq_data);
 	module_put(desc->owner);
diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index 623b8136e9af..83ed403991c6 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -414,7 +414,7 @@ void unregister_irq_proc(unsigned int irq, struct irq_desc *desc)
 
 #undef MAX_NAMELEN
 
-void unregister_handler_proc(unsigned int irq, struct irqaction *action)
+void unregister_handler_proc(struct irqaction *action)
 {
 	proc_remove(action->dir);
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-14  9:36 [PATCH 0/2] genirq: don't leak handler procfs entries Bartosz Golaszewski
  2023-08-14  9:36 ` [PATCH 1/2] genirq: proc: drop unused argument from unregister_handler_proc() Bartosz Golaszewski
@ 2023-08-14  9:36 ` Bartosz Golaszewski
  2023-08-24 20:12   ` Thomas Gleixner
  1 sibling, 1 reply; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-08-14  9:36 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier; +Cc: linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

When removing the proc entry for a desc that still has active users, we
will leak the irqaction entries. Let's remove them in
unregister_irq_proc().

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 kernel/irq/proc.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index 83ed403991c6..b284604a091a 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -390,6 +390,15 @@ void register_irq_proc(unsigned int irq, struct irq_desc *desc)
 	mutex_unlock(&register_lock);
 }
 
+static void unregister_action_proc(struct irqaction *action)
+{
+	if (!action)
+		return;
+
+	unregister_action_proc(action->secondary);
+	unregister_handler_proc(action);
+}
+
 void unregister_irq_proc(unsigned int irq, struct irq_desc *desc)
 {
 	char name [MAX_NAMELEN];
@@ -408,6 +417,12 @@ void unregister_irq_proc(unsigned int irq, struct irq_desc *desc)
 #endif
 	remove_proc_entry("spurious", desc->dir);
 
+	/*
+	 * If at this point, this irq desc is still requested, we need to
+	 * remove the proc handler entries or we'll leak them.
+	 */
+	unregister_action_proc(desc->action);
+
 	sprintf(name, "%u", irq);
 	remove_proc_entry(name, root_irq_dir);
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-14  9:36 ` [PATCH 2/2] genirq: proc: fix a procfs entry leak Bartosz Golaszewski
@ 2023-08-24 20:12   ` Thomas Gleixner
  2023-08-25  7:36     ` brgl
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2023-08-24 20:12 UTC (permalink / raw)
  To: Bartosz Golaszewski, Marc Zyngier; +Cc: linux-kernel, Bartosz Golaszewski

On Mon, Aug 14 2023 at 11:36, Bartosz Golaszewski wrote:

Please read and follow the guidelines of the tip tree:

  https://www.kernel.org/doc/html/latest/process/maintainer-tip.html

> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> When removing the proc entry for a desc that still has active users, we
> will leak the irqaction entries. Let's remove them in
> unregister_irq_proc().

How exactly is an interrupt descriptor freed which has still active
users?

If that happens then the procfs entry leak is the least of your worries.

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-24 20:12   ` Thomas Gleixner
@ 2023-08-25  7:36     ` brgl
  2023-08-25  8:11       ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: brgl @ 2023-08-25  7:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Bartosz Golaszewski, Bartosz Golaszewski, Marc Zyngier

On Thu, 24 Aug 2023 22:12:41 +0200, Thomas Gleixner <tglx@linutronix.de> said:
> On Mon, Aug 14 2023 at 11:36, Bartosz Golaszewski wrote:
>
> Please read and follow the guidelines of the tip tree:
>
>   https://www.kernel.org/doc/html/latest/process/maintainer-tip.html
>

Gah! And I sent a second series about the same time for genirq with a correct
subject line...

>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>>
>> When removing the proc entry for a desc that still has active users, we
>> will leak the irqaction entries. Let's remove them in
>> unregister_irq_proc().
>
> How exactly is an interrupt descriptor freed which has still active
> users?
>
> If that happens then the procfs entry leak is the least of your worries.
>

Let's consider the following stack trace

------------[ cut here ]------------
remove_proc_entry: removing non-empty directory 'irq/30', leaking at
least 'reset'
WARNING: CPU: 5 PID: 261 at fs/proc/generic.c:717 remove_proc_entry+0x27a/0x290
Modules linked in: gpio_sim gpio_consumer cfg80211 parport_pc parport
nfsd sch_fq_codel fuse cons
CPU: 5 PID: 261 Comm: test.sh Not tainted
6.5.0-rc7-next-20230825-yocto-standard+ #125
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.16.2-0-gea1b7a073390-prebuilt.qem4
RIP: 0010:remove_proc_entry+0x27a/0x290
Code: bd a0 00 00 00 e8 46 30 ee ff 49 8b 95 a0 00 00 00 4d 89 f8 4c
89 f1 48 c7 c6 40 2d b4 bc 0
RSP: 0018:ffffc90000e5f530 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 1ffff920001cbea6 RCX: 0000000000000000
RDX: 0000000000000002 RSI: ffffffffbaf38a18 RDI: ffffffffbe1c40a0
RBP: ffffc90000e5f5d8 R08: 0000000000000001 R09: ffffed100dad4eb9
R10: ffff88806d6a75cb R11: 0000000000000001 R12: ffff88800b086100
R13: ffff888001a20200 R14: ffff88800b0861ac R15: ffff88800b0869ac
FS:  00007f66796fe740(0000) GS:ffff88806d680000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055eb5b851090 CR3: 000000000b182000 CR4: 00000000001506e0
Call Trace:
 <TASK>
 ? show_regs+0x65/0x70
 ? __warn+0xa1/0x1c0
 ? remove_proc_entry+0x27a/0x290
 ? report_bug+0x202/0x230
 ? handle_bug+0x79/0xa0
 ? exc_invalid_op+0x18/0x50
 ? asm_exc_invalid_op+0x1b/0x20
 ? preempt_count_sub+0x18/0xc0
 ? remove_proc_entry+0x27a/0x290
 ? __pfx_remove_proc_entry+0x10/0x10
 ? mtree_load+0x3b9/0x460
 unregister_irq_proc+0x16f/0x1b0
 ? __pfx_unregister_irq_proc+0x10/0x10
 ? __kmem_cache_free+0x183/0x250
 free_desc+0x92/0x160
 ? __pfx_free_desc+0x10/0x10
 ? __kasan_check_write+0x14/0x20
 ? mutex_lock+0x86/0xe0
 ? __pfx_mutex_lock+0x10/0x10
 ? __kasan_check_write+0x14/0x20
 ? mutex_unlock+0x81/0xd0
 ? __pfx_mutex_unlock+0x10/0x10
 irq_free_descs+0x4c/0x70
 irq_dispose_mapping+0x93/0x1f0
 gpio_sim_dispose_mappings+0xd9/0x140 [gpio_sim]
 ? __pfx_gpio_sim_dispose_mappings+0x10/0x10 [gpio_sim]
 ? __kmem_cache_free+0x183/0x250
 ? __pfx_gpio_sim_dispose_mappings+0x10/0x10 [gpio_sim]
 devm_action_release+0x2d/0x40
 release_nodes+0x8f/0x170
 devres_release_all+0xef/0x140
 ? __pfx_devres_release_all+0x10/0x10
 ? dev_pm_domain_detach+0x20/0x60
 device_unbind_cleanup+0x14/0xd0
 device_release_driver_internal+0x2af/0x300
 device_release_driver+0x12/0x20
 bus_remove_device+0x12d/0x180
 device_del+0x277/0x720
 ? __pfx_device_del+0x10/0x10
 ? __kasan_check_write+0x14/0x20
 platform_device_del.part.0+0x1e/0xe0
 platform_device_unregister+0x20/0x40
 gpio_sim_device_config_live_store+0x3a4/0xa10 [gpio_sim]
 ? __kasan_check_write+0x14/0x20
 ? copyin+0x40/0x60
 ? __pfx_gpio_sim_device_config_live_store+0x10/0x10 [gpio_sim]
 ? preempt_count_sub+0x18/0xc0
 ? down_read+0xc6/0x200
 ? __pfx_down_read+0x10/0x10
 ? __pfx_mutex_lock+0x10/0x10
 ? __pfx_locks_remove_posix+0x10/0x10
 ? __pfx_gpio_sim_device_config_live_store+0x10/0x10 [gpio_sim]
 configfs_write_iter+0x16a/0x1f0 [configfs]
 vfs_write+0x408/0x750
 ? __pfx_vfs_write+0x10/0x10
 ksys_write+0xd9/0x180
 ? __pfx_ksys_write+0x10/0x10
 ? debug_smp_processor_id+0x17/0x20
 __x64_sys_write+0x42/0x50
 do_syscall_64+0x43/0x90
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x7f66797f21a4
Code: 15 79 7c 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f
00 f3 0f 1e fa 80 3d 3d 0c 8
RSP: 002b:00007fffa35ef228 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f66797f21a4
RDX: 0000000000000002 RSI: 000055eb5b864240 RDI: 0000000000000001
RBP: 00007f66798cb5a0 R08: 0000000000000000 R09: 0000000000000000
R10: 000055eb5b864080 R11: 0000000000000202 R12: 0000000000000002
R13: 000055eb5b864240 R14: 0000000000000002 R15: 00007f66798cb7a0
 </TASK>
---[ end trace 0000000000000000 ]---

Here a GPIO device - that is also an irq chip - is unbound (this is a testing
module unbound during a test-case but it can be anything else, like an I2C
expander for which the driver is unloaded) while some users called
request_irq() on its interrupts (this is orthogonal to gpiod_get() and doesn't
take a reference to the module, so nothing is stopping us from unloading it)
and didn't call free_irq() before the device went down (they don't have to,
they can't know about the unbinding).

The irq chip disposes of all mappings held by its domain and then tears down
the domain. This looks fine from the interrupt subsystem point of view. The
mappings are gone, so there's nothing in the maple tree, when the users
eventually do call free_irq(), they look up the mapping, get a NULL and return.

KASAN doesn't report any bugs, kmemleak doesn't show any other leaks, the only
problem here seems to be the procfs leak which this patch addresses, unless I'm
missing something - my knowledge of this part is limited.

I hope it makes sense, if it does, I can resend the patch with a more detailed
message.

Bartosz

> Thanks,
>
>         tglx
>
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-25  7:36     ` brgl
@ 2023-08-25  8:11       ` Thomas Gleixner
  2023-08-25 11:01         ` Bartosz Golaszewski
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2023-08-25  8:11 UTC (permalink / raw)
  To: brgl; +Cc: linux-kernel, Bartosz Golaszewski, Bartosz Golaszewski, Marc Zyngier

On Fri, Aug 25 2023 at 00:36, brgl@bgdev.pl wrote:
> On Thu, 24 Aug 2023 22:12:41 +0200, Thomas Gleixner <tglx@linutronix.de> said:
> Here a GPIO device - that is also an irq chip - is unbound (this is a testing
> module unbound during a test-case but it can be anything else, like an I2C
> expander for which the driver is unloaded) while some users called
> request_irq() on its interrupts (this is orthogonal to gpiod_get() and doesn't
> take a reference to the module, so nothing is stopping us from
> unloading it)

You just described the real problem in this sentence. So why are you
trying to cure a symptom?

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-25  8:11       ` Thomas Gleixner
@ 2023-08-25 11:01         ` Bartosz Golaszewski
  2023-08-25 14:58           ` Thomas Gleixner
  2023-08-25 17:08           ` Thomas Gleixner
  0 siblings, 2 replies; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-08-25 11:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij

On Fri, Aug 25, 2023 at 10:11 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Fri, Aug 25 2023 at 00:36, brgl@bgdev.pl wrote:
> > On Thu, 24 Aug 2023 22:12:41 +0200, Thomas Gleixner <tglx@linutronix.de> said:
> > Here a GPIO device - that is also an irq chip - is unbound (this is a testing
> > module unbound during a test-case but it can be anything else, like an I2C
> > expander for which the driver is unloaded) while some users called
> > request_irq() on its interrupts (this is orthogonal to gpiod_get() and doesn't
> > take a reference to the module, so nothing is stopping us from
> > unloading it)
>
> You just described the real problem in this sentence. So why are you
> trying to cure a symptom?
>

Cc: Linus Walleij

Honestly I'm not following. Even if we did have a way of taking the
reference to the underlying GPIO module (I'm 99% percent sure, it's
not possible: we're not using any of the GPIO interfaces for that,
just platform_get_irq() or similar and all the GPIO subsystem sees is
the call to .irq_map() but that doesn't look like a reliable place to
take that reference) - that wouldn't stop anyone from unbinding the
device elsewhere: from user-space over sysfs or maybe it's a GPIO
expander over USB that gets unplugged (I know that it would only be
described in DT if it was hard-wired but it's still within the realm
of possibility).

Because this situation (removing a device while it still has users) is
possible, we have a way of handling that in the GPIO subsystem, where
if a device you're using goes from under you, the GPIO descriptor (the
only interface consumers have to that device) is "numbed down" and no
longer functional but doesn't leak resources or crash.

I was under the impression that the whole irqnum-to-irq_desc mapping
was designed to handle this situation on purpose, hence a check for
!desc and a silent return in free_irq(). If a missing mapping was a
bug, then it would warrant at least a warning, right?

Bartosz

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-25 11:01         ` Bartosz Golaszewski
@ 2023-08-25 14:58           ` Thomas Gleixner
  2023-08-25 17:08           ` Thomas Gleixner
  1 sibling, 0 replies; 26+ messages in thread
From: Thomas Gleixner @ 2023-08-25 14:58 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij

On Fri, Aug 25 2023 at 13:01, Bartosz Golaszewski wrote:
> On Fri, Aug 25, 2023 at 10:11 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> I was under the impression that the whole irqnum-to-irq_desc mapping
> was designed to handle this situation on purpose, hence a check for
> !desc and a silent return in free_irq(). If a missing mapping was a
> bug, then it would warrant at least a warning, right?

The check for !desc has nothing to do with the problem you are trying to
solve. There is obviously a valid interrupt descriptor so desc != NULL,
otherwise there would be no procfs entries and no actions, no?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-25 11:01         ` Bartosz Golaszewski
  2023-08-25 14:58           ` Thomas Gleixner
@ 2023-08-25 17:08           ` Thomas Gleixner
  2023-08-25 20:07             ` Bartosz Golaszewski
  1 sibling, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2023-08-25 17:08 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij

On Fri, Aug 25 2023 at 13:01, Bartosz Golaszewski wrote:
> On Fri, Aug 25, 2023 at 10:11 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Fri, Aug 25 2023 at 00:36, brgl@bgdev.pl wrote:
>> > On Thu, 24 Aug 2023 22:12:41 +0200, Thomas Gleixner <tglx@linutronix.de> said:
>> > Here a GPIO device - that is also an irq chip - is unbound (this is a testing
>> > module unbound during a test-case but it can be anything else, like an I2C
>> > expander for which the driver is unloaded) while some users called
>> > request_irq() on its interrupts (this is orthogonal to gpiod_get() and doesn't
>> > take a reference to the module, so nothing is stopping us from
>> > unloading it)
>>
>> You just described the real problem in this sentence. So why are you
>> trying to cure a symptom?
>
> Honestly I'm not following. Even if we did have a way of taking the
> reference to the underlying GPIO module (I'm 99% percent sure, it's
> not possible: we're not using any of the GPIO interfaces for that,

The point is that something frees an in-use interrupt descriptor, which
is obviously wrong to begin with.

Now you go and cure the symptom of a stale procfs file, but as I said
before this is the least of the worries.

Care to look at what free_irq() does and you might figure out why this
stale file is just an easy to observe symptom, but obviously not the
real problem.

This leaves an activated interrupt around with stale pointers to the
descriptor. The interrupt could be on the flight. The associated thread
could be running. There can be resources claimed via request_irq() which
will not be freed either. There can be management operations on the
interrupt in parallel.

A driver which successfully requests an interrupt can rightfully expect
that any management operation on the interrupt, e.g. disable(),
enable(), set_*() is valid until the point it invokes free_irq().

IOW, the descriptor including the associated interrupt chips (software
representation) in the hierarchy must be at least in a consistent state
and accessible. If the underlying hardware vanished, e.g. USB
disconnect, then still the software side must be intact. Of course an
disable_irq() won't reach the hardware anymore, but that's something the
relevant driver has to handle correctly.

So just claiming that it's fine to free in-use interrupts and everything
is greate by removing the procfs entries is just wishful thinking.

You simply cannot free an in-use interrupt descriptor and I'm going to
add a big fat warning into that code to that effect.

So if it turns out that this is a general problem, then we have to sit
down and solve it from ground up.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-25 17:08           ` Thomas Gleixner
@ 2023-08-25 20:07             ` Bartosz Golaszewski
  2023-08-26 15:08               ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-08-25 20:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij

On Fri, Aug 25, 2023 at 7:08 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Fri, Aug 25 2023 at 13:01, Bartosz Golaszewski wrote:
> > On Fri, Aug 25, 2023 at 10:11 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> On Fri, Aug 25 2023 at 00:36, brgl@bgdev.pl wrote:
> >> > On Thu, 24 Aug 2023 22:12:41 +0200, Thomas Gleixner <tglx@linutronix.de> said:
> >> > Here a GPIO device - that is also an irq chip - is unbound (this is a testing
> >> > module unbound during a test-case but it can be anything else, like an I2C
> >> > expander for which the driver is unloaded) while some users called
> >> > request_irq() on its interrupts (this is orthogonal to gpiod_get() and doesn't
> >> > take a reference to the module, so nothing is stopping us from
> >> > unloading it)
> >>
> >> You just described the real problem in this sentence. So why are you
> >> trying to cure a symptom?
> >
> > Honestly I'm not following. Even if we did have a way of taking the
> > reference to the underlying GPIO module (I'm 99% percent sure, it's
> > not possible: we're not using any of the GPIO interfaces for that,
>
> The point is that something frees an in-use interrupt descriptor, which
> is obviously wrong to begin with.
>

It happens in irq_dispose_mapping() when the domain is not
hierarchical and irq_free_desc() is called in the if branch.

> Now you go and cure the symptom of a stale procfs file, but as I said
> before this is the least of the worries.
>
> Care to look at what free_irq() does and you might figure out why this
> stale file is just an easy to observe symptom, but obviously not the
> real problem.
>
> This leaves an activated interrupt around with stale pointers to the
> descriptor. The interrupt could be on the flight. The associated thread
> could be running. There can be resources claimed via request_irq() which
> will not be freed either. There can be management operations on the
> interrupt in parallel.
>
> A driver which successfully requests an interrupt can rightfully expect
> that any management operation on the interrupt, e.g. disable(),
> enable(), set_*() is valid until the point it invokes free_irq().
>

Whether by "valid" you mean "always succeeds" or merely just "doesn't
crash the kernel", it basically means that the consumer that requested
the resource (irq, GPIO or otherwise) can always call any of the
consumer-facing calls and the handle it got from the provider (GPIO
descriptor, irq number) remains valid even after the device backing
that handle is gone - in which case the call either returns 0 and acts
like nothing happened or returns an appropriate error code (most
likely -ENODEV).

Or even a combination of both, like we do in the GPIO subsystem where
we emit a warning to kernel log but we still return 0.

> IOW, the descriptor including the associated interrupt chips (software
> representation) in the hierarchy must be at least in a consistent state
> and accessible. If the underlying hardware vanished, e.g. USB
> disconnect, then still the software side must be intact. Of course an
> disable_irq() won't reach the hardware anymore, but that's something the
> relevant driver has to handle correctly.
>

Absolutely agree! This is precisely what we do in the GPIO subsystem.
Except that we migrated from using a GPIO numberspace similar to the
irq numbers in favor of using GPIO descriptors which are simple
structs containing some meta-data about the GPIO request and - most
importantly - a reference to the GPIO device backing the descriptor.

That GPIO device (still talking about the software representation) has
two layers and is composed of a struct that embeds struct device and
is reference counted via the kobject refs inside struct device as well
as a second struct called the GPIO chip which on the other hand can be
destroyed at any time (most likely when the provider device is
unbound). struct gpio_device is guaranteed to exist for as long as
anyone references it even after struct gpio_chip was freed. Whenever
the user calls any of the consumer-facing APIs, we dereference the
descriptor, take the gpio_device pointer and see if the chip is still
there. If it's not - we return 0 as mentioned above.

Now for irqs, the consumer-facing "handle" is the interrupt number. I
don't know what the rationale is for that as it forces us to convert
it to a descriptor internally everytime we use it (tree lookup!) but
from what you're saying, if the domain backing it is destroyed and
with it, the interrupt descriptors, then management calls may end up
triggering use-after-free bugs. It seems about right as not all of
them check if the mapping can be found. I may have been lucky to never
trigger it as I was only using request_irq() and free_irq().

As I explained before in this thread - it's perfectly normal for the
provider of most resources in the kernel to be gone with users still
holding the respective handles. Maybe functions in linux/interrupt.h
could use some audit in order to make sure they can handle this. It
seems to me that the current infrastructure is ready as all it takes
is checking if irq_to_desc() returns a non-NULL value. Or I may be
getting it wrong and it's much more complex than that.

> So just claiming that it's fine to free in-use interrupts and everything
> is greate by removing the procfs entries is just wishful thinking.
>
> You simply cannot free an in-use interrupt descriptor and I'm going to
> add a big fat warning into that code to that effect.
>

With the above example, if our GPIO desc is analogous to the interrupt
number in the irq world - I'm not really sure what the role of the
irq_desc is. Should it be the *handle* that users get when they
request an irq? Maybe it is what the GPIO device is for us? Should it
be reference counted then?

> So if it turns out that this is a general problem, then we have to sit
> down and solve it from ground up.
>

It may very well be. I guess it will require a more detailed
investigation. I'd still say this patch is correct though, as the
self-contained function removing a procfs hierarchy should remove all
sub-directories and not just leak them. They don't hold any irq
resources anyway.

Are you fine with the first, cleanup patch in this series BTW?

Bartosz

> Thanks,
>
>         tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-25 20:07             ` Bartosz Golaszewski
@ 2023-08-26 15:08               ` Thomas Gleixner
  2023-08-28 10:06                 ` Bartosz Golaszewski
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2023-08-26 15:08 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij

On Fri, Aug 25 2023 at 22:07, Bartosz Golaszewski wrote:
> On Fri, Aug 25, 2023 at 7:08 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> The point is that something frees an in-use interrupt descriptor, which
>> is obviously wrong to begin with.
>>
> It happens in irq_dispose_mapping() when the domain is not
> hierarchical and irq_free_desc() is called in the if branch.

Again. You are explaining what the callchain is. That's not interesting
at all (I can find the places which invoke irq_free_descs() with grep).

The question is WHY is irq_dispose_mapping() called when the interrupt
in question _IS_ in use, i.e. requested and not yet released via
free_irq().

That's the underlying problem which needs to be addressed.

> Now for irqs, the consumer-facing "handle" is the interrupt number. I
> don't know what the rationale is for that as it forces us to convert
> it to a descriptor internally everytime we use it (tree lookup!) but

Perhaps because operating on an integer is not really giving you access
to the underlying mechanisms. The tree lookup is really not an issue and
it's a mechanism which is used all over the place in the kernel to
translate a cookie or identifier to the internal data representation of
a subsystem.

> As I explained before in this thread - it's perfectly normal for the
> provider of most resources in the kernel to be gone with users still
> holding the respective handles.

No. It's not most. It's only the case when the subsystem explicitely
documents that it can handle it, which the interrupt subsystem does not.

> Maybe functions in linux/interrupt.h could use some audit in order to
> make sure they can handle this.

Maybe you stop claiming that it's perfectly legit to free resources
which are in use. It's not and the interrupt subsystem never was making
this guarantee and never was designed for it.

> It seems to me that the current infrastructure is ready as all it
> takes is checking if irq_to_desc() returns a non-NULL value. Or I may
> be getting it wrong and it's much more complex than that.

Again:

It's not just NULL checks, which exist already. It's not just a stale
procfs file which needs to be removed. Did you actually look what
free_irq() does?

Obviously not, otherwise you might have noticed that removing the
resources leaks:

   1) The interrupt action itself
   2) Any interrupt threads associated with the action
   3) The procfs entry
   4) Any resource which was allocated during request_irq()
   5) Interrupt descriptor pointers stored separately for low level
      interrupt handling code

Further it might:

   1) leave hardware in an undefined state
   2) race against an interrupt being processed concurrently

How do you address this with slapping some NULL checks around? The only
way to clean this up is invoking free_irq().

>> So just claiming that it's fine to free in-use interrupts and everything
>> is greate by removing the procfs entries is just wishful thinking.
>>
>> You simply cannot free an in-use interrupt descriptor and I'm going to
>> add a big fat warning into that code to that effect.
>>
> With the above example, if our GPIO desc is analogous to the interrupt
> number in the irq world - I'm not really sure what the role of the
> irq_desc is. Should it be the *handle* that users get when they
> request an irq? Maybe it is what the GPIO device is for us? Should it
> be reference counted then?

It's an internal data structure which is not accessible outside of the
interrupt core and architecture low level code. The number is the
identifier for the consumers to interact with the core code. That
concept is called abstraction. What's so hard about that?

>> So if it turns out that this is a general problem, then we have to sit
>> down and solve it from ground up.
>
> It may very well be. I guess it will require a more detailed
> investigation. I'd still say this patch is correct though, as the
> self-contained function removing a procfs hierarchy should remove all
> sub-directories and not just leak them. They don't hold any irq
> resources anyway.

This patch is not correct at all. Once again:

 The interrupt subsystem is not designed to have its underlying
 resources be freed when an interrupt is in-use.

It's that simple. And no, your patch is not changing this. It tries to
paper over the violation of the rule of this subsystem.

The below is going to be applied ASAP to make this entirely clear. And
yes, that's going to leak a bit more than just the procfs entry.

The proper solution to this is to take a reference count on the module
which provides the interrupt chip and allocates the interrupt domain.
The core code has a mechanism for that. See request/free_irq().

Unfortunately it is not properly utilized by the irqdomain
infrastructure today. But that's a fixable problem.

Thanks,

        tglx
---
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 27ca1c866f29..5382fd4e7588 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -466,6 +466,9 @@ static void free_desc(unsigned int irq)
 {
 	struct irq_desc *desc = irq_to_desc(irq);
 
+	if (WARN_ON_ONCE(desc->action))
+		return;
+
 	irq_remove_debugfs_entry(desc);
 	unregister_irq_proc(irq, desc);
 


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-26 15:08               ` Thomas Gleixner
@ 2023-08-28 10:06                 ` Bartosz Golaszewski
  2023-08-28 12:41                   ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-08-28 10:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij

On Sat, Aug 26, 2023 at 5:08 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Fri, Aug 25 2023 at 22:07, Bartosz Golaszewski wrote:
> > On Fri, Aug 25, 2023 at 7:08 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> The point is that something frees an in-use interrupt descriptor, which
> >> is obviously wrong to begin with.
> >>
> > It happens in irq_dispose_mapping() when the domain is not
> > hierarchical and irq_free_desc() is called in the if branch.
>
> Again. You are explaining what the callchain is. That's not interesting
> at all (I can find the places which invoke irq_free_descs() with grep).
>
> The question is WHY is irq_dispose_mapping() called when the interrupt
> in question _IS_ in use, i.e. requested and not yet released via
> free_irq().
>
> That's the underlying problem which needs to be addressed.
>
> > Now for irqs, the consumer-facing "handle" is the interrupt number. I
> > don't know what the rationale is for that as it forces us to convert
> > it to a descriptor internally everytime we use it (tree lookup!) but
>
> Perhaps because operating on an integer is not really giving you access
> to the underlying mechanisms. The tree lookup is really not an issue and
> it's a mechanism which is used all over the place in the kernel to
> translate a cookie or identifier to the internal data representation of
> a subsystem.
>
> > As I explained before in this thread - it's perfectly normal for the
> > provider of most resources in the kernel to be gone with users still
> > holding the respective handles.
>
> No. It's not most. It's only the case when the subsystem explicitely
> documents that it can handle it, which the interrupt subsystem does not.
>
> > Maybe functions in linux/interrupt.h could use some audit in order to
> > make sure they can handle this.
>
> Maybe you stop claiming that it's perfectly legit to free resources
> which are in use. It's not and the interrupt subsystem never was making
> this guarantee and never was designed for it.
>
> > It seems to me that the current infrastructure is ready as all it
> > takes is checking if irq_to_desc() returns a non-NULL value. Or I may
> > be getting it wrong and it's much more complex than that.
>
> Again:
>
> It's not just NULL checks, which exist already. It's not just a stale
> procfs file which needs to be removed. Did you actually look what
> free_irq() does?
>
> Obviously not, otherwise you might have noticed that removing the
> resources leaks:
>
>    1) The interrupt action itself
>    2) Any interrupt threads associated with the action
>    3) The procfs entry
>    4) Any resource which was allocated during request_irq()
>    5) Interrupt descriptor pointers stored separately for low level
>       interrupt handling code
>
> Further it might:
>
>    1) leave hardware in an undefined state
>    2) race against an interrupt being processed concurrently
>
> How do you address this with slapping some NULL checks around? The only
> way to clean this up is invoking free_irq().
>

This is not what I meant, let me rephrase the question:

Is there any reason why whatever operations irq_free() performs could
not be done by the irqchip when it knows it will be destroyed with
irqs still in use? And then any new management operation that would be
called by the now orphaned user would just bail out? Maybe not, I'm
asking this question because I don't know and it's not obvious from
the code.

> >> So just claiming that it's fine to free in-use interrupts and everything
> >> is greate by removing the procfs entries is just wishful thinking.
> >>
> >> You simply cannot free an in-use interrupt descriptor and I'm going to
> >> add a big fat warning into that code to that effect.
> >>
> > With the above example, if our GPIO desc is analogous to the interrupt
> > number in the irq world - I'm not really sure what the role of the
> > irq_desc is. Should it be the *handle* that users get when they
> > request an irq? Maybe it is what the GPIO device is for us? Should it
> > be reference counted then?
>
> It's an internal data structure which is not accessible outside of the
> interrupt core and architecture low level code. The number is the
> identifier for the consumers to interact with the core code. That
> concept is called abstraction. What's so hard about that?
>

No need to be snarky. I do know what abstraction is. I also know that
a pointer to an opaque structure fulfills the same role without the
translation step and this is what irq_desc could be. But I get it, the
numbers have been in use for years, it's hard to change it, I don't
have an issue with that. Let's not continue this.

> >> So if it turns out that this is a general problem, then we have to sit
> >> down and solve it from ground up.
> >
> > It may very well be. I guess it will require a more detailed
> > investigation. I'd still say this patch is correct though, as the
> > self-contained function removing a procfs hierarchy should remove all
> > sub-directories and not just leak them. They don't hold any irq
> > resources anyway.
>
> This patch is not correct at all. Once again:
>
>  The interrupt subsystem is not designed to have its underlying
>  resources be freed when an interrupt is in-use.
>
> It's that simple. And no, your patch is not changing this. It tries to
> paper over the violation of the rule of this subsystem.
>
> The below is going to be applied ASAP to make this entirely clear. And
> yes, that's going to leak a bit more than just the procfs entry.
>
> The proper solution to this is to take a reference count on the module
> which provides the interrupt chip and allocates the interrupt domain.
> The core code has a mechanism for that. See request/free_irq().

I guess you're referring to irq_alloc_descs()? Anyway, here's a
real-life example: we have the hid-cp2112 module which drives a
GPIO-and-I2C-expander-on-a-USB-stick. I plug it in and have a driver
that requests one of its GPIOs as interrupt. Now I unplug it. How has
taking the reference to the hid-cp2112 module protected me from
freeing an irq domain with interrupts in use?

Let me focus on a possible solution then, because as it is, we have
the GPIO subsystem which claims it can survive a hot-unplug with
active users (it's not fully done yet for all corner-cases but that's
a different story...) but a GPIO chip can also act as an interrupt
controller whereas the caller can bypass most of the core GPIO code.
And now it's clear that the interrupt subsystem does not support hot
unplugging at all. And please, don't yell at me for this as this
design predates my involvement with the subsystem.

I'm thinking about two possible solutions that could be contained
within the GPIO core.

One would be to keep track of requested irqs in the GPIO-to-interrupt
glue code (where the domain is managed) and then when the GPIO chip
device is unbound, we could just go over the list of active ones and
call irq_free() on all of them. Does it even make any sense? If so: is
there any way at the moment to be notified about an irq being
requested/freed for given domain? I don't think map() and unmap()
callbacks are the right place as they are called once per mapping.
Would you be ok with adding this functionality?

Second possibility would be to keep the domain in refcounted struct
gpio_device and not struct gpio_chip where it can be freed on device
unbind. Except unlike requesting a GPIO descriptor, requesting the
interrupt directly does not increase the gpio_device's reference count
so we're back to the former. We could move the domain to the
refcounted struct and with a proper on-request notification, could
increase its refcount just like we do with the GPIO device on GPIO
request.

Do you maybe have any other suggestions by chance?

Bartosz

>
> Unfortunately it is not properly utilized by the irqdomain
> infrastructure today. But that's a fixable problem.
>
> Thanks,
>
>         tglx
> ---
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index 27ca1c866f29..5382fd4e7588 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -466,6 +466,9 @@ static void free_desc(unsigned int irq)
>  {
>         struct irq_desc *desc = irq_to_desc(irq);
>
> +       if (WARN_ON_ONCE(desc->action))
> +               return;
> +
>         irq_remove_debugfs_entry(desc);
>         unregister_irq_proc(irq, desc);
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-28 10:06                 ` Bartosz Golaszewski
@ 2023-08-28 12:41                   ` Thomas Gleixner
  2023-08-28 19:03                     ` Bartosz Golaszewski
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2023-08-28 12:41 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij

On Mon, Aug 28 2023 at 12:06, Bartosz Golaszewski wrote:
> On Sat, Aug 26, 2023 at 5:08 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> How do you address this with slapping some NULL checks around? The only
>> way to clean this up is invoking free_irq().
>>
>
> This is not what I meant, let me rephrase the question:
>
> Is there any reason why whatever operations irq_free() performs could
> not be done by the irqchip when it knows it will be destroyed with
> irqs still in use? And then any new management operation that would be
> called by the now orphaned user would just bail out? Maybe not, I'm
> asking this question because I don't know and it's not obvious from
> the code.

The irqchip can do nothing and it is not directly involved in freeing
the interrupt descriptor. The entity, which allocated the interrupt
descriptors is responsible for that. That's in most modern cases the
interrupt domain.

It might be possible to free the actions in a teardown operation, but
that needs a lot of thoughts vs. concurrency and life time rules. A
simple 'let's invoke free_irq()' does not cut it.

>> The proper solution to this is to take a reference count on the module
>> which provides the interrupt chip and allocates the interrupt domain.
>> The core code has a mechanism for that. See request/free_irq().
>
> I guess you're referring to irq_alloc_descs()? Anyway, here's a
> real-life example: we have the hid-cp2112 module which drives a
> GPIO-and-I2C-expander-on-a-USB-stick. I plug it in and have a driver
> that requests one of its GPIOs as interrupt. Now I unplug it. How has
> taking the reference to the hid-cp2112 module protected me from
> freeing an irq domain with interrupts in use?

request_irq() does not care which module request the interrupt. It
always takes a refcount on irq_desc::owner. That points to the module
which created the interrupt domain and/or allocated the descriptors.

IOW, this needs a mechanism to store the module which creates the
interrupt domain somewhere in the domain itself and use it when
allocating interrupt descriptors. So in your case this would take a
refcount on the GPIO module.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-28 12:41                   ` Thomas Gleixner
@ 2023-08-28 19:03                     ` Bartosz Golaszewski
  2023-08-28 21:44                       ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-08-28 19:03 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij

On Mon, Aug 28, 2023 at 2:41 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, Aug 28 2023 at 12:06, Bartosz Golaszewski wrote:
> > On Sat, Aug 26, 2023 at 5:08 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >>
> >> How do you address this with slapping some NULL checks around? The only
> >> way to clean this up is invoking free_irq().
> >>
> >
> > This is not what I meant, let me rephrase the question:
> >
> > Is there any reason why whatever operations irq_free() performs could
> > not be done by the irqchip when it knows it will be destroyed with
> > irqs still in use? And then any new management operation that would be
> > called by the now orphaned user would just bail out? Maybe not, I'm
> > asking this question because I don't know and it's not obvious from
> > the code.
>
> The irqchip can do nothing and it is not directly involved in freeing
> the interrupt descriptor. The entity, which allocated the interrupt
> descriptors is responsible for that. That's in most modern cases the
> interrupt domain.
>
> It might be possible to free the actions in a teardown operation, but
> that needs a lot of thoughts vs. concurrency and life time rules. A
> simple 'let's invoke free_irq()' does not cut it.
>
> >> The proper solution to this is to take a reference count on the module
> >> which provides the interrupt chip and allocates the interrupt domain.
> >> The core code has a mechanism for that. See request/free_irq().
> >
> > I guess you're referring to irq_alloc_descs()? Anyway, here's a
> > real-life example: we have the hid-cp2112 module which drives a
> > GPIO-and-I2C-expander-on-a-USB-stick. I plug it in and have a driver
> > that requests one of its GPIOs as interrupt. Now I unplug it. How has
> > taking the reference to the hid-cp2112 module protected me from
> > freeing an irq domain with interrupts in use?
>
> request_irq() does not care which module request the interrupt. It
> always takes a refcount on irq_desc::owner. That points to the module
> which created the interrupt domain and/or allocated the descriptors.
>
> IOW, this needs a mechanism to store the module which creates the
> interrupt domain somewhere in the domain itself and use it when
> allocating interrupt descriptors. So in your case this would take a
> refcount on the GPIO module.
>

This is still not complete. In the above example, the USB bus can
still unbind the GPIO device that created the domain on hot-unplug,
triggering its cleanup routines (.remove(), devres chain) and
destroying the domain and keeping the reference to the hid-cp2112
module will not help it. This is why I suggested tracking the irq
requests and freeing them in said cleanup path.

Bartosz

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-28 19:03                     ` Bartosz Golaszewski
@ 2023-08-28 21:44                       ` Thomas Gleixner
  2023-08-29  6:26                         ` Bartosz Golaszewski
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2023-08-28 21:44 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij

On Mon, Aug 28 2023 at 21:03, Bartosz Golaszewski wrote:
> On Mon, Aug 28, 2023 at 2:41 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> > I guess you're referring to irq_alloc_descs()? Anyway, here's a
>> > real-life example: we have the hid-cp2112 module which drives a
>> > GPIO-and-I2C-expander-on-a-USB-stick. I plug it in and have a driver
>> > that requests one of its GPIOs as interrupt. Now I unplug it. How has
>> > taking the reference to the hid-cp2112 module protected me from
>> > freeing an irq domain with interrupts in use?
>>
>> request_irq() does not care which module request the interrupt. It
>> always takes a refcount on irq_desc::owner. That points to the module
>> which created the interrupt domain and/or allocated the descriptors.
>>
>> IOW, this needs a mechanism to store the module which creates the
>> interrupt domain somewhere in the domain itself and use it when
>> allocating interrupt descriptors. So in your case this would take a
>> refcount on the GPIO module.
>>
> This is still not complete. In the above example, the USB bus can
> still unbind the GPIO device that created the domain on hot-unplug,
> triggering its cleanup routines (.remove(), devres chain) and
> destroying the domain and keeping the reference to the hid-cp2112
> module will not help it. This is why I suggested tracking the irq
> requests and freeing them in said cleanup path.

Are you actually reading what I write?

>> So in your case this would take a refcount on the GPIO module.

That's the module which provides the interrupt domain and hid-whatever
is the one which requests the interrupt, no?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-28 21:44                       ` Thomas Gleixner
@ 2023-08-29  6:26                         ` Bartosz Golaszewski
  2023-08-29  9:11                           ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-08-29  6:26 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij

On Mon, Aug 28, 2023 at 11:44 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, Aug 28 2023 at 21:03, Bartosz Golaszewski wrote:
> > On Mon, Aug 28, 2023 at 2:41 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> > I guess you're referring to irq_alloc_descs()? Anyway, here's a
> >> > real-life example: we have the hid-cp2112 module which drives a
> >> > GPIO-and-I2C-expander-on-a-USB-stick. I plug it in and have a driver
> >> > that requests one of its GPIOs as interrupt. Now I unplug it. How has
> >> > taking the reference to the hid-cp2112 module protected me from
> >> > freeing an irq domain with interrupts in use?
> >>
> >> request_irq() does not care which module request the interrupt. It
> >> always takes a refcount on irq_desc::owner. That points to the module
> >> which created the interrupt domain and/or allocated the descriptors.
> >>
> >> IOW, this needs a mechanism to store the module which creates the
> >> interrupt domain somewhere in the domain itself and use it when
> >> allocating interrupt descriptors. So in your case this would take a
> >> refcount on the GPIO module.
> >>
> > This is still not complete. In the above example, the USB bus can
> > still unbind the GPIO device that created the domain on hot-unplug,
> > triggering its cleanup routines (.remove(), devres chain) and
> > destroying the domain and keeping the reference to the hid-cp2112
> > module will not help it. This is why I suggested tracking the irq
> > requests and freeing them in said cleanup path.
>
> Are you actually reading what I write?
>
> >> So in your case this would take a refcount on the GPIO module.
>
> That's the module which provides the interrupt domain and hid-whatever
> is the one which requests the interrupt, no?
>

Not at all! This is what I said: "we have the hid-cp2112 module which
drives a GPIO-and-I2C-expander-on-a-USB-stick". Meaning: the
hid-cp2112 module registers a USB driver for a GPIO expander on a
stick. This GPIO chip is the interrupt controller here. It's the USB
stick that provides interrupts for whatever else needs them (in real
life: it can be an IIO device on the I2C bus which signals some events
over the GPIOs). The user can get the interrupt number using the
gpiod_to_irq() function. It can be unplugged at any moment and module
references will not stop the USB bus from unbinding it.

Bartosz

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-29  6:26                         ` Bartosz Golaszewski
@ 2023-08-29  9:11                           ` Thomas Gleixner
  2023-08-29 12:24                             ` Bartosz Golaszewski
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2023-08-29  9:11 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij,
	Greg Kroah-Hartman

On Tue, Aug 29 2023 at 08:26, Bartosz Golaszewski wrote:
> On Mon, Aug 28, 2023 at 11:44 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> That's the module which provides the interrupt domain and hid-whatever
>> is the one which requests the interrupt, no?
>>
> Not at all! This is what I said: "we have the hid-cp2112 module which
> drives a GPIO-and-I2C-expander-on-a-USB-stick". Meaning: the
> hid-cp2112 module registers a USB driver for a GPIO expander on a
> stick. This GPIO chip is the interrupt controller here. It's the USB
> stick that provides interrupts for whatever else needs them (in real
> life: it can be an IIO device on the I2C bus which signals some events
> over the GPIOs). The user can get the interrupt number using the
> gpiod_to_irq() function. It can be unplugged at any moment and module
> references will not stop the USB bus from unbinding it.

Sorry for my confusion, but this all is confusing at best.

So what you are saying is that the GPIO driver, which creates the
interrupt domain is unbound and that unbind destroys the interrupt
domain, right? IOW, the wonderful world of plug and pray.

Let's look at the full picture again.

   USB -> USB-device
          |----------- GPIO
          |------------I2C  ---------- I2C-device
                 (hid-cp2112 driver)   (i2c-xx-driver)

i2x-xx-driver is the one which requests the interrupt from
hid-cp2112-GPIO, right?

So when the USB device disconnects, then something needs to tell the
i2c-xx-driver that the I2C interface is not longer available, right?

IOW, the unbind operation needs the following notification and teardown
order:

   1) USB core notifies hid-cp2112

   2) hid-cp2112 notifies i2c-xx-driver

   3) i2c-xx-driver mops up and invokes free_irq()

   4) hid-cp2112 removes the interrupt domain

But it seems that you end up with a situation where the notification of
the i2c-xx-driver is either not happening or the driver in question is
simply failing to mop up and free the requested interrupt.

As a consequence you want to work around it by mopping up the requested
interrupts somewhere else.

It's not clear to me what you are trying to achieve here.

   If this is meant as a hardening effort to catch such issues and let
   the kernel gracefully "survive", then I'm all ears.

   If this is just meant to paper over the shortcomings of subsystems or
   driver implemtations then I'm not interested at all.

Thanks,

        tglx





^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-29  9:11                           ` Thomas Gleixner
@ 2023-08-29 12:24                             ` Bartosz Golaszewski
  2023-08-29 20:18                               ` Greg Kroah-Hartman
  2023-08-29 22:29                               ` Thomas Gleixner
  0 siblings, 2 replies; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-08-29 12:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij,
	Greg Kroah-Hartman

On Tue, Aug 29, 2023 at 11:11 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Tue, Aug 29 2023 at 08:26, Bartosz Golaszewski wrote:
> > On Mon, Aug 28, 2023 at 11:44 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> That's the module which provides the interrupt domain and hid-whatever
> >> is the one which requests the interrupt, no?
> >>
> > Not at all! This is what I said: "we have the hid-cp2112 module which
> > drives a GPIO-and-I2C-expander-on-a-USB-stick". Meaning: the
> > hid-cp2112 module registers a USB driver for a GPIO expander on a
> > stick. This GPIO chip is the interrupt controller here. It's the USB
> > stick that provides interrupts for whatever else needs them (in real
> > life: it can be an IIO device on the I2C bus which signals some events
> > over the GPIOs). The user can get the interrupt number using the
> > gpiod_to_irq() function. It can be unplugged at any moment and module
> > references will not stop the USB bus from unbinding it.
>
> Sorry for my confusion, but this all is confusing at best.
>
> So what you are saying is that the GPIO driver, which creates the
> interrupt domain is unbound and that unbind destroys the interrupt
> domain, right? IOW, the wonderful world of plug and pray.
>
> Let's look at the full picture again.
>
>    USB -> USB-device
>           |----------- GPIO
>           |------------I2C  ---------- I2C-device
>                  (hid-cp2112 driver)   (i2c-xx-driver)
>
> i2x-xx-driver is the one which requests the interrupt from
> hid-cp2112-GPIO, right?
>

Yes! Sorry if I was not being clear about it.

> So when the USB device disconnects, then something needs to tell the
> i2c-xx-driver that the I2C interface is not longer available, right?
>
> IOW, the unbind operation needs the following notification and teardown
> order:
>
>    1) USB core notifies hid-cp2112
>
>    2) hid-cp2112 notifies i2c-xx-driver
>
>    3) i2c-xx-driver mops up and invokes free_irq()
>
>    4) hid-cp2112 removes the interrupt domain
>
> But it seems that you end up with a situation where the notification of
> the i2c-xx-driver is either not happening or the driver in question is
> simply failing to mop up and free the requested interrupt.
>

Yes, there's no notification of any kind. It's a common problem
unfortunately across different subsystems. We have hot-unpluggable
consumers using resources that don't support it (like interrupts in
this example).

> As a consequence you want to work around it by mopping up the requested
> interrupts somewhere else.
>

The approach I'm proposing - and that we implement in GPIO - is
treating the "handle" to the resource as what's often called in
programming - a weak reference. The resource itself is released not by
the consumer, but the provider. The consumer in turn can get the weak
reference from the provider and has to have some way of converting it
to a strong one for the duration of any of the API calls. It can be
implemented internally with a mutex, spinlock, an RCU read section or
otherwise (in GPIO we're using rw_semaphores but I'm working on
migrating to SRCU in order to protect the functions called from
interrupt context too which is missing ATM). If for any reason the
provider vanishes, then the next API call will fail. If it vanishes
during a call, then we'll wait for the call to exit before freeing the
resources, even if the underlying HW is already gone (the call in
progress may fail, that's alright).

For interrupts it would mean that when the consumer calls
request_irq(), the number it gets is a weak reference to the irq_desc.
For any management operation we lock irq_desc. If the domain is
destroyed, irq_descs get destroyed with it (after all users leave the
critical section). Next call to any of the functions looks up the irq
number and sees it's gone. It fails or silently returns depending on
the function (e.g. irq_free() would have to ignore the missing
lookup).

But I'm just floating ideas here.

> It's not clear to me what you are trying to achieve here.
>
>    If this is meant as a hardening effort to catch such issues and let
>    the kernel gracefully "survive", then I'm all ears.
>

IMO if more subsystems adapted the above approach then we'd have less
surprises when their resources are used in an unexpected way. I'd call
it "part of a broader hardening effort". :)

Bart

>    If this is just meant to paper over the shortcomings of subsystems or
>    driver implemtations then I'm not interested at all.
>
> Thanks,
>
>         tglx
>
>
>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-29 12:24                             ` Bartosz Golaszewski
@ 2023-08-29 20:18                               ` Greg Kroah-Hartman
  2023-08-29 22:29                               ` Thomas Gleixner
  1 sibling, 0 replies; 26+ messages in thread
From: Greg Kroah-Hartman @ 2023-08-29 20:18 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Thomas Gleixner, linux-kernel, Bartosz Golaszewski, Marc Zyngier,
	Linus Walleij

On Tue, Aug 29, 2023 at 02:24:21PM +0200, Bartosz Golaszewski wrote:
> On Tue, Aug 29, 2023 at 11:11 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > On Tue, Aug 29 2023 at 08:26, Bartosz Golaszewski wrote:
> > > On Mon, Aug 28, 2023 at 11:44 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> > >> That's the module which provides the interrupt domain and hid-whatever
> > >> is the one which requests the interrupt, no?
> > >>
> > > Not at all! This is what I said: "we have the hid-cp2112 module which
> > > drives a GPIO-and-I2C-expander-on-a-USB-stick". Meaning: the
> > > hid-cp2112 module registers a USB driver for a GPIO expander on a
> > > stick. This GPIO chip is the interrupt controller here. It's the USB
> > > stick that provides interrupts for whatever else needs them (in real
> > > life: it can be an IIO device on the I2C bus which signals some events
> > > over the GPIOs). The user can get the interrupt number using the
> > > gpiod_to_irq() function. It can be unplugged at any moment and module
> > > references will not stop the USB bus from unbinding it.
> >
> > Sorry for my confusion, but this all is confusing at best.
> >
> > So what you are saying is that the GPIO driver, which creates the
> > interrupt domain is unbound and that unbind destroys the interrupt
> > domain, right? IOW, the wonderful world of plug and pray.
> >
> > Let's look at the full picture again.
> >
> >    USB -> USB-device
> >           |----------- GPIO
> >           |------------I2C  ---------- I2C-device
> >                  (hid-cp2112 driver)   (i2c-xx-driver)
> >
> > i2x-xx-driver is the one which requests the interrupt from
> > hid-cp2112-GPIO, right?
> >
> 
> Yes! Sorry if I was not being clear about it.
> 
> > So when the USB device disconnects, then something needs to tell the
> > i2c-xx-driver that the I2C interface is not longer available, right?
> >
> > IOW, the unbind operation needs the following notification and teardown
> > order:
> >
> >    1) USB core notifies hid-cp2112
> >
> >    2) hid-cp2112 notifies i2c-xx-driver
> >
> >    3) i2c-xx-driver mops up and invokes free_irq()
> >
> >    4) hid-cp2112 removes the interrupt domain
> >
> > But it seems that you end up with a situation where the notification of
> > the i2c-xx-driver is either not happening or the driver in question is
> > simply failing to mop up and free the requested interrupt.
> >
> 
> Yes, there's no notification of any kind.

Why not fix that?

> It's a common problem unfortunately across different subsystems. We
> have hot-unpluggable consumers using resources that don't support it
> (like interrupts in this example).

Then the driver for the controller of that hot-pluggable irq controller
should be fixed.

> > As a consequence you want to work around it by mopping up the requested
> > interrupts somewhere else.
> >
> 
> The approach I'm proposing - and that we implement in GPIO - is
> treating the "handle" to the resource as what's often called in
> programming - a weak reference. The resource itself is released not by
> the consumer, but the provider. The consumer in turn can get the weak
> reference from the provider and has to have some way of converting it
> to a strong one for the duration of any of the API calls. It can be
> implemented internally with a mutex, spinlock, an RCU read section or
> otherwise (in GPIO we're using rw_semaphores but I'm working on
> migrating to SRCU in order to protect the functions called from
> interrupt context too which is missing ATM). If for any reason the
> provider vanishes, then the next API call will fail. If it vanishes
> during a call, then we'll wait for the call to exit before freeing the
> resources, even if the underlying HW is already gone (the call in
> progress may fail, that's alright).
> 
> For interrupts it would mean that when the consumer calls
> request_irq(), the number it gets is a weak reference to the irq_desc.
> For any management operation we lock irq_desc. If the domain is
> destroyed, irq_descs get destroyed with it (after all users leave the
> critical section). Next call to any of the functions looks up the irq
> number and sees it's gone. It fails or silently returns depending on
> the function (e.g. irq_free() would have to ignore the missing
> lookup).
> 
> But I'm just floating ideas here.

That's a nice idea, but a lot of work implementing this.  Good luck!

Fixing the driver might be simpler :)

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-29 12:24                             ` Bartosz Golaszewski
  2023-08-29 20:18                               ` Greg Kroah-Hartman
@ 2023-08-29 22:29                               ` Thomas Gleixner
  2023-09-06 14:54                                 ` Bartosz Golaszewski
  1 sibling, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2023-08-29 22:29 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij,
	Greg Kroah-Hartman

On Tue, Aug 29 2023 at 14:24, Bartosz Golaszewski wrote:
> On Tue, Aug 29, 2023 at 11:11 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> > On Mon, Aug 28, 2023 at 11:44 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> So when the USB device disconnects, then something needs to tell the
>> i2c-xx-driver that the I2C interface is not longer available, right?
>>
>> IOW, the unbind operation needs the following notification and teardown
>> order:
>>
>>    1) USB core notifies hid-cp2112
>>
>>    2) hid-cp2112 notifies i2c-xx-driver
>>
>>    3) i2c-xx-driver mops up and invokes free_irq()
>>
>>    4) hid-cp2112 removes the interrupt domain
>>
>> But it seems that you end up with a situation where the notification of
>> the i2c-xx-driver is either not happening or the driver in question is
>> simply failing to mop up and free the requested interrupt.
>>
>
> Yes, there's no notification of any kind.

I'm not buying that.

  usb disconnect
    ...
      cp2112_remove()
        i2c_del_adapter()
          i2c_unregister_device(client)
            ...
            device_unregister()
              device_del()
                bus_notify()            // Mechanism #1
                  i2c_device_remove()
                    if (dev->remove)
                       dev->remove()
                ...
                device_unbind_cleanup()
                  devres_release_all()  // Mechanism #2

        gpiochip_remove()

There are very well notifications to the drivers about unplug of a
device. Otherwise this would end up in a complete disaster and a lot
more stale data and state than just a procfs file or a requested
interrupt.

So the mechanisms are there, no?

If this is just about the problem that some device driver writers fail
to implement them correctly, then yes it makes sense to have a last
resort fallback which cleans them up and emits a big fat warning.

Making this a programming model would be beyond wrong.

> It's a common problem unfortunately across different subsystems. We
> have hot-unpluggable consumers using resources that don't support it
> (like interrupts in this example).

All hotpluggable consumers have at least one mechanism to mop up the
resources they allocated. There are a lot of resources in the kernel
which do not clean themself up magically.

So what's so special about interrupts? They are not any different from a
pointer which is registered at some entity and the device driver writer
forgets to unregister it, but the underlying resource is freed. That's
even worse than the leaked interrupt and cannot be magically undone at
all.

Whatever you try, you can't turn driver programming into a task which
can be accomplished w/o brains.

> For interrupts it would mean that when the consumer calls
> request_irq(), the number it gets is a weak reference to the irq_desc.

Interrupt numbers are weak references by definition. request_irq() does
not return an interrupt number, it returns success or fail. The
interrupt number is handed to request_irq(), no?

The entities which hand out the interrupt number are a complete
different story. But that number is from their perspective a weak
reference too.

> For any management operation we lock irq_desc.

That's required anyway, but irq_desc::lock is not a sufficient
protection against a teardown race.

> If the domain is destroyed, irq_descs get destroyed with it

Interrupts consist of more than just an interrupt descriptor. If you
care to look at the internals, then the descriptor is the last entity
which goes away simply because all other related resources hang off the
interrupt descriptor.

So they obviously need to be mopped up first and trying to mop up
requested interrupts at the point where the interrupt descriptor is
freed is way too late.

In fact they need to mopped up first, which is obvious from the setup
ordering as everything underneath must be in place already before
request_irq() can succeed. The only sensible ordering for teardown is
obviously the reverse of setup, but that's not specific to interrupts at
all.

> (after all users leave the critical section).

Which critical section? Interrupts have more than just the
irq_desc::lock critical section which needs to be left.

> Next call to any of the functions looks up the irq number and sees
> it's gone. It fails or silently returns depending on the function
> (e.g. irq_free() would have to ignore the missing lookup).

That's what happens today already.

The missing bit is the magic function which mops up when the driver
writer blew it up. But that's far from a 'put a trivial free_irq() call
somewhere'.

First of all we have too many interrupt mechanisms ranging from the
linux 0.1 model with static interrupt spaces to hierarchical interrupt
domains.

  - Almost everything which is interrupt domain based (hierarchical or
    not) can probably be "addressed" by some definition of "addressed"
    because there are only a few places in the core code which are
    involved in releasing individual or domain wide resources.

    But see below.

  - The incarnations which do not use interrupt domains are way harder
    because the teardown of the interrupt resources is not happening in
    the core code. It's happening at the driver side and the core is
    only involved to release the interrupt descriptor. But that's too
    late.

    The good news about those is that probably the vast majority of the
    instances is built in, mostly legacy and not pluggable.

So lets assume that anything which is not interrupt domain based is not
relevant, which reduces the problem space significantly. But the
remaining space is still non-trivial.

  1) Life-time

     Interrupt descriptors are RCU freed but that's mostly for external
     usage like /proc/interrupts

     Still most of the core code relies on the proper setup/teardown
     order, so there would be quite some work to do vs. validating that
     an interrupt descriptor is consistent at all hierarchy levels.

     That's going to be interesting vs. interfaces which can be invoked
     from pretty much any context (except NMI).

     irq_desc::lock is not the panacea here because it obviously can
     only be held for real short truly atomic sections. But quite some
     of the setup/teardown at all levels requires sleepable context.

  2) Concurrency

     Protection against concurrent interrupt handler execution is
     covered in free_irq() as it fully synchronizes.

     That's the least of my worries.

     Whatever the invalidation mechanism might be, pretty much every
     usage site of irq_to_desc() and any usage site of interrupt
     descriptors which are stored outside of the maple_tree for
     performance reasons need to be audited.

     The validation has to take into account whether that's an requested
     descriptor or an unused one.
  
So no, neither removing some procfs entry at the wrong point nor
slapping some variant of free_irq() into random places is going to cut
it.

Your idea of tracking request_irq()/free_irq() at some subsystem level
is not going to work either simply because it requires that such muck is
sprinkled all over the place.

I completely agree that the interrupt core code does not have enough
places which emit a prominent warning when the rules are violated.

So sure, in some way or the other a "fix" by some definition of "fix"
could be implemented, but I'm really not convinced that's the right way
to waste^Wspend our time with.

Especially not because interrupt handling is a hot-path and any
life-time/conncurrency validation mechanism will introduce overhead no
matter what.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-08-29 22:29                               ` Thomas Gleixner
@ 2023-09-06 14:54                                 ` Bartosz Golaszewski
  2023-09-12 18:16                                   ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-09-06 14:54 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij,
	Greg Kroah-Hartman

On Wed, Aug 30, 2023 at 12:29 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Tue, Aug 29 2023 at 14:24, Bartosz Golaszewski wrote:
> > On Tue, Aug 29, 2023 at 11:11 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> > On Mon, Aug 28, 2023 at 11:44 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> So when the USB device disconnects, then something needs to tell the
> >> i2c-xx-driver that the I2C interface is not longer available, right?
> >>
> >> IOW, the unbind operation needs the following notification and teardown
> >> order:
> >>
> >>    1) USB core notifies hid-cp2112
> >>
> >>    2) hid-cp2112 notifies i2c-xx-driver
> >>
> >>    3) i2c-xx-driver mops up and invokes free_irq()
> >>
> >>    4) hid-cp2112 removes the interrupt domain
> >>
> >> But it seems that you end up with a situation where the notification of
> >> the i2c-xx-driver is either not happening or the driver in question is
> >> simply failing to mop up and free the requested interrupt.
> >>
> >
> > Yes, there's no notification of any kind.
>
> I'm not buying that.

Gah, sorry, it took me a while to get back to this thread...

>
>   usb disconnect
>     ...
>       cp2112_remove()
>         i2c_del_adapter()
>           i2c_unregister_device(client)
>             ...
>             device_unregister()
>               device_del()
>                 bus_notify()            // Mechanism #1
>                   i2c_device_remove()
>                     if (dev->remove)
>                        dev->remove()
>                 ...
>                 device_unbind_cleanup()
>                   devres_release_all()  // Mechanism #2
>
>         gpiochip_remove()
>
> There are very well notifications to the drivers about unplug of a
> device. Otherwise this would end up in a complete disaster and a lot
> more stale data and state than just a procfs file or a requested
> interrupt.

I'm not sure how either of the two helps here. #2 just releases
managed resources owned by cp2112. It can remove the domain with an
appropriate devm action but it won't do anything for the users of
interrupts. #1 is a bus notification emitted when the I2C adapter
exposed by cp2112 has been deleted. This one in particular doesn't
help us, the domain is long gone by now but if I get what you mean
correctly, you'd want the driver to call request_irq() and then set up
a notifier block for the BUS_NOTIFY_UNBIND_DRIVER notification of the
provider of that interrupt? Doesn't that break like half a dozen of
abstraction layers? Because now the device driver which is the GPIO
consumer needs to know where it gets its interrupts from?

>
> So the mechanisms are there, no?
>

Technically you could duct tape something up. But this is far from
being generic and it would be a per-driver solution at best, not even
per-subsystem. Because this is just interrupts. What about other
providers? Setup a notification for every single one? The resources a
device needs are normally acquired in probe() and released in
remove(). Possibly there are per-systems setup/teardown, open/close
etc. callbacks but it's not much different. What would you do in this
notification callback? Release the resource and keep going? In what
state would that device be now?

You would think that plug-and-play works well in the kernel and it's
true for certain parts but it really isn't the case for subsystems
that were not considered as very plug-and-play until people started
putting them on a stick. Some devices that are not typically
hot-pluggable - like serial - have been used with USB for so long that
they do handle unplugging very well. But as soon as you put i2c on
USB, you'll see what a mess it is. If you have an I2C device on a USB
I2C expander and it's being used, when you pull the plug, you'll see
that the kernel thread removing the device will block on a call to
wait_for_completion() until all users release their i2c adapter
references. They (the users) are not however notified in any generic
way about the provider of their resources being gone.

And even if they were, what would they do? "Unbind themselves" so that
their remove paths be triggered where they would free resources? That
may be a solution I suppose - a device whose non-optional resource
provider is gone, should go down as well I suppose. This would be a
huge development however and a more realistic approach is to make
provider APIs resistant to the backing device being removed.

> If this is just about the problem that some device driver writers fail
> to implement them correctly, then yes it makes sense to have a last
> resort fallback which cleans them up and emits a big fat warning.
>
> Making this a programming model would be beyond wrong.
>
> > It's a common problem unfortunately across different subsystems. We
> > have hot-unpluggable consumers using resources that don't support it
> > (like interrupts in this example).
>
> All hotpluggable consumers have at least one mechanism to mop up the
> resources they allocated. There are a lot of resources in the kernel
> which do not clean themself up magically.
>

Yeah, hotpluggable consumers are fine. The problem here is
hotpluggable *providers* with consumers who don't know that.

> So what's so special about interrupts? They are not any different from a
> pointer which is registered at some entity and the device driver writer
> forgets to unregister it, but the underlying resource is freed. That's
> even worse than the leaked interrupt and cannot be magically undone at
> all.

But that's precisely my point. It's not only interrupts. There *IS* a
wider problem in the kernel with resources that are provided by
devices that can disappear at any moment. My argument is that the
provider API should handle that and not rely on the consumer to
release those resources in time. Serial or GPIO handle that by having
a reference counted outer layer and an inner implementation that can
go from under it. The outer layer then simply returns an appropriate
error to the consumer. During any of the provider calls, the
lower-level object is locked and protected from being destroyed.

>
> Whatever you try, you can't turn driver programming into a task which
> can be accomplished w/o brains.
>

I merely want to reduce the number of obvious traps. :)

> > For interrupts it would mean that when the consumer calls
> > request_irq(), the number it gets is a weak reference to the irq_desc.
>
> Interrupt numbers are weak references by definition. request_irq() does
> not return an interrupt number, it returns success or fail. The
> interrupt number is handed to request_irq(), no?
>
> The entities which hand out the interrupt number are a complete
> different story. But that number is from their perspective a weak
> reference too.
>
> > For any management operation we lock irq_desc.
>
> That's required anyway, but irq_desc::lock is not a sufficient
> protection against a teardown race.
>
> > If the domain is destroyed, irq_descs get destroyed with it
>
> Interrupts consist of more than just an interrupt descriptor. If you
> care to look at the internals, then the descriptor is the last entity
> which goes away simply because all other related resources hang off the
> interrupt descriptor.
>
> So they obviously need to be mopped up first and trying to mop up
> requested interrupts at the point where the interrupt descriptor is
> freed is way too late.
>
> In fact they need to mopped up first, which is obvious from the setup
> ordering as everything underneath must be in place already before
> request_irq() can succeed. The only sensible ordering for teardown is
> obviously the reverse of setup, but that's not specific to interrupts at
> all.
>
> > (after all users leave the critical section).
>
> Which critical section? Interrupts have more than just the
> irq_desc::lock critical section which needs to be left.
>
> > Next call to any of the functions looks up the irq number and sees
> > it's gone. It fails or silently returns depending on the function
> > (e.g. irq_free() would have to ignore the missing lookup).
>
> That's what happens today already.
>
> The missing bit is the magic function which mops up when the driver
> writer blew it up. But that's far from a 'put a trivial free_irq() call
> somewhere'.
>
> First of all we have too many interrupt mechanisms ranging from the
> linux 0.1 model with static interrupt spaces to hierarchical interrupt
> domains.
>
>   - Almost everything which is interrupt domain based (hierarchical or
>     not) can probably be "addressed" by some definition of "addressed"
>     because there are only a few places in the core code which are
>     involved in releasing individual or domain wide resources.
>
>     But see below.
>
>   - The incarnations which do not use interrupt domains are way harder
>     because the teardown of the interrupt resources is not happening in
>     the core code. It's happening at the driver side and the core is
>     only involved to release the interrupt descriptor. But that's too
>     late.
>
>     The good news about those is that probably the vast majority of the
>     instances is built in, mostly legacy and not pluggable.
>
> So lets assume that anything which is not interrupt domain based is not
> relevant, which reduces the problem space significantly. But the
> remaining space is still non-trivial.
>
>   1) Life-time
>
>      Interrupt descriptors are RCU freed but that's mostly for external
>      usage like /proc/interrupts
>
>      Still most of the core code relies on the proper setup/teardown
>      order, so there would be quite some work to do vs. validating that
>      an interrupt descriptor is consistent at all hierarchy levels.
>
>      That's going to be interesting vs. interfaces which can be invoked
>      from pretty much any context (except NMI).
>
>      irq_desc::lock is not the panacea here because it obviously can
>      only be held for real short truly atomic sections. But quite some
>      of the setup/teardown at all levels requires sleepable context.
>
>   2) Concurrency
>
>      Protection against concurrent interrupt handler execution is
>      covered in free_irq() as it fully synchronizes.
>
>      That's the least of my worries.
>
>      Whatever the invalidation mechanism might be, pretty much every
>      usage site of irq_to_desc() and any usage site of interrupt
>      descriptors which are stored outside of the maple_tree for
>      performance reasons need to be audited.
>
>      The validation has to take into account whether that's an requested
>      descriptor or an unused one.
>
> So no, neither removing some procfs entry at the wrong point nor
> slapping some variant of free_irq() into random places is going to cut
> it.
>
> Your idea of tracking request_irq()/free_irq() at some subsystem level
> is not going to work either simply because it requires that such muck is
> sprinkled all over the place.
>

I was thinking more about tracking it at the irq domain level so that
when a domain is destroyed with interrupts requested, these interrupts
are freed. I admit I still don't have enough in-depth knowledge about
linux interrupts to understand why it  can't work, I need to spend
more time on this. I'll be back.

> I completely agree that the interrupt core code does not have enough
> places which emit a prominent warning when the rules are violated.
>
> So sure, in some way or the other a "fix" by some definition of "fix"
> could be implemented, but I'm really not convinced that's the right way
> to waste^Wspend our time with.
>
> Especially not because interrupt handling is a hot-path and any
> life-time/conncurrency validation mechanism will introduce overhead no
> matter what.
>

Thanks for the detailed explanation.

Bartosz

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-09-06 14:54                                 ` Bartosz Golaszewski
@ 2023-09-12 18:16                                   ` Thomas Gleixner
  2023-09-15 19:50                                     ` Bartosz Golaszewski
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Gleixner @ 2023-09-12 18:16 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij,
	Greg Kroah-Hartman

On Wed, Sep 06 2023 at 16:54, Bartosz Golaszewski wrote:
> On Wed, Aug 30, 2023 at 12:29 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>>   usb disconnect
>>     ...
>>       cp2112_remove()
>>         i2c_del_adapter()
>>           i2c_unregister_device(client)
>>             ...
>>             device_unregister()
>>               device_del()
>>                 bus_notify()            // Mechanism #1
>>                   i2c_device_remove()
>>                     if (dev->remove)
>>                        dev->remove()
>>                 ...
>>                 device_unbind_cleanup()
>>                   devres_release_all()  // Mechanism #2
>>
>>         gpiochip_remove()
>>
>> There are very well notifications to the drivers about unplug of a
>> device. Otherwise this would end up in a complete disaster and a lot
>> more stale data and state than just a procfs file or a requested
>> interrupt.
>
> I'm not sure how either of the two helps here. #2 just releases
> managed resources owned by cp2112. It can remove the domain with an
> appropriate devm action but it won't do anything for the users of
> interrupts. #1 is a bus notification emitted when the I2C adapter
> exposed by cp2112 has been deleted.

No. The domain is not yet gone at the point where the I2C bus
notification happens. Look at the above invocation chain.

The removal of the attached I2C devices happens _before_ the domain is
removed. Anything else does not make sense at all.

So the cleanup of those devices should free the interrupt, in the same
way it frees other resources, no?

i2c_device_remove()
  if (driver->remove)
    driver->remove()    // Driver specific cleanup

  // Devres cleanup operating on the to be removed I2C device
  devres_release_group(&client->dev, client->devres_group_id);

So again:

       cp2112_remove()
         i2c_del_adapter()      // Cleans up all I2C users

       gpiochip_remove()        // Removes the interrupt domain.

So you do not need any magic bus notififications and whatever. It's all
there already.

> This one in particular doesn't help us, the domain is long gone by now
> but if I get what you mean correctly, you'd want the driver to call
> request_irq() and then set up a notifier block for the
> BUS_NOTIFY_UNBIND_DRIVER notification of the provider of that
> interrupt? Doesn't that break like half a dozen of abstraction layers?
> Because now the device driver which is the GPIO consumer needs to know
> where it gets its interrupts from?

Again. It does not. The point is that the device is removed in the
hotplug event chain, which cleans up the associated resources.
devm_request_irq() already takes care of that.

> You would think that plug-and-play works well in the kernel and it's
> true for certain parts but it really isn't the case for subsystems
> that were not considered as very plug-and-play until people started
> putting them on a stick. Some devices that are not typically
> hot-pluggable - like serial - have been used with USB for so long that
> they do handle unplugging very well. But as soon as you put i2c on
> USB, you'll see what a mess it is. If you have an I2C device on a USB
> I2C expander and it's being used, when you pull the plug, you'll see
> that the kernel thread removing the device will block on a call to
> wait_for_completion() until all users release their i2c adapter
> references. They (the users) are not however notified in any generic
> way about the provider of their resources being gone.

So why aren't you fixing this and instead trying to implement force
unplug mechanisms which require a pile of unholy hacks all over the
place?

>> All hotpluggable consumers have at least one mechanism to mop up the
>> resources they allocated. There are a lot of resources in the kernel
>> which do not clean themself up magically.
>>
>
> Yeah, hotpluggable consumers are fine. The problem here is
> hotpluggable *providers* with consumers who don't know that.

Then these consumers have to be fixed and made aware of the new world order
of hotplug, no?

>> Your idea of tracking request_irq()/free_irq() at some subsystem level
>> is not going to work either simply because it requires that such muck is
>> sprinkled all over the place.
>>
> I was thinking more about tracking it at the irq domain level so that
> when a domain is destroyed with interrupts requested, these interrupts
> are freed. I admit I still don't have enough in-depth knowledge about
> linux interrupts to understand why it  can't work, I need to spend
> more time on this. I'll be back.

There is no need for special tracking. The core can figure out today
whether an interrupt which is mapped by the domain is requested or
not. That's not the problem at all.

The problems are the life time rules, references, concurrency etc. They
are not magically going away by some new form of tracking.

It's amazing that you insist on solving the problem at the wrong end.

The real problem is that there are device drivers and subsystems which
are not prepared for hotplug, right?

As interrupts are only a small part of the overall problem, I'm
absolutely not seeing how adding heuristics all over the place is a
sensible design principle.

What's so problematic about teaching the affected subsystems and drivers
that hotplug exists?

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-09-12 18:16                                   ` Thomas Gleixner
@ 2023-09-15 19:50                                     ` Bartosz Golaszewski
  2023-11-13 20:53                                       ` Bartosz Golaszewski
  0 siblings, 1 reply; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-09-15 19:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij,
	Greg Kroah-Hartman

On Tue, Sep 12, 2023 at 8:17 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, Sep 06 2023 at 16:54, Bartosz Golaszewski wrote:
> > On Wed, Aug 30, 2023 at 12:29 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >>   usb disconnect
> >>     ...
> >>       cp2112_remove()
> >>         i2c_del_adapter()
> >>           i2c_unregister_device(client)
> >>             ...
> >>             device_unregister()
> >>               device_del()
> >>                 bus_notify()            // Mechanism #1
> >>                   i2c_device_remove()
> >>                     if (dev->remove)
> >>                        dev->remove()
> >>                 ...
> >>                 device_unbind_cleanup()
> >>                   devres_release_all()  // Mechanism #2
> >>
> >>         gpiochip_remove()
> >>
> >> There are very well notifications to the drivers about unplug of a
> >> device. Otherwise this would end up in a complete disaster and a lot
> >> more stale data and state than just a procfs file or a requested
> >> interrupt.
> >
> > I'm not sure how either of the two helps here. #2 just releases
> > managed resources owned by cp2112. It can remove the domain with an
> > appropriate devm action but it won't do anything for the users of
> > interrupts. #1 is a bus notification emitted when the I2C adapter
> > exposed by cp2112 has been deleted.
>
> No. The domain is not yet gone at the point where the I2C bus
> notification happens. Look at the above invocation chain.
>
> The removal of the attached I2C devices happens _before_ the domain is
> removed. Anything else does not make sense at all.
>
> So the cleanup of those devices should free the interrupt, in the same
> way it frees other resources, no?
>
> i2c_device_remove()
>   if (driver->remove)
>     driver->remove()    // Driver specific cleanup
>
>   // Devres cleanup operating on the to be removed I2C device
>   devres_release_group(&client->dev, client->devres_group_id);
>
> So again:
>
>        cp2112_remove()
>          i2c_del_adapter()      // Cleans up all I2C users
>
>        gpiochip_remove()        // Removes the interrupt domain.
>
> So you do not need any magic bus notififications and whatever. It's all
> there already.
>

You're only talking about a situation in which the users of the
interrupts from the cp2112 GPIO chip's are I2C devices on its I2C
adapter. We can have consumers of those interrupts elsewhere. We can
have user-space watching interrupts on GPIOs (see below). They won't
get removed before the cp2112 GPIO chip, so their remove paths freeing
interrupts will not be triggered as you describe it.

> > This one in particular doesn't help us, the domain is long gone by now
> > but if I get what you mean correctly, you'd want the driver to call
> > request_irq() and then set up a notifier block for the
> > BUS_NOTIFY_UNBIND_DRIVER notification of the provider of that
> > interrupt? Doesn't that break like half a dozen of abstraction layers?
> > Because now the device driver which is the GPIO consumer needs to know
> > where it gets its interrupts from?
>
> Again. It does not. The point is that the device is removed in the
> hotplug event chain, which cleans up the associated resources.
> devm_request_irq() already takes care of that.
>

That's not always necessary. Imagine you have an interrupt handler set
up on a GPIO that is now gone. Your driver may do lots of other things
and remain functional even though this interrupt will never fire.

> > You would think that plug-and-play works well in the kernel and it's
> > true for certain parts but it really isn't the case for subsystems
> > that were not considered as very plug-and-play until people started
> > putting them on a stick. Some devices that are not typically
> > hot-pluggable - like serial - have been used with USB for so long that
> > they do handle unplugging very well. But as soon as you put i2c on
> > USB, you'll see what a mess it is. If you have an I2C device on a USB
> > I2C expander and it's being used, when you pull the plug, you'll see
> > that the kernel thread removing the device will block on a call to
> > wait_for_completion() until all users release their i2c adapter
> > references. They (the users) are not however notified in any generic
> > way about the provider of their resources being gone.
>
> So why aren't you fixing this and instead trying to implement force
> unplug mechanisms which require a pile of unholy hacks all over the
> place?
>

That's not what I'm suggesting. I've described the general model I'm
postulating. If this patch is an unholy hack, it's because I didn't
know better. Now I do, I've abandoned it two weeks ago.

> >> All hotpluggable consumers have at least one mechanism to mop up the
> >> resources they allocated. There are a lot of resources in the kernel
> >> which do not clean themself up magically.
> >>
> >
> > Yeah, hotpluggable consumers are fine. The problem here is
> > hotpluggable *providers* with consumers who don't know that.
>
> Then these consumers have to be fixed and made aware of the new world order
> of hotplug, no?
>

I've asked that question in my previous email. What do you think we
should do when a provider of a resource we're using in a driver is
gone? Let's assume, the consumer device will not get removed in the
hot-unplug chain - which is perfectly reasonable. I'm arguing that it
should receive an error code on the next call to that provider. The
alternatives I see are: force-unbind the device or notify it by some
other unspecified means and have it do what exactly?

> >> Your idea of tracking request_irq()/free_irq() at some subsystem level
> >> is not going to work either simply because it requires that such muck is
> >> sprinkled all over the place.
> >>
> > I was thinking more about tracking it at the irq domain level so that
> > when a domain is destroyed with interrupts requested, these interrupts
> > are freed. I admit I still don't have enough in-depth knowledge about
> > linux interrupts to understand why it  can't work, I need to spend
> > more time on this. I'll be back.
>
> There is no need for special tracking. The core can figure out today
> whether an interrupt which is mapped by the domain is requested or
> not. That's not the problem at all.
>
> The problems are the life time rules, references, concurrency etc. They
> are not magically going away by some new form of tracking.
>
> It's amazing that you insist on solving the problem at the wrong end.
>

Is it really the wrong end though? Let me give you an analogy with a
driver consuming a resource replaced by a user-space process. Let's
take a user process requesting some kernel resource by opening a
character device file. The handle the process gets will now be the
file descriptor number. The resource can be a GPIO (or even an
interrupt on that GPIO - as user-space can watch interrupts via the
GPIO character device).

Let's now assume the GPIO is on a USB expander. We now unplug it.
Should the user-space get informed about this fact with some
side-channel other than the descriptor? Or sent a signal/killed
(analogy to the removal of the device in the hot-unplug path)? Should
we set up some entirely different notification mechanism? No, the only
sane thing to do is: next time the process calls into the kernel via a
system call referencing that descriptor, it should return an error -
typically -ENODEV. If a poll() is in process, it should be woken up
with EPOLLERR. The process should then call close() on that
descriptor, putting its reference to the resource. If it doesn't, then
all it'll see will be errors. The process however can keep on living
and doing other stuff.

What should happen in the kernel is: on the unplug event we clean
everything up, leaving just the user-facing, reference-counted outer
shell. Once the last reference to struct file is put, it'll be
released. Of course not everyone does it and so user-space can crash
the kernel just by opening a character device exposed by vulnerable
subsystems, unbinding the device over sysfs and calling ioctl() or
otherwise.

My point is: the same rule should apply to in-kernel consumers. When
they request a resource, they get a reference to it. The resource is
managed by its provider. If the provider is going down, it frees the
resource. The consumer tries to use it -> it gets an error. I'm not
convinced by the life-time rules argument. The consumer is not
CREATING a resource. It's REQUESTING it for usage. IMO this means it
REFERENCES it, not OWNS it. And so is only responsible for putting the
reference.

Bartosz

> The real problem is that there are device drivers and subsystems which
> are not prepared for hotplug, right?
>
> As interrupts are only a small part of the overall problem, I'm
> absolutely not seeing how adding heuristics all over the place is a
> sensible design principle.
>
> What's so problematic about teaching the affected subsystems and drivers
> that hotplug exists?
>
> Thanks,
>
>         tglx
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-09-15 19:50                                     ` Bartosz Golaszewski
@ 2023-11-13 20:53                                       ` Bartosz Golaszewski
  2023-11-14 12:27                                         ` Greg Kroah-Hartman
  2023-11-14 13:25                                         ` Linus Walleij
  0 siblings, 2 replies; 26+ messages in thread
From: Bartosz Golaszewski @ 2023-11-13 20:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Bartosz Golaszewski, Marc Zyngier, Linus Walleij,
	Greg Kroah-Hartman, Wolfram Sang

On Fri, Sep 15, 2023 at 9:50 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
>

[snip]

>
> My point is: the same rule should apply to in-kernel consumers. When
> they request a resource, they get a reference to it. The resource is
> managed by its provider. If the provider is going down, it frees the
> resource. The consumer tries to use it -> it gets an error. I'm not
> convinced by the life-time rules argument. The consumer is not
> CREATING a resource. It's REQUESTING it for usage. IMO this means it
> REFERENCES it, not OWNS it. And so is only responsible for putting the
> reference.
>
> Bartosz
>

Hi Thomas, Greg et al,

I am at LPC and will present a talk on Wednesday 5:15pm at the kernel
summit about object life-time issues. I'll reference this problem
among others. Please consider it in your schedules, I think it'll be
useful to discuss it in person as it's a generic problem in many
driver subsystems.

Bartosz

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-11-13 20:53                                       ` Bartosz Golaszewski
@ 2023-11-14 12:27                                         ` Greg Kroah-Hartman
  2023-11-14 13:25                                         ` Linus Walleij
  1 sibling, 0 replies; 26+ messages in thread
From: Greg Kroah-Hartman @ 2023-11-14 12:27 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Thomas Gleixner, linux-kernel, Bartosz Golaszewski, Marc Zyngier,
	Linus Walleij, Wolfram Sang

On Mon, Nov 13, 2023 at 09:53:47PM +0100, Bartosz Golaszewski wrote:
> On Fri, Sep 15, 2023 at 9:50 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
> >
> 
> [snip]
> 
> >
> > My point is: the same rule should apply to in-kernel consumers. When
> > they request a resource, they get a reference to it. The resource is
> > managed by its provider. If the provider is going down, it frees the
> > resource. The consumer tries to use it -> it gets an error. I'm not
> > convinced by the life-time rules argument. The consumer is not
> > CREATING a resource. It's REQUESTING it for usage. IMO this means it
> > REFERENCES it, not OWNS it. And so is only responsible for putting the
> > reference.
> >
> > Bartosz
> >
> 
> Hi Thomas, Greg et al,
> 
> I am at LPC and will present a talk on Wednesday 5:15pm at the kernel
> summit about object life-time issues. I'll reference this problem
> among others. Please consider it in your schedules, I think it'll be
> useful to discuss it in person as it's a generic problem in many
> driver subsystems.

Sounds great, I'll try to make it there!

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
  2023-11-13 20:53                                       ` Bartosz Golaszewski
  2023-11-14 12:27                                         ` Greg Kroah-Hartman
@ 2023-11-14 13:25                                         ` Linus Walleij
  1 sibling, 0 replies; 26+ messages in thread
From: Linus Walleij @ 2023-11-14 13:25 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Thomas Gleixner, linux-kernel, Bartosz Golaszewski, Marc Zyngier,
	Greg Kroah-Hartman, Wolfram Sang

On Mon, Nov 13, 2023 at 9:53 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:

> I am at LPC and will present a talk on Wednesday 5:15pm at the kernel
> summit about object life-time issues. I'll reference this problem
> among others. Please consider it in your schedules, I think it'll be
> useful to discuss it in person as it's a generic problem in many
> driver subsystems.

I'm sadly not at LPC, but this sounds very relevant and I'd love to know
what you conclude (if we're lucky maybe LWN catches it in a writeup).

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-11-14 13:25 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-14  9:36 [PATCH 0/2] genirq: don't leak handler procfs entries Bartosz Golaszewski
2023-08-14  9:36 ` [PATCH 1/2] genirq: proc: drop unused argument from unregister_handler_proc() Bartosz Golaszewski
2023-08-14  9:36 ` [PATCH 2/2] genirq: proc: fix a procfs entry leak Bartosz Golaszewski
2023-08-24 20:12   ` Thomas Gleixner
2023-08-25  7:36     ` brgl
2023-08-25  8:11       ` Thomas Gleixner
2023-08-25 11:01         ` Bartosz Golaszewski
2023-08-25 14:58           ` Thomas Gleixner
2023-08-25 17:08           ` Thomas Gleixner
2023-08-25 20:07             ` Bartosz Golaszewski
2023-08-26 15:08               ` Thomas Gleixner
2023-08-28 10:06                 ` Bartosz Golaszewski
2023-08-28 12:41                   ` Thomas Gleixner
2023-08-28 19:03                     ` Bartosz Golaszewski
2023-08-28 21:44                       ` Thomas Gleixner
2023-08-29  6:26                         ` Bartosz Golaszewski
2023-08-29  9:11                           ` Thomas Gleixner
2023-08-29 12:24                             ` Bartosz Golaszewski
2023-08-29 20:18                               ` Greg Kroah-Hartman
2023-08-29 22:29                               ` Thomas Gleixner
2023-09-06 14:54                                 ` Bartosz Golaszewski
2023-09-12 18:16                                   ` Thomas Gleixner
2023-09-15 19:50                                     ` Bartosz Golaszewski
2023-11-13 20:53                                       ` Bartosz Golaszewski
2023-11-14 12:27                                         ` Greg Kroah-Hartman
2023-11-14 13:25                                         ` Linus Walleij

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.