* [PATCH] net: dsa: mv88e6xxx: fix races between lock and irq freeing
@ 2018-07-20 9:53 Uwe Kleine-König
2018-07-22 5:44 ` David Miller
0 siblings, 1 reply; 5+ messages in thread
From: Uwe Kleine-König @ 2018-07-20 9:53 UTC (permalink / raw)
To: Andrew Lunn, Vivien Didelot
Cc: Florian Fainelli, David S. Miller, netdev, kernel
free_irq() waits until all handlers for this IRQ have completed. As the
relevant handler (mv88e6xxx_g1_irq_thread_fn()) takes the chip's reg_lock
it might never return if the thread calling free_irq() holds this lock.
For the same reason kthread_cancel_delayed_work_sync() in the polling case
must not hold this lock.
Also first free the irq (or stop the worker respectively) such that
mv88e6xxx_g1_irq_thread_work() isn't called any more before the irq
mappings are dropped in mv88e6xxx_g1_irq_free_common() to prevent the
worker thread to call handle_nested_irq(0) which results in a NULL-pointer
exception.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
drivers/net/dsa/mv88e6xxx/chip.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 437cd6eb4faa..9ef07a06aceb 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -343,6 +343,7 @@ static const struct irq_domain_ops mv88e6xxx_g1_irq_domain_ops = {
.xlate = irq_domain_xlate_twocell,
};
+/* To be called with reg_lock held */
static void mv88e6xxx_g1_irq_free_common(struct mv88e6xxx_chip *chip)
{
int irq, virq;
@@ -362,9 +363,15 @@ static void mv88e6xxx_g1_irq_free_common(struct mv88e6xxx_chip *chip)
static void mv88e6xxx_g1_irq_free(struct mv88e6xxx_chip *chip)
{
- mv88e6xxx_g1_irq_free_common(chip);
-
+ /*
+ * free_irq must be called without reg_lock taken because the irq
+ * handler takes this lock, too.
+ */
free_irq(chip->irq, chip);
+
+ mutex_lock(&chip->reg_lock);
+ mv88e6xxx_g1_irq_free_common(chip);
+ mutex_unlock(&chip->reg_lock);
}
static int mv88e6xxx_g1_irq_setup_common(struct mv88e6xxx_chip *chip)
@@ -469,10 +476,12 @@ static int mv88e6xxx_irq_poll_setup(struct mv88e6xxx_chip *chip)
static void mv88e6xxx_irq_poll_free(struct mv88e6xxx_chip *chip)
{
- mv88e6xxx_g1_irq_free_common(chip);
-
kthread_cancel_delayed_work_sync(&chip->irq_poll_work);
kthread_destroy_worker(chip->kworker);
+
+ mutex_lock(&chip->reg_lock);
+ mv88e6xxx_g1_irq_free_common(chip);
+ mutex_unlock(&chip->reg_lock);
}
int mv88e6xxx_wait(struct mv88e6xxx_chip *chip, int addr, int reg, u16 mask)
@@ -4506,12 +4515,10 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
if (chip->info->g2_irqs > 0)
mv88e6xxx_g2_irq_free(chip);
out_g1_irq:
- mutex_lock(&chip->reg_lock);
if (chip->irq > 0)
mv88e6xxx_g1_irq_free(chip);
else
mv88e6xxx_irq_poll_free(chip);
- mutex_unlock(&chip->reg_lock);
out:
if (pdata)
dev_put(pdata->netdev);
@@ -4539,12 +4546,10 @@ static void mv88e6xxx_remove(struct mdio_device *mdiodev)
if (chip->info->g2_irqs > 0)
mv88e6xxx_g2_irq_free(chip);
- mutex_lock(&chip->reg_lock);
if (chip->irq > 0)
mv88e6xxx_g1_irq_free(chip);
else
mv88e6xxx_irq_poll_free(chip);
- mutex_unlock(&chip->reg_lock);
}
static const struct of_device_id mv88e6xxx_of_match[] = {
--
2.18.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] net: dsa: mv88e6xxx: fix races between lock and irq freeing
2018-07-20 9:53 [PATCH] net: dsa: mv88e6xxx: fix races between lock and irq freeing Uwe Kleine-König
@ 2018-07-22 5:44 ` David Miller
2018-07-22 19:00 ` Uwe Kleine-König
0 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2018-07-22 5:44 UTC (permalink / raw)
To: u.kleine-koenig; +Cc: andrew, vivien.didelot, f.fainelli, netdev, kernel
From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Date: Fri, 20 Jul 2018 11:53:15 +0200
> free_irq() waits until all handlers for this IRQ have completed. As the
> relevant handler (mv88e6xxx_g1_irq_thread_fn()) takes the chip's reg_lock
> it might never return if the thread calling free_irq() holds this lock.
>
> For the same reason kthread_cancel_delayed_work_sync() in the polling case
> must not hold this lock.
>
> Also first free the irq (or stop the worker respectively) such that
> mv88e6xxx_g1_irq_thread_work() isn't called any more before the irq
> mappings are dropped in mv88e6xxx_g1_irq_free_common() to prevent the
> worker thread to call handle_nested_irq(0) which results in a NULL-pointer
> exception.
>
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Looks good.
Note than the IRQ domain unmapping will do a synchronize_irq() which
should cause the same deadlock as free_irq() will with the reg_lock
held.
Note also that g2 IRQ freeing gets the ordering right, and doesn't need
a lock because it doesn't program any registers when tearing down it's
IRQ.
Applied and queued up for -stable, thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net: dsa: mv88e6xxx: fix races between lock and irq freeing
2018-07-22 5:44 ` David Miller
@ 2018-07-22 19:00 ` Uwe Kleine-König
2018-07-22 20:04 ` David Miller
0 siblings, 1 reply; 5+ messages in thread
From: Uwe Kleine-König @ 2018-07-22 19:00 UTC (permalink / raw)
To: David Miller; +Cc: andrew, vivien.didelot, f.fainelli, netdev, kernel
Hello,
On Sat, Jul 21, 2018 at 10:44:09PM -0700, David Miller wrote:
> From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> Date: Fri, 20 Jul 2018 11:53:15 +0200
>
> > free_irq() waits until all handlers for this IRQ have completed. As the
> > relevant handler (mv88e6xxx_g1_irq_thread_fn()) takes the chip's reg_lock
> > it might never return if the thread calling free_irq() holds this lock.
> >
> > For the same reason kthread_cancel_delayed_work_sync() in the polling case
> > must not hold this lock.
> >
> > Also first free the irq (or stop the worker respectively) such that
> > mv88e6xxx_g1_irq_thread_work() isn't called any more before the irq
> > mappings are dropped in mv88e6xxx_g1_irq_free_common() to prevent the
> > worker thread to call handle_nested_irq(0) which results in a NULL-pointer
> > exception.
> >
> > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>
> Looks good.
>
> Note than the IRQ domain unmapping will do a synchronize_irq() which
> should cause the same deadlock as free_irq() will with the reg_lock
> held.
Do you think that there is still a problem? When free_irq() for the
external visible irq returns the muxed irqs should be all gone, too, so
this should not trigger, should it?
> Note also that g2 IRQ freeing gets the ordering right, and doesn't need
> a lock because it doesn't program any registers when tearing down it's
> IRQ.
Yes.
> Applied and queued up for -stable, thanks.
Fine, thanks
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net: dsa: mv88e6xxx: fix races between lock and irq freeing
2018-07-22 19:00 ` Uwe Kleine-König
@ 2018-07-22 20:04 ` David Miller
2018-07-22 20:38 ` Uwe Kleine-König
0 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2018-07-22 20:04 UTC (permalink / raw)
To: u.kleine-koenig; +Cc: andrew, vivien.didelot, f.fainelli, netdev, kernel
From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Date: Sun, 22 Jul 2018 21:00:35 +0200
> On Sat, Jul 21, 2018 at 10:44:09PM -0700, David Miller wrote:
>> From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>> Date: Fri, 20 Jul 2018 11:53:15 +0200
>>
>> > free_irq() waits until all handlers for this IRQ have completed. As the
>> > relevant handler (mv88e6xxx_g1_irq_thread_fn()) takes the chip's reg_lock
>> > it might never return if the thread calling free_irq() holds this lock.
>> >
>> > For the same reason kthread_cancel_delayed_work_sync() in the polling case
>> > must not hold this lock.
>> >
>> > Also first free the irq (or stop the worker respectively) such that
>> > mv88e6xxx_g1_irq_thread_work() isn't called any more before the irq
>> > mappings are dropped in mv88e6xxx_g1_irq_free_common() to prevent the
>> > worker thread to call handle_nested_irq(0) which results in a NULL-pointer
>> > exception.
>> >
>> > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>>
>> Looks good.
>>
>> Note than the IRQ domain unmapping will do a synchronize_irq() which
>> should cause the same deadlock as free_irq() will with the reg_lock
>> held.
>
> Do you think that there is still a problem? When free_irq() for the
> external visible irq returns the muxed irqs should be all gone, too, so
> this should not trigger, should it?
It shouldn't be a problem after your changes.
I'm just saying that I'm surprised that, in the original code, you see
the deadlock in free_irq(), since the synchronize_irq() done by the
IRQ domain code should have happened first.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] net: dsa: mv88e6xxx: fix races between lock and irq freeing
2018-07-22 20:04 ` David Miller
@ 2018-07-22 20:38 ` Uwe Kleine-König
0 siblings, 0 replies; 5+ messages in thread
From: Uwe Kleine-König @ 2018-07-22 20:38 UTC (permalink / raw)
To: David Miller; +Cc: andrew, vivien.didelot, f.fainelli, netdev, kernel
On Sun, Jul 22, 2018 at 01:04:11PM -0700, David Miller wrote:
> From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> Date: Sun, 22 Jul 2018 21:00:35 +0200
>
> > On Sat, Jul 21, 2018 at 10:44:09PM -0700, David Miller wrote:
> >> From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> >> Date: Fri, 20 Jul 2018 11:53:15 +0200
> >>
> >> > free_irq() waits until all handlers for this IRQ have completed. As the
> >> > relevant handler (mv88e6xxx_g1_irq_thread_fn()) takes the chip's reg_lock
> >> > it might never return if the thread calling free_irq() holds this lock.
> >> >
> >> > For the same reason kthread_cancel_delayed_work_sync() in the polling case
> >> > must not hold this lock.
> >> >
> >> > Also first free the irq (or stop the worker respectively) such that
> >> > mv88e6xxx_g1_irq_thread_work() isn't called any more before the irq
> >> > mappings are dropped in mv88e6xxx_g1_irq_free_common() to prevent the
> >> > worker thread to call handle_nested_irq(0) which results in a NULL-pointer
> >> > exception.
> >> >
> >> > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> >>
> >> Looks good.
> >>
> >> Note than the IRQ domain unmapping will do a synchronize_irq() which
> >> should cause the same deadlock as free_irq() will with the reg_lock
> >> held.
> >
> > Do you think that there is still a problem? When free_irq() for the
> > external visible irq returns the muxed irqs should be all gone, too, so
> > this should not trigger, should it?
>
> It shouldn't be a problem after your changes.
>
> I'm just saying that I'm surprised that, in the original code, you see
> the deadlock in free_irq(), since the synchronize_irq() done by the
> IRQ domain code should have happened first.
ah, I see. This didn't happen because I added an msleep to
mv88e6xxx_g1_irq_thread_work() before the lock it taken to widen the
race window for a different problem. So the sub-irqs were not active
when mv88e6xxx_g1_irq_free() run, only the mux-irq was. When
irq_dispose_mapping() is called for the sub-irq there is no problem as
this results in synchronize_irq() for the sub-irq, not the mux-irq.
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-07-22 21:36 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-20 9:53 [PATCH] net: dsa: mv88e6xxx: fix races between lock and irq freeing Uwe Kleine-König
2018-07-22 5:44 ` David Miller
2018-07-22 19:00 ` Uwe Kleine-König
2018-07-22 20:04 ` David Miller
2018-07-22 20:38 ` Uwe Kleine-König
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.