LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock
@ 2020-09-23 16:19 Serge Semin
  2020-09-23 16:19 ` [PATCH 1/3] serial: 8250: Discard RTS/DTS setting from clock update method Serge Semin
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Serge Semin @ 2020-09-23 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Andy Shevchenko
  Cc: Serge Semin, Serge Semin, Hans de Goede, Alexey Malahov,
	Pavel Parkhomenko, Andy Shevchenko, Maxime Ripard, Will Deacon,
	Russell King, linux-arm-kernel, linux-serial, linux-kernel

Hans has discovered that there is a potential deadlock between the ref
clock change notifier and the port suspension procedures {see the link at
the bottom of the letter}. Indeed the deadlock is possible if the port
suspension is initiated during the ref clock rate change:

    CPU0 (suspend CPU/UART)   CPU1 (update clock)
             ----                    ----
    lock(&port->mutex);
                              lock((work_completion)(&data->clk_work));
                              lock(&port->mutex);
    lock((work_completion)(&data->clk_work));

    *** DEADLOCK ***

So the CPU performing the UART port shutdown procedure will wait until the
ref clock change notifier is finished (worker is flushed), while the later
will wait for a port mutex being released.

A possible solution to bypass the deadlock is to move the worker flush out
of the critical section protected by the TTY port mutex. For instance we
can register and de-register the clock change notifier in the port probe
and remove methods instead of having them called from the port
startup/shutdown callbacks. But in order to do that we need to make sure
that the serial8250_update_uartclk() method is safe to be used while the
port is shutted down. Alas the current implementation doesn't provide that
safety. The solution described above is introduced in the framework of
this patchset. See individual patches for details.

Link: https://lore.kernel.org/linux-serial/f1cd5c75-9cda-6896-a4e2-42c5bfc3f5c3@redhat.com

Hans, could you test the patchset out on your Cherry Trail (x86)-based
devices? After that we can merge it in into the kernels 5.8 and 5.9 if
there is no objections against the fix.

Note, in order to have the fix working for the older kernel all of patches
need to be backported.

Fixes: cc816969d7b5 ("serial: 8250_dw: Fix common clocks usage race condition")
Fixes: 868f3ee6e452 ("serial: 8250: Add 8250 port clock update method")
Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Pavel Parkhomenko <Pavel.Parkhomenko@baikalelectronics.ru>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-serial@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (3):
  serial: 8250: Discard RTS/DTS setting from clock update method
  serial: 8250: Skip uninitialized TTY port baud rate update
  serial: 8250_dw: Fix clk-notifier/port suspend deadlock

 drivers/tty/serial/8250/8250_dw.c   | 54 ++++++++++-------------------
 drivers/tty/serial/8250/8250_port.c |  5 ++-
 2 files changed, 23 insertions(+), 36 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] serial: 8250: Discard RTS/DTS setting from clock update method
  2020-09-23 16:19 [PATCH 0/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock Serge Semin
@ 2020-09-23 16:19 ` Serge Semin
  2020-09-23 16:19 ` [PATCH 2/3] serial: 8250: Skip uninitialized TTY port baud rate update Serge Semin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Serge Semin @ 2020-09-23 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Andy Shevchenko
  Cc: Serge Semin, Serge Semin, Hans de Goede, Alexey Malahov,
	Pavel Parkhomenko, Andy Shevchenko, Maxime Ripard, Will Deacon,
	Russell King, linux-arm-kernel, linux-serial, linux-kernel

It has been a mistake to add the MCR register RTS/DTS fields setting in
the generic method of the UART reference clock update. There is no point
in asserting these lines at that procedure. Just discard the
serial8250_out_MCR() mathod invocation from there then.

Fixes: 868f3ee6e452 ("serial: 8250: Add 8250 port clock update method")
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
---
 drivers/tty/serial/8250/8250_port.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index c71d647eb87a..1259fb6b66b3 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -2665,7 +2665,6 @@ void serial8250_update_uartclk(struct uart_port *port, unsigned int uartclk)
 
 	serial8250_set_divisor(port, baud, quot, frac);
 	serial_port_out(port, UART_LCR, up->lcr);
-	serial8250_out_MCR(up, UART_MCR_DTR | UART_MCR_RTS);
 
 	spin_unlock_irqrestore(&port->lock, flags);
 	serial8250_rpm_put(up);
-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/3] serial: 8250: Skip uninitialized TTY port baud rate update
  2020-09-23 16:19 [PATCH 0/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock Serge Semin
  2020-09-23 16:19 ` [PATCH 1/3] serial: 8250: Discard RTS/DTS setting from clock update method Serge Semin
@ 2020-09-23 16:19 ` Serge Semin
  2020-09-23 16:19 ` [PATCH 3/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock Serge Semin
  2020-09-27 15:01 ` [PATCH 0/3] " Hans de Goede
  3 siblings, 0 replies; 7+ messages in thread
From: Serge Semin @ 2020-09-23 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Andy Shevchenko
  Cc: Serge Semin, Serge Semin, Hans de Goede, Alexey Malahov,
	Pavel Parkhomenko, Andy Shevchenko, Maxime Ripard, Will Deacon,
	Russell King, linux-arm-kernel, linux-serial, linux-kernel

It is erroneous to update the TTY port baud rate if it hasn't been
initialized yet, because in that case the TTY struct isn't set. So there
is no termios structure to get and re-calculate the baud if the current
baud can't be reached. Let's skip the baud rate update then until the port
is fully initialized.

Note the update UART clock method still sets the uartclk member with a new
ref clock value even if the port is turned off. The new UART ref clock
rate will be used later on the port starting up procedure.

Fixes: 868f3ee6e452 ("serial: 8250: Add 8250 port clock update method")
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
---
 drivers/tty/serial/8250/8250_port.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index 1259fb6b66b3..b0af13074cd3 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -2653,6 +2653,10 @@ void serial8250_update_uartclk(struct uart_port *port, unsigned int uartclk)
 		goto out_lock;
 
 	port->uartclk = uartclk;
+
+	if (!tty_port_initialized(&port->state->port))
+		goto out_lock;
+
 	termios = &port->state->port.tty->termios;
 
 	baud = serial8250_get_baud_rate(port, termios, NULL);
-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 3/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock
  2020-09-23 16:19 [PATCH 0/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock Serge Semin
  2020-09-23 16:19 ` [PATCH 1/3] serial: 8250: Discard RTS/DTS setting from clock update method Serge Semin
  2020-09-23 16:19 ` [PATCH 2/3] serial: 8250: Skip uninitialized TTY port baud rate update Serge Semin
@ 2020-09-23 16:19 ` Serge Semin
  2020-10-18  0:32   ` Jonathan Liu
  2020-09-27 15:01 ` [PATCH 0/3] " Hans de Goede
  3 siblings, 1 reply; 7+ messages in thread
From: Serge Semin @ 2020-09-23 16:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Andy Shevchenko
  Cc: Serge Semin, Serge Semin, Hans de Goede, Alexey Malahov,
	Pavel Parkhomenko, Andy Shevchenko, Maxime Ripard, Will Deacon,
	Russell King, linux-arm-kernel, linux-serial, linux-kernel

It has been discovered that there is a potential deadlock between
the clock-change-notifier thread and the UART port suspending one:

   CPU0 (suspend CPU/UART)   CPU1 (update clock)
            ----                    ----
   lock(&port->mutex);
                             lock((work_completion)(&data->clk_work));
                             lock(&port->mutex);
   lock((work_completion)(&data->clk_work));

   *** DEADLOCK ***

The best way to fix this is to eliminate the CPU0
port->mutex/work-completion scenario. So we suggest to register and
unregister the clock-notifier during the DW APB UART port probe/remove
procedures, instead of doing that at the points of the port
startup/shutdown.

Link: https://lore.kernel.org/linux-serial/f1cd5c75-9cda-6896-a4e2-42c5bfc3f5c3@redhat.com

Fixes: cc816969d7b5 ("serial: 8250_dw: Fix common clocks usage race condition")
Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
---
 drivers/tty/serial/8250/8250_dw.c | 54 +++++++++++--------------------
 1 file changed, 19 insertions(+), 35 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c
index 87f450b7c177..9e204f9b799a 100644
--- a/drivers/tty/serial/8250/8250_dw.c
+++ b/drivers/tty/serial/8250/8250_dw.c
@@ -373,39 +373,6 @@ static void dw8250_set_ldisc(struct uart_port *p, struct ktermios *termios)
 	serial8250_do_set_ldisc(p, termios);
 }
 
-static int dw8250_startup(struct uart_port *p)
-{
-	struct dw8250_data *d = to_dw8250_data(p->private_data);
-	int ret;
-
-	/*
-	 * Some platforms may provide a reference clock shared between several
-	 * devices. In this case before using the serial port first we have to
-	 * make sure that any clock state change is known to the UART port at
-	 * least post factum.
-	 */
-	if (d->clk) {
-		ret = clk_notifier_register(d->clk, &d->clk_notifier);
-		if (ret)
-			dev_warn(p->dev, "Failed to set the clock notifier\n");
-	}
-
-	return serial8250_do_startup(p);
-}
-
-static void dw8250_shutdown(struct uart_port *p)
-{
-	struct dw8250_data *d = to_dw8250_data(p->private_data);
-
-	serial8250_do_shutdown(p);
-
-	if (d->clk) {
-		clk_notifier_unregister(d->clk, &d->clk_notifier);
-
-		flush_work(&d->clk_work);
-	}
-}
-
 /*
  * dw8250_fallback_dma_filter will prevent the UART from getting just any free
  * channel on platforms that have DMA engines, but don't have any channels
@@ -501,8 +468,6 @@ static int dw8250_probe(struct platform_device *pdev)
 	p->serial_out	= dw8250_serial_out;
 	p->set_ldisc	= dw8250_set_ldisc;
 	p->set_termios	= dw8250_set_termios;
-	p->startup	= dw8250_startup;
-	p->shutdown	= dw8250_shutdown;
 
 	p->membase = devm_ioremap(dev, regs->start, resource_size(regs));
 	if (!p->membase)
@@ -622,6 +587,19 @@ static int dw8250_probe(struct platform_device *pdev)
 		goto err_reset;
 	}
 
+	/*
+	 * Some platforms may provide a reference clock shared between several
+	 * devices. In this case any clock state change must be known to the
+	 * UART port at least post factum.
+	 */
+	if (data->clk) {
+		err = clk_notifier_register(data->clk, &data->clk_notifier);
+		if (err)
+			dev_warn(p->dev, "Failed to set the clock notifier\n");
+		else
+			queue_work(system_unbound_wq, &data->clk_work);
+	}
+
 	platform_set_drvdata(pdev, data);
 
 	pm_runtime_set_active(dev);
@@ -648,6 +626,12 @@ static int dw8250_remove(struct platform_device *pdev)
 
 	pm_runtime_get_sync(dev);
 
+	if (data->clk) {
+		clk_notifier_unregister(data->clk, &data->clk_notifier);
+
+		flush_work(&data->clk_work);
+	}
+
 	serial8250_unregister_port(data->data.line);
 
 	reset_control_assert(data->rst);
-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock
  2020-09-23 16:19 [PATCH 0/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock Serge Semin
                   ` (2 preceding siblings ...)
  2020-09-23 16:19 ` [PATCH 3/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock Serge Semin
@ 2020-09-27 15:01 ` Hans de Goede
  2020-09-29 20:51   ` Serge Semin
  3 siblings, 1 reply; 7+ messages in thread
From: Hans de Goede @ 2020-09-27 15:01 UTC (permalink / raw)
  To: Serge Semin, Greg Kroah-Hartman, Jiri Slaby, Andy Shevchenko
  Cc: Serge Semin, Alexey Malahov, Pavel Parkhomenko, Andy Shevchenko,
	Maxime Ripard, Will Deacon, Russell King, linux-arm-kernel,
	linux-serial, linux-kernel

Hi,

On 9/23/20 6:19 PM, Serge Semin wrote:
> Hans has discovered that there is a potential deadlock between the ref
> clock change notifier and the port suspension procedures {see the link at
> the bottom of the letter}. Indeed the deadlock is possible if the port
> suspension is initiated during the ref clock rate change:
> 
>      CPU0 (suspend CPU/UART)   CPU1 (update clock)
>               ----                    ----
>      lock(&port->mutex);
>                                lock((work_completion)(&data->clk_work));
>                                lock(&port->mutex);
>      lock((work_completion)(&data->clk_work));
> 
>      *** DEADLOCK ***
> 
> So the CPU performing the UART port shutdown procedure will wait until the
> ref clock change notifier is finished (worker is flushed), while the later
> will wait for a port mutex being released.
> 
> A possible solution to bypass the deadlock is to move the worker flush out
> of the critical section protected by the TTY port mutex. For instance we
> can register and de-register the clock change notifier in the port probe
> and remove methods instead of having them called from the port
> startup/shutdown callbacks. But in order to do that we need to make sure
> that the serial8250_update_uartclk() method is safe to be used while the
> port is shutted down. Alas the current implementation doesn't provide that
> safety. The solution described above is introduced in the framework of
> this patchset. See individual patches for details.
> 
> Link: https://lore.kernel.org/linux-serial/f1cd5c75-9cda-6896-a4e2-42c5bfc3f5c3@redhat.com
> 
> Hans, could you test the patchset out on your Cherry Trail (x86)-based
> devices? After that we can merge it in into the kernels 5.8 and 5.9 if
> there is no objections against the fix.

Done, I can confirm that this fixes the lockdep issue for me, so you
can add my:

Tested-by: Hans de Goede <hdegoede@redhat.com>

To the entire series.

Regards,

Hans


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock
  2020-09-27 15:01 ` [PATCH 0/3] " Hans de Goede
@ 2020-09-29 20:51   ` Serge Semin
  0 siblings, 0 replies; 7+ messages in thread
From: Serge Semin @ 2020-09-29 20:51 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Greg Kroah-Hartman, Jiri Slaby, Andy Shevchenko, Alexey Malahov,
	Pavel Parkhomenko, Andy Shevchenko, Maxime Ripard, Will Deacon,
	Russell King, linux-arm-kernel, linux-serial, linux-kernel

Hello,

On Sun, Sep 27, 2020 at 05:01:52PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 9/23/20 6:19 PM, Serge Semin wrote:
> > Hans has discovered that there is a potential deadlock between the ref
> > clock change notifier and the port suspension procedures {see the link at
> > the bottom of the letter}. Indeed the deadlock is possible if the port
> > suspension is initiated during the ref clock rate change:
> > 
> >      CPU0 (suspend CPU/UART)   CPU1 (update clock)
> >               ----                    ----
> >      lock(&port->mutex);
> >                                lock((work_completion)(&data->clk_work));
> >                                lock(&port->mutex);
> >      lock((work_completion)(&data->clk_work));
> > 
> >      *** DEADLOCK ***
> > 
> > So the CPU performing the UART port shutdown procedure will wait until the
> > ref clock change notifier is finished (worker is flushed), while the later
> > will wait for a port mutex being released.
> > 
> > A possible solution to bypass the deadlock is to move the worker flush out
> > of the critical section protected by the TTY port mutex. For instance we
> > can register and de-register the clock change notifier in the port probe
> > and remove methods instead of having them called from the port
> > startup/shutdown callbacks. But in order to do that we need to make sure
> > that the serial8250_update_uartclk() method is safe to be used while the
> > port is shutted down. Alas the current implementation doesn't provide that
> > safety. The solution described above is introduced in the framework of
> > this patchset. See individual patches for details.
> > 
> > Link: https://lore.kernel.org/linux-serial/f1cd5c75-9cda-6896-a4e2-42c5bfc3f5c3@redhat.com
> > 
> > Hans, could you test the patchset out on your Cherry Trail (x86)-based
> > devices? After that we can merge it in into the kernels 5.8 and 5.9 if
> > there is no objections against the fix.
> 
> Done, I can confirm that this fixes the lockdep issue for me, so you
> can add my:
> 
> Tested-by: Hans de Goede <hdegoede@redhat.com>

Great! Thank you very much.

Greg, could you merge the series in if you have no objection against the
solution design? Seeing the bug has been introduced together with the
original series integrated in the kernel 5.9, the fix provided by this
patchset will be only needed in 5.9.

-Sergey

> 
> To the entire series.
> 
> Regards,
> 
> Hans
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock
  2020-09-23 16:19 ` [PATCH 3/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock Serge Semin
@ 2020-10-18  0:32   ` Jonathan Liu
  0 siblings, 0 replies; 7+ messages in thread
From: Jonathan Liu @ 2020-10-18  0:32 UTC (permalink / raw)
  To: Serge Semin
  Cc: Greg Kroah-Hartman, Jiri Slaby, Andy Shevchenko, Maxime Ripard,
	linux-kernel, Russell King, Serge Semin, Alexey Malahov,
	Hans de Goede, Andy Shevchenko, Pavel Parkhomenko, linux-serial,
	Will Deacon, linux-arm-kernel

On Wed, 23 Sep 2020 at 16:19, Serge Semin
<Sergey.Semin@baikalelectronics.ru> wrote:
>
> It has been discovered that there is a potential deadlock between
> the clock-change-notifier thread and the UART port suspending one:
>
>    CPU0 (suspend CPU/UART)   CPU1 (update clock)
>             ----                    ----
>    lock(&port->mutex);
>                              lock((work_completion)(&data->clk_work));
>                              lock(&port->mutex);
>    lock((work_completion)(&data->clk_work));
>
>    *** DEADLOCK ***
>
> The best way to fix this is to eliminate the CPU0
> port->mutex/work-completion scenario. So we suggest to register and
> unregister the clock-notifier during the DW APB UART port probe/remove
> procedures, instead of doing that at the points of the port
> startup/shutdown.
>
> Link: https://lore.kernel.org/linux-serial/f1cd5c75-9cda-6896-a4e2-42c5bfc3f5c3@redhat.com
>
> Fixes: cc816969d7b5 ("serial: 8250_dw: Fix common clocks usage race condition")
> Reported-by: Hans de Goede <hdegoede@redhat.com>
> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>

Tested-by: Jonathan Liu <net147@gmail.com>

Fixes hang while closing the serial port on RK3399 that I was
experiencing often with Linux 5.9.
After applying this patch, it no longer hangs while closing the serial port.
No problems while rebooting either.

Thanks.

Regards,
Jonathan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-23 16:19 [PATCH 0/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock Serge Semin
2020-09-23 16:19 ` [PATCH 1/3] serial: 8250: Discard RTS/DTS setting from clock update method Serge Semin
2020-09-23 16:19 ` [PATCH 2/3] serial: 8250: Skip uninitialized TTY port baud rate update Serge Semin
2020-09-23 16:19 ` [PATCH 3/3] serial: 8250_dw: Fix clk-notifier/port suspend deadlock Serge Semin
2020-10-18  0:32   ` Jonathan Liu
2020-09-27 15:01 ` [PATCH 0/3] " Hans de Goede
2020-09-29 20:51   ` Serge Semin

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git