All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] serial_core: Get a reference for port->tty in uart_remove_one_port()
@ 2014-03-17 13:10 Geert Uytterhoeven
  2014-03-17 13:10 ` [PATCH 2/2] [RFC] serial_core: Avoid NULL pointer dereference in uart_close() Geert Uytterhoeven
  0 siblings, 1 reply; 6+ messages in thread
From: Geert Uytterhoeven @ 2014-03-17 13:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Peter Hurley
  Cc: linux-serial, linux-kernel, Geert Uytterhoeven

From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>

Suggested-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
---
 drivers/tty/serial/serial_core.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 8ee90f731f31..21084f0b8ea4 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -2647,6 +2647,7 @@ int uart_remove_one_port(struct uart_driver *drv, struct uart_port *uport)
 {
 	struct uart_state *state = drv->state + uport->line;
 	struct tty_port *port = &state->port;
+	struct tty_struct *tty;
 	int ret = 0;
 
 	BUG_ON(in_interrupt());
@@ -2675,8 +2676,11 @@ int uart_remove_one_port(struct uart_driver *drv, struct uart_port *uport)
 	 */
 	tty_unregister_device(drv->tty_driver, uport->line);
 
-	if (port->tty)
+	tty = tty_port_tty_get(port);
+	if (tty) {
 		tty_vhangup(port->tty);
+		tty_kref_put(tty);
+	}
 
 	/*
 	 * If the port is used as a console, unregister it, and power it down
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] [RFC] serial_core: Avoid NULL pointer dereference in uart_close()
  2014-03-17 13:10 [PATCH 1/2] serial_core: Get a reference for port->tty in uart_remove_one_port() Geert Uytterhoeven
@ 2014-03-17 13:10 ` Geert Uytterhoeven
  2014-03-21 20:29   ` Peter Hurley
  0 siblings, 1 reply; 6+ messages in thread
From: Geert Uytterhoeven @ 2014-03-17 13:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Peter Hurley
  Cc: linux-serial, linux-kernel, Geert Uytterhoeven

From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>

When unbinding a serial driver that's being used as a serial console,
the kernel may crash with a NULL pointer dereference in a uart_*() function
called from uart_close () (e.g. uart_flush_buffer() or
uart_chars_in_buffer()).

To fix this, let uart_close() check for port->count == 0. If this is the
case, bail out early. Else tty_port_close_start() will make the port
counts inconsistent, printing out warnings like

    tty_port_close_start: tty->count = 1 port count = 0.

and

    tty_port_close_start: count = -1

and once uport == NULL, it will also crash.

Also fix the related crash in pr_debug() by checking for a non-NULL uport
first.

Detailed description:

On driver unbind, uart_remove_one_port() is called. Basically it;
  - marks the port dead,
  - calls tty_vhangup(),
  - sets state->uart_port = NULL.

What will happen depends on whether the port is just in use by e.g. getty,
or was also opened as a console.

A. If the tty was not opened as a console:

  - tty_vhangup() will (in __tty_hangup()):
      - mark all file descriptors for this tty hung up by pointing them to
	hung_up_tty_fops,
      - call uart_hangup(), which sets port->count to 0.

  - A subsequent uart_open() (this may be through /dev/ttyS*, or through
    /dev/console if this is a serial console) will fail with -ENXIO as the
    port was marked dead,
  - uart_close() after the failed uart_open() will return early, as
    tty_hung_up_p() (called from tty_port_close_start()) will notice it was
    hung up.

B. If the tty was also opened as a console:

  - tty_vhangup() will (in __tty_hangup()):
      - mark non-console file descriptors for this tty hung up by pointing
	them to hung_up_tty_fops,
      - NOT call uart_hangup(), but instead call uart_close() for every
        non-console file descriptor, so port->count will still have a
	non-zero value afterwards.

  - A subsequent uart_open() will fail with -ENXIO as the port was
    marked dead,
  - uart_close() after the failed uart_open() starts to misbehave:
      - tty_hung_up_p() will not notice it was hung up,
      - As port->count is non-zero, tty_port_close_start() will decrease
        port->count, making the tty and port counts inconsistent. Later,
	warnings like these will be printed:

	    tty_port_close_start: tty->count = 1 port count = 0.

	and
	    tty_port_close_start: count = -1

      - If all of this happens after state->uart_port was set to zero, a
        NULL pointer dereference will happen.

Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
---
This still doesn't fix everything, but avoids the NULL pointer dereference.

In case B, port->count is still non-zero after tty_vhangup().  Hence when
uart_close() is called after a failed uart_open(), it will decrement
port->count, and, on reaching zero, it will close the port for real (incl.
calling uart_shutdown()).  This doesn't seem to cause any harm, though.
Besides, if uart_close() wouldn't do that, who else would shut down the
port?

As tty_open() always calls tty_release() on failure of ->open(),
->close() is always called.
How can uart_close() after a failed uart_open() know it should not do
anything? Set a (new) flag in its port structure?
---
 drivers/tty/serial/serial_core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 21084f0b8ea4..56dda84f82a5 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -1319,9 +1319,9 @@ static void uart_close(struct tty_struct *tty, struct file *filp)
 	uport = state->uart_port;
 	port = &state->port;
 
-	pr_debug("uart_close(%d) called\n", uport->line);
+	pr_debug("uart_close(%d) called\n", uport ? uport->line : -1);
 
-	if (tty_port_close_start(port, tty, filp) == 0)
+	if (!port->count || tty_port_close_start(port, tty, filp) == 0)
 		return;
 
 	/*
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] [RFC] serial_core: Avoid NULL pointer dereference in uart_close()
  2014-03-17 13:10 ` [PATCH 2/2] [RFC] serial_core: Avoid NULL pointer dereference in uart_close() Geert Uytterhoeven
@ 2014-03-21 20:29   ` Peter Hurley
  2014-03-26 18:58     ` Geert Uytterhoeven
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Hurley @ 2014-03-21 20:29 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel,
	Geert Uytterhoeven

On 03/17/2014 09:10 AM, Geert Uytterhoeven wrote:
> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>
> When unbinding a serial driver that's being used as a serial console,
> the kernel may crash with a NULL pointer dereference in a uart_*() function
> called from uart_close () (e.g. uart_flush_buffer() or
> uart_chars_in_buffer()).
>
> To fix this, let uart_close() check for port->count == 0. If this is the
> case, bail out early. Else tty_port_close_start() will make the port
> counts inconsistent, printing out warnings like
>
>      tty_port_close_start: tty->count = 1 port count = 0.
>
> and
>
>      tty_port_close_start: count = -1

As you noted in the patch comments below, the tty core always closes
a failed open.

So the reason for the port->count mismatch is because tty_port_close_start()
only handles the tty hangup condition -- all other failed opens are assumed
to carry a port->count.

A similar mismatch will occur if the mutex_lock in uart_open() is
interrupted.

This means that the port->count mismatch can occur even if port->count != 0;
so the bug here is that uart_open() and uart_close() don't agree on
who does what cleanup under what error conditions.

So with respect to the port count mismatches, the conditions need careful
auditing and fixing, separate from the tty console teardown problem.

> and once uport == NULL, it will also crash.

Ok, but so will basically every serial core function after
uart_remove_one_port() assigns state->uart_port = NULL

IOW, uart_close() is not the only serial core function that could be
using state->uart_port after uart_remove_one_port() finishes.

> Also fix the related crash in pr_debug() by checking for a non-NULL uport
> first.

So my point is that checking for NULL state->uart_port is simply not the fix,
here or anywhere else.

AFAICT, either reference-counting the uart_port structure or RCU
are the only viable solutions to safe ll driver uart_port removal.

Regards,
Peter Hurley

> Detailed description:
>
> On driver unbind, uart_remove_one_port() is called. Basically it;
>    - marks the port dead,
>    - calls tty_vhangup(),
>    - sets state->uart_port = NULL.
>
> What will happen depends on whether the port is just in use by e.g. getty,
> or was also opened as a console.
>
> A. If the tty was not opened as a console:
>
>    - tty_vhangup() will (in __tty_hangup()):
>        - mark all file descriptors for this tty hung up by pointing them to
> 	hung_up_tty_fops,
>        - call uart_hangup(), which sets port->count to 0.
>
>    - A subsequent uart_open() (this may be through /dev/ttyS*, or through
>      /dev/console if this is a serial console) will fail with -ENXIO as the
>      port was marked dead,
>    - uart_close() after the failed uart_open() will return early, as
>      tty_hung_up_p() (called from tty_port_close_start()) will notice it was
>      hung up.
>
> B. If the tty was also opened as a console:
>
>    - tty_vhangup() will (in __tty_hangup()):
>        - mark non-console file descriptors for this tty hung up by pointing
> 	them to hung_up_tty_fops,
>        - NOT call uart_hangup(), but instead call uart_close() for every
>          non-console file descriptor, so port->count will still have a
> 	non-zero value afterwards.
>
>    - A subsequent uart_open() will fail with -ENXIO as the port was
>      marked dead,
>    - uart_close() after the failed uart_open() starts to misbehave:
>        - tty_hung_up_p() will not notice it was hung up,
>        - As port->count is non-zero, tty_port_close_start() will decrease
>          port->count, making the tty and port counts inconsistent. Later,
> 	warnings like these will be printed:
>
> 	    tty_port_close_start: tty->count = 1 port count = 0.
>
> 	and
> 	    tty_port_close_start: count = -1
>
>        - If all of this happens after state->uart_port was set to zero, a
>          NULL pointer dereference will happen.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
> ---
> This still doesn't fix everything, but avoids the NULL pointer dereference.
>
> In case B, port->count is still non-zero after tty_vhangup().  Hence when
> uart_close() is called after a failed uart_open(), it will decrement
> port->count, and, on reaching zero, it will close the port for real (incl.
> calling uart_shutdown()).  This doesn't seem to cause any harm, though.
> Besides, if uart_close() wouldn't do that, who else would shut down the
> port?
>
> As tty_open() always calls tty_release() on failure of ->open(),
> ->close() is always called.
> How can uart_close() after a failed uart_open() know it should not do
> anything? Set a (new) flag in its port structure?
> ---
>   drivers/tty/serial/serial_core.c |    4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> index 21084f0b8ea4..56dda84f82a5 100644
> --- a/drivers/tty/serial/serial_core.c
> +++ b/drivers/tty/serial/serial_core.c
> @@ -1319,9 +1319,9 @@ static void uart_close(struct tty_struct *tty, struct file *filp)
>   	uport = state->uart_port;
>   	port = &state->port;
>
> -	pr_debug("uart_close(%d) called\n", uport->line);
> +	pr_debug("uart_close(%d) called\n", uport ? uport->line : -1);
>
> -	if (tty_port_close_start(port, tty, filp) == 0)
> +	if (!port->count || tty_port_close_start(port, tty, filp) == 0)
>   		return;
>
>   	/*
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] [RFC] serial_core: Avoid NULL pointer dereference in uart_close()
  2014-03-21 20:29   ` Peter Hurley
@ 2014-03-26 18:58     ` Geert Uytterhoeven
  2014-03-26 20:10       ` Peter Hurley
  0 siblings, 1 reply; 6+ messages in thread
From: Geert Uytterhoeven @ 2014-03-26 18:58 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel,
	Geert Uytterhoeven

Hi Peter,

Thanks for your comments!

On Fri, Mar 21, 2014 at 9:29 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> On 03/17/2014 09:10 AM, Geert Uytterhoeven wrote:
>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>> When unbinding a serial driver that's being used as a serial console,
>> the kernel may crash with a NULL pointer dereference in a uart_*()
>> function
>> called from uart_close () (e.g. uart_flush_buffer() or
>> uart_chars_in_buffer()).
>>
>> To fix this, let uart_close() check for port->count == 0. If this is the
>> case, bail out early. Else tty_port_close_start() will make the port
>> counts inconsistent, printing out warnings like
>>
>>      tty_port_close_start: tty->count = 1 port count = 0.
>>
>> and
>>
>>      tty_port_close_start: count = -1
>
> As you noted in the patch comments below, the tty core always closes
> a failed open.
>
> So the reason for the port->count mismatch is because tty_port_close_start()
> only handles the tty hangup condition -- all other failed opens are assumed
> to carry a port->count.
>
> A similar mismatch will occur if the mutex_lock in uart_open() is
> interrupted.
>
> This means that the port->count mismatch can occur even if port->count != 0;
> so the bug here is that uart_open() and uart_close() don't agree on
> who does what cleanup under what error conditions.
>
> So with respect to the port count mismatches, the conditions need careful
> auditing and fixing, separate from the tty console teardown problem.

Indeed. Currently uart_open() always decrements port->count again
in any error condition, which is clearly wrong.

>> and once uport == NULL, it will also crash.
>
> Ok, but so will basically every serial core function after
> uart_remove_one_port() assigns state->uart_port = NULL
>
> IOW, uart_close() is not the only serial core function that could be
> using state->uart_port after uart_remove_one_port() finishes.
>
>> Also fix the related crash in pr_debug() by checking for a non-NULL uport
>> first.
>
> So my point is that checking for NULL state->uart_port is simply not the
> fix,
> here or anywhere else.
>
> AFAICT, either reference-counting the uart_port structure or RCU
> are the only viable solutions to safe ll driver uart_port removal.

I gave it a quick try, but failed. It seems the driver needs some big surgery
to fix this.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] [RFC] serial_core: Avoid NULL pointer dereference in uart_close()
  2014-03-26 18:58     ` Geert Uytterhoeven
@ 2014-03-26 20:10       ` Peter Hurley
  2014-03-27  9:34         ` Geert Uytterhoeven
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Hurley @ 2014-03-26 20:10 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel,
	Geert Uytterhoeven

On 03/26/2014 02:58 PM, Geert Uytterhoeven wrote:
> Hi Peter,
>
> Thanks for your comments!

Not a problem; just wanted to save you some time and frustration :)

> On Fri, Mar 21, 2014 at 9:29 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
>> On 03/17/2014 09:10 AM, Geert Uytterhoeven wrote:
>>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>>> When unbinding a serial driver that's being used as a serial console,
>>> the kernel may crash with a NULL pointer dereference in a uart_*()
>>> function
>>> called from uart_close () (e.g. uart_flush_buffer() or
>>> uart_chars_in_buffer()).
>>>
>>> To fix this, let uart_close() check for port->count == 0. If this is the
>>> case, bail out early. Else tty_port_close_start() will make the port
>>> counts inconsistent, printing out warnings like
>>>
>>>       tty_port_close_start: tty->count = 1 port count = 0.
>>>
>>> and
>>>
>>>       tty_port_close_start: count = -1
>>
>> As you noted in the patch comments below, the tty core always closes
>> a failed open.
>>
>> So the reason for the port->count mismatch is because tty_port_close_start()
>> only handles the tty hangup condition -- all other failed opens are assumed
>> to carry a port->count.
>>
>> A similar mismatch will occur if the mutex_lock in uart_open() is
>> interrupted.
>>
>> This means that the port->count mismatch can occur even if port->count != 0;
>> so the bug here is that uart_open() and uart_close() don't agree on
>> who does what cleanup under what error conditions.
>>
>> So with respect to the port count mismatches, the conditions need careful
>> auditing and fixing, separate from the tty console teardown problem.
>
> Indeed. Currently uart_open() always decrements port->count again
> in any error condition, which is clearly wrong.

I started looking at this problem only to realize that the
tty_hung_up_p() condition in uart_open() can't actually happen.

Which has lead me to a bunch of cleanups and fixes that I'm still
working on. It's just slow going because tty code audit takes
forever with legacy intentions that no longer apply and some of
the bit-rotting tty drivers that I doubt even run.

>>> and once uport == NULL, it will also crash.
>>
>> Ok, but so will basically every serial core function after
>> uart_remove_one_port() assigns state->uart_port = NULL
>>
>> IOW, uart_close() is not the only serial core function that could be
>> using state->uart_port after uart_remove_one_port() finishes.
>>
>>> Also fix the related crash in pr_debug() by checking for a non-NULL uport
>>> first.
>>
>> So my point is that checking for NULL state->uart_port is simply not the
>> fix,
>> here or anywhere else.
>>
>> AFAICT, either reference-counting the uart_port structure or RCU
>> are the only viable solutions to safe ll driver uart_port removal.
>
> I gave it a quick try, but failed. It seems the driver needs some big surgery
> to fix this.

I figured that would be true.
What are the circumstances of device removal in your case?

Regards,
Peter Hurley


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] [RFC] serial_core: Avoid NULL pointer dereference in uart_close()
  2014-03-26 20:10       ` Peter Hurley
@ 2014-03-27  9:34         ` Geert Uytterhoeven
  0 siblings, 0 replies; 6+ messages in thread
From: Geert Uytterhoeven @ 2014-03-27  9:34 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Greg Kroah-Hartman, Jiri Slaby, linux-serial, linux-kernel,
	Geert Uytterhoeven

Hi Peter,

On Wed, Mar 26, 2014 at 9:10 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> On 03/26/2014 02:58 PM, Geert Uytterhoeven wrote:
>> Thanks for your comments!
>
> Not a problem; just wanted to save you some time and frustration :)

Much appreciated!

>> On Fri, Mar 21, 2014 at 9:29 PM, Peter Hurley <peter@hurleysoftware.com>
>> wrote:
>>> On 03/17/2014 09:10 AM, Geert Uytterhoeven wrote:
>>>> From: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
>>>> When unbinding a serial driver that's being used as a serial console,
>>>> the kernel may crash with a NULL pointer dereference in a uart_*()
>>>> function
>>>> called from uart_close () (e.g. uart_flush_buffer() or
>>>> uart_chars_in_buffer()).
>>>>
>>>> To fix this, let uart_close() check for port->count == 0. If this is the
>>>> case, bail out early. Else tty_port_close_start() will make the port
>>>> counts inconsistent, printing out warnings like
>>>>
>>>>       tty_port_close_start: tty->count = 1 port count = 0.
>>>>
>>>> and
>>>>
>>>>       tty_port_close_start: count = -1
>>>
>>> As you noted in the patch comments below, the tty core always closes
>>> a failed open.
>>>
>>> So the reason for the port->count mismatch is because
>>> tty_port_close_start()
>>> only handles the tty hangup condition -- all other failed opens are
>>> assumed
>>> to carry a port->count.
>>>
>>> A similar mismatch will occur if the mutex_lock in uart_open() is
>>> interrupted.
>>>
>>> This means that the port->count mismatch can occur even if port->count !=
>>> 0;
>>> so the bug here is that uart_open() and uart_close() don't agree on
>>> who does what cleanup under what error conditions.
>>>
>>> So with respect to the port count mismatches, the conditions need careful
>>> auditing and fixing, separate from the tty console teardown problem.
>>
>> Indeed. Currently uart_open() always decrements port->count again
>> in any error condition, which is clearly wrong.
>
> I started looking at this problem only to realize that the
> tty_hung_up_p() condition in uart_open() can't actually happen.

BTW, generic tty_port_open(), as used by new serial port drivers, also checks
for this condition.

> Which has lead me to a bunch of cleanups and fixes that I'm still
> working on. It's just slow going because tty code audit takes
> forever with legacy intentions that no longer apply and some of
> the bit-rotting tty drivers that I doubt even run.

Sure, I understand.

> What are the circumstances of device removal in your case?

I'm unbinding the driver using:

    echo sh-sci.6 > /sys/bus/platform/drivers/sh-sci/unbind

As long as the serial port is not opened as a console at the time
of unbind, everything is reasonably well. But if it's open as a console,
uart_hangup() is no longer called, and proper cleanup never happens.

I started looking into this when I wanted to verify that the serial hardware's
clock is properly disabled when the hardware is not in use (e.g. on driver
shutdown).

Note that Greg has applied this patch to linux-next, so you may want to
revert it for your investigations (and fix ;-).

Thanks again!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-03-27  9:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-17 13:10 [PATCH 1/2] serial_core: Get a reference for port->tty in uart_remove_one_port() Geert Uytterhoeven
2014-03-17 13:10 ` [PATCH 2/2] [RFC] serial_core: Avoid NULL pointer dereference in uart_close() Geert Uytterhoeven
2014-03-21 20:29   ` Peter Hurley
2014-03-26 18:58     ` Geert Uytterhoeven
2014-03-26 20:10       ` Peter Hurley
2014-03-27  9:34         ` Geert Uytterhoeven

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.