linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
@ 2020-04-28 12:48 Markus Elfring
  0 siblings, 0 replies; 18+ messages in thread
From: Markus Elfring @ 2020-04-28 12:48 UTC (permalink / raw)
  To: Raghavendra Rao Ananta, linuxppc-dev
  Cc: Greg Kroah-Hartman, Andrew Melnychenko, linux-kernel, Jiri Slaby

> Hence, serialize hvc_open and check if tty->private_data is NULL before
> proceeding ahead.

How do you think about to add the tag “Fixes” because of adjustments
for the data synchronisation?


…
> +++ b/drivers/tty/hvc/hvc_console.c
…
@@ -384,6 +394,8 @@ static int hvc_open(struct tty_struct *tty, struct file * filp)
…
> +out:
> +	mutex_unlock(&hvc_open_mutex);
>  	return rc;
>  }

I suggest to use the label “unlock” instead.

Regards,
Markus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-20  9:38                     ` Jiri Slaby
@ 2020-05-20 13:49                       ` rananta
  0 siblings, 0 replies; 18+ messages in thread
From: rananta @ 2020-05-20 13:49 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Greg KH, andrew, linuxppc-dev, linux-kernel

On 2020-05-20 02:38, Jiri Slaby wrote:
> On 15. 05. 20, 1:22, rananta@codeaurora.org wrote:
>> On 2020-05-13 00:04, Greg KH wrote:
>>> On Tue, May 12, 2020 at 02:39:50PM -0700, rananta@codeaurora.org 
>>> wrote:
>>>> On 2020-05-12 01:25, Greg KH wrote:
>>>> > On Tue, May 12, 2020 at 09:22:15AM +0200, Jiri Slaby wrote:
>>>> > > commit bdb498c20040616e94b05c31a0ceb3e134b7e829
>>>> > > Author: Jiri Slaby <jslaby@suse.cz>
>>>> > > Date:   Tue Aug 7 21:48:04 2012 +0200
>>>> > >
>>>> > >     TTY: hvc_console, add tty install
>>>> > >
>>>> > > added hvc_install but did not move 'tty->driver_data = NULL;' from
>>>> > > hvc_open's fail path to hvc_cleanup.
>>>> > >
>>>> > > IOW hvc_open now NULLs tty->driver_data even for another task which
>>>> > > opened the tty earlier. The same holds for
>>>> > > "tty_port_tty_set(&hp->port,
>>>> > > NULL);" there. And actually "tty_port_put(&hp->port);" is also
>>>> > > incorrect
>>>> > > for the 2nd task opening the tty.
>>>> > >
> 
> ...
> 
>> These are the traces you get when the issue happens:
>> [  154.212291] hvc_install called for pid: 666
>> [  154.216705] hvc_open called for pid: 666
>> [  154.233657] hvc_open: request_irq failed with rc -22.
>> [  154.238934] hvc_open called for pid: 678
>> [  154.243012] Unable to handle kernel NULL pointer dereference at
>> virtual address 00000000000000c4
>> # hvc_install isn't called for pid: 678 as the file wasn't closed yet.
> 
> Nice. Does the attached help?
Yeah, it looks good. I think it also eliminates the port.count reference
issue discussed on the v2 patch. Are you planning to mainline this?
> 
> I wonder how comes the tty_port_put in hvc_open does not cause a UAF? I
> would say hvc_open fails, tty_port_put is called. It decrements the
> reference taken in hvc_install. So far so good.
> 
> Now, this should happen IMO:
> tty_open
>   -> hvc_open (fails)
>     -> tty_port_put
hvc_console driver defines port->ops->destruct(). Upon tty_port_put(), 
the
tty_port_destructor() calls port->ops->destruct(), rather than 
kfree(port).
>   -> tty_release
>     -> tty_release_struct
>       -> tty_kref_put
>         -> queue_release_one_tty
> SCHEDULED WORKQUEUE
> release_one_tty
>   -> hvc_cleanup
>     -> tty_port_put (should die terrible death now)
Since port is not free'd, I think we should be good.
> 
> What am I missing?
> 
> thanks,

Thank you.
Raghavendra

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-14 23:22                   ` rananta
  2020-05-15  7:30                     ` Greg KH
@ 2020-05-20  9:38                     ` Jiri Slaby
  2020-05-20 13:49                       ` rananta
  1 sibling, 1 reply; 18+ messages in thread
From: Jiri Slaby @ 2020-05-20  9:38 UTC (permalink / raw)
  To: rananta, Greg KH; +Cc: andrew, linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1897 bytes --]

On 15. 05. 20, 1:22, rananta@codeaurora.org wrote:
> On 2020-05-13 00:04, Greg KH wrote:
>> On Tue, May 12, 2020 at 02:39:50PM -0700, rananta@codeaurora.org wrote:
>>> On 2020-05-12 01:25, Greg KH wrote:
>>> > On Tue, May 12, 2020 at 09:22:15AM +0200, Jiri Slaby wrote:
>>> > > commit bdb498c20040616e94b05c31a0ceb3e134b7e829
>>> > > Author: Jiri Slaby <jslaby@suse.cz>
>>> > > Date:   Tue Aug 7 21:48:04 2012 +0200
>>> > >
>>> > >     TTY: hvc_console, add tty install
>>> > >
>>> > > added hvc_install but did not move 'tty->driver_data = NULL;' from
>>> > > hvc_open's fail path to hvc_cleanup.
>>> > >
>>> > > IOW hvc_open now NULLs tty->driver_data even for another task which
>>> > > opened the tty earlier. The same holds for
>>> > > "tty_port_tty_set(&hp->port,
>>> > > NULL);" there. And actually "tty_port_put(&hp->port);" is also
>>> > > incorrect
>>> > > for the 2nd task opening the tty.
>>> > >

...

> These are the traces you get when the issue happens:
> [  154.212291] hvc_install called for pid: 666
> [  154.216705] hvc_open called for pid: 666
> [  154.233657] hvc_open: request_irq failed with rc -22.
> [  154.238934] hvc_open called for pid: 678
> [  154.243012] Unable to handle kernel NULL pointer dereference at
> virtual address 00000000000000c4
> # hvc_install isn't called for pid: 678 as the file wasn't closed yet.

Nice. Does the attached help?

I wonder how comes the tty_port_put in hvc_open does not cause a UAF? I
would say hvc_open fails, tty_port_put is called. It decrements the
reference taken in hvc_install. So far so good.

Now, this should happen IMO:
tty_open
  -> hvc_open (fails)
    -> tty_port_put
  -> tty_release
    -> tty_release_struct
      -> tty_kref_put
        -> queue_release_one_tty
SCHEDULED WORKQUEUE
release_one_tty
  -> hvc_cleanup
    -> tty_port_put (should die terrible death now)

What am I missing?

thanks,
-- 
js
suse labs

[-- Attachment #2: 0001-hvc_console-fix-open.patch --]
[-- Type: text/x-patch, Size: 2381 bytes --]

From d891cdfcbd3b41eb23ddfc8d9e6cbe038ff8fb72 Mon Sep 17 00:00:00 2001
From: Jiri Slaby <jslaby@suse.cz>
Date: Wed, 20 May 2020 11:29:25 +0200
Subject: [PATCH] hvc_console: fix open

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/tty/hvc/hvc_console.c | 23 ++++++++---------------
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 436cc51c92c3..cdcc64ea2554 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -371,15 +371,14 @@ static int hvc_open(struct tty_struct *tty, struct file * filp)
 	 * tty fields and return the kref reference.
 	 */
 	if (rc) {
-		tty_port_tty_set(&hp->port, NULL);
-		tty->driver_data = NULL;
-		tty_port_put(&hp->port);
 		printk(KERN_ERR "hvc_open: request_irq failed with rc %d.\n", rc);
-	} else
+	} else {
 		/* We are ready... raise DTR/RTS */
 		if (C_BAUD(tty))
 			if (hp->ops->dtr_rts)
 				hp->ops->dtr_rts(hp, 1);
+		tty_port_set_initialized(&hp->port, true);
+	}
 
 	/* Force wakeup of the polling thread */
 	hvc_kick();
@@ -389,22 +388,12 @@ static int hvc_open(struct tty_struct *tty, struct file * filp)
 
 static void hvc_close(struct tty_struct *tty, struct file * filp)
 {
-	struct hvc_struct *hp;
+	struct hvc_struct *hp = tty->driver_data;
 	unsigned long flags;
 
 	if (tty_hung_up_p(filp))
 		return;
 
-	/*
-	 * No driver_data means that this close was issued after a failed
-	 * hvc_open by the tty layer's release_dev() function and we can just
-	 * exit cleanly because the kref reference wasn't made.
-	 */
-	if (!tty->driver_data)
-		return;
-
-	hp = tty->driver_data;
-
 	spin_lock_irqsave(&hp->port.lock, flags);
 
 	if (--hp->port.count == 0) {
@@ -412,6 +401,9 @@ static void hvc_close(struct tty_struct *tty, struct file * filp)
 		/* We are done with the tty pointer now. */
 		tty_port_tty_set(&hp->port, NULL);
 
+		if (!tty_port_initialized(&hp->port))
+			return;
+
 		if (C_HUPCL(tty))
 			if (hp->ops->dtr_rts)
 				hp->ops->dtr_rts(hp, 0);
@@ -428,6 +420,7 @@ static void hvc_close(struct tty_struct *tty, struct file * filp)
 		 * waking periodically to check chars_in_buffer().
 		 */
 		tty_wait_until_sent(tty, HVC_CLOSE_WAIT);
+		tty_port_set_initialized(&hp->port, false);
 	} else {
 		if (hp->port.count < 0)
 			printk(KERN_ERR "hvc_close %X: oops, count is %d\n",
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-15  7:30                     ` Greg KH
@ 2020-05-15 19:21                       ` rananta
  0 siblings, 0 replies; 18+ messages in thread
From: rananta @ 2020-05-15 19:21 UTC (permalink / raw)
  To: Greg KH; +Cc: linuxppc-dev, andrew, Jiri Slaby, linux-kernel

On 2020-05-15 00:30, Greg KH wrote:
> On Thu, May 14, 2020 at 04:22:10PM -0700, rananta@codeaurora.org wrote:
>> On 2020-05-13 00:04, Greg KH wrote:
>> > On Tue, May 12, 2020 at 02:39:50PM -0700, rananta@codeaurora.org wrote:
>> > > On 2020-05-12 01:25, Greg KH wrote:
>> > > > On Tue, May 12, 2020 at 09:22:15AM +0200, Jiri Slaby wrote:
>> > > > > On 11. 05. 20, 9:39, Greg KH wrote:
>> > > > > > On Mon, May 11, 2020 at 12:23:58AM -0700, rananta@codeaurora.org wrote:
>> > > > > >> On 2020-05-09 23:48, Greg KH wrote:
>> > > > > >>> On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
>> > > > > >>>> On 2020-05-06 02:48, Greg KH wrote:
>> > > > > >>>>> On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
>> > > > > >>>>>> Potentially, hvc_open() can be called in parallel when two tasks calls
>> > > > > >>>>>> open() on /dev/hvcX. In such a scenario, if the
>> > > > > >>>>>> hp->ops->notifier_add()
>> > > > > >>>>>> callback in the function fails, where it sets the tty->driver_data to
>> > > > > >>>>>> NULL, the parallel hvc_open() can see this NULL and cause a memory
>> > > > > >>>>>> abort.
>> > > > > >>>>>> Hence, serialize hvc_open and check if tty->private_data is NULL
>> > > > > >>>>>> before
>> > > > > >>>>>> proceeding ahead.
>> > > > > >>>>>>
>> > > > > >>>>>> The issue can be easily reproduced by launching two tasks
>> > > > > >>>>>> simultaneously
>> > > > > >>>>>> that does nothing but open() and close() on /dev/hvcX.
>> > > > > >>>>>> For example:
>> > > > > >>>>>> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
>> > > > > >>>>>>
>> > > > > >>>>>> Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
>> > > > > >>>>>> ---
>> > > > > >>>>>>  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
>> > > > > >>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>> > > > > >>>>>>
>> > > > > >>>>>> diff --git a/drivers/tty/hvc/hvc_console.c
>> > > > > >>>>>> b/drivers/tty/hvc/hvc_console.c
>> > > > > >>>>>> index 436cc51c92c3..ebe26fe5ac09 100644
>> > > > > >>>>>> --- a/drivers/tty/hvc/hvc_console.c
>> > > > > >>>>>> +++ b/drivers/tty/hvc/hvc_console.c
>> > > > > >>>>>> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
>> > > > > >>>>>>   */
>> > > > > >>>>>>  static DEFINE_MUTEX(hvc_structs_mutex);
>> > > > > >>>>>>
>> > > > > >>>>>> +/* Mutex to serialize hvc_open */
>> > > > > >>>>>> +static DEFINE_MUTEX(hvc_open_mutex);
>> > > > > >>>>>>  /*
>> > > > > >>>>>>   * This value is used to assign a tty->index value to a hvc_struct
>> > > > > >>>>>> based
>> > > > > >>>>>>   * upon order of exposure via hvc_probe(), when we can not match it
>> > > > > >>>>>> to
>> > > > > >>>>>> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
>> > > > > >>>>>> *driver, struct tty_struct *tty)
>> > > > > >>>>>>   */
>> > > > > >>>>>>  static int hvc_open(struct tty_struct *tty, struct file * filp)
>> > > > > >>>>>>  {
>> > > > > >>>>>> -	struct hvc_struct *hp = tty->driver_data;
>> > > > > >>>>>> +	struct hvc_struct *hp;
>> > > > > >>>>>>  	unsigned long flags;
>> > > > > >>>>>>  	int rc = 0;
>> > > > > >>>>>>
>> > > > > >>>>>> +	mutex_lock(&hvc_open_mutex);
>> > > > > >>>>>> +
>> > > > > >>>>>> +	hp = tty->driver_data;
>> > > > > >>>>>> +	if (!hp) {
>> > > > > >>>>>> +		rc = -EIO;
>> > > > > >>>>>> +		goto out;
>> > > > > >>>>>> +	}
>> > > > > >>>>>> +
>> > > > > >>>>>>  	spin_lock_irqsave(&hp->port.lock, flags);
>> > > > > >>>>>>  	/* Check and then increment for fast path open. */
>> > > > > >>>>>>  	if (hp->port.count++ > 0) {
>> > > > > >>>>>>  		spin_unlock_irqrestore(&hp->port.lock, flags);
>> > > > > >>>>>>  		hvc_kick();
>> > > > > >>>>>> -		return 0;
>> > > > > >>>>>> +		goto out;
>> > > > > >>>>>>  	} /* else count == 0 */
>> > > > > >>>>>>  	spin_unlock_irqrestore(&hp->port.lock, flags);
>> > > > > >>>>>
>> > > > > >>>>> Wait, why isn't this driver just calling tty_port_open() instead of
>> > > > > >>>>> trying to open-code all of this?
>> > > > > >>>>>
>> > > > > >>>>> Keeping a single mutext for open will not protect it from close, it will
>> > > > > >>>>> just slow things down a bit.  There should already be a tty lock held by
>> > > > > >>>>> the tty core for open() to keep it from racing things, right?
>> > > > > >>>> The tty lock should have been held, but not likely across
>> > > > > >>>> ->install() and
>> > > > > >>>> ->open() callbacks, thus resulting in a race between hvc_install() and
>> > > > > >>>> hvc_open(),
>> > > > > >>>
>> > > > > >>> How?  The tty lock is held in install, and should not conflict with
>> > > > > >>> open(), otherwise, we would be seeing this happen in all tty drivers,
>> > > > > >>> right?
>> > > > > >>>
>> > > > > >> Well, I was expecting the same, but IIRC, I see that the open() was being
>> > > > > >> called in parallel for the same device node.
>> > > > > >
>> > > > > > So open and install are happening at the same time?  And the tty_lock()
>> > > > > > does not protect the needed fields from being protected properly?  If
>> > > > > > not, what fields are being touched without the lock?
>> > > > > >
>> > > > > >> Is it expected that the tty core would allow only one thread to
>> > > > > >> access the dev-node, while blocking the other, or is it the client
>> > > > > >> driver's responsibility to handle the exclusiveness?
>> > > > > >
>> > > > > > The tty core should handle this correctly, for things that can mess
>> > > > > > stuff up (like install and open at the same time).  A driver should not
>> > > > > > have to worry about that.
>> > > > > >
>> > > > > >>>> where hvc_install() sets a data and the hvc_open() clears it.
>> > > > > >>>> hvc_open()
>> > > > > >>>> doesn't
>> > > > > >>>> check if the data was set to NULL and proceeds.
>> > > > > >>>
>> > > > > >>> What data is being set that hvc_open is checking?
>> > > > > >> hvc_install sets tty->private_data to hp, while hvc_open sets it to NULL (in
>> > > > > >> one of the paths).
>> > > > > >
>> > > > > > I see no use of private_data in drivers/tty/hvc/ so what exactly are you
>> > > > > > referring to?
>> > > > >
>> > > > > He likely means tty->driver_data. And there exactly lays the issue.
>> > > > >
>> > > > > commit bdb498c20040616e94b05c31a0ceb3e134b7e829
>> > > > > Author: Jiri Slaby <jslaby@suse.cz>
>> > > > > Date:   Tue Aug 7 21:48:04 2012 +0200
>> > > > >
>> > > > >     TTY: hvc_console, add tty install
>> > > > >
>> > > > > added hvc_install but did not move 'tty->driver_data = NULL;' from
>> > > > > hvc_open's fail path to hvc_cleanup.
>> > > > >
>> > > > > IOW hvc_open now NULLs tty->driver_data even for another task which
>> > > > > opened the tty earlier. The same holds for
>> > > > > "tty_port_tty_set(&hp->port,
>> > > > > NULL);" there. And actually "tty_port_put(&hp->port);" is also
>> > > > > incorrect
>> > > > > for the 2nd task opening the tty.
>> > > > >
>> > > > > So, a mutex with tty->driver_data check in open is not definitely the
>> > > > > way to go. This mess needs to be sorted out properly. Sure, a good
>> > > > > start
>> > > > > would be a conversion to tty_port_open. Right after dropping "tty:
>> > > > > hvc:
>> > > > > Fix data abort due to race in hvc_open" from tty/tty-next :).
>> > > >
>> > > > I've now reverted this commit so we can start from a "clean" place.
>> > > >
>> > > > > What I *don't* understand is why hp->ops->notifier_add fails, given
>> > > > > the
>> > > > > open does not allow multiple opens anyway?
>> > > >
>> > > > I don't understand that either.  Raghavendra, can you show a real trace
>> > > > for this issue that shows this?
>> > > >
>> > > Let me know if this helps:
>> > >
>> > > [  265.332900] Unable to handle kernel NULL pointer dereference at
>> > > virtual
>> > > address 00000000000000a8
>> > > [  265.332920] Mem abort info:
>> > > [  265.332934]   ESR = 0x96000006
>> > > [  265.332950]   EC = 0x25: DABT (current EL), IL = 32 bits
>> > > [  265.332963]   SET = 0, FnV = 0
>> > > [  265.332975]   EA = 0, S1PTW = 0
>> > > [  265.332985] Data abort info:
>> > > [  265.332997]   ISV = 0, ISS = 0x00000006
>> > > [  265.333008]   CM = 0, WnR = 0
>> > > [  265.333025] user pgtable: 4k pages, 39-bit VAs,
>> > > pgdp=00000001620f3000
>> > > [  265.333038] [00000000000000a8] pgd=00000001620f2003,
>> > > pud=00000001620f2003, pmd=0000000000000000
>> > > [  265.333071] Internal error: Oops: 96000006 [#1] PREEMPT SMP
>> > > [  265.333424] CPU: 1 PID: 5653 Comm: stress-ng-dev Tainted: G S
>> > > W  O
>> > > 5.4.12-g04866e0 #1
>> > > [  265.333458] pstate: 80400085 (Nzcv daIf +PAN -UAO)
>> > > [  265.333499] pc : _raw_spin_lock_irqsave+0x40/0x7c
>> > > [  265.333517] lr : _raw_spin_lock_irqsave+0x38/0x7c
>> > > [  265.333530] sp : ffffffc02436ba40
>> > > [  265.333542] x29: ffffffc02436ba40 x28: 0000000000020800
>> > > [  265.333562] x27: ffffffdfb4046490 x26: ffffff8101b83400
>> > > [  265.333580] x25: ffffff80e163ad00 x24: ffffffdfb45c7798
>> > > [  265.333598] x23: ffffff8101b83668 x22: ffffffdfb4974000
>> > > [  265.333617] x21: 0000000000000001 x20: 00000000000000a8
>> > > [  265.333634] x19: 0000000000000000 x18: ffffff80e0b0d460
>> > > [  265.333652] x17: 0000000000000000 x16: 0000000001000000
>> > > [  265.333670] x15: 0000000001000000 x14: 00000000f8000000
>> > > [  265.333688] x13: 0000000000000000 x12: 0000000000000001
>> > > [  265.333706] x11: 17f5f16765f64600 x10: 17f5f16765f64600
>> > > [  265.333724] x9 : ffffffdfb3444244 x8 : 0000000000000000
>> > > [  265.333741] x7 : 0000000000000000 x6 : 0000000000000000
>> > > [  265.333759] x5 : 0000000000000000 x4 : 0000000000000002
>> > > [  265.333776] x3 : ffffffc02436b9c0 x2 : ffffffdfb40456e0
>> > > [  265.333794] x1 : ffffffc02436b9c0 x0 : ffffffdfb3444244
>> > > [  265.333812] Call trace:
>> > > [  265.333831]  _raw_spin_lock_irqsave+0x40/0x7c
>> > > [  265.333859]  28$61deaf328f140fd7df47c115ec866fa5+0x28/0x174
>> > > [  265.333882]  tty_open$86bd494905ebe22944bf63b711173de3+0x3d0/0x584
>> > > [  265.333921]
>> > > chrdev_open$4083aaa799bca8e0e1e0c8dc1947aa96+0x1c4/0x248
>> > > [  265.333940]  do_dentry_open+0x258/0x3b0
>> > > [  265.333956]  vfs_open+0x2c/0x38
>> > > [  265.333975]  path_openat+0x898/0xedc
>> > > [  265.333991]  do_filp_open+0x78/0x124
>> > > [  265.334006]  do_sys_open+0x13c/0x298
>> > > [  265.334022]  __arm64_sys_openat+0x28/0x34
>> > > [  265.334044]  el0_svc_common+0xb8/0x1b4
>> > > [  265.334059]  el0_svc_handler+0x6c/0x88
>> > > [  265.334079]  el0_svc+0x8/0xc
>> > > [  265.334110] Code: 52800035 97b9fec7 aa1f03e8 f9800291 (885ffe81)
>> > > [  265.334130] ---[ end trace ac90e3099a98e99f ]---
>> > > [  265.334146] Kernel panic - not syncing: Fatal exception
>> >
>> > Hm, do you have a strace showing the close happening at the same time?
>> > What about install()?
>> 
>> Yes, I do see the close happening, at which point the issue is not 
>> seen.
>> It's only seen if the second task came in before this close was 
>> called. For
>> this task, as the file was already opened, it doesn't go through
>> hvc_install.
>> 
>> (I figured adding some logs in the drivers would be helpful than 
>> straces to
>> also include hvc_install)
>> 
>> These are the traces you get when the issue happens:
>> [  154.212291] hvc_install called for pid: 666
>> [  154.216705] hvc_open called for pid: 666
>> [  154.233657] hvc_open: request_irq failed with rc -22.
>> [  154.238934] hvc_open called for pid: 678
>> [  154.243012] Unable to handle kernel NULL pointer dereference at 
>> virtual
>> address 00000000000000c4
>> # hvc_install isn't called for pid: 678 as the file wasn't closed yet.
>> 
>> And these are the traces we get when things are normal:
>> [   76.318861] hvc_install called for pid: 537
>> [   76.323240] hvc_open called for pid: 537
>> [   76.340177] hvc_open: request_irq failed with rc -22.
>> [   76.345384] hvc_close called for pid: 537
>> [   76.349555] hvc_install called for pid: 538
>> [   76.353921] hvc_open called for pid: 538
>> # hvc_open for the second task (pid: 538) seems to be fine here as the 
>> file
>> was closed prior to the second task tried to open the file.
>> 
>> >
>> > And what line in hvc_open() does that offset correspond to?
>> It's the point where it tries to acquire the spinlock for the first 
>> time:
>> spin_lock_irqsave(&hp->port.lock, flags);
> 
> Ah, is this a console?  Maybe this is the same issue that other console
> drivers have been having recently, look at:
> 	https://lore.kernel.org/r/20200428184050.6501-1-john.stultz@linaro.org
> and
> 	https://lore.kernel.org/r/1589019852-21505-2-git-send-email-sagar.kadam@sifive.com
> 
> Is that what you need here too?
> 
No. The spinlock is initialized here it's just that the data-structure 
that holds the lock (hp) is NULL because of the race.
> thanks,
> 
> greg k-h

Thank you.
Raghavendra

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-14 23:22                   ` rananta
@ 2020-05-15  7:30                     ` Greg KH
  2020-05-15 19:21                       ` rananta
  2020-05-20  9:38                     ` Jiri Slaby
  1 sibling, 1 reply; 18+ messages in thread
From: Greg KH @ 2020-05-15  7:30 UTC (permalink / raw)
  To: rananta; +Cc: linuxppc-dev, andrew, Jiri Slaby, linux-kernel

On Thu, May 14, 2020 at 04:22:10PM -0700, rananta@codeaurora.org wrote:
> On 2020-05-13 00:04, Greg KH wrote:
> > On Tue, May 12, 2020 at 02:39:50PM -0700, rananta@codeaurora.org wrote:
> > > On 2020-05-12 01:25, Greg KH wrote:
> > > > On Tue, May 12, 2020 at 09:22:15AM +0200, Jiri Slaby wrote:
> > > > > On 11. 05. 20, 9:39, Greg KH wrote:
> > > > > > On Mon, May 11, 2020 at 12:23:58AM -0700, rananta@codeaurora.org wrote:
> > > > > >> On 2020-05-09 23:48, Greg KH wrote:
> > > > > >>> On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
> > > > > >>>> On 2020-05-06 02:48, Greg KH wrote:
> > > > > >>>>> On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> > > > > >>>>>> Potentially, hvc_open() can be called in parallel when two tasks calls
> > > > > >>>>>> open() on /dev/hvcX. In such a scenario, if the
> > > > > >>>>>> hp->ops->notifier_add()
> > > > > >>>>>> callback in the function fails, where it sets the tty->driver_data to
> > > > > >>>>>> NULL, the parallel hvc_open() can see this NULL and cause a memory
> > > > > >>>>>> abort.
> > > > > >>>>>> Hence, serialize hvc_open and check if tty->private_data is NULL
> > > > > >>>>>> before
> > > > > >>>>>> proceeding ahead.
> > > > > >>>>>>
> > > > > >>>>>> The issue can be easily reproduced by launching two tasks
> > > > > >>>>>> simultaneously
> > > > > >>>>>> that does nothing but open() and close() on /dev/hvcX.
> > > > > >>>>>> For example:
> > > > > >>>>>> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> > > > > >>>>>>
> > > > > >>>>>> Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
> > > > > >>>>>> ---
> > > > > >>>>>>  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
> > > > > >>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
> > > > > >>>>>>
> > > > > >>>>>> diff --git a/drivers/tty/hvc/hvc_console.c
> > > > > >>>>>> b/drivers/tty/hvc/hvc_console.c
> > > > > >>>>>> index 436cc51c92c3..ebe26fe5ac09 100644
> > > > > >>>>>> --- a/drivers/tty/hvc/hvc_console.c
> > > > > >>>>>> +++ b/drivers/tty/hvc/hvc_console.c
> > > > > >>>>>> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > > > > >>>>>>   */
> > > > > >>>>>>  static DEFINE_MUTEX(hvc_structs_mutex);
> > > > > >>>>>>
> > > > > >>>>>> +/* Mutex to serialize hvc_open */
> > > > > >>>>>> +static DEFINE_MUTEX(hvc_open_mutex);
> > > > > >>>>>>  /*
> > > > > >>>>>>   * This value is used to assign a tty->index value to a hvc_struct
> > > > > >>>>>> based
> > > > > >>>>>>   * upon order of exposure via hvc_probe(), when we can not match it
> > > > > >>>>>> to
> > > > > >>>>>> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > > > >>>>>> *driver, struct tty_struct *tty)
> > > > > >>>>>>   */
> > > > > >>>>>>  static int hvc_open(struct tty_struct *tty, struct file * filp)
> > > > > >>>>>>  {
> > > > > >>>>>> -	struct hvc_struct *hp = tty->driver_data;
> > > > > >>>>>> +	struct hvc_struct *hp;
> > > > > >>>>>>  	unsigned long flags;
> > > > > >>>>>>  	int rc = 0;
> > > > > >>>>>>
> > > > > >>>>>> +	mutex_lock(&hvc_open_mutex);
> > > > > >>>>>> +
> > > > > >>>>>> +	hp = tty->driver_data;
> > > > > >>>>>> +	if (!hp) {
> > > > > >>>>>> +		rc = -EIO;
> > > > > >>>>>> +		goto out;
> > > > > >>>>>> +	}
> > > > > >>>>>> +
> > > > > >>>>>>  	spin_lock_irqsave(&hp->port.lock, flags);
> > > > > >>>>>>  	/* Check and then increment for fast path open. */
> > > > > >>>>>>  	if (hp->port.count++ > 0) {
> > > > > >>>>>>  		spin_unlock_irqrestore(&hp->port.lock, flags);
> > > > > >>>>>>  		hvc_kick();
> > > > > >>>>>> -		return 0;
> > > > > >>>>>> +		goto out;
> > > > > >>>>>>  	} /* else count == 0 */
> > > > > >>>>>>  	spin_unlock_irqrestore(&hp->port.lock, flags);
> > > > > >>>>>
> > > > > >>>>> Wait, why isn't this driver just calling tty_port_open() instead of
> > > > > >>>>> trying to open-code all of this?
> > > > > >>>>>
> > > > > >>>>> Keeping a single mutext for open will not protect it from close, it will
> > > > > >>>>> just slow things down a bit.  There should already be a tty lock held by
> > > > > >>>>> the tty core for open() to keep it from racing things, right?
> > > > > >>>> The tty lock should have been held, but not likely across
> > > > > >>>> ->install() and
> > > > > >>>> ->open() callbacks, thus resulting in a race between hvc_install() and
> > > > > >>>> hvc_open(),
> > > > > >>>
> > > > > >>> How?  The tty lock is held in install, and should not conflict with
> > > > > >>> open(), otherwise, we would be seeing this happen in all tty drivers,
> > > > > >>> right?
> > > > > >>>
> > > > > >> Well, I was expecting the same, but IIRC, I see that the open() was being
> > > > > >> called in parallel for the same device node.
> > > > > >
> > > > > > So open and install are happening at the same time?  And the tty_lock()
> > > > > > does not protect the needed fields from being protected properly?  If
> > > > > > not, what fields are being touched without the lock?
> > > > > >
> > > > > >> Is it expected that the tty core would allow only one thread to
> > > > > >> access the dev-node, while blocking the other, or is it the client
> > > > > >> driver's responsibility to handle the exclusiveness?
> > > > > >
> > > > > > The tty core should handle this correctly, for things that can mess
> > > > > > stuff up (like install and open at the same time).  A driver should not
> > > > > > have to worry about that.
> > > > > >
> > > > > >>>> where hvc_install() sets a data and the hvc_open() clears it.
> > > > > >>>> hvc_open()
> > > > > >>>> doesn't
> > > > > >>>> check if the data was set to NULL and proceeds.
> > > > > >>>
> > > > > >>> What data is being set that hvc_open is checking?
> > > > > >> hvc_install sets tty->private_data to hp, while hvc_open sets it to NULL (in
> > > > > >> one of the paths).
> > > > > >
> > > > > > I see no use of private_data in drivers/tty/hvc/ so what exactly are you
> > > > > > referring to?
> > > > >
> > > > > He likely means tty->driver_data. And there exactly lays the issue.
> > > > >
> > > > > commit bdb498c20040616e94b05c31a0ceb3e134b7e829
> > > > > Author: Jiri Slaby <jslaby@suse.cz>
> > > > > Date:   Tue Aug 7 21:48:04 2012 +0200
> > > > >
> > > > >     TTY: hvc_console, add tty install
> > > > >
> > > > > added hvc_install but did not move 'tty->driver_data = NULL;' from
> > > > > hvc_open's fail path to hvc_cleanup.
> > > > >
> > > > > IOW hvc_open now NULLs tty->driver_data even for another task which
> > > > > opened the tty earlier. The same holds for
> > > > > "tty_port_tty_set(&hp->port,
> > > > > NULL);" there. And actually "tty_port_put(&hp->port);" is also
> > > > > incorrect
> > > > > for the 2nd task opening the tty.
> > > > >
> > > > > So, a mutex with tty->driver_data check in open is not definitely the
> > > > > way to go. This mess needs to be sorted out properly. Sure, a good
> > > > > start
> > > > > would be a conversion to tty_port_open. Right after dropping "tty:
> > > > > hvc:
> > > > > Fix data abort due to race in hvc_open" from tty/tty-next :).
> > > >
> > > > I've now reverted this commit so we can start from a "clean" place.
> > > >
> > > > > What I *don't* understand is why hp->ops->notifier_add fails, given
> > > > > the
> > > > > open does not allow multiple opens anyway?
> > > >
> > > > I don't understand that either.  Raghavendra, can you show a real trace
> > > > for this issue that shows this?
> > > >
> > > Let me know if this helps:
> > > 
> > > [  265.332900] Unable to handle kernel NULL pointer dereference at
> > > virtual
> > > address 00000000000000a8
> > > [  265.332920] Mem abort info:
> > > [  265.332934]   ESR = 0x96000006
> > > [  265.332950]   EC = 0x25: DABT (current EL), IL = 32 bits
> > > [  265.332963]   SET = 0, FnV = 0
> > > [  265.332975]   EA = 0, S1PTW = 0
> > > [  265.332985] Data abort info:
> > > [  265.332997]   ISV = 0, ISS = 0x00000006
> > > [  265.333008]   CM = 0, WnR = 0
> > > [  265.333025] user pgtable: 4k pages, 39-bit VAs,
> > > pgdp=00000001620f3000
> > > [  265.333038] [00000000000000a8] pgd=00000001620f2003,
> > > pud=00000001620f2003, pmd=0000000000000000
> > > [  265.333071] Internal error: Oops: 96000006 [#1] PREEMPT SMP
> > > [  265.333424] CPU: 1 PID: 5653 Comm: stress-ng-dev Tainted: G S
> > > W  O
> > > 5.4.12-g04866e0 #1
> > > [  265.333458] pstate: 80400085 (Nzcv daIf +PAN -UAO)
> > > [  265.333499] pc : _raw_spin_lock_irqsave+0x40/0x7c
> > > [  265.333517] lr : _raw_spin_lock_irqsave+0x38/0x7c
> > > [  265.333530] sp : ffffffc02436ba40
> > > [  265.333542] x29: ffffffc02436ba40 x28: 0000000000020800
> > > [  265.333562] x27: ffffffdfb4046490 x26: ffffff8101b83400
> > > [  265.333580] x25: ffffff80e163ad00 x24: ffffffdfb45c7798
> > > [  265.333598] x23: ffffff8101b83668 x22: ffffffdfb4974000
> > > [  265.333617] x21: 0000000000000001 x20: 00000000000000a8
> > > [  265.333634] x19: 0000000000000000 x18: ffffff80e0b0d460
> > > [  265.333652] x17: 0000000000000000 x16: 0000000001000000
> > > [  265.333670] x15: 0000000001000000 x14: 00000000f8000000
> > > [  265.333688] x13: 0000000000000000 x12: 0000000000000001
> > > [  265.333706] x11: 17f5f16765f64600 x10: 17f5f16765f64600
> > > [  265.333724] x9 : ffffffdfb3444244 x8 : 0000000000000000
> > > [  265.333741] x7 : 0000000000000000 x6 : 0000000000000000
> > > [  265.333759] x5 : 0000000000000000 x4 : 0000000000000002
> > > [  265.333776] x3 : ffffffc02436b9c0 x2 : ffffffdfb40456e0
> > > [  265.333794] x1 : ffffffc02436b9c0 x0 : ffffffdfb3444244
> > > [  265.333812] Call trace:
> > > [  265.333831]  _raw_spin_lock_irqsave+0x40/0x7c
> > > [  265.333859]  28$61deaf328f140fd7df47c115ec866fa5+0x28/0x174
> > > [  265.333882]  tty_open$86bd494905ebe22944bf63b711173de3+0x3d0/0x584
> > > [  265.333921]
> > > chrdev_open$4083aaa799bca8e0e1e0c8dc1947aa96+0x1c4/0x248
> > > [  265.333940]  do_dentry_open+0x258/0x3b0
> > > [  265.333956]  vfs_open+0x2c/0x38
> > > [  265.333975]  path_openat+0x898/0xedc
> > > [  265.333991]  do_filp_open+0x78/0x124
> > > [  265.334006]  do_sys_open+0x13c/0x298
> > > [  265.334022]  __arm64_sys_openat+0x28/0x34
> > > [  265.334044]  el0_svc_common+0xb8/0x1b4
> > > [  265.334059]  el0_svc_handler+0x6c/0x88
> > > [  265.334079]  el0_svc+0x8/0xc
> > > [  265.334110] Code: 52800035 97b9fec7 aa1f03e8 f9800291 (885ffe81)
> > > [  265.334130] ---[ end trace ac90e3099a98e99f ]---
> > > [  265.334146] Kernel panic - not syncing: Fatal exception
> > 
> > Hm, do you have a strace showing the close happening at the same time?
> > What about install()?
> 
> Yes, I do see the close happening, at which point the issue is not seen.
> It's only seen if the second task came in before this close was called. For
> this task, as the file was already opened, it doesn't go through
> hvc_install.
> 
> (I figured adding some logs in the drivers would be helpful than straces to
> also include hvc_install)
> 
> These are the traces you get when the issue happens:
> [  154.212291] hvc_install called for pid: 666
> [  154.216705] hvc_open called for pid: 666
> [  154.233657] hvc_open: request_irq failed with rc -22.
> [  154.238934] hvc_open called for pid: 678
> [  154.243012] Unable to handle kernel NULL pointer dereference at virtual
> address 00000000000000c4
> # hvc_install isn't called for pid: 678 as the file wasn't closed yet.
> 
> And these are the traces we get when things are normal:
> [   76.318861] hvc_install called for pid: 537
> [   76.323240] hvc_open called for pid: 537
> [   76.340177] hvc_open: request_irq failed with rc -22.
> [   76.345384] hvc_close called for pid: 537
> [   76.349555] hvc_install called for pid: 538
> [   76.353921] hvc_open called for pid: 538
> # hvc_open for the second task (pid: 538) seems to be fine here as the file
> was closed prior to the second task tried to open the file.
> 
> > 
> > And what line in hvc_open() does that offset correspond to?
> It's the point where it tries to acquire the spinlock for the first time:
> spin_lock_irqsave(&hp->port.lock, flags);

Ah, is this a console?  Maybe this is the same issue that other console
drivers have been having recently, look at:
	https://lore.kernel.org/r/20200428184050.6501-1-john.stultz@linaro.org
and
	https://lore.kernel.org/r/1589019852-21505-2-git-send-email-sagar.kadam@sifive.com

Is that what you need here too?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-13  7:04                 ` Greg KH
@ 2020-05-14 23:22                   ` rananta
  2020-05-15  7:30                     ` Greg KH
  2020-05-20  9:38                     ` Jiri Slaby
  0 siblings, 2 replies; 18+ messages in thread
From: rananta @ 2020-05-14 23:22 UTC (permalink / raw)
  To: Greg KH; +Cc: linuxppc-dev, andrew, Jiri Slaby, linux-kernel

On 2020-05-13 00:04, Greg KH wrote:
> On Tue, May 12, 2020 at 02:39:50PM -0700, rananta@codeaurora.org wrote:
>> On 2020-05-12 01:25, Greg KH wrote:
>> > On Tue, May 12, 2020 at 09:22:15AM +0200, Jiri Slaby wrote:
>> > > On 11. 05. 20, 9:39, Greg KH wrote:
>> > > > On Mon, May 11, 2020 at 12:23:58AM -0700, rananta@codeaurora.org wrote:
>> > > >> On 2020-05-09 23:48, Greg KH wrote:
>> > > >>> On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
>> > > >>>> On 2020-05-06 02:48, Greg KH wrote:
>> > > >>>>> On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
>> > > >>>>>> Potentially, hvc_open() can be called in parallel when two tasks calls
>> > > >>>>>> open() on /dev/hvcX. In such a scenario, if the
>> > > >>>>>> hp->ops->notifier_add()
>> > > >>>>>> callback in the function fails, where it sets the tty->driver_data to
>> > > >>>>>> NULL, the parallel hvc_open() can see this NULL and cause a memory
>> > > >>>>>> abort.
>> > > >>>>>> Hence, serialize hvc_open and check if tty->private_data is NULL
>> > > >>>>>> before
>> > > >>>>>> proceeding ahead.
>> > > >>>>>>
>> > > >>>>>> The issue can be easily reproduced by launching two tasks
>> > > >>>>>> simultaneously
>> > > >>>>>> that does nothing but open() and close() on /dev/hvcX.
>> > > >>>>>> For example:
>> > > >>>>>> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
>> > > >>>>>>
>> > > >>>>>> Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
>> > > >>>>>> ---
>> > > >>>>>>  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
>> > > >>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>> > > >>>>>>
>> > > >>>>>> diff --git a/drivers/tty/hvc/hvc_console.c
>> > > >>>>>> b/drivers/tty/hvc/hvc_console.c
>> > > >>>>>> index 436cc51c92c3..ebe26fe5ac09 100644
>> > > >>>>>> --- a/drivers/tty/hvc/hvc_console.c
>> > > >>>>>> +++ b/drivers/tty/hvc/hvc_console.c
>> > > >>>>>> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
>> > > >>>>>>   */
>> > > >>>>>>  static DEFINE_MUTEX(hvc_structs_mutex);
>> > > >>>>>>
>> > > >>>>>> +/* Mutex to serialize hvc_open */
>> > > >>>>>> +static DEFINE_MUTEX(hvc_open_mutex);
>> > > >>>>>>  /*
>> > > >>>>>>   * This value is used to assign a tty->index value to a hvc_struct
>> > > >>>>>> based
>> > > >>>>>>   * upon order of exposure via hvc_probe(), when we can not match it
>> > > >>>>>> to
>> > > >>>>>> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
>> > > >>>>>> *driver, struct tty_struct *tty)
>> > > >>>>>>   */
>> > > >>>>>>  static int hvc_open(struct tty_struct *tty, struct file * filp)
>> > > >>>>>>  {
>> > > >>>>>> -	struct hvc_struct *hp = tty->driver_data;
>> > > >>>>>> +	struct hvc_struct *hp;
>> > > >>>>>>  	unsigned long flags;
>> > > >>>>>>  	int rc = 0;
>> > > >>>>>>
>> > > >>>>>> +	mutex_lock(&hvc_open_mutex);
>> > > >>>>>> +
>> > > >>>>>> +	hp = tty->driver_data;
>> > > >>>>>> +	if (!hp) {
>> > > >>>>>> +		rc = -EIO;
>> > > >>>>>> +		goto out;
>> > > >>>>>> +	}
>> > > >>>>>> +
>> > > >>>>>>  	spin_lock_irqsave(&hp->port.lock, flags);
>> > > >>>>>>  	/* Check and then increment for fast path open. */
>> > > >>>>>>  	if (hp->port.count++ > 0) {
>> > > >>>>>>  		spin_unlock_irqrestore(&hp->port.lock, flags);
>> > > >>>>>>  		hvc_kick();
>> > > >>>>>> -		return 0;
>> > > >>>>>> +		goto out;
>> > > >>>>>>  	} /* else count == 0 */
>> > > >>>>>>  	spin_unlock_irqrestore(&hp->port.lock, flags);
>> > > >>>>>
>> > > >>>>> Wait, why isn't this driver just calling tty_port_open() instead of
>> > > >>>>> trying to open-code all of this?
>> > > >>>>>
>> > > >>>>> Keeping a single mutext for open will not protect it from close, it will
>> > > >>>>> just slow things down a bit.  There should already be a tty lock held by
>> > > >>>>> the tty core for open() to keep it from racing things, right?
>> > > >>>> The tty lock should have been held, but not likely across
>> > > >>>> ->install() and
>> > > >>>> ->open() callbacks, thus resulting in a race between hvc_install() and
>> > > >>>> hvc_open(),
>> > > >>>
>> > > >>> How?  The tty lock is held in install, and should not conflict with
>> > > >>> open(), otherwise, we would be seeing this happen in all tty drivers,
>> > > >>> right?
>> > > >>>
>> > > >> Well, I was expecting the same, but IIRC, I see that the open() was being
>> > > >> called in parallel for the same device node.
>> > > >
>> > > > So open and install are happening at the same time?  And the tty_lock()
>> > > > does not protect the needed fields from being protected properly?  If
>> > > > not, what fields are being touched without the lock?
>> > > >
>> > > >> Is it expected that the tty core would allow only one thread to
>> > > >> access the dev-node, while blocking the other, or is it the client
>> > > >> driver's responsibility to handle the exclusiveness?
>> > > >
>> > > > The tty core should handle this correctly, for things that can mess
>> > > > stuff up (like install and open at the same time).  A driver should not
>> > > > have to worry about that.
>> > > >
>> > > >>>> where hvc_install() sets a data and the hvc_open() clears it.
>> > > >>>> hvc_open()
>> > > >>>> doesn't
>> > > >>>> check if the data was set to NULL and proceeds.
>> > > >>>
>> > > >>> What data is being set that hvc_open is checking?
>> > > >> hvc_install sets tty->private_data to hp, while hvc_open sets it to NULL (in
>> > > >> one of the paths).
>> > > >
>> > > > I see no use of private_data in drivers/tty/hvc/ so what exactly are you
>> > > > referring to?
>> > >
>> > > He likely means tty->driver_data. And there exactly lays the issue.
>> > >
>> > > commit bdb498c20040616e94b05c31a0ceb3e134b7e829
>> > > Author: Jiri Slaby <jslaby@suse.cz>
>> > > Date:   Tue Aug 7 21:48:04 2012 +0200
>> > >
>> > >     TTY: hvc_console, add tty install
>> > >
>> > > added hvc_install but did not move 'tty->driver_data = NULL;' from
>> > > hvc_open's fail path to hvc_cleanup.
>> > >
>> > > IOW hvc_open now NULLs tty->driver_data even for another task which
>> > > opened the tty earlier. The same holds for
>> > > "tty_port_tty_set(&hp->port,
>> > > NULL);" there. And actually "tty_port_put(&hp->port);" is also
>> > > incorrect
>> > > for the 2nd task opening the tty.
>> > >
>> > > So, a mutex with tty->driver_data check in open is not definitely the
>> > > way to go. This mess needs to be sorted out properly. Sure, a good
>> > > start
>> > > would be a conversion to tty_port_open. Right after dropping "tty:
>> > > hvc:
>> > > Fix data abort due to race in hvc_open" from tty/tty-next :).
>> >
>> > I've now reverted this commit so we can start from a "clean" place.
>> >
>> > > What I *don't* understand is why hp->ops->notifier_add fails, given
>> > > the
>> > > open does not allow multiple opens anyway?
>> >
>> > I don't understand that either.  Raghavendra, can you show a real trace
>> > for this issue that shows this?
>> >
>> Let me know if this helps:
>> 
>> [  265.332900] Unable to handle kernel NULL pointer dereference at 
>> virtual
>> address 00000000000000a8
>> [  265.332920] Mem abort info:
>> [  265.332934]   ESR = 0x96000006
>> [  265.332950]   EC = 0x25: DABT (current EL), IL = 32 bits
>> [  265.332963]   SET = 0, FnV = 0
>> [  265.332975]   EA = 0, S1PTW = 0
>> [  265.332985] Data abort info:
>> [  265.332997]   ISV = 0, ISS = 0x00000006
>> [  265.333008]   CM = 0, WnR = 0
>> [  265.333025] user pgtable: 4k pages, 39-bit VAs, 
>> pgdp=00000001620f3000
>> [  265.333038] [00000000000000a8] pgd=00000001620f2003,
>> pud=00000001620f2003, pmd=0000000000000000
>> [  265.333071] Internal error: Oops: 96000006 [#1] PREEMPT SMP
>> [  265.333424] CPU: 1 PID: 5653 Comm: stress-ng-dev Tainted: G S      
>> W  O
>> 5.4.12-g04866e0 #1
>> [  265.333458] pstate: 80400085 (Nzcv daIf +PAN -UAO)
>> [  265.333499] pc : _raw_spin_lock_irqsave+0x40/0x7c
>> [  265.333517] lr : _raw_spin_lock_irqsave+0x38/0x7c
>> [  265.333530] sp : ffffffc02436ba40
>> [  265.333542] x29: ffffffc02436ba40 x28: 0000000000020800
>> [  265.333562] x27: ffffffdfb4046490 x26: ffffff8101b83400
>> [  265.333580] x25: ffffff80e163ad00 x24: ffffffdfb45c7798
>> [  265.333598] x23: ffffff8101b83668 x22: ffffffdfb4974000
>> [  265.333617] x21: 0000000000000001 x20: 00000000000000a8
>> [  265.333634] x19: 0000000000000000 x18: ffffff80e0b0d460
>> [  265.333652] x17: 0000000000000000 x16: 0000000001000000
>> [  265.333670] x15: 0000000001000000 x14: 00000000f8000000
>> [  265.333688] x13: 0000000000000000 x12: 0000000000000001
>> [  265.333706] x11: 17f5f16765f64600 x10: 17f5f16765f64600
>> [  265.333724] x9 : ffffffdfb3444244 x8 : 0000000000000000
>> [  265.333741] x7 : 0000000000000000 x6 : 0000000000000000
>> [  265.333759] x5 : 0000000000000000 x4 : 0000000000000002
>> [  265.333776] x3 : ffffffc02436b9c0 x2 : ffffffdfb40456e0
>> [  265.333794] x1 : ffffffc02436b9c0 x0 : ffffffdfb3444244
>> [  265.333812] Call trace:
>> [  265.333831]  _raw_spin_lock_irqsave+0x40/0x7c
>> [  265.333859]  28$61deaf328f140fd7df47c115ec866fa5+0x28/0x174
>> [  265.333882]  tty_open$86bd494905ebe22944bf63b711173de3+0x3d0/0x584
>> [  265.333921]  
>> chrdev_open$4083aaa799bca8e0e1e0c8dc1947aa96+0x1c4/0x248
>> [  265.333940]  do_dentry_open+0x258/0x3b0
>> [  265.333956]  vfs_open+0x2c/0x38
>> [  265.333975]  path_openat+0x898/0xedc
>> [  265.333991]  do_filp_open+0x78/0x124
>> [  265.334006]  do_sys_open+0x13c/0x298
>> [  265.334022]  __arm64_sys_openat+0x28/0x34
>> [  265.334044]  el0_svc_common+0xb8/0x1b4
>> [  265.334059]  el0_svc_handler+0x6c/0x88
>> [  265.334079]  el0_svc+0x8/0xc
>> [  265.334110] Code: 52800035 97b9fec7 aa1f03e8 f9800291 (885ffe81)
>> [  265.334130] ---[ end trace ac90e3099a98e99f ]---
>> [  265.334146] Kernel panic - not syncing: Fatal exception
> 
> Hm, do you have a strace showing the close happening at the same time?
> What about install()?

Yes, I do see the close happening, at which point the issue is not seen.
It's only seen if the second task came in before this close was called. 
For this task, as the file was already opened, it doesn't go through 
hvc_install.

(I figured adding some logs in the drivers would be helpful than straces 
to also include hvc_install)

These are the traces you get when the issue happens:
[  154.212291] hvc_install called for pid: 666
[  154.216705] hvc_open called for pid: 666
[  154.233657] hvc_open: request_irq failed with rc -22.
[  154.238934] hvc_open called for pid: 678
[  154.243012] Unable to handle kernel NULL pointer dereference at 
virtual address 00000000000000c4
# hvc_install isn't called for pid: 678 as the file wasn't closed yet.

And these are the traces we get when things are normal:
[   76.318861] hvc_install called for pid: 537
[   76.323240] hvc_open called for pid: 537
[   76.340177] hvc_open: request_irq failed with rc -22.
[   76.345384] hvc_close called for pid: 537
[   76.349555] hvc_install called for pid: 538
[   76.353921] hvc_open called for pid: 538
# hvc_open for the second task (pid: 538) seems to be fine here as the 
file was closed prior to the second task tried to open the file.

> 
> And what line in hvc_open() does that offset correspond to?
It's the point where it tries to acquire the spinlock for the first 
time: spin_lock_irqsave(&hp->port.lock, flags);

> 
> thanks,
> 
> greg k-h

Thank you.
Raghavendra

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-12 21:39               ` rananta
@ 2020-05-13  7:04                 ` Greg KH
  2020-05-14 23:22                   ` rananta
  0 siblings, 1 reply; 18+ messages in thread
From: Greg KH @ 2020-05-13  7:04 UTC (permalink / raw)
  To: rananta; +Cc: linuxppc-dev, andrew, Jiri Slaby, linux-kernel

On Tue, May 12, 2020 at 02:39:50PM -0700, rananta@codeaurora.org wrote:
> On 2020-05-12 01:25, Greg KH wrote:
> > On Tue, May 12, 2020 at 09:22:15AM +0200, Jiri Slaby wrote:
> > > On 11. 05. 20, 9:39, Greg KH wrote:
> > > > On Mon, May 11, 2020 at 12:23:58AM -0700, rananta@codeaurora.org wrote:
> > > >> On 2020-05-09 23:48, Greg KH wrote:
> > > >>> On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
> > > >>>> On 2020-05-06 02:48, Greg KH wrote:
> > > >>>>> On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> > > >>>>>> Potentially, hvc_open() can be called in parallel when two tasks calls
> > > >>>>>> open() on /dev/hvcX. In such a scenario, if the
> > > >>>>>> hp->ops->notifier_add()
> > > >>>>>> callback in the function fails, where it sets the tty->driver_data to
> > > >>>>>> NULL, the parallel hvc_open() can see this NULL and cause a memory
> > > >>>>>> abort.
> > > >>>>>> Hence, serialize hvc_open and check if tty->private_data is NULL
> > > >>>>>> before
> > > >>>>>> proceeding ahead.
> > > >>>>>>
> > > >>>>>> The issue can be easily reproduced by launching two tasks
> > > >>>>>> simultaneously
> > > >>>>>> that does nothing but open() and close() on /dev/hvcX.
> > > >>>>>> For example:
> > > >>>>>> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> > > >>>>>>
> > > >>>>>> Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
> > > >>>>>> ---
> > > >>>>>>  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
> > > >>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
> > > >>>>>>
> > > >>>>>> diff --git a/drivers/tty/hvc/hvc_console.c
> > > >>>>>> b/drivers/tty/hvc/hvc_console.c
> > > >>>>>> index 436cc51c92c3..ebe26fe5ac09 100644
> > > >>>>>> --- a/drivers/tty/hvc/hvc_console.c
> > > >>>>>> +++ b/drivers/tty/hvc/hvc_console.c
> > > >>>>>> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > > >>>>>>   */
> > > >>>>>>  static DEFINE_MUTEX(hvc_structs_mutex);
> > > >>>>>>
> > > >>>>>> +/* Mutex to serialize hvc_open */
> > > >>>>>> +static DEFINE_MUTEX(hvc_open_mutex);
> > > >>>>>>  /*
> > > >>>>>>   * This value is used to assign a tty->index value to a hvc_struct
> > > >>>>>> based
> > > >>>>>>   * upon order of exposure via hvc_probe(), when we can not match it
> > > >>>>>> to
> > > >>>>>> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > >>>>>> *driver, struct tty_struct *tty)
> > > >>>>>>   */
> > > >>>>>>  static int hvc_open(struct tty_struct *tty, struct file * filp)
> > > >>>>>>  {
> > > >>>>>> -	struct hvc_struct *hp = tty->driver_data;
> > > >>>>>> +	struct hvc_struct *hp;
> > > >>>>>>  	unsigned long flags;
> > > >>>>>>  	int rc = 0;
> > > >>>>>>
> > > >>>>>> +	mutex_lock(&hvc_open_mutex);
> > > >>>>>> +
> > > >>>>>> +	hp = tty->driver_data;
> > > >>>>>> +	if (!hp) {
> > > >>>>>> +		rc = -EIO;
> > > >>>>>> +		goto out;
> > > >>>>>> +	}
> > > >>>>>> +
> > > >>>>>>  	spin_lock_irqsave(&hp->port.lock, flags);
> > > >>>>>>  	/* Check and then increment for fast path open. */
> > > >>>>>>  	if (hp->port.count++ > 0) {
> > > >>>>>>  		spin_unlock_irqrestore(&hp->port.lock, flags);
> > > >>>>>>  		hvc_kick();
> > > >>>>>> -		return 0;
> > > >>>>>> +		goto out;
> > > >>>>>>  	} /* else count == 0 */
> > > >>>>>>  	spin_unlock_irqrestore(&hp->port.lock, flags);
> > > >>>>>
> > > >>>>> Wait, why isn't this driver just calling tty_port_open() instead of
> > > >>>>> trying to open-code all of this?
> > > >>>>>
> > > >>>>> Keeping a single mutext for open will not protect it from close, it will
> > > >>>>> just slow things down a bit.  There should already be a tty lock held by
> > > >>>>> the tty core for open() to keep it from racing things, right?
> > > >>>> The tty lock should have been held, but not likely across
> > > >>>> ->install() and
> > > >>>> ->open() callbacks, thus resulting in a race between hvc_install() and
> > > >>>> hvc_open(),
> > > >>>
> > > >>> How?  The tty lock is held in install, and should not conflict with
> > > >>> open(), otherwise, we would be seeing this happen in all tty drivers,
> > > >>> right?
> > > >>>
> > > >> Well, I was expecting the same, but IIRC, I see that the open() was being
> > > >> called in parallel for the same device node.
> > > >
> > > > So open and install are happening at the same time?  And the tty_lock()
> > > > does not protect the needed fields from being protected properly?  If
> > > > not, what fields are being touched without the lock?
> > > >
> > > >> Is it expected that the tty core would allow only one thread to
> > > >> access the dev-node, while blocking the other, or is it the client
> > > >> driver's responsibility to handle the exclusiveness?
> > > >
> > > > The tty core should handle this correctly, for things that can mess
> > > > stuff up (like install and open at the same time).  A driver should not
> > > > have to worry about that.
> > > >
> > > >>>> where hvc_install() sets a data and the hvc_open() clears it.
> > > >>>> hvc_open()
> > > >>>> doesn't
> > > >>>> check if the data was set to NULL and proceeds.
> > > >>>
> > > >>> What data is being set that hvc_open is checking?
> > > >> hvc_install sets tty->private_data to hp, while hvc_open sets it to NULL (in
> > > >> one of the paths).
> > > >
> > > > I see no use of private_data in drivers/tty/hvc/ so what exactly are you
> > > > referring to?
> > > 
> > > He likely means tty->driver_data. And there exactly lays the issue.
> > > 
> > > commit bdb498c20040616e94b05c31a0ceb3e134b7e829
> > > Author: Jiri Slaby <jslaby@suse.cz>
> > > Date:   Tue Aug 7 21:48:04 2012 +0200
> > > 
> > >     TTY: hvc_console, add tty install
> > > 
> > > added hvc_install but did not move 'tty->driver_data = NULL;' from
> > > hvc_open's fail path to hvc_cleanup.
> > > 
> > > IOW hvc_open now NULLs tty->driver_data even for another task which
> > > opened the tty earlier. The same holds for
> > > "tty_port_tty_set(&hp->port,
> > > NULL);" there. And actually "tty_port_put(&hp->port);" is also
> > > incorrect
> > > for the 2nd task opening the tty.
> > > 
> > > So, a mutex with tty->driver_data check in open is not definitely the
> > > way to go. This mess needs to be sorted out properly. Sure, a good
> > > start
> > > would be a conversion to tty_port_open. Right after dropping "tty:
> > > hvc:
> > > Fix data abort due to race in hvc_open" from tty/tty-next :).
> > 
> > I've now reverted this commit so we can start from a "clean" place.
> > 
> > > What I *don't* understand is why hp->ops->notifier_add fails, given
> > > the
> > > open does not allow multiple opens anyway?
> > 
> > I don't understand that either.  Raghavendra, can you show a real trace
> > for this issue that shows this?
> > 
> Let me know if this helps:
> 
> [  265.332900] Unable to handle kernel NULL pointer dereference at virtual
> address 00000000000000a8
> [  265.332920] Mem abort info:
> [  265.332934]   ESR = 0x96000006
> [  265.332950]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  265.332963]   SET = 0, FnV = 0
> [  265.332975]   EA = 0, S1PTW = 0
> [  265.332985] Data abort info:
> [  265.332997]   ISV = 0, ISS = 0x00000006
> [  265.333008]   CM = 0, WnR = 0
> [  265.333025] user pgtable: 4k pages, 39-bit VAs, pgdp=00000001620f3000
> [  265.333038] [00000000000000a8] pgd=00000001620f2003,
> pud=00000001620f2003, pmd=0000000000000000
> [  265.333071] Internal error: Oops: 96000006 [#1] PREEMPT SMP
> [  265.333424] CPU: 1 PID: 5653 Comm: stress-ng-dev Tainted: G S      W  O
> 5.4.12-g04866e0 #1
> [  265.333458] pstate: 80400085 (Nzcv daIf +PAN -UAO)
> [  265.333499] pc : _raw_spin_lock_irqsave+0x40/0x7c
> [  265.333517] lr : _raw_spin_lock_irqsave+0x38/0x7c
> [  265.333530] sp : ffffffc02436ba40
> [  265.333542] x29: ffffffc02436ba40 x28: 0000000000020800
> [  265.333562] x27: ffffffdfb4046490 x26: ffffff8101b83400
> [  265.333580] x25: ffffff80e163ad00 x24: ffffffdfb45c7798
> [  265.333598] x23: ffffff8101b83668 x22: ffffffdfb4974000
> [  265.333617] x21: 0000000000000001 x20: 00000000000000a8
> [  265.333634] x19: 0000000000000000 x18: ffffff80e0b0d460
> [  265.333652] x17: 0000000000000000 x16: 0000000001000000
> [  265.333670] x15: 0000000001000000 x14: 00000000f8000000
> [  265.333688] x13: 0000000000000000 x12: 0000000000000001
> [  265.333706] x11: 17f5f16765f64600 x10: 17f5f16765f64600
> [  265.333724] x9 : ffffffdfb3444244 x8 : 0000000000000000
> [  265.333741] x7 : 0000000000000000 x6 : 0000000000000000
> [  265.333759] x5 : 0000000000000000 x4 : 0000000000000002
> [  265.333776] x3 : ffffffc02436b9c0 x2 : ffffffdfb40456e0
> [  265.333794] x1 : ffffffc02436b9c0 x0 : ffffffdfb3444244
> [  265.333812] Call trace:
> [  265.333831]  _raw_spin_lock_irqsave+0x40/0x7c
> [  265.333859]  hvc_open$61deaf328f140fd7df47c115ec866fa5+0x28/0x174
> [  265.333882]  tty_open$86bd494905ebe22944bf63b711173de3+0x3d0/0x584
> [  265.333921]  chrdev_open$4083aaa799bca8e0e1e0c8dc1947aa96+0x1c4/0x248
> [  265.333940]  do_dentry_open+0x258/0x3b0
> [  265.333956]  vfs_open+0x2c/0x38
> [  265.333975]  path_openat+0x898/0xedc
> [  265.333991]  do_filp_open+0x78/0x124
> [  265.334006]  do_sys_open+0x13c/0x298
> [  265.334022]  __arm64_sys_openat+0x28/0x34
> [  265.334044]  el0_svc_common+0xb8/0x1b4
> [  265.334059]  el0_svc_handler+0x6c/0x88
> [  265.334079]  el0_svc+0x8/0xc
> [  265.334110] Code: 52800035 97b9fec7 aa1f03e8 f9800291 (885ffe81)
> [  265.334130] ---[ end trace ac90e3099a98e99f ]---
> [  265.334146] Kernel panic - not syncing: Fatal exception

Hm, do you have a strace showing the close happening at the same time?
What about install()?

And what line in hvc_open() does that offset correspond to?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-12  8:25             ` Greg KH
@ 2020-05-12 21:39               ` rananta
  2020-05-13  7:04                 ` Greg KH
  0 siblings, 1 reply; 18+ messages in thread
From: rananta @ 2020-05-12 21:39 UTC (permalink / raw)
  To: Greg KH; +Cc: linuxppc-dev, andrew, Jiri Slaby, linux-kernel

On 2020-05-12 01:25, Greg KH wrote:
> On Tue, May 12, 2020 at 09:22:15AM +0200, Jiri Slaby wrote:
>> On 11. 05. 20, 9:39, Greg KH wrote:
>> > On Mon, May 11, 2020 at 12:23:58AM -0700, rananta@codeaurora.org wrote:
>> >> On 2020-05-09 23:48, Greg KH wrote:
>> >>> On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
>> >>>> On 2020-05-06 02:48, Greg KH wrote:
>> >>>>> On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
>> >>>>>> Potentially, hvc_open() can be called in parallel when two tasks calls
>> >>>>>> open() on /dev/hvcX. In such a scenario, if the
>> >>>>>> hp->ops->notifier_add()
>> >>>>>> callback in the function fails, where it sets the tty->driver_data to
>> >>>>>> NULL, the parallel hvc_open() can see this NULL and cause a memory
>> >>>>>> abort.
>> >>>>>> Hence, serialize hvc_open and check if tty->private_data is NULL
>> >>>>>> before
>> >>>>>> proceeding ahead.
>> >>>>>>
>> >>>>>> The issue can be easily reproduced by launching two tasks
>> >>>>>> simultaneously
>> >>>>>> that does nothing but open() and close() on /dev/hvcX.
>> >>>>>> For example:
>> >>>>>> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
>> >>>>>>
>> >>>>>> Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
>> >>>>>> ---
>> >>>>>>  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
>> >>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>> >>>>>>
>> >>>>>> diff --git a/drivers/tty/hvc/hvc_console.c
>> >>>>>> b/drivers/tty/hvc/hvc_console.c
>> >>>>>> index 436cc51c92c3..ebe26fe5ac09 100644
>> >>>>>> --- a/drivers/tty/hvc/hvc_console.c
>> >>>>>> +++ b/drivers/tty/hvc/hvc_console.c
>> >>>>>> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
>> >>>>>>   */
>> >>>>>>  static DEFINE_MUTEX(hvc_structs_mutex);
>> >>>>>>
>> >>>>>> +/* Mutex to serialize hvc_open */
>> >>>>>> +static DEFINE_MUTEX(hvc_open_mutex);
>> >>>>>>  /*
>> >>>>>>   * This value is used to assign a tty->index value to a hvc_struct
>> >>>>>> based
>> >>>>>>   * upon order of exposure via hvc_probe(), when we can not match it
>> >>>>>> to
>> >>>>>> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
>> >>>>>> *driver, struct tty_struct *tty)
>> >>>>>>   */
>> >>>>>>  static int hvc_open(struct tty_struct *tty, struct file * filp)
>> >>>>>>  {
>> >>>>>> -	struct hvc_struct *hp = tty->driver_data;
>> >>>>>> +	struct hvc_struct *hp;
>> >>>>>>  	unsigned long flags;
>> >>>>>>  	int rc = 0;
>> >>>>>>
>> >>>>>> +	mutex_lock(&hvc_open_mutex);
>> >>>>>> +
>> >>>>>> +	hp = tty->driver_data;
>> >>>>>> +	if (!hp) {
>> >>>>>> +		rc = -EIO;
>> >>>>>> +		goto out;
>> >>>>>> +	}
>> >>>>>> +
>> >>>>>>  	spin_lock_irqsave(&hp->port.lock, flags);
>> >>>>>>  	/* Check and then increment for fast path open. */
>> >>>>>>  	if (hp->port.count++ > 0) {
>> >>>>>>  		spin_unlock_irqrestore(&hp->port.lock, flags);
>> >>>>>>  		hvc_kick();
>> >>>>>> -		return 0;
>> >>>>>> +		goto out;
>> >>>>>>  	} /* else count == 0 */
>> >>>>>>  	spin_unlock_irqrestore(&hp->port.lock, flags);
>> >>>>>
>> >>>>> Wait, why isn't this driver just calling tty_port_open() instead of
>> >>>>> trying to open-code all of this?
>> >>>>>
>> >>>>> Keeping a single mutext for open will not protect it from close, it will
>> >>>>> just slow things down a bit.  There should already be a tty lock held by
>> >>>>> the tty core for open() to keep it from racing things, right?
>> >>>> The tty lock should have been held, but not likely across
>> >>>> ->install() and
>> >>>> ->open() callbacks, thus resulting in a race between hvc_install() and
>> >>>> hvc_open(),
>> >>>
>> >>> How?  The tty lock is held in install, and should not conflict with
>> >>> open(), otherwise, we would be seeing this happen in all tty drivers,
>> >>> right?
>> >>>
>> >> Well, I was expecting the same, but IIRC, I see that the open() was being
>> >> called in parallel for the same device node.
>> >
>> > So open and install are happening at the same time?  And the tty_lock()
>> > does not protect the needed fields from being protected properly?  If
>> > not, what fields are being touched without the lock?
>> >
>> >> Is it expected that the tty core would allow only one thread to
>> >> access the dev-node, while blocking the other, or is it the client
>> >> driver's responsibility to handle the exclusiveness?
>> >
>> > The tty core should handle this correctly, for things that can mess
>> > stuff up (like install and open at the same time).  A driver should not
>> > have to worry about that.
>> >
>> >>>> where hvc_install() sets a data and the hvc_open() clears it.
>> >>>> hvc_open()
>> >>>> doesn't
>> >>>> check if the data was set to NULL and proceeds.
>> >>>
>> >>> What data is being set that hvc_open is checking?
>> >> hvc_install sets tty->private_data to hp, while hvc_open sets it to NULL (in
>> >> one of the paths).
>> >
>> > I see no use of private_data in drivers/tty/hvc/ so what exactly are you
>> > referring to?
>> 
>> He likely means tty->driver_data. And there exactly lays the issue.
>> 
>> commit bdb498c20040616e94b05c31a0ceb3e134b7e829
>> Author: Jiri Slaby <jslaby@suse.cz>
>> Date:   Tue Aug 7 21:48:04 2012 +0200
>> 
>>     TTY: hvc_console, add tty install
>> 
>> added hvc_install but did not move 'tty->driver_data = NULL;' from
>> hvc_open's fail path to hvc_cleanup.
>> 
>> IOW hvc_open now NULLs tty->driver_data even for another task which
>> opened the tty earlier. The same holds for 
>> "tty_port_tty_set(&hp->port,
>> NULL);" there. And actually "tty_port_put(&hp->port);" is also 
>> incorrect
>> for the 2nd task opening the tty.
>> 
>> So, a mutex with tty->driver_data check in open is not definitely the
>> way to go. This mess needs to be sorted out properly. Sure, a good 
>> start
>> would be a conversion to tty_port_open. Right after dropping "tty: 
>> hvc:
>> Fix data abort due to race in hvc_open" from tty/tty-next :).
> 
> I've now reverted this commit so we can start from a "clean" place.
> 
>> What I *don't* understand is why hp->ops->notifier_add fails, given 
>> the
>> open does not allow multiple opens anyway?
> 
> I don't understand that either.  Raghavendra, can you show a real trace
> for this issue that shows this?
> 
Let me know if this helps:

[  265.332900] Unable to handle kernel NULL pointer dereference at 
virtual address 00000000000000a8
[  265.332920] Mem abort info:
[  265.332934]   ESR = 0x96000006
[  265.332950]   EC = 0x25: DABT (current EL), IL = 32 bits
[  265.332963]   SET = 0, FnV = 0
[  265.332975]   EA = 0, S1PTW = 0
[  265.332985] Data abort info:
[  265.332997]   ISV = 0, ISS = 0x00000006
[  265.333008]   CM = 0, WnR = 0
[  265.333025] user pgtable: 4k pages, 39-bit VAs, pgdp=00000001620f3000
[  265.333038] [00000000000000a8] pgd=00000001620f2003, 
pud=00000001620f2003, pmd=0000000000000000
[  265.333071] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[  265.333424] CPU: 1 PID: 5653 Comm: stress-ng-dev Tainted: G S      W  
O      5.4.12-g04866e0 #1
[  265.333458] pstate: 80400085 (Nzcv daIf +PAN -UAO)
[  265.333499] pc : _raw_spin_lock_irqsave+0x40/0x7c
[  265.333517] lr : _raw_spin_lock_irqsave+0x38/0x7c
[  265.333530] sp : ffffffc02436ba40
[  265.333542] x29: ffffffc02436ba40 x28: 0000000000020800
[  265.333562] x27: ffffffdfb4046490 x26: ffffff8101b83400
[  265.333580] x25: ffffff80e163ad00 x24: ffffffdfb45c7798
[  265.333598] x23: ffffff8101b83668 x22: ffffffdfb4974000
[  265.333617] x21: 0000000000000001 x20: 00000000000000a8
[  265.333634] x19: 0000000000000000 x18: ffffff80e0b0d460
[  265.333652] x17: 0000000000000000 x16: 0000000001000000
[  265.333670] x15: 0000000001000000 x14: 00000000f8000000
[  265.333688] x13: 0000000000000000 x12: 0000000000000001
[  265.333706] x11: 17f5f16765f64600 x10: 17f5f16765f64600
[  265.333724] x9 : ffffffdfb3444244 x8 : 0000000000000000
[  265.333741] x7 : 0000000000000000 x6 : 0000000000000000
[  265.333759] x5 : 0000000000000000 x4 : 0000000000000002
[  265.333776] x3 : ffffffc02436b9c0 x2 : ffffffdfb40456e0
[  265.333794] x1 : ffffffc02436b9c0 x0 : ffffffdfb3444244
[  265.333812] Call trace:
[  265.333831]  _raw_spin_lock_irqsave+0x40/0x7c
[  265.333859]  hvc_open$61deaf328f140fd7df47c115ec866fa5+0x28/0x174
[  265.333882]  tty_open$86bd494905ebe22944bf63b711173de3+0x3d0/0x584
[  265.333921]  chrdev_open$4083aaa799bca8e0e1e0c8dc1947aa96+0x1c4/0x248
[  265.333940]  do_dentry_open+0x258/0x3b0
[  265.333956]  vfs_open+0x2c/0x38
[  265.333975]  path_openat+0x898/0xedc
[  265.333991]  do_filp_open+0x78/0x124
[  265.334006]  do_sys_open+0x13c/0x298
[  265.334022]  __arm64_sys_openat+0x28/0x34
[  265.334044]  el0_svc_common+0xb8/0x1b4
[  265.334059]  el0_svc_handler+0x6c/0x88
[  265.334079]  el0_svc+0x8/0xc
[  265.334110] Code: 52800035 97b9fec7 aa1f03e8 f9800291 (885ffe81)
[  265.334130] ---[ end trace ac90e3099a98e99f ]---
[  265.334146] Kernel panic - not syncing: Fatal exception
> thanks,
> 
> greg k-h

Thank you.
Raghavendra

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-12  7:22           ` Jiri Slaby
@ 2020-05-12  8:25             ` Greg KH
  2020-05-12 21:39               ` rananta
  0 siblings, 1 reply; 18+ messages in thread
From: Greg KH @ 2020-05-12  8:25 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: rananta, andrew, linuxppc-dev, linux-kernel

On Tue, May 12, 2020 at 09:22:15AM +0200, Jiri Slaby wrote:
> On 11. 05. 20, 9:39, Greg KH wrote:
> > On Mon, May 11, 2020 at 12:23:58AM -0700, rananta@codeaurora.org wrote:
> >> On 2020-05-09 23:48, Greg KH wrote:
> >>> On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
> >>>> On 2020-05-06 02:48, Greg KH wrote:
> >>>>> On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> >>>>>> Potentially, hvc_open() can be called in parallel when two tasks calls
> >>>>>> open() on /dev/hvcX. In such a scenario, if the
> >>>>>> hp->ops->notifier_add()
> >>>>>> callback in the function fails, where it sets the tty->driver_data to
> >>>>>> NULL, the parallel hvc_open() can see this NULL and cause a memory
> >>>>>> abort.
> >>>>>> Hence, serialize hvc_open and check if tty->private_data is NULL
> >>>>>> before
> >>>>>> proceeding ahead.
> >>>>>>
> >>>>>> The issue can be easily reproduced by launching two tasks
> >>>>>> simultaneously
> >>>>>> that does nothing but open() and close() on /dev/hvcX.
> >>>>>> For example:
> >>>>>> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> >>>>>>
> >>>>>> Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
> >>>>>> ---
> >>>>>>  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
> >>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/tty/hvc/hvc_console.c
> >>>>>> b/drivers/tty/hvc/hvc_console.c
> >>>>>> index 436cc51c92c3..ebe26fe5ac09 100644
> >>>>>> --- a/drivers/tty/hvc/hvc_console.c
> >>>>>> +++ b/drivers/tty/hvc/hvc_console.c
> >>>>>> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> >>>>>>   */
> >>>>>>  static DEFINE_MUTEX(hvc_structs_mutex);
> >>>>>>
> >>>>>> +/* Mutex to serialize hvc_open */
> >>>>>> +static DEFINE_MUTEX(hvc_open_mutex);
> >>>>>>  /*
> >>>>>>   * This value is used to assign a tty->index value to a hvc_struct
> >>>>>> based
> >>>>>>   * upon order of exposure via hvc_probe(), when we can not match it
> >>>>>> to
> >>>>>> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> >>>>>> *driver, struct tty_struct *tty)
> >>>>>>   */
> >>>>>>  static int hvc_open(struct tty_struct *tty, struct file * filp)
> >>>>>>  {
> >>>>>> -	struct hvc_struct *hp = tty->driver_data;
> >>>>>> +	struct hvc_struct *hp;
> >>>>>>  	unsigned long flags;
> >>>>>>  	int rc = 0;
> >>>>>>
> >>>>>> +	mutex_lock(&hvc_open_mutex);
> >>>>>> +
> >>>>>> +	hp = tty->driver_data;
> >>>>>> +	if (!hp) {
> >>>>>> +		rc = -EIO;
> >>>>>> +		goto out;
> >>>>>> +	}
> >>>>>> +
> >>>>>>  	spin_lock_irqsave(&hp->port.lock, flags);
> >>>>>>  	/* Check and then increment for fast path open. */
> >>>>>>  	if (hp->port.count++ > 0) {
> >>>>>>  		spin_unlock_irqrestore(&hp->port.lock, flags);
> >>>>>>  		hvc_kick();
> >>>>>> -		return 0;
> >>>>>> +		goto out;
> >>>>>>  	} /* else count == 0 */
> >>>>>>  	spin_unlock_irqrestore(&hp->port.lock, flags);
> >>>>>
> >>>>> Wait, why isn't this driver just calling tty_port_open() instead of
> >>>>> trying to open-code all of this?
> >>>>>
> >>>>> Keeping a single mutext for open will not protect it from close, it will
> >>>>> just slow things down a bit.  There should already be a tty lock held by
> >>>>> the tty core for open() to keep it from racing things, right?
> >>>> The tty lock should have been held, but not likely across
> >>>> ->install() and
> >>>> ->open() callbacks, thus resulting in a race between hvc_install() and
> >>>> hvc_open(),
> >>>
> >>> How?  The tty lock is held in install, and should not conflict with
> >>> open(), otherwise, we would be seeing this happen in all tty drivers,
> >>> right?
> >>>
> >> Well, I was expecting the same, but IIRC, I see that the open() was being
> >> called in parallel for the same device node.
> > 
> > So open and install are happening at the same time?  And the tty_lock()
> > does not protect the needed fields from being protected properly?  If
> > not, what fields are being touched without the lock?
> > 
> >> Is it expected that the tty core would allow only one thread to
> >> access the dev-node, while blocking the other, or is it the client
> >> driver's responsibility to handle the exclusiveness?
> > 
> > The tty core should handle this correctly, for things that can mess
> > stuff up (like install and open at the same time).  A driver should not
> > have to worry about that.
> > 
> >>>> where hvc_install() sets a data and the hvc_open() clears it.
> >>>> hvc_open()
> >>>> doesn't
> >>>> check if the data was set to NULL and proceeds.
> >>>
> >>> What data is being set that hvc_open is checking?
> >> hvc_install sets tty->private_data to hp, while hvc_open sets it to NULL (in
> >> one of the paths).
> > 
> > I see no use of private_data in drivers/tty/hvc/ so what exactly are you
> > referring to?
> 
> He likely means tty->driver_data. And there exactly lays the issue.
> 
> commit bdb498c20040616e94b05c31a0ceb3e134b7e829
> Author: Jiri Slaby <jslaby@suse.cz>
> Date:   Tue Aug 7 21:48:04 2012 +0200
> 
>     TTY: hvc_console, add tty install
> 
> added hvc_install but did not move 'tty->driver_data = NULL;' from
> hvc_open's fail path to hvc_cleanup.
> 
> IOW hvc_open now NULLs tty->driver_data even for another task which
> opened the tty earlier. The same holds for "tty_port_tty_set(&hp->port,
> NULL);" there. And actually "tty_port_put(&hp->port);" is also incorrect
> for the 2nd task opening the tty.
> 
> So, a mutex with tty->driver_data check in open is not definitely the
> way to go. This mess needs to be sorted out properly. Sure, a good start
> would be a conversion to tty_port_open. Right after dropping "tty: hvc:
> Fix data abort due to race in hvc_open" from tty/tty-next :).

I've now reverted this commit so we can start from a "clean" place.

> What I *don't* understand is why hp->ops->notifier_add fails, given the
> open does not allow multiple opens anyway?

I don't understand that either.  Raghavendra, can you show a real trace
for this issue that shows this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-11  7:39         ` Greg KH
@ 2020-05-12  7:22           ` Jiri Slaby
  2020-05-12  8:25             ` Greg KH
  0 siblings, 1 reply; 18+ messages in thread
From: Jiri Slaby @ 2020-05-12  7:22 UTC (permalink / raw)
  To: Greg KH, rananta; +Cc: andrew, linuxppc-dev, linux-kernel

On 11. 05. 20, 9:39, Greg KH wrote:
> On Mon, May 11, 2020 at 12:23:58AM -0700, rananta@codeaurora.org wrote:
>> On 2020-05-09 23:48, Greg KH wrote:
>>> On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
>>>> On 2020-05-06 02:48, Greg KH wrote:
>>>>> On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
>>>>>> Potentially, hvc_open() can be called in parallel when two tasks calls
>>>>>> open() on /dev/hvcX. In such a scenario, if the
>>>>>> hp->ops->notifier_add()
>>>>>> callback in the function fails, where it sets the tty->driver_data to
>>>>>> NULL, the parallel hvc_open() can see this NULL and cause a memory
>>>>>> abort.
>>>>>> Hence, serialize hvc_open and check if tty->private_data is NULL
>>>>>> before
>>>>>> proceeding ahead.
>>>>>>
>>>>>> The issue can be easily reproduced by launching two tasks
>>>>>> simultaneously
>>>>>> that does nothing but open() and close() on /dev/hvcX.
>>>>>> For example:
>>>>>> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
>>>>>>
>>>>>> Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
>>>>>> ---
>>>>>>  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
>>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/tty/hvc/hvc_console.c
>>>>>> b/drivers/tty/hvc/hvc_console.c
>>>>>> index 436cc51c92c3..ebe26fe5ac09 100644
>>>>>> --- a/drivers/tty/hvc/hvc_console.c
>>>>>> +++ b/drivers/tty/hvc/hvc_console.c
>>>>>> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
>>>>>>   */
>>>>>>  static DEFINE_MUTEX(hvc_structs_mutex);
>>>>>>
>>>>>> +/* Mutex to serialize hvc_open */
>>>>>> +static DEFINE_MUTEX(hvc_open_mutex);
>>>>>>  /*
>>>>>>   * This value is used to assign a tty->index value to a hvc_struct
>>>>>> based
>>>>>>   * upon order of exposure via hvc_probe(), when we can not match it
>>>>>> to
>>>>>> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
>>>>>> *driver, struct tty_struct *tty)
>>>>>>   */
>>>>>>  static int hvc_open(struct tty_struct *tty, struct file * filp)
>>>>>>  {
>>>>>> -	struct hvc_struct *hp = tty->driver_data;
>>>>>> +	struct hvc_struct *hp;
>>>>>>  	unsigned long flags;
>>>>>>  	int rc = 0;
>>>>>>
>>>>>> +	mutex_lock(&hvc_open_mutex);
>>>>>> +
>>>>>> +	hp = tty->driver_data;
>>>>>> +	if (!hp) {
>>>>>> +		rc = -EIO;
>>>>>> +		goto out;
>>>>>> +	}
>>>>>> +
>>>>>>  	spin_lock_irqsave(&hp->port.lock, flags);
>>>>>>  	/* Check and then increment for fast path open. */
>>>>>>  	if (hp->port.count++ > 0) {
>>>>>>  		spin_unlock_irqrestore(&hp->port.lock, flags);
>>>>>>  		hvc_kick();
>>>>>> -		return 0;
>>>>>> +		goto out;
>>>>>>  	} /* else count == 0 */
>>>>>>  	spin_unlock_irqrestore(&hp->port.lock, flags);
>>>>>
>>>>> Wait, why isn't this driver just calling tty_port_open() instead of
>>>>> trying to open-code all of this?
>>>>>
>>>>> Keeping a single mutext for open will not protect it from close, it will
>>>>> just slow things down a bit.  There should already be a tty lock held by
>>>>> the tty core for open() to keep it from racing things, right?
>>>> The tty lock should have been held, but not likely across
>>>> ->install() and
>>>> ->open() callbacks, thus resulting in a race between hvc_install() and
>>>> hvc_open(),
>>>
>>> How?  The tty lock is held in install, and should not conflict with
>>> open(), otherwise, we would be seeing this happen in all tty drivers,
>>> right?
>>>
>> Well, I was expecting the same, but IIRC, I see that the open() was being
>> called in parallel for the same device node.
> 
> So open and install are happening at the same time?  And the tty_lock()
> does not protect the needed fields from being protected properly?  If
> not, what fields are being touched without the lock?
> 
>> Is it expected that the tty core would allow only one thread to
>> access the dev-node, while blocking the other, or is it the client
>> driver's responsibility to handle the exclusiveness?
> 
> The tty core should handle this correctly, for things that can mess
> stuff up (like install and open at the same time).  A driver should not
> have to worry about that.
> 
>>>> where hvc_install() sets a data and the hvc_open() clears it.
>>>> hvc_open()
>>>> doesn't
>>>> check if the data was set to NULL and proceeds.
>>>
>>> What data is being set that hvc_open is checking?
>> hvc_install sets tty->private_data to hp, while hvc_open sets it to NULL (in
>> one of the paths).
> 
> I see no use of private_data in drivers/tty/hvc/ so what exactly are you
> referring to?

He likely means tty->driver_data. And there exactly lays the issue.

commit bdb498c20040616e94b05c31a0ceb3e134b7e829
Author: Jiri Slaby <jslaby@suse.cz>
Date:   Tue Aug 7 21:48:04 2012 +0200

    TTY: hvc_console, add tty install

added hvc_install but did not move 'tty->driver_data = NULL;' from
hvc_open's fail path to hvc_cleanup.

IOW hvc_open now NULLs tty->driver_data even for another task which
opened the tty earlier. The same holds for "tty_port_tty_set(&hp->port,
NULL);" there. And actually "tty_port_put(&hp->port);" is also incorrect
for the 2nd task opening the tty.

So, a mutex with tty->driver_data check in open is not definitely the
way to go. This mess needs to be sorted out properly. Sure, a good start
would be a conversion to tty_port_open. Right after dropping "tty: hvc:
Fix data abort due to race in hvc_open" from tty/tty-next :).

What I *don't* understand is why hp->ops->notifier_add fails, given the
open does not allow multiple opens anyway?

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-11  7:34         ` rananta
@ 2020-05-11  7:41           ` Greg KH
  0 siblings, 0 replies; 18+ messages in thread
From: Greg KH @ 2020-05-11  7:41 UTC (permalink / raw)
  To: rananta; +Cc: andrew, linuxppc-dev, linux-kernel, jslaby

On Mon, May 11, 2020 at 12:34:44AM -0700, rananta@codeaurora.org wrote:
> On 2020-05-11 00:23, rananta@codeaurora.org wrote:
> > On 2020-05-09 23:48, Greg KH wrote:
> > > On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org
> > > wrote:
> > > > On 2020-05-06 02:48, Greg KH wrote:
> > > > > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> > > > > > Potentially, hvc_open() can be called in parallel when two tasks calls
> > > > > > open() on /dev/hvcX. In such a scenario, if the
> > > > > > hp->ops->notifier_add()
> > > > > > callback in the function fails, where it sets the tty->driver_data to
> > > > > > NULL, the parallel hvc_open() can see this NULL and cause a memory
> > > > > > abort.
> > > > > > Hence, serialize hvc_open and check if tty->private_data is NULL
> > > > > > before
> > > > > > proceeding ahead.
> > > > > >
> > > > > > The issue can be easily reproduced by launching two tasks
> > > > > > simultaneously
> > > > > > that does nothing but open() and close() on /dev/hvcX.
> > > > > > For example:
> > > > > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> > > > > >
> > > > > > Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
> > > > > > ---
> > > > > >  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
> > > > > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/tty/hvc/hvc_console.c
> > > > > > b/drivers/tty/hvc/hvc_console.c
> > > > > > index 436cc51c92c3..ebe26fe5ac09 100644
> > > > > > --- a/drivers/tty/hvc/hvc_console.c
> > > > > > +++ b/drivers/tty/hvc/hvc_console.c
> > > > > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > > > > >   */
> > > > > >  static DEFINE_MUTEX(hvc_structs_mutex);
> > > > > >
> > > > > > +/* Mutex to serialize hvc_open */
> > > > > > +static DEFINE_MUTEX(hvc_open_mutex);
> > > > > >  /*
> > > > > >   * This value is used to assign a tty->index value to a hvc_struct
> > > > > > based
> > > > > >   * upon order of exposure via hvc_probe(), when we can not match it
> > > > > > to
> > > > > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > > > > *driver, struct tty_struct *tty)
> > > > > >   */
> > > > > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
> > > > > >  {
> > > > > > -	struct hvc_struct *hp = tty->driver_data;
> > > > > > +	struct hvc_struct *hp;
> > > > > >  	unsigned long flags;
> > > > > >  	int rc = 0;
> > > > > >
> > > > > > +	mutex_lock(&hvc_open_mutex);
> > > > > > +
> > > > > > +	hp = tty->driver_data;
> > > > > > +	if (!hp) {
> > > > > > +		rc = -EIO;
> > > > > > +		goto out;
> > > > > > +	}
> > > > > > +
> > > > > >  	spin_lock_irqsave(&hp->port.lock, flags);
> > > > > >  	/* Check and then increment for fast path open. */
> > > > > >  	if (hp->port.count++ > 0) {
> > > > > >  		spin_unlock_irqrestore(&hp->port.lock, flags);
> > > > > >  		hvc_kick();
> > > > > > -		return 0;
> > > > > > +		goto out;
> > > > > >  	} /* else count == 0 */
> > > > > >  	spin_unlock_irqrestore(&hp->port.lock, flags);
> > > > >
> > > > > Wait, why isn't this driver just calling tty_port_open() instead of
> > > > > trying to open-code all of this?
> > > > >
> > > > > Keeping a single mutext for open will not protect it from close, it will
> > > > > just slow things down a bit.  There should already be a tty lock held by
> > > > > the tty core for open() to keep it from racing things, right?
> > > > The tty lock should have been held, but not likely across
> > > > ->install() and
> > > > ->open() callbacks, thus resulting in a race between
> > > > hvc_install() and
> > > > hvc_open(),
> > > 
> > > How?  The tty lock is held in install, and should not conflict with
> > > open(), otherwise, we would be seeing this happen in all tty drivers,
> > > right?
> > > 
> > Well, I was expecting the same, but IIRC, I see that the open() was
> > being
> > called in parallel for the same device node.
> > 
> > Is it expected that the tty core would allow only one thread to
> > access the dev-node, while blocking the other, or is it the client
> > driver's responsibility to handle the exclusiveness?
> Or is there any optimization going on where the second call doesn't go
> through
> install(), but calls open() directly as the file was already opened by the
> first
> thread?

Yes, it should only happen once, look at the logic in tty_kopen().

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-11  7:23       ` rananta
  2020-05-11  7:34         ` rananta
@ 2020-05-11  7:39         ` Greg KH
  2020-05-12  7:22           ` Jiri Slaby
  1 sibling, 1 reply; 18+ messages in thread
From: Greg KH @ 2020-05-11  7:39 UTC (permalink / raw)
  To: rananta; +Cc: andrew, linuxppc-dev, linux-kernel, jslaby

On Mon, May 11, 2020 at 12:23:58AM -0700, rananta@codeaurora.org wrote:
> On 2020-05-09 23:48, Greg KH wrote:
> > On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
> > > On 2020-05-06 02:48, Greg KH wrote:
> > > > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> > > > > Potentially, hvc_open() can be called in parallel when two tasks calls
> > > > > open() on /dev/hvcX. In such a scenario, if the
> > > > > hp->ops->notifier_add()
> > > > > callback in the function fails, where it sets the tty->driver_data to
> > > > > NULL, the parallel hvc_open() can see this NULL and cause a memory
> > > > > abort.
> > > > > Hence, serialize hvc_open and check if tty->private_data is NULL
> > > > > before
> > > > > proceeding ahead.
> > > > >
> > > > > The issue can be easily reproduced by launching two tasks
> > > > > simultaneously
> > > > > that does nothing but open() and close() on /dev/hvcX.
> > > > > For example:
> > > > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> > > > >
> > > > > Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
> > > > > ---
> > > > >  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
> > > > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/tty/hvc/hvc_console.c
> > > > > b/drivers/tty/hvc/hvc_console.c
> > > > > index 436cc51c92c3..ebe26fe5ac09 100644
> > > > > --- a/drivers/tty/hvc/hvc_console.c
> > > > > +++ b/drivers/tty/hvc/hvc_console.c
> > > > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > > > >   */
> > > > >  static DEFINE_MUTEX(hvc_structs_mutex);
> > > > >
> > > > > +/* Mutex to serialize hvc_open */
> > > > > +static DEFINE_MUTEX(hvc_open_mutex);
> > > > >  /*
> > > > >   * This value is used to assign a tty->index value to a hvc_struct
> > > > > based
> > > > >   * upon order of exposure via hvc_probe(), when we can not match it
> > > > > to
> > > > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > > > *driver, struct tty_struct *tty)
> > > > >   */
> > > > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
> > > > >  {
> > > > > -	struct hvc_struct *hp = tty->driver_data;
> > > > > +	struct hvc_struct *hp;
> > > > >  	unsigned long flags;
> > > > >  	int rc = 0;
> > > > >
> > > > > +	mutex_lock(&hvc_open_mutex);
> > > > > +
> > > > > +	hp = tty->driver_data;
> > > > > +	if (!hp) {
> > > > > +		rc = -EIO;
> > > > > +		goto out;
> > > > > +	}
> > > > > +
> > > > >  	spin_lock_irqsave(&hp->port.lock, flags);
> > > > >  	/* Check and then increment for fast path open. */
> > > > >  	if (hp->port.count++ > 0) {
> > > > >  		spin_unlock_irqrestore(&hp->port.lock, flags);
> > > > >  		hvc_kick();
> > > > > -		return 0;
> > > > > +		goto out;
> > > > >  	} /* else count == 0 */
> > > > >  	spin_unlock_irqrestore(&hp->port.lock, flags);
> > > >
> > > > Wait, why isn't this driver just calling tty_port_open() instead of
> > > > trying to open-code all of this?
> > > >
> > > > Keeping a single mutext for open will not protect it from close, it will
> > > > just slow things down a bit.  There should already be a tty lock held by
> > > > the tty core for open() to keep it from racing things, right?
> > > The tty lock should have been held, but not likely across
> > > ->install() and
> > > ->open() callbacks, thus resulting in a race between hvc_install() and
> > > hvc_open(),
> > 
> > How?  The tty lock is held in install, and should not conflict with
> > open(), otherwise, we would be seeing this happen in all tty drivers,
> > right?
> > 
> Well, I was expecting the same, but IIRC, I see that the open() was being
> called in parallel for the same device node.

So open and install are happening at the same time?  And the tty_lock()
does not protect the needed fields from being protected properly?  If
not, what fields are being touched without the lock?

> Is it expected that the tty core would allow only one thread to
> access the dev-node, while blocking the other, or is it the client
> driver's responsibility to handle the exclusiveness?

The tty core should handle this correctly, for things that can mess
stuff up (like install and open at the same time).  A driver should not
have to worry about that.

> > > where hvc_install() sets a data and the hvc_open() clears it.
> > > hvc_open()
> > > doesn't
> > > check if the data was set to NULL and proceeds.
> > 
> > What data is being set that hvc_open is checking?
> hvc_install sets tty->private_data to hp, while hvc_open sets it to NULL (in
> one of the paths).

I see no use of private_data in drivers/tty/hvc/ so what exactly are you
referring to?  The file private_data or the port private_data or
something else?

> > And you are not grabbing a lock in your install callback, you are only
> > serializing your open call here, I don't see how this is fixing anything
> > other than perhaps slowing down your codepaths.
> Basically, my intention was to add a NULL check before accessing *hp in
> open().
> The intention of the lock was to protect against this check.
> If the tty layer would have taken care of this, then perhaps there won't be
> a
> need to check for NULL.

Ah, driver_data is what you are referring to, not private_data.

Look at hvc_close(), no locking is done there to test for private_data,
right?  Why not?  The only thing setting driver_data is in install, and
your lock is not touching that.

And again, install and open should not race, if so, the tty core needs
to be fixed.

> > As an arument why this isn't correct, can you answer why this same type
> > of change wouldn't be required for all tty drivers in the tree?
> > 
> I agree, that if it's already taken care by the tty-core, we don't need it
> here.
> Correct me if I'm wrong, but looks like the tty layer is allowing parallel
> accesses
> to open(),

I do not think that happens, try counting the calls to open(), there
should only be one.  If not, that's a bug somewhere else.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-11  7:23       ` rananta
@ 2020-05-11  7:34         ` rananta
  2020-05-11  7:41           ` Greg KH
  2020-05-11  7:39         ` Greg KH
  1 sibling, 1 reply; 18+ messages in thread
From: rananta @ 2020-05-11  7:34 UTC (permalink / raw)
  To: Greg KH; +Cc: andrew, linuxppc-dev, linux-kernel, jslaby

On 2020-05-11 00:23, rananta@codeaurora.org wrote:
> On 2020-05-09 23:48, Greg KH wrote:
>> On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org 
>> wrote:
>>> On 2020-05-06 02:48, Greg KH wrote:
>>> > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
>>> > > Potentially, hvc_open() can be called in parallel when two tasks calls
>>> > > open() on /dev/hvcX. In such a scenario, if the
>>> > > hp->ops->notifier_add()
>>> > > callback in the function fails, where it sets the tty->driver_data to
>>> > > NULL, the parallel hvc_open() can see this NULL and cause a memory
>>> > > abort.
>>> > > Hence, serialize hvc_open and check if tty->private_data is NULL
>>> > > before
>>> > > proceeding ahead.
>>> > >
>>> > > The issue can be easily reproduced by launching two tasks
>>> > > simultaneously
>>> > > that does nothing but open() and close() on /dev/hvcX.
>>> > > For example:
>>> > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
>>> > >
>>> > > Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
>>> > > ---
>>> > >  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
>>> > >  1 file changed, 14 insertions(+), 2 deletions(-)
>>> > >
>>> > > diff --git a/drivers/tty/hvc/hvc_console.c
>>> > > b/drivers/tty/hvc/hvc_console.c
>>> > > index 436cc51c92c3..ebe26fe5ac09 100644
>>> > > --- a/drivers/tty/hvc/hvc_console.c
>>> > > +++ b/drivers/tty/hvc/hvc_console.c
>>> > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
>>> > >   */
>>> > >  static DEFINE_MUTEX(hvc_structs_mutex);
>>> > >
>>> > > +/* Mutex to serialize hvc_open */
>>> > > +static DEFINE_MUTEX(hvc_open_mutex);
>>> > >  /*
>>> > >   * This value is used to assign a tty->index value to a hvc_struct
>>> > > based
>>> > >   * upon order of exposure via hvc_probe(), when we can not match it
>>> > > to
>>> > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
>>> > > *driver, struct tty_struct *tty)
>>> > >   */
>>> > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
>>> > >  {
>>> > > -	struct hvc_struct *hp = tty->driver_data;
>>> > > +	struct hvc_struct *hp;
>>> > >  	unsigned long flags;
>>> > >  	int rc = 0;
>>> > >
>>> > > +	mutex_lock(&hvc_open_mutex);
>>> > > +
>>> > > +	hp = tty->driver_data;
>>> > > +	if (!hp) {
>>> > > +		rc = -EIO;
>>> > > +		goto out;
>>> > > +	}
>>> > > +
>>> > >  	spin_lock_irqsave(&hp->port.lock, flags);
>>> > >  	/* Check and then increment for fast path open. */
>>> > >  	if (hp->port.count++ > 0) {
>>> > >  		spin_unlock_irqrestore(&hp->port.lock, flags);
>>> > >  		hvc_kick();
>>> > > -		return 0;
>>> > > +		goto out;
>>> > >  	} /* else count == 0 */
>>> > >  	spin_unlock_irqrestore(&hp->port.lock, flags);
>>> >
>>> > Wait, why isn't this driver just calling tty_port_open() instead of
>>> > trying to open-code all of this?
>>> >
>>> > Keeping a single mutext for open will not protect it from close, it will
>>> > just slow things down a bit.  There should already be a tty lock held by
>>> > the tty core for open() to keep it from racing things, right?
>>> The tty lock should have been held, but not likely across ->install() 
>>> and
>>> ->open() callbacks, thus resulting in a race between hvc_install() 
>>> and
>>> hvc_open(),
>> 
>> How?  The tty lock is held in install, and should not conflict with
>> open(), otherwise, we would be seeing this happen in all tty drivers,
>> right?
>> 
> Well, I was expecting the same, but IIRC, I see that the open() was 
> being
> called in parallel for the same device node.
> 
> Is it expected that the tty core would allow only one thread to
> access the dev-node, while blocking the other, or is it the client
> driver's responsibility to handle the exclusiveness?
Or is there any optimization going on where the second call doesn't go 
through
install(), but calls open() directly as the file was already opened by 
the first
thread?
>>> where hvc_install() sets a data and the hvc_open() clears it. 
>>> hvc_open()
>>> doesn't
>>> check if the data was set to NULL and proceeds.
>> 
>> What data is being set that hvc_open is checking?
> hvc_install sets tty->private_data to hp, while hvc_open sets it to
> NULL (in one of the paths).
>> 
>> And you are not grabbing a lock in your install callback, you are only
>> serializing your open call here, I don't see how this is fixing 
>> anything
>> other than perhaps slowing down your codepaths.
> Basically, my intention was to add a NULL check before accessing *hp in 
> open().
> The intention of the lock was to protect against this check.
> If the tty layer would have taken care of this, then perhaps there 
> won't be a
> need to check for NULL.
>> 
>> As an arument why this isn't correct, can you answer why this same 
>> type
>> of change wouldn't be required for all tty drivers in the tree?
>> 
> I agree, that if it's already taken care by the tty-core, we don't need 
> it here.
> Correct me if I'm wrong, but looks like the tty layer is allowing
> parallel accesses
> to open(),
>> thanks,
>> 
>> greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-10  6:48     ` Greg KH
@ 2020-05-11  7:23       ` rananta
  2020-05-11  7:34         ` rananta
  2020-05-11  7:39         ` Greg KH
  0 siblings, 2 replies; 18+ messages in thread
From: rananta @ 2020-05-11  7:23 UTC (permalink / raw)
  To: Greg KH; +Cc: andrew, linuxppc-dev, linux-kernel, jslaby

On 2020-05-09 23:48, Greg KH wrote:
> On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
>> On 2020-05-06 02:48, Greg KH wrote:
>> > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
>> > > Potentially, hvc_open() can be called in parallel when two tasks calls
>> > > open() on /dev/hvcX. In such a scenario, if the
>> > > hp->ops->notifier_add()
>> > > callback in the function fails, where it sets the tty->driver_data to
>> > > NULL, the parallel hvc_open() can see this NULL and cause a memory
>> > > abort.
>> > > Hence, serialize hvc_open and check if tty->private_data is NULL
>> > > before
>> > > proceeding ahead.
>> > >
>> > > The issue can be easily reproduced by launching two tasks
>> > > simultaneously
>> > > that does nothing but open() and close() on /dev/hvcX.
>> > > For example:
>> > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
>> > >
>> > > Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
>> > > ---
>> > >  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
>> > >  1 file changed, 14 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/drivers/tty/hvc/hvc_console.c
>> > > b/drivers/tty/hvc/hvc_console.c
>> > > index 436cc51c92c3..ebe26fe5ac09 100644
>> > > --- a/drivers/tty/hvc/hvc_console.c
>> > > +++ b/drivers/tty/hvc/hvc_console.c
>> > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
>> > >   */
>> > >  static DEFINE_MUTEX(hvc_structs_mutex);
>> > >
>> > > +/* Mutex to serialize hvc_open */
>> > > +static DEFINE_MUTEX(hvc_open_mutex);
>> > >  /*
>> > >   * This value is used to assign a tty->index value to a hvc_struct
>> > > based
>> > >   * upon order of exposure via hvc_probe(), when we can not match it
>> > > to
>> > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
>> > > *driver, struct tty_struct *tty)
>> > >   */
>> > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
>> > >  {
>> > > -	struct hvc_struct *hp = tty->driver_data;
>> > > +	struct hvc_struct *hp;
>> > >  	unsigned long flags;
>> > >  	int rc = 0;
>> > >
>> > > +	mutex_lock(&hvc_open_mutex);
>> > > +
>> > > +	hp = tty->driver_data;
>> > > +	if (!hp) {
>> > > +		rc = -EIO;
>> > > +		goto out;
>> > > +	}
>> > > +
>> > >  	spin_lock_irqsave(&hp->port.lock, flags);
>> > >  	/* Check and then increment for fast path open. */
>> > >  	if (hp->port.count++ > 0) {
>> > >  		spin_unlock_irqrestore(&hp->port.lock, flags);
>> > >  		hvc_kick();
>> > > -		return 0;
>> > > +		goto out;
>> > >  	} /* else count == 0 */
>> > >  	spin_unlock_irqrestore(&hp->port.lock, flags);
>> >
>> > Wait, why isn't this driver just calling tty_port_open() instead of
>> > trying to open-code all of this?
>> >
>> > Keeping a single mutext for open will not protect it from close, it will
>> > just slow things down a bit.  There should already be a tty lock held by
>> > the tty core for open() to keep it from racing things, right?
>> The tty lock should have been held, but not likely across ->install() 
>> and
>> ->open() callbacks, thus resulting in a race between hvc_install() and
>> hvc_open(),
> 
> How?  The tty lock is held in install, and should not conflict with
> open(), otherwise, we would be seeing this happen in all tty drivers,
> right?
> 
Well, I was expecting the same, but IIRC, I see that the open() was 
being
called in parallel for the same device node.

Is it expected that the tty core would allow only one thread to
access the dev-node, while blocking the other, or is it the client
driver's responsibility to handle the exclusiveness?
>> where hvc_install() sets a data and the hvc_open() clears it. 
>> hvc_open()
>> doesn't
>> check if the data was set to NULL and proceeds.
> 
> What data is being set that hvc_open is checking?
hvc_install sets tty->private_data to hp, while hvc_open sets it to NULL 
(in one of the paths).
> 
> And you are not grabbing a lock in your install callback, you are only
> serializing your open call here, I don't see how this is fixing 
> anything
> other than perhaps slowing down your codepaths.
Basically, my intention was to add a NULL check before accessing *hp in 
open().
The intention of the lock was to protect against this check.
If the tty layer would have taken care of this, then perhaps there won't 
be a
need to check for NULL.
> 
> As an arument why this isn't correct, can you answer why this same type
> of change wouldn't be required for all tty drivers in the tree?
> 
I agree, that if it's already taken care by the tty-core, we don't need 
it here.
Correct me if I'm wrong, but looks like the tty layer is allowing 
parallel accesses
to open(),
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-10  1:30   ` rananta
@ 2020-05-10  6:48     ` Greg KH
  2020-05-11  7:23       ` rananta
  0 siblings, 1 reply; 18+ messages in thread
From: Greg KH @ 2020-05-10  6:48 UTC (permalink / raw)
  To: rananta; +Cc: andrew, linuxppc-dev, linux-kernel, jslaby

On Sat, May 09, 2020 at 06:30:56PM -0700, rananta@codeaurora.org wrote:
> On 2020-05-06 02:48, Greg KH wrote:
> > On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> > > Potentially, hvc_open() can be called in parallel when two tasks calls
> > > open() on /dev/hvcX. In such a scenario, if the
> > > hp->ops->notifier_add()
> > > callback in the function fails, where it sets the tty->driver_data to
> > > NULL, the parallel hvc_open() can see this NULL and cause a memory
> > > abort.
> > > Hence, serialize hvc_open and check if tty->private_data is NULL
> > > before
> > > proceeding ahead.
> > > 
> > > The issue can be easily reproduced by launching two tasks
> > > simultaneously
> > > that does nothing but open() and close() on /dev/hvcX.
> > > For example:
> > > $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> > > 
> > > Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
> > > ---
> > >  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
> > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/tty/hvc/hvc_console.c
> > > b/drivers/tty/hvc/hvc_console.c
> > > index 436cc51c92c3..ebe26fe5ac09 100644
> > > --- a/drivers/tty/hvc/hvc_console.c
> > > +++ b/drivers/tty/hvc/hvc_console.c
> > > @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
> > >   */
> > >  static DEFINE_MUTEX(hvc_structs_mutex);
> > > 
> > > +/* Mutex to serialize hvc_open */
> > > +static DEFINE_MUTEX(hvc_open_mutex);
> > >  /*
> > >   * This value is used to assign a tty->index value to a hvc_struct
> > > based
> > >   * upon order of exposure via hvc_probe(), when we can not match it
> > > to
> > > @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver
> > > *driver, struct tty_struct *tty)
> > >   */
> > >  static int hvc_open(struct tty_struct *tty, struct file * filp)
> > >  {
> > > -	struct hvc_struct *hp = tty->driver_data;
> > > +	struct hvc_struct *hp;
> > >  	unsigned long flags;
> > >  	int rc = 0;
> > > 
> > > +	mutex_lock(&hvc_open_mutex);
> > > +
> > > +	hp = tty->driver_data;
> > > +	if (!hp) {
> > > +		rc = -EIO;
> > > +		goto out;
> > > +	}
> > > +
> > >  	spin_lock_irqsave(&hp->port.lock, flags);
> > >  	/* Check and then increment for fast path open. */
> > >  	if (hp->port.count++ > 0) {
> > >  		spin_unlock_irqrestore(&hp->port.lock, flags);
> > >  		hvc_kick();
> > > -		return 0;
> > > +		goto out;
> > >  	} /* else count == 0 */
> > >  	spin_unlock_irqrestore(&hp->port.lock, flags);
> > 
> > Wait, why isn't this driver just calling tty_port_open() instead of
> > trying to open-code all of this?
> > 
> > Keeping a single mutext for open will not protect it from close, it will
> > just slow things down a bit.  There should already be a tty lock held by
> > the tty core for open() to keep it from racing things, right?
> The tty lock should have been held, but not likely across ->install() and
> ->open() callbacks, thus resulting in a race between hvc_install() and
> hvc_open(),

How?  The tty lock is held in install, and should not conflict with
open(), otherwise we would be seeing this happen in all tty drivers,
right?

> where hvc_install() sets a data and the hvc_open() clears it. hvc_open()
> doesn't
> check if the data was set to NULL and proceeds.

What data is being set that hvc_open is checking?

And you are not grabbing a lock in your install callback, you are only
serializing your open call here, I don't see how this is fixing anything
other than perhaps slowing down your codepaths.

As an arument why this isn't correct, can you answer why this same type
of change wouldn't be required for all tty drivers in the tree?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-05-06  9:48 ` Greg KH
@ 2020-05-10  1:30   ` rananta
  2020-05-10  6:48     ` Greg KH
  0 siblings, 1 reply; 18+ messages in thread
From: rananta @ 2020-05-10  1:30 UTC (permalink / raw)
  To: Greg KH; +Cc: andrew, linuxppc-dev, linux-kernel, jslaby

On 2020-05-06 02:48, Greg KH wrote:
> On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
>> Potentially, hvc_open() can be called in parallel when two tasks calls
>> open() on /dev/hvcX. In such a scenario, if the 
>> hp->ops->notifier_add()
>> callback in the function fails, where it sets the tty->driver_data to
>> NULL, the parallel hvc_open() can see this NULL and cause a memory 
>> abort.
>> Hence, serialize hvc_open and check if tty->private_data is NULL 
>> before
>> proceeding ahead.
>> 
>> The issue can be easily reproduced by launching two tasks 
>> simultaneously
>> that does nothing but open() and close() on /dev/hvcX.
>> For example:
>> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
>> 
>> Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
>> ---
>>  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
>>  1 file changed, 14 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/tty/hvc/hvc_console.c 
>> b/drivers/tty/hvc/hvc_console.c
>> index 436cc51c92c3..ebe26fe5ac09 100644
>> --- a/drivers/tty/hvc/hvc_console.c
>> +++ b/drivers/tty/hvc/hvc_console.c
>> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
>>   */
>>  static DEFINE_MUTEX(hvc_structs_mutex);
>> 
>> +/* Mutex to serialize hvc_open */
>> +static DEFINE_MUTEX(hvc_open_mutex);
>>  /*
>>   * This value is used to assign a tty->index value to a hvc_struct 
>> based
>>   * upon order of exposure via hvc_probe(), when we can not match it 
>> to
>> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver 
>> *driver, struct tty_struct *tty)
>>   */
>>  static int hvc_open(struct tty_struct *tty, struct file * filp)
>>  {
>> -	struct hvc_struct *hp = tty->driver_data;
>> +	struct hvc_struct *hp;
>>  	unsigned long flags;
>>  	int rc = 0;
>> 
>> +	mutex_lock(&hvc_open_mutex);
>> +
>> +	hp = tty->driver_data;
>> +	if (!hp) {
>> +		rc = -EIO;
>> +		goto out;
>> +	}
>> +
>>  	spin_lock_irqsave(&hp->port.lock, flags);
>>  	/* Check and then increment for fast path open. */
>>  	if (hp->port.count++ > 0) {
>>  		spin_unlock_irqrestore(&hp->port.lock, flags);
>>  		hvc_kick();
>> -		return 0;
>> +		goto out;
>>  	} /* else count == 0 */
>>  	spin_unlock_irqrestore(&hp->port.lock, flags);
> 
> Wait, why isn't this driver just calling tty_port_open() instead of
> trying to open-code all of this?
> 
> Keeping a single mutext for open will not protect it from close, it 
> will
> just slow things down a bit.  There should already be a tty lock held 
> by
> the tty core for open() to keep it from racing things, right?
The tty lock should have been held, but not likely across ->install() 
and
->open() callbacks, thus resulting in a race between hvc_install() and 
hvc_open(),
where hvc_install() sets a data and the hvc_open() clears it. hvc_open() 
doesn't
check if the data was set to NULL and proceeds.
> 
> Try just removing all of this logic and replacing it with a call to
> tty_port_open() and see if that fixes this issue.
> 
> As "proof" of this, I don't see other serial drivers needing a single
> mutex for their open calls, do you?
> 
> thanks,
> 
> greg k-h

Thank you.
Raghavendra

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] tty: hvc: Fix data abort due to race in hvc_open
  2020-04-28  3:26 Raghavendra Rao Ananta
@ 2020-05-06  9:48 ` Greg KH
  2020-05-10  1:30   ` rananta
  0 siblings, 1 reply; 18+ messages in thread
From: Greg KH @ 2020-05-06  9:48 UTC (permalink / raw)
  To: Raghavendra Rao Ananta; +Cc: andrew, linuxppc-dev, linux-kernel, jslaby

On Mon, Apr 27, 2020 at 08:26:01PM -0700, Raghavendra Rao Ananta wrote:
> Potentially, hvc_open() can be called in parallel when two tasks calls
> open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add()
> callback in the function fails, where it sets the tty->driver_data to
> NULL, the parallel hvc_open() can see this NULL and cause a memory abort.
> Hence, serialize hvc_open and check if tty->private_data is NULL before
> proceeding ahead.
> 
> The issue can be easily reproduced by launching two tasks simultaneously
> that does nothing but open() and close() on /dev/hvcX.
> For example:
> $ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
> 
> Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
> ---
>  drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
> index 436cc51c92c3..ebe26fe5ac09 100644
> --- a/drivers/tty/hvc/hvc_console.c
> +++ b/drivers/tty/hvc/hvc_console.c
> @@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
>   */
>  static DEFINE_MUTEX(hvc_structs_mutex);
>  
> +/* Mutex to serialize hvc_open */
> +static DEFINE_MUTEX(hvc_open_mutex);
>  /*
>   * This value is used to assign a tty->index value to a hvc_struct based
>   * upon order of exposure via hvc_probe(), when we can not match it to
> @@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver *driver, struct tty_struct *tty)
>   */
>  static int hvc_open(struct tty_struct *tty, struct file * filp)
>  {
> -	struct hvc_struct *hp = tty->driver_data;
> +	struct hvc_struct *hp;
>  	unsigned long flags;
>  	int rc = 0;
>  
> +	mutex_lock(&hvc_open_mutex);
> +
> +	hp = tty->driver_data;
> +	if (!hp) {
> +		rc = -EIO;
> +		goto out;
> +	}
> +
>  	spin_lock_irqsave(&hp->port.lock, flags);
>  	/* Check and then increment for fast path open. */
>  	if (hp->port.count++ > 0) {
>  		spin_unlock_irqrestore(&hp->port.lock, flags);
>  		hvc_kick();
> -		return 0;
> +		goto out;
>  	} /* else count == 0 */
>  	spin_unlock_irqrestore(&hp->port.lock, flags);

Wait, why isn't this driver just calling tty_port_open() instead of
trying to open-code all of this?

Keeping a single mutext for open will not protect it from close, it will
just slow things down a bit.  There should already be a tty lock held by
the tty core for open() to keep it from racing things, right?

Try just removing all of this logic and replacing it with a call to
tty_port_open() and see if that fixes this issue.

As "proof" of this, I don't see other serial drivers needing a single
mutex for their open calls, do you?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] tty: hvc: Fix data abort due to race in hvc_open
@ 2020-04-28  3:26 Raghavendra Rao Ananta
  2020-05-06  9:48 ` Greg KH
  0 siblings, 1 reply; 18+ messages in thread
From: Raghavendra Rao Ananta @ 2020-04-28  3:26 UTC (permalink / raw)
  To: gregkh, jslaby, andrew; +Cc: rananta, linuxppc-dev, linux-kernel

Potentially, hvc_open() can be called in parallel when two tasks calls
open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add()
callback in the function fails, where it sets the tty->driver_data to
NULL, the parallel hvc_open() can see this NULL and cause a memory abort.
Hence, serialize hvc_open and check if tty->private_data is NULL before
proceeding ahead.

The issue can be easily reproduced by launching two tasks simultaneously
that does nothing but open() and close() on /dev/hvcX.
For example:
$ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &

Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
---
 drivers/tty/hvc/hvc_console.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 436cc51c92c3..ebe26fe5ac09 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
  */
 static DEFINE_MUTEX(hvc_structs_mutex);
 
+/* Mutex to serialize hvc_open */
+static DEFINE_MUTEX(hvc_open_mutex);
 /*
  * This value is used to assign a tty->index value to a hvc_struct based
  * upon order of exposure via hvc_probe(), when we can not match it to
@@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver *driver, struct tty_struct *tty)
  */
 static int hvc_open(struct tty_struct *tty, struct file * filp)
 {
-	struct hvc_struct *hp = tty->driver_data;
+	struct hvc_struct *hp;
 	unsigned long flags;
 	int rc = 0;
 
+	mutex_lock(&hvc_open_mutex);
+
+	hp = tty->driver_data;
+	if (!hp) {
+		rc = -EIO;
+		goto out;
+	}
+
 	spin_lock_irqsave(&hp->port.lock, flags);
 	/* Check and then increment for fast path open. */
 	if (hp->port.count++ > 0) {
 		spin_unlock_irqrestore(&hp->port.lock, flags);
 		hvc_kick();
-		return 0;
+		goto out;
 	} /* else count == 0 */
 	spin_unlock_irqrestore(&hp->port.lock, flags);
 
@@ -384,6 +394,8 @@ static int hvc_open(struct tty_struct *tty, struct file * filp)
 	/* Force wakeup of the polling thread */
 	hvc_kick();
 
+out:
+	mutex_unlock(&hvc_open_mutex);
 	return rc;
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-05-20 13:52 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-28 12:48 [PATCH] tty: hvc: Fix data abort due to race in hvc_open Markus Elfring
  -- strict thread matches above, loose matches on Subject: below --
2020-04-28  3:26 Raghavendra Rao Ananta
2020-05-06  9:48 ` Greg KH
2020-05-10  1:30   ` rananta
2020-05-10  6:48     ` Greg KH
2020-05-11  7:23       ` rananta
2020-05-11  7:34         ` rananta
2020-05-11  7:41           ` Greg KH
2020-05-11  7:39         ` Greg KH
2020-05-12  7:22           ` Jiri Slaby
2020-05-12  8:25             ` Greg KH
2020-05-12 21:39               ` rananta
2020-05-13  7:04                 ` Greg KH
2020-05-14 23:22                   ` rananta
2020-05-15  7:30                     ` Greg KH
2020-05-15 19:21                       ` rananta
2020-05-20  9:38                     ` Jiri Slaby
2020-05-20 13:49                       ` rananta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).