linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen()
@ 2018-09-11  1:48 Dmitry Safonov
  2018-09-11  1:48 ` [PATCHv3 1/6] tty: Drop tty->count on tty_reopen() failure Dmitry Safonov
                   ` (6 more replies)
  0 siblings, 7 replies; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11  1:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa, stable,
	Greg Kroah-Hartman, Jiri Slaby, Jiri Slaby, Tetsuo Handa,
	Peter Zijlstra, Paul E. McKenney, syzbot+3aa9784721dfb90e984d

Hi all,

Three fixes that worth to have in the @stable, as we've hit them on v4.9
stable.

And for linux-next - adding lockdep asserts for line discipline changing
code, verifying that write ldisc sem will be held forthwith.

The last patch is optional and probably, timeout can be dropped for
read_lock(). I'll do it if everyone agrees.

Rong Chen, could you kindly re-run this version to see if the lockup
from v1 still happens? I wasn't able to reproduce it..

Thanks,
Dima

Changes since v2:
- Added reviewed-by tags
- Hopefully, fixed reported by 0-day issue.
- Added optional fix for wait_readers decrement

Changes since v1:
- Added tested-by/reported-by tags
- Dropped 3/4 (locking tty pair for lockdep sake),
  Because of that - not adding lockdep_assert_held() in tty_ldisc_open()
- Added 4/4 cleanup to inc tty->count only on success of
  tty_ldisc_reinit()
- lock ldisc without (5*HZ) timeout in tty_reopen()

v1 link: 
lkml.kernel.org/r/<20180829022353.23568-1-dima@arista.com>

Huuge cc list:
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Nathan March <nathan@gt.net>
Cc: Pasi Kärkkäinen <pasik@iki.fi>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: "Rong, Chen" <rong.a.chen@intel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
(please, ignore if I Cc'ed you mistakenly)

Dmitry Safonov (6):
  tty: Drop tty->count on tty_reopen() failure
  tty/ldsem: Update waiter->task before waking up reader
  tty: Hold tty_ldisc_lock() during tty_reopen()
  tty/lockdep: Add ldisc_sem asserts
  tty: Simplify tty->count math in tty_reopen()
  tty/ldsem: Decrement wait_readers on timeouted down_read()

 drivers/tty/tty_io.c    | 12 ++++++++----
 drivers/tty/tty_ldisc.c |  5 +++++
 drivers/tty/tty_ldsem.c |  5 ++++-
 3 files changed, 17 insertions(+), 5 deletions(-)

-- 
2.13.6


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCHv3 1/6] tty: Drop tty->count on tty_reopen() failure
  2018-09-11  1:48 [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Dmitry Safonov
@ 2018-09-11  1:48 ` Dmitry Safonov
  2018-09-11  1:48 ` [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader Dmitry Safonov
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11  1:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa, Jiri Slaby,
	Jiri Slaby, Tetsuo Handa, stable, Greg Kroah-Hartman

In case of tty_ldisc_reinit() failure, tty->count should be decremented
back, otherwise we will never release_tty().
Tetsuo reported that it fixes noisy warnings on tty release like:
  pts pts4033: tty_release: tty->count(10529) != (#fd's(7) + #kopen's(0))

Fixes: commit 892d1fa7eaae ("tty: Destroy ldisc instance on hangup")

Cc: stable@vger.kernel.org # v4.6+
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Reviewed-by: Jiri Slaby <jslaby@suse.cz>
Tested-by: Jiri Slaby <jslaby@suse.com>
Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 drivers/tty/tty_io.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 32bc3e3fe4d3..5e5da9acaf0a 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1255,6 +1255,7 @@ static void tty_driver_remove_tty(struct tty_driver *driver, struct tty_struct *
 static int tty_reopen(struct tty_struct *tty)
 {
 	struct tty_driver *driver = tty->driver;
+	int retval;
 
 	if (driver->type == TTY_DRIVER_TYPE_PTY &&
 	    driver->subtype == PTY_TYPE_MASTER)
@@ -1268,10 +1269,14 @@ static int tty_reopen(struct tty_struct *tty)
 
 	tty->count++;
 
-	if (!tty->ldisc)
-		return tty_ldisc_reinit(tty, tty->termios.c_line);
+	if (tty->ldisc)
+		return 0;
 
-	return 0;
+	retval = tty_ldisc_reinit(tty, tty->termios.c_line);
+	if (retval)
+		tty->count--;
+
+	return retval;
 }
 
 /**
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader
  2018-09-11  1:48 [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Dmitry Safonov
  2018-09-11  1:48 ` [PATCHv3 1/6] tty: Drop tty->count on tty_reopen() failure Dmitry Safonov
@ 2018-09-11  1:48 ` Dmitry Safonov
  2018-09-11  5:04   ` Sergey Senozhatsky
  2018-09-11 11:40   ` Peter Zijlstra
  2018-09-11  1:48 ` [PATCHv3 3/6] tty: Hold tty_ldisc_lock() during tty_reopen() Dmitry Safonov
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11  1:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa, stable,
	Greg Kroah-Hartman, Jiri Slaby, Peter Zijlstra, Paul E. McKenney

There is a couple of reports about lockup in ldsem_down_read() without
anyone holding write end of ldisc semaphore:
lkml.kernel.org/r/<20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.intel.com>
lkml.kernel.org/r/<20180907045041.GF1110@shao2-debian>

They all looked like a missed wake up.
I wasn't lucky enough to reproduce it, but it seems like reader on
another CPU can miss waiter->task update and schedule again, resulting
in indefinite (MAX_SCHEDULE_TIMEOUT) sleep.

Make sure waked up reader will see waiter->task == NULL.

Cc: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 drivers/tty/tty_ldsem.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/tty_ldsem.c b/drivers/tty/tty_ldsem.c
index 0c98d88f795a..832accbbcb6d 100644
--- a/drivers/tty/tty_ldsem.c
+++ b/drivers/tty/tty_ldsem.c
@@ -118,6 +118,8 @@ static void __ldsem_wake_readers(struct ld_semaphore *sem)
 		tsk = waiter->task;
 		smp_mb();
 		waiter->task = NULL;
+		/* Make sure down_read_failed() will see !waiter->task update */
+		smp_wmb();
 		wake_up_process(tsk);
 		put_task_struct(tsk);
 	}
@@ -217,7 +219,7 @@ down_read_failed(struct ld_semaphore *sem, long count, long timeout)
 	for (;;) {
 		set_current_state(TASK_UNINTERRUPTIBLE);
 
-		if (!waiter.task)
+		if (!READ_ONCE(waiter.task))
 			break;
 		if (!timeout)
 			break;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv3 3/6] tty: Hold tty_ldisc_lock() during tty_reopen()
  2018-09-11  1:48 [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Dmitry Safonov
  2018-09-11  1:48 ` [PATCHv3 1/6] tty: Drop tty->count on tty_reopen() failure Dmitry Safonov
  2018-09-11  1:48 ` [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader Dmitry Safonov
@ 2018-09-11  1:48 ` Dmitry Safonov
  2018-09-11  1:48 ` [PATCHv3 4/6] tty/lockdep: Add ldisc_sem asserts Dmitry Safonov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11  1:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa, Jiri Slaby,
	syzbot+3aa9784721dfb90e984d, Tetsuo Handa, stable,
	Greg Kroah-Hartman, Jiri Slaby

tty_ldisc_reinit() doesn't race with neither tty_ldisc_hangup()
nor set_ldisc() nor tty_ldisc_release() as they use tty lock.
But it races with anyone who expects line discipline to be the same
after hoding read semaphore in tty_ldisc_ref().

We've seen the following crash on v4.9.108 stable:

BUG: unable to handle kernel paging request at 0000000000002260
IP: [..] n_tty_receive_buf_common+0x5f/0x86d
Workqueue: events_unbound flush_to_ldisc
Call Trace:
 [..] n_tty_receive_buf2
 [..] tty_ldisc_receive_buf
 [..] flush_to_ldisc
 [..] process_one_work
 [..] worker_thread
 [..] kthread
 [..] ret_from_fork

tty_ldisc_reinit() should be called with ldisc_sem hold for writing,
which will protect any reader against line discipline changes.

Backport-first: b027e2298bd5 ("tty: fix data race between tty_init_dev
and flush of buf")
Cc: stable@vger.kernel.org

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Reviewed-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: syzbot+3aa9784721dfb90e984d@syzkaller.appspotmail.com
Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 drivers/tty/tty_io.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 5e5da9acaf0a..a947719b4626 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1255,7 +1255,7 @@ static void tty_driver_remove_tty(struct tty_driver *driver, struct tty_struct *
 static int tty_reopen(struct tty_struct *tty)
 {
 	struct tty_driver *driver = tty->driver;
-	int retval;
+	int retval = 0;
 
 	if (driver->type == TTY_DRIVER_TYPE_PTY &&
 	    driver->subtype == PTY_TYPE_MASTER)
@@ -1267,15 +1267,18 @@ static int tty_reopen(struct tty_struct *tty)
 	if (test_bit(TTY_EXCLUSIVE, &tty->flags) && !capable(CAP_SYS_ADMIN))
 		return -EBUSY;
 
-	tty->count++;
+	tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT);
 
+	tty->count++;
 	if (tty->ldisc)
-		return 0;
+		goto out_unlock;
 
 	retval = tty_ldisc_reinit(tty, tty->termios.c_line);
 	if (retval)
 		tty->count--;
 
+out_unlock:
+	tty_ldisc_unlock(tty);
 	return retval;
 }
 
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv3 4/6] tty/lockdep: Add ldisc_sem asserts
  2018-09-11  1:48 [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Dmitry Safonov
                   ` (2 preceding siblings ...)
  2018-09-11  1:48 ` [PATCHv3 3/6] tty: Hold tty_ldisc_lock() during tty_reopen() Dmitry Safonov
@ 2018-09-11  1:48 ` Dmitry Safonov
  2018-09-11 11:59   ` Peter Zijlstra
  2018-09-11 12:01   ` Peter Zijlstra
  2018-09-11  1:48 ` [PATCHv3 5/6] tty: Simplify tty->count math in tty_reopen() Dmitry Safonov
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11  1:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa,
	Greg Kroah-Hartman, Jiri Slaby

Make sure under CONFIG_LOCKDEP that each change to line discipline
is done with held write semaphor.
Otherwise potential reader will have a good time dereferencing
incomplete/uninitialized ldisc.

Exception here is tty_ldisc_open(), as it's called without ldisc_sem
locked by tty_init_dev() for the tty->link.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 drivers/tty/tty_ldisc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
index fc4c97cae01e..202cb645582f 100644
--- a/drivers/tty/tty_ldisc.c
+++ b/drivers/tty/tty_ldisc.c
@@ -471,6 +471,7 @@ static int tty_ldisc_open(struct tty_struct *tty, struct tty_ldisc *ld)
 
 static void tty_ldisc_close(struct tty_struct *tty, struct tty_ldisc *ld)
 {
+	lockdep_assert_held(&tty->ldisc_sem);
 	WARN_ON(!test_bit(TTY_LDISC_OPEN, &tty->flags));
 	clear_bit(TTY_LDISC_OPEN, &tty->flags);
 	if (ld->ops->close)
@@ -492,6 +493,7 @@ static int tty_ldisc_failto(struct tty_struct *tty, int ld)
 	struct tty_ldisc *disc = tty_ldisc_get(tty, ld);
 	int r;
 
+	lockdep_assert_held(&tty->ldisc_sem);
 	if (IS_ERR(disc))
 		return PTR_ERR(disc);
 	tty->ldisc = disc;
@@ -615,6 +617,7 @@ EXPORT_SYMBOL_GPL(tty_set_ldisc);
  */
 static void tty_ldisc_kill(struct tty_struct *tty)
 {
+	lockdep_assert_held(&tty->ldisc_sem);
 	if (!tty->ldisc)
 		return;
 	/*
@@ -662,6 +665,7 @@ int tty_ldisc_reinit(struct tty_struct *tty, int disc)
 	struct tty_ldisc *ld;
 	int retval;
 
+	lockdep_assert_held(&tty->ldisc_sem);
 	ld = tty_ldisc_get(tty, disc);
 	if (IS_ERR(ld)) {
 		BUG_ON(disc == N_TTY);
@@ -825,6 +829,7 @@ int tty_ldisc_init(struct tty_struct *tty)
  */
 void tty_ldisc_deinit(struct tty_struct *tty)
 {
+	/* no ldisc_sem, tty is being destroyed */
 	if (tty->ldisc)
 		tty_ldisc_put(tty->ldisc);
 	tty->ldisc = NULL;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv3 5/6] tty: Simplify tty->count math in tty_reopen()
  2018-09-11  1:48 [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Dmitry Safonov
                   ` (3 preceding siblings ...)
  2018-09-11  1:48 ` [PATCHv3 4/6] tty/lockdep: Add ldisc_sem asserts Dmitry Safonov
@ 2018-09-11  1:48 ` Dmitry Safonov
  2018-09-11  1:48 ` [PATCHv3 6/6] tty/ldsem: Decrement wait_readers on timeouted down_read() Dmitry Safonov
  2018-09-11 12:16 ` [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Mark Rutland
  6 siblings, 0 replies; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11  1:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa, Jiri Slaby,
	Jiri Slaby, Greg Kroah-Hartman

As notted by Jiri, tty_ldisc_reinit() shouldn't rely on tty counter.
Simplify math by increasing the counter after reinit success.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Link: lkml.kernel.org/r/<20180829022353.23568-2-dima@arista.com>
Suggested-by: Jiri Slaby <jslaby@suse.com>
Reviewed-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 drivers/tty/tty_io.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index a947719b4626..7f968ac14bbd 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1268,17 +1268,13 @@ static int tty_reopen(struct tty_struct *tty)
 		return -EBUSY;
 
 	tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT);
+	if (!tty->ldisc)
+		retval = tty_ldisc_reinit(tty, tty->termios.c_line);
+	tty_ldisc_unlock(tty);
 
-	tty->count++;
-	if (tty->ldisc)
-		goto out_unlock;
+	if (retval == 0)
+		tty->count++;
 
-	retval = tty_ldisc_reinit(tty, tty->termios.c_line);
-	if (retval)
-		tty->count--;
-
-out_unlock:
-	tty_ldisc_unlock(tty);
 	return retval;
 }
 
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv3 6/6] tty/ldsem: Decrement wait_readers on timeouted down_read()
  2018-09-11  1:48 [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Dmitry Safonov
                   ` (4 preceding siblings ...)
  2018-09-11  1:48 ` [PATCHv3 5/6] tty: Simplify tty->count math in tty_reopen() Dmitry Safonov
@ 2018-09-11  1:48 ` Dmitry Safonov
  2018-09-11 12:02   ` Peter Zijlstra
  2018-09-11 12:16 ` [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Mark Rutland
  6 siblings, 1 reply; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11  1:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa,
	Greg Kroah-Hartman, Jiri Slaby

It seems like when ldsem_down_read() fails with timeout, it misses
update for sem->wait_readers. By that reason, when writer finally
releases write end of the semaphore __ldsem_wake_readers() does adjust
sem->count with wrong value:
sem->wait_readers * (LDSEM_ACTIVE_BIAS - LDSEM_WAIT_BIAS)

I.e, if update comes with 1 missed wait_readers decrement, sem->count
will be 0x100000001 which means that there is active reader and it'll
make any further writer to fail in acquiring the semaphore.

It looks like, this is a dead-code, because ldsem_down_read() is never
called with timeout different than MAX_SCHEDULE_TIMEOUT, so it might be
worth to delete timeout parameter and error path fall-back..

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 drivers/tty/tty_ldsem.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/tty/tty_ldsem.c b/drivers/tty/tty_ldsem.c
index 832accbbcb6d..f7966ab7b450 100644
--- a/drivers/tty/tty_ldsem.c
+++ b/drivers/tty/tty_ldsem.c
@@ -237,6 +237,7 @@ down_read_failed(struct ld_semaphore *sem, long count, long timeout)
 		raw_spin_lock_irq(&sem->wait_lock);
 		if (waiter.task) {
 			atomic_long_add_return(-LDSEM_WAIT_BIAS, &sem->count);
+			sem->wait_readers--;
 			list_del(&waiter.list);
 			raw_spin_unlock_irq(&sem->wait_lock);
 			put_task_struct(waiter.task);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader
  2018-09-11  1:48 ` [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader Dmitry Safonov
@ 2018-09-11  5:04   ` Sergey Senozhatsky
  2018-09-11  5:41     ` Sergey Senozhatsky
  2018-09-11 11:43     ` Peter Zijlstra
  2018-09-11 11:40   ` Peter Zijlstra
  1 sibling, 2 replies; 24+ messages in thread
From: Sergey Senozhatsky @ 2018-09-11  5:04 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa, stable,
	Greg Kroah-Hartman, Jiri Slaby, Peter Zijlstra, Paul E. McKenney

On (09/11/18 02:48), Dmitry Safonov wrote:
> There is a couple of reports about lockup in ldsem_down_read() without
> anyone holding write end of ldisc semaphore:
> lkml.kernel.org/r/<20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.intel.com>
> lkml.kernel.org/r/<20180907045041.GF1110@shao2-debian>
> 
> They all looked like a missed wake up.
> I wasn't lucky enough to reproduce it, but it seems like reader on
> another CPU can miss waiter->task update and schedule again, resulting
> in indefinite (MAX_SCHEDULE_TIMEOUT) sleep.

Certainly, something suspicious is going on.

> @@ -118,6 +118,8 @@ static void __ldsem_wake_readers(struct ld_semaphore *sem)
>  		tsk = waiter->task;
>  		smp_mb();
>  		waiter->task = NULL;
> +		/* Make sure down_read_failed() will see !waiter->task update */
> +		smp_wmb();
>  		wake_up_process(tsk);

Hmm. I think wake_up_process() executes a full memory barrier, because
it accesses task state.

>  		put_task_struct(tsk);
>  	}
> @@ -217,7 +219,7 @@ down_read_failed(struct ld_semaphore *sem, long count, long timeout)
>  	for (;;) {
>  		set_current_state(TASK_UNINTERRUPTIBLE);

I think that set_current_state() also executes memory barrier. Just
because it accesses task state.

> -		if (!waiter.task)
> +		if (!READ_ONCE(waiter.task))
>  			break;
>  		if (!timeout)
>  			break;

	-ss

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader
  2018-09-11  5:04   ` Sergey Senozhatsky
@ 2018-09-11  5:41     ` Sergey Senozhatsky
  2018-09-11 11:04       ` Kirill Tkhai
  2018-09-11 11:44       ` Peter Zijlstra
  2018-09-11 11:43     ` Peter Zijlstra
  1 sibling, 2 replies; 24+ messages in thread
From: Sergey Senozhatsky @ 2018-09-11  5:41 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen, Tan Xiaojun,
	Tetsuo Handa, stable, Greg Kroah-Hartman, Jiri Slaby,
	Peter Zijlstra, Paul E. McKenney, Sergey Senozhatsky

On (09/11/18 14:04), Sergey Senozhatsky wrote:
> >  	for (;;) {
> >  		set_current_state(TASK_UNINTERRUPTIBLE);
> 
> I think that set_current_state() also executes memory barrier. Just
> because it accesses task state.
> 
> > -		if (!waiter.task)
> > +		if (!READ_ONCE(waiter.task))
> >  			break;
> >  		if (!timeout)
> >  			break;

This READ_ONCE(waiter.task) looks interesting. Maybe could be moved
to a loop condition

	while (!READ_ONCE(waiter.task)) {
		...
	}

	-ss

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Re: [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader
  2018-09-11  5:41     ` Sergey Senozhatsky
@ 2018-09-11 11:04       ` Kirill Tkhai
  2018-09-11 11:44       ` Peter Zijlstra
  1 sibling, 0 replies; 24+ messages in thread
From: Kirill Tkhai @ 2018-09-11 11:04 UTC (permalink / raw)
  To: Sergey Senozhatsky, Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen, Tan Xiaojun,
	Tetsuo Handa, stable, Greg Kroah-Hartman, Jiri Slaby,
	Peter Zijlstra, Paul E. McKenney

On 9/11/18 8:41 AM, Sergey Senozhatsky wrote:
> On (09/11/18 14:04), Sergey Senozhatsky wrote:
>>>  	for (;;) {
>>>  		set_current_state(TASK_UNINTERRUPTIBLE);
>>
>> I think that set_current_state() also executes memory barrier. Just
>> because it accesses task state.
>>
>>> -		if (!waiter.task)
>>> +		if (!READ_ONCE(waiter.task))
>>>  			break;
>>>  		if (!timeout)
>>>  			break;
> 
> This READ_ONCE(waiter.task) looks interesting. Maybe could be moved
> to a loop condition
> 
> 	while (!READ_ONCE(waiter.task)) {
> 		...
> 	}

We can't reorder event check and set_current_state(),
because this will lead to missing of wakeup:

	Documentation/memory-barriers.txt

Also, it looks like READ_ONCE() is not need. In case of compiler
had optimized this, then all wait_event() in kernel w/o READ_ONCE
would have not worked like expected, wouldn't they?

Kirill

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader
  2018-09-11  1:48 ` [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader Dmitry Safonov
  2018-09-11  5:04   ` Sergey Senozhatsky
@ 2018-09-11 11:40   ` Peter Zijlstra
  2018-09-11 12:48     ` Dmitry Safonov
  1 sibling, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2018-09-11 11:40 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa, stable,
	Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney

On Tue, Sep 11, 2018 at 02:48:17AM +0100, Dmitry Safonov wrote:
> There is a couple of reports about lockup in ldsem_down_read() without
> anyone holding write end of ldisc semaphore:
> lkml.kernel.org/r/<20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.intel.com>
> lkml.kernel.org/r/<20180907045041.GF1110@shao2-debian>
> 
> They all looked like a missed wake up.
> I wasn't lucky enough to reproduce it, but it seems like reader on
> another CPU can miss waiter->task update and schedule again, resulting
> in indefinite (MAX_SCHEDULE_TIMEOUT) sleep.
> 
> Make sure waked up reader will see waiter->task == NULL.

> --- a/drivers/tty/tty_ldsem.c
> +++ b/drivers/tty/tty_ldsem.c
> @@ -118,6 +118,8 @@ static void __ldsem_wake_readers(struct ld_semaphore *sem)
>  		tsk = waiter->task;
>  		smp_mb();
>  		waiter->task = NULL;
> +		/* Make sure down_read_failed() will see !waiter->task update */
> +		smp_wmb();
>  		wake_up_process(tsk);

This is 'wrong', wake_up_process() should imply sufficient for this to
already be true. 

>  		put_task_struct(tsk);
>  	}

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader
  2018-09-11  5:04   ` Sergey Senozhatsky
  2018-09-11  5:41     ` Sergey Senozhatsky
@ 2018-09-11 11:43     ` Peter Zijlstra
  1 sibling, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2018-09-11 11:43 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Daniel Axtens,
	Dmitry Vyukov, Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen, Tan Xiaojun,
	Tetsuo Handa, stable, Greg Kroah-Hartman, Jiri Slaby,
	Paul E. McKenney

On Tue, Sep 11, 2018 at 02:04:49PM +0900, Sergey Senozhatsky wrote:
> On (09/11/18 02:48), Dmitry Safonov wrote:
> > There is a couple of reports about lockup in ldsem_down_read() without
> > anyone holding write end of ldisc semaphore:
> > lkml.kernel.org/r/<20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.intel.com>
> > lkml.kernel.org/r/<20180907045041.GF1110@shao2-debian>
> > 
> > They all looked like a missed wake up.
> > I wasn't lucky enough to reproduce it, but it seems like reader on
> > another CPU can miss waiter->task update and schedule again, resulting
> > in indefinite (MAX_SCHEDULE_TIMEOUT) sleep.
> 
> Certainly, something suspicious is going on.
> 
> > @@ -118,6 +118,8 @@ static void __ldsem_wake_readers(struct ld_semaphore *sem)
> >  		tsk = waiter->task;
> >  		smp_mb();
> >  		waiter->task = NULL;
> > +		/* Make sure down_read_failed() will see !waiter->task update */
> > +		smp_wmb();
> >  		wake_up_process(tsk);
> 
> Hmm. I think wake_up_process() executes a full memory barrier, because
> it accesses task state.
> 
> >  		put_task_struct(tsk);
> >  	}
> > @@ -217,7 +219,7 @@ down_read_failed(struct ld_semaphore *sem, long count, long timeout)
> >  	for (;;) {
> >  		set_current_state(TASK_UNINTERRUPTIBLE);
> 
> I think that set_current_state() also executes memory barrier. Just
> because it accesses task state.

In both cases, the rationale, 'because it accesses task state' is
utterly wrong.

The reasoning can be found in the comment near set_current_state().

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader
  2018-09-11  5:41     ` Sergey Senozhatsky
  2018-09-11 11:04       ` Kirill Tkhai
@ 2018-09-11 11:44       ` Peter Zijlstra
  1 sibling, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2018-09-11 11:44 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Dmitry Safonov, linux-kernel, Dmitry Safonov, Daniel Axtens,
	Dmitry Vyukov, Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen, Tan Xiaojun,
	Tetsuo Handa, stable, Greg Kroah-Hartman, Jiri Slaby,
	Paul E. McKenney

On Tue, Sep 11, 2018 at 02:41:29PM +0900, Sergey Senozhatsky wrote:
> On (09/11/18 14:04), Sergey Senozhatsky wrote:
> > >  	for (;;) {
> > >  		set_current_state(TASK_UNINTERRUPTIBLE);
> > 
> > I think that set_current_state() also executes memory barrier. Just
> > because it accesses task state.
> > 
> > > -		if (!waiter.task)
> > > +		if (!READ_ONCE(waiter.task))
> > >  			break;
> > >  		if (!timeout)
> > >  			break;
> 
> This READ_ONCE(waiter.task) looks interesting. Maybe could be moved
> to a loop condition
> 
> 	while (!READ_ONCE(waiter.task)) {
> 		...
> 	}

No, it must be after set_current_state(). See that same comment.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 4/6] tty/lockdep: Add ldisc_sem asserts
  2018-09-11  1:48 ` [PATCHv3 4/6] tty/lockdep: Add ldisc_sem asserts Dmitry Safonov
@ 2018-09-11 11:59   ` Peter Zijlstra
  2018-09-11 12:01   ` Peter Zijlstra
  1 sibling, 0 replies; 24+ messages in thread
From: Peter Zijlstra @ 2018-09-11 11:59 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa,
	Greg Kroah-Hartman, Jiri Slaby

Subject: tty/ldsem: Convert to regular lockdep annotations

For some reason ldsem has its own lockdep wrappers, make them go away.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 drivers/tty/tty_ldsem.c | 51 ++++++++++++++-----------------------------------
 1 file changed, 14 insertions(+), 37 deletions(-)

diff --git a/drivers/tty/tty_ldsem.c b/drivers/tty/tty_ldsem.c
index 0c98d88f795a..60beada5be6c 100644
--- a/drivers/tty/tty_ldsem.c
+++ b/drivers/tty/tty_ldsem.c
@@ -34,29 +34,6 @@
 #include <linux/sched/task.h>
 
 
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define __acq(l, s, t, r, c, n, i)		\
-				lock_acquire(&(l)->dep_map, s, t, r, c, n, i)
-# define __rel(l, n, i)				\
-				lock_release(&(l)->dep_map, n, i)
-#define lockdep_acquire(l, s, t, i)		__acq(l, s, t, 0, 1, NULL, i)
-#define lockdep_acquire_nest(l, s, t, n, i)	__acq(l, s, t, 0, 1, n, i)
-#define lockdep_acquire_read(l, s, t, i)	__acq(l, s, t, 1, 1, NULL, i)
-#define lockdep_release(l, n, i)		__rel(l, n, i)
-#else
-# define lockdep_acquire(l, s, t, i)		do { } while (0)
-# define lockdep_acquire_nest(l, s, t, n, i)	do { } while (0)
-# define lockdep_acquire_read(l, s, t, i)	do { } while (0)
-# define lockdep_release(l, n, i)		do { } while (0)
-#endif
-
-#ifdef CONFIG_LOCK_STAT
-# define lock_stat(_lock, stat)		lock_##stat(&(_lock)->dep_map, _RET_IP_)
-#else
-# define lock_stat(_lock, stat)		do { } while (0)
-#endif
-
-
 #if BITS_PER_LONG == 64
 # define LDSEM_ACTIVE_MASK	0xffffffffL
 #else
@@ -310,17 +287,17 @@ static int __ldsem_down_read_nested(struct ld_semaphore *sem,
 {
 	long count;
 
-	lockdep_acquire_read(sem, subclass, 0, _RET_IP_);
+	rwsem_acquire_read(&sem->dep_map, subclass, 0, _RET_IP_);
 
 	count = atomic_long_add_return(LDSEM_READ_BIAS, &sem->count);
 	if (count <= 0) {
-		lock_stat(sem, contended);
+		lock_contended(&sem->dep_map, _RET_IP_);
 		if (!down_read_failed(sem, count, timeout)) {
-			lockdep_release(sem, 1, _RET_IP_);
+			rwsem_release(&sem->dep_map, 1, _RET_IP_);
 			return 0;
 		}
 	}
-	lock_stat(sem, acquired);
+	lock_acquired(&sem->dep_map, _RET_IP_);
 	return 1;
 }
 
@@ -329,17 +306,17 @@ static int __ldsem_down_write_nested(struct ld_semaphore *sem,
 {
 	long count;
 
-	lockdep_acquire(sem, subclass, 0, _RET_IP_);
+	rwsem_acquire(&sem->dep_map, subclass, 0, _RET_IP_);
 
 	count = atomic_long_add_return(LDSEM_WRITE_BIAS, &sem->count);
 	if ((count & LDSEM_ACTIVE_MASK) != LDSEM_ACTIVE_BIAS) {
-		lock_stat(sem, contended);
+		lock_contended(&sem->dep_map, _RET_IP_);
 		if (!down_write_failed(sem, count, timeout)) {
-			lockdep_release(sem, 1, _RET_IP_);
+			rwsem_release(&sem->dep_map, 1, _RET_IP_);
 			return 0;
 		}
 	}
-	lock_stat(sem, acquired);
+	lock_acquired(&sem->dep_map, _RET_IP_);
 	return 1;
 }
 
@@ -362,8 +339,8 @@ int ldsem_down_read_trylock(struct ld_semaphore *sem)
 
 	while (count >= 0) {
 		if (atomic_long_try_cmpxchg(&sem->count, &count, count + LDSEM_READ_BIAS)) {
-			lockdep_acquire_read(sem, 0, 1, _RET_IP_);
-			lock_stat(sem, acquired);
+			rwsem_acquire_read(&sem->dep_map, 0, 1, _RET_IP_);
+			lock_acquired(&sem->dep_map, _RET_IP_);
 			return 1;
 		}
 	}
@@ -388,8 +365,8 @@ int ldsem_down_write_trylock(struct ld_semaphore *sem)
 
 	while ((count & LDSEM_ACTIVE_MASK) == 0) {
 		if (atomic_long_try_cmpxchg(&sem->count, &count, count + LDSEM_WRITE_BIAS)) {
-			lockdep_acquire(sem, 0, 1, _RET_IP_);
-			lock_stat(sem, acquired);
+			rwsem_acquire(&sem->dep_map, 0, 1, _RET_IP_);
+			lock_acquired(&sem->dep_map, _RET_IP_);
 			return 1;
 		}
 	}
@@ -403,7 +380,7 @@ void ldsem_up_read(struct ld_semaphore *sem)
 {
 	long count;
 
-	lockdep_release(sem, 1, _RET_IP_);
+	rwsem_release(&sem->dep_map, 1, _RET_IP_);
 
 	count = atomic_long_add_return(-LDSEM_READ_BIAS, &sem->count);
 	if (count < 0 && (count & LDSEM_ACTIVE_MASK) == 0)
@@ -417,7 +394,7 @@ void ldsem_up_write(struct ld_semaphore *sem)
 {
 	long count;
 
-	lockdep_release(sem, 1, _RET_IP_);
+	rwsem_release(&sem->dep_map, 1, _RET_IP_);
 
 	count = atomic_long_add_return(-LDSEM_WRITE_BIAS, &sem->count);
 	if (count < 0)

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 4/6] tty/lockdep: Add ldisc_sem asserts
  2018-09-11  1:48 ` [PATCHv3 4/6] tty/lockdep: Add ldisc_sem asserts Dmitry Safonov
  2018-09-11 11:59   ` Peter Zijlstra
@ 2018-09-11 12:01   ` Peter Zijlstra
  2018-09-11 12:53     ` Dmitry Safonov
  1 sibling, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2018-09-11 12:01 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa,
	Greg Kroah-Hartman, Jiri Slaby

On Tue, Sep 11, 2018 at 02:48:19AM +0100, Dmitry Safonov wrote:
> Make sure under CONFIG_LOCKDEP that each change to line discipline
> is done with held write semaphor.

But you don't do that. You just assert it is held, not for writing.

> Otherwise potential reader will have a good time dereferencing
> incomplete/uninitialized ldisc.
> 
> Exception here is tty_ldisc_open(), as it's called without ldisc_sem
> locked by tty_init_dev() for the tty->link.

You fail to explain how that is not broken...

> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Jiri Slaby <jslaby@suse.com>
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> ---
>  drivers/tty/tty_ldisc.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
> index fc4c97cae01e..202cb645582f 100644
> --- a/drivers/tty/tty_ldisc.c
> +++ b/drivers/tty/tty_ldisc.c
> @@ -471,6 +471,7 @@ static int tty_ldisc_open(struct tty_struct *tty, struct tty_ldisc *ld)
>  
>  static void tty_ldisc_close(struct tty_struct *tty, struct tty_ldisc *ld)
>  {
> +	lockdep_assert_held(&tty->ldisc_sem);

Did you want:

	lockdep_assert_held_exclusive(&tty->ldisc_sem);

?

>  	WARN_ON(!test_bit(TTY_LDISC_OPEN, &tty->flags));
>  	clear_bit(TTY_LDISC_OPEN, &tty->flags);
>  	if (ld->ops->close)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 6/6] tty/ldsem: Decrement wait_readers on timeouted down_read()
  2018-09-11  1:48 ` [PATCHv3 6/6] tty/ldsem: Decrement wait_readers on timeouted down_read() Dmitry Safonov
@ 2018-09-11 12:02   ` Peter Zijlstra
  2018-09-11 13:01     ` Dmitry Safonov
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2018-09-11 12:02 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa,
	Greg Kroah-Hartman, Jiri Slaby

On Tue, Sep 11, 2018 at 02:48:21AM +0100, Dmitry Safonov wrote:
> It seems like when ldsem_down_read() fails with timeout, it misses
> update for sem->wait_readers. By that reason, when writer finally
> releases write end of the semaphore __ldsem_wake_readers() does adjust
> sem->count with wrong value:
> sem->wait_readers * (LDSEM_ACTIVE_BIAS - LDSEM_WAIT_BIAS)
> 
> I.e, if update comes with 1 missed wait_readers decrement, sem->count
> will be 0x100000001 which means that there is active reader and it'll
> make any further writer to fail in acquiring the semaphore.
> 
> It looks like, this is a dead-code, because ldsem_down_read() is never
> called with timeout different than MAX_SCHEDULE_TIMEOUT, so it might be
> worth to delete timeout parameter and error path fall-back..

You might want to think about ditching that ldsem thing entirely, and
use a regular rwsem ?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen()
  2018-09-11  1:48 [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Dmitry Safonov
                   ` (5 preceding siblings ...)
  2018-09-11  1:48 ` [PATCHv3 6/6] tty/ldsem: Decrement wait_readers on timeouted down_read() Dmitry Safonov
@ 2018-09-11 12:16 ` Mark Rutland
  2018-09-11 12:42   ` Dmitry Safonov
  6 siblings, 1 reply; 24+ messages in thread
From: Mark Rutland @ 2018-09-11 12:16 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa, stable,
	Greg Kroah-Hartman, Jiri Slaby, Jiri Slaby, Peter Zijlstra,
	Paul E. McKenney, syzbot+3aa9784721dfb90e984d

On Tue, Sep 11, 2018 at 02:48:15AM +0100, Dmitry Safonov wrote:
> Hi all,

Hi,

> Three fixes that worth to have in the @stable, as we've hit them on v4.9
> stable.
> 
> And for linux-next - adding lockdep asserts for line discipline changing
> code, verifying that write ldisc sem will be held forthwith.
> 
> The last patch is optional and probably, timeout can be dropped for
> read_lock(). I'll do it if everyone agrees.
> 
> Rong Chen, could you kindly re-run this version to see if the lockup
> from v1 still happens? I wasn't able to reproduce it..

These patches seem to fix issues I've been seeing on arm64 for a while
but hadn't managed to track down.

For patches 1, 3, and 5, feel free to add:

Tested-by: Mark Rutland <mark.rutland@arm.com>

On vanilla v4.19-rc2, the below reproducer would fire in seconds,
whereas with those patches applied, I have not seen issues after 10s of
minutes of testing.

Thanks,
Mark.

Syzkaller hit 'KASAN: user-memory-access Write in n_tty_set_termios' bug.

IPv6: ADDRCONF(NETDEV_UP): veth0: link is not ready
ipV6: ADDRCONF(NETDEV_UP): veth1: link is not ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
==================================================================
BUG: KASAN: user-memory-access in memset include/linux/string.h:330 [inline]
BUG: KASAN: user-memory-access in bitmap_zero include/linux/bitmap.h:216 [inline]
BUG: KASAN: user-memory-access in n_tty_set_termios+0xe4/0xd08 drivers/tty/n_tty.c:1784
Write of size 512 at addr 0000000000001060 by task syz-executor0/3007

CPU: 1 PID: 3007 Comm: syz-executor0 Not tainted 4.19.0-rc2-dirty #4
Hardware name: linux,dummy-virt (DT)
Call trace:
 dump_backtrace+0x0/0x340 arch/arm64/include/asm/ptrace.h:270
 show_stack+0x20/0x30 arch/arm64/kernel/traps.c:152
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xec/0x150 lib/dump_stack.c:113
 kasan_report_error mm/kasan/report.c:352 [inline]
 kasan_report+0x228/0x360 mm/kasan/report.c:412
 check_memory_region_inline mm/kasan/kasan.c:253 [inline]
 check_memory_region+0x114/0x1c8 mm/kasan/kasan.c:267
 memset+0x2c/0x50 mm/kasan/kasan.c:285
 memset include/linux/string.h:330 [inline]
 bitmap_zero include/linux/bitmap.h:216 [inline]
 n_tty_set_termios+0xe4/0xd08 drivers/tty/n_tty.c:1784
 tty_set_termios+0x538/0x760 drivers/tty/tty_ioctl.c:341
 set_termios+0x348/0x968 drivers/tty/tty_ioctl.c:414
 tty_mode_ioctl+0x8f0/0xc60 drivers/tty/tty_ioctl.c:779
 n_tty_ioctl_helper+0x6c/0x390 drivers/tty/tty_ioctl.c:940
 n_tty_ioctl+0x6c/0x490 drivers/tty/n_tty.c:2450
 tty_ioctl+0x610/0x19a8 drivers/tty/tty_io.c:2655
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:501 [inline]
 do_vfs_ioctl+0x1bc/0x1618 fs/ioctl.c:685
 ksys_ioctl+0xbc/0x108 fs/ioctl.c:702
 __do_sys_ioctl fs/ioctl.c:709 [inline]
 __se_sys_ioctl fs/ioctl.c:707 [inline]
 __arm64_sys_ioctl+0x6c/0xa0 fs/ioctl.c:707
 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
 invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
 el0_svc_common+0x150/0x288 arch/arm64/kernel/syscall.c:84
 el0_svc_handler+0x54/0xf0 arch/arm64/kernel/syscall.c:130
 el0_svc+0x8/0xc arch/arm64/kernel/entry.S:917
==================================================================


Syzkaller reproducer:
# {Threaded:true Collide:true Repeat:true RepeatTimes:0 Procs:1 Sandbox:none Fault:false FaultCall:-1 FaultNth:0 EnableTun:true UseTmpDir:true EnableCgroups:true EnableNetdev:true ResetNet:true HandleSegv:true Repro:false Trace:false}
r0 = openat$ptmx(0xffffffffffffff9c, &(0x7f0000000000)='/dev/ptmx\x00', 0x0, 0x0)
ioctl$TIOCGPTPEER(r0, 0x40045431, 0x6e0000)
r1 = syz_open_pts(r0, 0x0)
ioctl$TCXONC(r1, 0x5437, 0x0)
ioctl$TIOCGSOFTCAR(r0, 0x5419, &(0x7f00000000c0))
r2 = semget(0x0, 0x1, 0x1a)
semctl$IPC_INFO(r2, 0x0, 0x3, &(0x7f0000000100)=""/166)
syz_open_pts(r0, 0x2)
ioctl$TCSETAW(r0, 0x5407, &(0x7f0000000080))


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen()
  2018-09-11 12:16 ` [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Mark Rutland
@ 2018-09-11 12:42   ` Dmitry Safonov
  0 siblings, 0 replies; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11 12:42 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa, stable,
	Greg Kroah-Hartman, Jiri Slaby, Jiri Slaby, Peter Zijlstra,
	Paul E. McKenney, syzbot+3aa9784721dfb90e984d

On Tue, 2018-09-11 at 13:16 +0100, Mark Rutland wrote:
> On Tue, Sep 11, 2018 at 02:48:15AM +0100, Dmitry Safonov wrote:
> > Hi all,
> 
> Hi,
> 
> > Three fixes that worth to have in the @stable, as we've hit them on
> > v4.9
> > stable.
> > 
> > And for linux-next - adding lockdep asserts for line discipline
> > changing
> > code, verifying that write ldisc sem will be held forthwith.
> > 
> > The last patch is optional and probably, timeout can be dropped for
> > read_lock(). I'll do it if everyone agrees.
> > 
> > Rong Chen, could you kindly re-run this version to see if the
> > lockup
> > from v1 still happens? I wasn't able to reproduce it..
> 
> These patches seem to fix issues I've been seeing on arm64 for a
> while
> but hadn't managed to track down.
> 
> For patches 1, 3, and 5, feel free to add:
> 
> Tested-by: Mark Rutland <mark.rutland@arm.com>

Thanks, Mark!
Will add on the next version.

> 
> On vanilla v4.19-rc2, the below reproducer would fire in seconds,
> whereas with those patches applied, I have not seen issues after 10s
> of
> minutes of testing.
> 
> Thanks,
> Mark.
> 
> Syzkaller hit 'KASAN: user-memory-access Write in n_tty_set_termios'
> bug.
> 
> IPv6: ADDRCONF(NETDEV_UP): veth0: link is not ready
> ipV6: ADDRCONF(NETDEV_UP): veth1: link is not ready
> IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
> IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
> ==================================================================
> BUG: KASAN: user-memory-access in memset include/linux/string.h:330
> [inline]
> BUG: KASAN: user-memory-access in bitmap_zero
> include/linux/bitmap.h:216 [inline]
> BUG: KASAN: user-memory-access in n_tty_set_termios+0xe4/0xd08
> drivers/tty/n_tty.c:1784
> Write of size 512 at addr 0000000000001060 by task syz-executor0/3007
> 
> CPU: 1 PID: 3007 Comm: syz-executor0 Not tainted 4.19.0-rc2-dirty #4
> Hardware name: linux,dummy-virt (DT)
> Call trace:
>  dump_backtrace+0x0/0x340 arch/arm64/include/asm/ptrace.h:270
>  show_stack+0x20/0x30 arch/arm64/kernel/traps.c:152
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0xec/0x150 lib/dump_stack.c:113
>  kasan_report_error mm/kasan/report.c:352 [inline]
>  kasan_report+0x228/0x360 mm/kasan/report.c:412
>  check_memory_region_inline mm/kasan/kasan.c:253 [inline]
>  check_memory_region+0x114/0x1c8 mm/kasan/kasan.c:267
>  memset+0x2c/0x50 mm/kasan/kasan.c:285
>  memset include/linux/string.h:330 [inline]
>  bitmap_zero include/linux/bitmap.h:216 [inline]
>  n_tty_set_termios+0xe4/0xd08 drivers/tty/n_tty.c:1784
>  tty_set_termios+0x538/0x760 drivers/tty/tty_ioctl.c:341
>  set_termios+0x348/0x968 drivers/tty/tty_ioctl.c:414
>  tty_mode_ioctl+0x8f0/0xc60 drivers/tty/tty_ioctl.c:779
>  n_tty_ioctl_helper+0x6c/0x390 drivers/tty/tty_ioctl.c:940
>  n_tty_ioctl+0x6c/0x490 drivers/tty/n_tty.c:2450
>  tty_ioctl+0x610/0x19a8 drivers/tty/tty_io.c:2655
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  file_ioctl fs/ioctl.c:501 [inline]
>  do_vfs_ioctl+0x1bc/0x1618 fs/ioctl.c:685
>  ksys_ioctl+0xbc/0x108 fs/ioctl.c:702
>  __do_sys_ioctl fs/ioctl.c:709 [inline]
>  __se_sys_ioctl fs/ioctl.c:707 [inline]
>  __arm64_sys_ioctl+0x6c/0xa0 fs/ioctl.c:707
>  __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
>  invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
>  el0_svc_common+0x150/0x288 arch/arm64/kernel/syscall.c:84
>  el0_svc_handler+0x54/0xf0 arch/arm64/kernel/syscall.c:130
>  el0_svc+0x8/0xc arch/arm64/kernel/entry.S:917
> ==================================================================
> 
> 
> Syzkaller reproducer:
> # {Threaded:true Collide:true Repeat:true RepeatTimes:0 Procs:1
> Sandbox:none Fault:false FaultCall:-1 FaultNth:0 EnableTun:true
> UseTmpDir:true EnableCgroups:true EnableNetdev:true ResetNet:true
> HandleSegv:true Repro:false Trace:false}
> r0 = openat$ptmx(0xffffffffffffff9c,
> &(0x7f0000000000)='/dev/ptmx\x00', 0x0, 0x0)
> ioctl$TIOCGPTPEER(r0, 0x40045431, 0x6e0000)
> r1 = syz_open_pts(r0, 0x0)
> ioctl$TCXONC(r1, 0x5437, 0x0)
> ioctl$TIOCGSOFTCAR(r0, 0x5419, &(0x7f00000000c0))
> r2 = semget(0x0, 0x1, 0x1a)
> semctl$IPC_INFO(r2, 0x0, 0x3, &(0x7f0000000100)=""/166)
> syz_open_pts(r0, 0x2)
> ioctl$TCSETAW(r0, 0x5407, &(0x7f0000000080))
> 

-- 
Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader
  2018-09-11 11:40   ` Peter Zijlstra
@ 2018-09-11 12:48     ` Dmitry Safonov
  0 siblings, 0 replies; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11 12:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa, stable,
	Greg Kroah-Hartman, Jiri Slaby, Paul E. McKenney

On Tue, 2018-09-11 at 13:40 +0200, Peter Zijlstra wrote:
> On Tue, Sep 11, 2018 at 02:48:17AM +0100, Dmitry Safonov wrote:
> > There is a couple of reports about lockup in ldsem_down_read()
> > without
> > anyone holding write end of ldisc semaphore:
> > lkml.kernel.org/r/<20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.int
> > el.com>
> > lkml.kernel.org/r/<20180907045041.GF1110@shao2-debian>
> > 
> > They all looked like a missed wake up.
> > I wasn't lucky enough to reproduce it, but it seems like reader on
> > another CPU can miss waiter->task update and schedule again,
> > resulting
> > in indefinite (MAX_SCHEDULE_TIMEOUT) sleep.
> > 
> > Make sure waked up reader will see waiter->task == NULL.
> > --- a/drivers/tty/tty_ldsem.c
> > +++ b/drivers/tty/tty_ldsem.c
> > @@ -118,6 +118,8 @@ static void __ldsem_wake_readers(struct
> > ld_semaphore *sem)
> >  		tsk = waiter->task;
> >  		smp_mb();
> >  		waiter->task = NULL;
> > +		/* Make sure down_read_failed() will see !waiter-
> > >task update */
> > +		smp_wmb();
> >  		wake_up_process(tsk);
> 
> This is 'wrong', wake_up_process() should imply sufficient for this
> to
> already be true. 

Yeah, thanks.
It was stupid of me not to check that..
Saw the smoke that would describe the reports and made too long-going
conjectures. Need more covfefe and staring into that code.

> 
> >  		put_task_struct(tsk);
> >  	}

-- 
Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 4/6] tty/lockdep: Add ldisc_sem asserts
  2018-09-11 12:01   ` Peter Zijlstra
@ 2018-09-11 12:53     ` Dmitry Safonov
  0 siblings, 0 replies; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11 12:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa,
	Greg Kroah-Hartman, Jiri Slaby

On Tue, 2018-09-11 at 14:01 +0200, Peter Zijlstra wrote:
> On Tue, Sep 11, 2018 at 02:48:19AM +0100, Dmitry Safonov wrote:
> > Make sure under CONFIG_LOCKDEP that each change to line discipline
> > is done with held write semaphor.
> 
> But you don't do that. You just assert it is held, not for writing.
> 
> > Otherwise potential reader will have a good time dereferencing
> > incomplete/uninitialized ldisc.
> > 
> > Exception here is tty_ldisc_open(), as it's called without
> > ldisc_sem
> > locked by tty_init_dev() for the tty->link.
> 
> You fail to explain how that is not broken...

I'll add the explanation to the next version.

> 
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Jiri Slaby <jslaby@suse.com>
> > Signed-off-by: Dmitry Safonov <dima@arista.com>
> > ---
> >  drivers/tty/tty_ldisc.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
> > index fc4c97cae01e..202cb645582f 100644
> > --- a/drivers/tty/tty_ldisc.c
> > +++ b/drivers/tty/tty_ldisc.c
> > @@ -471,6 +471,7 @@ static int tty_ldisc_open(struct tty_struct
> > *tty, struct tty_ldisc *ld)
> >  
> >  static void tty_ldisc_close(struct tty_struct *tty, struct
> > tty_ldisc *ld)
> >  {
> > +	lockdep_assert_held(&tty->ldisc_sem);
> 
> Did you want:
> 
> 	lockdep_assert_held_exclusive(&tty->ldisc_sem);
> 
> ?

Oh, yes, thanks.

> 
> >  	WARN_ON(!test_bit(TTY_LDISC_OPEN, &tty->flags));
> >  	clear_bit(TTY_LDISC_OPEN, &tty->flags);
> >  	if (ld->ops->close)

-- 
Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 6/6] tty/ldsem: Decrement wait_readers on timeouted down_read()
  2018-09-11 12:02   ` Peter Zijlstra
@ 2018-09-11 13:01     ` Dmitry Safonov
  2018-09-11 13:33       ` Dmitry Safonov
  0 siblings, 1 reply; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11 13:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa,
	Greg Kroah-Hartman, Jiri Slaby

On Tue, 2018-09-11 at 14:02 +0200, Peter Zijlstra wrote:
> On Tue, Sep 11, 2018 at 02:48:21AM +0100, Dmitry Safonov wrote:
> > It seems like when ldsem_down_read() fails with timeout, it misses
> > update for sem->wait_readers. By that reason, when writer finally
> > releases write end of the semaphore __ldsem_wake_readers() does
> > adjust
> > sem->count with wrong value:
> > sem->wait_readers * (LDSEM_ACTIVE_BIAS - LDSEM_WAIT_BIAS)
> > 
> > I.e, if update comes with 1 missed wait_readers decrement, sem-
> > >count
> > will be 0x100000001 which means that there is active reader and
> > it'll
> > make any further writer to fail in acquiring the semaphore.
> > 
> > It looks like, this is a dead-code, because ldsem_down_read() is
> > never
> > called with timeout different than MAX_SCHEDULE_TIMEOUT, so it
> > might be
> > worth to delete timeout parameter and error path fall-back..
> 
> You might want to think about ditching that ldsem thing entirely, and
> use a regular rwsem ?

Yeah, but AFAICS, regular rwsem will need to have a timeout then (for
write). So, I thought fixing this pile would be simpler than adding
timeout and probably writer-priority to generic rwsem?

And I guess, we still will need fixes for stable for the bugs here..

I expect that timeouts are ABI, while the gain of adding priority may
be measured. I'll give it a shot (adding timeout/priority for linux-
next) to rwsem if you say it's acceptable.

-- 
Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 6/6] tty/ldsem: Decrement wait_readers on timeouted down_read()
  2018-09-11 13:01     ` Dmitry Safonov
@ 2018-09-11 13:33       ` Dmitry Safonov
  2018-09-11 13:50         ` Peter Zijlstra
  0 siblings, 1 reply; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11 13:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa,
	Greg Kroah-Hartman, Jiri Slaby

On Tue, 2018-09-11 at 14:01 +0100, Dmitry Safonov wrote:
> On Tue, 2018-09-11 at 14:02 +0200, Peter Zijlstra wrote:
> > On Tue, Sep 11, 2018 at 02:48:21AM +0100, Dmitry Safonov wrote:
> > > It seems like when ldsem_down_read() fails with timeout, it
> > > misses
> > > update for sem->wait_readers. By that reason, when writer finally
> > > releases write end of the semaphore __ldsem_wake_readers() does
> > > adjust
> > > sem->count with wrong value:
> > > sem->wait_readers * (LDSEM_ACTIVE_BIAS - LDSEM_WAIT_BIAS)
> > > 
> > > I.e, if update comes with 1 missed wait_readers decrement, sem-
> > > > count
> > > 
> > > will be 0x100000001 which means that there is active reader and
> > > it'll
> > > make any further writer to fail in acquiring the semaphore.
> > > 
> > > It looks like, this is a dead-code, because ldsem_down_read() is
> > > never
> > > called with timeout different than MAX_SCHEDULE_TIMEOUT, so it
> > > might be
> > > worth to delete timeout parameter and error path fall-back..
> > 
> > You might want to think about ditching that ldsem thing entirely,
> > and
> > use a regular rwsem ?
> 
> Yeah, but AFAICS, regular rwsem will need to have a timeout then (for
> write). So, I thought fixing this pile would be simpler than adding
> timeout and probably writer-priority to generic rwsem?
> 
> And I guess, we still will need fixes for stable for the bugs here..
> 
> I expect that timeouts are ABI, while the gain of adding priority may
> be measured. I'll give it a shot (adding timeout/priority for linux-
> next) to rwsem if you say it's acceptable.

Actually, priority looks quite simple: we can add writers in the head
of wait_list and it probably may work.
Timeout looks also not a rocket science.
So, I can try to do that if you say it's acceptable (with the gain
measures).

After this can of worms that I need to fix regardless.

-- 
Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 6/6] tty/ldsem: Decrement wait_readers on timeouted down_read()
  2018-09-11 13:33       ` Dmitry Safonov
@ 2018-09-11 13:50         ` Peter Zijlstra
  2018-09-11 15:04           ` Dmitry Safonov
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2018-09-11 13:50 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa,
	Greg Kroah-Hartman, Jiri Slaby

On Tue, Sep 11, 2018 at 02:33:22PM +0100, Dmitry Safonov wrote:
> > > You might want to think about ditching that ldsem thing entirely,
> > > and use a regular rwsem ?
> > 
> > Yeah, but AFAICS, regular rwsem will need to have a timeout then (for
> > write). So, I thought fixing this pile would be simpler than adding
> > timeout and probably writer-priority to generic rwsem?
> > 
> > And I guess, we still will need fixes for stable for the bugs here..
> > 
> > I expect that timeouts are ABI, while the gain of adding priority may
> > be measured. I'll give it a shot (adding timeout/priority for linux-
> > next) to rwsem if you say it's acceptable.
> 
> Actually, priority looks quite simple: we can add writers in the head
> of wait_list and it probably may work.
> Timeout looks also not a rocket science.
> So, I can try to do that if you say it's acceptable (with the gain
> measures).

So why do you need writer priority? The comment that goes with ldsems
doesn't explain I think, it just says it has it.

In general I dislike unfair locks, they always cause trouble.

> After this can of worms that I need to fix regardless.

Sure.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv3 6/6] tty/ldsem: Decrement wait_readers on timeouted down_read()
  2018-09-11 13:50         ` Peter Zijlstra
@ 2018-09-11 15:04           ` Dmitry Safonov
  0 siblings, 0 replies; 24+ messages in thread
From: Dmitry Safonov @ 2018-09-11 15:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Dmitry Safonov, Daniel Axtens, Dmitry Vyukov,
	Michael Neuling, Mikulas Patocka, Nathan March,
	Pasi Kärkkäinen, Peter Hurley, Rong, Chen,
	Sergey Senozhatsky, Tan Xiaojun, Tetsuo Handa,
	Greg Kroah-Hartman, Jiri Slaby

On Tue, 2018-09-11 at 15:50 +0200, Peter Zijlstra wrote:
> On Tue, Sep 11, 2018 at 02:33:22PM +0100, Dmitry Safonov wrote:
> > > > You might want to think about ditching that ldsem thing
> > > > entirely,
> > > > and use a regular rwsem ?
> > > 
> > > Yeah, but AFAICS, regular rwsem will need to have a timeout then
> > > (for
> > > write). So, I thought fixing this pile would be simpler than
> > > adding
> > > timeout and probably writer-priority to generic rwsem?
> > > 
> > > And I guess, we still will need fixes for stable for the bugs
> > > here..
> > > 
> > > I expect that timeouts are ABI, while the gain of adding priority
> > > may
> > > be measured. I'll give it a shot (adding timeout/priority for
> > > linux-
> > > next) to rwsem if you say it's acceptable.
> > 
> > Actually, priority looks quite simple: we can add writers in the
> > head
> > of wait_list and it probably may work.
> > Timeout looks also not a rocket science.
> > So, I can try to do that if you say it's acceptable (with the gain
> > measures).
> 
> So why do you need writer priority? The comment that goes with ldsems
> doesn't explain I think, it just says it has it.

Well, as far as I can fetch from the  commit 4898e640caf0, it describes
that you should halt and scrap pending i/o (reader side) to prevent the
loss or change of the current line dicipline (write lock).
So, AFAIU, line discipline is expected to change within 5 sec by ABI
and write-priority makes it more likely.

> In general I dislike unfair locks, they always cause trouble.
> 
> > After this can of worms that I need to fix regardless.
> 
> Sure.

-- 
Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-09-11 15:04 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-11  1:48 [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Dmitry Safonov
2018-09-11  1:48 ` [PATCHv3 1/6] tty: Drop tty->count on tty_reopen() failure Dmitry Safonov
2018-09-11  1:48 ` [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader Dmitry Safonov
2018-09-11  5:04   ` Sergey Senozhatsky
2018-09-11  5:41     ` Sergey Senozhatsky
2018-09-11 11:04       ` Kirill Tkhai
2018-09-11 11:44       ` Peter Zijlstra
2018-09-11 11:43     ` Peter Zijlstra
2018-09-11 11:40   ` Peter Zijlstra
2018-09-11 12:48     ` Dmitry Safonov
2018-09-11  1:48 ` [PATCHv3 3/6] tty: Hold tty_ldisc_lock() during tty_reopen() Dmitry Safonov
2018-09-11  1:48 ` [PATCHv3 4/6] tty/lockdep: Add ldisc_sem asserts Dmitry Safonov
2018-09-11 11:59   ` Peter Zijlstra
2018-09-11 12:01   ` Peter Zijlstra
2018-09-11 12:53     ` Dmitry Safonov
2018-09-11  1:48 ` [PATCHv3 5/6] tty: Simplify tty->count math in tty_reopen() Dmitry Safonov
2018-09-11  1:48 ` [PATCHv3 6/6] tty/ldsem: Decrement wait_readers on timeouted down_read() Dmitry Safonov
2018-09-11 12:02   ` Peter Zijlstra
2018-09-11 13:01     ` Dmitry Safonov
2018-09-11 13:33       ` Dmitry Safonov
2018-09-11 13:50         ` Peter Zijlstra
2018-09-11 15:04           ` Dmitry Safonov
2018-09-11 12:16 ` [PATCHv3 0/6] tty: Hold write ldisc sem in tty_reopen() Mark Rutland
2018-09-11 12:42   ` Dmitry Safonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).