All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Peter Hurley <peter@hurleysoftware.com>,
	Jiri Slaby <jirislaby@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] tty: vt: make do_con_write() no-op if IRQ is disabled
Date: Fri, 3 Dec 2021 21:32:21 +0900	[thread overview]
Message-ID: <5d055fbd-e94a-fe54-d3e0-982dc455ed1a@i-love.sakura.ne.jp> (raw)
In-Reply-To: <3017492.JFOoIcAZ2s@localhost.localdomain>

On 2021/12/03 20:00, Fabio M. De Francesco wrote:
> On Thursday, December 2, 2021 7:35:16 PM CET Linus Torvalds wrote:
>> On Thu, Dec 2, 2021 at 7:41 AM Tetsuo Handa
>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>>
>>>> Looking at the backtrace, I see
>>>>
>>>>    n_hdlc_send_frames+0x24b/0x490 drivers/tty/n_hdlc.c:290
>>>>    tty_wakeup+0xe1/0x120 drivers/tty/tty_io.c:534
>>>>    __start_tty drivers/tty/tty_io.c:806 [inline]
>>>>    __start_tty+0xfb/0x130 drivers/tty/tty_io.c:799
>>>>
>>>> and apparently it's that hdlc line discipline (and
>>>> n_hdlc_send_frames() in particular) that is the problem here.
>>>>
>>>> I think that's where the fix should be.
>>>
>>> Do you mean that we should change the behavior of n_hdlc_send_frames()
>>> rather than trying to make __start_tty() schedulable again?
>>
>> I wouldn't change n_hdlc_send_frames() itself. It does what it says it does.
>>
>> But n_hdlc_tty_wakeup() probably shouldn't call it directly. Other tty
>> line disciplines don't do that kind of thing - although I only looked
>> at a couple. They all seem to just set bits and prepare things. Like a
>> wakeup function should do.
>>
>> So I think n_hdlc_tty_wakeup() should perhaps only do a
>> "schedule_work()" or similar to get that n_hdlc_send_frames() started,
>> rather than doing it itself.
>>
>> Example: net/nfc/nci/uart.c. It does that
>>
>>         schedule_work(&nu->write_work);
>>
>> instead of actually trying to do a write from a wakeup routine
>> (similar examples in ppp - "tasklet_schedule(&ap->tsk)" etc).
>>
>> I mean, it's called "wakeup", not "write". So I think the fundamental
>> confusion here is in hdlc, not the tty layer.
>>
>>               Linus
>>

OK.

> This is what I understand from the above argument: do a schedule_work() to get 
> that n_hdlc_send_frames() started; in this way, n_hdlc_tty_wakeup() can
> return to the caller and n_hdlc_send_frames() is executed asynchronously 
> (i.e., no longer in an atomic context). 

Yes. If we copy how net/nfc/nci/uart.c does, the fix would look like:

--------------------
 drivers/tty/n_hdlc.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/n_hdlc.c b/drivers/tty/n_hdlc.c
index 7e0884ecc74f..a71fcac60925 100644
--- a/drivers/tty/n_hdlc.c
+++ b/drivers/tty/n_hdlc.c
@@ -140,6 +140,8 @@ struct n_hdlc {
 	struct n_hdlc_buf_list	rx_buf_list;
 	struct n_hdlc_buf_list	tx_free_buf_list;
 	struct n_hdlc_buf_list	rx_free_buf_list;
+	struct work_struct	write_work;
+	struct tty_struct	*tty_for_write_work;
 };
 
 /*
@@ -210,6 +212,8 @@ static void n_hdlc_tty_close(struct tty_struct *tty)
 	wake_up_interruptible(&tty->read_wait);
 	wake_up_interruptible(&tty->write_wait);
 
+	cancel_work_sync(&n_hdlc->write_work);
+
 	n_hdlc_free_buf_list(&n_hdlc->rx_free_buf_list);
 	n_hdlc_free_buf_list(&n_hdlc->tx_free_buf_list);
 	n_hdlc_free_buf_list(&n_hdlc->rx_buf_list);
@@ -334,6 +338,20 @@ static void n_hdlc_send_frames(struct n_hdlc *n_hdlc, struct tty_struct *tty)
 		goto check_again;
 }	/* end of n_hdlc_send_frames() */
 
+/**
+ * n_hdlc_tty_write_work - Asynchronous callback for transmit wakeup
+ * @work: pointer to work_struct
+ *
+ * Called when low level device driver can accept more send data.
+ */
+static void n_hdlc_tty_write_work(struct work_struct *work)
+{
+	struct n_hdlc *n_hdlc = container_of(work, struct n_hdlc, write_work);
+	struct tty_struct *tty = n_hdlc->tty_for_write_work;
+
+	n_hdlc_send_frames(n_hdlc, tty);
+}	/* end of n_hdlc_tty_write_work() */
+
 /**
  * n_hdlc_tty_wakeup - Callback for transmit wakeup
  * @tty: pointer to associated tty instance data
@@ -344,7 +362,8 @@ static void n_hdlc_tty_wakeup(struct tty_struct *tty)
 {
 	struct n_hdlc *n_hdlc = tty->disc_data;
 
-	n_hdlc_send_frames(n_hdlc, tty);
+	n_hdlc->tty_for_write_work = tty;
+	schedule_work(&n_hdlc->write_work);
 }	/* end of n_hdlc_tty_wakeup() */
 
 /**
@@ -706,6 +725,7 @@ static struct n_hdlc *n_hdlc_alloc(void)
 	if (!n_hdlc)
 		return NULL;
 
+	INIT_WORK(&n_hdlc->write_work, n_hdlc_tty_write_work);
 	spin_lock_init(&n_hdlc->rx_free_buf_list.spinlock);
 	spin_lock_init(&n_hdlc->tx_free_buf_list.spinlock);
 	spin_lock_init(&n_hdlc->rx_buf_list.spinlock);
--------------------

> 
> I hope that I'm not missing something. If the above summary is correct, 
> please forgive a newbie for the following questions... 
> 
> Commit f9e053dcfc02 ("tty: Serialize tty flow control changes with flow_lock") 
> has introduced spinlocks to serialize flow control changes and avoid the 
> concurrent executions of __start_tty() and __stop_tty().
> 
> This is an excerpt from the above-mentioned commit:
> 
> ->
>     Introduce tty->flow_lock spinlock to serialize tty flow control changes.
>     Split out unlocked __start_tty()/__stop_tty() flavors for use by
>     ioctl(TCXONC) in follow-on patch.
> <-
> 
> This is the reason why we are dealing with this bug. Currently we have __start_tty() 
> called with an acquired spinlock and IRQs disabled and the calls chain leads to 
> console_lock() while in atomic context.

If we hit a race window described in that commit

    CPU 0                          | CPU 1
    stop_tty()                     |
      lock ctrl_lock               |
      tty->stopped = 1             |
      unlock ctrl_lock             |
                                   | start_tty()
                                   |   lock ctrl_lock
                                   |   tty->stopped = 0
                                   |   unlock ctrl_lock
                                   |   driver->start()
      driver->stop()               |

    In this case, the flow control state now indicates the tty has
    been started, but the actual hardware state has actually been stopped.

, the tty->stopped flag remains 0 despite driver->stop() is called after
driver->start() finished. tty->stopped (the flow control state) says "not stopped"
but the actual hardware state is "stopped".

> 
> In summation, my questions are... 
> 
> 1) Why do we still need to protect __start_tty() and __stop_tty() with spin_lock_irq() 
> if the solution to the bug is to execute n_hdlc_send_frames() asynchronously?

Without serialization, tty->stopped flag and the actual hardware state can mismatch.

> 
> 2) If it is true that we need to avoid concurrent executions of __start_tty() and 
> __stop_tty(), can we just use a Mutex in the IOCTL's helper?

Yes if all __start_tty() and __stop_tty() callers were schedulable context.
But stop_tty() says that stop_tty() might be called from atomic context.
Thus, we can't use a mutex for protecting tty->stopped flag.

> 
> Thanks,
> 
> Fabio M. De Francesco

By the way, even with above patch, I think

    CPU 0                  | CPU 1                  |  CPU 2
    stop_tty()             |                        |
      lock flow.lock       |                        |
      tty->stopped = 1     |                        |
      driver->stop()       |                        |
      unlock flow.lock     |                        |
                           | start_tty()            |
                           |   lock flow.lock       |
                           |   tty->stopped = 0     |
                           |   driver->start() => Schedules n_hdlc_send_frames()
                           |   unlock flow.lock     |
    stop_tty()             |                        |
      lock flow.lock       |                        |
      tty->stopped = 1     |                        |
      driver->stop()       |                        |
      unlock flow.lock     |                        |
                           |                        | Starts n_hdlc_send_frames()

(that is, the n_hdlc is writing to consoles despite tty->stopped is 1) can happen
until n_hdlc_send_frames() completes.

Then, even scheduling next n_hdlc_send_frames() while previous n_hdlc_send_frames() is
possible? In the worst case, multiple CPUs can run n_hdlc_send_frames() concurrently?

    CPU 0                  | CPU 1                  |  CPU 2                 | CPU 3
    stop_tty()             |                        |                        |
      lock flow.lock       |                        |                        |
      tty->stopped = 1     |                        |                        |
      driver->stop()       |                        |                        |
      unlock flow.lock     |                        |                        |
                           | start_tty()            |                        |
                           |   lock flow.lock       |                        |
                           |   tty->stopped = 0     |                        |
                           |   driver->start() => Schedules n_hdlc_send_frames()
                           |   unlock flow.lock     |                        |
                           |                        | Starts n_hdlc_send_frames()
    stop_tty()             |                        |                        |
      lock flow.lock       |                        |                        |
      tty->stopped = 1     |                        |                        |
      driver->stop()       |                        |                        |
      unlock flow.lock     |                        |                        |
                           | start_tty()            |                        |
                           |   lock flow.lock       |                        |
                           |   tty->stopped = 0     |                        |
                           |   driver->start() => Schedules n_hdlc_send_frames()
                           |   unlock flow.lock     |                        |
                           |                        |                        | Starts n_hdlc_send_frames()

Ah, OK. n_hdlc->tbusy is there for serialization.


  reply	other threads:[~2021-12-03 12:32 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-16 14:49 [PATCH] vt: Fix sleeping functions called from atomic context Fabio M. De Francesco
2021-11-16 14:58 ` Greg Kroah-Hartman
2021-11-16 15:35   ` Fabio M. De Francesco
2021-11-16 16:59     ` Greg Kroah-Hartman
2021-11-16 17:28       ` Fabio M. De Francesco
2021-11-17  8:23       ` Fabio M. De Francesco
2021-11-17  8:54         ` Greg Kroah-Hartman
2021-11-17 10:51           ` Tetsuo Handa
2021-11-18  8:31             ` Fabio M. De Francesco
2021-11-18  9:38               ` Fabio M. De Francesco
2021-11-18 12:14                 ` Tetsuo Handa
2021-11-18 17:01                   ` Fabio M. De Francesco
2021-11-19 14:55                     ` [PATCH] tty: vt: make do_con_write() no-op if IRQ is disabled Tetsuo Handa
2021-12-01 13:40                       ` Tetsuo Handa
2021-12-01 14:20                         ` Greg Kroah-Hartman
2021-12-01 19:05                         ` Linus Torvalds
2021-12-02 15:40                           ` Tetsuo Handa
2021-12-02 18:35                             ` Linus Torvalds
2021-12-03  5:03                               ` Jiri Slaby
2021-12-03 11:00                               ` Fabio M. De Francesco
2021-12-03 12:32                                 ` Tetsuo Handa [this message]
2021-12-03 14:51                                   ` Fabio M. De Francesco
2021-11-17 12:38           ` [PATCH] vt: Fix sleeping functions called from atomic context Fabio M. De Francesco
2021-11-17  1:55 ` Tetsuo Handa
2021-11-17  7:02   ` Fabio M. De Francesco
2021-12-06 11:44 ` [PATCH] tty: n_hdlc: make n_hdlc_tty_wakeup() asynchronous Tetsuo Handa
2021-12-06 18:07   ` Linus Torvalds
2021-12-09 13:18     ` Tetsuo Handa
2021-12-15 11:52       ` [PATCH (resend)] " Tetsuo Handa
2021-12-06 19:06   ` [PATCH] " Fabio M. De Francesco

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5d055fbd-e94a-fe54-d3e0-982dc455ed1a@i-love.sakura.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=fmdefrancesco@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jirislaby@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peter@hurleysoftware.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.