linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Peter Hurley <peter@hurleysoftware.com>,
	Jiri Slaby <jirislaby@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] tty: vt: make do_con_write() no-op if IRQ is disabled
Date: Fri, 03 Dec 2021 15:51:58 +0100	[thread overview]
Message-ID: <2064700.cDd5PexU1D@localhost.localdomain> (raw)
In-Reply-To: <5d055fbd-e94a-fe54-d3e0-982dc455ed1a@i-love.sakura.ne.jp>

On Friday, December 3, 2021 1:32:21 PM CET Tetsuo Handa wrote:
> On 2021/12/03 20:00, Fabio M. De Francesco wrote:
> > On Thursday, December 2, 2021 7:35:16 PM CET Linus Torvalds wrote:
> >> [...]
> >> Example: net/nfc/nci/uart.c. It does that
> >>
> >>         schedule_work(&nu->write_work);
> >>
> >> instead of actually trying to do a write from a wakeup routine
> >> (similar examples in ppp - "tasklet_schedule(&ap->tsk)" etc).
> >>
> >> I mean, it's called "wakeup", not "write". So I think the fundamental
> >> confusion here is in hdlc, not the tty layer.
> >>
> >>               Linus
> >>
> 
> OK.
> 
> > This is what I understand from the above argument: do a schedule_work() to get 
> > that n_hdlc_send_frames() started; in this way, n_hdlc_tty_wakeup() can
> > return to the caller and n_hdlc_send_frames() is executed asynchronously 
> > (i.e., no longer in an atomic context). 
> 
> Yes. 

Thanks for confirming :)

> [...]
>
> > [...]
> > 
> > Commit f9e053dcfc02 ("tty: Serialize tty flow control changes with flow_lock") 
> > has introduced spinlocks to serialize flow control changes and avoid the 
> > concurrent executions of __start_tty() and __stop_tty().
> > 
> > [...]
> >
> > This is the reason why we are dealing with this bug. Currently we have __start_tty() 
> > called with an acquired spinlock and IRQs disabled and the calls chain leads to 
> > console_lock() while in atomic context.
> 
> If we hit a race window described in that commit
> 
>     CPU 0                          | CPU 1
>     stop_tty()                     |
>       lock ctrl_lock               |
>       tty->stopped = 1             |
>       unlock ctrl_lock             |
>                                    | start_tty()
>                                    |   lock ctrl_lock
>                                    |   tty->stopped = 0
>                                    |   unlock ctrl_lock
>                                    |   driver->start()
>       driver->stop()               |
> 
>     In this case, the flow control state now indicates the tty has
>     been started, but the actual hardware state has actually been stopped.
> 
> , the tty->stopped flag remains 0 despite driver->stop() is called after
> driver->start() finished. tty->stopped (the flow control state) says "not stopped"
> but the actual hardware state is "stopped".

This is clear and it is the reason why I cited that commit. I understand that
__stop_tty() and __start_tty() cannot run concurrently.

I'm sorry if my poor abilities to express complex thoughts in English may 
have made you to think that I'm not aware of the real race condition issue 
that commit f9e053dcfc02 is trying to address :( 

Unfortunately, by fixing the SAC bug with a mere asynchronous execution of
n_hdlc_send_frames() in a work queue, we may still have different sources of 
race conditions. 

The point is that, when n_hdlc_tty_wakeup() returns to the IOCTL helper (the caller)
that acquired the spinlock, that same spinlock is immediately released without 
even knowing whether or not n_hdlc_send_frames() has had the chance to run 
and complete. Further __start_tty() or __stop_tty() calls are allowed to acquire that
spinlock again and we lose serialization because they still run concurrently. 

Actually, in the final part of your email you say that more than one instance of 
__start_tty() cannot run because of a variable that checks that another thread is 
still running the same code. I haven't yet checked, however you showed other
steps that can lead to races and I agree that the problem is not yet fixed.

This is why I think that n_hdlc_send_frames(), when runs asynchronously with
the change that Linus suggested, should also signal completions or change a 
a condition variable. 

Still using locks locks where it is needed, we should somehow use completions 
or condition variables to enforce an execution alternation between __stop_tty 
and __start_tty.

Either:

1) wait_for_completion() and complete();

Or:

2) wait_event(tty_event, tty->flow.tco_stopped == true) and 
wait_event(tty_event, tty->flow.tco_stopped == false) before entering respectively
__start_tty() and __stop_tty().

Regards,

Fabio M. De Francesco





  reply	other threads:[~2021-12-03 14:52 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-16 14:49 [PATCH] vt: Fix sleeping functions called from atomic context Fabio M. De Francesco
2021-11-16 14:58 ` Greg Kroah-Hartman
2021-11-16 15:35   ` Fabio M. De Francesco
2021-11-16 16:59     ` Greg Kroah-Hartman
2021-11-16 17:28       ` Fabio M. De Francesco
2021-11-17  8:23       ` Fabio M. De Francesco
2021-11-17  8:54         ` Greg Kroah-Hartman
2021-11-17 10:51           ` Tetsuo Handa
2021-11-18  8:31             ` Fabio M. De Francesco
2021-11-18  9:38               ` Fabio M. De Francesco
2021-11-18 12:14                 ` Tetsuo Handa
2021-11-18 17:01                   ` Fabio M. De Francesco
2021-11-19 14:55                     ` [PATCH] tty: vt: make do_con_write() no-op if IRQ is disabled Tetsuo Handa
2021-12-01 13:40                       ` Tetsuo Handa
2021-12-01 14:20                         ` Greg Kroah-Hartman
2021-12-01 19:05                         ` Linus Torvalds
2021-12-02 15:40                           ` Tetsuo Handa
2021-12-02 18:35                             ` Linus Torvalds
2021-12-03  5:03                               ` Jiri Slaby
2021-12-03 11:00                               ` Fabio M. De Francesco
2021-12-03 12:32                                 ` Tetsuo Handa
2021-12-03 14:51                                   ` Fabio M. De Francesco [this message]
2021-11-17 12:38           ` [PATCH] vt: Fix sleeping functions called from atomic context Fabio M. De Francesco
2021-11-17  1:55 ` Tetsuo Handa
2021-11-17  7:02   ` Fabio M. De Francesco
2021-12-06 11:44 ` [PATCH] tty: n_hdlc: make n_hdlc_tty_wakeup() asynchronous Tetsuo Handa
2021-12-06 18:07   ` Linus Torvalds
2021-12-09 13:18     ` Tetsuo Handa
2021-12-15 11:52       ` [PATCH (resend)] " Tetsuo Handa
2021-12-06 19:06   ` [PATCH] " Fabio M. De Francesco

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2064700.cDd5PexU1D@localhost.localdomain \
    --to=fmdefrancesco@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jirislaby@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=peter@hurleysoftware.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).