All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: "D. Starke" <daniel.starke@siemens.com>
Cc: linux-serial@vger.kernel.org, jirislaby@kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 6/9] tty: n_gsm: fix deadlock and link starvation in outgoing data path
Date: Fri, 10 Jun 2022 13:55:53 +0200	[thread overview]
Message-ID: <YqMxSQIvHbl/ofWB@kroah.com> (raw)
In-Reply-To: <20220530144512.2731-6-daniel.starke@siemens.com>

On Mon, May 30, 2022 at 04:45:09PM +0200, D. Starke wrote:
> From: Daniel Starke <daniel.starke@siemens.com>
> 
> The current implementation queues up new control and user packets as needed
> and processes this queue down to the ldisc in the same code path.
> That means that the upper and the lower layer are hard coupled in the code.
> Due to this deadlocks can happen as seen below while transmitting data,
> especially during ldisc congestion. Furthermore, the data channels starve
> the control channel on high transmission load on the ldisc.
> 
> Introduce an additional control channel data queue to prevent timeouts and
> link hangups during ldisc congestion. This is being processed before the
> user channel data queue in gsm_data_kick(), i.e. with the highest priority.
> Put the queue to ldisc data path into a workqueue and trigger it whenever
> new data has been put into the transmission queue. Change
> gsm_dlci_data_sweep() accordingly to fill up the transmission queue until
> TX_THRESH_HI. This solves the locking issue, keeps latency low and provides
> good performance on high data load.
> Note that now all packets from a DLCI are removed from the internal queue
> if the associated DLCI was closed. This ensures that no data is sent by the
> introduced write task to an already closed DLCI.
> 
> BUG: spinlock recursion on CPU#0, test_v24_loop/124
>  lock: serial8250_ports+0x3a8/0x7500, .magic: dead4ead, .owner: test_v24_loop/124, .owner_cpu: 0
> CPU: 0 PID: 124 Comm: test_v24_loop Tainted: G           O      5.18.0-rc2 #3
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Call Trace:
>  <IRQ>
>  dump_stack_lvl+0x34/0x44
>  do_raw_spin_lock+0x76/0xa0
>  _raw_spin_lock_irqsave+0x72/0x80
>  uart_write_room+0x3b/0xc0
>  gsm_data_kick+0x14b/0x240 [n_gsm]
>  gsmld_write_wakeup+0x35/0x70 [n_gsm]
>  tty_wakeup+0x53/0x60
>  tty_port_default_wakeup+0x1b/0x30
>  serial8250_tx_chars+0x12f/0x220
>  serial8250_handle_irq.part.0+0xfe/0x150
>  serial8250_default_handle_irq+0x48/0x80
>  serial8250_interrupt+0x56/0xa0
>  __handle_irq_event_percpu+0x78/0x1f0
>  handle_irq_event+0x34/0x70
>  handle_fasteoi_irq+0x90/0x1e0
>  __common_interrupt+0x69/0x100
>  common_interrupt+0x48/0xc0
>  asm_common_interrupt+0x1e/0x40
> RIP: 0010:__do_softirq+0x83/0x34e
> Code: 2a 0a ff 0f b7 ed c7 44 24 10 0a 00 00 00 48 c7 c7 51 2a 64 82 e8 2d
> e2 d5 ff 65 66 c7 05 83 af 1e 7e 00 00 fb b8 ff ff ff ff <49> c7 c2 40 61
> 80 82 0f bc c5 41 89 c4 41 83 c4 01 0f 84 e6 00 00
> RSP: 0018:ffffc90000003f98 EFLAGS: 00000286
> RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff82642a51 RDI: ffffffff825bb5e7
> RBP: 0000000000000200 R08: 00000008de3271a8 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000030 R14: 0000000000000000 R15: 0000000000000000
>  ? __do_softirq+0x73/0x34e
>  irq_exit_rcu+0xb5/0x100
>  common_interrupt+0xa4/0xc0
>  </IRQ>
>  <TASK>
>  asm_common_interrupt+0x1e/0x40
> RIP: 0010:_raw_spin_unlock_irqrestore+0x2e/0x50
> Code: 00 55 48 89 fd 48 83 c7 18 53 48 89 f3 48 8b 74 24 10 e8 85 28 36 ff
> 48 89 ef e8 cd 58 36 ff 80 e7 02 74 01 fb bf 01 00 00 00 <e8> 3d 97 33 ff
> 65 8b 05 96 23 2b 7e 85 c0 74 03 5b 5d c3 0f 1f 44
> RSP: 0018:ffffc9000020fd08 EFLAGS: 00000202
> RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
> RDX: 0000000000000004 RSI: ffffffff8257fd74 RDI: 0000000000000001
> RBP: ffff8880057de3a0 R08: 00000008de233000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000100 R14: 0000000000000202 R15: ffff8880057df0b8
>  ? _raw_spin_unlock_irqrestore+0x23/0x50
>  gsmtty_write+0x65/0x80 [n_gsm]
>  n_tty_write+0x33f/0x530
>  ? swake_up_all+0xe0/0xe0
>  file_tty_write.constprop.0+0x1b1/0x320
>  ? n_tty_flush_buffer+0xb0/0xb0
>  new_sync_write+0x10c/0x190
>  vfs_write+0x282/0x310
>  ksys_write+0x68/0xe0
>  do_syscall_64+0x3b/0x90
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f3e5e35c15c
> Code: 8b 7c 24 08 89 c5 e8 c5 ff ff ff 89 ef 89 44 24 08 e8 58 bc 02 00 8b
> 44 24 08 48 83 c4 10 5d c3 48 63 ff b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff
> ff 76 10 48 8b 15 fd fc 05 00 f7 d8 64 89 02 48 83
> RSP: 002b:00007ffcee77cd18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 00007ffcee77cd70 RCX: 00007f3e5e35c15c
> RDX: 0000000000000100 RSI: 00007ffcee77cd90 RDI: 0000000000000003
> RBP: 0000000000000100 R08: 0000000000000000 R09: 7efefefefefefeff
> R10: 00007f3e5e3bddeb R11: 0000000000000246 R12: 00007ffcee77ce8f
> R13: 0000000000000001 R14: 000056214404e010 R15: 00007ffcee77cd90
>  </TASK>
> 
> Fixes: e1eaea46bb40 ("tty: n_gsm line discipline")
> Cc: stable@vger.kernel.org
> Signed-off-by: Daniel Starke <daniel.starke@siemens.com>
> ---
>  drivers/tty/n_gsm.c | 410 ++++++++++++++++++++++++++++++--------------
>  1 file changed, 280 insertions(+), 130 deletions(-)

This is a bit huge for stable backports, especially given that a huge
number of the previous stable backports have totally failed and no one
has submitted new versions.

So why is this needed for stable?  Same for all of these in the
series...

thanks,

greg k-h

  reply	other threads:[~2022-06-10 11:56 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-30 14:45 [PATCH v3 1/9] tty: n_gsm: fix user open not possible at responder until initiator open D. Starke
2022-05-30 14:45 ` [PATCH v3 2/9] tty: n_gsm: fix tty registration before control channel open D. Starke
2022-05-30 14:45 ` [PATCH v3 3/9] tty: n_gsm: fix wrong queuing behavior in gsm_dlci_data_output() D. Starke
2022-05-30 14:45 ` [PATCH v3 4/9] tty: n_gsm: fix missing timer to handle stalled links D. Starke
2022-05-30 14:45 ` [PATCH v3 5/9] tty: n_gsm: fix non flow control frames during mux flow off D. Starke
2022-05-30 14:45 ` [PATCH v3 6/9] tty: n_gsm: fix deadlock and link starvation in outgoing data path D. Starke
2022-06-10 11:55   ` Greg KH [this message]
2022-05-30 14:45 ` [PATCH v3 7/9] tty: n_gsm: fix packet re-transmission without open control channel D. Starke
2022-05-30 14:45 ` [PATCH v3 8/9] tty: n_gsm: fix resource allocation order in gsm_activate_mux() D. Starke
2022-05-30 14:45 ` [PATCH v3 9/9] tty: n_gsm: fix race condition in gsmld_write() D. Starke
2022-06-23 12:26 [PATCH v3 6/9] tty: n_gsm: fix deadlock and link starvation in outgoing data path Starke, Daniel
2022-06-27 11:54 ` Greg KH
2022-06-28  9:41 Starke, Daniel
2022-06-30 14:59 ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YqMxSQIvHbl/ofWB@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=daniel.starke@siemens.com \
    --cc=jirislaby@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-serial@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.