LKML Archive on lore.kernel.org
 help / color / Atom feed
* UART/TTY console deadlock
@ 2020-06-22 17:30 Raul Rangel
  2020-06-22 17:37 ` Andy Shevchenko
  0 siblings, 1 reply; 31+ messages in thread
From: Raul Rangel @ 2020-06-22 17:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, Andy Shevchenko, kurt, S, Shirish

We are trying an S3 suspend stress test and occasionally while
entering S3 we get a console deadlock. Is this a known issue? I'm not
really sure why a probe would be happening while suspending.

The kernel command line is `console=ttyS0` and kernel 5.4.39.

Here is the log:

[  278.885831] leds mmc0::: calling led_resume+0x0/0x1c @ 3909,
parent: 0000:02:00.0
[  278.885835] leds mmc0::: led_resume+0x0/0x1c returned 0 after 0 usecs
[  278.890230]  secondary_startup_64+0xa4/0xb0
[  278.890236] handlers:
[  279.752138] [<00000000b8e5d375>] i8042_interrupt
[  279.756915] Disabling IRQ #1
[  279.759801]
[  279.759802] ======================================================
[  279.759803] WARNING: possible circular locking dependency detected
[  279.759803] 5.4.39 #55 Not tainted
[  279.759804] ------------------------------------------------------
[  279.759804] swapper/0/0 is trying to acquire lock:
[  279.759805] ffffffffab65b6c0 (console_owner){-...}, at:
console_lock_spinning_enable+0x31/0x57
[  279.759806]
[  279.759807] but task is already holding lock:
[  279.759807] ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at:
__report_bad_irq+0x5b/0xba
[  279.759809]
[  279.759809] which lock already depends on the new lock.
[  279.759810]
[  279.759810]
[  279.759811] the existing dependency chain (in reverse order) is:
[  279.759811]
[  279.759811] -> #2 (&irq_desc_lock_class){-.-.}:
[  279.759813]        _raw_spin_lock_irqsave+0x61/0x8d
[  279.759813]        __irq_get_desc_lock+0x65/0x89
[  279.759814]        __disable_irq_nosync+0x3b/0x93
[  279.759814]        serial8250_do_startup+0x451/0x75c
[  279.759815]        uart_startup+0x1b4/0x2ff
[  279.759815]        uart_port_activate+0x73/0xa0
[  279.759815]        tty_port_open+0xae/0x10a
[  279.759816]        uart_open+0x1b/0x26
[  279.759816]        tty_open+0x24d/0x3a0
[  279.759817]        chrdev_open+0xd5/0x1cc
[  279.759817]        do_dentry_open+0x299/0x3c8
[  279.759817]        path_openat+0x434/0x1100
[  279.759818]        do_filp_open+0x9b/0x10a
[  279.759818]        do_sys_open+0x15f/0x3d7
[  279.759819]        kernel_init_freeable+0x157/0x1dd
[  279.759819]        kernel_init+0xe/0x105
[  279.759819]        ret_from_fork+0x27/0x50
[  279.759820]
[  279.759820] -> #1 (&port_lock_key){-.-.}:
[  279.759822]        _raw_spin_lock_irqsave+0x61/0x8d
[  279.759822]        serial8250_console_write+0xa7/0x2a0
[  279.759823]        console_unlock+0x3b7/0x528
[  279.759823]        vprintk_emit+0x111/0x17f
[  279.759823]        printk+0x59/0x73
[  279.759824]        register_console+0x336/0x3a4
[  279.759824]        uart_add_one_port+0x51b/0x5be
[  279.759825]        serial8250_register_8250_port+0x454/0x55e
[  279.759825]        dw8250_probe+0x4dc/0x5b9
[  279.759825]        platform_drv_probe+0x67/0x8b
[  279.759826]        really_probe+0x14a/0x422
[  279.759826]        driver_probe_device+0x66/0x130
[  279.759827]        device_driver_attach+0x42/0x5b
[  279.759827]        __driver_attach+0xca/0x139
[  279.759827]        bus_for_each_dev+0x97/0xc9
[  279.759828]        bus_add_driver+0x12b/0x228
[  279.759828]        driver_register+0x64/0xed
[  279.759829]        do_one_initcall+0x20c/0x4a6
[  279.759829]        do_initcall_level+0xb5/0xc5
[  279.759829]        do_basic_setup+0x4c/0x58
[  279.759830]        kernel_init_freeable+0x13f/0x1dd
[  279.759830]        kernel_init+0xe/0x105
[  279.759831]        ret_from_fork+0x27/0x50
[  279.759831]
[  279.759831] -> #0 (console_owner){-...}:
[  279.759833]        __lock_acquire+0x118d/0x2714
[  279.759833]        lock_acquire+0x203/0x258
[  279.759834]        console_lock_spinning_enable+0x51/0x57
[  279.759834]        console_unlock+0x25d/0x528
[  279.759834]        vprintk_emit+0x111/0x17f
[  279.759835]        printk+0x59/0x73
[  279.759835]        __report_bad_irq+0xa3/0xba
[  279.759836]        note_interrupt+0x19a/0x1d6
[  279.759836]        handle_irq_event_percpu+0x57/0x79
[  279.759836]        handle_irq_event+0x36/0x55
[  279.759837]        handle_fasteoi_irq+0xc2/0x18a
[  279.759837]        do_IRQ+0xb3/0x157
[  279.759838]        ret_from_intr+0x0/0x1d
[  279.759838]        cpuidle_enter_state+0x12f/0x1fd
[  279.759838]        cpuidle_enter+0x2e/0x3d
[  279.759839]        do_idle+0x1ce/0x2ce
[  279.759839]        cpu_startup_entry+0x1d/0x1f
[  279.759840]        start_kernel+0x406/0x46a
[  279.759840]        secondary_startup_64+0xa4/0xb0
[  279.759840]
[  279.759841] other info that might help us debug this:
[  279.759841]
[  279.759841] Chain exists of:
[  279.759842]   console_owner --> &port_lock_key --> &irq_desc_lock_class
[  279.759844]
[  279.759844]  Possible unsafe locking scenario:
[  279.759845]
[  279.759845]        CPU0                    CPU1
[  279.759845]        ----                    ----
[  279.759846]   lock(&irq_desc_lock_class);
[  279.759847]                                lock(&port_lock_key);
[  279.759848]                                lock(&irq_desc_lock_class);
[  279.759849]   lock(console_owner);
[  279.759850]
[  279.759850]  *** DEADLOCK ***
[  279.759850]
[  279.759851] 2 locks held by swapper/0/0:
[  279.759851]  #0: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at:
__report_bad_irq+0x5b/0xba
[  279.759853]  #1: ffffffffab65b5c0 (console_lock){+.+.}, at:
console_trylock_spinning+0x20/0x181
[  279.759855]
[  279.759855] stack backtrace:
[  279.759856] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.39 #55
[  279.759856] Hardware name: XXXXXX
[  279.759857] Call Trace:
[  279.759857]  <IRQ>
[  279.759857]  dump_stack+0xbf/0x133
[  279.759858]  ? print_circular_bug+0xd6/0xe9
[  279.759858]  check_noncircular+0x1b9/0x1c3
[  279.759858]  __lock_acquire+0x118d/0x2714
[  279.759859]  lock_acquire+0x203/0x258
[  279.759859]  ? console_lock_spinning_enable+0x31/0x57
[  279.759860]  console_lock_spinning_enable+0x51/0x57
[  279.759860]  ? console_lock_spinning_enable+0x31/0x57
[  279.759861]  console_unlock+0x25d/0x528
[  279.759861]  ? console_trylock+0x18/0x4e
[  279.759861]  vprintk_emit+0x111/0x17f
[  279.759862]  ? lock_acquire+0x203/0x258
[  279.759862]  printk+0x59/0x73
[  279.759862]  __report_bad_irq+0xa3/0xba
[  279.759863]  note_interrupt+0x19a/0x1d6
[  279.759863]  handle_irq_event_percpu+0x57/0x79
[  279.759864]  handle_irq_event+0x36/0x55
[  279.759864]  handle_fasteoi_irq+0xc2/0x18a
[  279.759864]  do_IRQ+0xb3/0x157
[  279.759865]  common_interrupt+0xf/0xf
[  279.759865]  </IRQ>
[  279.759866] RIP: 0010:cpuidle_enter_state+0x12f/0x1fd
[  279.759867] Code: aa aa aa aa aa aa aa 48 89 45 c8 9c 8f 45 c8 0f
ba 65 c8 09 0f 82 cc 00 00 00 31 ff e8 57 25 76 ff e8 36 bb 7b ff fb
45 85 ff <78> 47 44 89 f8 48 6b d0 68 49 8b 4c 16 48 4c 2b 65 b8 4c 89
63 18
[  279.759867] RSP: 0018:ffffffffab607e20 EFLAGS: 00000202 ORIG_RAX:
ffffffffffffffda
[  279.759869] RAX: 86733df5ce911a00 RBX: ffff888104709000 RCX: ffffffffa9bb229b
[  279.759870] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffffffffab62d780
[  279.759870] RBP: ffffffffab607e68 R08: dffffc0000000000 R09: fffffbfff56eb96f
[  279.759871] R10: fffffbfff56eb96f R11: ffffffffa9f4f800 R12: 00000040eb0f7ad6
[  279.759871] R13: 0000000000000000 R14: ffffffffab6caf68 R15: 0000000000000001
[  279.759871]  ? flatten_lpi_states+0x190/0x1d7
[  279.759872]  ? trace_irq_enable_rcuidle+0x22/0xee
[  279.759872]  cpuidle_enter+0x2e/0x3d
[  279.759872]  do_idle+0x1ce/0x2ce
[  279.759873]  cpu_startup_entry+0x1d/0x1f
[  279.759873]  start_kernel+0x406/0x46a
[  279.759873]  secondary_startup_64+0xa4/0xb0
[  279.927956] i2c_designware AMD0010:00: controller timed out
[  280.388120] elan_i2c i2c-ELAN0000:00: reading cmd (0x0307) fail.
[  280.394413] elan_i2c i2c-ELAN0000:00: failed to read current power
state: -110
[  280.542626] elan_i2c i2c-ELAN0000:00: elan_resume+0x0/0x5f returned
0 after 1607813 usecs


Thanks,
Raul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-06-22 17:30 UART/TTY console deadlock Raul Rangel
@ 2020-06-22 17:37 ` Andy Shevchenko
  2020-06-30  3:58   ` Sergey Senozhatsky
  0 siblings, 1 reply; 31+ messages in thread
From: Andy Shevchenko @ 2020-06-22 17:37 UTC (permalink / raw)
  To: Raul Rangel, Tony Lindgren, Petr Mladek, Sergey Senozhatsky
  Cc: linux-kernel, Greg Kroah-Hartman, Andy Shevchenko, kurt, S, Shirish

+Cc: Tony and console people

On Mon, Jun 22, 2020 at 8:32 PM Raul Rangel <rrangel@google.com> wrote:
>
> We are trying an S3 suspend stress test and occasionally while
> entering S3 we get a console deadlock. Is this a known issue? I'm not
> really sure why a probe would be happening while suspending.
>
> The kernel command line is `console=ttyS0` and kernel 5.4.39.
>
> Here is the log:
>
> [  278.885831] leds mmc0::: calling led_resume+0x0/0x1c @ 3909,
> parent: 0000:02:00.0
> [  278.885835] leds mmc0::: led_resume+0x0/0x1c returned 0 after 0 usecs
> [  278.890230]  secondary_startup_64+0xa4/0xb0
> [  278.890236] handlers:
> [  279.752138] [<00000000b8e5d375>] i8042_interrupt
> [  279.756915] Disabling IRQ #1
> [  279.759801]
> [  279.759802] ======================================================
> [  279.759803] WARNING: possible circular locking dependency detected
> [  279.759803] 5.4.39 #55 Not tainted
> [  279.759804] ------------------------------------------------------
> [  279.759804] swapper/0/0 is trying to acquire lock:
> [  279.759805] ffffffffab65b6c0 (console_owner){-...}, at:
> console_lock_spinning_enable+0x31/0x57
> [  279.759806]
> [  279.759807] but task is already holding lock:
> [  279.759807] ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at:
> __report_bad_irq+0x5b/0xba
> [  279.759809]
> [  279.759809] which lock already depends on the new lock.
> [  279.759810]
> [  279.759810]
> [  279.759811] the existing dependency chain (in reverse order) is:
> [  279.759811]
> [  279.759811] -> #2 (&irq_desc_lock_class){-.-.}:
> [  279.759813]        _raw_spin_lock_irqsave+0x61/0x8d
> [  279.759813]        __irq_get_desc_lock+0x65/0x89
> [  279.759814]        __disable_irq_nosync+0x3b/0x93
> [  279.759814]        serial8250_do_startup+0x451/0x75c
> [  279.759815]        uart_startup+0x1b4/0x2ff
> [  279.759815]        uart_port_activate+0x73/0xa0
> [  279.759815]        tty_port_open+0xae/0x10a
> [  279.759816]        uart_open+0x1b/0x26
> [  279.759816]        tty_open+0x24d/0x3a0
> [  279.759817]        chrdev_open+0xd5/0x1cc
> [  279.759817]        do_dentry_open+0x299/0x3c8
> [  279.759817]        path_openat+0x434/0x1100
> [  279.759818]        do_filp_open+0x9b/0x10a
> [  279.759818]        do_sys_open+0x15f/0x3d7
> [  279.759819]        kernel_init_freeable+0x157/0x1dd
> [  279.759819]        kernel_init+0xe/0x105
> [  279.759819]        ret_from_fork+0x27/0x50
> [  279.759820]
> [  279.759820] -> #1 (&port_lock_key){-.-.}:
> [  279.759822]        _raw_spin_lock_irqsave+0x61/0x8d
> [  279.759822]        serial8250_console_write+0xa7/0x2a0
> [  279.759823]        console_unlock+0x3b7/0x528
> [  279.759823]        vprintk_emit+0x111/0x17f
> [  279.759823]        printk+0x59/0x73
> [  279.759824]        register_console+0x336/0x3a4
> [  279.759824]        uart_add_one_port+0x51b/0x5be
> [  279.759825]        serial8250_register_8250_port+0x454/0x55e
> [  279.759825]        dw8250_probe+0x4dc/0x5b9
> [  279.759825]        platform_drv_probe+0x67/0x8b
> [  279.759826]        really_probe+0x14a/0x422
> [  279.759826]        driver_probe_device+0x66/0x130
> [  279.759827]        device_driver_attach+0x42/0x5b
> [  279.759827]        __driver_attach+0xca/0x139
> [  279.759827]        bus_for_each_dev+0x97/0xc9
> [  279.759828]        bus_add_driver+0x12b/0x228
> [  279.759828]        driver_register+0x64/0xed
> [  279.759829]        do_one_initcall+0x20c/0x4a6
> [  279.759829]        do_initcall_level+0xb5/0xc5
> [  279.759829]        do_basic_setup+0x4c/0x58
> [  279.759830]        kernel_init_freeable+0x13f/0x1dd
> [  279.759830]        kernel_init+0xe/0x105
> [  279.759831]        ret_from_fork+0x27/0x50
> [  279.759831]
> [  279.759831] -> #0 (console_owner){-...}:
> [  279.759833]        __lock_acquire+0x118d/0x2714
> [  279.759833]        lock_acquire+0x203/0x258
> [  279.759834]        console_lock_spinning_enable+0x51/0x57
> [  279.759834]        console_unlock+0x25d/0x528
> [  279.759834]        vprintk_emit+0x111/0x17f
> [  279.759835]        printk+0x59/0x73
> [  279.759835]        __report_bad_irq+0xa3/0xba
> [  279.759836]        note_interrupt+0x19a/0x1d6
> [  279.759836]        handle_irq_event_percpu+0x57/0x79
> [  279.759836]        handle_irq_event+0x36/0x55
> [  279.759837]        handle_fasteoi_irq+0xc2/0x18a
> [  279.759837]        do_IRQ+0xb3/0x157
> [  279.759838]        ret_from_intr+0x0/0x1d
> [  279.759838]        cpuidle_enter_state+0x12f/0x1fd
> [  279.759838]        cpuidle_enter+0x2e/0x3d
> [  279.759839]        do_idle+0x1ce/0x2ce
> [  279.759839]        cpu_startup_entry+0x1d/0x1f
> [  279.759840]        start_kernel+0x406/0x46a
> [  279.759840]        secondary_startup_64+0xa4/0xb0
> [  279.759840]
> [  279.759841] other info that might help us debug this:
> [  279.759841]
> [  279.759841] Chain exists of:
> [  279.759842]   console_owner --> &port_lock_key --> &irq_desc_lock_class
> [  279.759844]
> [  279.759844]  Possible unsafe locking scenario:
> [  279.759845]
> [  279.759845]        CPU0                    CPU1
> [  279.759845]        ----                    ----
> [  279.759846]   lock(&irq_desc_lock_class);
> [  279.759847]                                lock(&port_lock_key);
> [  279.759848]                                lock(&irq_desc_lock_class);
> [  279.759849]   lock(console_owner);
> [  279.759850]
> [  279.759850]  *** DEADLOCK ***
> [  279.759850]
> [  279.759851] 2 locks held by swapper/0/0:
> [  279.759851]  #0: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at:
> __report_bad_irq+0x5b/0xba
> [  279.759853]  #1: ffffffffab65b5c0 (console_lock){+.+.}, at:
> console_trylock_spinning+0x20/0x181
> [  279.759855]
> [  279.759855] stack backtrace:
> [  279.759856] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.39 #55
> [  279.759856] Hardware name: XXXXXX
> [  279.759857] Call Trace:
> [  279.759857]  <IRQ>
> [  279.759857]  dump_stack+0xbf/0x133
> [  279.759858]  ? print_circular_bug+0xd6/0xe9
> [  279.759858]  check_noncircular+0x1b9/0x1c3
> [  279.759858]  __lock_acquire+0x118d/0x2714
> [  279.759859]  lock_acquire+0x203/0x258
> [  279.759859]  ? console_lock_spinning_enable+0x31/0x57
> [  279.759860]  console_lock_spinning_enable+0x51/0x57
> [  279.759860]  ? console_lock_spinning_enable+0x31/0x57
> [  279.759861]  console_unlock+0x25d/0x528
> [  279.759861]  ? console_trylock+0x18/0x4e
> [  279.759861]  vprintk_emit+0x111/0x17f
> [  279.759862]  ? lock_acquire+0x203/0x258
> [  279.759862]  printk+0x59/0x73
> [  279.759862]  __report_bad_irq+0xa3/0xba
> [  279.759863]  note_interrupt+0x19a/0x1d6
> [  279.759863]  handle_irq_event_percpu+0x57/0x79
> [  279.759864]  handle_irq_event+0x36/0x55
> [  279.759864]  handle_fasteoi_irq+0xc2/0x18a
> [  279.759864]  do_IRQ+0xb3/0x157
> [  279.759865]  common_interrupt+0xf/0xf
> [  279.759865]  </IRQ>
> [  279.759866] RIP: 0010:cpuidle_enter_state+0x12f/0x1fd
> [  279.759867] Code: aa aa aa aa aa aa aa 48 89 45 c8 9c 8f 45 c8 0f
> ba 65 c8 09 0f 82 cc 00 00 00 31 ff e8 57 25 76 ff e8 36 bb 7b ff fb
> 45 85 ff <78> 47 44 89 f8 48 6b d0 68 49 8b 4c 16 48 4c 2b 65 b8 4c 89
> 63 18
> [  279.759867] RSP: 0018:ffffffffab607e20 EFLAGS: 00000202 ORIG_RAX:
> ffffffffffffffda
> [  279.759869] RAX: 86733df5ce911a00 RBX: ffff888104709000 RCX: ffffffffa9bb229b
> [  279.759870] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffffffffab62d780
> [  279.759870] RBP: ffffffffab607e68 R08: dffffc0000000000 R09: fffffbfff56eb96f
> [  279.759871] R10: fffffbfff56eb96f R11: ffffffffa9f4f800 R12: 00000040eb0f7ad6
> [  279.759871] R13: 0000000000000000 R14: ffffffffab6caf68 R15: 0000000000000001
> [  279.759871]  ? flatten_lpi_states+0x190/0x1d7
> [  279.759872]  ? trace_irq_enable_rcuidle+0x22/0xee
> [  279.759872]  cpuidle_enter+0x2e/0x3d
> [  279.759872]  do_idle+0x1ce/0x2ce
> [  279.759873]  cpu_startup_entry+0x1d/0x1f
> [  279.759873]  start_kernel+0x406/0x46a
> [  279.759873]  secondary_startup_64+0xa4/0xb0
> [  279.927956] i2c_designware AMD0010:00: controller timed out
> [  280.388120] elan_i2c i2c-ELAN0000:00: reading cmd (0x0307) fail.
> [  280.394413] elan_i2c i2c-ELAN0000:00: failed to read current power
> state: -110
> [  280.542626] elan_i2c i2c-ELAN0000:00: elan_resume+0x0/0x5f returned
> 0 after 1607813 usecs
>
>
> Thanks,
> Raul



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-06-22 17:37 ` Andy Shevchenko
@ 2020-06-30  3:58   ` Sergey Senozhatsky
  2020-06-30 10:21     ` Petr Mladek
  0 siblings, 1 reply; 31+ messages in thread
From: Sergey Senozhatsky @ 2020-06-30  3:58 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Raul Rangel, Tony Lindgren, Petr Mladek, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, Andy Shevchenko, kurt, S,
	Shirish, Peter Zijlstra, John Ogness, Steven Rostedt

Cc-ing more people

https://lore.kernel.org/lkml/CAHp75Vd8nTzmZdnhpTDChdc11zyCaSfeigbxaCpOWZ1Lv9ZBMw@mail.gmail.com

On (20/06/22 20:37), Andy Shevchenko wrote:
> > [  279.759811] -> #2 (&irq_desc_lock_class){-.-.}:
> > [  279.759813]        _raw_spin_lock_irqsave+0x61/0x8d
> > [  279.759813]        __irq_get_desc_lock+0x65/0x89
> > [  279.759814]        __disable_irq_nosync+0x3b/0x93
> > [  279.759814]        serial8250_do_startup+0x451/0x75c
> > [  279.759815]        uart_startup+0x1b4/0x2ff
> > [  279.759815]        uart_port_activate+0x73/0xa0
> > [  279.759815]        tty_port_open+0xae/0x10a
> > [  279.759816]        uart_open+0x1b/0x26
> > [  279.759816]        tty_open+0x24d/0x3a0
> > [  279.759817]        chrdev_open+0xd5/0x1cc
> > [  279.759817]        do_dentry_open+0x299/0x3c8
> > [  279.759817]        path_openat+0x434/0x1100
> > [  279.759818]        do_filp_open+0x9b/0x10a
> > [  279.759818]        do_sys_open+0x15f/0x3d7
> > [  279.759819]        kernel_init_freeable+0x157/0x1dd
> > [  279.759819]        kernel_init+0xe/0x105
> > [  279.759819]        ret_from_fork+0x27/0x50
> > [  279.759820]
> > [  279.759820] -> #1 (&port_lock_key){-.-.}:
> > [  279.759822]        _raw_spin_lock_irqsave+0x61/0x8d
> > [  279.759822]        serial8250_console_write+0xa7/0x2a0
> > [  279.759823]        console_unlock+0x3b7/0x528
> > [  279.759823]        vprintk_emit+0x111/0x17f
> > [  279.759823]        printk+0x59/0x73
> > [  279.759824]        register_console+0x336/0x3a4
> > [  279.759824]        uart_add_one_port+0x51b/0x5be
> > [  279.759825]        serial8250_register_8250_port+0x454/0x55e
> > [  279.759825]        dw8250_probe+0x4dc/0x5b9
> > [  279.759825]        platform_drv_probe+0x67/0x8b
> > [  279.759826]        really_probe+0x14a/0x422
> > [  279.759826]        driver_probe_device+0x66/0x130
> > [  279.759827]        device_driver_attach+0x42/0x5b
> > [  279.759827]        __driver_attach+0xca/0x139
> > [  279.759827]        bus_for_each_dev+0x97/0xc9
> > [  279.759828]        bus_add_driver+0x12b/0x228
> > [  279.759828]        driver_register+0x64/0xed
> > [  279.759829]        do_one_initcall+0x20c/0x4a6
> > [  279.759829]        do_initcall_level+0xb5/0xc5
> > [  279.759829]        do_basic_setup+0x4c/0x58
> > [  279.759830]        kernel_init_freeable+0x13f/0x1dd
> > [  279.759830]        kernel_init+0xe/0x105
> > [  279.759831]        ret_from_fork+0x27/0x50
> > [  279.759831]
> > [  279.759831] -> #0 (console_owner){-...}:
> > [  279.759833]        __lock_acquire+0x118d/0x2714
> > [  279.759833]        lock_acquire+0x203/0x258
> > [  279.759834]        console_lock_spinning_enable+0x51/0x57
> > [  279.759834]        console_unlock+0x25d/0x528
> > [  279.759834]        vprintk_emit+0x111/0x17f
> > [  279.759835]        printk+0x59/0x73
> > [  279.759835]        __report_bad_irq+0xa3/0xba
> > [  279.759836]        note_interrupt+0x19a/0x1d6
> > [  279.759836]        handle_irq_event_percpu+0x57/0x79
> > [  279.759836]        handle_irq_event+0x36/0x55
> > [  279.759837]        handle_fasteoi_irq+0xc2/0x18a
> > [  279.759837]        do_IRQ+0xb3/0x157
> > [  279.759838]        ret_from_intr+0x0/0x1d
> > [  279.759838]        cpuidle_enter_state+0x12f/0x1fd
> > [  279.759838]        cpuidle_enter+0x2e/0x3d
> > [  279.759839]        do_idle+0x1ce/0x2ce
> > [  279.759839]        cpu_startup_entry+0x1d/0x1f
> > [  279.759840]        start_kernel+0x406/0x46a
> > [  279.759840]        secondary_startup_64+0xa4/0xb0

Hmm. So this is uart->port => desc->lock vs desc->lock => uart->port

chain #1:

 serial8250_do_startup()
  spin_lock_irqsave(&port->lock);
   disable_irq_nosync(port->irq);
    raw_spin_lock_irqsave(&desc->lock)

chain #2:

  __report_bad_irq()
   raw_spin_lock_irqsave(&desc->lock)
    for_each_action_of_desc()
     printk()
      spin_lock_irqsave(&port->lock);


Breaking up chain #2 is not an option, I suppose. Those are a rather
important KERN_ERR messages, printk_deferred() will upset people badly.

So... Do we need to hold uart->port when we disable port->irq? What do we
race with? Module removal? The function bumps device PM counter (albeit
for UART_CAP_RPM ports only). But, at the same time, we do a whole bunch
of unprotected port->FOO accesses in serial8250_do_startup(). We even set
the IRQF_SHARED up->port.irqflags without grabbing the port->lock:

	 up->port.irqflags |= IRQF_SHARED;
	 spin_lock_irqsave(&port->lock, flags);
	 if (up->port.irqflags & IRQF_SHARED)
	    disable_irq_nosync(port->irq);

IOW, can we do something like this?

---
 drivers/tty/serial/8250/8250_port.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index d64ca77d9cfa..ad30991e1b3b 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
 
 	if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
 		unsigned char iir1;
+		bool irq_shared = up->port.irqflags & IRQF_SHARED;
+
+		if (irq_shared)
+			disable_irq_nosync(port->irq);
+
 		/*
 		 * Test for UARTs that do not reassert THRE when the
 		 * transmitter is idle and the interrupt has already
@@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
 		 * allow register changes to become visible.
 		 */
 		spin_lock_irqsave(&port->lock, flags);
-		if (up->port.irqflags & IRQF_SHARED)
-			disable_irq_nosync(port->irq);
 
 		wait_for_xmitr(up, UART_LSR_THRE);
 		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
@@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
 		iir = serial_port_in(port, UART_IIR);
 		serial_port_out(port, UART_IER, 0);
 
-		if (port->irqflags & IRQF_SHARED)
-			enable_irq(port->irq);
 		spin_unlock_irqrestore(&port->lock, flags);
+		if (irq_shared)
+			enable_irq(port->irq);
 
 		/*
 		 * If the interrupt is not reasserted, or we otherwise
-- 
2.27.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-06-30  3:58   ` Sergey Senozhatsky
@ 2020-06-30 10:21     ` Petr Mladek
  2020-06-30 10:55       ` Sergey Senozhatsky
  0 siblings, 1 reply; 31+ messages in thread
From: Petr Mladek @ 2020-06-30 10:21 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andy Shevchenko, Raul Rangel, Tony Lindgren, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, Andy Shevchenko, kurt, S,
	Shirish, Peter Zijlstra, John Ogness, Steven Rostedt

On Tue 2020-06-30 12:58:16, Sergey Senozhatsky wrote:
> Cc-ing more people
> 
> https://lore.kernel.org/lkml/CAHp75Vd8nTzmZdnhpTDChdc11zyCaSfeigbxaCpOWZ1Lv9ZBMw@mail.gmail.com
> 
> On (20/06/22 20:37), Andy Shevchenko wrote:
> > > [  279.759811] -> #2 (&irq_desc_lock_class){-.-.}:
> > > [  279.759813]        _raw_spin_lock_irqsave+0x61/0x8d
> > > [  279.759813]        __irq_get_desc_lock+0x65/0x89
> > > [  279.759814]        __disable_irq_nosync+0x3b/0x93
> > > [  279.759814]        serial8250_do_startup+0x451/0x75c
> > > [  279.759815]        uart_startup+0x1b4/0x2ff
> > > [  279.759815]        uart_port_activate+0x73/0xa0
> > > [  279.759815]        tty_port_open+0xae/0x10a
> > > [  279.759816]        uart_open+0x1b/0x26
> > > [  279.759816]        tty_open+0x24d/0x3a0
> > > [  279.759817]        chrdev_open+0xd5/0x1cc
> > > [  279.759817]        do_dentry_open+0x299/0x3c8
> > > [  279.759817]        path_openat+0x434/0x1100
> > > [  279.759818]        do_filp_open+0x9b/0x10a
> > > [  279.759818]        do_sys_open+0x15f/0x3d7
> > > [  279.759819]        kernel_init_freeable+0x157/0x1dd
> > > [  279.759819]        kernel_init+0xe/0x105
> > > [  279.759819]        ret_from_fork+0x27/0x50
> > > [  279.759820]
> > > [  279.759820] -> #1 (&port_lock_key){-.-.}:
> > > [  279.759822]        _raw_spin_lock_irqsave+0x61/0x8d
> > > [  279.759822]        serial8250_console_write+0xa7/0x2a0
> > > [  279.759823]        console_unlock+0x3b7/0x528
> > > [  279.759823]        vprintk_emit+0x111/0x17f
> > > [  279.759823]        printk+0x59/0x73
> > > [  279.759824]        register_console+0x336/0x3a4
> > > [  279.759824]        uart_add_one_port+0x51b/0x5be
> > > [  279.759825]        serial8250_register_8250_port+0x454/0x55e
> > > [  279.759825]        dw8250_probe+0x4dc/0x5b9
> > > [  279.759825]        platform_drv_probe+0x67/0x8b
> > > [  279.759826]        really_probe+0x14a/0x422
> > > [  279.759826]        driver_probe_device+0x66/0x130
> > > [  279.759827]        device_driver_attach+0x42/0x5b
> > > [  279.759827]        __driver_attach+0xca/0x139
> > > [  279.759827]        bus_for_each_dev+0x97/0xc9
> > > [  279.759828]        bus_add_driver+0x12b/0x228
> > > [  279.759828]        driver_register+0x64/0xed
> > > [  279.759829]        do_one_initcall+0x20c/0x4a6
> > > [  279.759829]        do_initcall_level+0xb5/0xc5
> > > [  279.759829]        do_basic_setup+0x4c/0x58
> > > [  279.759830]        kernel_init_freeable+0x13f/0x1dd
> > > [  279.759830]        kernel_init+0xe/0x105
> > > [  279.759831]        ret_from_fork+0x27/0x50
> > > [  279.759831]
> > > [  279.759831] -> #0 (console_owner){-...}:
> > > [  279.759833]        __lock_acquire+0x118d/0x2714
> > > [  279.759833]        lock_acquire+0x203/0x258
> > > [  279.759834]        console_lock_spinning_enable+0x51/0x57
> > > [  279.759834]        console_unlock+0x25d/0x528
> > > [  279.759834]        vprintk_emit+0x111/0x17f
> > > [  279.759835]        printk+0x59/0x73
> > > [  279.759835]        __report_bad_irq+0xa3/0xba
> > > [  279.759836]        note_interrupt+0x19a/0x1d6
> > > [  279.759836]        handle_irq_event_percpu+0x57/0x79
> > > [  279.759836]        handle_irq_event+0x36/0x55
> > > [  279.759837]        handle_fasteoi_irq+0xc2/0x18a
> > > [  279.759837]        do_IRQ+0xb3/0x157
> > > [  279.759838]        ret_from_intr+0x0/0x1d
> > > [  279.759838]        cpuidle_enter_state+0x12f/0x1fd
> > > [  279.759838]        cpuidle_enter+0x2e/0x3d
> > > [  279.759839]        do_idle+0x1ce/0x2ce
> > > [  279.759839]        cpu_startup_entry+0x1d/0x1f
> > > [  279.759840]        start_kernel+0x406/0x46a
> > > [  279.759840]        secondary_startup_64+0xa4/0xb0
> 
> Hmm. So this is uart->port => desc->lock vs desc->lock => uart->port
> 
> chain #1:
> 
>  serial8250_do_startup()
>   spin_lock_irqsave(&port->lock);
>    disable_irq_nosync(port->irq);
>     raw_spin_lock_irqsave(&desc->lock)
> 
> chain #2:
> 
>   __report_bad_irq()
>    raw_spin_lock_irqsave(&desc->lock)
>     for_each_action_of_desc()
>      printk()
>       spin_lock_irqsave(&port->lock);
> 
> 
> Breaking up chain #2 is not an option, I suppose. Those are a rather
> important KERN_ERR messages, printk_deferred() will upset people badly.

Yes, we should avoid printk_deferred() unless there is another solution.


> So... Do we need to hold uart->port when we disable port->irq? What do we
> race with? Module removal? The function bumps device PM counter (albeit
> for UART_CAP_RPM ports only).

Honestly, I do not see where a PM counter gets incremented.

Anyway, __disable_irq_nosync() does nothing when
irq_get_desc_buslock() returns NULL. And irq_get_desc_buslock()
takes desc->lock when desc exist. This should be enough to
synchronize any calls.

> But, at the same time, we do a whole bunch
> of unprotected port->FOO accesses in serial8250_do_startup(). We even set
> the IRQF_SHARED up->port.irqflags without grabbing the port->lock:
> 
> 	 up->port.irqflags |= IRQF_SHARED;
> 	 spin_lock_irqsave(&port->lock, flags);
> 	 if (up->port.irqflags & IRQF_SHARED)
> 	    disable_irq_nosync(port->irq);

Yup, this looks suspicious. We set a flag in port.irqflags and take the lock
only when the flag was set. Either everything needs to be done under
the lock or the lock is not needed.

Well, I might have missed something. I do not fully understand meaning
and relation of all the structures.

Anyway, I believe that this is a false positive. If I get it correctly
serial8250_do_startup() must be called before the serial port could
be registered as a console. It means that it could not be called
from inside printk().

Sigh, I do not know how to tell lockdep about these false positives.
And I am never 100% sure that we could shuffle locks in the various
the console drivers.


> IOW, can we do something like this?
> 
> ---
>  drivers/tty/serial/8250/8250_port.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index d64ca77d9cfa..ad30991e1b3b 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
>  
>  	if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
>  		unsigned char iir1;
> +		bool irq_shared = up->port.irqflags & IRQF_SHARED;
> +
> +		if (irq_shared)
> +			disable_irq_nosync(port->irq);
> +
>  		/*
>  		 * Test for UARTs that do not reassert THRE when the
>  		 * transmitter is idle and the interrupt has already
> @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
>  		 * allow register changes to become visible.
>  		 */
>  		spin_lock_irqsave(&port->lock, flags);
> -		if (up->port.irqflags & IRQF_SHARED)
> -			disable_irq_nosync(port->irq);
>  
>  		wait_for_xmitr(up, UART_LSR_THRE);
>  		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> @@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
>  		iir = serial_port_in(port, UART_IIR);
>  		serial_port_out(port, UART_IER, 0);
>  
> -		if (port->irqflags & IRQF_SHARED)
> -			enable_irq(port->irq);
>  		spin_unlock_irqrestore(&port->lock, flags);
> +		if (irq_shared)
> +			enable_irq(port->irq);
>  
>  		/*
>  		 * If the interrupt is not reasserted, or we otherwise

I think that it might be safe but I am not 100% sure, sigh.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-06-30 10:21     ` Petr Mladek
@ 2020-06-30 10:55       ` Sergey Senozhatsky
  2020-06-30 11:40         ` Andy Shevchenko
  2020-06-30 12:22         ` Petr Mladek
  0 siblings, 2 replies; 31+ messages in thread
From: Sergey Senozhatsky @ 2020-06-30 10:55 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andy Shevchenko, Raul Rangel, Tony Lindgren,
	Sergey Senozhatsky, linux-kernel, Greg Kroah-Hartman,
	Andy Shevchenko, kurt, S, Shirish, Peter Zijlstra, John Ogness,
	Steven Rostedt

On (20/06/30 12:21), Petr Mladek wrote:
> > So... Do we need to hold uart->port when we disable port->irq? What do we
> > race with? Module removal? The function bumps device PM counter (albeit
> > for UART_CAP_RPM ports only).
> 
> Honestly, I do not see where a PM counter gets incremented.

serial8250_do_startup()
 serial8250_rpm_get()
  pm_runtime_get_sync(p->port.dev)

But this does not happen for all ports, just for UART_CAP_RPM ones.

> Anyway, __disable_irq_nosync() does nothing when
> irq_get_desc_buslock() returns NULL. And irq_get_desc_buslock()
> takes desc->lock when desc exist. This should be enough to
> synchronize any calls.
> 
> > But, at the same time, we do a whole bunch
> > of unprotected port->FOO accesses in serial8250_do_startup(). We even set
> > the IRQF_SHARED up->port.irqflags without grabbing the port->lock:
> > 
> > 	 up->port.irqflags |= IRQF_SHARED;
> > 	 spin_lock_irqsave(&port->lock, flags);
> > 	 if (up->port.irqflags & IRQF_SHARED)
> > 	    disable_irq_nosync(port->irq);
> 
> Yup, this looks suspicious. We set a flag in port.irqflags and take the lock
> only when the flag was set. Either everything needs to be done under
> the lock or the lock is not needed.
> 
> Well, I might have missed something. I do not fully understand meaning
> and relation of all the structures.
> 
> Anyway, I believe that this is a false positive. If I get it correctly
> serial8250_do_startup() must be called before the serial port could
> be registered as a console. It means that it could not be called
> from inside printk().

From my understanding, I'm afraid we are talking about actual deadlock
here, not about false positive report. Quoting the original email:

 : We are trying an S3 suspend stress test and occasionally while
 : entering S3 we get a console deadlock.

[..]

> >  drivers/tty/serial/8250/8250_port.c | 11 +++++++----
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> > index d64ca77d9cfa..ad30991e1b3b 100644
> > --- a/drivers/tty/serial/8250/8250_port.c
> > +++ b/drivers/tty/serial/8250/8250_port.c
> > @@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
> >  
> >  	if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> >  		unsigned char iir1;
> > +		bool irq_shared = up->port.irqflags & IRQF_SHARED;
> > +
> > +		if (irq_shared)
> > +			disable_irq_nosync(port->irq);
> > +
> >  		/*
> >  		 * Test for UARTs that do not reassert THRE when the
> >  		 * transmitter is idle and the interrupt has already
> > @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
> >  		 * allow register changes to become visible.
> >  		 */
> >  		spin_lock_irqsave(&port->lock, flags);
> > -		if (up->port.irqflags & IRQF_SHARED)
> > -			disable_irq_nosync(port->irq);
> >  
> >  		wait_for_xmitr(up, UART_LSR_THRE);
> >  		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> > @@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
> >  		iir = serial_port_in(port, UART_IIR);
> >  		serial_port_out(port, UART_IER, 0);
> >  
> > -		if (port->irqflags & IRQF_SHARED)
> > -			enable_irq(port->irq);
> >  		spin_unlock_irqrestore(&port->lock, flags);
> > +		if (irq_shared)
> > +			enable_irq(port->irq);
> >  
> >  		/*
> >  		 * If the interrupt is not reasserted, or we otherwise
> 
> I think that it might be safe but I am not 100% sure, sigh.

Yeah, I'm not 100%, but I'd give it a try.

	-ss

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-06-30 10:55       ` Sergey Senozhatsky
@ 2020-06-30 11:40         ` Andy Shevchenko
  2020-06-30 12:22         ` Petr Mladek
  1 sibling, 0 replies; 31+ messages in thread
From: Andy Shevchenko @ 2020-06-30 11:40 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Raul Rangel, Tony Lindgren, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, kurt, S, Shirish,
	Peter Zijlstra, John Ogness, Steven Rostedt

On Tue, Jun 30, 2020 at 07:55:12PM +0900, Sergey Senozhatsky wrote:
> On (20/06/30 12:21), Petr Mladek wrote:

...

> > I think that it might be safe but I am not 100% sure, sigh.
> 
> Yeah, I'm not 100%, but I'd give it a try.

Thanks for the patch! I think we have to wait for the original reporter.
Raul, can you test the change Sergey proposed in the thread?

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-06-30 10:55       ` Sergey Senozhatsky
  2020-06-30 11:40         ` Andy Shevchenko
@ 2020-06-30 12:22         ` Petr Mladek
  2020-06-30 13:05           ` Sergey Senozhatsky
  1 sibling, 1 reply; 31+ messages in thread
From: Petr Mladek @ 2020-06-30 12:22 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andy Shevchenko, Raul Rangel, Tony Lindgren, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, Andy Shevchenko, kurt, S,
	Shirish, Peter Zijlstra, John Ogness, Steven Rostedt

On Tue 2020-06-30 19:55:12, Sergey Senozhatsky wrote:
> On (20/06/30 12:21), Petr Mladek wrote:
> > Anyway, I believe that this is a false positive. If I get it correctly
> > serial8250_do_startup() must be called before the serial port could
> > be registered as a console. It means that it could not be called
> > from inside printk().
> 
> >From my understanding, I'm afraid we are talking about actual deadlock
> here, not about false positive report. Quoting the original email:

IMHO, it is not clear. The log contains lockdep report. It does not mean
the deadlock really happened. The lockdep reports cyclic dependency but it
is not able to say if the related code chains might be called at the
same time in reality.

And this combination has signs of false positives. As I said, I think
that the serial console could not be registered by printk() before
serial8250_do_startup() has finished.

Also there are some unrelated messages after the last backtrace.

Finally. we would be able to see the messages only when panic() was
called and oops_in_progress set. And there is nothing about panic().


>  : We are trying an S3 suspend stress test and occasionally while
>  : entering S3 we get a console deadlock.

Raul, have the system really died or have you just seen the lockdep
report, please?


> [..]
> 
> > >  drivers/tty/serial/8250/8250_port.c | 11 +++++++----
> > >  1 file changed, 7 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> > > index d64ca77d9cfa..ad30991e1b3b 100644
> > > --- a/drivers/tty/serial/8250/8250_port.c
> > > +++ b/drivers/tty/serial/8250/8250_port.c
> > > @@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
> > >  
> > >  	if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> > >  		unsigned char iir1;
> > > +		bool irq_shared = up->port.irqflags & IRQF_SHARED;
> > > +
> > > +		if (irq_shared)
> > > +			disable_irq_nosync(port->irq);
> > > +
> > >  		/*
> > >  		 * Test for UARTs that do not reassert THRE when the
> > >  		 * transmitter is idle and the interrupt has already
> > > @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
> > >  		 * allow register changes to become visible.
> > >  		 */
> > >  		spin_lock_irqsave(&port->lock, flags);
> > > -		if (up->port.irqflags & IRQF_SHARED)
> > > -			disable_irq_nosync(port->irq);
> > >  
> > >  		wait_for_xmitr(up, UART_LSR_THRE);
> > >  		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> > > @@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
> > >  		iir = serial_port_in(port, UART_IIR);
> > >  		serial_port_out(port, UART_IER, 0);
> > >  
> > > -		if (port->irqflags & IRQF_SHARED)
> > > -			enable_irq(port->irq);
> > >  		spin_unlock_irqrestore(&port->lock, flags);
> > > +		if (irq_shared)
> > > +			enable_irq(port->irq);
> > >  
> > >  		/*
> > >  		 * If the interrupt is not reasserted, or we otherwise
> > 
> > I think that it might be safe but I am not 100% sure, sigh.
> 
> Yeah, I'm not 100%, but I'd give it a try.

I do not feel brave enough to ack it today. But I am all for trying it
if anyone more familiar with the code is fine with it.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-06-30 12:22         ` Petr Mladek
@ 2020-06-30 13:05           ` Sergey Senozhatsky
  2020-06-30 18:02             ` Tony Lindgren
  0 siblings, 1 reply; 31+ messages in thread
From: Sergey Senozhatsky @ 2020-06-30 13:05 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andy Shevchenko, Raul Rangel, Tony Lindgren,
	Sergey Senozhatsky, linux-kernel, Greg Kroah-Hartman,
	Andy Shevchenko, kurt, S, Shirish, Peter Zijlstra, John Ogness,
	Steven Rostedt

On (20/06/30 14:22), Petr Mladek wrote:
> On Tue 2020-06-30 19:55:12, Sergey Senozhatsky wrote:
> > On (20/06/30 12:21), Petr Mladek wrote:
> > > Anyway, I believe that this is a false positive. If I get it correctly
> > > serial8250_do_startup() must be called before the serial port could
> > > be registered as a console. It means that it could not be called
> > > from inside printk().
> > 
> > >From my understanding, I'm afraid we are talking about actual deadlock
> > here, not about false positive report. Quoting the original email:
> 
> IMHO, it is not clear.
>

Yea. Maybe it is, maybe it's not.

Regardless, after lockdep tells us there is something it doesn't like,
it disables itself, so false positives render lockdep useless.

> > > > @@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
> > > >  
> > > >  	if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> > > >  		unsigned char iir1;
> > > > +		bool irq_shared = up->port.irqflags & IRQF_SHARED;
> > > > +
> > > > +		if (irq_shared)
> > > > +			disable_irq_nosync(port->irq);
> > > > +
> > > >  		/*
> > > >  		 * Test for UARTs that do not reassert THRE when the
> > > >  		 * transmitter is idle and the interrupt has already
> > > > @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
> > > >  		 * allow register changes to become visible.
> > > >  		 */
> > > >  		spin_lock_irqsave(&port->lock, flags);
> > > > -		if (up->port.irqflags & IRQF_SHARED)
> > > > -			disable_irq_nosync(port->irq);
> > > >  
> > > >  		wait_for_xmitr(up, UART_LSR_THRE);
> > > >  		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> > > > @@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
> > > >  		iir = serial_port_in(port, UART_IIR);
> > > >  		serial_port_out(port, UART_IER, 0);
> > > >  
> > > > -		if (port->irqflags & IRQF_SHARED)
> > > > -			enable_irq(port->irq);
> > > >  		spin_unlock_irqrestore(&port->lock, flags);
> > > > +		if (irq_shared)
> > > > +			enable_irq(port->irq);
> > > >  
> > > >  		/*
> > > >  		 * If the interrupt is not reasserted, or we otherwise
> > > 
> > > I think that it might be safe but I am not 100% sure, sigh.
> > 
> > Yeah, I'm not 100%, but I'd give it a try.
> 
> I do not feel brave enough to ack it today. But I am all for trying it
> if anyone more familiar with the code is fine with it.

I see. Well, I suppose we need Ack-s from tty/serial/8250 maintainers.
I would not be very happy if _only_ printk people Ack the patch.

	-ss

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-06-30 13:05           ` Sergey Senozhatsky
@ 2020-06-30 18:02             ` Tony Lindgren
  2020-07-01  6:44               ` S, Shirish
  2020-07-02  5:12               ` Sergey Senozhatsky
  0 siblings, 2 replies; 31+ messages in thread
From: Tony Lindgren @ 2020-06-30 18:02 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Andy Shevchenko, Raul Rangel, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, Andy Shevchenko, kurt, S,
	Shirish, Peter Zijlstra, John Ogness, Steven Rostedt

* Sergey Senozhatsky <sergey.senozhatsky@gmail.com> [200630 13:06]:
> On (20/06/30 14:22), Petr Mladek wrote:
...

> > > > > @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
> > > > >  		 * allow register changes to become visible.
> > > > >  		 */
> > > > >  		spin_lock_irqsave(&port->lock, flags);
> > > > > -		if (up->port.irqflags & IRQF_SHARED)
> > > > > -			disable_irq_nosync(port->irq);
> > > > >  
> > > > >  		wait_for_xmitr(up, UART_LSR_THRE);
> > > > >  		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> > > > > @@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
> > > > >  		iir = serial_port_in(port, UART_IIR);
> > > > >  		serial_port_out(port, UART_IER, 0);
> > > > >  
> > > > > -		if (port->irqflags & IRQF_SHARED)
> > > > > -			enable_irq(port->irq);
> > > > >  		spin_unlock_irqrestore(&port->lock, flags);
> > > > > +		if (irq_shared)
> > > > > +			enable_irq(port->irq);
> > > > >  
> > > > >  		/*
> > > > >  		 * If the interrupt is not reasserted, or we otherwise
> > > > 
> > > > I think that it might be safe but I am not 100% sure, sigh.
> > > 
> > > Yeah, I'm not 100%, but I'd give it a try.
> > 
> > I do not feel brave enough to ack it today. But I am all for trying it
> > if anyone more familiar with the code is fine with it.
> 
> I see. Well, I suppose we need Ack-s from tty/serial/8250 maintainers.
> I would not be very happy if _only_ printk people Ack the patch.

This conditional disable for irq_shared does not look nice to me
from the other device point of view :)

Would it be possible to just set up te dummy interrupt handler
for the startup, then change it back afterwards? See for example
omap8250_no_handle_irq().

The other device for irq_shared should be capable of dealing
with spurious interrupts if it's shared.

Regards,

Tony

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-06-30 18:02             ` Tony Lindgren
@ 2020-07-01  6:44               ` S, Shirish
  2020-07-02  3:48                 ` S, Shirish
  2020-07-02  5:12               ` Sergey Senozhatsky
  1 sibling, 1 reply; 31+ messages in thread
From: S, Shirish @ 2020-07-01  6:44 UTC (permalink / raw)
  To: Tony Lindgren, Sergey Senozhatsky
  Cc: Petr Mladek, Andy Shevchenko, Raul Rangel, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, Andy Shevchenko, kurt, S,
	Shirish, Peter Zijlstra, John Ogness, Steven Rostedt


On 6/30/2020 11:32 PM, Tony Lindgren wrote:
> * Sergey Senozhatsky <sergey.senozhatsky@gmail.com> [200630 13:06]:
>> On (20/06/30 14:22), Petr Mladek wrote:
> ...
>
>>>>>> @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
>>>>>>   		 * allow register changes to become visible.
>>>>>>   		 */
>>>>>>   		spin_lock_irqsave(&port->lock, flags);
>>>>>> -		if (up->port.irqflags & IRQF_SHARED)
>>>>>> -			disable_irq_nosync(port->irq);
>>>>>>   
>>>>>>   		wait_for_xmitr(up, UART_LSR_THRE);
>>>>>>   		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
>>>>>> @@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
>>>>>>   		iir = serial_port_in(port, UART_IIR);
>>>>>>   		serial_port_out(port, UART_IER, 0);
>>>>>>   
>>>>>> -		if (port->irqflags & IRQF_SHARED)
>>>>>> -			enable_irq(port->irq);
>>>>>>   		spin_unlock_irqrestore(&port->lock, flags);
>>>>>> +		if (irq_shared)
>>>>>> +			enable_irq(port->irq);
>>>>>>   
>>>>>>   		/*
>>>>>>   		 * If the interrupt is not reasserted, or we otherwise
>>>>> I think that it might be safe but I am not 100% sure, sigh.
>>>> Yeah, I'm not 100%, but I'd give it a try.
>>> I do not feel brave enough to ack it today. But I am all for trying it
>>> if anyone more familiar with the code is fine with it.
>> I see. Well, I suppose we need Ack-s from tty/serial/8250 maintainers.
>> I would not be very happy if _only_ printk people Ack the patch.

FWIW, the lockdep trace is not seen anymore with the patch applied.

Regards,

Shirish S

> This conditional disable for irq_shared does not look nice to me
> from the other device point of view :)
>
> Would it be possible to just set up te dummy interrupt handler
> for the startup, then change it back afterwards? See for example
> omap8250_no_handle_irq().
>
> The other device for irq_shared should be capable of dealing
> with spurious interrupts if it's shared.
>
> Regards,
>
> Tony

-- 
Regards,
Shirish S


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: UART/TTY console deadlock
  2020-07-01  6:44               ` S, Shirish
@ 2020-07-02  3:48                 ` S, Shirish
  2020-07-02  6:11                   ` Greg Kroah-Hartman
  0 siblings, 1 reply; 31+ messages in thread
From: S, Shirish @ 2020-07-02  3:48 UTC (permalink / raw)
  To: Tony Lindgren, Sergey Senozhatsky
  Cc: Petr Mladek, Andy Shevchenko, Raul Rangel, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, Andy Shevchenko, kurt,
	Peter Zijlstra, John Ogness, Steven Rostedt

Hi All,

Can we land this patch upstream?
Feel free to add my tested-by.

Thanks.

Regards,
Shirish S

-----Original Message-----
From: S, Shirish <Shirish.S@amd.com> 
Sent: Wednesday, July 1, 2020 12:15 PM
To: Tony Lindgren <tony@atomide.com>; Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>; Andy Shevchenko <andy.shevchenko@gmail.com>; Raul Rangel <rrangel@google.com>; Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>; linux-kernel <linux-kernel@vger.kernel.org>; Greg Kroah-Hartman <gregkh@linuxfoundation.org>; Andy Shevchenko <andriy.shevchenko@linux.intel.com>; kurt@linutronix.de; S, Shirish <Shirish.S@amd.com>; Peter Zijlstra <peterz@infradead.org>; John Ogness <john.ogness@linutronix.de>; Steven Rostedt <rostedt@goodmis.org>
Subject: Re: UART/TTY console deadlock


On 6/30/2020 11:32 PM, Tony Lindgren wrote:
> * Sergey Senozhatsky <sergey.senozhatsky@gmail.com> [200630 13:06]:
>> On (20/06/30 14:22), Petr Mladek wrote:
> ...
>
>>>>>> @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
>>>>>>   		 * allow register changes to become visible.
>>>>>>   		 */
>>>>>>   		spin_lock_irqsave(&port->lock, flags);
>>>>>> -		if (up->port.irqflags & IRQF_SHARED)
>>>>>> -			disable_irq_nosync(port->irq);
>>>>>>   
>>>>>>   		wait_for_xmitr(up, UART_LSR_THRE);
>>>>>>   		serial_port_out_sync(port, UART_IER, UART_IER_THRI); @@ 
>>>>>> -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
>>>>>>   		iir = serial_port_in(port, UART_IIR);
>>>>>>   		serial_port_out(port, UART_IER, 0);
>>>>>>   
>>>>>> -		if (port->irqflags & IRQF_SHARED)
>>>>>> -			enable_irq(port->irq);
>>>>>>   		spin_unlock_irqrestore(&port->lock, flags);
>>>>>> +		if (irq_shared)
>>>>>> +			enable_irq(port->irq);
>>>>>>   
>>>>>>   		/*
>>>>>>   		 * If the interrupt is not reasserted, or we otherwise
>>>>> I think that it might be safe but I am not 100% sure, sigh.
>>>> Yeah, I'm not 100%, but I'd give it a try.
>>> I do not feel brave enough to ack it today. But I am all for trying 
>>> it if anyone more familiar with the code is fine with it.
>> I see. Well, I suppose we need Ack-s from tty/serial/8250 maintainers.
>> I would not be very happy if _only_ printk people Ack the patch.

FWIW, the lockdep trace is not seen anymore with the patch applied.

Regards,

Shirish S

> This conditional disable for irq_shared does not look nice to me from 
> the other device point of view :)
>
> Would it be possible to just set up te dummy interrupt handler for the 
> startup, then change it back afterwards? See for example 
> omap8250_no_handle_irq().
>
> The other device for irq_shared should be capable of dealing with 
> spurious interrupts if it's shared.
>
> Regards,
>
> Tony

--
Regards,
Shirish S


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-06-30 18:02             ` Tony Lindgren
  2020-07-01  6:44               ` S, Shirish
@ 2020-07-02  5:12               ` Sergey Senozhatsky
  2020-07-02  5:40                 ` Sergey Senozhatsky
  2020-07-02 16:05                 ` Tony Lindgren
  1 sibling, 2 replies; 31+ messages in thread
From: Sergey Senozhatsky @ 2020-07-02  5:12 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Sergey Senozhatsky, Petr Mladek, Andy Shevchenko, Raul Rangel,
	Sergey Senozhatsky, linux-kernel, Greg Kroah-Hartman,
	Andy Shevchenko, kurt, S, Shirish, Peter Zijlstra, John Ogness,
	Steven Rostedt

On (20/06/30 11:02), Tony Lindgren wrote:
> This conditional disable for irq_shared does not look nice to me
> from the other device point of view :)
> 
> Would it be possible to just set up te dummy interrupt handler
> for the startup, then change it back afterwards? See for example
> omap8250_no_handle_irq().

I think we can do it. serial8250_do_startup() and irq handler take
port->lock, so they should be synchronized.

Something like this then?

---

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index f77bf820b7a3..61ad6f5f0f88 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -2140,6 +2140,11 @@ static void serial8250_put_poll_char(struct uart_port *port,
 
 #endif /* CONFIG_CONSOLE_POLL */
 
+static int serial8250_dummy_handle_irq(struct uart_port *port)
+{
+	return 0;
+}
+
 int serial8250_do_startup(struct uart_port *port)
 {
 	struct uart_8250_port *up = up_to_u8250p(port);
@@ -2271,6 +2276,9 @@ int serial8250_do_startup(struct uart_port *port)
 
 	if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
 		unsigned char iir1;
+		int (*handler)(struct uart_port *);
+
+		handler = port->handle_irq;
 		/*
 		 * Test for UARTs that do not reassert THRE when the
 		 * transmitter is idle and the interrupt has already
@@ -2280,8 +2288,7 @@ int serial8250_do_startup(struct uart_port *port)
 		 * allow register changes to become visible.
 		 */
 		spin_lock_irqsave(&port->lock, flags);
-		if (up->port.irqflags & IRQF_SHARED)
-			disable_irq_nosync(port->irq);
+		port->handle_irq = serial8250_dummy_handle_irq;
 
 		wait_for_xmitr(up, UART_LSR_THRE);
 		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
@@ -2293,8 +2300,7 @@ int serial8250_do_startup(struct uart_port *port)
 		iir = serial_port_in(port, UART_IIR);
 		serial_port_out(port, UART_IER, 0);
 
-		if (port->irqflags & IRQF_SHARED)
-			enable_irq(port->irq);
+		port->handle_irq = handler;
 		spin_unlock_irqrestore(&port->lock, flags);
 
 		/*

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-02  5:12               ` Sergey Senozhatsky
@ 2020-07-02  5:40                 ` Sergey Senozhatsky
  2020-07-02  8:20                   ` Andy Shevchenko
  2020-07-02 16:05                 ` Tony Lindgren
  1 sibling, 1 reply; 31+ messages in thread
From: Sergey Senozhatsky @ 2020-07-02  5:40 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Petr Mladek, Andy Shevchenko, Raul Rangel, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, Andy Shevchenko, kurt, S,
	Shirish, Peter Zijlstra, John Ogness, Steven Rostedt,
	Sergey Senozhatsky

On (20/07/02 14:12), Sergey Senozhatsky wrote:
> Date: Thu, 2 Jul 2020 14:12:13 +0900
> From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> To: Tony Lindgren <tony@atomide.com>
> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>, Petr Mladek
>  <pmladek@suse.com>, Andy Shevchenko <andy.shevchenko@gmail.com>, Raul
>  Rangel <rrangel@google.com>, Sergey Senozhatsky
>  <sergey.senozhatsky.work@gmail.com>, linux-kernel
>  <linux-kernel@vger.kernel.org>, Greg Kroah-Hartman
>  <gregkh@linuxfoundation.org>, Andy Shevchenko
>  <andriy.shevchenko@linux.intel.com>, kurt@linutronix.de, "S, Shirish"
>  <Shirish.S@amd.com>, Peter Zijlstra <peterz@infradead.org>, John Ogness
>  <john.ogness@linutronix.de>, Steven Rostedt <rostedt@goodmis.org>
> Subject: Re: UART/TTY console deadlock
> Message-ID: <20200702051213.GB3450@jagdpanzerIV.localdomain>
> 
> On (20/06/30 11:02), Tony Lindgren wrote:
> > This conditional disable for irq_shared does not look nice to me
> > from the other device point of view :)
> > 
> > Would it be possible to just set up te dummy interrupt handler
> > for the startup, then change it back afterwards? See for example
> > omap8250_no_handle_irq().
> 
> I think we can do it. serial8250_do_startup() and irq handler take
> port->lock, so they should be synchronized.

Hmm, hold on. Why does it disable IRQ in the first place? IRQ handlers
should grab the port->lock. So if there is already running IRQ, then
serial8250_do_startup() will wait until IRQ handler unlocks the port->lock.
If serial8250_do_startup() grabs the port->lock first, then IRQ will wait
for serial8250_do_startup() to unlock it. serial8250_do_startup() does
not release the port->unlock until its done:

		spin_lock_irqsave(&port->lock, flags);

		wait_for_xmitr(up, UART_LSR_THRE);
		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
		udelay(1); /* allow THRE to set */
		iir1 = serial_port_in(port, UART_IIR);
		serial_port_out(port, UART_IER, 0);
		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
		udelay(1); /* allow a working UART time to re-assert THRE */
		iir = serial_port_in(port, UART_IIR);
		serial_port_out(port, UART_IER, 0);

		spin_unlock_irqrestore(&port->lock, flags);

so IRQ will not see the inconsistent device state.

What exactly is the purpose of disable_irq_nosync()? Can we just remove
disable_irq_nosync()/enable_irq() instead? Are there any IRQ handlers
that don't acquire the port->lock?

	-ss

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-02  3:48                 ` S, Shirish
@ 2020-07-02  6:11                   ` Greg Kroah-Hartman
  2020-07-02  6:14                     ` S, Shirish
  0 siblings, 1 reply; 31+ messages in thread
From: Greg Kroah-Hartman @ 2020-07-02  6:11 UTC (permalink / raw)
  To: S, Shirish
  Cc: Tony Lindgren, Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Raul Rangel, Sergey Senozhatsky, linux-kernel, Andy Shevchenko,
	kurt, Peter Zijlstra, John Ogness, Steven Rostedt

On Thu, Jul 02, 2020 at 03:48:43AM +0000, S, Shirish wrote:
> Hi All,
> 
> Can we land this patch upstream?

What patch?

> Feel free to add my tested-by.

Please send it in a form that I can add it, to the patch in question.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: UART/TTY console deadlock
  2020-07-02  6:11                   ` Greg Kroah-Hartman
@ 2020-07-02  6:14                     ` S, Shirish
  2020-07-02  6:34                       ` Sergey Senozhatsky
  2020-07-02  7:11                       ` Greg Kroah-Hartman
  0 siblings, 2 replies; 31+ messages in thread
From: S, Shirish @ 2020-07-02  6:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Tony Lindgren, Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Raul Rangel, Sergey Senozhatsky, linux-kernel, Andy Shevchenko,
	kurt, Peter Zijlstra, John Ogness, Steven Rostedt

This one:
https://lkml.org/lkml/2020/6/30/394

I did reply to all, not sure what I am missing while replying.



Regards,
Shirish S

-----Original Message-----
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> 
Sent: Thursday, July 2, 2020 11:42 AM
To: S, Shirish <Shirish.S@amd.com>
Cc: Tony Lindgren <tony@atomide.com>; Sergey Senozhatsky <sergey.senozhatsky@gmail.com>; Petr Mladek <pmladek@suse.com>; Andy Shevchenko <andy.shevchenko@gmail.com>; Raul Rangel <rrangel@google.com>; Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>; linux-kernel <linux-kernel@vger.kernel.org>; Andy Shevchenko <andriy.shevchenko@linux.intel.com>; kurt@linutronix.de; Peter Zijlstra <peterz@infradead.org>; John Ogness <john.ogness@linutronix.de>; Steven Rostedt <rostedt@goodmis.org>
Subject: Re: UART/TTY console deadlock

On Thu, Jul 02, 2020 at 03:48:43AM +0000, S, Shirish wrote:
> Hi All,
> 
> Can we land this patch upstream?

What patch?

> Feel free to add my tested-by.

Please send it in a form that I can add it, to the patch in question.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-02  6:14                     ` S, Shirish
@ 2020-07-02  6:34                       ` Sergey Senozhatsky
  2020-07-02  7:11                       ` Greg Kroah-Hartman
  1 sibling, 0 replies; 31+ messages in thread
From: Sergey Senozhatsky @ 2020-07-02  6:34 UTC (permalink / raw)
  To: S, Shirish
  Cc: Greg Kroah-Hartman, Tony Lindgren, Sergey Senozhatsky,
	Petr Mladek, Andy Shevchenko, Raul Rangel, Sergey Senozhatsky,
	linux-kernel, Andy Shevchenko, kurt, Peter Zijlstra, John Ogness,
	Steven Rostedt

On (20/07/02 06:14), S, Shirish wrote:
> This one:
> https://lkml.org/lkml/2020/6/30/394
> 
> I did reply to all, not sure what I am missing while replying.

It's not a formal patch yet. Give us some more time to figure out
how do we want to fix the problem at hand.

	-ss

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-02  6:14                     ` S, Shirish
  2020-07-02  6:34                       ` Sergey Senozhatsky
@ 2020-07-02  7:11                       ` Greg Kroah-Hartman
  1 sibling, 0 replies; 31+ messages in thread
From: Greg Kroah-Hartman @ 2020-07-02  7:11 UTC (permalink / raw)
  To: S, Shirish
  Cc: Tony Lindgren, Sergey Senozhatsky, Petr Mladek, Andy Shevchenko,
	Raul Rangel, Sergey Senozhatsky, linux-kernel, Andy Shevchenko,
	kurt, Peter Zijlstra, John Ogness, Steven Rostedt

A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Thu, Jul 02, 2020 at 06:14:05AM +0000, S, Shirish wrote:
> This one:
> https://lkml.org/lkml/2020/6/30/394

Please use lore.kernel.org for links, lkml.org is known for having
reliability issues and is not under our control.

> I did reply to all, not sure what I am missing while replying.

Proper email quoting?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-02  5:40                 ` Sergey Senozhatsky
@ 2020-07-02  8:20                   ` Andy Shevchenko
  2020-07-03 10:53                     ` Sergey Senozhatsky
  0 siblings, 1 reply; 31+ messages in thread
From: Andy Shevchenko @ 2020-07-02  8:20 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tony Lindgren, Petr Mladek, Raul Rangel, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, kurt, S, Shirish,
	Peter Zijlstra, John Ogness, Steven Rostedt

On Thu, Jul 02, 2020 at 02:40:33PM +0900, Sergey Senozhatsky wrote:
> On (20/07/02 14:12), Sergey Senozhatsky wrote:
> > From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> > On (20/06/30 11:02), Tony Lindgren wrote:

...

> > I think we can do it. serial8250_do_startup() and irq handler take
> > port->lock, so they should be synchronized.
> 
> Hmm, hold on. Why does it disable IRQ in the first place? IRQ handlers
> should grab the port->lock. So if there is already running IRQ, then
> serial8250_do_startup() will wait until IRQ handler unlocks the port->lock.
> If serial8250_do_startup() grabs the port->lock first, then IRQ will wait
> for serial8250_do_startup() to unlock it. serial8250_do_startup() does
> not release the port->unlock until its done:
> 
> 		spin_lock_irqsave(&port->lock, flags);
> 
> 		wait_for_xmitr(up, UART_LSR_THRE);
> 		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> 		udelay(1); /* allow THRE to set */
> 		iir1 = serial_port_in(port, UART_IIR);
> 		serial_port_out(port, UART_IER, 0);
> 		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> 		udelay(1); /* allow a working UART time to re-assert THRE */
> 		iir = serial_port_in(port, UART_IIR);
> 		serial_port_out(port, UART_IER, 0);
> 
> 		spin_unlock_irqrestore(&port->lock, flags);
> 
> so IRQ will not see the inconsistent device state.
> 
> What exactly is the purpose of disable_irq_nosync()? Can we just remove
> disable_irq_nosync()/enable_irq() instead? Are there any IRQ handlers
> that don't acquire the port->lock?

I didn't look into this deeply, but my understanding that this is something for
special case when you have several UART ports sharing the IRQ (multi-port card)
and IRQ even maybe undesirable b/c it will confuse real IRQ handler. I don't
remember details, but AFAIR IRQ handler does a busyloop to service as much as
possible and in between it may release the lock (again, multi-port UART cards),
that's why we better avoid IRQ event in the first place.
But it's my pure speculation.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-02  5:12               ` Sergey Senozhatsky
  2020-07-02  5:40                 ` Sergey Senozhatsky
@ 2020-07-02 16:05                 ` Tony Lindgren
  2020-07-03 10:32                   ` Sergey Senozhatsky
  1 sibling, 1 reply; 31+ messages in thread
From: Tony Lindgren @ 2020-07-02 16:05 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Andy Shevchenko, Raul Rangel, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, Andy Shevchenko, kurt, S,
	Shirish, Peter Zijlstra, John Ogness, Steven Rostedt

* Sergey Senozhatsky <sergey.senozhatsky@gmail.com> [200702 05:13]:
> On (20/06/30 11:02), Tony Lindgren wrote:
> > This conditional disable for irq_shared does not look nice to me
> > from the other device point of view :)
> > 
> > Would it be possible to just set up te dummy interrupt handler
> > for the startup, then change it back afterwards? See for example
> > omap8250_no_handle_irq().
> 
> I think we can do it. serial8250_do_startup() and irq handler take
> port->lock, so they should be synchronized.
> 
> Something like this then?

Yeah thanks this should work better for shared irq.

Regards,

Tony

> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index f77bf820b7a3..61ad6f5f0f88 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -2140,6 +2140,11 @@ static void serial8250_put_poll_char(struct uart_port *port,
>  
>  #endif /* CONFIG_CONSOLE_POLL */
>  
> +static int serial8250_dummy_handle_irq(struct uart_port *port)
> +{
> +	return 0;
> +}
> +
>  int serial8250_do_startup(struct uart_port *port)
>  {
>  	struct uart_8250_port *up = up_to_u8250p(port);
> @@ -2271,6 +2276,9 @@ int serial8250_do_startup(struct uart_port *port)
>  
>  	if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
>  		unsigned char iir1;
> +		int (*handler)(struct uart_port *);
> +
> +		handler = port->handle_irq;
>  		/*
>  		 * Test for UARTs that do not reassert THRE when the
>  		 * transmitter is idle and the interrupt has already
> @@ -2280,8 +2288,7 @@ int serial8250_do_startup(struct uart_port *port)
>  		 * allow register changes to become visible.
>  		 */
>  		spin_lock_irqsave(&port->lock, flags);
> -		if (up->port.irqflags & IRQF_SHARED)
> -			disable_irq_nosync(port->irq);
> +		port->handle_irq = serial8250_dummy_handle_irq;
>  
>  		wait_for_xmitr(up, UART_LSR_THRE);
>  		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> @@ -2293,8 +2300,7 @@ int serial8250_do_startup(struct uart_port *port)
>  		iir = serial_port_in(port, UART_IIR);
>  		serial_port_out(port, UART_IER, 0);
>  
> -		if (port->irqflags & IRQF_SHARED)
> -			enable_irq(port->irq);
> +		port->handle_irq = handler;
>  		spin_unlock_irqrestore(&port->lock, flags);
>  
>  		/*

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-02 16:05                 ` Tony Lindgren
@ 2020-07-03 10:32                   ` Sergey Senozhatsky
  2020-07-04 11:35                     ` Andy Shevchenko
  0 siblings, 1 reply; 31+ messages in thread
From: Sergey Senozhatsky @ 2020-07-03 10:32 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Sergey Senozhatsky, Petr Mladek, Andy Shevchenko, Raul Rangel,
	Sergey Senozhatsky, linux-kernel, Greg Kroah-Hartman,
	Andy Shevchenko, kurt, S, Shirish, Peter Zijlstra, John Ogness,
	Steven Rostedt

On (20/07/02 09:05), Tony Lindgren wrote:
> * Sergey Senozhatsky <sergey.senozhatsky@gmail.com> [200702 05:13]:
> > On (20/06/30 11:02), Tony Lindgren wrote:
> > > This conditional disable for irq_shared does not look nice to me
> > > from the other device point of view :)
> > > 
> > > Would it be possible to just set up te dummy interrupt handler
> > > for the startup, then change it back afterwards? See for example
> > > omap8250_no_handle_irq().
> > 
> > I think we can do it. serial8250_do_startup() and irq handler take
> > port->lock, so they should be synchronized.
> > 
> > Something like this then?
> 
> Yeah thanks this should work better for shared irq.

This is, basically, an equivalent of

---
 drivers/tty/serial/8250/8250_port.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index d64ca77d9cfa..dba7747d2ddd 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -2275,6 +2275,7 @@ int serial8250_do_startup(struct uart_port *port)
 
 	if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
 		unsigned char iir1;
+
 		/*
 		 * Test for UARTs that do not reassert THRE when the
 		 * transmitter is idle and the interrupt has already
@@ -2284,8 +2285,6 @@ int serial8250_do_startup(struct uart_port *port)
 		 * allow register changes to become visible.
 		 */
 		spin_lock_irqsave(&port->lock, flags);
-		if (up->port.irqflags & IRQF_SHARED)
-			disable_irq_nosync(port->irq);
 
 		wait_for_xmitr(up, UART_LSR_THRE);
 		serial_port_out_sync(port, UART_IER, UART_IER_THRI);
@@ -2297,8 +2296,6 @@ int serial8250_do_startup(struct uart_port *port)
 		iir = serial_port_in(port, UART_IIR);
 		serial_port_out(port, UART_IER, 0);
 
-		if (port->irqflags & IRQF_SHARED)
-			enable_irq(port->irq);
 		spin_unlock_irqrestore(&port->lock, flags);
 
 		/*
-- 
2.27.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-02  8:20                   ` Andy Shevchenko
@ 2020-07-03 10:53                     ` Sergey Senozhatsky
  2020-07-04 11:37                       ` Andy Shevchenko
  0 siblings, 1 reply; 31+ messages in thread
From: Sergey Senozhatsky @ 2020-07-03 10:53 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Sergey Senozhatsky, Tony Lindgren, Petr Mladek, Raul Rangel,
	Sergey Senozhatsky, linux-kernel, Greg Kroah-Hartman, kurt, S,
	Shirish, Peter Zijlstra, John Ogness, Steven Rostedt

On (20/07/02 11:20), Andy Shevchenko wrote:
> 
> I didn't look into this deeply, but my understanding that this is something for
> special case when you have several UART ports sharing the IRQ (multi-port card)
> and IRQ even maybe undesirable b/c it will confuse real IRQ handler. I don't
> remember details, but AFAIR IRQ handler does a busyloop to service as much as
> possible and in between it may release the lock (again, multi-port UART cards),
> that's why we better avoid IRQ event in the first place.
> But it's my pure speculation.

Hmm. Would different IRQ handlers use different UART port structures? ->irq
might be the only thing they share. Even if they use the same port, we
keep IRQs disabled on local CPU, and UART port locked. To unlock the UART
port during IRQ, the handler (on another CPU) first needs to acquire the
port->lock, which is locked by serial8250_do_startup().

	-ss

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-03 10:32                   ` Sergey Senozhatsky
@ 2020-07-04 11:35                     ` Andy Shevchenko
  2020-07-04 11:59                       ` Andy Shevchenko
  2020-07-06 11:31                       ` Kurt Kanzenbach
  0 siblings, 2 replies; 31+ messages in thread
From: Andy Shevchenko @ 2020-07-04 11:35 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tony Lindgren, Petr Mladek, Raul Rangel, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, Andy Shevchenko, kurt, S,
	Shirish, Peter Zijlstra, John Ogness, Steven Rostedt

On Fri, Jul 3, 2020 at 1:32 PM Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
>
> On (20/07/02 09:05), Tony Lindgren wrote:
> > * Sergey Senozhatsky <sergey.senozhatsky@gmail.com> [200702 05:13]:
> > > On (20/06/30 11:02), Tony Lindgren wrote:
> > > > This conditional disable for irq_shared does not look nice to me
> > > > from the other device point of view :)
> > > >
> > > > Would it be possible to just set up te dummy interrupt handler
> > > > for the startup, then change it back afterwards? See for example
> > > > omap8250_no_handle_irq().
> > >
> > > I think we can do it. serial8250_do_startup() and irq handler take
> > > port->lock, so they should be synchronized.
> > >
> > > Something like this then?
> >
> > Yeah thanks this should work better for shared irq.
>
> This is, basically, an equivalent of
>
> ---
>  drivers/tty/serial/8250/8250_port.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index d64ca77d9cfa..dba7747d2ddd 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -2275,6 +2275,7 @@ int serial8250_do_startup(struct uart_port *port)
>
>         if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
>                 unsigned char iir1;
> +
>                 /*
>                  * Test for UARTs that do not reassert THRE when the
>                  * transmitter is idle and the interrupt has already
> @@ -2284,8 +2285,6 @@ int serial8250_do_startup(struct uart_port *port)
>                  * allow register changes to become visible.
>                  */
>                 spin_lock_irqsave(&port->lock, flags);
> -               if (up->port.irqflags & IRQF_SHARED)
> -                       disable_irq_nosync(port->irq);
>
>                 wait_for_xmitr(up, UART_LSR_THRE);
>                 serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> @@ -2297,8 +2296,6 @@ int serial8250_do_startup(struct uart_port *port)
>                 iir = serial_port_in(port, UART_IIR);
>                 serial_port_out(port, UART_IER, 0);
>
> -               if (port->irqflags & IRQF_SHARED)
> -                       enable_irq(port->irq);
>                 spin_unlock_irqrestore(&port->lock, flags);
>
>                 /*

...which effectively is a revert of

768aec0b5bcc ("serial: 8250: fix shared interrupts issues with SMP and
RT kernels")

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-03 10:53                     ` Sergey Senozhatsky
@ 2020-07-04 11:37                       ` Andy Shevchenko
  0 siblings, 0 replies; 31+ messages in thread
From: Andy Shevchenko @ 2020-07-04 11:37 UTC (permalink / raw)
  To: Sergey Senozhatsky, Andy Shevchenko
  Cc: Tony Lindgren, Petr Mladek, Raul Rangel, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, kurt, S, Shirish,
	Peter Zijlstra, John Ogness, Steven Rostedt

On Fri, Jul 3, 2020 at 1:53 PM Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
>
> On (20/07/02 11:20), Andy Shevchenko wrote:
> >
> > I didn't look into this deeply, but my understanding that this is something for
> > special case when you have several UART ports sharing the IRQ (multi-port card)
> > and IRQ even maybe undesirable b/c it will confuse real IRQ handler. I don't
> > remember details, but AFAIR IRQ handler does a busyloop to service as much as
> > possible and in between it may release the lock (again, multi-port UART cards),
> > that's why we better avoid IRQ event in the first place.
> > But it's my pure speculation.
>
> Hmm. Would different IRQ handlers use different UART port structures? ->irq
> might be the only thing they share. Even if they use the same port, we
> keep IRQs disabled on local CPU, and UART port locked. To unlock the UART
> port during IRQ, the handler (on another CPU) first needs to acquire the
> port->lock, which is locked by serial8250_do_startup().

It appears to be a RT kernel related. Don't know the state of affairs
in RT nowadays, but maybe still relevant.
See the commit 768aec0b5bcc ("serial: 8250: fix shared interrupts
issues with SMP and RT kernels").


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-04 11:35                     ` Andy Shevchenko
@ 2020-07-04 11:59                       ` Andy Shevchenko
  2020-07-06 11:31                       ` Kurt Kanzenbach
  1 sibling, 0 replies; 31+ messages in thread
From: Andy Shevchenko @ 2020-07-04 11:59 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tony Lindgren, Petr Mladek, Raul Rangel, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, kurt, S, Shirish,
	Peter Zijlstra, John Ogness, Steven Rostedt

On Sat, Jul 04, 2020 at 02:35:46PM +0300, Andy Shevchenko wrote:
> On Fri, Jul 3, 2020 at 1:32 PM Sergey Senozhatsky
> <sergey.senozhatsky@gmail.com> wrote:
> > On (20/07/02 09:05), Tony Lindgren wrote:
> > > * Sergey Senozhatsky <sergey.senozhatsky@gmail.com> [200702 05:13]:
> > > > On (20/06/30 11:02), Tony Lindgren wrote:

...

> > This is, basically, an equivalent of
> >
> > ---
> >  drivers/tty/serial/8250/8250_port.c | 5 +----
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> >
> > diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> > index d64ca77d9cfa..dba7747d2ddd 100644
> > --- a/drivers/tty/serial/8250/8250_port.c
> > +++ b/drivers/tty/serial/8250/8250_port.c
> > @@ -2275,6 +2275,7 @@ int serial8250_do_startup(struct uart_port *port)
> >
> >         if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> >                 unsigned char iir1;
> > +
> >                 /*
> >                  * Test for UARTs that do not reassert THRE when the
> >                  * transmitter is idle and the interrupt has already
> > @@ -2284,8 +2285,6 @@ int serial8250_do_startup(struct uart_port *port)
> >                  * allow register changes to become visible.
> >                  */
> >                 spin_lock_irqsave(&port->lock, flags);
> > -               if (up->port.irqflags & IRQF_SHARED)
> > -                       disable_irq_nosync(port->irq);
> >
> >                 wait_for_xmitr(up, UART_LSR_THRE);
> >                 serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> > @@ -2297,8 +2296,6 @@ int serial8250_do_startup(struct uart_port *port)
> >                 iir = serial_port_in(port, UART_IIR);
> >                 serial_port_out(port, UART_IER, 0);
> >
> > -               if (port->irqflags & IRQF_SHARED)
> > -                       enable_irq(port->irq);
> >                 spin_unlock_irqrestore(&port->lock, flags);
> >
> >                 /*
> 
> ...which effectively is a revert of
> 
> 768aec0b5bcc ("serial: 8250: fix shared interrupts issues with SMP and
> RT kernels")

(without c389d27b5e64 ("8250.c: port.lock is irq-safe") applied)

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-04 11:35                     ` Andy Shevchenko
  2020-07-04 11:59                       ` Andy Shevchenko
@ 2020-07-06 11:31                       ` Kurt Kanzenbach
  2020-07-06 14:43                         ` Sergey Senozhatsky
  1 sibling, 1 reply; 31+ messages in thread
From: Kurt Kanzenbach @ 2020-07-06 11:31 UTC (permalink / raw)
  To: Andy Shevchenko, Sergey Senozhatsky
  Cc: Tony Lindgren, Petr Mladek, Raul Rangel, Sergey Senozhatsky,
	linux-kernel, Greg Kroah-Hartman, Andy Shevchenko, S, Shirish,
	Peter Zijlstra, John Ogness, Steven Rostedt


[-- Attachment #1: Type: text/plain, Size: 2735 bytes --]

Hi,

On Sat Jul 04 2020, Andy Shevchenko wrote:
> On Fri, Jul 3, 2020 at 1:32 PM Sergey Senozhatsky
> <sergey.senozhatsky@gmail.com> wrote:
>>
>> On (20/07/02 09:05), Tony Lindgren wrote:
>> > * Sergey Senozhatsky <sergey.senozhatsky@gmail.com> [200702 05:13]:
>> > > On (20/06/30 11:02), Tony Lindgren wrote:
>> > > > This conditional disable for irq_shared does not look nice to me
>> > > > from the other device point of view :)
>> > > >
>> > > > Would it be possible to just set up te dummy interrupt handler
>> > > > for the startup, then change it back afterwards? See for example
>> > > > omap8250_no_handle_irq().
>> > >
>> > > I think we can do it. serial8250_do_startup() and irq handler take
>> > > port->lock, so they should be synchronized.
>> > >
>> > > Something like this then?
>> >
>> > Yeah thanks this should work better for shared irq.
>>
>> This is, basically, an equivalent of
>>
>> ---
>>  drivers/tty/serial/8250/8250_port.c | 5 +----
>>  1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
>> index d64ca77d9cfa..dba7747d2ddd 100644
>> --- a/drivers/tty/serial/8250/8250_port.c
>> +++ b/drivers/tty/serial/8250/8250_port.c
>> @@ -2275,6 +2275,7 @@ int serial8250_do_startup(struct uart_port *port)
>>
>>         if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
>>                 unsigned char iir1;
>> +
>>                 /*
>>                  * Test for UARTs that do not reassert THRE when the
>>                  * transmitter is idle and the interrupt has already
>> @@ -2284,8 +2285,6 @@ int serial8250_do_startup(struct uart_port *port)
>>                  * allow register changes to become visible.
>>                  */
>>                 spin_lock_irqsave(&port->lock, flags);
>> -               if (up->port.irqflags & IRQF_SHARED)
>> -                       disable_irq_nosync(port->irq);
>>
>>                 wait_for_xmitr(up, UART_LSR_THRE);
>>                 serial_port_out_sync(port, UART_IER, UART_IER_THRI);
>> @@ -2297,8 +2296,6 @@ int serial8250_do_startup(struct uart_port *port)
>>                 iir = serial_port_in(port, UART_IIR);
>>                 serial_port_out(port, UART_IER, 0);
>>
>> -               if (port->irqflags & IRQF_SHARED)
>> -                       enable_irq(port->irq);
>>                 spin_unlock_irqrestore(&port->lock, flags);
>>
>>                 /*
>
> ...which effectively is a revert of
>
> 768aec0b5bcc ("serial: 8250: fix shared interrupts issues with SMP and
> RT kernels")

Please, don't revert that commit. I've faced the same issue as described
in the commit log. There is hardware available with shared UART
interrupt lines.

Thanks,
Kurt

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-06 11:31                       ` Kurt Kanzenbach
@ 2020-07-06 14:43                         ` Sergey Senozhatsky
  2020-07-08  7:40                           ` Kurt Kanzenbach
  0 siblings, 1 reply; 31+ messages in thread
From: Sergey Senozhatsky @ 2020-07-06 14:43 UTC (permalink / raw)
  To: Kurt Kanzenbach
  Cc: Andy Shevchenko, Sergey Senozhatsky, Tony Lindgren, Petr Mladek,
	Raul Rangel, Sergey Senozhatsky, linux-kernel,
	Greg Kroah-Hartman, Andy Shevchenko, S, Shirish, Peter Zijlstra,
	John Ogness, Steven Rostedt

On (20/07/06 13:31), Kurt Kanzenbach wrote:
> >> @@ -2275,6 +2275,7 @@ int serial8250_do_startup(struct uart_port *port)
> >>
> >>         if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> >>                 unsigned char iir1;
> >> +
> >>                 /*
> >>                  * Test for UARTs that do not reassert THRE when the
> >>                  * transmitter is idle and the interrupt has already
> >> @@ -2284,8 +2285,6 @@ int serial8250_do_startup(struct uart_port *port)
> >>                  * allow register changes to become visible.
> >>                  */
> >>                 spin_lock_irqsave(&port->lock, flags);
> >> -               if (up->port.irqflags & IRQF_SHARED)
> >> -                       disable_irq_nosync(port->irq);
> >>
> >>                 wait_for_xmitr(up, UART_LSR_THRE);
> >>                 serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> >> @@ -2297,8 +2296,6 @@ int serial8250_do_startup(struct uart_port *port)
> >>                 iir = serial_port_in(port, UART_IIR);
> >>                 serial_port_out(port, UART_IER, 0);
> >>
> >> -               if (port->irqflags & IRQF_SHARED)
> >> -                       enable_irq(port->irq);
> >>                 spin_unlock_irqrestore(&port->lock, flags);
> >>
> >>                 /*
> >
> > ...which effectively is a revert of
> >
> > 768aec0b5bcc ("serial: 8250: fix shared interrupts issues with SMP and
> > RT kernels")
> 
> Please, don't revert that commit. I've faced the same issue as described
> in the commit log. There is hardware available with shared UART
> interrupt lines.

Will this patch break that hardware?
https://lore.kernel.org/lkml/20200702051213.GB3450@jagdpanzerIV.localdomain/

	-ss

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-06 14:43                         ` Sergey Senozhatsky
@ 2020-07-08  7:40                           ` Kurt Kanzenbach
  2020-07-08  8:07                             ` Sergey Senozhatsky
  0 siblings, 1 reply; 31+ messages in thread
From: Kurt Kanzenbach @ 2020-07-08  7:40 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andy Shevchenko, Sergey Senozhatsky, Tony Lindgren, Petr Mladek,
	Raul Rangel, Sergey Senozhatsky, linux-kernel,
	Greg Kroah-Hartman, Andy Shevchenko, S, Shirish, Peter Zijlstra,
	John Ogness, Steven Rostedt


[-- Attachment #1: Type: text/plain, Size: 2319 bytes --]

On Mon Jul 06 2020, Sergey Senozhatsky wrote:
> On (20/07/06 13:31), Kurt Kanzenbach wrote:
>> >> @@ -2275,6 +2275,7 @@ int serial8250_do_startup(struct uart_port *port)
>> >>
>> >>         if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
>> >>                 unsigned char iir1;
>> >> +
>> >>                 /*
>> >>                  * Test for UARTs that do not reassert THRE when the
>> >>                  * transmitter is idle and the interrupt has already
>> >> @@ -2284,8 +2285,6 @@ int serial8250_do_startup(struct uart_port *port)
>> >>                  * allow register changes to become visible.
>> >>                  */
>> >>                 spin_lock_irqsave(&port->lock, flags);
>> >> -               if (up->port.irqflags & IRQF_SHARED)
>> >> -                       disable_irq_nosync(port->irq);
>> >>
>> >>                 wait_for_xmitr(up, UART_LSR_THRE);
>> >>                 serial_port_out_sync(port, UART_IER, UART_IER_THRI);
>> >> @@ -2297,8 +2296,6 @@ int serial8250_do_startup(struct uart_port *port)
>> >>                 iir = serial_port_in(port, UART_IIR);
>> >>                 serial_port_out(port, UART_IER, 0);
>> >>
>> >> -               if (port->irqflags & IRQF_SHARED)
>> >> -                       enable_irq(port->irq);
>> >>                 spin_unlock_irqrestore(&port->lock, flags);
>> >>
>> >>                 /*
>> >
>> > ...which effectively is a revert of
>> >
>> > 768aec0b5bcc ("serial: 8250: fix shared interrupts issues with SMP and
>> > RT kernels")
>> 
>> Please, don't revert that commit. I've faced the same issue as described
>> in the commit log. There is hardware available with shared UART
>> interrupt lines.
>
> Will this patch break that hardware?
> https://lore.kernel.org/lkml/20200702051213.GB3450@jagdpanzerIV.localdomain/

I'm not sure how this patch will help with the situation. Because at the
point of that THRE test the irq handler isn't registered. It's
registered a few lines below (up->ops->setup_irq()) meaning the irq line
has to be disabled if shared. Otherwise the kernel might detect a
spurious irq and disables it. That's at least my understanding of the
problem (see commit message from 54e53b2e8081 ("tty: serial: 8250: pass
IRQ shared flag to UART ports")).

Thanks,
Kurt

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-08  7:40                           ` Kurt Kanzenbach
@ 2020-07-08  8:07                             ` Sergey Senozhatsky
  2020-07-08  9:52                               ` Petr Mladek
  0 siblings, 1 reply; 31+ messages in thread
From: Sergey Senozhatsky @ 2020-07-08  8:07 UTC (permalink / raw)
  To: Kurt Kanzenbach
  Cc: Sergey Senozhatsky, Andy Shevchenko, Tony Lindgren, Petr Mladek,
	Raul Rangel, Sergey Senozhatsky, linux-kernel,
	Greg Kroah-Hartman, Andy Shevchenko, S, Shirish, Peter Zijlstra,
	John Ogness, Steven Rostedt

On (20/07/08 09:40), Kurt Kanzenbach wrote:
> I'm not sure how this patch will help with the situation. Because at the
> point of that THRE test the irq handler isn't registered. It's
> registered a few lines below (up->ops->setup_irq()) meaning the irq line
> has to be disabled if shared. Otherwise the kernel might detect a
> spurious irq and disables it. That's at least my understanding of the
> problem (see commit message from 54e53b2e8081 ("tty: serial: 8250: pass
> IRQ shared flag to UART ports")).

So the only remaining approach then is to move
disable_irq_nosync()/enable_irq() out of port->lock
scope.

	-ss

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-08  8:07                             ` Sergey Senozhatsky
@ 2020-07-08  9:52                               ` Petr Mladek
  2020-07-09 13:22                                 ` Sergey Senozhatsky
  0 siblings, 1 reply; 31+ messages in thread
From: Petr Mladek @ 2020-07-08  9:52 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Kurt Kanzenbach, Andy Shevchenko, Tony Lindgren, Raul Rangel,
	Sergey Senozhatsky, linux-kernel, Greg Kroah-Hartman,
	Andy Shevchenko, S, Shirish, Peter Zijlstra, John Ogness,
	Steven Rostedt

On Wed 2020-07-08 17:07:12, Sergey Senozhatsky wrote:
> On (20/07/08 09:40), Kurt Kanzenbach wrote:
> > I'm not sure how this patch will help with the situation. Because at the
> > point of that THRE test the irq handler isn't registered. It's
> > registered a few lines below (up->ops->setup_irq()) meaning the irq line
> > has to be disabled if shared. Otherwise the kernel might detect a
> > spurious irq and disables it. That's at least my understanding of the
> > problem (see commit message from 54e53b2e8081 ("tty: serial: 8250: pass
> > IRQ shared flag to UART ports")).
> 
> So the only remaining approach then is to move
> disable_irq_nosync()/enable_irq() out of port->lock
> scope.

The ultimate solution is the offload of printk console handling to
kthreads. I know that it takes ages to get it in. But it will
also take ages to go around all the cyclic dependencies.

People tend to use printk() anywhere. And any lock taken by
printk() is prone to cause these problems.

We have tried to go around this in the scheduler code by
printk_deferred() and it was never-ending story. I believe
that the port->lock related cycles are another never-ending
story. And attempts to shuffle locks are always complicated
and error-prone. They require deep understanding of the affected
code paths. The approach with printk_deferred() has been pretty
safe at least.

Another approach might be to teach lockdep to remove lock chains
that were taken during the system initialization and will never
happen again. Or something like this.

I still believe that this is a false positive. Console that is being
initialized can't be used by printk() at the same moment. Locks taken
only when the console is being initialized can't cause deadlocks
when the fully initialized console is registered later.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-08  9:52                               ` Petr Mladek
@ 2020-07-09 13:22                                 ` Sergey Senozhatsky
  2020-07-14 16:16                                   ` Raul Rangel
  0 siblings, 1 reply; 31+ messages in thread
From: Sergey Senozhatsky @ 2020-07-09 13:22 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Kurt Kanzenbach, Andy Shevchenko,
	Tony Lindgren, Raul Rangel, Sergey Senozhatsky, linux-kernel,
	Greg Kroah-Hartman, Andy Shevchenko, S, Shirish, Peter Zijlstra,
	John Ogness, Steven Rostedt

On (20/07/08 11:52), Petr Mladek wrote:
> 
> I still believe that this is a false positive.

I've no problems with that. The thing that I'm trying to improve here is the
fact that from lockdep's point of view there are no false positives. lockdep
reports the problem and then waves good bye. It disables itself. There might
be some real suspend locking problems on some particular setups, but it's
hard to tell, because console startup blows up the lockdep first.

	-ss

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: UART/TTY console deadlock
  2020-07-09 13:22                                 ` Sergey Senozhatsky
@ 2020-07-14 16:16                                   ` Raul Rangel
  0 siblings, 0 replies; 31+ messages in thread
From: Raul Rangel @ 2020-07-14 16:16 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Kurt Kanzenbach, Andy Shevchenko, Tony Lindgren,
	Sergey Senozhatsky, linux-kernel, Greg Kroah-Hartman,
	Andy Shevchenko, S, Shirish, Peter Zijlstra, John Ogness,
	Steven Rostedt

Hey everyone,
So it looks like this was a red herring. We were getting spurious
interrupts which I suspect caused the console code to get called in
some critical section. Once we fixed the spurious interrupts we no
longer see the issue. Sorry for the noise.

Thanks,
Raul

On Thu, Jul 9, 2020 at 7:22 AM Sergey Senozhatsky
<sergey.senozhatsky@gmail.com> wrote:
>
> On (20/07/08 11:52), Petr Mladek wrote:
> >
> > I still believe that this is a false positive.
>
> I've no problems with that. The thing that I'm trying to improve here is the
> fact that from lockdep's point of view there are no false positives. lockdep
> reports the problem and then waves good bye. It disables itself. There might
> be some real suspend locking problems on some particular setups, but it's
> hard to tell, because console startup blows up the lockdep first.
>
>         -ss

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, back to index

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-22 17:30 UART/TTY console deadlock Raul Rangel
2020-06-22 17:37 ` Andy Shevchenko
2020-06-30  3:58   ` Sergey Senozhatsky
2020-06-30 10:21     ` Petr Mladek
2020-06-30 10:55       ` Sergey Senozhatsky
2020-06-30 11:40         ` Andy Shevchenko
2020-06-30 12:22         ` Petr Mladek
2020-06-30 13:05           ` Sergey Senozhatsky
2020-06-30 18:02             ` Tony Lindgren
2020-07-01  6:44               ` S, Shirish
2020-07-02  3:48                 ` S, Shirish
2020-07-02  6:11                   ` Greg Kroah-Hartman
2020-07-02  6:14                     ` S, Shirish
2020-07-02  6:34                       ` Sergey Senozhatsky
2020-07-02  7:11                       ` Greg Kroah-Hartman
2020-07-02  5:12               ` Sergey Senozhatsky
2020-07-02  5:40                 ` Sergey Senozhatsky
2020-07-02  8:20                   ` Andy Shevchenko
2020-07-03 10:53                     ` Sergey Senozhatsky
2020-07-04 11:37                       ` Andy Shevchenko
2020-07-02 16:05                 ` Tony Lindgren
2020-07-03 10:32                   ` Sergey Senozhatsky
2020-07-04 11:35                     ` Andy Shevchenko
2020-07-04 11:59                       ` Andy Shevchenko
2020-07-06 11:31                       ` Kurt Kanzenbach
2020-07-06 14:43                         ` Sergey Senozhatsky
2020-07-08  7:40                           ` Kurt Kanzenbach
2020-07-08  8:07                             ` Sergey Senozhatsky
2020-07-08  9:52                               ` Petr Mladek
2020-07-09 13:22                                 ` Sergey Senozhatsky
2020-07-14 16:16                                   ` Raul Rangel

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git