netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: netconsole deadlock with virtnet
       [not found]       ` <93b42091-66f2-bb92-6822-473167b2698d@redhat.com>
@ 2020-11-18 14:12         ` Steven Rostedt
  2020-11-23 11:08           ` Leon Romanovsky
  0 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2020-11-18 14:12 UTC (permalink / raw)
  To: Jason Wang
  Cc: Sergey Senozhatsky, Leon Romanovsky, Michael S. Tsirkin,
	Petr Mladek, John Ogness, virtualization, Amit Shah,
	Itay Aveksis, Ran Rozenstein, netdev


[ Adding netdev as perhaps someone there knows ]

On Wed, 18 Nov 2020 12:09:59 +0800
Jason Wang <jasowang@redhat.com> wrote:

> > This CPU0 lock(_xmit_ETHER#2) -> hard IRQ -> lock(console_owner) is
> > basically
> > 	soft IRQ -> lock(_xmit_ETHER#2) -> hard IRQ -> printk()
> >
> > Then CPU1 spins on xmit, which is owned by CPU0, CPU0 spins on
> > console_owner, which is owned by CPU1?  

It still looks to me that the target_list_lock is taken in IRQ, (which can
be the case because printk calls write_msg() which takes that lock). And
someplace there's a:

	lock(target_list_lock)
	lock(xmit_lock)

which means you can remove the console lock from this scenario completely,
and you still have a possible deadlock between target_list_lock and
xmit_lock.

> 
> 
> If this is true, it looks not a virtio-net specific issue but somewhere 
> else.
> 
> I think all network driver will synchronize through bh instead of hardirq.

I think the issue is where target_list_lock is held when we take xmit_lock.
Is there anywhere in netconsole.c that can end up taking xmit_lock while
holding the target_list_lock? If so, that's the problem. As
target_list_lock is something that can be taken in IRQ context, which means
*any* other lock that is taking while holding the target_list_lock must
also protect against interrupts from happening while it they are held.

-- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-18 14:12         ` netconsole deadlock with virtnet Steven Rostedt
@ 2020-11-23 11:08           ` Leon Romanovsky
  2020-11-23 14:31             ` Steven Rostedt
  0 siblings, 1 reply; 14+ messages in thread
From: Leon Romanovsky @ 2020-11-23 11:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jason Wang, Sergey Senozhatsky, Michael S. Tsirkin, Petr Mladek,
	John Ogness, virtualization, Amit Shah, Itay Aveksis,
	Ran Rozenstein, netdev

On Wed, Nov 18, 2020 at 09:12:57AM -0500, Steven Rostedt wrote:
>
> [ Adding netdev as perhaps someone there knows ]
>
> On Wed, 18 Nov 2020 12:09:59 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
> > > This CPU0 lock(_xmit_ETHER#2) -> hard IRQ -> lock(console_owner) is
> > > basically
> > > 	soft IRQ -> lock(_xmit_ETHER#2) -> hard IRQ -> printk()
> > >
> > > Then CPU1 spins on xmit, which is owned by CPU0, CPU0 spins on
> > > console_owner, which is owned by CPU1?
>
> It still looks to me that the target_list_lock is taken in IRQ, (which can
> be the case because printk calls write_msg() which takes that lock). And
> someplace there's a:
>
> 	lock(target_list_lock)
> 	lock(xmit_lock)
>
> which means you can remove the console lock from this scenario completely,
> and you still have a possible deadlock between target_list_lock and
> xmit_lock.
>
> >
> >
> > If this is true, it looks not a virtio-net specific issue but somewhere
> > else.
> >
> > I think all network driver will synchronize through bh instead of hardirq.
>
> I think the issue is where target_list_lock is held when we take xmit_lock.
> Is there anywhere in netconsole.c that can end up taking xmit_lock while
> holding the target_list_lock? If so, that's the problem. As
> target_list_lock is something that can be taken in IRQ context, which means
> *any* other lock that is taking while holding the target_list_lock must
> also protect against interrupts from happening while it they are held.

I increased printk buffer like Petr suggested and the splat is below.
It doesn't happening on x86, but on ARM65 and ppc64.

 [   10.027975] =====================================================
 [   10.027976] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
 [   10.027976] 5.10.0-rc4_for_upstream_min_debug_2020_11_22_19_37 #1 Not tainted
 [   10.027977] -----------------------------------------------------
 [   10.027978] modprobe/638 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
 [   10.027979] ffff0000c9f63c98 (_xmit_ETHER#2){+.-.}-{2:2}, at: virtnet_poll_tx+0x84/0x120
 [   10.027982]
 [   10.027982] and this task is already holding:
 [   10.027983] ffff800009007018 (target_list_lock){....}-{2:2}, at: write_msg+0x6c/0x120 [netconsole]
 [   10.027985] which would create a new lock dependency:
 [   10.027985]  (target_list_lock){....}-{2:2} -> (_xmit_ETHER#2){+.-.}-{2:2}
 [   10.027989]
 [   10.027989] but this new dependency connects a HARDIRQ-irq-safe lock:
 [   10.027990]  (console_owner){-...}-{0:0}
 [   10.027991]
 [   10.027992] ... which became HARDIRQ-irq-safe at:
 [   10.027992]   __lock_acquire+0xa78/0x1a94
 [   10.027993]   lock_acquire.part.0+0x170/0x360
 [   10.027993]   lock_acquire+0x68/0x8c
 [   10.027994]   console_unlock+0x1e8/0x6a4
 [   10.027994]   vprintk_emit+0x1c4/0x3c4
 [   10.027995]   vprintk_default+0x40/0x4c
 [   10.027995]   vprintk_func+0x10c/0x220
 [   10.027995]   printk+0x68/0x90
 [   10.027996]   crng_fast_load+0x1bc/0x1c0
 [   10.027997]   add_interrupt_randomness+0x280/0x290
 [   10.027997]   handle_irq_event+0x80/0x120
 [   10.027997]   handle_fasteoi_irq+0xac/0x200
 [   10.027998]   __handle_domain_irq+0x84/0xf0
 [   10.027999]   gic_handle_irq+0xd4/0x320
 [   10.027999]   el1_irq+0xd0/0x180
 [   10.028000]   arch_cpu_idle+0x24/0x44
 [   10.028000]   default_idle_call+0x48/0xa0
 [   10.028001]   do_idle+0x260/0x300
 [   10.028001]   cpu_startup_entry+0x30/0x6c
 [   10.028001]   rest_init+0x1b4/0x288
 [   10.028002]   arch_call_rest_init+0x18/0x24
 [   10.028002]   start_kernel+0x5cc/0x608
 [   10.028003]
 [   10.028003] to a HARDIRQ-irq-unsafe lock:
 [   10.028004]  (_xmit_ETHER#2){+.-.}-{2:2}
 [   10.028005]
 [   10.028006] ... which became HARDIRQ-irq-unsafe at:
 [   10.028006] ...  __lock_acquire+0x8bc/0x1a94
 [   10.028007]   lock_acquire.part.0+0x170/0x360
 [   10.028007]   lock_acquire+0x68/0x8c
 [   10.028008]   _raw_spin_trylock+0x80/0xd0
 [   10.028008]   virtnet_poll+0xac/0x360
 [   10.028009]   net_rx_action+0x1b0/0x4e0
 [   10.028010]   __do_softirq+0x1f4/0x638
 [   10.028010]   do_softirq+0xb8/0xcc
 [   10.028010]   __local_bh_enable_ip+0x18c/0x200
 [   10.028011]   virtnet_napi_enable+0xc0/0xd4
 [   10.028011]   virtnet_open+0x98/0x1c0
 [   10.028012]   __dev_open+0x12c/0x200
 [   10.028013]   __dev_change_flags+0x1a0/0x220
 [   10.028013]   dev_change_flags+0x2c/0x70
 [   10.028014]   do_setlink+0x214/0xe20
 [   10.028014]   __rtnl_newlink+0x514/0x820
 [   10.028015]   rtnl_newlink+0x58/0x84
 [   10.028015]   rtnetlink_rcv_msg+0x184/0x4b4
 [   10.028016]   netlink_rcv_skb+0x60/0x124
 [   10.028016]   rtnetlink_rcv+0x20/0x30
 [   10.028017]   netlink_unicast+0x1b4/0x270
 [   10.028017]   netlink_sendmsg+0x1f0/0x400
 [   10.028018]   sock_sendmsg+0x5c/0x70
 [   10.028018]   ____sys_sendmsg+0x24c/0x280
 [   10.028019]   ___sys_sendmsg+0x88/0xd0
 [   10.028019]   __sys_sendmsg+0x70/0xd0
 [   10.028020]   __arm64_sys_sendmsg+0x2c/0x40
 [   10.028021]   el0_svc_common.constprop.0+0x84/0x200
 [   10.028021]   do_el0_svc+0x2c/0x90
 [   10.028021]   el0_svc+0x18/0x50
 [   10.028022]   el0_sync_handler+0xe0/0x350
 [   10.028023]   el0_sync+0x158/0x180
 [   10.028023]
 [   10.028023] other info that might help us debug this:
 [   10.028024]
 [   10.028024] Chain exists of:
 [   10.028025]   console_owner --> target_list_lock --> _xmit_ETHER#2
 [   10.028028]
 [   10.028028]  Possible interrupt unsafe locking scenario:
 [   10.028029]
 [   10.028029]        CPU0                    CPU1
 [   10.028030]        ----                    ----
 [   10.028030]   lock(_xmit_ETHER#2);
 [   10.028032]                                local_irq_disable();
 [   10.028032]                                lock(console_owner);
 [   10.028034]                                lock(target_list_lock);
 [   10.028035]   <Interrupt>
 [   10.028035]     lock(console_owner);
 [   10.028036]
 [   10.028037]  *** DEADLOCK ***
 [   10.028037]
 [   10.028038] 3 locks held by modprobe/638:
 [   10.028038]  #0: ffff800011e1efe0 (console_lock){+.+.}-{0:0}, at: register_console+0x144/0x2f4
 [   10.028040]  #1: ffff800011e1f108 (console_owner){-...}-{0:0}, at: console_unlock+0x17c/0x6a4
 [   10.028043]  #2: ffff800009007018 (target_list_lock){....}-{2:2}, at: write_msg+0x6c/0x120 [netconsole]
 [   10.028045]
 [   10.028046] the dependencies between HARDIRQ-irq-safe lock and the holding lock:
 [   10.028046]  -> (console_owner){-...}-{0:0} ops: 1574 {
 [   10.028049]     IN-HARDIRQ-W at:
 [   10.028050]                          __lock_acquire+0xa78/0x1a94
 [   10.028050]                          lock_acquire.part.0+0x170/0x360
 [   10.028051]                          lock_acquire+0x68/0x8c
 [   10.028051]                          console_unlock+0x1e8/0x6a4
 [   10.028052]                          vprintk_emit+0x1c4/0x3c4
 [   10.028052]                          vprintk_default+0x40/0x4c
 [   10.028053]                          vprintk_func+0x10c/0x220
 [   10.028054]                          printk+0x68/0x90
 [   10.028054]                          crng_fast_load+0x1bc/0x1c0
 [   10.028055]                          add_interrupt_randomness+0x280/0x290
 [   10.028056]                          handle_irq_event+0x80/0x120
 [   10.028056]                          handle_fasteoi_irq+0xac/0x200
 [   10.028057]                          __handle_domain_irq+0x84/0xf0
 [   10.028057]                          gic_handle_irq+0xd4/0x320
 [   10.028058]                          el1_irq+0xd0/0x180
 [   10.028058]                          arch_cpu_idle+0x24/0x44
 [   10.028059]                          default_idle_call+0x48/0xa0
 [   10.028060]                          do_idle+0x260/0x300
 [   10.028061]                          cpu_startup_entry+0x30/0x6c
 [   10.028061]                          rest_init+0x1b4/0x288
 [   10.028062]                          arch_call_rest_init+0x18/0x24
 [   10.028062]                          start_kernel+0x5cc/0x608
 [   10.028063]     INITIAL USE at:
 [   10.028064]                         __lock_acquire+0x2e0/0x1a94
 [   10.028064]                         lock_acquire.part.0+0x170/0x360
 [   10.028065]                         lock_acquire+0x68/0x8c
 [   10.028066]                         console_unlock+0x1e8/0x6a4
 [   10.028067]                         vprintk_emit+0x1c4/0x3c4
 [   10.028067]                         vprintk_default+0x40/0x4c
 [   10.028068]                         vprintk_func+0x10c/0x220
 [   10.028068]                         printk+0x68/0x90
 [   10.028069]                         start_kernel+0x8c/0x608
 [   10.028069]   }
 [   10.028070]   ... key      at: [<ffff800011e1f108>] console_owner_dep_map+0x0/0x28
 [   10.028071]   ... acquired at:
 [   10.028071]    lock_acquire.part.0+0x170/0x360
 [   10.028072]    lock_acquire+0x68/0x8c
 [   10.028072]    _raw_spin_lock_irqsave+0x88/0x15c
 [   10.028073]    write_msg+0x6c/0x120 [netconsole]
 [   10.028073]    console_unlock+0x3ec/0x6a4
 [   10.028074]    register_console+0x17c/0x2f4
 [   10.028075]    init_netconsole+0x20c/0x1000 [netconsole]
 [   10.028075]    do_one_initcall+0x8c/0x480
 [   10.028076]    do_init_module+0x60/0x270
 [   10.028076]    load_module+0x21f8/0x2734
 [   10.028077]    __do_sys_finit_module+0xbc/0x12c
 [   10.028077]    __arm64_sys_finit_module+0x28/0x34
 [   10.028078]    el0_svc_common.constprop.0+0x84/0x200
 [   10.028078]    do_el0_svc+0x2c/0x90
 [   10.028079]    el0_svc+0x18/0x50
 [   10.028079]    el0_sync_handler+0xe0/0x350
 [   10.028080]    el0_sync+0x158/0x180
 [   10.028080]
 [   10.028081] -> (target_list_lock){....}-{2:2} ops: 34 {
 [   10.028083]    INITIAL USE at:
 [   10.028084]                       __lock_acquire+0x2e0/0x1a94
 [   10.028084]                       lock_acquire.part.0+0x170/0x360
 [   10.028085]                       lock_acquire+0x68/0x8c
 [   10.028085]                       _raw_spin_lock_irqsave+0x88/0x15c
 [   10.028086]                       init_netconsole+0x148/0x1000 [netconsole]
 [   10.028087]                       do_one_initcall+0x8c/0x480
 [   10.028087]                       do_init_module+0x60/0x270
 [   10.028088]                       load_module+0x21f8/0x2734
 [   10.028088]                       __do_sys_finit_module+0xbc/0x12c
 [   10.028089]                       __arm64_sys_finit_module+0x28/0x34
 [   10.028090]                       el0_svc_common.constprop.0+0x84/0x200
 [   10.028090]                       do_el0_svc+0x2c/0x90
 [   10.028091]                       el0_svc+0x18/0x50
 [   10.028092]                       el0_sync_handler+0xe0/0x350
 [   10.028092]                       el0_sync+0x158/0x180
 [   10.028093]  }
 [   10.028093]  ... key      at: [<ffff800009007018>] target_list_lock+0x18/0xfffffffffffff000 [netconsole]
 [   10.028094]  ... acquired at:
 [   10.028094]    __lock_acquire+0x134c/0x1a94
 [   10.028095]    lock_acquire.part.0+0x170/0x360
 [   10.028095]    lock_acquire+0x68/0x8c
 [   10.028096]    _raw_spin_lock+0x64/0x90
 [   10.028096]    virtnet_poll_tx+0x84/0x120
 [   10.028097]    netpoll_poll_dev+0x12c/0x350
 [   10.028097]    netpoll_send_skb+0x39c/0x400
 [   10.028098]    netpoll_send_udp+0x2b8/0x440
 [   10.028098]    write_msg+0xfc/0x120 [netconsole]
 [   10.028099]    console_unlock+0x3ec/0x6a4
 [   10.028100]    register_console+0x17c/0x2f4
 [   10.028100]    init_netconsole+0x20c/0x1000 [netconsole]
 [   10.028101]    do_one_initcall+0x8c/0x480
 [   10.028101]    do_init_module+0x60/0x270
 [   10.028102]    load_module+0x21f8/0x2734
 [   10.028102]    __do_sys_finit_module+0xbc/0x12c
 [   10.028103]    __arm64_sys_finit_module+0x28/0x34
 [   10.028103]    el0_svc_common.constprop.0+0x84/0x200
 [   10.028104]    do_el0_svc+0x2c/0x90
 [   10.028104]    el0_svc+0x18/0x50
 [   10.028105]    el0_sync_handler+0xe0/0x350
 [   10.028105]    el0_sync+0x158/0x180
 [   10.028106]
 [   10.028106]
 [   10.028107] the dependencies between the lock to be acquired
 [   10.028107]  and HARDIRQ-irq-unsafe lock:
 [   10.028108] -> (_xmit_ETHER#2){+.-.}-{2:2} ops: 217 {
 [   10.028110]    HARDIRQ-ON-W at:
 [   10.028111]                        __lock_acquire+0x8bc/0x1a94
 [   10.028111]                        lock_acquire.part.0+0x170/0x360
 [   10.028112]                        lock_acquire+0x68/0x8c
 [   10.028113]                        _raw_spin_trylock+0x80/0xd0
 [   10.028113]                        virtnet_poll+0xac/0x360
 [   10.028114]                        net_rx_action+0x1b0/0x4e0
 [   10.028115]                        __do_softirq+0x1f4/0x638
 [   10.028115]                        do_softirq+0xb8/0xcc
 [   10.028116]                        __local_bh_enable_ip+0x18c/0x200
 [   10.028116]                        virtnet_napi_enable+0xc0/0xd4
 [   10.028117]                        virtnet_open+0x98/0x1c0
 [   10.028118]                        __dev_open+0x12c/0x200
 [   10.028118]                        __dev_change_flags+0x1a0/0x220
 [   10.028119]                        dev_change_flags+0x2c/0x70
 [   10.028119]                        do_setlink+0x214/0xe20
 [   10.028120]                        __rtnl_newlink+0x514/0x820
 [   10.028120]                        rtnl_newlink+0x58/0x84
 [   10.028121]                        rtnetlink_rcv_msg+0x184/0x4b4
 [   10.028122]                        netlink_rcv_skb+0x60/0x124
 [   10.028122]                        rtnetlink_rcv+0x20/0x30
 [   10.028123]                        netlink_unicast+0x1b4/0x270
 [   10.028124]                        netlink_sendmsg+0x1f0/0x400
 [   10.028124]                        sock_sendmsg+0x5c/0x70
 [   10.028125]                        ____sys_sendmsg+0x24c/0x280
 [   10.028125]                        ___sys_sendmsg+0x88/0xd0
 [   10.028126]                        __sys_sendmsg+0x70/0xd0
 [   10.028127]                        __arm64_sys_sendmsg+0x2c/0x40
 [   10.028128]                        el0_svc_common.constprop.0+0x84/0x200
 [   10.028128]                        do_el0_svc+0x2c/0x90
 [   10.028129]                        el0_svc+0x18/0x50
 [   10.028129]                        el0_sync_handler+0xe0/0x350
 [   10.028130]                        el0_sync+0x158/0x180
 [   10.028130]    IN-SOFTIRQ-W at:
 [   10.028131]                        __lock_acquire+0x894/0x1a94
 [   10.028132]                        lock_acquire.part.0+0x170/0x360
 [   10.028132]                        lock_acquire+0x68/0x8c
 [   10.028133]                        _raw_spin_lock+0x64/0x90
 [   10.028134]                        virtnet_poll_tx+0x84/0x120
 [   10.028134]                        net_rx_action+0x1b0/0x4e0
 [   10.028135]                        __do_softirq+0x1f4/0x638
 [   10.028135]                        do_softirq+0xb8/0xcc
 [   10.028136]                        __local_bh_enable_ip+0x18c/0x200
 [   10.028137]                        virtnet_napi_enable+0xc0/0xd4
 [   10.028137]                        virtnet_open+0x14c/0x1c0
 [   10.028138]                        __dev_open+0x12c/0x200
 [   10.028138]                        __dev_change_flags+0x1a0/0x220
 [   10.028139]                        dev_change_flags+0x2c/0x70
 [   10.028140]                        do_setlink+0x214/0xe20
 [   10.028140]                        __rtnl_newlink+0x514/0x820
 [   10.028141]                        rtnl_newlink+0x58/0x84
 [   10.028141]                        rtnetlink_rcv_msg+0x184/0x4b4
 [   10.028142]                        netlink_rcv_skb+0x60/0x124
 [   10.028142]                        rtnetlink_rcv+0x20/0x30
 [   10.028143]                        netlink_unicast+0x1b4/0x270
 [   10.028144]                        netlink_sendmsg+0x1f0/0x400
 [   10.028144]                        sock_sendmsg+0x5c/0x70
 [   10.028145]                        ____sys_sendmsg+0x24c/0x280
 [   10.028146]                        ___sys_sendmsg+0x88/0xd0
 [   10.028146]                        __sys_sendmsg+0x70/0xd0
 [   10.028147]                        __arm64_sys_sendmsg+0x2c/0x40
 [   10.028148]                        el0_svc_common.constprop.0+0x84/0x200
 [   10.028148]                        do_el0_svc+0x2c/0x90
 [   10.028149]                        el0_svc+0x18/0x50
 [   10.028149]                        el0_sync_handler+0xe0/0x350
 [   10.028150]                        el0_sync+0x158/0x180
 [   10.028150]    INITIAL USE at:
 [   10.028151]                       __lock_acquire+0x2e0/0x1a94
 [   10.028152]                       lock_acquire.part.0+0x170/0x360
 [   10.028153]                       lock_acquire+0x68/0x8c
 [   10.028153]                       _raw_spin_trylock+0x80/0xd0
 [   10.028154]                       virtnet_poll+0xac/0x360
 [   10.028154]                       net_rx_action+0x1b0/0x4e0
 [   10.028155]                       __do_softirq+0x1f4/0x638
 [   10.028155]                       do_softirq+0xb8/0xcc
 [   10.028156]                       __local_bh_enable_ip+0x18c/0x200
 [   10.028157]                       virtnet_napi_enable+0xc0/0xd4
 [   10.028157]                       virtnet_open+0x98/0x1c0
 [   10.028158]                       __dev_open+0x12c/0x200
 [   10.028158]                       __dev_change_flags+0x1a0/0x220
 [   10.028159]                       dev_change_flags+0x2c/0x70
 [   10.028159]                       do_setlink+0x214/0xe20
 [   10.028160]                       __rtnl_newlink+0x514/0x820
 [   10.028161]                       rtnl_newlink+0x58/0x84
 [   10.028161]                       rtnetlink_rcv_msg+0x184/0x4b4
 [   10.028162]                       netlink_rcv_skb+0x60/0x124
 [   10.028162]                       rtnetlink_rcv+0x20/0x30
 [   10.028163]                       netlink_unicast+0x1b4/0x270
 [   10.028163]                       netlink_sendmsg+0x1f0/0x400
 [   10.028164]                       sock_sendmsg+0x5c/0x70
 [   10.028165]                       ____sys_sendmsg+0x24c/0x280
 [   10.028165]                       ___sys_sendmsg+0x88/0xd0
 [   10.028166]                       __sys_sendmsg+0x70/0xd0
 [   10.028166]                       __arm64_sys_sendmsg+0x2c/0x40
 [   10.028167]                       el0_svc_common.constprop.0+0x84/0x200
 [   10.028168]                       do_el0_svc+0x2c/0x90
 [   10.028168]                       el0_svc+0x18/0x50
 [   10.028169]                       el0_sync_handler+0xe0/0x350
 [   10.028169]                       el0_sync+0x158/0x180
 [   10.028170]  }
 [   10.028171]  ... key      at: [<ffff80001312aef8>] netdev_xmit_lock_key+0x10/0x390
 [   10.028171]  ... acquired at:
 [   10.028172]    __lock_acquire+0x134c/0x1a94
 [   10.028172]    lock_acquire.part.0+0x170/0x360
 [   10.028173]    lock_acquire+0x68/0x8c
 [   10.028173]    _raw_spin_lock+0x64/0x90
 [   10.028174]    virtnet_poll_tx+0x84/0x120
 [   10.028174]    netpoll_poll_dev+0x12c/0x350
 [   10.028175]    netpoll_send_skb+0x39c/0x400
 [   10.028175]    netpoll_send_udp+0x2b8/0x440
 [   10.028176]    write_msg+0xfc/0x120 [netconsole]
 [   10.028176]    console_unlock+0x3ec/0x6a4
 [   10.028177]    register_console+0x17c/0x2f4
 [   10.028178]    init_netconsole+0x20c/0x1000 [netconsole]
 [   10.028178]    do_one_initcall+0x8c/0x480
 [   10.028179]    do_init_module+0x60/0x270
 [   10.028179]    load_module+0x21f8/0x2734
 [   10.028180]    __do_sys_finit_module+0xbc/0x12c
 [   10.028180]    __arm64_sys_finit_module+0x28/0x34
 [   10.028181]    el0_svc_common.constprop.0+0x84/0x200
 [   10.028181]    do_el0_svc+0x2c/0x90
 [   10.028182]    el0_svc+0x18/0x50
 [   10.028182]    el0_sync_handler+0xe0/0x350
 [   10.028183]    el0_sync+0x158/0x180
 [   10.028183]
 [   10.028183]
 [   10.028184] stack backtrace:
 [   10.028185] CPU: 14 PID: 638 Comm: modprobe Not tainted 5.10.0-rc4_for_upstream_min_debug_2020_11_22_19_37 #1
 [   10.028186] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
 [   10.028186] Call trace:
 [   10.028186]  dump_backtrace+0x0/0x1d0
 [   10.028187]  show_stack+0x20/0x3c
 [   10.028187]  dump_stack+0xec/0x138
 [   10.028188]  check_irq_usage+0x6b8/0x6cc
 [   10.028188]  __lock_acquire+0x134c/0x1a94
 [   10.028189]  lock_acquire.part.0+0x170/0x360
 [   10.028189]  lock_acquire+0x68/0x8c
 [   10.028190]  _raw_spin_lock+0x64/0x90
 [   10.028191]  virtnet_poll_tx+0x84/0x120
 [   10.028191]  netpoll_poll_dev+0x12c/0x350
 [   10.028192]  netpoll_send_skb+0x39c/0x400
 [   10.028192]  netpoll_send_udp+0x2b8/0x440
 [   10.028193]  write_msg+0xfc/0x120 [netconsole]
 [   10.028193]  console_unlock+0x3ec/0x6a4
 [   10.028194]  register_console+0x17c/0x2f4
 [   10.028194]  init_netconsole+0x20c/0x1000 [netconsole]
 [   10.028195]  do_one_initcall+0x8c/0x480
 [   10.028195]  do_init_module+0x60/0x270
 [   10.028196]  load_module+0x21f8/0x2734
 [   10.028197]  __do_sys_finit_module+0xbc/0x12c
 [   10.028197]  __arm64_sys_finit_module+0x28/0x34
 [   10.028198]  el0_svc_common.constprop.0+0x84/0x200
 [   10.028198]  do_el0_svc+0x2c/0x90
 [   10.028199]  el0_svc+0x18/0x50
 [   10.028199]  el0_sync_handler+0xe0/0x350
 [   10.028200]  el0_sync+0x158/0x180
 [   10.073569] random: crng init done
 [   10.073964] printk: console [netcon0] enabled
 [   10.074704] random: 7 urandom warning(s) missed due to ratelimiting
 [   10.075340] netconsole: network logging started

>
> -- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-23 11:08           ` Leon Romanovsky
@ 2020-11-23 14:31             ` Steven Rostedt
  2020-11-23 18:52               ` Jakub Kicinski
  0 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2020-11-23 14:31 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jason Wang, Sergey Senozhatsky, Michael S. Tsirkin, Petr Mladek,
	John Ogness, virtualization, Amit Shah, Itay Aveksis,
	Ran Rozenstein, netdev

On Mon, 23 Nov 2020 13:08:55 +0200
Leon Romanovsky <leon@kernel.org> wrote:


>  [   10.028024] Chain exists of:
>  [   10.028025]   console_owner --> target_list_lock --> _xmit_ETHER#2

Note, the problem is that we have a location that grabs the xmit_lock while
holding target_list_lock (and possibly console_owner).


>  [   10.028028]
>  [   10.028028]  Possible interrupt unsafe locking scenario:
>  [   10.028029]
>  [   10.028029]        CPU0                    CPU1
>  [   10.028030]        ----                    ----
>  [   10.028030]   lock(_xmit_ETHER#2);
>  [   10.028032]                                local_irq_disable();
>  [   10.028032]                                lock(console_owner);
>  [   10.028034]                                lock(target_list_lock);
>  [   10.028035]   <Interrupt>
>  [   10.028035]     lock(console_owner);
>  [   10.028036]
>  [   10.028037]  *** DEADLOCK ***
>  [   10.028037]



>  [   10.028107] the dependencies between the lock to be acquired
>  [   10.028107]  and HARDIRQ-irq-unsafe lock:
>  [   10.028108] -> (_xmit_ETHER#2){+.-.}-{2:2} ops: 217 {
>  [   10.028110]    HARDIRQ-ON-W at:
>  [   10.028111]                        __lock_acquire+0x8bc/0x1a94
>  [   10.028111]                        lock_acquire.part.0+0x170/0x360
>  [   10.028112]                        lock_acquire+0x68/0x8c
>  [   10.028113]                        _raw_spin_trylock+0x80/0xd0
>  [   10.028113]                        virtnet_poll+0xac/0x360

xmit_lock is taken in virtnet_poll() (via virtnet_poll_cleantx()).

This is called from the softirq, and interrupts are not disabled.

>  [   10.028114]                        net_rx_action+0x1b0/0x4e0
>  [   10.028115]                        __do_softirq+0x1f4/0x638
>  [   10.028115]                        do_softirq+0xb8/0xcc
>  [   10.028116]                        __local_bh_enable_ip+0x18c/0x200
>  [   10.028116]                        virtnet_napi_enable+0xc0/0xd4
>  [   10.028117]                        virtnet_open+0x98/0x1c0
>  [   10.028118]                        __dev_open+0x12c/0x200
>  [   10.028118]                        __dev_change_flags+0x1a0/0x220
>  [   10.028119]                        dev_change_flags+0x2c/0x70
>  [   10.028119]                        do_setlink+0x214/0xe20
>  [   10.028120]                        __rtnl_newlink+0x514/0x820
>  [   10.028120]                        rtnl_newlink+0x58/0x84
>  [   10.028121]                        rtnetlink_rcv_msg+0x184/0x4b4
>  [   10.028122]                        netlink_rcv_skb+0x60/0x124
>  [   10.028122]                        rtnetlink_rcv+0x20/0x30
>  [   10.028123]                        netlink_unicast+0x1b4/0x270
>  [   10.028124]                        netlink_sendmsg+0x1f0/0x400
>  [   10.028124]                        sock_sendmsg+0x5c/0x70
>  [   10.028125]                        ____sys_sendmsg+0x24c/0x280
>  [   10.028125]                        ___sys_sendmsg+0x88/0xd0
>  [   10.028126]                        __sys_sendmsg+0x70/0xd0
>  [   10.028127]                        __arm64_sys_sendmsg+0x2c/0x40
>  [   10.028128]                        el0_svc_common.constprop.0+0x84/0x200
>  [   10.028128]                        do_el0_svc+0x2c/0x90
>  [   10.028129]                        el0_svc+0x18/0x50
>  [   10.028129]                        el0_sync_handler+0xe0/0x350
>  [   10.028130]                        el0_sync+0x158/0x180

[..]

>  [   10.028171]  ... key      at: [<ffff80001312aef8>] netdev_xmit_lock_key+0x10/0x390
>  [   10.028171]  ... acquired at:
>  [   10.028172]    __lock_acquire+0x134c/0x1a94
>  [   10.028172]    lock_acquire.part.0+0x170/0x360
>  [   10.028173]    lock_acquire+0x68/0x8c
>  [   10.028173]    _raw_spin_lock+0x64/0x90
>  [   10.028174]    virtnet_poll_tx+0x84/0x120
>  [   10.028174]    netpoll_poll_dev+0x12c/0x350
>  [   10.028175]    netpoll_send_skb+0x39c/0x400
>  [   10.028175]    netpoll_send_udp+0x2b8/0x440
>  [   10.028176]    write_msg+0xfc/0x120 [netconsole]
>  [   10.028176]    console_unlock+0x3ec/0x6a4

The above shows the problem. We have:

	console_unlock() (which holds the console_owner lock)
	write_msg() (which holds the target_list_lock)

Then we write_msg() calls:

	netpoll_send_udp() {
	  netpoll_send_skb() {
	    netpoll_poll_dev() {
	      virtnet_poll_tx() (which takes the xmit_lock!)

  DEADLOCK!


In netpoll_send_skb() I see this:

			/* tickle device maybe there is some cleanup */
			netpoll_poll_dev(np->dev);

Which looks to me that it will call some code that should only be used in
softirq context. It's called with locks held that are taken in interrupt
context, and any locks that are taken in netpoll_poll_dev() must always be
taken with interrupts disabled. That is, if xmit_lock is taken within
netpoll_poll_dev(), then it must always be taken with interrupts disabled.
Otherwise you can have the deadlock that lockdep reported.

-- Steve




>  [   10.028177]    register_console+0x17c/0x2f4
>  [   10.028178]    init_netconsole+0x20c/0x1000 [netconsole]
>  [   10.028178]    do_one_initcall+0x8c/0x480
>  [   10.028179]    do_init_module+0x60/0x270
>  [   10.028179]    load_module+0x21f8/0x2734
>  [   10.028180]    __do_sys_finit_module+0xbc/0x12c
>  [   10.028180]    __arm64_sys_finit_module+0x28/0x34
>  [   10.028181]    el0_svc_common.constprop.0+0x84/0x200
>  [   10.028181]    do_el0_svc+0x2c/0x90
>  [   10.028182]    el0_svc+0x18/0x50
>  [   10.028182]    el0_sync_handler+0xe0/0x350
>  [   10.028183]    el0_sync+0x158/0x180
>  [   10.028183]
>  [   10.028183]
>  [   10.028184] stack backtrace:
>  [   10.028185] CPU: 14 PID: 638 Comm: modprobe Not tainted 5.10.0-rc4_for_upstream_min_debug_2020_11_22_19_37 #1
>  [   10.028186] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
>  [   10.028186] Call trace:
>  [   10.028186]  dump_backtrace+0x0/0x1d0
>  [   10.028187]  show_stack+0x20/0x3c
>  [   10.028187]  dump_stack+0xec/0x138
>  [   10.028188]  check_irq_usage+0x6b8/0x6cc
>  [   10.028188]  __lock_acquire+0x134c/0x1a94
>  [   10.028189]  lock_acquire.part.0+0x170/0x360
>  [   10.028189]  lock_acquire+0x68/0x8c
>  [   10.028190]  _raw_spin_lock+0x64/0x90
>  [   10.028191]  virtnet_poll_tx+0x84/0x120
>  [   10.028191]  netpoll_poll_dev+0x12c/0x350
>  [   10.028192]  netpoll_send_skb+0x39c/0x400
>  [   10.028192]  netpoll_send_udp+0x2b8/0x440
>  [   10.028193]  write_msg+0xfc/0x120 [netconsole]
>  [   10.028193]  console_unlock+0x3ec/0x6a4
>  [   10.028194]  register_console+0x17c/0x2f4
>  [   10.028194]  init_netconsole+0x20c/0x1000 [netconsole]
>  [   10.028195]  do_one_initcall+0x8c/0x480
>  [   10.028195]  do_init_module+0x60/0x270
>  [   10.028196]  load_module+0x21f8/0x2734
>  [   10.028197]  __do_sys_finit_module+0xbc/0x12c
>  [   10.028197]  __arm64_sys_finit_module+0x28/0x34
>  [   10.028198]  el0_svc_common.constprop.0+0x84/0x200
>  [   10.028198]  do_el0_svc+0x2c/0x90
>  [   10.028199]  el0_svc+0x18/0x50
>  [   10.028199]  el0_sync_handler+0xe0/0x350
>  [   10.028200]  el0_sync+0x158/0x180
>  [   10.073569] random: crng init done
>  [   10.073964] printk: console [netcon0] enabled
>  [   10.074704] random: 7 urandom warning(s) missed due to ratelimiting
>  [   10.075340] netconsole: network logging started
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-23 14:31             ` Steven Rostedt
@ 2020-11-23 18:52               ` Jakub Kicinski
  2020-11-23 19:09                 ` Steven Rostedt
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Kicinski @ 2020-11-23 18:52 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Leon Romanovsky, Jason Wang, Sergey Senozhatsky,
	Michael S. Tsirkin, Petr Mladek, John Ogness, virtualization,
	Amit Shah, Itay Aveksis, Ran Rozenstein, netdev

On Mon, 23 Nov 2020 09:31:28 -0500 Steven Rostedt wrote:
> On Mon, 23 Nov 2020 13:08:55 +0200
> Leon Romanovsky <leon@kernel.org> wrote:
> 
> 
> >  [   10.028024] Chain exists of:
> >  [   10.028025]   console_owner --> target_list_lock --> _xmit_ETHER#2  
> 
> Note, the problem is that we have a location that grabs the xmit_lock while
> holding target_list_lock (and possibly console_owner).

Well, it try_locks the xmit_lock. Does lockdep understand try-locks?

(not that I condone the shenanigans that are going on here)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-23 18:52               ` Jakub Kicinski
@ 2020-11-23 19:09                 ` Steven Rostedt
  2020-11-23 19:21                   ` Jakub Kicinski
  0 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2020-11-23 19:09 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Leon Romanovsky, Jason Wang, Sergey Senozhatsky,
	Michael S. Tsirkin, Petr Mladek, John Ogness, virtualization,
	Amit Shah, Itay Aveksis, Ran Rozenstein, netdev

On Mon, 23 Nov 2020 10:52:52 -0800
Jakub Kicinski <kuba@kernel.org> wrote:

> On Mon, 23 Nov 2020 09:31:28 -0500 Steven Rostedt wrote:
> > On Mon, 23 Nov 2020 13:08:55 +0200
> > Leon Romanovsky <leon@kernel.org> wrote:
> > 
> >   
> > >  [   10.028024] Chain exists of:
> > >  [   10.028025]   console_owner --> target_list_lock --> _xmit_ETHER#2    
> > 
> > Note, the problem is that we have a location that grabs the xmit_lock while
> > holding target_list_lock (and possibly console_owner).  
> 
> Well, it try_locks the xmit_lock. Does lockdep understand try-locks?
> 
> (not that I condone the shenanigans that are going on here)

Does it?

	virtnet_poll_tx() {
		__netif_tx_lock() {
			spin_lock(&txq->_xmit_lock);

That looks like we can have:


	CPU0		CPU1
	----		----
   lock(xmit_lock)

		    lock(console)
		    lock(target_list_lock)
		    __netif_tx_lock()
		        lock(xmit_lock);

			[BLOCKED]

   <interrupt>
   lock(console)

   [BLOCKED]



 DEADLOCK.


So where is the trylock here?

Perhaps you need the trylock in virtnet_poll_tx()?

-- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-23 19:09                 ` Steven Rostedt
@ 2020-11-23 19:21                   ` Jakub Kicinski
  2020-11-24  3:22                     ` Jason Wang
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Kicinski @ 2020-11-23 19:21 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Leon Romanovsky, Jason Wang, Sergey Senozhatsky,
	Michael S. Tsirkin, Petr Mladek, John Ogness, virtualization,
	Amit Shah, Itay Aveksis, Ran Rozenstein, netdev

On Mon, 23 Nov 2020 14:09:34 -0500 Steven Rostedt wrote:
> On Mon, 23 Nov 2020 10:52:52 -0800
> Jakub Kicinski <kuba@kernel.org> wrote:
> 
> > On Mon, 23 Nov 2020 09:31:28 -0500 Steven Rostedt wrote:  
> > > On Mon, 23 Nov 2020 13:08:55 +0200
> > > Leon Romanovsky <leon@kernel.org> wrote:
> > > 
> > >     
> > > >  [   10.028024] Chain exists of:
> > > >  [   10.028025]   console_owner --> target_list_lock --> _xmit_ETHER#2      
> > > 
> > > Note, the problem is that we have a location that grabs the xmit_lock while
> > > holding target_list_lock (and possibly console_owner).    
> > 
> > Well, it try_locks the xmit_lock. Does lockdep understand try-locks?
> > 
> > (not that I condone the shenanigans that are going on here)  
> 
> Does it?
> 
> 	virtnet_poll_tx() {
> 		__netif_tx_lock() {
> 			spin_lock(&txq->_xmit_lock);

Umpf. Right. I was looking at virtnet_poll_cleantx()

> That looks like we can have:
> 
> 
> 	CPU0		CPU1
> 	----		----
>    lock(xmit_lock)
> 
> 		    lock(console)
> 		    lock(target_list_lock)
> 		    __netif_tx_lock()
> 		        lock(xmit_lock);
> 
> 			[BLOCKED]
> 
>    <interrupt>
>    lock(console)
> 
>    [BLOCKED]
> 
> 
> 
>  DEADLOCK.
> 
> 
> So where is the trylock here?
> 
> Perhaps you need the trylock in virtnet_poll_tx()?

That could work. Best if we used normal lock if !!budget, and trylock
when budget is 0. But maybe that's too hairy.

I'm assuming all this trickiness comes from virtqueue_get_buf() needing
locking vs the TX path? It's pretty unusual for the completion path to
need locking vs xmit path.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-23 19:21                   ` Jakub Kicinski
@ 2020-11-24  3:22                     ` Jason Wang
  2020-11-24  8:01                       ` Leon Romanovsky
                                         ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jason Wang @ 2020-11-24  3:22 UTC (permalink / raw)
  To: Jakub Kicinski, Steven Rostedt
  Cc: Leon Romanovsky, Sergey Senozhatsky, Michael S. Tsirkin,
	Petr Mladek, John Ogness, virtualization, Amit Shah,
	Itay Aveksis, Ran Rozenstein, netdev


On 2020/11/24 上午3:21, Jakub Kicinski wrote:
> On Mon, 23 Nov 2020 14:09:34 -0500 Steven Rostedt wrote:
>> On Mon, 23 Nov 2020 10:52:52 -0800
>> Jakub Kicinski <kuba@kernel.org> wrote:
>>
>>> On Mon, 23 Nov 2020 09:31:28 -0500 Steven Rostedt wrote:
>>>> On Mon, 23 Nov 2020 13:08:55 +0200
>>>> Leon Romanovsky <leon@kernel.org> wrote:
>>>>
>>>>      
>>>>>   [   10.028024] Chain exists of:
>>>>>   [   10.028025]   console_owner --> target_list_lock --> _xmit_ETHER#2
>>>> Note, the problem is that we have a location that grabs the xmit_lock while
>>>> holding target_list_lock (and possibly console_owner).
>>> Well, it try_locks the xmit_lock. Does lockdep understand try-locks?
>>>
>>> (not that I condone the shenanigans that are going on here)
>> Does it?
>>
>> 	virtnet_poll_tx() {
>> 		__netif_tx_lock() {
>> 			spin_lock(&txq->_xmit_lock);
> Umpf. Right. I was looking at virtnet_poll_cleantx()
>
>> That looks like we can have:
>>
>>
>> 	CPU0		CPU1
>> 	----		----
>>     lock(xmit_lock)
>>
>> 		    lock(console)
>> 		    lock(target_list_lock)
>> 		    __netif_tx_lock()
>> 		        lock(xmit_lock);
>>
>> 			[BLOCKED]
>>
>>     <interrupt>
>>     lock(console)
>>
>>     [BLOCKED]
>>
>>
>>
>>   DEADLOCK.
>>
>>
>> So where is the trylock here?
>>
>> Perhaps you need the trylock in virtnet_poll_tx()?
> That could work. Best if we used normal lock if !!budget, and trylock
> when budget is 0. But maybe that's too hairy.


If we use trylock, we probably lose(or delay) tx notification that may 
have side effects to the stack.


>
> I'm assuming all this trickiness comes from virtqueue_get_buf() needing
> locking vs the TX path? It's pretty unusual for the completion path to
> need locking vs xmit path.


Two reasons for doing this:

1) For some historical reason, we try to free transmitted tx packets in 
xmit (see free_old_xmit_skbs() in start_xmit()), we can probably remove 
this if we remove the non tx interrupt mode.
2) virtio core requires virtqueue_get_buf() to be synchronized with 
virtqueue_add(), we probably can solve this but it requires some non 
trivial refactoring in the virtio core

Btw, have a quick search, there are several other drivers that uses tx 
lock in the tx NAPI.

Thanks

>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-24  3:22                     ` Jason Wang
@ 2020-11-24  8:01                       ` Leon Romanovsky
  2020-11-24  8:57                         ` Jason Wang
  2020-11-24 14:31                       ` Steven Rostedt
  2020-11-24 16:20                       ` Jakub Kicinski
  2 siblings, 1 reply; 14+ messages in thread
From: Leon Romanovsky @ 2020-11-24  8:01 UTC (permalink / raw)
  To: Jason Wang
  Cc: Jakub Kicinski, Steven Rostedt, Sergey Senozhatsky,
	Michael S. Tsirkin, Petr Mladek, John Ogness, virtualization,
	Amit Shah, Itay Aveksis, Ran Rozenstein, netdev

On Tue, Nov 24, 2020 at 11:22:03AM +0800, Jason Wang wrote:
>
> On 2020/11/24 上午3:21, Jakub Kicinski wrote:
> > On Mon, 23 Nov 2020 14:09:34 -0500 Steven Rostedt wrote:
> > > On Mon, 23 Nov 2020 10:52:52 -0800
> > > Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > > On Mon, 23 Nov 2020 09:31:28 -0500 Steven Rostedt wrote:
> > > > > On Mon, 23 Nov 2020 13:08:55 +0200
> > > > > Leon Romanovsky <leon@kernel.org> wrote:
> > > > >
> > > > > >   [   10.028024] Chain exists of:
> > > > > >   [   10.028025]   console_owner --> target_list_lock --> _xmit_ETHER#2
> > > > > Note, the problem is that we have a location that grabs the xmit_lock while
> > > > > holding target_list_lock (and possibly console_owner).
> > > > Well, it try_locks the xmit_lock. Does lockdep understand try-locks?
> > > >
> > > > (not that I condone the shenanigans that are going on here)
> > > Does it?
> > >
> > > 	virtnet_poll_tx() {
> > > 		__netif_tx_lock() {
> > > 			spin_lock(&txq->_xmit_lock);
> > Umpf. Right. I was looking at virtnet_poll_cleantx()
> >
> > > That looks like we can have:
> > >
> > >
> > > 	CPU0		CPU1
> > > 	----		----
> > >     lock(xmit_lock)
> > >
> > > 		    lock(console)
> > > 		    lock(target_list_lock)
> > > 		    __netif_tx_lock()
> > > 		        lock(xmit_lock);
> > >
> > > 			[BLOCKED]
> > >
> > >     <interrupt>
> > >     lock(console)
> > >
> > >     [BLOCKED]
> > >
> > >
> > >
> > >   DEADLOCK.
> > >
> > >
> > > So where is the trylock here?
> > >
> > > Perhaps you need the trylock in virtnet_poll_tx()?
> > That could work. Best if we used normal lock if !!budget, and trylock
> > when budget is 0. But maybe that's too hairy.
>
>
> If we use trylock, we probably lose(or delay) tx notification that may have
> side effects to the stack.
>
>
> >
> > I'm assuming all this trickiness comes from virtqueue_get_buf() needing
> > locking vs the TX path? It's pretty unusual for the completion path to
> > need locking vs xmit path.
>
>
> Two reasons for doing this:
>
> 1) For some historical reason, we try to free transmitted tx packets in xmit
> (see free_old_xmit_skbs() in start_xmit()), we can probably remove this if
> we remove the non tx interrupt mode.
> 2) virtio core requires virtqueue_get_buf() to be synchronized with
> virtqueue_add(), we probably can solve this but it requires some non trivial
> refactoring in the virtio core

So how will we solve our lockdep issues?

Thanks

>
> Btw, have a quick search, there are several other drivers that uses tx lock
> in the tx NAPI.
>
> Thanks
>
> >
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-24  8:01                       ` Leon Romanovsky
@ 2020-11-24  8:57                         ` Jason Wang
  2020-11-24  9:26                           ` Leon Romanovsky
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Wang @ 2020-11-24  8:57 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jakub Kicinski, Steven Rostedt, Sergey Senozhatsky,
	Michael S. Tsirkin, Petr Mladek, John Ogness, virtualization,
	Amit Shah, Itay Aveksis, Ran Rozenstein, netdev


On 2020/11/24 下午4:01, Leon Romanovsky wrote:
> On Tue, Nov 24, 2020 at 11:22:03AM +0800, Jason Wang wrote:
>> On 2020/11/24 上午3:21, Jakub Kicinski wrote:
>>> On Mon, 23 Nov 2020 14:09:34 -0500 Steven Rostedt wrote:
>>>> On Mon, 23 Nov 2020 10:52:52 -0800
>>>> Jakub Kicinski <kuba@kernel.org> wrote:
>>>>
>>>>> On Mon, 23 Nov 2020 09:31:28 -0500 Steven Rostedt wrote:
>>>>>> On Mon, 23 Nov 2020 13:08:55 +0200
>>>>>> Leon Romanovsky <leon@kernel.org> wrote:
>>>>>>
>>>>>>>    [   10.028024] Chain exists of:
>>>>>>>    [   10.028025]   console_owner --> target_list_lock --> _xmit_ETHER#2
>>>>>> Note, the problem is that we have a location that grabs the xmit_lock while
>>>>>> holding target_list_lock (and possibly console_owner).
>>>>> Well, it try_locks the xmit_lock. Does lockdep understand try-locks?
>>>>>
>>>>> (not that I condone the shenanigans that are going on here)
>>>> Does it?
>>>>
>>>> 	virtnet_poll_tx() {
>>>> 		__netif_tx_lock() {
>>>> 			spin_lock(&txq->_xmit_lock);
>>> Umpf. Right. I was looking at virtnet_poll_cleantx()
>>>
>>>> That looks like we can have:
>>>>
>>>>
>>>> 	CPU0		CPU1
>>>> 	----		----
>>>>      lock(xmit_lock)
>>>>
>>>> 		    lock(console)
>>>> 		    lock(target_list_lock)
>>>> 		    __netif_tx_lock()
>>>> 		        lock(xmit_lock);
>>>>
>>>> 			[BLOCKED]
>>>>
>>>>      <interrupt>
>>>>      lock(console)
>>>>
>>>>      [BLOCKED]
>>>>
>>>>
>>>>
>>>>    DEADLOCK.
>>>>
>>>>
>>>> So where is the trylock here?
>>>>
>>>> Perhaps you need the trylock in virtnet_poll_tx()?
>>> That could work. Best if we used normal lock if !!budget, and trylock
>>> when budget is 0. But maybe that's too hairy.
>>
>> If we use trylock, we probably lose(or delay) tx notification that may have
>> side effects to the stack.
>>
>>
>>> I'm assuming all this trickiness comes from virtqueue_get_buf() needing
>>> locking vs the TX path? It's pretty unusual for the completion path to
>>> need locking vs xmit path.
>>
>> Two reasons for doing this:
>>
>> 1) For some historical reason, we try to free transmitted tx packets in xmit
>> (see free_old_xmit_skbs() in start_xmit()), we can probably remove this if
>> we remove the non tx interrupt mode.
>> 2) virtio core requires virtqueue_get_buf() to be synchronized with
>> virtqueue_add(), we probably can solve this but it requires some non trivial
>> refactoring in the virtio core
> So how will we solve our lockdep issues?
>
> Thanks


It's not clear to me that whether it's a virtio-net specific issue. E.g 
the above deadlock looks like a generic issue so workaround it via 
virtio-net may not help for other drivers.

Thanks


>
>> Btw, have a quick search, there are several other drivers that uses tx lock
>> in the tx NAPI.
>>
>> Thanks
>>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-24  8:57                         ` Jason Wang
@ 2020-11-24  9:26                           ` Leon Romanovsky
  0 siblings, 0 replies; 14+ messages in thread
From: Leon Romanovsky @ 2020-11-24  9:26 UTC (permalink / raw)
  To: Jason Wang
  Cc: Jakub Kicinski, Steven Rostedt, Sergey Senozhatsky,
	Michael S. Tsirkin, Petr Mladek, John Ogness, virtualization,
	Amit Shah, Itay Aveksis, Ran Rozenstein, netdev

On Tue, Nov 24, 2020 at 04:57:23PM +0800, Jason Wang wrote:
>
> On 2020/11/24 下午4:01, Leon Romanovsky wrote:
> > On Tue, Nov 24, 2020 at 11:22:03AM +0800, Jason Wang wrote:
> > > On 2020/11/24 上午3:21, Jakub Kicinski wrote:
> > > > On Mon, 23 Nov 2020 14:09:34 -0500 Steven Rostedt wrote:
> > > > > On Mon, 23 Nov 2020 10:52:52 -0800
> > > > > Jakub Kicinski <kuba@kernel.org> wrote:
> > > > >
> > > > > > On Mon, 23 Nov 2020 09:31:28 -0500 Steven Rostedt wrote:
> > > > > > > On Mon, 23 Nov 2020 13:08:55 +0200
> > > > > > > Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > >
> > > > > > > >    [   10.028024] Chain exists of:
> > > > > > > >    [   10.028025]   console_owner --> target_list_lock --> _xmit_ETHER#2
> > > > > > > Note, the problem is that we have a location that grabs the xmit_lock while
> > > > > > > holding target_list_lock (and possibly console_owner).
> > > > > > Well, it try_locks the xmit_lock. Does lockdep understand try-locks?
> > > > > >
> > > > > > (not that I condone the shenanigans that are going on here)
> > > > > Does it?
> > > > >
> > > > > 	virtnet_poll_tx() {
> > > > > 		__netif_tx_lock() {
> > > > > 			spin_lock(&txq->_xmit_lock);
> > > > Umpf. Right. I was looking at virtnet_poll_cleantx()
> > > >
> > > > > That looks like we can have:
> > > > >
> > > > >
> > > > > 	CPU0		CPU1
> > > > > 	----		----
> > > > >      lock(xmit_lock)
> > > > >
> > > > > 		    lock(console)
> > > > > 		    lock(target_list_lock)
> > > > > 		    __netif_tx_lock()
> > > > > 		        lock(xmit_lock);
> > > > >
> > > > > 			[BLOCKED]
> > > > >
> > > > >      <interrupt>
> > > > >      lock(console)
> > > > >
> > > > >      [BLOCKED]
> > > > >
> > > > >
> > > > >
> > > > >    DEADLOCK.
> > > > >
> > > > >
> > > > > So where is the trylock here?
> > > > >
> > > > > Perhaps you need the trylock in virtnet_poll_tx()?
> > > > That could work. Best if we used normal lock if !!budget, and trylock
> > > > when budget is 0. But maybe that's too hairy.
> > >
> > > If we use trylock, we probably lose(or delay) tx notification that may have
> > > side effects to the stack.
> > >
> > >
> > > > I'm assuming all this trickiness comes from virtqueue_get_buf() needing
> > > > locking vs the TX path? It's pretty unusual for the completion path to
> > > > need locking vs xmit path.
> > >
> > > Two reasons for doing this:
> > >
> > > 1) For some historical reason, we try to free transmitted tx packets in xmit
> > > (see free_old_xmit_skbs() in start_xmit()), we can probably remove this if
> > > we remove the non tx interrupt mode.
> > > 2) virtio core requires virtqueue_get_buf() to be synchronized with
> > > virtqueue_add(), we probably can solve this but it requires some non trivial
> > > refactoring in the virtio core
> > So how will we solve our lockdep issues?
> >
> > Thanks
>
>
> It's not clear to me that whether it's a virtio-net specific issue. E.g the
> above deadlock looks like a generic issue so workaround it via virtio-net
> may not help for other drivers.

It is hard to say, no one else complained except me who is using virtio :).

Thanks

>
> Thanks
>
>
> >
> > > Btw, have a quick search, there are several other drivers that uses tx lock
> > > in the tx NAPI.
> > >
> > > Thanks
> > >
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-24  3:22                     ` Jason Wang
  2020-11-24  8:01                       ` Leon Romanovsky
@ 2020-11-24 14:31                       ` Steven Rostedt
  2020-11-25  6:20                         ` Jason Wang
  2020-11-24 16:20                       ` Jakub Kicinski
  2 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2020-11-24 14:31 UTC (permalink / raw)
  To: Jason Wang
  Cc: Jakub Kicinski, Leon Romanovsky, Sergey Senozhatsky,
	Michael S. Tsirkin, Petr Mladek, John Ogness, virtualization,
	Amit Shah, Itay Aveksis, Ran Rozenstein, netdev

On Tue, 24 Nov 2020 11:22:03 +0800
Jason Wang <jasowang@redhat.com> wrote:

> Btw, have a quick search, there are several other drivers that uses tx 
> lock in the tx NAPI.

tx NAPI is not the issue. The issue is that write_msg() (in netconsole.c)
calls this polling logic with the target_list_lock held.

Are those other drivers called by netconsole? If not, then this is unique
to virtio-net.

-- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-24  3:22                     ` Jason Wang
  2020-11-24  8:01                       ` Leon Romanovsky
  2020-11-24 14:31                       ` Steven Rostedt
@ 2020-11-24 16:20                       ` Jakub Kicinski
  2020-11-25  6:21                         ` Jason Wang
  2 siblings, 1 reply; 14+ messages in thread
From: Jakub Kicinski @ 2020-11-24 16:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: Steven Rostedt, Leon Romanovsky, Sergey Senozhatsky,
	Michael S. Tsirkin, Petr Mladek, John Ogness, virtualization,
	Amit Shah, Itay Aveksis, Ran Rozenstein, netdev

On Tue, 24 Nov 2020 11:22:03 +0800 Jason Wang wrote:
> >> Perhaps you need the trylock in virtnet_poll_tx()?  
> > That could work. Best if we used normal lock if !!budget, and trylock
> > when budget is 0. But maybe that's too hairy.  
> 
> If we use trylock, we probably lose(or delay) tx notification that may 
> have side effects to the stack.

That's why I said only trylock with budget == 0. Only netpoll calls with
budget == 0, AFAIK.

> > I'm assuming all this trickiness comes from virtqueue_get_buf() needing
> > locking vs the TX path? It's pretty unusual for the completion path to
> > need locking vs xmit path.  
> 
> Two reasons for doing this:
> 
> 1) For some historical reason, we try to free transmitted tx packets in 
> xmit (see free_old_xmit_skbs() in start_xmit()), we can probably remove 
> this if we remove the non tx interrupt mode.
> 2) virtio core requires virtqueue_get_buf() to be synchronized with 
> virtqueue_add(), we probably can solve this but it requires some non 
> trivial refactoring in the virtio core
> 
> Btw, have a quick search, there are several other drivers that uses tx 
> lock in the tx NAPI.

Unless they do:

	netdev->priv_flags |= IFF_DISABLE_NETPOLL;

they are all broken.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-24 14:31                       ` Steven Rostedt
@ 2020-11-25  6:20                         ` Jason Wang
  0 siblings, 0 replies; 14+ messages in thread
From: Jason Wang @ 2020-11-25  6:20 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jakub Kicinski, Leon Romanovsky, Sergey Senozhatsky,
	Michael S. Tsirkin, Petr Mladek, John Ogness, virtualization,
	Amit Shah, Itay Aveksis, Ran Rozenstein, netdev


On 2020/11/24 下午10:31, Steven Rostedt wrote:
> On Tue, 24 Nov 2020 11:22:03 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
>> Btw, have a quick search, there are several other drivers that uses tx
>> lock in the tx NAPI.
> tx NAPI is not the issue. The issue is that write_msg() (in netconsole.c)
> calls this polling logic with the target_list_lock held.


But in the tx NAPI poll it tries to lock TX instead of using trylock.


>
> Are those other drivers called by netconsole? If not, then this is unique
> to virtio-net.


I think the answer is yes, since net console is not disabled in the codes.

Thanks


>
> -- Steve
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: netconsole deadlock with virtnet
  2020-11-24 16:20                       ` Jakub Kicinski
@ 2020-11-25  6:21                         ` Jason Wang
  0 siblings, 0 replies; 14+ messages in thread
From: Jason Wang @ 2020-11-25  6:21 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Steven Rostedt, Leon Romanovsky, Sergey Senozhatsky,
	Michael S. Tsirkin, Petr Mladek, John Ogness, virtualization,
	Amit Shah, Itay Aveksis, Ran Rozenstein, netdev


On 2020/11/25 上午12:20, Jakub Kicinski wrote:
> On Tue, 24 Nov 2020 11:22:03 +0800 Jason Wang wrote:
>>>> Perhaps you need the trylock in virtnet_poll_tx()?
>>> That could work. Best if we used normal lock if !!budget, and trylock
>>> when budget is 0. But maybe that's too hairy.
>> If we use trylock, we probably lose(or delay) tx notification that may
>> have side effects to the stack.
> That's why I said only trylock with budget == 0. Only netpoll calls with
> budget == 0, AFAIK.


Oh right.

So I think maybe we can switch to use trylock when budget is zero and 
try to schedule another TX NAPI if we trylock fail.


>
>>> I'm assuming all this trickiness comes from virtqueue_get_buf() needing
>>> locking vs the TX path? It's pretty unusual for the completion path to
>>> need locking vs xmit path.
>> Two reasons for doing this:
>>
>> 1) For some historical reason, we try to free transmitted tx packets in
>> xmit (see free_old_xmit_skbs() in start_xmit()), we can probably remove
>> this if we remove the non tx interrupt mode.
>> 2) virtio core requires virtqueue_get_buf() to be synchronized with
>> virtqueue_add(), we probably can solve this but it requires some non
>> trivial refactoring in the virtio core
>>
>> Btw, have a quick search, there are several other drivers that uses tx
>> lock in the tx NAPI.
> Unless they do:
>
> 	netdev->priv_flags |= IFF_DISABLE_NETPOLL;
>
> they are all broken.


Yes.

Thanks


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-11-25  6:22 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20201117102341.GR47002@unreal>
     [not found] ` <20201117093325.78f1486d@gandalf.local.home>
     [not found]   ` <X7SK9l0oZ+RTivwF@jagdpanzerIV.localdomain>
     [not found]     ` <X7SRxB6C+9Bm+r4q@jagdpanzerIV.localdomain>
     [not found]       ` <93b42091-66f2-bb92-6822-473167b2698d@redhat.com>
2020-11-18 14:12         ` netconsole deadlock with virtnet Steven Rostedt
2020-11-23 11:08           ` Leon Romanovsky
2020-11-23 14:31             ` Steven Rostedt
2020-11-23 18:52               ` Jakub Kicinski
2020-11-23 19:09                 ` Steven Rostedt
2020-11-23 19:21                   ` Jakub Kicinski
2020-11-24  3:22                     ` Jason Wang
2020-11-24  8:01                       ` Leon Romanovsky
2020-11-24  8:57                         ` Jason Wang
2020-11-24  9:26                           ` Leon Romanovsky
2020-11-24 14:31                       ` Steven Rostedt
2020-11-25  6:20                         ` Jason Wang
2020-11-24 16:20                       ` Jakub Kicinski
2020-11-25  6:21                         ` Jason Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).