Re: netconsole deadlock with virtnet

* Re: netconsole deadlock with virtnet
       [not found]       ` <93b42091-66f2-bb92-6822-473167b2698d@redhat.com>
@ 2020-11-18 14:12         ` Steven Rostedt
  2020-11-23 11:08           ` Leon Romanovsky
  0 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2020-11-18 14:12 UTC (permalink / raw)
  To: Jason Wang
  Cc: Sergey Senozhatsky, Leon Romanovsky, Michael S. Tsirkin,
	Petr Mladek, John Ogness, virtualization, Amit Shah,
	Itay Aveksis, Ran Rozenstein, netdev

[ Adding netdev as perhaps someone there knows ]

On Wed, 18 Nov 2020 12:09:59 +0800
Jason Wang <jasowang@redhat.com> wrote:

> > This CPU0 lock(_xmit_ETHER#2) -> hard IRQ -> lock(console_owner) is
> > basically
> > 	soft IRQ -> lock(_xmit_ETHER#2) -> hard IRQ -> printk()
> >
> > Then CPU1 spins on xmit, which is owned by CPU0, CPU0 spins on
> > console_owner, which is owned by CPU1?  

It still looks to me that the target_list_lock is taken in IRQ, (which can
be the case because printk calls write_msg() which takes that lock). And
someplace there's a:

	lock(target_list_lock)
	lock(xmit_lock)

which means you can remove the console lock from this scenario completely,
and you still have a possible deadlock between target_list_lock and
xmit_lock.

> 
> 
> If this is true, it looks not a virtio-net specific issue but somewhere 
> else.
> 
> I think all network driver will synchronize through bh instead of hardirq.

I think the issue is where target_list_lock is held when we take xmit_lock.
Is there anywhere in netconsole.c that can end up taking xmit_lock while
holding the target_list_lock? If so, that's the problem. As
target_list_lock is something that can be taken in IRQ context, which means
*any* other lock that is taking while holding the target_list_lock must
also protect against interrupts from happening while it they are held.

-- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread