netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dexuan Cui <decui@microsoft.com>
To: Stefano Garzarella <sgarzare@redhat.com>,
	Hillf Danton <hdanton@sina.com>,
	Jorgen Hansen <jhansen@vmware.com>,
	Stefan Hajnoczi <stefanha@redhat.com>
Cc: syzbot <syzbot+731710996d79d0d58fbc@syzkaller.appspotmail.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"syzkaller-bugs@googlegroups.com"
	<syzkaller-bugs@googlegroups.com>,
	"virtualization@lists.linux-foundation.org" 
	<virtualization@lists.linux-foundation.org>
Subject: RE: INFO: task hung in lock_sock_nested (2)
Date: Tue, 25 Feb 2020 05:44:03 +0000	[thread overview]
Message-ID: <HK0P153MB0148B4C74BA6A60E295A03D8BFED0@HK0P153MB0148.APCP153.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <20200224100853.wd67e7rqmtidfg7y@steredhat>

> From: Stefano Garzarella <sgarzare@redhat.com>
> Sent: Monday, February 24, 2020 2:09 AM
> ...
> > > syz-executor280 D27912  9768   9766 0x00000000
> > > Call Trace:
> > >  context_switch kernel/sched/core.c:3386 [inline]
> > >  __schedule+0x934/0x1f90 kernel/sched/core.c:4082
> > >  schedule+0xdc/0x2b0 kernel/sched/core.c:4156
> > >  __lock_sock+0x165/0x290 net/core/sock.c:2413
> > >  lock_sock_nested+0xfe/0x120 net/core/sock.c:2938
> > >  virtio_transport_release+0xc4/0xd60
> net/vmw_vsock/virtio_transport_common.c:832
> > >  vsock_assign_transport+0xf3/0x3b0 net/vmw_vsock/af_vsock.c:454
> > >  vsock_stream_connect+0x2b3/0xc70 net/vmw_vsock/af_vsock.c:1288
> > >  __sys_connect_file+0x161/0x1c0 net/socket.c:1857
> > >  __sys_connect+0x174/0x1b0 net/socket.c:1874
> > >  __do_sys_connect net/socket.c:1885 [inline]
> > >  __se_sys_connect net/socket.c:1882 [inline]
> > >  __x64_sys_connect+0x73/0xb0 net/socket.c:1882
> > >  do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294

My understanding about the call trace is: in vsock_stream_connect() 
after we call lock_sock(sk), we call vsock_assign_transport(), which may
call vsk->transport->release(vsk), i.e. virtio_transport_release(), and in
virtio_transport_release() we try to get the same lock and hang.

> > Seems like vsock needs a word to track lock owner in an attempt to
> > avoid trying to lock sock while the current is the lock owner.

I'm unfamilar with the g2h/h2g :-) 
Here I'm wondering if it's acceptable to add an 'already_locked'
parameter like this:
  bool already_locked = true;
  vsk->transport->release(vsk, already_locked) ?
 
> Thanks for this possible solution.
> What about using sock_owned_by_user()?
 
> We should fix also hyperv_transport, because it could suffer from the same
> problem.

IIUC hyperv_transport doesn't supprot the h2g/g2h feature, so it should not
suffers from the deadlock issue here?

> At this point, it might be better to call vsk->transport->release(vsk)
> always with the lock taken and remove it in the transports as in the
> following patch.
> 
> What do you think?
> 
> 
> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index 9c5b2a91baad..a073d8efca33 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -753,20 +753,18 @@ static void __vsock_release(struct sock *sk, int
> level)
>  		vsk = vsock_sk(sk);
>  		pending = NULL;	/* Compiler warning. */
> 
> -		/* The release call is supposed to use lock_sock_nested()
> -		 * rather than lock_sock(), if a sock lock should be acquired.
> -		 */
> -		if (vsk->transport)
> -			vsk->transport->release(vsk);
> -		else if (sk->sk_type == SOCK_STREAM)
> -			vsock_remove_sock(vsk);
> -
>  		/* When "level" is SINGLE_DEPTH_NESTING, use the nested
>  		 * version to avoid the warning "possible recursive locking
>  		 * detected". When "level" is 0, lock_sock_nested(sk, level)
>  		 * is the same as lock_sock(sk).
>  		 */
>  		lock_sock_nested(sk, level);
> +
> +		if (vsk->transport)
> +			vsk->transport->release(vsk);
> +		else if (sk->sk_type == SOCK_STREAM)
> +			vsock_remove_sock(vsk);
> +
>  		sock_orphan(sk);
>  		sk->sk_shutdown = SHUTDOWN_MASK;
> 
> diff --git a/net/vmw_vsock/hyperv_transport.c
> b/net/vmw_vsock/hyperv_transport.c
> index 3492c021925f..510f25f4a856 100644
> --- a/net/vmw_vsock/hyperv_transport.c
> +++ b/net/vmw_vsock/hyperv_transport.c
> @@ -529,9 +529,7 @@ static void hvs_release(struct vsock_sock *vsk)
>  	struct sock *sk = sk_vsock(vsk);
>  	bool remove_sock;
> 
> -	lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
>  	remove_sock = hvs_close_lock_held(vsk);
> -	release_sock(sk);
>  	if (remove_sock)
>  		vsock_remove_sock(vsk);
>  }

This looks good to me, but do we know why vsk->transport->release(vsk)
is called without holding the lock for 'sk' in the first place?

Thanks,
Dexuan


  reply	other threads:[~2020-02-25  5:44 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-22 18:58 INFO: task hung in lock_sock_nested (2) syzbot
     [not found] ` <20200223075025.9068-1-hdanton@sina.com>
2020-02-24 10:08   ` Stefano Garzarella
2020-02-25  5:44     ` Dexuan Cui [this message]
2020-02-25  8:30       ` Stefano Garzarella
     [not found]   ` <20200224134428.12256-1-hdanton@sina.com>
2020-02-25  9:07     ` Stefano Garzarella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=HK0P153MB0148B4C74BA6A60E295A03D8BFED0@HK0P153MB0148.APCP153.PROD.OUTLOOK.COM \
    --to=decui@microsoft.com \
    --cc=davem@davemloft.net \
    --cc=hdanton@sina.com \
    --cc=jhansen@vmware.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=syzbot+731710996d79d0d58fbc@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).