From: Stefano Garzarella <sgarzare@redhat.com> To: Peilin Ye <yepeilin.cs@gmail.com> Cc: "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Peilin Ye <peilin.ye@bytedance.com>, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH RFC net-next] vsock: Reschedule connect_work for O_NONBLOCK connect() requests Date: Fri, 5 Aug 2022 14:42:39 +0200 [thread overview] Message-ID: <20220805124239.iy5lkeytqwjyvn7g@sgarzare-redhat> (raw) In-Reply-To: <20220804234447.GA2294@bytedance> On Thu, Aug 04, 2022 at 04:44:47PM -0700, Peilin Ye wrote: >Hi Stefano, > >On Thu, Aug 04, 2022 at 08:59:23AM +0200, Stefano Garzarella wrote: >> The last thing I was trying to figure out before sending the patch was >> whether to set sock->state = SS_UNCONNECTED in vsock_connect_timeout(). >> >> I think we should do that, otherwise a subsequent to connect() with >> O_NONBLOCK set would keep returning -EALREADY, even though the timeout has >> expired. >> >> What do you think? > >Thanks for bringing this up, after thinking about sock->state, I have 3 >thoughts: > >1. I think the root cause of this memleak is, we keep @connect_work > pending, even after the 2nd, blocking request times out (or gets > interrupted) and sets sock->state back to SS_UNCONNECTED. > > @connect_work is effectively no-op when sk->sk_state is > TCP_CLOS{E,ING} anyway, so why not we just cancel @connect_work when > blocking requests time out or get interrupted? Something like: > >diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c >index f04abf662ec6..62628af84164 100644 >--- a/net/vmw_vsock/af_vsock.c >+++ b/net/vmw_vsock/af_vsock.c >@@ -1402,6 +1402,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, > lock_sock(sk); > > if (signal_pending(current)) { >+ if (cancel_delayed_work(&vsk->connect_work)) >+ sock_put(sk); >+ > err = sock_intr_errno(timeout); > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > sock->state = SS_UNCONNECTED; >@@ -1409,6 +1412,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, > vsock_remove_connected(vsk); > goto out_wait; > } else if (timeout == 0) { >+ if (cancel_delayed_work(&vsk->connect_work)) >+ sock_put(sk); >+ > err = -ETIMEDOUT; > sk->sk_state = TCP_CLOSE; > sock->state = SS_UNCONNECTED; > > Then no need to worry about rescheduling @connect_work, and the state > machine becomes more accurate. What do you think? I will ask syzbot > to test this. It could work, but should we set `sk->sk_err` and call sk_error_report() to wake up thread waiting on poll()? Maybe the previous version is simpler. > >2. About your suggestion of setting sock->state = SS_UNCONNECTED in > vsock_connect_timeout(), I think it makes sense. Are you going to > send a net-next patch for this? If you have time, feel free to send it. Since it is a fix, I believe you can use the "net" tree. (Also for this patch). Remember to put the "Fixes" tag that should be the same. > >3. After a TCP_SYN_SENT sock receives VIRTIO_VSOCK_OP_RESPONSE in > virtio_transport_recv_connecting(), why don't we cancel > @connect_work? > Am I missing something? Because when the timeout will fire, vsock_connect_timeout() will just call sock_put() since sk->sk_state is changed. Of course, we can cancel it if we want, but I think it's not worth it. In the end, this rescheduling patch should solve all the problems. Thanks, Stefano
WARNING: multiple messages have this Message-ID (diff)
From: Stefano Garzarella <sgarzare@redhat.com> To: Peilin Ye <yepeilin.cs@gmail.com> Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Peilin Ye <peilin.ye@bytedance.com>, "David S. Miller" <davem@davemloft.net> Subject: Re: [PATCH RFC net-next] vsock: Reschedule connect_work for O_NONBLOCK connect() requests Date: Fri, 5 Aug 2022 14:42:39 +0200 [thread overview] Message-ID: <20220805124239.iy5lkeytqwjyvn7g@sgarzare-redhat> (raw) In-Reply-To: <20220804234447.GA2294@bytedance> On Thu, Aug 04, 2022 at 04:44:47PM -0700, Peilin Ye wrote: >Hi Stefano, > >On Thu, Aug 04, 2022 at 08:59:23AM +0200, Stefano Garzarella wrote: >> The last thing I was trying to figure out before sending the patch was >> whether to set sock->state = SS_UNCONNECTED in vsock_connect_timeout(). >> >> I think we should do that, otherwise a subsequent to connect() with >> O_NONBLOCK set would keep returning -EALREADY, even though the timeout has >> expired. >> >> What do you think? > >Thanks for bringing this up, after thinking about sock->state, I have 3 >thoughts: > >1. I think the root cause of this memleak is, we keep @connect_work > pending, even after the 2nd, blocking request times out (or gets > interrupted) and sets sock->state back to SS_UNCONNECTED. > > @connect_work is effectively no-op when sk->sk_state is > TCP_CLOS{E,ING} anyway, so why not we just cancel @connect_work when > blocking requests time out or get interrupted? Something like: > >diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c >index f04abf662ec6..62628af84164 100644 >--- a/net/vmw_vsock/af_vsock.c >+++ b/net/vmw_vsock/af_vsock.c >@@ -1402,6 +1402,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, > lock_sock(sk); > > if (signal_pending(current)) { >+ if (cancel_delayed_work(&vsk->connect_work)) >+ sock_put(sk); >+ > err = sock_intr_errno(timeout); > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > sock->state = SS_UNCONNECTED; >@@ -1409,6 +1412,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, > vsock_remove_connected(vsk); > goto out_wait; > } else if (timeout == 0) { >+ if (cancel_delayed_work(&vsk->connect_work)) >+ sock_put(sk); >+ > err = -ETIMEDOUT; > sk->sk_state = TCP_CLOSE; > sock->state = SS_UNCONNECTED; > > Then no need to worry about rescheduling @connect_work, and the state > machine becomes more accurate. What do you think? I will ask syzbot > to test this. It could work, but should we set `sk->sk_err` and call sk_error_report() to wake up thread waiting on poll()? Maybe the previous version is simpler. > >2. About your suggestion of setting sock->state = SS_UNCONNECTED in > vsock_connect_timeout(), I think it makes sense. Are you going to > send a net-next patch for this? If you have time, feel free to send it. Since it is a fix, I believe you can use the "net" tree. (Also for this patch). Remember to put the "Fixes" tag that should be the same. > >3. After a TCP_SYN_SENT sock receives VIRTIO_VSOCK_OP_RESPONSE in > virtio_transport_recv_connecting(), why don't we cancel > @connect_work? > Am I missing something? Because when the timeout will fire, vsock_connect_timeout() will just call sock_put() since sk->sk_state is changed. Of course, we can cancel it if we want, but I think it's not worth it. In the end, this rescheduling patch should solve all the problems. Thanks, Stefano _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2022-08-05 12:42 UTC|newest] Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-08-04 2:09 [PATCH RFC net-next] vsock: Reschedule connect_work for O_NONBLOCK connect() requests Peilin Ye 2022-08-04 6:59 ` Stefano Garzarella 2022-08-04 6:59 ` Stefano Garzarella 2022-08-04 23:44 ` Peilin Ye 2022-08-05 12:42 ` Stefano Garzarella [this message] 2022-08-05 12:42 ` Stefano Garzarella 2022-08-05 18:27 ` Peilin Ye 2022-08-07 9:00 ` [PATCH net v2 1/2] vsock: Fix memory leak in vsock_connect() Peilin Ye 2022-08-07 9:00 ` [PATCH net v2 2/2] vsock: Set socket state back to SS_UNCONNECTED in vsock_connect_timeout() Peilin Ye 2022-08-08 7:56 ` Stefano Garzarella 2022-08-08 7:56 ` Stefano Garzarella 2022-08-08 7:55 ` [PATCH net v2 1/2] vsock: Fix memory leak in vsock_connect() Stefano Garzarella 2022-08-08 7:55 ` Stefano Garzarella 2022-08-08 17:45 ` Peilin Ye 2022-08-08 18:04 ` [PATCH net v3 " Peilin Ye 2022-08-08 18:05 ` [PATCH net v3 2/2] vsock: Set socket state back to SS_UNCONNECTED in vsock_connect_timeout() Peilin Ye 2022-08-10 9:00 ` [PATCH net v3 1/2] vsock: Fix memory leak in vsock_connect() patchwork-bot+netdevbpf
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220805124239.iy5lkeytqwjyvn7g@sgarzare-redhat \ --to=sgarzare@redhat.com \ --cc=davem@davemloft.net \ --cc=edumazet@google.com \ --cc=kuba@kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=netdev@vger.kernel.org \ --cc=pabeni@redhat.com \ --cc=peilin.ye@bytedance.com \ --cc=virtualization@lists.linux-foundation.org \ --cc=yepeilin.cs@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.