* [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures
@ 2019-03-08 21:09 Guillaume Nault
2019-03-08 21:33 ` Eric Dumazet
0 siblings, 1 reply; 8+ messages in thread
From: Guillaume Nault @ 2019-03-08 21:09 UTC (permalink / raw)
To: netdev; +Cc: Eric Dumazet
Commit 7716682cc58e ("tcp/dccp: fix another race at listener
dismantle") let inet_csk_reqsk_queue_add() fail, and adjusted
{tcp,dccp}_check_req() accordingly. However, TFO and syncookies
weren't modified, thus leaking allocated resources on error.
Contrary to tcp_check_req(), in both syncookies and TFO cases,
we need to drop the request socket. Also, since the child socket is
created with inet_csk_clone_lock(), we have to unlock it and drop an
extra reference (->sk_refcount is initially set to 2 and
inet_csk_reqsk_queue_add() drops only one ref).
For TFO, we also need to revert the work done by tcp_try_fastopen()
(with reqsk_fastopen_remove()).
Fixes: 7716682cc58e ("tcp/dccp: fix another race at listener dismantle")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
Note for stable backports: this patch relies on da8ab57863ed
("tcp/dccp: remove reqsk_put() from inet_child_forget()"), to prevent
inet_child_forget() from dropping a reference from the request socket.
Therefore, for trees older than 4.14, commit da8ab57863ed has to be
backported before this patch.
net/ipv4/syncookies.c | 7 ++++++-
net/ipv4/tcp_input.c | 8 +++++++-
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 606f868d9f3f..e531344611a0 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb,
refcount_set(&req->rsk_refcnt, 1);
tcp_sk(child)->tsoffset = tsoff;
sock_rps_save_rxhash(child, skb);
- inet_csk_reqsk_queue_add(sk, req, child);
+ if (!inet_csk_reqsk_queue_add(sk, req, child)) {
+ bh_unlock_sock(child);
+ sock_put(child);
+ child = NULL;
+ reqsk_put(req);
+ }
} else {
reqsk_free(req);
}
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4eb0c8ca3c60..5def3c48870e 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6498,7 +6498,13 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
af_ops->send_synack(fastopen_sk, dst, &fl, req,
&foc, TCP_SYNACK_FASTOPEN);
/* Add the child socket directly into the accept queue */
- inet_csk_reqsk_queue_add(sk, req, fastopen_sk);
+ if (!inet_csk_reqsk_queue_add(sk, req, fastopen_sk)) {
+ reqsk_fastopen_remove(fastopen_sk, req, false);
+ bh_unlock_sock(fastopen_sk);
+ sock_put(fastopen_sk);
+ reqsk_put(req);
+ goto drop;
+ }
sk->sk_data_ready(sk);
bh_unlock_sock(fastopen_sk);
sock_put(fastopen_sk);
--
2.20.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures
2019-03-08 21:09 [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures Guillaume Nault
@ 2019-03-08 21:33 ` Eric Dumazet
2019-03-08 22:22 ` Guillaume Nault
0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2019-03-08 21:33 UTC (permalink / raw)
To: Guillaume Nault, netdev
On 03/08/2019 01:09 PM, Guillaume Nault wrote:
> Commit 7716682cc58e ("tcp/dccp: fix another race at listener
> dismantle") let inet_csk_reqsk_queue_add() fail, and adjusted
> {tcp,dccp}_check_req() accordingly. However, TFO and syncookies
> weren't modified, thus leaking allocated resources on error.
>
> Contrary to tcp_check_req(), in both syncookies and TFO cases,
> we need to drop the request socket. Also, since the child socket is
> created with inet_csk_clone_lock(), we have to unlock it and drop an
> extra reference (->sk_refcount is initially set to 2 and
> inet_csk_reqsk_queue_add() drops only one ref).
>
> For TFO, we also need to revert the work done by tcp_try_fastopen()
> (with reqsk_fastopen_remove()).
>
> Fixes: 7716682cc58e ("tcp/dccp: fix another race at listener dismantle")
> Signed-off-by: Guillaume Nault <gnault@redhat.com>
> ---
>
> Note for stable backports: this patch relies on da8ab57863ed
> ("tcp/dccp: remove reqsk_put() from inet_child_forget()"), to prevent
> inet_child_forget() from dropping a reference from the request socket.
>
> Therefore, for trees older than 4.14, commit da8ab57863ed has to be
> backported before this patch.
>
Thanks for working on this issue (it was on my radar as well)
>
> net/ipv4/syncookies.c | 7 ++++++-
> net/ipv4/tcp_input.c | 8 +++++++-
> 2 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
> index 606f868d9f3f..e531344611a0 100644
> --- a/net/ipv4/syncookies.c
> +++ b/net/ipv4/syncookies.c
> @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb,
> refcount_set(&req->rsk_refcnt, 1);
> tcp_sk(child)->tsoffset = tsoff;
> sock_rps_save_rxhash(child, skb);
> - inet_csk_reqsk_queue_add(sk, req, child);
> + if (!inet_csk_reqsk_queue_add(sk, req, child)) {
> + bh_unlock_sock(child);
> + sock_put(child);
> + child = NULL;
> + reqsk_put(req);
Since we use reqsk_free(req) in the same function, we can use reqsk_free(req)
here as well ?
I suggest the following maybe :
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 606f868d9f3fde1c3140aa7eecde87d2ec32b5f2..8b28fb66a8fcefba27a2f5e371e9469d4d7e3650 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -216,11 +216,14 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb,
refcount_set(&req->rsk_refcnt, 1);
tcp_sk(child)->tsoffset = tsoff;
sock_rps_save_rxhash(child, skb);
- inet_csk_reqsk_queue_add(sk, req, child);
- } else {
- reqsk_free(req);
+ if (likely(inet_csk_reqsk_queue_add(sk, req, child)))
+ return child;
+ bh_unlock_sock(child);
+ sock_put(child);
}
- return child;
+
+ reqsk_free(req);
+ return NULL;
}
EXPORT_SYMBOL(tcp_get_cookie_sock);
> + }
> } else {
> reqsk_free(req);
> }
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 4eb0c8ca3c60..5def3c48870e 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -6498,7 +6498,13 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
> af_ops->send_synack(fastopen_sk, dst, &fl, req,
> &foc, TCP_SYNACK_FASTOPEN);
> /* Add the child socket directly into the accept queue */
> - inet_csk_reqsk_queue_add(sk, req, fastopen_sk);
> + if (!inet_csk_reqsk_queue_add(sk, req, fastopen_sk)) {
> + reqsk_fastopen_remove(fastopen_sk, req, false);
> + bh_unlock_sock(fastopen_sk);
> + sock_put(fastopen_sk);
> + reqsk_put(req);
> + goto drop;
These two lines can be replaced by :
goto drop_and_free;
> + }
> sk->sk_data_ready(sk);
> bh_unlock_sock(fastopen_sk);
> sock_put(fastopen_sk);
>
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures
2019-03-08 21:33 ` Eric Dumazet
@ 2019-03-08 22:22 ` Guillaume Nault
2019-03-08 22:34 ` Eric Dumazet
0 siblings, 1 reply; 8+ messages in thread
From: Guillaume Nault @ 2019-03-08 22:22 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote:
>
>
> On 03/08/2019 01:09 PM, Guillaume Nault wrote:
> > @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb,
> > refcount_set(&req->rsk_refcnt, 1);
> > tcp_sk(child)->tsoffset = tsoff;
> > sock_rps_save_rxhash(child, skb);
> > - inet_csk_reqsk_queue_add(sk, req, child);
> > + if (!inet_csk_reqsk_queue_add(sk, req, child)) {
> > + bh_unlock_sock(child);
> > + sock_put(child);
> > + child = NULL;
> > + reqsk_put(req);
>
> Since we use reqsk_free(req) in the same function, we can use reqsk_free(req)
> here as well ?
>
That was my first approach, but reqsk_free() doesn't like it:
static inline void reqsk_free(struct request_sock *req)
{
/* temporary debugging */
WARN_ON_ONCE(refcount_read(&req->rsk_refcnt) != 0);
...
}
> I suggest the following maybe :
>
> diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
> index 606f868d9f3fde1c3140aa7eecde87d2ec32b5f2..8b28fb66a8fcefba27a2f5e371e9469d4d7e3650 100644
> --- a/net/ipv4/syncookies.c
> +++ b/net/ipv4/syncookies.c
> @@ -216,11 +216,14 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb,
> refcount_set(&req->rsk_refcnt, 1);
> tcp_sk(child)->tsoffset = tsoff;
> sock_rps_save_rxhash(child, skb);
> - inet_csk_reqsk_queue_add(sk, req, child);
> - } else {
> - reqsk_free(req);
> + if (likely(inet_csk_reqsk_queue_add(sk, req, child)))
> + return child;
> + bh_unlock_sock(child);
> + sock_put(child);
> }
> - return child;
> +
> + reqsk_free(req);
> + return NULL;
> }
> EXPORT_SYMBOL(tcp_get_cookie_sock);
>
>
I prefer this form as well, but I'm not sure if removing the
"temporary" WARN() is appropriate for -net. If it is, I'll resubmit.
Otherwise I can refactor it after net-next reopens. Any opinion?
Guillaume
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures
2019-03-08 22:22 ` Guillaume Nault
@ 2019-03-08 22:34 ` Eric Dumazet
2019-03-08 22:40 ` Guillaume Nault
0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2019-03-08 22:34 UTC (permalink / raw)
To: Guillaume Nault; +Cc: netdev
On 03/08/2019 02:22 PM, Guillaume Nault wrote:
> On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote:
>>
>>
>> On 03/08/2019 01:09 PM, Guillaume Nault wrote:
>>> @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb,
>>> refcount_set(&req->rsk_refcnt, 1);
>>> tcp_sk(child)->tsoffset = tsoff;
>>> sock_rps_save_rxhash(child, skb);
>>> - inet_csk_reqsk_queue_add(sk, req, child);
>>> + if (!inet_csk_reqsk_queue_add(sk, req, child)) {
>>> + bh_unlock_sock(child);
>>> + sock_put(child);
>>> + child = NULL;
>>> + reqsk_put(req);
>>
>> Since we use reqsk_free(req) in the same function, we can use reqsk_free(req)
>> here as well ?
>>
> That was my first approach, but reqsk_free() doesn't like it:
>
> static inline void reqsk_free(struct request_sock *req)
> {
> /* temporary debugging */
> WARN_ON_ONCE(refcount_read(&req->rsk_refcnt) != 0);
> ...
> }
Oh right, there is this refcount_set(&req->rsk_refcnt, 1) before the call
to inet_csk_reqsk_queue_add(sk, req, child);
So just change the TFO case only :)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures
2019-03-08 22:34 ` Eric Dumazet
@ 2019-03-08 22:40 ` Guillaume Nault
2019-03-08 23:47 ` Eric Dumazet
0 siblings, 1 reply; 8+ messages in thread
From: Guillaume Nault @ 2019-03-08 22:40 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
On Fri, Mar 08, 2019 at 02:34:07PM -0800, Eric Dumazet wrote:
>
>
> On 03/08/2019 02:22 PM, Guillaume Nault wrote:
> > On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote:
> >>
> >>
> >> On 03/08/2019 01:09 PM, Guillaume Nault wrote:
> >>> @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb,
> >>> refcount_set(&req->rsk_refcnt, 1);
> >>> tcp_sk(child)->tsoffset = tsoff;
> >>> sock_rps_save_rxhash(child, skb);
> >>> - inet_csk_reqsk_queue_add(sk, req, child);
> >>> + if (!inet_csk_reqsk_queue_add(sk, req, child)) {
> >>> + bh_unlock_sock(child);
> >>> + sock_put(child);
> >>> + child = NULL;
> >>> + reqsk_put(req);
> >>
> >> Since we use reqsk_free(req) in the same function, we can use reqsk_free(req)
> >> here as well ?
> >>
> > That was my first approach, but reqsk_free() doesn't like it:
> >
> > static inline void reqsk_free(struct request_sock *req)
> > {
> > /* temporary debugging */
> > WARN_ON_ONCE(refcount_read(&req->rsk_refcnt) != 0);
> > ...
> > }
>
> Oh right, there is this refcount_set(&req->rsk_refcnt, 1) before the call
> to inet_csk_reqsk_queue_add(sk, req, child);
>
> So just change the TFO case only :)
>
Well.. refcount is 1 in the TFO case too.
Long term, do we want to keep the WARN_ON_ONCE()? If so, we should
probably remove the comment.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures
2019-03-08 22:40 ` Guillaume Nault
@ 2019-03-08 23:47 ` Eric Dumazet
2019-03-09 0:06 ` David Miller
2019-03-09 9:02 ` Guillaume Nault
0 siblings, 2 replies; 8+ messages in thread
From: Eric Dumazet @ 2019-03-08 23:47 UTC (permalink / raw)
To: Guillaume Nault, Eric Dumazet; +Cc: netdev
On 03/08/2019 02:40 PM, Guillaume Nault wrote:
> On Fri, Mar 08, 2019 at 02:34:07PM -0800, Eric Dumazet wrote:
>>
>>
>> On 03/08/2019 02:22 PM, Guillaume Nault wrote:
>>> On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote:
>>>>
>>>>
>>>> On 03/08/2019 01:09 PM, Guillaume Nault wrote:
>>>>> @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb,
>>>>> refcount_set(&req->rsk_refcnt, 1);
>>>>> tcp_sk(child)->tsoffset = tsoff;
>>>>> sock_rps_save_rxhash(child, skb);
>>>>> - inet_csk_reqsk_queue_add(sk, req, child);
>>>>> + if (!inet_csk_reqsk_queue_add(sk, req, child)) {
>>>>> + bh_unlock_sock(child);
>>>>> + sock_put(child);
>>>>> + child = NULL;
>>>>> + reqsk_put(req);
>>>>
>>>> Since we use reqsk_free(req) in the same function, we can use reqsk_free(req)
>>>> here as well ?
>>>>
>>> That was my first approach, but reqsk_free() doesn't like it:
>>>
>>> static inline void reqsk_free(struct request_sock *req)
>>> {
>>> /* temporary debugging */
>>> WARN_ON_ONCE(refcount_read(&req->rsk_refcnt) != 0);
>>> ...
>>> }
>>
>> Oh right, there is this refcount_set(&req->rsk_refcnt, 1) before the call
>> to inet_csk_reqsk_queue_add(sk, req, child);
>>
>> So just change the TFO case only :)
>>
> Well.. refcount is 1 in the TFO case too.
Arg...
>
> Long term, do we want to keep the WARN_ON_ONCE()? If so, we should
> probably remove the comment.
We want to keep the warning.
We do not have a way to tell if the req was ever inserted in a hash table, so better play safe.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Thanks !
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures
2019-03-08 23:47 ` Eric Dumazet
@ 2019-03-09 0:06 ` David Miller
2019-03-09 9:02 ` Guillaume Nault
1 sibling, 0 replies; 8+ messages in thread
From: David Miller @ 2019-03-09 0:06 UTC (permalink / raw)
To: eric.dumazet; +Cc: gnault, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 8 Mar 2019 15:47:25 -0800
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied and queued up for -stable.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures
2019-03-08 23:47 ` Eric Dumazet
2019-03-09 0:06 ` David Miller
@ 2019-03-09 9:02 ` Guillaume Nault
1 sibling, 0 replies; 8+ messages in thread
From: Guillaume Nault @ 2019-03-09 9:02 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
On Fri, Mar 08, 2019 at 03:47:25PM -0800, Eric Dumazet wrote:
>
> On 03/08/2019 02:40 PM, Guillaume Nault wrote:
> > On Fri, Mar 08, 2019 at 02:34:07PM -0800, Eric Dumazet wrote:
> >
> > Long term, do we want to keep the WARN_ON_ONCE()? If so, we should
> > probably remove the comment.
>
> We want to keep the warning.
>
> We do not have a way to tell if the req was ever inserted in a hash table, so better play safe.
>
Then I'm going to remove the /* temporary debugging */ line, so that
nobody will be tempted to drop the test.
Thanks for your feedbacks.
Guillaume
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-03-09 9:02 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-08 21:09 [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures Guillaume Nault
2019-03-08 21:33 ` Eric Dumazet
2019-03-08 22:22 ` Guillaume Nault
2019-03-08 22:34 ` Eric Dumazet
2019-03-08 22:40 ` Guillaume Nault
2019-03-08 23:47 ` Eric Dumazet
2019-03-09 0:06 ` David Miller
2019-03-09 9:02 ` Guillaume Nault
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).