All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] net: tcp: add skb drop reasons to connect request
@ 2022-04-26  8:07 menglong8.dong
  2022-04-26  8:07 ` [PATCH net-next 1/2] net: add skb drop reasons to inet " menglong8.dong
  2022-04-26  8:07 ` [PATCH net-next 2/2] net: tcp: add skb drop reasons to route_req() menglong8.dong
  0 siblings, 2 replies; 5+ messages in thread
From: menglong8.dong @ 2022-04-26  8:07 UTC (permalink / raw)
  To: kuba
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, pabeni, benbjiang,
	flyingpeng, imagedong, edumazet, kafai, talalahmad, keescook,
	mengensun, dongli.zhang, linux-kernel, netdev

From: Menglong Dong <imagedong@tencent.com>

Seems now the reasons of skb drop in TCP layer are almost supported,
except the path of connect requesting. So let's just finish it.

The TCP connect requesting is processed by
'inet_csk(sk)->icsk_af_ops->conn_request()'. Yeah, it's a function
pointer, so it's not easy to add function param to it. Luckily, it's
return value can be reused. For now, 0 means a call of 'consume_skb()'
and -1 means 'kfree_skb()', with a RESET be send. Therefore, we can
free skb with 'kfree_skb_reason()' in 'conn_request()' and return 1.
While 1 is returned, we do nothing outside. This work is done in the
1th patch.

And in the 2th patch, skb drop reasons are added to route_req() in
struct tcp_request_sock_ops by adding a function param to it.

Following new skb drop reasons are added:

  SKB_DROP_REASON_LISTENOVERFLOWS
  SKB_DROP_REASON_TCP_REQQFULLDROP
  SKB_DROP_REASON_SECURITY

Menglong Dong (2):
  net: add skb drop reasons to inet connect request
  net: tcp: add skb drop reasons to route_req()

 include/linux/skbuff.h     |  5 +++++
 include/net/tcp.h          |  3 ++-
 include/trace/events/skb.h |  3 +++
 net/dccp/input.c           | 12 +++++-------
 net/ipv4/tcp_input.c       | 23 ++++++++++++++---------
 net/ipv4/tcp_ipv4.c        | 17 +++++++++++++----
 net/ipv6/tcp_ipv6.c        | 14 +++++++++++---
 net/mptcp/subflow.c        | 10 ++++++----
 8 files changed, 59 insertions(+), 28 deletions(-)

-- 
2.36.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net-next 1/2] net: add skb drop reasons to inet connect request
  2022-04-26  8:07 [PATCH net-next 0/2] net: tcp: add skb drop reasons to connect request menglong8.dong
@ 2022-04-26  8:07 ` menglong8.dong
  2022-04-26 13:32   ` Eric Dumazet
  2022-04-26  8:07 ` [PATCH net-next 2/2] net: tcp: add skb drop reasons to route_req() menglong8.dong
  1 sibling, 1 reply; 5+ messages in thread
From: menglong8.dong @ 2022-04-26  8:07 UTC (permalink / raw)
  To: kuba
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, pabeni, benbjiang,
	flyingpeng, imagedong, edumazet, kafai, talalahmad, keescook,
	mengensun, dongli.zhang, linux-kernel, netdev

From: Menglong Dong <imagedong@tencent.com>

The 'conn_request()' in struct inet_connection_sock_af_ops is used to
process connection requesting for TCP/DCCP. Take TCP for example, it
is just 'tcp_v4_conn_request()'.

When non-zero value is returned by 'tcp_v4_conn_request()', the skb
will be freed by kfree_skb() and a 'reset' packet will be send.
Otherwise, it will be freed normally.

In this code path, 'consume_skb()' is used in many abnormal cases, such
as the accept queue of the listen socket full, which should be
'kfree_skb()'.

Therefore, we make a little change to the 'conn_request()' interface.
When 0 is returned, we call 'consume_skb()' as usual; when negative is
returned, we call 'kfree_skb()' and send a 'reset' as usual; when
positive is returned, which has not happened yet, we do nothing, and
skb will be freed in 'conn_request()'. Then, we can use drop reasons
in 'conn_request()'.

Following new drop reasons are added:

  SKB_DROP_REASON_LISTENOVERFLOWS
  SKB_DROP_REASON_TCP_REQQFULLDROP

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/linux/skbuff.h     |  4 ++++
 include/trace/events/skb.h |  2 ++
 net/dccp/input.c           | 12 +++++-------
 net/ipv4/tcp_input.c       | 21 +++++++++++++--------
 net/ipv4/tcp_ipv4.c        |  3 ++-
 5 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 84d78df60453..f33b3636bbce 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -469,6 +469,10 @@ enum skb_drop_reason {
 	SKB_DROP_REASON_PKT_TOO_BIG,	/* packet size is too big (maybe exceed
 					 * the MTU)
 					 */
+	SKB_DROP_REASON_LISTENOVERFLOWS, /* accept queue of the listen socket is full */
+	SKB_DROP_REASON_TCP_REQQFULLDROP, /* request queue of the listen
+					   * socket is full
+					   */
 	SKB_DROP_REASON_MAX,
 };
 
diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index a477bf907498..de6c93670437 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -80,6 +80,8 @@
 	EM(SKB_DROP_REASON_IP_INADDRERRORS, IP_INADDRERRORS)	\
 	EM(SKB_DROP_REASON_IP_INNOROUTES, IP_INNOROUTES)	\
 	EM(SKB_DROP_REASON_PKT_TOO_BIG, PKT_TOO_BIG)		\
+	EM(SKB_DROP_REASON_LISTENOVERFLOWS, LISTENOVERFLOWS)	\
+	EM(SKB_DROP_REASON_TCP_REQQFULLDROP, TCP_REQQFULLDROP)	\
 	EMe(SKB_DROP_REASON_MAX, MAX)
 
 #undef EM
diff --git a/net/dccp/input.c b/net/dccp/input.c
index 2cbb757a894f..ed20dfe83f66 100644
--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -574,8 +574,7 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
 	struct dccp_sock *dp = dccp_sk(sk);
 	struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb);
 	const int old_state = sk->sk_state;
-	bool acceptable;
-	int queued = 0;
+	int err, queued = 0;
 
 	/*
 	 *  Step 3: Process LISTEN state
@@ -606,13 +605,12 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
 			 */
 			rcu_read_lock();
 			local_bh_disable();
-			acceptable = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb) >= 0;
+			err = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb);
 			local_bh_enable();
 			rcu_read_unlock();
-			if (!acceptable)
-				return 1;
-			consume_skb(skb);
-			return 0;
+			if (!err)
+				consume_skb(skb);
+			return err < 0;
 		}
 		if (dh->dccph_type == DCCP_PKT_RESET)
 			goto discard;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index daff631b9486..e0bbbd624246 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6411,7 +6411,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 	struct inet_connection_sock *icsk = inet_csk(sk);
 	const struct tcphdr *th = tcp_hdr(skb);
 	struct request_sock *req;
-	int queued = 0;
+	int err, queued = 0;
 	bool acceptable;
 	SKB_DR(reason);
 
@@ -6438,14 +6438,13 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 			 */
 			rcu_read_lock();
 			local_bh_disable();
-			acceptable = icsk->icsk_af_ops->conn_request(sk, skb) >= 0;
+			err = icsk->icsk_af_ops->conn_request(sk, skb);
 			local_bh_enable();
 			rcu_read_unlock();
 
-			if (!acceptable)
-				return 1;
-			consume_skb(skb);
-			return 0;
+			if (!err)
+				consume_skb(skb);
+			return err < 0;
 		}
 		SKB_DR_SET(reason, TCP_FLAGS);
 		goto discard;
@@ -6878,6 +6877,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	bool want_cookie = false;
 	struct dst_entry *dst;
 	struct flowi fl;
+	SKB_DR(reason);
 
 	/* TW buckets are converted to open requests without
 	 * limitations, they conserve resources and peer is
@@ -6886,12 +6886,15 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	if ((net->ipv4.sysctl_tcp_syncookies == 2 ||
 	     inet_csk_reqsk_queue_is_full(sk)) && !isn) {
 		want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name);
-		if (!want_cookie)
+		if (!want_cookie) {
+			SKB_DR_SET(reason, TCP_REQQFULLDROP);
 			goto drop;
+		}
 	}
 
 	if (sk_acceptq_is_full(sk)) {
 		NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
+		SKB_DR_SET(reason, LISTENOVERFLOWS);
 		goto drop;
 	}
 
@@ -6947,6 +6950,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 			 */
 			pr_drop_req(req, ntohs(tcp_hdr(skb)->source),
 				    rsk_ops->family);
+			SKB_DR_SET(reason, TCP_REQQFULLDROP);
 			goto drop_and_release;
 		}
 
@@ -7006,7 +7010,8 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 drop_and_free:
 	__reqsk_free(req);
 drop:
+	kfree_skb_reason(skb, reason);
 	tcp_listendrop(sk);
-	return 0;
+	return 1;
 }
 EXPORT_SYMBOL(tcp_conn_request);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 157265aecbed..b8daf49f54a5 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1470,7 +1470,8 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 
 drop:
 	tcp_listendrop(sk);
-	return 0;
+	kfree_skb_reason(skb, SKB_DROP_REASON_IP_INADDRERRORS);
+	return 1;
 }
 EXPORT_SYMBOL(tcp_v4_conn_request);
 
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next 2/2] net: tcp: add skb drop reasons to route_req()
  2022-04-26  8:07 [PATCH net-next 0/2] net: tcp: add skb drop reasons to connect request menglong8.dong
  2022-04-26  8:07 ` [PATCH net-next 1/2] net: add skb drop reasons to inet " menglong8.dong
@ 2022-04-26  8:07 ` menglong8.dong
  1 sibling, 0 replies; 5+ messages in thread
From: menglong8.dong @ 2022-04-26  8:07 UTC (permalink / raw)
  To: kuba
  Cc: rostedt, mingo, davem, yoshfuji, dsahern, pabeni, benbjiang,
	flyingpeng, imagedong, edumazet, kafai, talalahmad, keescook,
	mengensun, dongli.zhang, linux-kernel, netdev

From: Menglong Dong <imagedong@tencent.com>

Add skb drop reasons to the route_req() in struct tcp_request_sock_ops.
Following functions are involved:

  tcp_v4_route_req()
  tcp_v6_route_req()
  subflow_v4_route_req()
  subflow_v6_route_req()

And the new reason SKB_DROP_REASON_SECURITY is added, which is used when
skb is dropped by LSM.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/linux/skbuff.h     |  1 +
 include/net/tcp.h          |  3 ++-
 include/trace/events/skb.h |  1 +
 net/ipv4/tcp_input.c       |  2 +-
 net/ipv4/tcp_ipv4.c        | 14 +++++++++++---
 net/ipv6/tcp_ipv6.c        | 14 +++++++++++---
 net/mptcp/subflow.c        | 10 ++++++----
 7 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f33b3636bbce..5909759e1b95 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -473,6 +473,7 @@ enum skb_drop_reason {
 	SKB_DROP_REASON_TCP_REQQFULLDROP, /* request queue of the listen
 					   * socket is full
 					   */
+	SKB_DROP_REASON_SECURITY,	/* dropped by LSM */
 	SKB_DROP_REASON_MAX,
 };
 
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 679b1964d494..01f841611895 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2075,7 +2075,8 @@ struct tcp_request_sock_ops {
 	struct dst_entry *(*route_req)(const struct sock *sk,
 				       struct sk_buff *skb,
 				       struct flowi *fl,
-				       struct request_sock *req);
+				       struct request_sock *req,
+				       enum skb_drop_reason *reason);
 	u32 (*init_seq)(const struct sk_buff *skb);
 	u32 (*init_ts_off)(const struct net *net, const struct sk_buff *skb);
 	int (*send_synack)(const struct sock *sk, struct dst_entry *dst,
diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index de6c93670437..aff57cd43e85 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -82,6 +82,7 @@
 	EM(SKB_DROP_REASON_PKT_TOO_BIG, PKT_TOO_BIG)		\
 	EM(SKB_DROP_REASON_LISTENOVERFLOWS, LISTENOVERFLOWS)	\
 	EM(SKB_DROP_REASON_TCP_REQQFULLDROP, TCP_REQQFULLDROP)	\
+	EM(SKB_DROP_REASON_SECURITY, SECURITY)			\
 	EMe(SKB_DROP_REASON_MAX, MAX)
 
 #undef EM
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index e0bbbd624246..2c158593dc37 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6928,7 +6928,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
 	/* Note: tcp_v6_init_req() might override ir_iif for link locals */
 	inet_rsk(req)->ir_iif = inet_request_bound_dev_if(sk, skb);
 
-	dst = af_ops->route_req(sk, skb, &fl, req);
+	dst = af_ops->route_req(sk, skb, &fl, req, &reason);
 	if (!dst)
 		goto drop_and_free;
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index b8daf49f54a5..12acf4823488 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1424,14 +1424,22 @@ static void tcp_v4_init_req(struct request_sock *req,
 static struct dst_entry *tcp_v4_route_req(const struct sock *sk,
 					  struct sk_buff *skb,
 					  struct flowi *fl,
-					  struct request_sock *req)
+					  struct request_sock *req,
+					  enum skb_drop_reason *reason)
 {
+	struct dst_entry *dst;
+
 	tcp_v4_init_req(req, sk, skb);
 
-	if (security_inet_conn_request(sk, skb, req))
+	if (security_inet_conn_request(sk, skb, req)) {
+		SKB_DR_SET(*reason, SECURITY);
 		return NULL;
+	}
 
-	return inet_csk_route_req(sk, &fl->u.ip4, req);
+	dst = inet_csk_route_req(sk, &fl->u.ip4, req);
+	if (!dst)
+		SKB_DR_SET(*reason, IP_OUTNOROUTES);
+	return dst;
 }
 
 struct request_sock_ops tcp_request_sock_ops __read_mostly = {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 782df529ff69..d69fef0e0892 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -802,14 +802,22 @@ static void tcp_v6_init_req(struct request_sock *req,
 static struct dst_entry *tcp_v6_route_req(const struct sock *sk,
 					  struct sk_buff *skb,
 					  struct flowi *fl,
-					  struct request_sock *req)
+					  struct request_sock *req,
+					  enum skb_drop_reason *reason)
 {
+	struct dst_entry *dst;
+
 	tcp_v6_init_req(req, sk, skb);
 
-	if (security_inet_conn_request(sk, skb, req))
+	if (security_inet_conn_request(sk, skb, req)) {
+		SKB_DR_SET(*reason, SECURITY);
 		return NULL;
+	}
 
-	return inet6_csk_route_req(sk, &fl->u.ip6, req, IPPROTO_TCP);
+	dst = inet6_csk_route_req(sk, &fl->u.ip6, req, IPPROTO_TCP);
+	if (!dst)
+		SKB_DR_SET(*reason, IP_OUTNOROUTES);
+	return dst;
 }
 
 struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index aba260f547da..03d07165cda6 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -283,7 +283,8 @@ EXPORT_SYMBOL_GPL(mptcp_subflow_init_cookie_req);
 static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 					      struct sk_buff *skb,
 					      struct flowi *fl,
-					      struct request_sock *req)
+					      struct request_sock *req,
+					      enum skb_drop_reason *reason)
 {
 	struct dst_entry *dst;
 	int err;
@@ -291,7 +292,7 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 	tcp_rsk(req)->is_mptcp = 1;
 	subflow_init_req(req, sk);
 
-	dst = tcp_request_sock_ipv4_ops.route_req(sk, skb, fl, req);
+	dst = tcp_request_sock_ipv4_ops.route_req(sk, skb, fl, req, reason);
 	if (!dst)
 		return NULL;
 
@@ -309,7 +310,8 @@ static struct dst_entry *subflow_v4_route_req(const struct sock *sk,
 static struct dst_entry *subflow_v6_route_req(const struct sock *sk,
 					      struct sk_buff *skb,
 					      struct flowi *fl,
-					      struct request_sock *req)
+					      struct request_sock *req,
+					      enum skb_drop_reason *reason)
 {
 	struct dst_entry *dst;
 	int err;
@@ -317,7 +319,7 @@ static struct dst_entry *subflow_v6_route_req(const struct sock *sk,
 	tcp_rsk(req)->is_mptcp = 1;
 	subflow_init_req(req, sk);
 
-	dst = tcp_request_sock_ipv6_ops.route_req(sk, skb, fl, req);
+	dst = tcp_request_sock_ipv6_ops.route_req(sk, skb, fl, req, reason);
 	if (!dst)
 		return NULL;
 
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next 1/2] net: add skb drop reasons to inet connect request
  2022-04-26  8:07 ` [PATCH net-next 1/2] net: add skb drop reasons to inet " menglong8.dong
@ 2022-04-26 13:32   ` Eric Dumazet
  2022-04-28  2:31     ` Menglong Dong
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2022-04-26 13:32 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Jakub Kicinski, Steven Rostedt, Ingo Molnar, David Miller,
	Hideaki YOSHIFUJI, David Ahern, Paolo Abeni, benbjiang,
	flyingpeng, Menglong Dong, Martin KaFai Lau, Talal Ahmad,
	Kees Cook, mengensun, Dongli Zhang, LKML, netdev

On Tue, Apr 26, 2022 at 1:07 AM <menglong8.dong@gmail.com> wrote:
>
> From: Menglong Dong <imagedong@tencent.com>
>
> The 'conn_request()' in struct inet_connection_sock_af_ops is used to
> process connection requesting for TCP/DCCP. Take TCP for example, it
> is just 'tcp_v4_conn_request()'.
>
> When non-zero value is returned by 'tcp_v4_conn_request()', the skb
> will be freed by kfree_skb() and a 'reset' packet will be send.
> Otherwise, it will be freed normally.
>
> In this code path, 'consume_skb()' is used in many abnormal cases, such
> as the accept queue of the listen socket full, which should be
> 'kfree_skb()'.
>
> Therefore, we make a little change to the 'conn_request()' interface.
> When 0 is returned, we call 'consume_skb()' as usual; when negative is
> returned, we call 'kfree_skb()' and send a 'reset' as usual; when
> positive is returned, which has not happened yet, we do nothing, and
> skb will be freed in 'conn_request()'. Then, we can use drop reasons
> in 'conn_request()'.
>
> Following new drop reasons are added:
>
>   SKB_DROP_REASON_LISTENOVERFLOWS
>   SKB_DROP_REASON_TCP_REQQFULLDROP
>
> Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> Reviewed-by: Hao Peng <flyingpeng@tencent.com>
> Signed-off-by: Menglong Dong <imagedong@tencent.com>
> ---
>  include/linux/skbuff.h     |  4 ++++
>  include/trace/events/skb.h |  2 ++
>  net/dccp/input.c           | 12 +++++-------
>  net/ipv4/tcp_input.c       | 21 +++++++++++++--------
>  net/ipv4/tcp_ipv4.c        |  3 ++-
>  5 files changed, 26 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 84d78df60453..f33b3636bbce 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -469,6 +469,10 @@ enum skb_drop_reason {
>         SKB_DROP_REASON_PKT_TOO_BIG,    /* packet size is too big (maybe exceed
>                                          * the MTU)
>                                          */
> +       SKB_DROP_REASON_LISTENOVERFLOWS, /* accept queue of the listen socket is full */
> +       SKB_DROP_REASON_TCP_REQQFULLDROP, /* request queue of the listen
> +                                          * socket is full
> +                                          */
>         SKB_DROP_REASON_MAX,
>  };
>
> diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
> index a477bf907498..de6c93670437 100644
> --- a/include/trace/events/skb.h
> +++ b/include/trace/events/skb.h
> @@ -80,6 +80,8 @@
>         EM(SKB_DROP_REASON_IP_INADDRERRORS, IP_INADDRERRORS)    \
>         EM(SKB_DROP_REASON_IP_INNOROUTES, IP_INNOROUTES)        \
>         EM(SKB_DROP_REASON_PKT_TOO_BIG, PKT_TOO_BIG)            \
> +       EM(SKB_DROP_REASON_LISTENOVERFLOWS, LISTENOVERFLOWS)    \
> +       EM(SKB_DROP_REASON_TCP_REQQFULLDROP, TCP_REQQFULLDROP)  \
>         EMe(SKB_DROP_REASON_MAX, MAX)
>
>  #undef EM
> diff --git a/net/dccp/input.c b/net/dccp/input.c
> index 2cbb757a894f..ed20dfe83f66 100644
> --- a/net/dccp/input.c
> +++ b/net/dccp/input.c
> @@ -574,8 +574,7 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
>         struct dccp_sock *dp = dccp_sk(sk);
>         struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb);
>         const int old_state = sk->sk_state;
> -       bool acceptable;
> -       int queued = 0;
> +       int err, queued = 0;
>
>         /*
>          *  Step 3: Process LISTEN state
> @@ -606,13 +605,12 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
>                          */
>                         rcu_read_lock();
>                         local_bh_disable();
> -                       acceptable = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb) >= 0;
> +                       err = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb);
>                         local_bh_enable();
>                         rcu_read_unlock();
> -                       if (!acceptable)
> -                               return 1;
> -                       consume_skb(skb);
> -                       return 0;
> +                       if (!err)
> +                               consume_skb(skb);
> +                       return err < 0;
>                 }
>                 if (dh->dccph_type == DCCP_PKT_RESET)
>                         goto discard;
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index daff631b9486..e0bbbd624246 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -6411,7 +6411,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
>         struct inet_connection_sock *icsk = inet_csk(sk);
>         const struct tcphdr *th = tcp_hdr(skb);
>         struct request_sock *req;
> -       int queued = 0;
> +       int err, queued = 0;
>         bool acceptable;
>         SKB_DR(reason);
>
> @@ -6438,14 +6438,13 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
>                          */
>                         rcu_read_lock();
>                         local_bh_disable();
> -                       acceptable = icsk->icsk_af_ops->conn_request(sk, skb) >= 0;
> +                       err = icsk->icsk_af_ops->conn_request(sk, skb);
>                         local_bh_enable();
>                         rcu_read_unlock();
>
> -                       if (!acceptable)
> -                               return 1;
> -                       consume_skb(skb);
> -                       return 0;
> +                       if (!err)
> +                               consume_skb(skb);

Please, do not add more mess like that, where skb is either freed by
the callee or the caller.


> +                       return err < 0;

Where err is set to a negative value ?


>                 }
>                 SKB_DR_SET(reason, TCP_FLAGS);
>                 goto discard;
> @@ -6878,6 +6877,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
>         bool want_cookie = false;
>         struct dst_entry *dst;
>         struct flowi fl;
> +       SKB_DR(reason);
>
>         /* TW buckets are converted to open requests without
>          * limitations, they conserve resources and peer is
> @@ -6886,12 +6886,15 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
>         if ((net->ipv4.sysctl_tcp_syncookies == 2 ||
>              inet_csk_reqsk_queue_is_full(sk)) && !isn) {
>                 want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name);
> -               if (!want_cookie)
> +               if (!want_cookie) {
> +                       SKB_DR_SET(reason, TCP_REQQFULLDROP);
>                         goto drop;
> +               }
>         }
>
>         if (sk_acceptq_is_full(sk)) {
>                 NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
> +               SKB_DR_SET(reason, LISTENOVERFLOWS);
>                 goto drop;
>         }
>
> @@ -6947,6 +6950,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
>                          */
>                         pr_drop_req(req, ntohs(tcp_hdr(skb)->source),
>                                     rsk_ops->family);
> +                       SKB_DR_SET(reason, TCP_REQQFULLDROP);
>                         goto drop_and_release;
>                 }
>
> @@ -7006,7 +7010,8 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
>  drop_and_free:
>         __reqsk_free(req);
>  drop:
> +       kfree_skb_reason(skb, reason);

Ugh no, prefer "return reason" and leave to the caller the freeing part.

Your changes are too invasive and will hurt future backports.


>         tcp_listendrop(sk);
> -       return 0;
> +       return 1;
>  }
>  EXPORT_SYMBOL(tcp_conn_request);
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 157265aecbed..b8daf49f54a5 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1470,7 +1470,8 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
>
>  drop:
>         tcp_listendrop(sk);
> -       return 0;

This return 0 meant : do not send reset.


> +       kfree_skb_reason(skb, SKB_DROP_REASON_IP_INADDRERRORS);

double kfree_skb() ?

> +       return 1;

-> send RESET

>  }
>  EXPORT_SYMBOL(tcp_v4_conn_request);
>
> --
> 2.36.0
>

I have a hard time understanding this patch.

Where is the related IPv6 change ?

I really wonder if you actually have tested this.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next 1/2] net: add skb drop reasons to inet connect request
  2022-04-26 13:32   ` Eric Dumazet
@ 2022-04-28  2:31     ` Menglong Dong
  0 siblings, 0 replies; 5+ messages in thread
From: Menglong Dong @ 2022-04-28  2:31 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jakub Kicinski, Steven Rostedt, Ingo Molnar, David Miller,
	Hideaki YOSHIFUJI, David Ahern, Paolo Abeni, Biao Jiang,
	Hao Peng, Menglong Dong, Martin KaFai Lau, Talal Ahmad,
	Kees Cook, Mengen Sun, Dongli Zhang, LKML, netdev

On Tue, Apr 26, 2022 at 9:32 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Apr 26, 2022 at 1:07 AM <menglong8.dong@gmail.com> wrote:
> >
[......]
> > +                       if (!err)
> > +                               consume_skb(skb);
>
> Please, do not add more mess like that, where skb is either freed by
> the callee or the caller.
>

Yeah, this is a little chaotic.....I just can't find a way out :/
keeping thinking

>
> > +                       return err < 0;
>
> Where err is set to a negative value ?

-1 is returned in dccp_v4_conn_request()

>
>
> >                 }
> >                 SKB_DR_SET(reason, TCP_FLAGS);
> >                 goto discard;
> > @@ -6878,6 +6877,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
> >         bool want_cookie = false;
> >         struct dst_entry *dst;
> >         struct flowi fl;
> > +       SKB_DR(reason);
> >
> >         /* TW buckets are converted to open requests without
> >          * limitations, they conserve resources and peer is
> > @@ -6886,12 +6886,15 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
> >         if ((net->ipv4.sysctl_tcp_syncookies == 2 ||
> >              inet_csk_reqsk_queue_is_full(sk)) && !isn) {
> >                 want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name);
> > -               if (!want_cookie)
> > +               if (!want_cookie) {
> > +                       SKB_DR_SET(reason, TCP_REQQFULLDROP);
> >                         goto drop;
> > +               }
> >         }
> >
> >         if (sk_acceptq_is_full(sk)) {
> >                 NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
> > +               SKB_DR_SET(reason, LISTENOVERFLOWS);
> >                 goto drop;
> >         }
> >
> > @@ -6947,6 +6950,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
> >                          */
> >                         pr_drop_req(req, ntohs(tcp_hdr(skb)->source),
> >                                     rsk_ops->family);
> > +                       SKB_DR_SET(reason, TCP_REQQFULLDROP);
> >                         goto drop_and_release;
> >                 }
> >
> > @@ -7006,7 +7010,8 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
> >  drop_and_free:
> >         __reqsk_free(req);
> >  drop:
> > +       kfree_skb_reason(skb, reason);
>
> Ugh no, prefer "return reason" and leave to the caller the freeing part.
>
> Your changes are too invasive and will hurt future backports.
>

Okey, I'll try some way else.

>
> >         tcp_listendrop(sk);
> > -       return 0;
> > +       return 1;
> >  }
> >  EXPORT_SYMBOL(tcp_conn_request);
> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> > index 157265aecbed..b8daf49f54a5 100644
> > --- a/net/ipv4/tcp_ipv4.c
> > +++ b/net/ipv4/tcp_ipv4.c
> > @@ -1470,7 +1470,8 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
> >
> >  drop:
> >         tcp_listendrop(sk);
> > -       return 0;
>
> This return 0 meant : do not send reset.
>
>
> > +       kfree_skb_reason(skb, SKB_DROP_REASON_IP_INADDRERRORS);
>
> double kfree_skb() ?
>
> > +       return 1;
>
> -> send RESET
>

No, this return 1 means not send RESET and this skb is already freed in
icsk_af_ops->conn_request(), since I made a change to the caller of
conn_request() in tcp_rcv_state_process() and dccp_rcv_state_process():

    err = icsk->icsk_af_ops->conn_request(sk, skb);
    local_bh_enable();
    rcu_read_unlock();
    if (!err)
        consume_skb(skb);
    return err < 0;

if err==1, the skb will not be freed again, as 0 is returned by
tcp_rcv_state_process()

> >  }
> >  EXPORT_SYMBOL(tcp_v4_conn_request);
> >
> > --
> > 2.36.0
> >
>
> I have a hard time understanding this patch.
>
> Where is the related IPv6 change ?
>
> I really wonder if you actually have tested this.

Yeah, I missed the IPv6....but it still works, the changes are
compatible with current IPv6 code.

In fact, I have tested it, and everything is ok, no double free
happens:

  drop at: tcp_conn_request+0xf1/0xcb0 (0xffffffff81d43271)
  origin: software
  input port ifindex: 1
  timestamp: Thu Apr 28 10:19:42 2022 917631574 nsec
  protocol: 0x800
  length: 74
  original length: 74
  drop reason: LISTENOVERFLOWS

  drop at: tcp_conn_request+0xf1/0xcb0 (0xffffffff81d43271)
  origin: software
  input port ifindex: 1
  timestamp: Thu Apr 28 10:19:43 2022 930983132 nsec
  protocol: 0x800
  length: 74
  original length: 74
  drop reason: LISTENOVERFLOWS

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-04-28  2:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-26  8:07 [PATCH net-next 0/2] net: tcp: add skb drop reasons to connect request menglong8.dong
2022-04-26  8:07 ` [PATCH net-next 1/2] net: add skb drop reasons to inet " menglong8.dong
2022-04-26 13:32   ` Eric Dumazet
2022-04-28  2:31     ` Menglong Dong
2022-04-26  8:07 ` [PATCH net-next 2/2] net: tcp: add skb drop reasons to route_req() menglong8.dong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.