linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/2] rxrpc: Fixes
@ 2018-10-05 13:42 David Howells
  2018-10-05 13:43 ` [PATCH net 1/2] rxrpc: Fix some missed refs to init_net David Howells
  2018-10-05 13:43 ` [PATCH net 2/2] rxrpc: Fix the data_ready handler David Howells
  0 siblings, 2 replies; 12+ messages in thread
From: David Howells @ 2018-10-05 13:42 UTC (permalink / raw)
  To: netdev; +Cc: dhowells, linux-afs, linux-kernel


Here are two fixes for AF_RXRPC:

 (1) Fix some places that are doing things in the wrong net namespace.

 (2) Fix the reception of UDP packets in three ways:

     (a) Close the window between binding the socket and setting the
     	 data_ready hook in which packets can come in and get lodged in the
     	 receive queue without data_ready seeing them.

     (b) Make sure the UDP receive queue is drained before returning from
     	 the data_ready hook.

     (c) Ignore Tx errors returned by skb_recv_udp().

The patches are tagged here:

	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
	rxrpc-fixes-20181005

and can also be found on the following branch:

	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-fixes

David
---
David Howells (2):
      rxrpc: Fix some missed refs to init_net
      rxrpc: Fix the data_ready handler


 net/rxrpc/ar-internal.h  |   10 ++++---
 net/rxrpc/call_accept.c  |    2 +
 net/rxrpc/call_object.c  |    4 +--
 net/rxrpc/conn_client.c  |   10 ++++---
 net/rxrpc/input.c        |   68 ++++++++++++++++++++++++++--------------------
 net/rxrpc/local_object.c |   11 ++++---
 net/rxrpc/peer_object.c  |   28 ++++++++++++-------
 7 files changed, 76 insertions(+), 57 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net 1/2] rxrpc: Fix some missed refs to init_net
  2018-10-05 13:42 [PATCH net 0/2] rxrpc: Fixes David Howells
@ 2018-10-05 13:43 ` David Howells
  2018-10-05 13:43 ` [PATCH net 2/2] rxrpc: Fix the data_ready handler David Howells
  1 sibling, 0 replies; 12+ messages in thread
From: David Howells @ 2018-10-05 13:43 UTC (permalink / raw)
  To: netdev; +Cc: dhowells, linux-afs, linux-kernel

Fix some refs to init_net that should've been changed to the appropriate
network namespace.

Fixes: 2baec2c3f854 ("rxrpc: Support network namespacing")
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
---

 net/rxrpc/ar-internal.h |   10 ++++++----
 net/rxrpc/call_accept.c |    2 +-
 net/rxrpc/call_object.c |    4 ++--
 net/rxrpc/conn_client.c |   10 ++++++----
 net/rxrpc/input.c       |    4 ++--
 net/rxrpc/peer_object.c |   28 +++++++++++++++++-----------
 6 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index ef9554131434..63c43b3a2096 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -891,8 +891,9 @@ extern unsigned long rxrpc_conn_idle_client_fast_expiry;
 extern struct idr rxrpc_client_conn_ids;
 
 void rxrpc_destroy_client_conn_ids(void);
-int rxrpc_connect_call(struct rxrpc_call *, struct rxrpc_conn_parameters *,
-		       struct sockaddr_rxrpc *, gfp_t);
+int rxrpc_connect_call(struct rxrpc_sock *, struct rxrpc_call *,
+		       struct rxrpc_conn_parameters *, struct sockaddr_rxrpc *,
+		       gfp_t);
 void rxrpc_expose_client_call(struct rxrpc_call *);
 void rxrpc_disconnect_client_call(struct rxrpc_call *);
 void rxrpc_put_client_conn(struct rxrpc_connection *);
@@ -1045,10 +1046,11 @@ void rxrpc_peer_keepalive_worker(struct work_struct *);
  */
 struct rxrpc_peer *rxrpc_lookup_peer_rcu(struct rxrpc_local *,
 					 const struct sockaddr_rxrpc *);
-struct rxrpc_peer *rxrpc_lookup_peer(struct rxrpc_local *,
+struct rxrpc_peer *rxrpc_lookup_peer(struct rxrpc_sock *, struct rxrpc_local *,
 				     struct sockaddr_rxrpc *, gfp_t);
 struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *, gfp_t);
-void rxrpc_new_incoming_peer(struct rxrpc_local *, struct rxrpc_peer *);
+void rxrpc_new_incoming_peer(struct rxrpc_sock *, struct rxrpc_local *,
+			     struct rxrpc_peer *);
 void rxrpc_destroy_all_peers(struct rxrpc_net *);
 struct rxrpc_peer *rxrpc_get_peer(struct rxrpc_peer *);
 struct rxrpc_peer *rxrpc_get_peer_maybe(struct rxrpc_peer *);
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 9c7f26d06a52..f55f67894465 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -287,7 +287,7 @@ static struct rxrpc_call *rxrpc_alloc_incoming_call(struct rxrpc_sock *rx,
 					  (peer_tail + 1) &
 					  (RXRPC_BACKLOG_MAX - 1));
 
-			rxrpc_new_incoming_peer(local, peer);
+			rxrpc_new_incoming_peer(rx, local, peer);
 		}
 
 		/* Now allocate and set up the connection */
diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c
index 799f75b6900d..0ca2c2dfd196 100644
--- a/net/rxrpc/call_object.c
+++ b/net/rxrpc/call_object.c
@@ -287,7 +287,7 @@ struct rxrpc_call *rxrpc_new_client_call(struct rxrpc_sock *rx,
 	/* Set up or get a connection record and set the protocol parameters,
 	 * including channel number and call ID.
 	 */
-	ret = rxrpc_connect_call(call, cp, srx, gfp);
+	ret = rxrpc_connect_call(rx, call, cp, srx, gfp);
 	if (ret < 0)
 		goto error;
 
@@ -339,7 +339,7 @@ int rxrpc_retry_client_call(struct rxrpc_sock *rx,
 	/* Set up or get a connection record and set the protocol parameters,
 	 * including channel number and call ID.
 	 */
-	ret = rxrpc_connect_call(call, cp, srx, gfp);
+	ret = rxrpc_connect_call(rx, call, cp, srx, gfp);
 	if (ret < 0)
 		goto error;
 
diff --git a/net/rxrpc/conn_client.c b/net/rxrpc/conn_client.c
index 8acf74fe24c0..521189f4b666 100644
--- a/net/rxrpc/conn_client.c
+++ b/net/rxrpc/conn_client.c
@@ -276,7 +276,8 @@ static bool rxrpc_may_reuse_conn(struct rxrpc_connection *conn)
  * If we return with a connection, the call will be on its waiting list.  It's
  * left to the caller to assign a channel and wake up the call.
  */
-static int rxrpc_get_client_conn(struct rxrpc_call *call,
+static int rxrpc_get_client_conn(struct rxrpc_sock *rx,
+				 struct rxrpc_call *call,
 				 struct rxrpc_conn_parameters *cp,
 				 struct sockaddr_rxrpc *srx,
 				 gfp_t gfp)
@@ -289,7 +290,7 @@ static int rxrpc_get_client_conn(struct rxrpc_call *call,
 
 	_enter("{%d,%lx},", call->debug_id, call->user_call_ID);
 
-	cp->peer = rxrpc_lookup_peer(cp->local, srx, gfp);
+	cp->peer = rxrpc_lookup_peer(rx, cp->local, srx, gfp);
 	if (!cp->peer)
 		goto error;
 
@@ -683,7 +684,8 @@ static int rxrpc_wait_for_channel(struct rxrpc_call *call, gfp_t gfp)
  * find a connection for a call
  * - called in process context with IRQs enabled
  */
-int rxrpc_connect_call(struct rxrpc_call *call,
+int rxrpc_connect_call(struct rxrpc_sock *rx,
+		       struct rxrpc_call *call,
 		       struct rxrpc_conn_parameters *cp,
 		       struct sockaddr_rxrpc *srx,
 		       gfp_t gfp)
@@ -696,7 +698,7 @@ int rxrpc_connect_call(struct rxrpc_call *call,
 	rxrpc_discard_expired_client_conns(&rxnet->client_conn_reaper);
 	rxrpc_cull_active_client_conns(rxnet);
 
-	ret = rxrpc_get_client_conn(call, cp, srx, gfp);
+	ret = rxrpc_get_client_conn(rx, call, cp, srx, gfp);
 	if (ret < 0)
 		goto out;
 
diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index 800f5b8a1baa..c5af9955665b 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -1156,12 +1156,12 @@ void rxrpc_data_ready(struct sock *udp_sk)
 	/* we'll probably need to checksum it (didn't call sock_recvmsg) */
 	if (skb_checksum_complete(skb)) {
 		rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
-		__UDP_INC_STATS(&init_net, UDP_MIB_INERRORS, 0);
+		__UDP_INC_STATS(sock_net(udp_sk), UDP_MIB_INERRORS, 0);
 		_leave(" [CSUM failed]");
 		return;
 	}
 
-	__UDP_INC_STATS(&init_net, UDP_MIB_INDATAGRAMS, 0);
+	__UDP_INC_STATS(sock_net(udp_sk), UDP_MIB_INDATAGRAMS, 0);
 
 	/* The UDP protocol already released all skb resources;
 	 * we are free to add our own data there.
diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c
index 01a9febfa367..2d39eaf19620 100644
--- a/net/rxrpc/peer_object.c
+++ b/net/rxrpc/peer_object.c
@@ -153,8 +153,10 @@ struct rxrpc_peer *rxrpc_lookup_peer_rcu(struct rxrpc_local *local,
  * assess the MTU size for the network interface through which this peer is
  * reached
  */
-static void rxrpc_assess_MTU_size(struct rxrpc_peer *peer)
+static void rxrpc_assess_MTU_size(struct rxrpc_sock *rx,
+				  struct rxrpc_peer *peer)
 {
+	struct net *net = sock_net(&rx->sk);
 	struct dst_entry *dst;
 	struct rtable *rt;
 	struct flowi fl;
@@ -169,7 +171,7 @@ static void rxrpc_assess_MTU_size(struct rxrpc_peer *peer)
 	switch (peer->srx.transport.family) {
 	case AF_INET:
 		rt = ip_route_output_ports(
-			&init_net, fl4, NULL,
+			net, fl4, NULL,
 			peer->srx.transport.sin.sin_addr.s_addr, 0,
 			htons(7000), htons(7001), IPPROTO_UDP, 0, 0);
 		if (IS_ERR(rt)) {
@@ -188,7 +190,7 @@ static void rxrpc_assess_MTU_size(struct rxrpc_peer *peer)
 		       sizeof(struct in6_addr));
 		fl6->fl6_dport = htons(7001);
 		fl6->fl6_sport = htons(7000);
-		dst = ip6_route_output(&init_net, NULL, fl6);
+		dst = ip6_route_output(net, NULL, fl6);
 		if (dst->error) {
 			_leave(" [route err %d]", dst->error);
 			return;
@@ -240,10 +242,11 @@ struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *local, gfp_t gfp)
 /*
  * Initialise peer record.
  */
-static void rxrpc_init_peer(struct rxrpc_peer *peer, unsigned long hash_key)
+static void rxrpc_init_peer(struct rxrpc_sock *rx, struct rxrpc_peer *peer,
+			    unsigned long hash_key)
 {
 	peer->hash_key = hash_key;
-	rxrpc_assess_MTU_size(peer);
+	rxrpc_assess_MTU_size(rx, peer);
 	peer->mtu = peer->if_mtu;
 	peer->rtt_last_req = ktime_get_real();
 
@@ -275,7 +278,8 @@ static void rxrpc_init_peer(struct rxrpc_peer *peer, unsigned long hash_key)
 /*
  * Set up a new peer.
  */
-static struct rxrpc_peer *rxrpc_create_peer(struct rxrpc_local *local,
+static struct rxrpc_peer *rxrpc_create_peer(struct rxrpc_sock *rx,
+					    struct rxrpc_local *local,
 					    struct sockaddr_rxrpc *srx,
 					    unsigned long hash_key,
 					    gfp_t gfp)
@@ -287,7 +291,7 @@ static struct rxrpc_peer *rxrpc_create_peer(struct rxrpc_local *local,
 	peer = rxrpc_alloc_peer(local, gfp);
 	if (peer) {
 		memcpy(&peer->srx, srx, sizeof(*srx));
-		rxrpc_init_peer(peer, hash_key);
+		rxrpc_init_peer(rx, peer, hash_key);
 	}
 
 	_leave(" = %p", peer);
@@ -299,14 +303,15 @@ static struct rxrpc_peer *rxrpc_create_peer(struct rxrpc_local *local,
  * since we've already done a search in the list from the non-reentrant context
  * (the data_ready handler) that is the only place we can add new peers.
  */
-void rxrpc_new_incoming_peer(struct rxrpc_local *local, struct rxrpc_peer *peer)
+void rxrpc_new_incoming_peer(struct rxrpc_sock *rx, struct rxrpc_local *local,
+			     struct rxrpc_peer *peer)
 {
 	struct rxrpc_net *rxnet = local->rxnet;
 	unsigned long hash_key;
 
 	hash_key = rxrpc_peer_hash_key(local, &peer->srx);
 	peer->local = local;
-	rxrpc_init_peer(peer, hash_key);
+	rxrpc_init_peer(rx, peer, hash_key);
 
 	spin_lock(&rxnet->peer_hash_lock);
 	hash_add_rcu(rxnet->peer_hash, &peer->hash_link, hash_key);
@@ -317,7 +322,8 @@ void rxrpc_new_incoming_peer(struct rxrpc_local *local, struct rxrpc_peer *peer)
 /*
  * obtain a remote transport endpoint for the specified address
  */
-struct rxrpc_peer *rxrpc_lookup_peer(struct rxrpc_local *local,
+struct rxrpc_peer *rxrpc_lookup_peer(struct rxrpc_sock *rx,
+				     struct rxrpc_local *local,
 				     struct sockaddr_rxrpc *srx, gfp_t gfp)
 {
 	struct rxrpc_peer *peer, *candidate;
@@ -337,7 +343,7 @@ struct rxrpc_peer *rxrpc_lookup_peer(struct rxrpc_local *local,
 		/* The peer is not yet present in hash - create a candidate
 		 * for a new record and then redo the search.
 		 */
-		candidate = rxrpc_create_peer(local, srx, hash_key, gfp);
+		candidate = rxrpc_create_peer(rx, local, srx, hash_key, gfp);
 		if (!candidate) {
 			_leave(" = NULL [nomem]");
 			return NULL;


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net 2/2] rxrpc: Fix the data_ready handler
  2018-10-05 13:42 [PATCH net 0/2] rxrpc: Fixes David Howells
  2018-10-05 13:43 ` [PATCH net 1/2] rxrpc: Fix some missed refs to init_net David Howells
@ 2018-10-05 13:43 ` David Howells
  2018-10-05 13:52   ` Eric Dumazet
  2018-10-05 14:18   ` David Howells
  1 sibling, 2 replies; 12+ messages in thread
From: David Howells @ 2018-10-05 13:43 UTC (permalink / raw)
  To: netdev; +Cc: dhowells, linux-afs, linux-kernel

Fix the rxrpc_data_ready() function to pick up all packets and to not miss
any.  There are two problems:

 (1) The sk_data_ready pointer on the UDP socket is set *after* it is
     bound.  This means that it's open for business before we're ready to
     dequeue packets and there's a tiny window exists in which a packet can
     sneak onto the receive queue, but we never know about it.

     Fix this by setting the pointers on the socket prior to binding it.

 (2) skb_recv_udp() will return an error (such as ENETUNREACH) if there was
     an error on the transmission side, even though we set the
     sk_error_report hook.  Because rxrpc_data_ready() returns immediately
     in such a case, it never actually removes its packet from the receive
     queue.

     Fix this by abstracting out the UDP dequeuing and checksumming into a
     separate function that keeps hammering on skb_recv_udp() until it
     returns -EAGAIN, passing the packets extracted to the remainder of the
     function.

and two potential problems:

 (3) It might be possible in some circumstances or in the future for
     packets to be being added to the UDP receive queue whilst rxrpc is
     running consuming them, so the data_ready() handler might get called
     less often than once per packet.

     Allow for this by fully draining the queue on each call as (2).

 (4) If a packet fails the checksum check, the code currently returns after
     discarding the packet without checking for more.

     Allow for this by fully draining the queue on each call as (2).

Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
---

 net/rxrpc/input.c        |   68 ++++++++++++++++++++++++++--------------------
 net/rxrpc/local_object.c |   11 ++++---
 2 files changed, 44 insertions(+), 35 deletions(-)

diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index c5af9955665b..c3114fa66c92 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -1121,7 +1121,7 @@ int rxrpc_extract_header(struct rxrpc_skb_priv *sp, struct sk_buff *skb)
  * shut down and the local endpoint from going away, thus sk_user_data will not
  * be cleared until this function returns.
  */
-void rxrpc_data_ready(struct sock *udp_sk)
+void rxrpc_input_packet(struct sock *udp_sk, struct sk_buff *skb)
 {
 	struct rxrpc_connection *conn;
 	struct rxrpc_channel *chan;
@@ -1130,39 +1130,11 @@ void rxrpc_data_ready(struct sock *udp_sk)
 	struct rxrpc_local *local = udp_sk->sk_user_data;
 	struct rxrpc_peer *peer = NULL;
 	struct rxrpc_sock *rx = NULL;
-	struct sk_buff *skb;
 	unsigned int channel;
-	int ret, skew = 0;
+	int skew = 0;
 
 	_enter("%p", udp_sk);
 
-	ASSERT(!irqs_disabled());
-
-	skb = skb_recv_udp(udp_sk, 0, 1, &ret);
-	if (!skb) {
-		if (ret == -EAGAIN)
-			return;
-		_debug("UDP socket error %d", ret);
-		return;
-	}
-
-	if (skb->tstamp == 0)
-		skb->tstamp = ktime_get_real();
-
-	rxrpc_new_skb(skb, rxrpc_skb_rx_received);
-
-	_net("recv skb %p", skb);
-
-	/* we'll probably need to checksum it (didn't call sock_recvmsg) */
-	if (skb_checksum_complete(skb)) {
-		rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
-		__UDP_INC_STATS(sock_net(udp_sk), UDP_MIB_INERRORS, 0);
-		_leave(" [CSUM failed]");
-		return;
-	}
-
-	__UDP_INC_STATS(sock_net(udp_sk), UDP_MIB_INDATAGRAMS, 0);
-
 	/* The UDP protocol already released all skb resources;
 	 * we are free to add our own data there.
 	 */
@@ -1181,6 +1153,8 @@ void rxrpc_data_ready(struct sock *udp_sk)
 		}
 	}
 
+	if (skb->tstamp == 0)
+		skb->tstamp = ktime_get_real();
 	trace_rxrpc_rx_packet(sp);
 
 	switch (sp->hdr.type) {
@@ -1398,3 +1372,37 @@ void rxrpc_data_ready(struct sock *udp_sk)
 	rxrpc_reject_packet(local, skb);
 	_leave(" [badmsg]");
 }
+
+void rxrpc_data_ready(struct sock *udp_sk)
+{
+	struct sk_buff *skb;
+	int ret;
+
+	for (;;) {
+		skb = skb_recv_udp(udp_sk, 0, 1, &ret);
+		if (!skb) {
+			if (ret == -EAGAIN)
+				return;
+
+			/* If there was a transmission failure, we get an error
+			 * here that we need to ignore.
+			 */
+			_debug("UDP socket error %d", ret);
+			continue;
+		}
+
+		rxrpc_new_skb(skb, rxrpc_skb_rx_received);
+
+		/* we'll probably need to checksum it (didn't call sock_recvmsg) */
+		if (skb_checksum_complete(skb)) {
+			rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
+			__UDP_INC_STATS(sock_net(udp_sk), UDP_MIB_INERRORS, 0);
+			_debug("csum failed");
+			continue;
+		}
+
+		__UDP_INC_STATS(sock_net(udp_sk), UDP_MIB_INDATAGRAMS, 0);
+
+		rxrpc_input_packet(udp_sk, skb);
+	}
+}
diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c
index 94d234e9c685..30862f44c9f1 100644
--- a/net/rxrpc/local_object.c
+++ b/net/rxrpc/local_object.c
@@ -122,6 +122,12 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net)
 		return ret;
 	}
 
+	/* set the socket up */
+	sock = local->socket->sk;
+	sock->sk_user_data	= local;
+	sock->sk_data_ready	= rxrpc_data_ready;
+	sock->sk_error_report	= rxrpc_error_report;
+
 	/* if a local address was supplied then bind it */
 	if (local->srx.transport_len > sizeof(sa_family_t)) {
 		_debug("bind");
@@ -191,11 +197,6 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net)
 		BUG();
 	}
 
-	/* set the socket up */
-	sock = local->socket->sk;
-	sock->sk_user_data	= local;
-	sock->sk_data_ready	= rxrpc_data_ready;
-	sock->sk_error_report	= rxrpc_error_report;
 	_leave(" = 0");
 	return 0;
 


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net 2/2] rxrpc: Fix the data_ready handler
  2018-10-05 13:43 ` [PATCH net 2/2] rxrpc: Fix the data_ready handler David Howells
@ 2018-10-05 13:52   ` Eric Dumazet
  2018-10-05 14:18   ` David Howells
  1 sibling, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2018-10-05 13:52 UTC (permalink / raw)
  To: David Howells, netdev; +Cc: linux-afs, linux-kernel



On 10/05/2018 06:43 AM, David Howells wrote:
> Fix the rxrpc_data_ready() function to pick up all packets and to not miss
> any.  There are two problems:
> 


> +	for (;;) {
> +		skb = skb_recv_udp(udp_sk, 0, 1, &ret);
> +		if (!skb) {
> +			if (ret == -EAGAIN)
> +				return;
> +
> +			/* If there was a transmission failure, we get an error
> +			 * here that we need to ignore.
> +			 */
> +			_debug("UDP socket error %d", ret);
> +			continue;
> +		}
> +
> +		rxrpc_new_skb(skb, rxrpc_skb_rx_received);
> +
> +		/* we'll probably need to checksum it (didn't call sock_recvmsg) */
> +		if (skb_checksum_complete(skb)) {
> +			rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
> +			__UDP_INC_STATS(sock_net(udp_sk), UDP_MIB_INERRORS, 0);
> +			_debug("csum failed");
> +			continue;
> +		}
> +
> +		__UDP_INC_STATS(sock_net(udp_sk), UDP_MIB_INDATAGRAMS, 0);
> +
> +		rxrpc_input_packet(udp_sk, skb);
> +	}
> +}


This looks a potential infinite loop to me ?

If not, please add a comment explaining why there is no apparent limit.




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net 2/2] rxrpc: Fix the data_ready handler
  2018-10-05 13:43 ` [PATCH net 2/2] rxrpc: Fix the data_ready handler David Howells
  2018-10-05 13:52   ` Eric Dumazet
@ 2018-10-05 14:18   ` David Howells
  2018-10-05 16:07     ` Eric Dumazet
  2018-10-05 16:33     ` David Howells
  1 sibling, 2 replies; 12+ messages in thread
From: David Howells @ 2018-10-05 14:18 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: dhowells, netdev, linux-afs, linux-kernel, Paolo Abeni

Eric Dumazet <eric.dumazet@gmail.com> wrote:

> This looks a potential infinite loop to me ?

I assume that you're talking about the case where the packets are coming in so
fast that rxrpc is processing them as fast as they're coming in - or failing
to keep up.  I'm not sure what's the best thing to do in that case.

If you're not talking about that case, shouldn't skb_recv_udp() eventually
empty the queue and return -EAGAIN, thereby ending the loop?

I have another patch (see attached) which I was going to submit to net-next,
but maybe I need to merge into this one, in which I take the packets from the
encap_rcv hook.

Paolo has a couple of technical points on it that I need to look at dealing
with, but it speeds things up a bit and gets rid of the loop entirely.

David
---
commit 3784732a15631345c6d33b58985a9c2f30a2047f
Author: David Howells <dhowells@redhat.com>
Date:   Thu Oct 4 11:10:51 2018 +0100

    rxrpc: Use the UDP encap_rcv hook
    
    Use the UDP encap_rcv hook to cut the bit out of the rxrpc packet reception
    in which a packet is placed onto the UDP receive queue and then immediately
    removed again by rxrpc.  Going via the queue in this manner seems like it
    should be unnecessary.
    
    This does, however, require the invention of a value to place in encap_type
    as that's one of the conditions to switch packets out to the encap_rcv
    hook.  Possibly the value doesn't actually matter for anything other than
    sockopts on the UDP socket, which aren't accessible outside of rxrpc
    anyway.
    
    This seems to cut a bit of time out of the time elapsed between each
    sk_buff being timestamped and turning up in rxrpc (the final number in the
    following trace excerpts).  I measured this by making the rxrpc_rx_packet
    trace point print the time elapsed between the skb being timestamped and
    the current time (in ns), e.g.:
    
            ... 424.278721: rxrpc_rx_packet: ...  ACK 25026
    
    So doing a 512MiB DIO read from my test server, with an unmodified kernel:
    
            N       min     max     sum             mean    stddev
            27605   2626    7581    7.83992e+07     2840.04 181.029
    
    and with the patch applied:
    
            N       min     max     sum             mean    stddev
            27547   1895    12165   6.77461e+07     2459.29 255.02
    
    Signed-off-by: David Howells <dhowells@redhat.com>

diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h
index 09d00f8c442b..09502de447f5 100644
--- a/include/uapi/linux/udp.h
+++ b/include/uapi/linux/udp.h
@@ -40,5 +40,6 @@ struct udphdr {
 #define UDP_ENCAP_L2TPINUDP	3 /* rfc2661 */
 #define UDP_ENCAP_GTP0		4 /* GSM TS 09.60 */
 #define UDP_ENCAP_GTP1U		5 /* 3GPP TS 29.060 */
+#define UDP_ENCAP_RXRPC		6
 
 #endif /* _UAPI_LINUX_UDP_H */
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index b79d0413b5ed..b94f3bdc66ac 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -966,7 +966,7 @@ void rxrpc_unpublish_service_conn(struct rxrpc_connection *);
 /*
  * input.c
  */
-void rxrpc_data_ready(struct sock *);
+int rxrpc_input_packet(struct sock *, struct sk_buff *);
 
 /*
  * insecure.c
diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index ab90e60a5237..c0af023b77f4 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -1121,7 +1121,7 @@ int rxrpc_extract_header(struct rxrpc_skb_priv *sp, struct sk_buff *skb)
  * shut down and the local endpoint from going away, thus sk_user_data will not
  * be cleared until this function returns.
  */
-void rxrpc_input_packet(struct sock *udp_sk, struct sk_buff *skb)
+int rxrpc_input_packet(struct sock *udp_sk, struct sk_buff *skb)
 {
 	struct rxrpc_connection *conn;
 	struct rxrpc_channel *chan;
@@ -1136,6 +1136,24 @@ void rxrpc_input_packet(struct sock *udp_sk, struct sk_buff *skb)
 
 	_enter("%p", udp_sk);
 
+	if (skb->tstamp == 0)
+		skb->tstamp = ktime_get_real();
+
+	rxrpc_new_skb(skb, rxrpc_skb_rx_received);
+
+	_net("recv skb %p", skb);
+
+	if (sk_filter_trim_cap(udp_sk, skb, sizeof(struct udphdr))) {
+		__UDP_INC_STATS(sock_net(udp_sk), UDP_MIB_INERRORS, IS_UDPLITE(udp_sk));
+		atomic_inc(&udp_sk->sk_drops);
+		rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
+		_leave(" [filtered]");
+		return 0;
+	}
+
+	udp_csum_pull_header(skb);
+	skb_orphan(skb);
+
 	/* The UDP protocol already released all skb resources;
 	 * we are free to add our own data there.
 	 */
@@ -1150,7 +1168,7 @@ void rxrpc_input_packet(struct sock *udp_sk, struct sk_buff *skb)
 		if ((lose++ & 7) == 7) {
 			trace_rxrpc_rx_lose(sp);
 			rxrpc_free_skb(skb, rxrpc_skb_rx_lost);
-			return;
+			return 0;
 		}
 	}
 
@@ -1334,7 +1352,7 @@ void rxrpc_input_packet(struct sock *udp_sk, struct sk_buff *skb)
 	rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
 out:
 	trace_rxrpc_rx_done(0, 0);
-	return;
+	return 0;
 
 out_unlock:
 	rcu_read_unlock();
@@ -1373,38 +1391,5 @@ void rxrpc_input_packet(struct sock *udp_sk, struct sk_buff *skb)
 	trace_rxrpc_rx_done(skb->mark, skb->priority);
 	rxrpc_reject_packet(local, skb);
 	_leave(" [badmsg]");
-}
-
-void rxrpc_data_ready(struct sock *udp_sk)
-{
-	struct sk_buff *skb;
-	int ret;
-
-	for (;;) {
-		skb = skb_recv_udp(udp_sk, 0, 1, &ret);
-		if (!skb) {
-			if (ret == -EAGAIN)
-				return;
-
-			/* If there was a transmission failure, we get an error
-			 * here that we need to ignore.
-			 */
-			_debug("UDP socket error %d", ret);
-			continue;
-		}
-
-		rxrpc_new_skb(skb, rxrpc_skb_rx_received);
-
-		/* we'll probably need to checksum it (didn't call sock_recvmsg) */
-		if (skb_checksum_complete(skb)) {
-			rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
-			__UDP_INC_STATS(sock_net(udp_sk), UDP_MIB_INERRORS, 0);
-			_debug("csum failed");
-			continue;
-		}
-
-		__UDP_INC_STATS(sock_net(udp_sk), UDP_MIB_INDATAGRAMS, 0);
-
-		rxrpc_input_packet(udp_sk, skb);
-	}
+	return 0;
 }
diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c
index 30862f44c9f1..cad0691c2bb4 100644
--- a/net/rxrpc/local_object.c
+++ b/net/rxrpc/local_object.c
@@ -19,6 +19,7 @@
 #include <linux/ip.h>
 #include <linux/hashtable.h>
 #include <net/sock.h>
+#include <net/udp.h>
 #include <net/af_rxrpc.h>
 #include "ar-internal.h"
 
@@ -108,7 +109,7 @@ static struct rxrpc_local *rxrpc_alloc_local(struct rxrpc_net *rxnet,
  */
 static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net)
 {
-	struct sock *sock;
+	struct sock *usk;
 	int ret, opt;
 
 	_enter("%p{%d,%d}",
@@ -123,10 +124,26 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net)
 	}
 
 	/* set the socket up */
-	sock = local->socket->sk;
-	sock->sk_user_data	= local;
-	sock->sk_data_ready	= rxrpc_data_ready;
-	sock->sk_error_report	= rxrpc_error_report;
+	usk = local->socket->sk;
+	inet_sk(usk)->mc_loop = 0;
+
+	/* Enable CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE conversion */
+	inet_inc_convert_csum(usk);
+
+	rcu_assign_sk_user_data(usk, local);
+
+	udp_sk(usk)->encap_type = UDP_ENCAP_RXRPC;
+	udp_sk(usk)->encap_rcv = rxrpc_input_packet;
+	udp_sk(usk)->encap_destroy = NULL;
+	udp_sk(usk)->gro_receive = NULL;
+	udp_sk(usk)->gro_complete = NULL;
+
+	udp_encap_enable();
+#if IS_ENABLED(CONFIG_IPV6)
+	if (local->srx.transport.family == AF_INET6)
+		udpv6_encap_enable();
+#endif
+	usk->sk_error_report = rxrpc_error_report;
 
 	/* if a local address was supplied then bind it */
 	if (local->srx.transport_len > sizeof(sa_family_t)) {

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net 2/2] rxrpc: Fix the data_ready handler
  2018-10-05 14:18   ` David Howells
@ 2018-10-05 16:07     ` Eric Dumazet
  2018-10-05 16:33     ` David Howells
  1 sibling, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2018-10-05 16:07 UTC (permalink / raw)
  To: David Howells, Eric Dumazet; +Cc: netdev, linux-afs, linux-kernel, Paolo Abeni



On 10/05/2018 07:18 AM, David Howells wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> This looks a potential infinite loop to me ?
> 
> I assume that you're talking about the case where the packets are coming in so
> fast that rxrpc is processing them as fast as they're coming in - or failing
> to keep up.  I'm not sure what's the best thing to do in that case.

Well, this is exactly why we do not write such loops :/

sk_data_ready is not meant to process packets, it is meant to signal
to another entity (preferably running in process context and thus with proper
schedule points, and not blocking BH) that there is data ready to be consumed.

Under DOS, it is possible multiple cpus will sk_data_ready in parallel.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net 2/2] rxrpc: Fix the data_ready handler
  2018-10-05 14:18   ` David Howells
  2018-10-05 16:07     ` Eric Dumazet
@ 2018-10-05 16:33     ` David Howells
  2018-10-05 17:44       ` Eric Dumazet
  1 sibling, 1 reply; 12+ messages in thread
From: David Howells @ 2018-10-05 16:33 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: dhowells, netdev, linux-afs, linux-kernel, Paolo Abeni

Eric Dumazet <eric.dumazet@gmail.com> wrote:

> sk_data_ready is not meant to process packets, it is meant to signal
> to another entity (preferably running in process context and thus with proper
> schedule points, and not blocking BH) that there is data ready to be consumed.

The issue is that I need to route the packets to the appropriate call, and the
BH appears to be the right place to do this, especially as I can quickly parse
and discard certain types of packet right there.

If I move all of this to process context then that adds extra context switches
between the routing process and the destination process.

> Under DOS, it is possible multiple cpus will sk_data_ready in parallel.

Ummm...  I've been led to believe that sk_data_ready will *not* be called in
parallel and that the code it calls can assume non-reentrancy.  Is this not
the case?

What about the patch I attached, whereby I use the encap_rcv() hook.  Do you
say that won't work?

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net 2/2] rxrpc: Fix the data_ready handler
  2018-10-05 16:33     ` David Howells
@ 2018-10-05 17:44       ` Eric Dumazet
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2018-10-05 17:44 UTC (permalink / raw)
  To: David Howells, Eric Dumazet; +Cc: netdev, linux-afs, linux-kernel, Paolo Abeni



On 10/05/2018 09:33 AM, David Howells wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> sk_data_ready is not meant to process packets, it is meant to signal
>> to another entity (preferably running in process context and thus with proper
>> schedule points, and not blocking BH) that there is data ready to be consumed.
> 
> The issue is that I need to route the packets to the appropriate call, and the
> BH appears to be the right place to do this, especially as I can quickly parse
> and discard certain types of packet right there.
> 
> If I move all of this to process context then that adds extra context switches
> between the routing process and the destination process.
> 
>> Under DOS, it is possible multiple cpus will sk_data_ready in parallel.
> 
> Ummm...  I've been led to believe that sk_data_ready will *not* be called in
> parallel and that the code it calls can assume non-reentrancy.  Is this not
> the case.

Certainly not for UDP.

TCP input processing uses the socket lock, but UDP is fully parallel.

> 
> What about the patch I attached, whereby I use the encap_rcv() hook.  Do you
> say that won't work?

I am kind of busy today, will look at it next week.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net 0/2] rxrpc: Fixes
  2019-08-09 16:05 David Howells
@ 2019-08-09 18:27 ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2019-08-09 18:27 UTC (permalink / raw)
  To: dhowells; +Cc: netdev, linux-afs, linux-kernel

From: David Howells <dhowells@redhat.com>
Date: Fri, 09 Aug 2019 17:05:46 +0100

> Here's a couple of fixes for rxrpc:
> 
>  (1) Fix refcounting of the local endpoint.
> 
>  (2) Don't calculate or report packet skew information.  This has been
>      obsolete since AFS 3.1 and so is a waste of resources.
> 
> The patches are tagged here:
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
> 	rxrpc-fixes-20190809

Pulled, thanks David.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net 0/2] rxrpc: Fixes
@ 2019-08-09 16:05 David Howells
  2019-08-09 18:27 ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: David Howells @ 2019-08-09 16:05 UTC (permalink / raw)
  To: netdev; +Cc: dhowells, linux-afs, linux-kernel


Here's a couple of fixes for rxrpc:

 (1) Fix refcounting of the local endpoint.

 (2) Don't calculate or report packet skew information.  This has been
     obsolete since AFS 3.1 and so is a waste of resources.


The patches are tagged here:

	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
	rxrpc-fixes-20190809

and can also be found on the following branch:

	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-fixes

David
---
David Howells (2):
      rxrpc: Fix local endpoint refcounting
      rxrpc: Don't bother generating maxSkew in the ACK packet


 net/rxrpc/af_rxrpc.c     |    6 ++-
 net/rxrpc/ar-internal.h  |    8 +++-
 net/rxrpc/call_event.c   |   15 +++-----
 net/rxrpc/input.c        |   59 +++++++++++++++-----------------
 net/rxrpc/local_object.c |   86 +++++++++++++++++++++++++++++-----------------
 net/rxrpc/output.c       |    3 +-
 net/rxrpc/recvmsg.c      |    6 ++-
 7 files changed, 100 insertions(+), 83 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net 0/2] rxrpc: Fixes
  2019-07-30 14:50 [PATCH net 0/2] rxrpc: Fixes David Howells
@ 2019-07-30 17:31 ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2019-07-30 17:31 UTC (permalink / raw)
  To: dhowells; +Cc: netdev, linux-afs, linux-kernel

From: David Howells <dhowells@redhat.com>
Date: Tue, 30 Jul 2019 15:50:11 +0100

> 
> Here are a couple of fixes for rxrpc:
> 
>  (1) Fix a potential deadlock in the peer keepalive dispatcher.
> 
>  (2) Fix a missing notification when a UDP sendmsg error occurs in rxrpc.
> 
> 
> The patches are tagged here:
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
> 	rxrpc-fixes-20190730

Pulled, thanks David.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net 0/2] rxrpc: Fixes
@ 2019-07-30 14:50 David Howells
  2019-07-30 17:31 ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: David Howells @ 2019-07-30 14:50 UTC (permalink / raw)
  To: netdev; +Cc: dhowells, linux-afs, linux-kernel


Here are a couple of fixes for rxrpc:

 (1) Fix a potential deadlock in the peer keepalive dispatcher.

 (2) Fix a missing notification when a UDP sendmsg error occurs in rxrpc.


The patches are tagged here:

	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
	rxrpc-fixes-20190730

and can also be found on the following branch:

	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-fixes

David
---
David Howells (2):
      rxrpc: Fix potential deadlock
      rxrpc: Fix the lack of notification when sendmsg() fails on a DATA packet


 net/rxrpc/ar-internal.h |    1 +
 net/rxrpc/peer_event.c  |    2 +-
 net/rxrpc/peer_object.c |   18 ++++++++++++++++++
 net/rxrpc/sendmsg.c     |    1 +
 4 files changed, 21 insertions(+), 1 deletion(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-08-09 18:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-05 13:42 [PATCH net 0/2] rxrpc: Fixes David Howells
2018-10-05 13:43 ` [PATCH net 1/2] rxrpc: Fix some missed refs to init_net David Howells
2018-10-05 13:43 ` [PATCH net 2/2] rxrpc: Fix the data_ready handler David Howells
2018-10-05 13:52   ` Eric Dumazet
2018-10-05 14:18   ` David Howells
2018-10-05 16:07     ` Eric Dumazet
2018-10-05 16:33     ` David Howells
2018-10-05 17:44       ` Eric Dumazet
2019-07-30 14:50 [PATCH net 0/2] rxrpc: Fixes David Howells
2019-07-30 17:31 ` David Miller
2019-08-09 16:05 David Howells
2019-08-09 18:27 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).