bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: jakub@cloudflare.com, daniel@iogearbox.net
Cc: john.fastabend@gmail.com, bpf@vger.kernel.org,
	netdev@vger.kernel.org, edumazet@google.com, ast@kernel.org,
	andrii@kernel.org, will@isovalent.com
Subject: [PATCH bpf v10 05/14] bpf: sockmap, handle fin correctly
Date: Mon, 22 May 2023 19:56:09 -0700	[thread overview]
Message-ID: <20230523025618.113937-6-john.fastabend@gmail.com> (raw)
In-Reply-To: <20230523025618.113937-1-john.fastabend@gmail.com>

The sockmap code is returning EAGAIN after a FIN packet is received and no
more data is on the receive queue. Correct behavior is to return 0 to the
user and the user can then close the socket. The EAGAIN causes many apps
to retry which masks the problem. Eventually the socket is evicted from
the sockmap because its released from sockmap sock free handling. The
issue creates a delay and can cause some errors on application side.

To fix this check on sk_msg_recvmsg side if length is zero and FIN flag
is set then set return to zero. A selftest will be added to check this
condition.

Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
Tested-by: William Findlay <will@isovalent.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/ipv4/tcp_bpf.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 2e9547467edb..73c13642d47f 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -174,6 +174,24 @@ static int tcp_msg_wait_data(struct sock *sk, struct sk_psock *psock,
 	return ret;
 }
 
+static bool is_next_msg_fin(struct sk_psock *psock)
+{
+	struct scatterlist *sge;
+	struct sk_msg *msg_rx;
+	int i;
+
+	msg_rx = sk_psock_peek_msg(psock);
+	i = msg_rx->sg.start;
+	sge = sk_msg_elem(msg_rx, i);
+	if (!sge->length) {
+		struct sk_buff *skb = msg_rx->skb;
+
+		if (skb && TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
+			return true;
+	}
+	return false;
+}
+
 static int tcp_bpf_recvmsg_parser(struct sock *sk,
 				  struct msghdr *msg,
 				  size_t len,
@@ -196,6 +214,19 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk,
 	lock_sock(sk);
 msg_bytes_ready:
 	copied = sk_msg_recvmsg(sk, psock, msg, len, flags);
+	/* The typical case for EFAULT is the socket was gracefully
+	 * shutdown with a FIN pkt. So check here the other case is
+	 * some error on copy_page_to_iter which would be unexpected.
+	 * On fin return correct return code to zero.
+	 */
+	if (copied == -EFAULT) {
+		bool is_fin = is_next_msg_fin(psock);
+
+		if (is_fin) {
+			copied = 0;
+			goto out;
+		}
+	}
 	if (!copied) {
 		long timeo;
 		int data;
-- 
2.33.0


  parent reply	other threads:[~2023-05-23  2:56 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-23  2:56 [PATCH bpf v10 00/14] bpf sockmap fixes John Fastabend
2023-05-23  2:56 ` [PATCH bpf v10 01/14] bpf: sockmap, pass skb ownership through read_skb John Fastabend
2023-05-23  2:56 ` [PATCH bpf v10 02/14] bpf: sockmap, convert schedule_work into delayed_work John Fastabend
2023-05-23  2:56 ` [PATCH bpf v10 03/14] bpf: sockmap, reschedule is now done through backlog John Fastabend
2023-05-23  2:56 ` [PATCH bpf v10 04/14] bpf: sockmap, improved check for empty queue John Fastabend
2023-05-23  7:35   ` Jakub Sitnicki
2023-05-23  2:56 ` John Fastabend [this message]
2023-05-23  2:56 ` [PATCH bpf v10 06/14] bpf: sockmap, TCP data stall on recv before accept John Fastabend
2023-05-23  2:56 ` [PATCH bpf v10 07/14] bpf: sockmap, wake up polling after data copy John Fastabend
2023-05-30  6:30   ` Eric Dumazet
2023-05-30 18:34     ` John Fastabend
2023-05-30 18:43       ` John Fastabend
2023-05-30 18:51         ` Eric Dumazet
2023-05-23  2:56 ` [PATCH bpf v10 08/14] bpf: sockmap, incorrectly handling copied_seq John Fastabend
2023-05-23  9:09   ` Jakub Sitnicki
2023-05-23  2:56 ` [PATCH bpf v10 09/14] bpf: sockmap, pull socket helpers out of listen test for general use John Fastabend
2023-05-23  2:56 ` [PATCH bpf v10 10/14] bpf: sockmap, build helper to create connected socket pair John Fastabend
2023-05-23  9:23   ` Jakub Sitnicki
2023-05-23  2:56 ` [PATCH bpf v10 11/14] bpf: sockmap, test shutdown() correctly exits epoll and recv()=0 John Fastabend
2023-05-23  9:41   ` Jakub Sitnicki
2023-05-23  2:56 ` [PATCH bpf v10 12/14] bpf: sockmap, test FIONREAD returns correct bytes in rx buffer John Fastabend
2023-05-23  2:56 ` [PATCH bpf v10 13/14] bpf: sockmap, test FIONREAD returns correct bytes in rx buffer with drops John Fastabend
2023-05-23  2:56 ` [PATCH bpf v10 14/14] bpf: sockmap, test progs verifier error with latest clang John Fastabend
2023-05-23 10:00   ` Jakub Sitnicki
2023-05-23 14:31 ` [PATCH bpf v10 00/14] bpf sockmap fixes Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230523025618.113937-6-john.fastabend@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=edumazet@google.com \
    --cc=jakub@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=will@isovalent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).