netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>,
	Dave Botsch <botsch@cnf.cornell.edu>,
	Sasha Levin <sashal@kernel.org>,
	linux-afs@lists.infradead.org, netdev@vger.kernel.org
Subject: [PATCH AUTOSEL 5.6 201/606] rxrpc: Fix ack discard
Date: Mon,  8 Jun 2020 19:05:26 -0400	[thread overview]
Message-ID: <20200608231211.3363633-201-sashal@kernel.org> (raw)
In-Reply-To: <20200608231211.3363633-1-sashal@kernel.org>

From: David Howells <dhowells@redhat.com>

[ Upstream commit 441fdee1eaf050ef0040bde0d7af075c1c6a6d8b ]

The Rx protocol has a "previousPacket" field in it that is not handled in
the same way by all protocol implementations.  Sometimes it contains the
serial number of the last DATA packet received, sometimes the sequence
number of the last DATA packet received and sometimes the highest sequence
number so far received.

AF_RXRPC is using this to weed out ACKs that are out of date (it's possible
for ACK packets to get reordered on the wire), but this does not work with
OpenAFS which will just stick the sequence number of the last packet seen
into previousPacket.

The issue being seen is that big AFS FS.StoreData RPC (eg. of ~256MiB) are
timing out when partly sent.  A trace was captured, with an additional
tracepoint to show ACKs being discarded in rxrpc_input_ack().  Here's an
excerpt showing the problem.

 52873.203230: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 0002449c q=00024499 fl=09

A DATA packet with sequence number 00024499 has been transmitted (the "q="
field).

 ...
 52873.243296: rxrpc_rx_ack: c=000004ae 00012a2b DLY r=00024499 f=00024497 p=00024496 n=0
 52873.243376: rxrpc_rx_ack: c=000004ae 00012a2c IDL r=0002449b f=00024499 p=00024498 n=0
 52873.243383: rxrpc_rx_ack: c=000004ae 00012a2d OOS r=0002449d f=00024499 p=0002449a n=2

The Out-Of-Sequence ACK indicates that the server didn't see DATA sequence
number 00024499, but did see seq 0002449a (previousPacket, shown as "p=",
skipped the number, but firstPacket, "f=", which shows the bottom of the
window is set at that point).

 52873.252663: rxrpc_retransmit: c=000004ae q=24499 a=02 xp=14581537
 52873.252664: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 000244bc q=00024499 fl=0b *RETRANS*

The packet has been retransmitted.  Retransmission recurs until the peer
says it got the packet.

 52873.271013: rxrpc_rx_ack: c=000004ae 00012a31 OOS r=000244a1 f=00024499 p=0002449e n=6

More OOS ACKs indicate that the other packets that are already in the
transmission pipeline are being received.  The specific-ACK list is up to 6
ACKs and NAKs.

 ...
 52873.284792: rxrpc_rx_ack: c=000004ae 00012a49 OOS r=000244b9 f=00024499 p=000244b6 n=30
 52873.284802: rxrpc_retransmit: c=000004ae q=24499 a=0a xp=63505500
 52873.284804: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 000244c2 q=00024499 fl=0b *RETRANS*
 52873.287468: rxrpc_rx_ack: c=000004ae 00012a4a OOS r=000244ba f=00024499 p=000244b7 n=31
 52873.287478: rxrpc_rx_ack: c=000004ae 00012a4b OOS r=000244bb f=00024499 p=000244b8 n=32

At this point, the server's receive window is full (n=32) with presumably 1
NAK'd packet and 31 ACK'd packets.  We can't transmit any more packets.

 52873.287488: rxrpc_retransmit: c=000004ae q=24499 a=0a xp=61327980
 52873.287489: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 000244c3 q=00024499 fl=0b *RETRANS*
 52873.293850: rxrpc_rx_ack: c=000004ae 00012a4c DLY r=000244bc f=000244a0 p=00024499 n=25

And now we've received an ACK indicating that a DATA retransmission was
received.  7 packets have been processed (the occupied part of the window
moved, as indicated by f= and n=).

 52873.293853: rxrpc_rx_discard_ack: c=000004ae r=00012a4c 000244a0<00024499 00024499<000244b8

However, the DLY ACK gets discarded because its previousPacket has gone
backwards (from p=000244b8, in the ACK at 52873.287478 to p=00024499 in the
ACK at 52873.293850).

We then end up in a continuous cycle of retransmit/discard.  kafs fails to
update its window because it's discarding the ACKs and can't transmit an
extra packet that would clear the issue because the window is full.
OpenAFS doesn't change the previousPacket value in the ACKs because no new
DATA packets are received with a different previousPacket number.

Fix this by altering the discard check to only discard an ACK based on
previousPacket if there was no advance in the firstPacket.  This allows us
to transmit a new packet which will cause previousPacket to advance in the
next ACK.

The check, however, needs to allow for the possibility that previousPacket
may actually have had the serial number placed in it instead - in which
case it will go outside the window and we should ignore it.

Fixes: 1a2391c30c0b ("rxrpc: Fix detection of out of order acks")
Reported-by: Dave Botsch <botsch@cnf.cornell.edu>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/rxrpc/input.c | 30 ++++++++++++++++++++++++++----
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index 2f22f082a66c..3be4177baf70 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -802,6 +802,30 @@ static void rxrpc_input_soft_acks(struct rxrpc_call *call, u8 *acks,
 	}
 }
 
+/*
+ * Return true if the ACK is valid - ie. it doesn't appear to have regressed
+ * with respect to the ack state conveyed by preceding ACKs.
+ */
+static bool rxrpc_is_ack_valid(struct rxrpc_call *call,
+			       rxrpc_seq_t first_pkt, rxrpc_seq_t prev_pkt)
+{
+	rxrpc_seq_t base = READ_ONCE(call->ackr_first_seq);
+
+	if (after(first_pkt, base))
+		return true; /* The window advanced */
+
+	if (before(first_pkt, base))
+		return false; /* firstPacket regressed */
+
+	if (after_eq(prev_pkt, call->ackr_prev_seq))
+		return true; /* previousPacket hasn't regressed. */
+
+	/* Some rx implementations put a serial number in previousPacket. */
+	if (after_eq(prev_pkt, base + call->tx_winsize))
+		return false;
+	return true;
+}
+
 /*
  * Process an ACK packet.
  *
@@ -865,8 +889,7 @@ static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb)
 	}
 
 	/* Discard any out-of-order or duplicate ACKs (outside lock). */
-	if (before(first_soft_ack, call->ackr_first_seq) ||
-	    before(prev_pkt, call->ackr_prev_seq)) {
+	if (!rxrpc_is_ack_valid(call, first_soft_ack, prev_pkt)) {
 		trace_rxrpc_rx_discard_ack(call->debug_id, sp->hdr.serial,
 					   first_soft_ack, call->ackr_first_seq,
 					   prev_pkt, call->ackr_prev_seq);
@@ -882,8 +905,7 @@ static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb)
 	spin_lock(&call->input_lock);
 
 	/* Discard any out-of-order or duplicate ACKs (inside lock). */
-	if (before(first_soft_ack, call->ackr_first_seq) ||
-	    before(prev_pkt, call->ackr_prev_seq)) {
+	if (!rxrpc_is_ack_valid(call, first_soft_ack, prev_pkt)) {
 		trace_rxrpc_rx_discard_ack(call->debug_id, sp->hdr.serial,
 					   first_soft_ack, call->ackr_first_seq,
 					   prev_pkt, call->ackr_prev_seq);
-- 
2.25.1


  parent reply	other threads:[~2020-06-09  0:29 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200608231211.3363633-1-sashal@kernel.org>
2020-06-08 23:02 ` [PATCH AUTOSEL 5.6 006/606] bpf: Fix bug in mmap() implementation for BPF array map Sasha Levin
2020-06-08 23:02 ` [PATCH AUTOSEL 5.6 009/606] net/rds: Use ERR_PTR for rds_message_alloc_sgs() Sasha Levin
2020-06-08 23:03 ` [PATCH AUTOSEL 5.6 067/606] SUNRPC: Revert 241b1f419f0e ("SUNRPC: Remove xdr_buf_trim()") Sasha Levin
2020-06-08 23:03 ` [PATCH AUTOSEL 5.6 068/606] bpf: Fix sk_psock refcnt leak when receiving message Sasha Levin
2020-06-08 23:03 ` [PATCH AUTOSEL 5.6 075/606] bpf: Enforce returning 0 for fentry/fexit progs Sasha Levin
2020-06-08 23:03 ` [PATCH AUTOSEL 5.6 076/606] selftests/bpf: Enforce returning 0 for fentry/fexit programs Sasha Levin
2020-06-08 23:03 ` [PATCH AUTOSEL 5.6 077/606] bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifier Sasha Levin
2020-06-08 23:03 ` [PATCH AUTOSEL 5.6 102/606] net: drop_monitor: use IS_REACHABLE() to guard net_dm_hw_report() Sasha Levin
2020-06-08 23:03 ` [PATCH AUTOSEL 5.6 111/606] vhost/vsock: fix packet delivery order to monitoring devices Sasha Levin
2020-06-08 23:03 ` [PATCH AUTOSEL 5.6 112/606] aquantia: Fix the media type of AQC100 ethernet controller in the driver Sasha Levin
2020-06-08 23:03 ` [PATCH AUTOSEL 5.6 114/606] net/ena: Fix build warning in ena_xdp_set() Sasha Levin
2020-06-08 23:04 ` [PATCH AUTOSEL 5.6 117/606] ibmvnic: Skip fatal error reset after passive init Sasha Levin
2020-06-08 23:04 ` [PATCH AUTOSEL 5.6 121/606] gtp: set NLM_F_MULTI flag in gtp_genl_dump_pdp() Sasha Levin
2020-06-08 23:04 ` [PATCH AUTOSEL 5.6 124/606] stmmac: fix pointer check after utilization in stmmac_interrupt Sasha Levin
2020-06-08 23:04 ` [PATCH AUTOSEL 5.6 141/606] bpf: Restrict bpf_probe_read{, str}() only to archs where they work Sasha Levin
2020-06-08 23:04 ` [PATCH AUTOSEL 5.6 142/606] bpf: Add bpf_probe_read_{user, kernel}_str() to do_refine_retval_range Sasha Levin
2020-06-08 23:04 ` [PATCH AUTOSEL 5.6 168/606] kbuild: Remove debug info from kallsyms linking Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 194/606] rxrpc: Fix the excessive initial retransmission timeout Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 195/606] rxrpc: Fix a memory leak in rxkad_verify_response() Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 198/606] flow_dissector: Drop BPF flow dissector prog ref on netns cleanup Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 200/606] rxrpc: Trace discarded ACKs Sasha Levin
2020-06-08 23:05 ` Sasha Levin [this message]
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 202/606] bpf: Prevent mmap()'ing read-only maps as writable Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 207/606] ax25: fix setsockopt(SO_BINDTODEVICE) Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 208/606] dpaa_eth: fix usage as DSA master, try 3 Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 209/606] ethtool: count header size in reply size estimate Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 210/606] felix: Fix initialization of ioremap resources Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 211/606] net: don't return invalid table id error when we fall back to PF_UNSPEC Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 212/606] net: dsa: mt7530: fix roaming from DSA user ports Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 213/606] net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspend Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 214/606] __netif_receive_skb_core: pass skb by reference Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 215/606] net: inet_csk: Fix so_reuseport bind-address cache in tb->fast* Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 216/606] net: ipip: fix wrong address family in init error path Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 217/606] net/mlx5: Add command entry handling completion Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 218/606] net: mvpp2: fix RX hashing for non-10G ports Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 219/606] net: nlmsg_cancel() if put fails for nhmsg Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 220/606] net: qrtr: Fix passing invalid reference to qrtr_local_enqueue() Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 221/606] net: revert "net: get rid of an signed integer overflow in ip_idents_reserve()" Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 222/606] net sched: fix reporting the first-time use timestamp Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 223/606] net/tls: fix race condition causing kernel panic Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 224/606] nexthop: Fix attribute checking for groups Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 225/606] r8152: support additional Microsoft Surface Ethernet Adapter variant Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 226/606] sctp: Don't add the shutdown timer if its already been added Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 227/606] sctp: Start shutdown on association restart if in SHUTDOWN-SENT state and socket is closed Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 228/606] tipc: block BH before using dst_cache Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 229/606] net/mlx5e: kTLS, Destroy key object after destroying the TIS Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 230/606] net/mlx5e: Fix inner tirs handling Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 231/606] net/mlx5: Fix memory leak in mlx5_events_init Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 232/606] net/mlx5e: Update netdev txq on completions during closure Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 233/606] net/mlx5: Fix error flow in case of function_setup failure Sasha Levin
2020-06-08 23:05 ` [PATCH AUTOSEL 5.6 234/606] wireguard: noise: read preshared key while taking lock Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 235/606] wireguard: queueing: preserve flow hash across packet scrubbing Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 236/606] wireguard: noise: separate receive counter from send counter Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 237/606] r8169: fix OCP access on RTL8117 Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 238/606] net/mlx5: Fix a race when moving command interface to events mode Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 239/606] net/mlx5: Fix cleaning unmanaged flow tables Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 241/606] net/mlx5: Avoid processing commands before cmdif is ready Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 242/606] net/mlx5: Annotate mutex destroy for root ns Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 243/606] net/tls: fix encryption error checking Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 244/606] net/tls: free record only on encryption error Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 245/606] net: sun: fix missing release regions in cas_init_one() Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 246/606] net/mlx4_core: fix a memory leak bug Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 247/606] net: sgi: ioc3-eth: Fix return value check in ioc3eth_probe() Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 248/606] mlxsw: spectrum: Fix use-after-free of split/unsplit/type_set in case reload fails Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 249/606] net: mscc: ocelot: fix address ageing time (again) Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 261/606] net: microchip: encx24j600: add missed kthread_stop Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 272/606] net: freescale: select CONFIG_FIXED_PHY where needed Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 287/606] samples: bpf: Fix build error Sasha Levin
2020-06-08 23:06 ` [PATCH AUTOSEL 5.6 288/606] drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c Sasha Levin
2020-06-08 23:07 ` [PATCH AUTOSEL 5.6 325/606] libceph: ignore pool overlay and cache logic on redirects Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200608231211.3363633-201-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=botsch@cnf.cornell.edu \
    --cc=dhowells@redhat.com \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).