netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>,
	Marc Dionne <marc.dionne@auristor.com>,
	linux-afs@lists.infradead.org,
	"David S . Miller" <davem@davemloft.net>,
	Sasha Levin <sashal@kernel.org>,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	netdev@vger.kernel.org
Subject: [PATCH AUTOSEL 5.17 131/135] afs: Adjust ACK interpretation to try and cope with NAT
Date: Mon, 30 May 2022 09:31:29 -0400	[thread overview]
Message-ID: <20220530133133.1931716-131-sashal@kernel.org> (raw)
In-Reply-To: <20220530133133.1931716-1-sashal@kernel.org>

From: David Howells <dhowells@redhat.com>

[ Upstream commit adc9613ff66c26ebaff9814973181ac178beb90b ]

If a client's address changes, say if it is NAT'd, this can disrupt an in
progress operation.  For most operations, this is not much of a problem,
but StoreData can be different as some servers modify the target file as
the data comes in, so if a store request is disrupted, the file can get
corrupted on the server.

The problem is that the server doesn't recognise packets that come after
the change of address as belonging to the original client and will bounce
them, either by sending an OUT_OF_SEQUENCE ACK to the apparent new call if
the packet number falls within the initial sequence number window of a call
or by sending an EXCEEDS_WINDOW ACK if it falls outside and then aborting
it.  In both cases, firstPacket will be 1 and previousPacket will be 0 in
the ACK information.

Fix this by the following means:

 (1) If a client call receives an EXCEEDS_WINDOW ACK with firstPacket as 1
     and previousPacket as 0, assume this indicates that the server saw the
     incoming packets from a different peer and thus as a different call.
     Fail the call with error -ENETRESET.

 (2) Also fail the call if a similar OUT_OF_SEQUENCE ACK occurs if the
     first packet has been hard-ACK'd.  If it hasn't been hard-ACK'd, the
     ACK packet will cause it to get retransmitted, so the call will just
     be repeated.

 (3) Make afs_select_fileserver() treat -ENETRESET as a straight fail of
     the operation.

 (4) Prioritise the error code over things like -ECONNRESET as the server
     did actually respond.

 (5) Make writeback treat -ENETRESET as a retryable error and make it
     redirty all the pages involved in a write so that the VM will retry.

Note that there is still a circumstance that I can't easily deal with: if
the operation is fully received and processed by the server, but the reply
is lost due to address change.  There's no way to know if the op happened.
We can examine the server, but a conflicting change could have been made by
a third party - and we can't tell the difference.  In such a case, a
message like:

    kAFS: vnode modified {100058:146266} b7->b8 YFS.StoreData64 (op=2646a)

will be logged to dmesg on the next op to touch the file and the client
will reset the inode state, including invalidating clean parts of the
pagecache.

Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Link: http://lists.infradead.org/pipermail/linux-afs/2021-December/004811.html # v1
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/afs/misc.c     |  5 ++++-
 fs/afs/rotate.c   |  4 ++++
 fs/afs/write.c    |  1 +
 net/rxrpc/input.c | 27 +++++++++++++++++++++++++++
 4 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/fs/afs/misc.c b/fs/afs/misc.c
index 1d1a8debe472..933e67fcdab1 100644
--- a/fs/afs/misc.c
+++ b/fs/afs/misc.c
@@ -163,8 +163,11 @@ void afs_prioritise_error(struct afs_error *e, int error, u32 abort_code)
 		return;
 
 	case -ECONNABORTED:
+		error = afs_abort_to_error(abort_code);
+		fallthrough;
+	case -ENETRESET: /* Responded, but we seem to have changed address */
 		e->responded = true;
-		e->error = afs_abort_to_error(abort_code);
+		e->error = error;
 		return;
 	}
 }
diff --git a/fs/afs/rotate.c b/fs/afs/rotate.c
index 79e1a5f6701b..a840c3588ebb 100644
--- a/fs/afs/rotate.c
+++ b/fs/afs/rotate.c
@@ -292,6 +292,10 @@ bool afs_select_fileserver(struct afs_operation *op)
 		op->error = error;
 		goto iterate_address;
 
+	case -ENETRESET:
+		pr_warn("kAFS: Peer reset %s (op=%x)\n",
+			op->type ? op->type->name : "???", op->debug_id);
+		fallthrough;
 	case -ECONNRESET:
 		_debug("call reset");
 		op->error = error;
diff --git a/fs/afs/write.c b/fs/afs/write.c
index f447c902318d..07454b1ed240 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -638,6 +638,7 @@ static ssize_t afs_write_back_from_locked_folio(struct address_space *mapping,
 	case -EKEYEXPIRED:
 	case -EKEYREJECTED:
 	case -EKEYREVOKED:
+	case -ENETRESET:
 		afs_redirty_pages(wbc, mapping, start, len);
 		mapping_set_error(mapping, ret);
 		break;
diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index dc201363f2c4..67d3eba60dc7 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -903,6 +903,33 @@ static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb)
 				  rxrpc_propose_ack_respond_to_ack);
 	}
 
+	/* If we get an EXCEEDS_WINDOW ACK from the server, it probably
+	 * indicates that the client address changed due to NAT.  The server
+	 * lost the call because it switched to a different peer.
+	 */
+	if (unlikely(buf.ack.reason == RXRPC_ACK_EXCEEDS_WINDOW) &&
+	    first_soft_ack == 1 &&
+	    prev_pkt == 0 &&
+	    rxrpc_is_client_call(call)) {
+		rxrpc_set_call_completion(call, RXRPC_CALL_REMOTELY_ABORTED,
+					  0, -ENETRESET);
+		return;
+	}
+
+	/* If we get an OUT_OF_SEQUENCE ACK from the server, that can also
+	 * indicate a change of address.  However, we can retransmit the call
+	 * if we still have it buffered to the beginning.
+	 */
+	if (unlikely(buf.ack.reason == RXRPC_ACK_OUT_OF_SEQUENCE) &&
+	    first_soft_ack == 1 &&
+	    prev_pkt == 0 &&
+	    call->tx_hard_ack == 0 &&
+	    rxrpc_is_client_call(call)) {
+		rxrpc_set_call_completion(call, RXRPC_CALL_REMOTELY_ABORTED,
+					  0, -ENETRESET);
+		return;
+	}
+
 	/* Discard any out-of-order or duplicate ACKs (outside lock). */
 	if (!rxrpc_is_ack_valid(call, first_soft_ack, prev_pkt)) {
 		trace_rxrpc_rx_discard_ack(call->debug_id, ack_serial,
-- 
2.35.1


  parent reply	other threads:[~2022-05-30 13:57 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220530133133.1931716-1-sashal@kernel.org>
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 003/135] ath11k: fix the warning of dev_wake in mhi_pm_disable_transition() Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 005/135] selftests/bpf: Fix vfs_link kprobe definition Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 006/135] selftests/bpf: Fix parsing of prog types in UAPI hdr for bpftool sync Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 007/135] ath11k: Change max no of active probe SSID and BSSID to fw capability Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 008/135] mwifiex: add mutex lock for call in mwifiex_dfs_chan_sw_work_queue Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 009/135] b43legacy: Fix assigning negative value to unsigned variable Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 010/135] b43: " Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 011/135] ipw2x00: Fix potential NULL dereference in libipw_xmit() Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 012/135] ipv6: fix locking issues with loops over idev->addr_list Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 016/135] libbpf: Fix a bug with checking bpf_probe_read_kernel() support in old kernels Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 017/135] mac80211: minstrel_ht: fix where rate stats are stored (fixes debugfs output) Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 023/135] sfc: ef10: Fix assigning negative value to unsigned variable Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 025/135] rtw88: fix incorrect frequency reported Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 026/135] rtw88: 8821c: fix debugfs rssi value Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 029/135] tcp: consume incoming skb leading to a reset Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 033/135] net: sched: use queue_mapping to pick tx queue Sasha Levin
2022-05-30 13:29 ` [PATCH AUTOSEL 5.17 038/135] ath9k: fix QCA9561 PA bias level Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 052/135] ath11k: disable spectral scan during spectral deinit Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 056/135] ath10k: skip ath10k_halt during suspend for driver state RESTARTING Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 061/135] ath11k: fix warning of not found station for bssid in message Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 063/135] ipv6: Don't send rs packets to the interface of ARPHRD_TUNNEL Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 064/135] net/mlx5: fs, delete the FTE when there are no rules attached to it Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 066/135] mlxsw: spectrum_dcb: Do not warn about priority changes Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 067/135] mlxsw: Treat LLDP packets as control Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 075/135] net/mlx5: Increase FW pre-init timeout for health recovery Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 078/135] net: remove two BUG() from skb_checksum_help() Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 090/135] rtlwifi: Use pr_warn instead of WARN_ONCE Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 091/135] mt76: mt7921: accept rx frames with non-standard VHT MCS10-11 Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 092/135] mt76: fix encap offload ethernet type check Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 099/135] usbnet: Run unregister_netdev() before unbind() again Sasha Levin
2022-05-30 13:30 ` [PATCH AUTOSEL 5.17 100/135] Bluetooth: HCI: Add HCI_QUIRK_BROKEN_ENHANCED_SETUP_SYNC_CONN quirk Sasha Levin
2022-05-30 13:31 ` [PATCH AUTOSEL 5.17 112/135] net: phy: micrel: Allow probing without .driver_data Sasha Levin
2022-05-30 13:31 ` [PATCH AUTOSEL 5.17 115/135] rtw89: cfo: check mac_id to avoid out-of-bounds Sasha Levin
2022-05-30 13:31 ` [PATCH AUTOSEL 5.17 122/135] can: mcp251xfd: silence clang's -Wunaligned-access warning Sasha Levin
2022-05-30 13:31 ` [PATCH AUTOSEL 5.17 124/135] net: ipa: ignore endianness if there is no header Sasha Levin
2022-05-30 13:31 ` [PATCH AUTOSEL 5.17 126/135] selftests/bpf: Add missing trampoline program type to trampoline_count test Sasha Levin
2022-05-30 13:31 ` [PATCH AUTOSEL 5.17 129/135] rxrpc: Return an error to sendmsg if call failed Sasha Levin
2022-05-30 13:31 ` [PATCH AUTOSEL 5.17 130/135] rxrpc, afs: Fix selection of abort codes Sasha Levin
2022-05-30 13:31 ` Sasha Levin [this message]
2022-05-30 13:31 ` [PATCH AUTOSEL 5.17 132/135] eth: tg3: silence the GCC 12 array-bounds warning Sasha Levin
2022-05-30 13:31 ` [PATCH AUTOSEL 5.17 134/135] selftests/bpf: fix btf_dump/btf_dump due to recent clang change Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220530133133.1931716-131-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.dionne@auristor.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).