netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v4 0/3] tcp: fix handling of stale syncookies timestamps
@ 2019-12-06 11:38 Guillaume Nault
  2019-12-06 11:38 ` [PATCH net v4 1/3] tcp: fix rejected syncookies due to stale timestamps Guillaume Nault
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Guillaume Nault @ 2019-12-06 11:38 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Arnd Bergmann, John Stultz, Thomas Gleixner

The synflood timestamps (->ts_recent_stamp and ->synq_overflow_ts) are
only refreshed when the syncookie protection triggers. Therefore, their
value can become very far apart from jiffies if no synflood happens for
a long time.

If jiffies grows too much and wraps while the synflood timestamp isn't
refreshed, then time_after32() might consider the later to be in the
future. This can trick tcp_synq_no_recent_overflow() into returning
erroneous values and rejecting valid ACKs.

Patch 1 handles the case of ACKs using legitimate syncookies.
Patch 2 handles the case of stray ACKs.
Patch 3 annotates lockless timestamp operations with READ_ONCE() and
WRITE_ONCE().

Changes from v3:
  - Fix description of time_between32() (found by Eric Dumazet).
  - Use more accurate Fixes tag in patch 3 (suggested by Eric Dumazet).

Changes from v2:
  - Define and use time_between32() instead of a pair of
    time_before32/time_after32 (suggested by Eric Dumazet).
  - Use 'last_overflow - HZ' as lower bound in
    tcp_synq_no_recent_overflow(), to accommodate for concurrent
    timestamp updates (found by Eric Dumazet).
  - Add a third patch to annotate lockless accesses to .ts_recent_stamp.

Changes from v1:
  - Initialising timestamps at socket creation time is not enough
    because jiffies wraps in 24 days with HZ=1000 (Eric Dumazet).
    Handle stale timestamps in tcp_synq_overflow() and
    tcp_synq_no_recent_overflow() instead.
  - Rework commit description.
  - Add a second patch to handle the case of stray ACKs.

Guillaume Nault (3):
  tcp: fix rejected syncookies due to stale timestamps
  tcp: tighten acceptance of ACKs not matching a child socket
  tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()

 include/linux/time.h | 13 +++++++++++++
 include/net/tcp.h    | 27 +++++++++++++++++++--------
 2 files changed, 32 insertions(+), 8 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net v4 1/3] tcp: fix rejected syncookies due to stale timestamps
  2019-12-06 11:38 [PATCH net v4 0/3] tcp: fix handling of stale syncookies timestamps Guillaume Nault
@ 2019-12-06 11:38 ` Guillaume Nault
  2019-12-07  1:49   ` Eric Dumazet
  2019-12-06 11:38 ` [PATCH net v4 2/3] tcp: tighten acceptance of ACKs not matching a child socket Guillaume Nault
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 6+ messages in thread
From: Guillaume Nault @ 2019-12-06 11:38 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Arnd Bergmann, John Stultz, Thomas Gleixner

If no synflood happens for a long enough period of time, then the
synflood timestamp isn't refreshed and jiffies can advance so much
that time_after32() can't accurately compare them any more.

Therefore, we can end up in a situation where time_after32(now,
last_overflow + HZ) returns false, just because these two values are
too far apart. In that case, the synflood timestamp isn't updated as
it should be, which can trick tcp_synq_no_recent_overflow() into
rejecting valid syncookies.

For example, let's consider the following scenario on a system
with HZ=1000:

  * The synflood timestamp is 0, either because that's the timestamp
    of the last synflood or, more commonly, because we're working with
    a freshly created socket.

  * We receive a new SYN, which triggers synflood protection. Let's say
    that this happens when jiffies == 2147484649 (that is,
    'synflood timestamp' + HZ + 2^31 + 1).

  * Then tcp_synq_overflow() doesn't update the synflood timestamp,
    because time_after32(2147484649, 1000) returns false.
    With:
      - 2147484649: the value of jiffies, aka. 'now'.
      - 1000: the value of 'last_overflow' + HZ.

  * A bit later, we receive the ACK completing the 3WHS. But
    cookie_v[46]_check() rejects it because tcp_synq_no_recent_overflow()
    says that we're not under synflood. That's because
    time_after32(2147484649, 120000) returns false.
    With:
      - 2147484649: the value of jiffies, aka. 'now'.
      - 120000: the value of 'last_overflow' + TCP_SYNCOOKIE_VALID.

    Of course, in reality jiffies would have increased a bit, but this
    condition will last for the next 119 seconds, which is far enough
    to accommodate for jiffie's growth.

Fix this by updating the overflow timestamp whenever jiffies isn't
within the [last_overflow, last_overflow + HZ] range. That shouldn't
have any performance impact since the update still happens at most once
per second.

Now we're guaranteed to have fresh timestamps while under synflood, so
tcp_synq_no_recent_overflow() can safely use it with time_after32() in
such situations.

Stale timestamps can still make tcp_synq_no_recent_overflow() return
the wrong verdict when not under synflood. This will be handled in the
next patch.

For 64 bits architectures, the problem was introduced with the
conversion of ->tw_ts_recent_stamp to 32 bits integer by commit
cca9bab1b72c ("tcp: use monotonic timestamps for PAWS").
The problem has always been there on 32 bits architectures.

Fixes: cca9bab1b72c ("tcp: use monotonic timestamps for PAWS")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 include/linux/time.h | 13 +++++++++++++
 include/net/tcp.h    |  5 +++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/include/linux/time.h b/include/linux/time.h
index 0760a4f5a15c..8e10b9dbd8c2 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -97,4 +97,17 @@ static inline bool itimerspec64_valid(const struct itimerspec64 *its)
  */
 #define time_after32(a, b)	((s32)((u32)(b) - (u32)(a)) < 0)
 #define time_before32(b, a)	time_after32(a, b)
+
+/**
+ * time_between32 - check if a 32-bit timestamp is within a given time range
+ * @t:	the time which may be within [l,h]
+ * @l:	the lower bound of the range
+ * @h:	the higher bound of the range
+ *
+ * time_before32(t, l, h) returns true if @l <= @t <= @h. All operands are
+ * treated as 32-bit integers.
+ *
+ * Equivalent to !(time_before32(@t, @l) || time_after32(@t, @h)).
+ */
+#define time_between32(t, l, h) ((u32)(h) - (u32)(l) >= (u32)(t) - (u32)(l))
 #endif
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 36f195fb576a..7d734ba391fc 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -494,14 +494,15 @@ static inline void tcp_synq_overflow(const struct sock *sk)
 		reuse = rcu_dereference(sk->sk_reuseport_cb);
 		if (likely(reuse)) {
 			last_overflow = READ_ONCE(reuse->synq_overflow_ts);
-			if (time_after32(now, last_overflow + HZ))
+			if (!time_between32(now, last_overflow,
+					    last_overflow + HZ))
 				WRITE_ONCE(reuse->synq_overflow_ts, now);
 			return;
 		}
 	}
 
 	last_overflow = tcp_sk(sk)->rx_opt.ts_recent_stamp;
-	if (time_after32(now, last_overflow + HZ))
+	if (!time_between32(now, last_overflow, last_overflow + HZ))
 		tcp_sk(sk)->rx_opt.ts_recent_stamp = now;
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net v4 2/3] tcp: tighten acceptance of ACKs not matching a child socket
  2019-12-06 11:38 [PATCH net v4 0/3] tcp: fix handling of stale syncookies timestamps Guillaume Nault
  2019-12-06 11:38 ` [PATCH net v4 1/3] tcp: fix rejected syncookies due to stale timestamps Guillaume Nault
@ 2019-12-06 11:38 ` Guillaume Nault
  2019-12-06 11:38 ` [PATCH net v4 3/3] tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE() Guillaume Nault
  2019-12-07  5:06 ` [PATCH net v4 0/3] tcp: fix handling of stale syncookies timestamps David Miller
  3 siblings, 0 replies; 6+ messages in thread
From: Guillaume Nault @ 2019-12-06 11:38 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet

When no synflood occurs, the synflood timestamp isn't updated.
Therefore it can be so old that time_after32() can consider it to be
in the future.

That's a problem for tcp_synq_no_recent_overflow() as it may report
that a recent overflow occurred while, in fact, it's just that jiffies
has grown past 'last_overflow' + TCP_SYNCOOKIE_VALID + 2^31.

Spurious detection of recent overflows lead to extra syncookie
verification in cookie_v[46]_check(). At that point, the verification
should fail and the packet dropped. But we should have dropped the
packet earlier as we didn't even send a syncookie.

Let's refine tcp_synq_no_recent_overflow() to report a recent overflow
only if jiffies is within the
[last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval. This
way, no spurious recent overflow is reported when jiffies wraps and
'last_overflow' becomes in the future from the point of view of
time_after32().

However, if jiffies wraps and enters the
[last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval (with
'last_overflow' being a stale synflood timestamp), then
tcp_synq_no_recent_overflow() still erroneously reports an
overflow. In such cases, we have to rely on syncookie verification
to drop the packet. We unfortunately have no way to differentiate
between a fresh and a stale syncookie timestamp.

In practice, using last_overflow as lower bound is problematic.
If the synflood timestamp is concurrently updated between the time
we read jiffies and the moment we store the timestamp in
'last_overflow', then 'now' becomes smaller than 'last_overflow' and
tcp_synq_no_recent_overflow() returns true, potentially dropping a
valid syncookie.

Reading jiffies after loading the timestamp could fix the problem,
but that'd require a memory barrier. Let's just accommodate for
potential timestamp growth instead and extend the interval using
'last_overflow - HZ' as lower bound.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/tcp.h | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7d734ba391fc..43e04e14c41e 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -518,13 +518,23 @@ static inline bool tcp_synq_no_recent_overflow(const struct sock *sk)
 		reuse = rcu_dereference(sk->sk_reuseport_cb);
 		if (likely(reuse)) {
 			last_overflow = READ_ONCE(reuse->synq_overflow_ts);
-			return time_after32(now, last_overflow +
-					    TCP_SYNCOOKIE_VALID);
+			return !time_between32(now, last_overflow - HZ,
+					       last_overflow +
+					       TCP_SYNCOOKIE_VALID);
 		}
 	}
 
 	last_overflow = tcp_sk(sk)->rx_opt.ts_recent_stamp;
-	return time_after32(now, last_overflow + TCP_SYNCOOKIE_VALID);
+
+	/* If last_overflow <= jiffies <= last_overflow + TCP_SYNCOOKIE_VALID,
+	 * then we're under synflood. However, we have to use
+	 * 'last_overflow - HZ' as lower bound. That's because a concurrent
+	 * tcp_synq_overflow() could update .ts_recent_stamp after we read
+	 * jiffies but before we store .ts_recent_stamp into last_overflow,
+	 * which could lead to rejecting a valid syncookie.
+	 */
+	return !time_between32(now, last_overflow - HZ,
+			       last_overflow + TCP_SYNCOOKIE_VALID);
 }
 
 static inline u32 tcp_cookie_time(void)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net v4 3/3] tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()
  2019-12-06 11:38 [PATCH net v4 0/3] tcp: fix handling of stale syncookies timestamps Guillaume Nault
  2019-12-06 11:38 ` [PATCH net v4 1/3] tcp: fix rejected syncookies due to stale timestamps Guillaume Nault
  2019-12-06 11:38 ` [PATCH net v4 2/3] tcp: tighten acceptance of ACKs not matching a child socket Guillaume Nault
@ 2019-12-06 11:38 ` Guillaume Nault
  2019-12-07  5:06 ` [PATCH net v4 0/3] tcp: fix handling of stale syncookies timestamps David Miller
  3 siblings, 0 replies; 6+ messages in thread
From: Guillaume Nault @ 2019-12-06 11:38 UTC (permalink / raw)
  To: David Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet

Syncookies borrow the ->rx_opt.ts_recent_stamp field to store the
timestamp of the last synflood. Protect them with READ_ONCE() and
WRITE_ONCE() since reads and writes aren't serialised.

Use of .rx_opt.ts_recent_stamp for storing the synflood timestamp was
introduced by a0f82f64e269 ("syncookies: remove last_synq_overflow from
struct tcp_sock"). But unprotected accesses were already there when
timestamp was stored in .last_synq_overflow.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/tcp.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 43e04e14c41e..86b9a8766648 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -501,9 +501,9 @@ static inline void tcp_synq_overflow(const struct sock *sk)
 		}
 	}
 
-	last_overflow = tcp_sk(sk)->rx_opt.ts_recent_stamp;
+	last_overflow = READ_ONCE(tcp_sk(sk)->rx_opt.ts_recent_stamp);
 	if (!time_between32(now, last_overflow, last_overflow + HZ))
-		tcp_sk(sk)->rx_opt.ts_recent_stamp = now;
+		WRITE_ONCE(tcp_sk(sk)->rx_opt.ts_recent_stamp, now);
 }
 
 /* syncookies: no recent synqueue overflow on this listening socket? */
@@ -524,7 +524,7 @@ static inline bool tcp_synq_no_recent_overflow(const struct sock *sk)
 		}
 	}
 
-	last_overflow = tcp_sk(sk)->rx_opt.ts_recent_stamp;
+	last_overflow = READ_ONCE(tcp_sk(sk)->rx_opt.ts_recent_stamp);
 
 	/* If last_overflow <= jiffies <= last_overflow + TCP_SYNCOOKIE_VALID,
 	 * then we're under synflood. However, we have to use
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net v4 1/3] tcp: fix rejected syncookies due to stale timestamps
  2019-12-06 11:38 ` [PATCH net v4 1/3] tcp: fix rejected syncookies due to stale timestamps Guillaume Nault
@ 2019-12-07  1:49   ` Eric Dumazet
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2019-12-07  1:49 UTC (permalink / raw)
  To: Guillaume Nault, David Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Arnd Bergmann, John Stultz, Thomas Gleixner



On 12/6/19 3:38 AM, Guillaume Nault wrote:
> If no synflood happens for a long enough period of time, then the
> synflood timestamp isn't refreshed and jiffies can advance so much
> that time_after32() can't accurately compare them any more.
...
> 
> Fixes: cca9bab1b72c ("tcp: use monotonic timestamps for PAWS")
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Guillaume Nault <gnault@redhat.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v4 0/3] tcp: fix handling of stale syncookies timestamps
  2019-12-06 11:38 [PATCH net v4 0/3] tcp: fix handling of stale syncookies timestamps Guillaume Nault
                   ` (2 preceding siblings ...)
  2019-12-06 11:38 ` [PATCH net v4 3/3] tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE() Guillaume Nault
@ 2019-12-07  5:06 ` David Miller
  3 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2019-12-07  5:06 UTC (permalink / raw)
  To: gnault; +Cc: jakub.kicinski, netdev, edumazet, arnd, john.stultz, tglx

From: Guillaume Nault <gnault@redhat.com>
Date: Fri, 6 Dec 2019 12:38:26 +0100

> The synflood timestamps (->ts_recent_stamp and ->synq_overflow_ts) are
> only refreshed when the syncookie protection triggers. Therefore, their
> value can become very far apart from jiffies if no synflood happens for
> a long time.
> 
> If jiffies grows too much and wraps while the synflood timestamp isn't
> refreshed, then time_after32() might consider the later to be in the
> future. This can trick tcp_synq_no_recent_overflow() into returning
> erroneous values and rejecting valid ACKs.
> 
> Patch 1 handles the case of ACKs using legitimate syncookies.
> Patch 2 handles the case of stray ACKs.
> Patch 3 annotates lockless timestamp operations with READ_ONCE() and
> WRITE_ONCE().
 ...

Series applied, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-12-07  5:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-06 11:38 [PATCH net v4 0/3] tcp: fix handling of stale syncookies timestamps Guillaume Nault
2019-12-06 11:38 ` [PATCH net v4 1/3] tcp: fix rejected syncookies due to stale timestamps Guillaume Nault
2019-12-07  1:49   ` Eric Dumazet
2019-12-06 11:38 ` [PATCH net v4 2/3] tcp: tighten acceptance of ACKs not matching a child socket Guillaume Nault
2019-12-06 11:38 ` [PATCH net v4 3/3] tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE() Guillaume Nault
2019-12-07  5:06 ` [PATCH net v4 0/3] tcp: fix handling of stale syncookies timestamps David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).