linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFCv3 00/15] tcp: Initial support for RFC5925 auth option
@ 2021-08-24 21:34 Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 01/15] tcp: authopt: Initial support and key management Leonard Crestez
                   ` (14 more replies)
  0 siblings, 15 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

This is similar to TCP MD5 in functionality but it's sufficiently
different that wire formats are incompatible. Compared to TCP-MD5
more algorithms are supported and multiple keys can be used on the
same connection but there is still no negotiation mechanism.
Expected use-case is protecting long-duration BGP/LDP connections
between routers using pre-shared keys.

This version is mostly functional, it incorporates ABI feedback from
previous versions and adds tests to kselftests. More discussion and
testing is required and obvious optimizations were skipped in favor of
adding functionality. Here are several flaws:

* RST and TIMEWAIT are mostly unhandled
* Locking is lockdep-clean but need to be revised
* Sequence Number Extension not implemented
* User is responsible for ensuring keys do not overlap
* Traffic key is not cached (reducing performance)

Not all ABI suggestions were incorporated, they can be discussed further.
However I very much want to avoid supporting algorithms beyond RFC5926.

Test suite was added to tools/selftests/tcp_authopt. Tests are written
in python using pytest and scapy and check the API in some detail and
validate packet captures. Python code is already in linux and in
kselftests but virtualenvs not very much. This test suite uses `tox` to
create a private virtualenv and hide dependencies. Let me know if this
is OK or how it can be improved.

Limited testing support is also included in nettest and fcnal-test.sh,
those tests are slow and cover much less.

Changes for frr: https://github.com/FRRouting/frr/pull/9442
That PR was made early for ABI feedback, it has many issues.

Changes for yabgp: https://github.com/cdleonard/yabgp/commits/tcp_authopt
The patched version of yabgp can establish a BGP session protected by
TCP Authentication Option with a Cisco IOS-XR router. It old now.

Changes since RFCv2:
* Removed local_id from ABI and match on send_id/recv_id/addr
* Add all relevant out-of-tree tests to tools/testing/selftests
* Return an error instead of ignoring unknown flags, hopefully this makes
it easier to extend.
* Check sk_family before __tcp_authopt_info_get_or_create in tcp_set_authopt_key
* Use sock_owned_by_me instead of WARN_ON(!lockdep_sock_is_held(sk))
* Fix some intermediate build failures reported by kbuild robot
* Improve documentation
Link: https://lore.kernel.org/netdev/cover.1628544649.git.cdleonard@gmail.com/
 
Changes since RFC:
* Split into per-topic commits for ease of review. The intermediate
commits compile with a few "unused function" warnings and don't do
anything useful by themselves.
* Add ABI documention including kernel-doc on uapi
* Fix lockdep warnings from crypto by creating pools with one shash for
each cpu
* Accept short options to setsockopt by padding with zeros; this
approach allows increasing the size of the structs in the future.
* Support for aes-128-cmac-96
* Support for binding addresses to keys in a way similar to old tcp_md5
* Add support for retrieving received keyid/rnextkeyid and controling
the keyid/rnextkeyid being sent.
Link: https://lore.kernel.org/netdev/01383a8751e97ef826ef2adf93bfde3a08195a43.1626693859.git.cdleonard@gmail.com/

Leonard Crestez (15):
  tcp: authopt: Initial support and key management
  docs: Add user documentation for tcp_authopt
  selftests: Initial tcp_authopt test module
  selftests: tcp_authopt: Initial sockopt manipulation
  tcp: authopt: Add crypto initialization
  tcp: authopt: Compute packet signatures
  tcp: authopt: Hook into tcp core
  tcp: authopt: Add snmp counters
  selftests: tcp_authopt: Test key address binding
  selftests: tcp_authopt: Capture and verify packets
  selftests: Initial tcp_authopt support for nettest
  selftests: Initial tcp_authopt support for fcnal-test
  selftests: Add -t tcp_authopt option for fcnal-test.sh
  tcp: authopt: Add key selection controls
  selftests: tcp_authopt: Add tests for rollover

 Documentation/networking/index.rst            |    1 +
 Documentation/networking/tcp_authopt.rst      |   69 +
 include/linux/tcp.h                           |    6 +
 include/net/tcp.h                             |    1 +
 include/net/tcp_authopt.h                     |  134 ++
 include/uapi/linux/snmp.h                     |    1 +
 include/uapi/linux/tcp.h                      |  110 ++
 net/ipv4/Kconfig                              |   14 +
 net/ipv4/Makefile                             |    1 +
 net/ipv4/proc.c                               |    1 +
 net/ipv4/tcp.c                                |   27 +
 net/ipv4/tcp_authopt.c                        | 1168 +++++++++++++++++
 net/ipv4/tcp_input.c                          |   17 +
 net/ipv4/tcp_ipv4.c                           |    5 +
 net/ipv4/tcp_minisocks.c                      |    2 +
 net/ipv4/tcp_output.c                         |   74 +-
 net/ipv6/tcp_ipv6.c                           |    4 +
 tools/testing/selftests/net/fcnal-test.sh     |   34 +
 tools/testing/selftests/net/nettest.c         |   34 +-
 tools/testing/selftests/tcp_authopt/Makefile  |    5 +
 .../testing/selftests/tcp_authopt/README.rst  |   15 +
 tools/testing/selftests/tcp_authopt/config    |    6 +
 tools/testing/selftests/tcp_authopt/run.sh    |   11 +
 tools/testing/selftests/tcp_authopt/setup.cfg |   17 +
 tools/testing/selftests/tcp_authopt/setup.py  |    5 +
 .../tcp_authopt/tcp_authopt_test/__init__.py  |    0
 .../tcp_authopt/tcp_authopt_test/conftest.py  |   21 +
 .../full_tcp_sniff_session.py                 |   53 +
 .../tcp_authopt_test/linux_tcp_authopt.py     |  198 +++
 .../tcp_authopt_test/netns_fixture.py         |   63 +
 .../tcp_authopt/tcp_authopt_test/server.py    |   82 ++
 .../tcp_authopt/tcp_authopt_test/sockaddr.py  |  101 ++
 .../tcp_authopt_test/tcp_authopt_alg.py       |  276 ++++
 .../tcp_authopt/tcp_authopt_test/test_bind.py |  143 ++
 .../tcp_authopt_test/test_rollover.py         |  181 +++
 .../tcp_authopt_test/test_sockopt.py          |   74 ++
 .../tcp_authopt_test/test_vectors.py          |  359 +++++
 .../tcp_authopt_test/test_verify_capture.py   |  123 ++
 .../tcp_authopt/tcp_authopt_test/utils.py     |  154 +++
 .../tcp_authopt/tcp_authopt_test/validator.py |  158 +++
 40 files changed, 3746 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/networking/tcp_authopt.rst
 create mode 100644 include/net/tcp_authopt.h
 create mode 100644 net/ipv4/tcp_authopt.c
 create mode 100644 tools/testing/selftests/tcp_authopt/Makefile
 create mode 100644 tools/testing/selftests/tcp_authopt/README.rst
 create mode 100644 tools/testing/selftests/tcp_authopt/config
 create mode 100755 tools/testing/selftests/tcp_authopt/run.sh
 create mode 100644 tools/testing/selftests/tcp_authopt/setup.cfg
 create mode 100644 tools/testing/selftests/tcp_authopt/setup.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/__init__.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/conftest.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/full_tcp_sniff_session.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/linux_tcp_authopt.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/netns_fixture.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/server.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/sockaddr.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/tcp_authopt_alg.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_bind.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_rollover.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_sockopt.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_vectors.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_verify_capture.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/utils.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/validator.py


base-commit: 3a62c333497b164868fdcd241842a1dd4e331825
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFCv3 01/15] tcp: authopt: Initial support and key management
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-31 19:04   ` Dmitry Safonov
  2021-08-24 21:34 ` [RFCv3 02/15] docs: Add user documentation for tcp_authopt Leonard Crestez
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

This commit adds support to add and remove keys but does not use them
further.

Similar to tcp md5 a single pointer to a struct tcp_authopt_info* struct
is added to struct tcp_sock, this avoids increasing memory usage. The
data structures related to tcp_authopt are initialized on setsockopt and
only freed on socket close.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 include/linux/tcp.h       |   6 +
 include/net/tcp.h         |   1 +
 include/net/tcp_authopt.h |  65 +++++++++++
 include/uapi/linux/tcp.h  |  79 ++++++++++++++
 net/ipv4/Kconfig          |  14 +++
 net/ipv4/Makefile         |   1 +
 net/ipv4/tcp.c            |  27 +++++
 net/ipv4/tcp_authopt.c    | 223 ++++++++++++++++++++++++++++++++++++++
 net/ipv4/tcp_ipv4.c       |   2 +
 9 files changed, 418 insertions(+)
 create mode 100644 include/net/tcp_authopt.h
 create mode 100644 net/ipv4/tcp_authopt.c

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 48d8a363319e..cfddfc720b00 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -140,10 +140,12 @@ struct tcp_request_sock {
 static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req)
 {
 	return (struct tcp_request_sock *)req;
 }
 
+struct tcp_authopt_info;
+
 struct tcp_sock {
 	/* inet_connection_sock has to be the first member of tcp_sock */
 	struct inet_connection_sock	inet_conn;
 	u16	tcp_header_len;	/* Bytes of tcp header to send		*/
 	u16	gso_segs;	/* Max number of segs per GSO packet	*/
@@ -403,10 +405,14 @@ struct tcp_sock {
 
 /* TCP MD5 Signature Option information */
 	struct tcp_md5sig_info	__rcu *md5sig_info;
 #endif
 
+#ifdef CONFIG_TCP_AUTHOPT
+	struct tcp_authopt_info	__rcu *authopt_info;
+#endif
+
 /* TCP fastopen related information */
 	struct tcp_fastopen_request *fastopen_req;
 	/* fastopen_rsk points to request_sock that resulted in this big
 	 * socket. Used to retransmit SYNACKs etc.
 	 */
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 3166dc15d7d6..bb76554e8fe5 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -182,10 +182,11 @@ void tcp_time_wait(struct sock *sk, int state, int timeo);
 #define TCPOPT_WINDOW		3	/* Window scaling */
 #define TCPOPT_SACK_PERM        4       /* SACK Permitted */
 #define TCPOPT_SACK             5       /* SACK Block */
 #define TCPOPT_TIMESTAMP	8	/* Better RTT estimations/PAWS */
 #define TCPOPT_MD5SIG		19	/* MD5 Signature (RFC2385) */
+#define TCPOPT_AUTHOPT		29	/* Auth Option (RFC5925) */
 #define TCPOPT_MPTCP		30	/* Multipath TCP (RFC6824) */
 #define TCPOPT_FASTOPEN		34	/* Fast open (RFC7413) */
 #define TCPOPT_EXP		254	/* Experimental */
 /* Magic number to be after the option value for sharing TCP
  * experimental options. See draft-ietf-tcpm-experimental-options-00.txt
diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
new file mode 100644
index 000000000000..b4277112b506
--- /dev/null
+++ b/include/net/tcp_authopt.h
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _LINUX_TCP_AUTHOPT_H
+#define _LINUX_TCP_AUTHOPT_H
+
+#include <uapi/linux/tcp.h>
+
+/**
+ * struct tcp_authopt_key_info - Representation of a Master Key Tuple as per RFC5925
+ *
+ * Key structure lifetime is only protected by RCU so readers needs to hold a
+ * single rcu_read_lock until they're done with the key.
+ */
+struct tcp_authopt_key_info {
+	struct hlist_node node;
+	struct rcu_head rcu;
+	/* Local identifier */
+	u32 local_id;
+	u32 flags;
+	/* Wire identifiers */
+	u8 send_id, recv_id;
+	u8 alg_id;
+	u8 keylen;
+	u8 key[TCP_AUTHOPT_MAXKEYLEN];
+	struct sockaddr_storage addr;
+};
+
+/**
+ * struct tcp_authopt_info - Per-socket information regarding tcp_authopt
+ *
+ * This is lazy-initialized in order to avoid increasing memory usage for
+ * regular TCP sockets. Once created it is only destroyed on socket close.
+ */
+struct tcp_authopt_info {
+	/** @head: List of tcp_authopt_key_info */
+	struct hlist_head head;
+	struct rcu_head rcu;
+	u32 flags;
+	u32 src_isn;
+	u32 dst_isn;
+};
+
+#ifdef CONFIG_TCP_AUTHOPT
+void tcp_authopt_clear(struct sock *sk);
+int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen);
+int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *key);
+int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen);
+#else
+static inline int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
+{
+	return -ENOPROTOOPT;
+}
+static inline int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *key)
+{
+	return -ENOPROTOOPT;
+}
+static inline void tcp_authopt_clear(struct sock *sk)
+{
+}
+static inline int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
+{
+	return -ENOPROTOOPT;
+}
+#endif
+
+#endif /* _LINUX_TCP_AUTHOPT_H */
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 8fc09e8638b3..575162e7e281 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -126,10 +126,12 @@ enum {
 #define TCP_INQ			36	/* Notify bytes available to read as a cmsg on read */
 
 #define TCP_CM_INQ		TCP_INQ
 
 #define TCP_TX_DELAY		37	/* delay outgoing packets by XX usec */
+#define TCP_AUTHOPT		38	/* TCP Authentication Option (RFC5925) */
+#define TCP_AUTHOPT_KEY		39	/* TCP Authentication Option Key (RFC5925) */
 
 
 #define TCP_REPAIR_ON		1
 #define TCP_REPAIR_OFF		0
 #define TCP_REPAIR_OFF_NO_WP	-1	/* Turn off without window probes */
@@ -340,10 +342,87 @@ struct tcp_diag_md5sig {
 	__u16	tcpm_keylen;
 	__be32	tcpm_addr[4];
 	__u8	tcpm_key[TCP_MD5SIG_MAXKEYLEN];
 };
 
+/**
+ * enum tcp_authopt_flag - flags for `tcp_authopt.flags`
+ */
+enum tcp_authopt_flag {
+	/**
+	 * @TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED:
+	 *	Configure behavior of segments with TCP-AO coming from hosts for which no
+	 *	key is configured. The default recommended by RFC is to silently accept
+	 *	such connections.
+	 */
+	TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED = (1 << 2),
+};
+
+/**
+ * struct tcp_authopt - Per-socket options related to TCP Authentication Option
+ */
+struct tcp_authopt {
+	/** @flags: Combination of &enum tcp_authopt_flag */
+	__u32	flags;
+};
+
+/**
+ * enum tcp_authopt_key_flag - flags for `tcp_authopt.flags`
+ *
+ * @TCP_AUTHOPT_KEY_DEL: Delete the key by local_id and ignore all other fields.
+ * @TCP_AUTHOPT_KEY_EXCLUDE_OPTS: Exclude TCP options from signature.
+ * @TCP_AUTHOPT_KEY_ADDR_BIND: Key only valid for `tcp_authopt.addr`
+ */
+enum tcp_authopt_key_flag {
+	TCP_AUTHOPT_KEY_DEL = (1 << 0),
+	TCP_AUTHOPT_KEY_EXCLUDE_OPTS = (1 << 1),
+	TCP_AUTHOPT_KEY_ADDR_BIND = (1 << 2),
+};
+
+/**
+ * enum tcp_authopt_alg - Algorithms for TCP Authentication Option
+ */
+enum tcp_authopt_alg {
+	TCP_AUTHOPT_ALG_HMAC_SHA_1_96 = 1,
+	TCP_AUTHOPT_ALG_AES_128_CMAC_96 = 2,
+};
+
+/* for TCP_AUTHOPT_KEY socket option */
+#define TCP_AUTHOPT_MAXKEYLEN	80
+
+/**
+ * struct tcp_authopt_key - TCP Authentication KEY
+ *
+ * Key are identified by the combination of:
+ * - send_id
+ * - recv_id
+ * - addr (iff TCP_AUTHOPT_KEY_ADDR_BIND)
+ *
+ * RFC5925 requires that key ids must not overlap for the same TCP connection.
+ * This is not enforced by linux.
+ */
+struct tcp_authopt_key {
+	/** @flags: Combination of &enum tcp_authopt_key_flag */
+	__u32	flags;
+	/** @send_id: keyid value for send */
+	__u8	send_id;
+	/** @recv_id: keyid value for receive */
+	__u8	recv_id;
+	/** @alg: One of &enum tcp_authopt_alg */
+	__u8	alg;
+	/** @keylen: Length of the key buffer */
+	__u8	keylen;
+	/** @key: Secret key */
+	__u8	key[TCP_AUTHOPT_MAXKEYLEN];
+	/**
+	 * @addr: Key is only valid for this address
+	 *
+	 * Ignored unless TCP_AUTHOPT_KEY_ADDR_BIND flag is set
+	 */
+	struct __kernel_sockaddr_storage addr;
+};
+
 /* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */
 
 #define TCP_RECEIVE_ZEROCOPY_FLAG_TLB_CLEAN_HINT 0x1
 struct tcp_zerocopy_receive {
 	__u64 address;		/* in: address of mapping */
diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index 87983e70f03f..6459f4ea6f1d 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -740,5 +740,19 @@ config TCP_MD5SIG
 	  RFC2385 specifies a method of giving MD5 protection to TCP sessions.
 	  Its main (only?) use is to protect BGP sessions between core routers
 	  on the Internet.
 
 	  If unsure, say N.
+
+config TCP_AUTHOPT
+	bool "TCP: Authentication Option support (RFC5925)"
+	select CRYPTO
+	select CRYPTO_SHA1
+	select CRYPTO_HMAC
+	select CRYPTO_AES
+	select CRYPTO_CMAC
+	help
+	  RFC5925 specifies a new method of giving protection to TCP sessions.
+	  Its intended use is to protect BGP sessions between core routers
+	  on the Internet. It obsoletes TCP MD5 (RFC2385) but is incompatible.
+
+	  If unsure, say N.
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index bbdd9c44f14e..d336f32ce177 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -59,10 +59,11 @@ obj-$(CONFIG_TCP_CONG_NV) += tcp_nv.o
 obj-$(CONFIG_TCP_CONG_VENO) += tcp_veno.o
 obj-$(CONFIG_TCP_CONG_SCALABLE) += tcp_scalable.o
 obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o
 obj-$(CONFIG_TCP_CONG_YEAH) += tcp_yeah.o
 obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o
+obj-$(CONFIG_TCP_AUTHOPT) += tcp_authopt.o
 obj-$(CONFIG_NET_SOCK_MSG) += tcp_bpf.o
 obj-$(CONFIG_BPF_SYSCALL) += udp_bpf.o
 obj-$(CONFIG_NETLABEL) += cipso_ipv4.o
 
 obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f931def6302e..fd90e80afa2c 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -271,10 +271,11 @@
 
 #include <net/icmp.h>
 #include <net/inet_common.h>
 #include <net/tcp.h>
 #include <net/mptcp.h>
+#include <net/tcp_authopt.h>
 #include <net/xfrm.h>
 #include <net/ip.h>
 #include <net/sock.h>
 
 #include <linux/uaccess.h>
@@ -3573,10 +3574,16 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname,
 	case TCP_MD5SIG:
 	case TCP_MD5SIG_EXT:
 		err = tp->af_specific->md5_parse(sk, optname, optval, optlen);
 		break;
 #endif
+	case TCP_AUTHOPT:
+		err = tcp_set_authopt(sk, optval, optlen);
+		break;
+	case TCP_AUTHOPT_KEY:
+		err = tcp_set_authopt_key(sk, optval, optlen);
+		break;
 	case TCP_USER_TIMEOUT:
 		/* Cap the max time in ms TCP will retry or probe the window
 		 * before giving up and aborting (ETIMEDOUT) a connection.
 		 */
 		if (val < 0)
@@ -4219,10 +4226,30 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
 		if (!err && copy_to_user(optval, &zc, len))
 			err = -EFAULT;
 		return err;
 	}
 #endif
+#ifdef CONFIG_TCP_AUTHOPT
+	case TCP_AUTHOPT: {
+		struct tcp_authopt info;
+
+		if (get_user(len, optlen))
+			return -EFAULT;
+
+		lock_sock(sk);
+		tcp_get_authopt_val(sk, &info);
+		release_sock(sk);
+
+		len = min_t(unsigned int, len, sizeof(info));
+		if (put_user(len, optlen))
+			return -EFAULT;
+		if (copy_to_user(optval, &info, len))
+			return -EFAULT;
+		return 0;
+	}
+#endif
+
 	default:
 		return -ENOPROTOOPT;
 	}
 
 	if (put_user(len, optlen))
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
new file mode 100644
index 000000000000..f6dddc5775ff
--- /dev/null
+++ b/net/ipv4/tcp_authopt.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include <linux/kernel.h>
+#include <net/tcp.h>
+#include <net/tcp_authopt.h>
+#include <crypto/hash.h>
+#include <trace/events/tcp.h>
+
+/* checks that ipv4 or ipv6 addr matches. */
+static bool ipvx_addr_match(struct sockaddr_storage *a1,
+			    struct sockaddr_storage *a2)
+{
+	if (a1->ss_family != a2->ss_family)
+		return false;
+	if (a1->ss_family == AF_INET && memcmp(
+			&((struct sockaddr_in *)a1)->sin_addr,
+			&((struct sockaddr_in *)a2)->sin_addr,
+			sizeof(struct in_addr)))
+		return false;
+	if (a1->ss_family == AF_INET6 && memcmp(
+			&((struct sockaddr_in6 *)a1)->sin6_addr,
+			&((struct sockaddr_in6 *)a2)->sin6_addr,
+			sizeof(struct in6_addr)))
+		return false;
+	return true;
+}
+
+static bool tcp_authopt_key_match_exact(struct tcp_authopt_key_info *info,
+					struct tcp_authopt_key *key)
+{
+	if (info->send_id != key->send_id)
+		return false;
+	if (info->recv_id != key->recv_id)
+		return false;
+	if ((info->flags & TCP_AUTHOPT_KEY_ADDR_BIND) != (key->recv_id & TCP_AUTHOPT_KEY_ADDR_BIND))
+		return false;
+	if (info->flags & TCP_AUTHOPT_KEY_ADDR_BIND)
+		if (!ipvx_addr_match(&info->addr, &key->addr))
+			return false;
+
+	return true;
+}
+
+static struct tcp_authopt_key_info *tcp_authopt_key_lookup_exact(const struct sock *sk,
+								 struct tcp_authopt_info *info,
+								 struct tcp_authopt_key *ukey)
+{
+	struct tcp_authopt_key_info *key_info;
+
+	hlist_for_each_entry_rcu(key_info, &info->head, node, lockdep_sock_is_held(sk))
+		if (tcp_authopt_key_match_exact(key_info, ukey))
+			return key_info;
+
+	return NULL;
+}
+
+static struct tcp_authopt_info *__tcp_authopt_info_get_or_create(struct sock *sk)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct tcp_authopt_info *info;
+
+	info = rcu_dereference_check(tp->authopt_info, lockdep_sock_is_held(sk));
+	if (info)
+		return info;
+
+	info = kmalloc(sizeof(*info), GFP_KERNEL | __GFP_ZERO);
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	sk_nocaps_add(sk, NETIF_F_GSO_MASK);
+	INIT_HLIST_HEAD(&info->head);
+	rcu_assign_pointer(tp->authopt_info, info);
+
+	return info;
+}
+
+#define TCP_AUTHOPT_KNOWN_FLAGS ( \
+	TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED)
+
+int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
+{
+	struct tcp_authopt opt;
+	struct tcp_authopt_info *info;
+
+	sock_owned_by_me(sk);
+
+	/* If userspace optlen is too short fill the rest with zeros */
+	if (optlen > sizeof(opt))
+		return -EINVAL;
+	memset(&opt, 0, sizeof(opt));
+	if (copy_from_sockptr(&opt, optval, optlen))
+		return -EFAULT;
+
+	if (opt.flags & ~TCP_AUTHOPT_KNOWN_FLAGS)
+		return -EINVAL;
+
+	info = __tcp_authopt_info_get_or_create(sk);
+	if (IS_ERR(info))
+		return PTR_ERR(info);
+
+	info->flags = opt.flags & TCP_AUTHOPT_KNOWN_FLAGS;
+
+	return 0;
+}
+
+int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *opt)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct tcp_authopt_info *info;
+
+	sock_owned_by_me(sk);
+
+	memset(opt, 0, sizeof(*opt));
+	info = rcu_dereference_check(tp->authopt_info, lockdep_sock_is_held(sk));
+	if (!info)
+		return -EINVAL;
+
+	opt->flags = info->flags & TCP_AUTHOPT_KNOWN_FLAGS;
+
+	return 0;
+}
+
+static void tcp_authopt_key_del(struct sock *sk,
+				struct tcp_authopt_info *info,
+				struct tcp_authopt_key_info *key)
+{
+	hlist_del_rcu(&key->node);
+	atomic_sub(sizeof(*key), &sk->sk_omem_alloc);
+	kfree_rcu(key, rcu);
+}
+
+/* free info and keys but don't touch tp->authopt_info */
+static void __tcp_authopt_info_free(struct sock *sk, struct tcp_authopt_info *info)
+{
+	struct hlist_node *n;
+	struct tcp_authopt_key_info *key;
+
+	hlist_for_each_entry_safe(key, n, &info->head, node)
+		tcp_authopt_key_del(sk, info, key);
+	kfree_rcu(info, rcu);
+}
+
+/* free everything and clear tcp_sock.authopt_info to NULL */
+void tcp_authopt_clear(struct sock *sk)
+{
+	struct tcp_authopt_info *info;
+
+	info = rcu_dereference_protected(tcp_sk(sk)->authopt_info, lockdep_sock_is_held(sk));
+	if (info) {
+		__tcp_authopt_info_free(sk, info);
+		tcp_sk(sk)->authopt_info = NULL;
+	}
+}
+
+#define TCP_AUTHOPT_KEY_KNOWN_FLAGS ( \
+	TCP_AUTHOPT_KEY_DEL | \
+	TCP_AUTHOPT_KEY_EXCLUDE_OPTS | \
+	TCP_AUTHOPT_KEY_ADDR_BIND)
+
+int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
+{
+	struct tcp_authopt_key opt;
+	struct tcp_authopt_info *info;
+	struct tcp_authopt_key_info *key_info;
+
+	sock_owned_by_me(sk);
+
+	/* If userspace optlen is too short fill the rest with zeros */
+	if (optlen > sizeof(opt))
+		return -EINVAL;
+	memset(&opt, 0, sizeof(opt));
+	if (copy_from_sockptr(&opt, optval, optlen))
+		return -EFAULT;
+
+	if (opt.flags & ~TCP_AUTHOPT_KEY_KNOWN_FLAGS)
+		return -EINVAL;
+
+	if (opt.keylen > TCP_AUTHOPT_MAXKEYLEN)
+		return -EINVAL;
+
+	/* Delete is a special case: */
+	if (opt.flags & TCP_AUTHOPT_KEY_DEL) {
+		info = rcu_dereference_check(tcp_sk(sk)->authopt_info, lockdep_sock_is_held(sk));
+		if (!info)
+			return -ENOENT;
+		key_info = tcp_authopt_key_lookup_exact(sk, info, &opt);
+		if (!key_info)
+			return -ENOENT;
+		tcp_authopt_key_del(sk, info, key_info);
+		return 0;
+	}
+
+	/* check key family */
+	if (opt.flags & TCP_AUTHOPT_KEY_ADDR_BIND) {
+		if (sk->sk_family != opt.addr.ss_family)
+			return -EINVAL;
+	}
+
+	/* Initialize tcp_authopt_info if not already set */
+	info = __tcp_authopt_info_get_or_create(sk);
+	if (IS_ERR(info))
+		return PTR_ERR(info);
+
+	/* If an old key exists with exact ID then remove and replace.
+	 * RCU-protected readers might observe both and pick any.
+	 */
+	key_info = tcp_authopt_key_lookup_exact(sk, info, &opt);
+	if (key_info)
+		tcp_authopt_key_del(sk, info, key_info);
+	key_info = sock_kmalloc(sk, sizeof(*key_info), GFP_KERNEL | __GFP_ZERO);
+	if (!key_info)
+		return -ENOMEM;
+	key_info->flags = opt.flags & TCP_AUTHOPT_KEY_KNOWN_FLAGS;
+	key_info->send_id = opt.send_id;
+	key_info->recv_id = opt.recv_id;
+	key_info->alg_id = opt.alg;
+	key_info->keylen = opt.keylen;
+	memcpy(key_info->key, opt.key, opt.keylen);
+	memcpy(&key_info->addr, &opt.addr, sizeof(key_info->addr));
+	hlist_add_head_rcu(&key_info->node, &info->head);
+
+	return 0;
+}
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 2e62e0d6373a..1348615c7576 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -60,10 +60,11 @@
 
 #include <net/net_namespace.h>
 #include <net/icmp.h>
 #include <net/inet_hashtables.h>
 #include <net/tcp.h>
+#include <net/tcp_authopt.h>
 #include <net/transp_v6.h>
 #include <net/ipv6.h>
 #include <net/inet_common.h>
 #include <net/timewait_sock.h>
 #include <net/xfrm.h>
@@ -2256,10 +2257,11 @@ void tcp_v4_destroy_sock(struct sock *sk)
 		tcp_clear_md5_list(sk);
 		kfree_rcu(rcu_dereference_protected(tp->md5sig_info, 1), rcu);
 		tp->md5sig_info = NULL;
 	}
 #endif
+	tcp_authopt_clear(sk);
 
 	/* Clean up a referenced TCP bind bucket. */
 	if (inet_csk(sk)->icsk_bind_hash)
 		inet_put_port(sk);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 02/15] docs: Add user documentation for tcp_authopt
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 01/15] tcp: authopt: Initial support and key management Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 03/15] selftests: Initial tcp_authopt test module Leonard Crestez
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

The .rst documentation contains a brief description of the user
interface and includes kernel-doc generated from uapi header.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 Documentation/networking/index.rst       |  1 +
 Documentation/networking/tcp_authopt.rst | 44 ++++++++++++++++++++++++
 2 files changed, 45 insertions(+)
 create mode 100644 Documentation/networking/tcp_authopt.rst

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 58bc8cd367c6..f5c324a060d8 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -100,10 +100,11 @@ Contents:
    strparser
    switchdev
    sysfs-tagging
    tc-actions-env-rules
    tcp-thin
+   tcp_authopt
    team
    timestamping
    tipc
    tproxy
    tuntap
diff --git a/Documentation/networking/tcp_authopt.rst b/Documentation/networking/tcp_authopt.rst
new file mode 100644
index 000000000000..484f66f41ad5
--- /dev/null
+++ b/Documentation/networking/tcp_authopt.rst
@@ -0,0 +1,44 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================
+TCP Authentication Option
+=========================
+
+The TCP Authentication option specified by RFC5925 replaces the TCP MD5
+Signature option. It similar in goals but not compatible in either wire formats
+or ABI.
+
+Interface
+=========
+
+Individual keys can be added to or removed from a TCP socket by using
+TCP_AUTHOPT_KEY setsockopt and a ``struct tcp_authopt_key``. There is no
+support for reading back keys and updates always replace the old key. These
+structures represent "Master Key Tuples (MKTs)" as described by the RFC.
+
+Per-socket options can set or read using the TCP_AUTHOPT sockopt and a ``struct
+tcp_authopt``. This is optional: doing setsockopt TCP_AUTHOPT_KEY is
+sufficient to enable the feature.
+
+Configuration associated with TCP Authentication is indepedently attached to
+each TCP socket. After listen and accept the newly returned socket gets an
+independent copy of relevant settings from the listen socket.
+
+Key binding
+-----------
+
+Keys can be bound to remote addresses in a way that is similar to TCP_MD5.
+
+ * The full address must match (/32 or /128)
+ * Ports are ignored
+ * Address binding is optional, by default keys match all addresses
+
+RFC5925 requires that key ids do not overlap when tcp identifiers (addr/port)
+overlap. This is not enforced by linux, configuring ambiguous keys will result
+in packet drops and lost connections.
+
+ABI Reference
+=============
+
+.. kernel-doc:: include/uapi/linux/tcp.h
+   :identifiers: tcp_authopt tcp_authopt_flag tcp_authopt_key tcp_authopt_key_flag tcp_authopt_alg
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 03/15] selftests: Initial tcp_authopt test module
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 01/15] tcp: authopt: Initial support and key management Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 02/15] docs: Add user documentation for tcp_authopt Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 04/15] selftests: tcp_authopt: Initial sockopt manipulation Leonard Crestez
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

This test suite is written as a standalone python3 package using
dependencies such as scapy.

The run.sh script wrapper called from kselftest infrastructure uses
"tox" to generate an isolated virtual environment just for running these
tests. The run.sh wrapper can be called from anywhere and does not rely
on kselftest infrastructure.

The python3 and tox packages be installed manually but not any other
dependencies

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 tools/testing/selftests/tcp_authopt/Makefile    |  5 +++++
 tools/testing/selftests/tcp_authopt/README.rst  | 15 +++++++++++++++
 tools/testing/selftests/tcp_authopt/config      |  6 ++++++
 tools/testing/selftests/tcp_authopt/run.sh      | 11 +++++++++++
 tools/testing/selftests/tcp_authopt/setup.cfg   | 17 +++++++++++++++++
 tools/testing/selftests/tcp_authopt/setup.py    |  5 +++++
 .../tcp_authopt/tcp_authopt_test/__init__.py    |  0
 7 files changed, 59 insertions(+)
 create mode 100644 tools/testing/selftests/tcp_authopt/Makefile
 create mode 100644 tools/testing/selftests/tcp_authopt/README.rst
 create mode 100644 tools/testing/selftests/tcp_authopt/config
 create mode 100755 tools/testing/selftests/tcp_authopt/run.sh
 create mode 100644 tools/testing/selftests/tcp_authopt/setup.cfg
 create mode 100644 tools/testing/selftests/tcp_authopt/setup.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/__init__.py

diff --git a/tools/testing/selftests/tcp_authopt/Makefile b/tools/testing/selftests/tcp_authopt/Makefile
new file mode 100644
index 000000000000..391412071875
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0
+include ../lib.mk
+
+TEST_PROGS += ./run.sh
+TEST_FILES := setup.py setup.cfg tcp_authopt_test
diff --git a/tools/testing/selftests/tcp_authopt/README.rst b/tools/testing/selftests/tcp_authopt/README.rst
new file mode 100644
index 000000000000..e9e4acc0a22a
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/README.rst
@@ -0,0 +1,15 @@
+=========================================
+Tests for linux TCP Authentication Option
+=========================================
+
+Test suite is written in python3 using pytest and scapy. The test suite is
+mostly self-contained as a python package.
+
+The recommended way to run this is the included `run.sh` script as root, this
+will automatically create a virtual environment with the correct dependencies
+using `tox`.
+
+An old separate version can be found here: https://github.com/cdleonard/tcp-authopt-test
+
+Integration with kselftest infrastructure is minimal: when in doubt just run
+this separately.
diff --git a/tools/testing/selftests/tcp_authopt/config b/tools/testing/selftests/tcp_authopt/config
new file mode 100644
index 000000000000..0d4e5d47fa72
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/config
@@ -0,0 +1,6 @@
+# RFC5925 TCP Authentication Option and all algorithms
+CONFIG_TCP_AUTHOPT=y
+CONFIG_CRYPTO_SHA1=M
+CONFIG_CRYPTO_HMAC=M
+CONFIG_CRYPTO_AES=M
+CONFIG_CRYPTO_CMAC=M
diff --git a/tools/testing/selftests/tcp_authopt/run.sh b/tools/testing/selftests/tcp_authopt/run.sh
new file mode 100755
index 000000000000..192ae094e3be
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/run.sh
@@ -0,0 +1,11 @@
+#! /bin/bash
+
+if ! command -v tox >/dev/null; then
+	echo >&2 "error: please install the python tox package"
+	exit 1
+fi
+if [[ $(id -u) -ne 0 ]]; then
+	echo >&2 "warning: running as non-root user is unlikely to work"
+fi
+cd "$(dirname "${BASH_SOURCE[0]}")"
+exec tox "$@"
diff --git a/tools/testing/selftests/tcp_authopt/setup.cfg b/tools/testing/selftests/tcp_authopt/setup.cfg
new file mode 100644
index 000000000000..373c5632b0a2
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/setup.cfg
@@ -0,0 +1,17 @@
+[options]
+install_requires=
+    cryptography
+    nsenter
+    pre-commit
+    pytest
+    scapy
+
+[tox:tox]
+envlist = py3
+
+[testenv]
+commands = pytest {posargs}
+
+[metadata]
+name = tcp-authopt-test
+version = 0.1
diff --git a/tools/testing/selftests/tcp_authopt/setup.py b/tools/testing/selftests/tcp_authopt/setup.py
new file mode 100644
index 000000000000..d5e50aa1ca5e
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/setup.py
@@ -0,0 +1,5 @@
+#! /usr/bin/env python3
+
+from setuptools import setup
+
+setup()
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/__init__.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/__init__.py
new file mode 100644
index 000000000000..e69de29bb2d1
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 04/15] selftests: tcp_authopt: Initial sockopt manipulation
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (2 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 03/15] selftests: Initial tcp_authopt test module Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 05/15] tcp: authopt: Add crypto initialization Leonard Crestez
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 .../tcp_authopt/tcp_authopt_test/conftest.py  |  21 ++
 .../tcp_authopt_test/linux_tcp_authopt.py     | 188 ++++++++++++++++++
 .../tcp_authopt/tcp_authopt_test/sockaddr.py  | 101 ++++++++++
 .../tcp_authopt_test/test_sockopt.py          |  74 +++++++
 4 files changed, 384 insertions(+)
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/conftest.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/linux_tcp_authopt.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/sockaddr.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_sockopt.py

diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/conftest.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/conftest.py
new file mode 100644
index 000000000000..c17c8ea2a943
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/conftest.py
@@ -0,0 +1,21 @@
+# SPDX-License-Identifier: GPL-2.0
+from tcp_authopt_test.linux_tcp_authopt import has_tcp_authopt
+import pytest
+import logging
+from contextlib import ExitStack
+
+logger = logging.getLogger(__name__)
+
+skipif_missing_tcp_authopt = pytest.mark.skipif(
+    not has_tcp_authopt(), reason="Need CONFIG_TCP_AUTHOPT"
+)
+
+
+@pytest.fixture
+def exit_stack():
+    """Return a contextlib.ExitStack as a pytest fixture
+
+    This reduces indentation making code more readable
+    """
+    with ExitStack() as exit_stack:
+        yield exit_stack
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/linux_tcp_authopt.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/linux_tcp_authopt.py
new file mode 100644
index 000000000000..41374f9851aa
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/linux_tcp_authopt.py
@@ -0,0 +1,188 @@
+# SPDX-License-Identifier: GPL-2.0
+"""Python wrapper around linux TCP_AUTHOPT ABI"""
+
+from dataclasses import dataclass
+from ipaddress import IPv4Address, IPv6Address, ip_address
+import socket
+import errno
+import logging
+from .sockaddr import sockaddr_in, sockaddr_in6, sockaddr_storage, sockaddr_unpack
+import typing
+import struct
+
+logger = logging.getLogger(__name__)
+
+
+def BIT(x):
+    return 1 << x
+
+
+TCP_AUTHOPT = 38
+TCP_AUTHOPT_KEY = 39
+
+TCP_AUTHOPT_MAXKEYLEN = 80
+
+TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED = BIT(2)
+
+TCP_AUTHOPT_KEY_DEL = BIT(0)
+TCP_AUTHOPT_KEY_EXCLUDE_OPTS = BIT(1)
+TCP_AUTHOPT_KEY_BIND_ADDR = BIT(2)
+
+TCP_AUTHOPT_ALG_HMAC_SHA_1_96 = 1
+TCP_AUTHOPT_ALG_AES_128_CMAC_96 = 2
+
+
+@dataclass
+class tcp_authopt:
+    """Like linux struct tcp_authopt"""
+
+    flags: int = 0
+    sizeof = 4
+
+    def pack(self) -> bytes:
+        return struct.pack(
+            "I",
+            self.flags,
+        )
+
+    def __bytes__(self):
+        return self.pack()
+
+    @classmethod
+    def unpack(cls, b: bytes):
+        tup = struct.unpack("I", b)
+        return cls(*tup)
+
+
+def set_tcp_authopt(sock, opt: tcp_authopt):
+    return sock.setsockopt(socket.IPPROTO_TCP, TCP_AUTHOPT, bytes(opt))
+
+
+def get_tcp_authopt(sock: socket.socket) -> tcp_authopt:
+    b = sock.getsockopt(socket.IPPROTO_TCP, TCP_AUTHOPT, tcp_authopt.sizeof)
+    return tcp_authopt.unpack(b)
+
+
+class tcp_authopt_key:
+    """Like linux struct tcp_authopt_key"""
+
+    def __init__(
+        self,
+        flags: int = 0,
+        send_id: int = 0,
+        recv_id: int = 0,
+        alg=TCP_AUTHOPT_ALG_HMAC_SHA_1_96,
+        key: bytes = b"",
+        addr: bytes = b"",
+        include_options=None,
+    ):
+        self.flags = flags
+        self.send_id = send_id
+        self.recv_id = recv_id
+        self.alg = alg
+        self.key = key
+        self.addr = addr
+        if include_options is not None:
+            self.include_options = include_options
+
+    def pack(self):
+        if len(self.key) > TCP_AUTHOPT_MAXKEYLEN:
+            raise ValueError(f"Max key length is {TCP_AUTHOPT_MAXKEYLEN}")
+        data = struct.pack(
+            "IBBBB80s",
+            self.flags,
+            self.send_id,
+            self.recv_id,
+            self.alg,
+            len(self.key),
+            self.key,
+        )
+        data += bytes(self.addrbuf.ljust(sockaddr_storage.sizeof, b"\x00"))
+        return data
+
+    def __bytes__(self):
+        return self.pack()
+
+    @property
+    def key(self) -> bytes:
+        return self._key
+
+    @key.setter
+    def key(self, val: typing.Union[bytes, str]) -> bytes:
+        if isinstance(val, str):
+            val = val.encode("utf-8")
+        if len(val) > TCP_AUTHOPT_MAXKEYLEN:
+            raise ValueError(f"Max key length is {TCP_AUTHOPT_MAXKEYLEN}")
+        self._key = val
+        return val
+
+    @property
+    def addr(self):
+        if not self.addrbuf:
+            return None
+        else:
+            return sockaddr_unpack(bytes(self.addrbuf))
+
+    @addr.setter
+    def addr(self, val):
+        if isinstance(val, bytes):
+            if len(val) > sockaddr_storage.sizeof:
+                raise ValueError(f"Must be up to {sockaddr_storage.sizeof}")
+            self.addrbuf = val
+        elif val is None:
+            self.addrbuf = b""
+        elif isinstance(val, str):
+            self.addr = ip_address(val)
+        elif isinstance(val, IPv4Address):
+            self.addr = sockaddr_in(addr=val)
+        elif isinstance(val, IPv6Address):
+            self.addr = sockaddr_in6(addr=val)
+        elif (
+            isinstance(val, sockaddr_in)
+            or isinstance(val, sockaddr_in6)
+            or isinstance(val, sockaddr_storage)
+        ):
+            self.addr = bytes(val)
+        else:
+            raise TypeError(f"Can't handle addr {val}")
+        return self.addr
+
+    @property
+    def include_options(self) -> bool:
+        return (self.flags & TCP_AUTHOPT_KEY_EXCLUDE_OPTS) == 0
+
+    @include_options.setter
+    def include_options(self, value) -> bool:
+        if value:
+            self.flags &= ~TCP_AUTHOPT_KEY_EXCLUDE_OPTS
+        else:
+            self.flags |= TCP_AUTHOPT_KEY_EXCLUDE_OPTS
+
+    @property
+    def delete_flag(self) -> bool:
+        return bool(self.flags & TCP_AUTHOPT_KEY_DEL)
+
+    @delete_flag.setter
+    def delete_flag(self, value) -> bool:
+        if value:
+            self.flags |= TCP_AUTHOPT_KEY_DEL
+        else:
+            self.flags &= ~TCP_AUTHOPT_KEY_DEL
+
+
+def set_tcp_authopt_key(sock, key: tcp_authopt_key):
+    return sock.setsockopt(socket.IPPROTO_TCP, TCP_AUTHOPT_KEY, bytes(key))
+
+
+def has_tcp_authopt() -> bool:
+    """Check is TCP_AUTHOPT is implemented by the OS"""
+    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
+        try:
+            optbuf = bytes(4)
+            sock.setsockopt(socket.IPPROTO_TCP, TCP_AUTHOPT, optbuf)
+            return True
+        except OSError as e:
+            if e.errno == errno.ENOPROTOOPT:
+                return False
+            else:
+                raise
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/sockaddr.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/sockaddr.py
new file mode 100644
index 000000000000..f61d0f190a0c
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/sockaddr.py
@@ -0,0 +1,101 @@
+# SPDX-License-Identifier: GPL-2.0
+"""pack/unpack wrappers for sockaddr"""
+import socket
+import struct
+from dataclasses import dataclass
+from ipaddress import IPv4Address, IPv6Address
+
+
+@dataclass
+class sockaddr_in:
+    port: int
+    addr: IPv4Address
+    sizeof = 8
+
+    def __init__(self, port=0, addr=None):
+        self.port = port
+        if addr is None:
+            addr = IPv4Address(0)
+        self.addr = IPv4Address(addr)
+
+    def pack(self):
+        return struct.pack("HH4s", socket.AF_INET, self.port, self.addr.packed)
+
+    @classmethod
+    def unpack(cls, buffer):
+        family, port, addr_packed = struct.unpack("HH4s", buffer[:8])
+        if family != socket.AF_INET:
+            raise ValueError(f"Must be AF_INET not {family}")
+        return cls(port, addr_packed)
+
+    def __bytes__(self):
+        return self.pack()
+
+
+@dataclass
+class sockaddr_in6:
+    """Like sockaddr_in6 but for python. Always contains scope_id"""
+
+    port: int
+    addr: IPv6Address
+    flowinfo: int
+    scope_id: int
+    sizeof = 28
+
+    def __init__(self, port=0, addr=None, flowinfo=0, scope_id=0):
+        self.port = port
+        if addr is None:
+            addr = IPv6Address(0)
+        self.addr = IPv6Address(addr)
+        self.flowinfo = flowinfo
+        self.scope_id = scope_id
+
+    def pack(self):
+        return struct.pack(
+            "HHI16sI",
+            socket.AF_INET6,
+            self.port,
+            self.flowinfo,
+            self.addr.packed,
+            self.scope_id,
+        )
+
+    @classmethod
+    def unpack(cls, buffer):
+        family, port, flowinfo, addr_packed, scope_id = struct.unpack(
+            "HHI16sI", buffer[:28]
+        )
+        if family != socket.AF_INET6:
+            raise ValueError(f"Must be AF_INET6 not {family}")
+        return cls(port, addr_packed, flowinfo=flowinfo, scope_id=scope_id)
+
+    def __bytes__(self):
+        return self.pack()
+
+
+@dataclass
+class sockaddr_storage:
+    family: int
+    data: bytes
+    sizeof = 128
+
+    def pack(self):
+        return struct.pack("H126s", self.family, self.data)
+
+    def __bytes__(self):
+        return self.pack()
+
+    @classmethod
+    def unpack(cls, buffer):
+        return cls(*struct.unpack("H126s", buffer))
+
+
+def sockaddr_unpack(buffer: bytes):
+    """Unpack based on family"""
+    family = struct.unpack("H", buffer[:2])[0]
+    if family == socket.AF_INET:
+        return sockaddr_in.unpack(buffer)
+    elif family == socket.AF_INET6:
+        return sockaddr_in6.unpack(buffer)
+    else:
+        return sockaddr_storage.unpack(buffer)
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_sockopt.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_sockopt.py
new file mode 100644
index 000000000000..06a05bf8aeec
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_sockopt.py
@@ -0,0 +1,74 @@
+# SPDX-License-Identifier: GPL-2.0
+"""Test TCP_AUTHOPT sockopt API"""
+import errno
+import socket
+import struct
+from ipaddress import IPv4Address, IPv6Address
+
+import pytest
+
+from .linux_tcp_authopt import (
+    set_tcp_authopt,
+    set_tcp_authopt_key,
+    tcp_authopt,
+    tcp_authopt_key,
+)
+from .sockaddr import sockaddr_unpack
+from .conftest import skipif_missing_tcp_authopt
+
+pytestmark = skipif_missing_tcp_authopt
+
+
+def test_authopt_key_pack_noaddr():
+    b = bytes(tcp_authopt_key(key=b"a\x00b"))
+    assert b[7] == 3
+    assert b[8:13] == b"a\x00b\x00\x00"
+
+
+def test_authopt_key_pack_addr():
+    b = bytes(tcp_authopt_key(key=b"a\x00b", addr="10.0.0.1"))
+    assert struct.unpack("H", b[88:90])[0] == socket.AF_INET
+    assert sockaddr_unpack(b[88:]).addr == IPv4Address("10.0.0.1")
+
+
+def test_authopt_key_pack_addr6():
+    b = bytes(tcp_authopt_key(key=b"abc", addr="fd00::1"))
+    assert struct.unpack("H", b[88:90])[0] == socket.AF_INET6
+    assert sockaddr_unpack(b[88:]).addr == IPv6Address("fd00::1")
+
+
+def test_tcp_authopt_key_del_without_active(exit_stack):
+    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+    exit_stack.push(sock)
+
+    # nothing happens:
+    key = tcp_authopt_key()
+    assert key.delete_flag is False
+    key.delete_flag = True
+    assert key.delete_flag is True
+    with pytest.raises(OSError) as e:
+        set_tcp_authopt_key(sock, key)
+    assert e.value.errno in [errno.EINVAL, errno.ENOENT]
+
+
+def test_tcp_authopt_key_setdel(exit_stack):
+    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+    exit_stack.push(sock)
+    set_tcp_authopt(sock, tcp_authopt())
+
+    # delete returns ENOENT
+    key = tcp_authopt_key()
+    key.delete_flag = True
+    with pytest.raises(OSError) as e:
+        set_tcp_authopt_key(sock, key)
+    assert e.value.errno == errno.ENOENT
+
+    key = tcp_authopt_key(send_id=1, recv_id=2)
+    set_tcp_authopt_key(sock, key)
+    # First delete works fine:
+    key.delete_flag = True
+    set_tcp_authopt_key(sock, key)
+    # Duplicate delete returns ENOENT
+    with pytest.raises(OSError) as e:
+        set_tcp_authopt_key(sock, key)
+    assert e.value.errno == errno.ENOENT
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 05/15] tcp: authopt: Add crypto initialization
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (3 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 04/15] selftests: tcp_authopt: Initial sockopt manipulation Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 23:02   ` Eric Dumazet
  2021-08-24 23:34   ` Eric Dumazet
  2021-08-24 21:34 ` [RFCv3 06/15] tcp: authopt: Compute packet signatures Leonard Crestez
                   ` (9 subsequent siblings)
  14 siblings, 2 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

The crypto_shash API is used in order to compute packet signatures. The
API comes with several unfortunate limitations:

1) Allocating a crypto_shash can sleep and must be done in user context.
2) Packet signatures must be computed in softirq context
3) Packet signatures use dynamic "traffic keys" which require exclusive
access to crypto_shash for crypto_setkey.

The solution is to allocate one crypto_shash for each possible cpu for
each algorithm at setsockopt time. The per-cpu tfm is then borrowed from
softirq context, signatures are computed and the tfm is returned.

The pool for each algorithm is reference counted, initialized at
setsockopt time and released in tcp_authopt_key_info's rcu callback

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 include/net/tcp_authopt.h |   3 +
 net/ipv4/tcp_authopt.c    | 177 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 178 insertions(+), 2 deletions(-)

diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index b4277112b506..c9ee2059b442 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -2,10 +2,12 @@
 #ifndef _LINUX_TCP_AUTHOPT_H
 #define _LINUX_TCP_AUTHOPT_H
 
 #include <uapi/linux/tcp.h>
 
+struct tcp_authopt_alg_imp;
+
 /**
  * struct tcp_authopt_key_info - Representation of a Master Key Tuple as per RFC5925
  *
  * Key structure lifetime is only protected by RCU so readers needs to hold a
  * single rcu_read_lock until they're done with the key.
@@ -20,10 +22,11 @@ struct tcp_authopt_key_info {
 	u8 send_id, recv_id;
 	u8 alg_id;
 	u8 keylen;
 	u8 key[TCP_AUTHOPT_MAXKEYLEN];
 	struct sockaddr_storage addr;
+	struct tcp_authopt_alg_imp *alg;
 };
 
 /**
  * struct tcp_authopt_info - Per-socket information regarding tcp_authopt
  *
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index f6dddc5775ff..ce560bd88903 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -4,10 +4,161 @@
 #include <net/tcp.h>
 #include <net/tcp_authopt.h>
 #include <crypto/hash.h>
 #include <trace/events/tcp.h>
 
+/* All current algorithms have a mac length of 12 but crypto API digestsize can be larger */
+#define TCP_AUTHOPT_MAXMACBUF	20
+#define TCP_AUTHOPT_MAX_TRAFFIC_KEY_LEN	20
+
+struct tcp_authopt_alg_imp {
+	/* Name of algorithm in crypto-api */
+	const char *alg_name;
+	/* One of the TCP_AUTHOPT_ALG_* constants from uapi */
+	u8 alg_id;
+	/* Length of traffic key */
+	u8 traffic_key_len;
+	/* Length of mac in TCP option */
+	u8 maclen;
+
+	/* shared crypto_shash */
+	spinlock_t lock;
+	int ref_cnt;
+	struct crypto_shash *tfm;
+};
+
+static struct tcp_authopt_alg_imp tcp_authopt_alg_list[] = {
+	{
+		.alg_id = TCP_AUTHOPT_ALG_HMAC_SHA_1_96,
+		.alg_name = "hmac(sha1)",
+		.traffic_key_len = 20,
+		.maclen = 12,
+		.lock = __SPIN_LOCK_UNLOCKED(tcp_authopt_alg_list[0].lock),
+	},
+	{
+		.alg_id = TCP_AUTHOPT_ALG_AES_128_CMAC_96,
+		.alg_name = "cmac(aes)",
+		.traffic_key_len = 16,
+		.maclen = 12,
+		.lock = __SPIN_LOCK_UNLOCKED(tcp_authopt_alg_list[1].lock),
+	},
+};
+
+/* get a pointer to the tcp_authopt_alg instance or NULL if id invalid */
+static inline struct tcp_authopt_alg_imp *tcp_authopt_alg_get(int alg_num)
+{
+	if (alg_num <= 0 || alg_num > 2)
+		return NULL;
+	return &tcp_authopt_alg_list[alg_num - 1];
+}
+
+/* Mark an algorithm as in-use from user context */
+static int tcp_authopt_alg_require(struct tcp_authopt_alg_imp *alg)
+{
+	struct crypto_shash *tfm = NULL;
+	bool need_init = false;
+
+	might_sleep();
+
+	/* If we're the first user then we need to initialize shash but we might lose the race. */
+	spin_lock_bh(&alg->lock);
+	WARN_ON(alg->ref_cnt < 0);
+	if (alg->ref_cnt == 0)
+		need_init = true;
+	else
+		++alg->ref_cnt;
+	spin_unlock_bh(&alg->lock);
+
+	/* Already initialized */
+	if (!need_init)
+		return 0;
+
+	tfm = crypto_alloc_shash(alg->alg_name, 0, 0);
+	if (IS_ERR(tfm))
+		return PTR_ERR(tfm);
+
+	spin_lock_bh(&alg->lock);
+	if (alg->ref_cnt == 0)
+		/* race won */
+		alg->tfm = tfm;
+	else
+		/* race lost, free tfm later */
+		need_init = false;
+	++alg->ref_cnt;
+	spin_unlock_bh(&alg->lock);
+
+	if (!need_init)
+		crypto_free_shash(tfm);
+	else
+		pr_info("initialized tcp-ao %s", alg->alg_name);
+
+	return 0;
+}
+
+static void tcp_authopt_alg_release(struct tcp_authopt_alg_imp *alg)
+{
+	struct crypto_shash *tfm_to_free = NULL;
+
+	spin_lock_bh(&alg->lock);
+	--alg->ref_cnt;
+	WARN_ON(alg->ref_cnt < 0);
+	if (alg->ref_cnt == 0) {
+		tfm_to_free = alg->tfm;
+		alg->tfm = NULL;
+	}
+	spin_unlock_bh(&alg->lock);
+
+	if (tfm_to_free) {
+		pr_info("released tcp-ao %s", alg->alg_name);
+		crypto_free_shash(tfm_to_free);
+	}
+}
+
+/* increase reference count on an algorithm that is already in use */
+static void tcp_authopt_alg_incref(struct tcp_authopt_alg_imp *alg)
+{
+	spin_lock_bh(&alg->lock);
+	WARN_ON(alg->ref_cnt <= 0);
+	++alg->ref_cnt;
+	spin_unlock_bh(&alg->lock);
+}
+
+static struct crypto_shash *tcp_authopt_alg_get_tfm(struct tcp_authopt_alg_imp *alg)
+{
+	spin_lock_bh(&alg->lock);
+	WARN_ON(alg->ref_cnt < 0);
+	return alg->tfm;
+}
+
+static void tcp_authopt_alg_put_tfm(struct tcp_authopt_alg_imp *alg, struct crypto_shash *tfm)
+{
+	WARN_ON(tfm != alg->tfm);
+	spin_unlock_bh(&alg->lock);
+}
+
+static struct crypto_shash *tcp_authopt_get_kdf_shash(struct tcp_authopt_key_info *key)
+{
+	return tcp_authopt_alg_get_tfm(key->alg);
+}
+
+static void tcp_authopt_put_kdf_shash(struct tcp_authopt_key_info *key,
+				      struct crypto_shash *tfm)
+{
+	return tcp_authopt_alg_put_tfm(key->alg, tfm);
+}
+
+static struct crypto_shash *tcp_authopt_get_mac_shash(struct tcp_authopt_key_info *key)
+{
+	return tcp_authopt_alg_get_tfm(key->alg);
+}
+
+static void tcp_authopt_put_mac_shash(struct tcp_authopt_key_info *key,
+				      struct crypto_shash *tfm)
+{
+	return tcp_authopt_alg_put_tfm(key->alg, tfm);
+}
+
 /* checks that ipv4 or ipv6 addr matches. */
 static bool ipvx_addr_match(struct sockaddr_storage *a1,
 			    struct sockaddr_storage *a2)
 {
 	if (a1->ss_family != a2->ss_family)
@@ -118,17 +269,25 @@ int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *opt)
 	opt->flags = info->flags & TCP_AUTHOPT_KNOWN_FLAGS;
 
 	return 0;
 }
 
+static void tcp_authopt_key_free_rcu(struct rcu_head *rcu)
+{
+	struct tcp_authopt_key_info *key = container_of(rcu, struct tcp_authopt_key_info, rcu);
+
+	tcp_authopt_alg_release(key->alg);
+	kfree(key);
+}
+
 static void tcp_authopt_key_del(struct sock *sk,
 				struct tcp_authopt_info *info,
 				struct tcp_authopt_key_info *key)
 {
 	hlist_del_rcu(&key->node);
 	atomic_sub(sizeof(*key), &sk->sk_omem_alloc);
-	kfree_rcu(key, rcu);
+	call_rcu(&key->rcu, tcp_authopt_key_free_rcu);
 }
 
 /* free info and keys but don't touch tp->authopt_info */
 static void __tcp_authopt_info_free(struct sock *sk, struct tcp_authopt_info *info)
 {
@@ -160,10 +319,12 @@ void tcp_authopt_clear(struct sock *sk)
 int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
 {
 	struct tcp_authopt_key opt;
 	struct tcp_authopt_info *info;
 	struct tcp_authopt_key_info *key_info;
+	struct tcp_authopt_alg_imp *alg;
+	int err;
 
 	sock_owned_by_me(sk);
 
 	/* If userspace optlen is too short fill the rest with zeros */
 	if (optlen > sizeof(opt))
@@ -199,23 +360,35 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
 	/* Initialize tcp_authopt_info if not already set */
 	info = __tcp_authopt_info_get_or_create(sk);
 	if (IS_ERR(info))
 		return PTR_ERR(info);
 
+	/* check the algorithm */
+	alg = tcp_authopt_alg_get(opt.alg);
+	if (!alg)
+		return -EINVAL;
+	WARN_ON(alg->alg_id != opt.alg);
+	err = tcp_authopt_alg_require(alg);
+	if (err)
+		return err;
+
 	/* If an old key exists with exact ID then remove and replace.
 	 * RCU-protected readers might observe both and pick any.
 	 */
 	key_info = tcp_authopt_key_lookup_exact(sk, info, &opt);
 	if (key_info)
 		tcp_authopt_key_del(sk, info, key_info);
 	key_info = sock_kmalloc(sk, sizeof(*key_info), GFP_KERNEL | __GFP_ZERO);
-	if (!key_info)
+	if (!key_info) {
+		tcp_authopt_alg_release(alg);
 		return -ENOMEM;
+	}
 	key_info->flags = opt.flags & TCP_AUTHOPT_KEY_KNOWN_FLAGS;
 	key_info->send_id = opt.send_id;
 	key_info->recv_id = opt.recv_id;
 	key_info->alg_id = opt.alg;
+	key_info->alg = alg;
 	key_info->keylen = opt.keylen;
 	memcpy(key_info->key, opt.key, opt.keylen);
 	memcpy(&key_info->addr, &opt.addr, sizeof(key_info->addr));
 	hlist_add_head_rcu(&key_info->node, &info->head);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 06/15] tcp: authopt: Compute packet signatures
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (4 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 05/15] tcp: authopt: Add crypto initialization Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 07/15] tcp: authopt: Hook into tcp core Leonard Crestez
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

Computing tcp authopt packet signatures is a two step process:

* traffic key is computed based on tcp 4-tuple, initial sequence numbers
and the secret key.
* packet mac is computed based on traffic key and content of individual
packets.

The traffic key could be cached for established sockets but it is not.

A single code path exists for ipv4/ipv6 and input/output. This keeps the
code short but slightly slower due to lots of conditionals.

On output we read remote IP address from socket members on output, we
can't use skb network header because it's computed after TCP options.

On input we read remote IP address from skb network headers, we can't
use socket binding members because those are not available for SYN.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 net/ipv4/tcp_authopt.c | 467 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 467 insertions(+)

diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index ce560bd88903..2a3463ad6896 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -392,5 +392,472 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
 	memcpy(&key_info->addr, &opt.addr, sizeof(key_info->addr));
 	hlist_add_head_rcu(&key_info->node, &info->head);
 
 	return 0;
 }
+
+/* feed traffic key into shash */
+static int tcp_authopt_shash_traffic_key(struct shash_desc *desc,
+					 struct sock *sk,
+					 struct sk_buff *skb,
+					 bool input,
+					 bool ipv6)
+{
+	struct tcphdr *th = tcp_hdr(skb);
+	int err;
+	__be32 sisn, disn;
+	__be16 digestbits = htons(crypto_shash_digestsize(desc->tfm) * 8);
+
+	// RFC5926 section 3.1.1.1
+	err = crypto_shash_update(desc, "\x01TCP-AO", 7);
+	if (err)
+		return err;
+
+	/* Addresses from packet on input and from socket on output
+	 * This is because on output MAC is computed before prepending IP header
+	 */
+	if (input) {
+		if (ipv6)
+			err = crypto_shash_update(desc, (u8 *)&ipv6_hdr(skb)->saddr, 32);
+		else
+			err = crypto_shash_update(desc, (u8 *)&ip_hdr(skb)->saddr, 8);
+		if (err)
+			return err;
+	} else {
+		if (ipv6) {
+			struct in6_addr *saddr;
+			struct in6_addr *daddr;
+
+			saddr = &sk->sk_v6_rcv_saddr;
+			daddr = &sk->sk_v6_daddr;
+			err = crypto_shash_update(desc, (u8 *)&sk->sk_v6_rcv_saddr, 16);
+			if (err)
+				return err;
+			err = crypto_shash_update(desc, (u8 *)&sk->sk_v6_daddr, 16);
+			if (err)
+				return err;
+		} else {
+			err = crypto_shash_update(desc, (u8 *)&sk->sk_rcv_saddr, 4);
+			if (err)
+				return err;
+			err = crypto_shash_update(desc, (u8 *)&sk->sk_daddr, 4);
+			if (err)
+				return err;
+		}
+	}
+
+	/* TCP ports from header */
+	err = crypto_shash_update(desc, (u8 *)&th->source, 4);
+	if (err)
+		return err;
+
+	/* special cases for SYN and SYN/ACK */
+	if (th->syn && !th->ack) {
+		sisn = th->seq;
+		disn = 0;
+	} else if (th->syn && th->ack) {
+		sisn = th->seq;
+		disn = htonl(ntohl(th->ack_seq) - 1);
+	} else {
+		struct tcp_authopt_info *authopt_info;
+
+		/* Fetching authopt_info like this means it's possible that authopt_info
+		 * was deleted while we were hashing. If that happens we drop the packet
+		 * which should be fine.
+		 *
+		 * A better solution might be to always pass info as a parameter, or
+		 * compute traffic_key for established sockets separately.
+		 */
+		rcu_read_lock();
+		authopt_info = rcu_dereference(tcp_sk(sk)->authopt_info);
+		if (!authopt_info) {
+			rcu_read_unlock();
+			return -EINVAL;
+		}
+		/* Initial sequence numbers for ESTABLISHED connections from info */
+		if (input) {
+			sisn = htonl(authopt_info->dst_isn);
+			disn = htonl(authopt_info->src_isn);
+		} else {
+			sisn = htonl(authopt_info->src_isn);
+			disn = htonl(authopt_info->dst_isn);
+		}
+		rcu_read_unlock();
+	}
+
+	err = crypto_shash_update(desc, (u8 *)&sisn, 4);
+	if (err)
+		return err;
+	err = crypto_shash_update(desc, (u8 *)&disn, 4);
+	if (err)
+		return err;
+
+	err = crypto_shash_update(desc, (u8 *)&digestbits, 2);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+/* Convert a variable-length key to a 16-byte fixed-length key for AES-CMAC
+ * This is described in RFC5926 section 3.1.1.2
+ */
+static int aes_setkey_derived(struct crypto_shash *tfm, u8 *key, size_t keylen)
+{
+	static const u8 zeros[16] = {0};
+	u8 derived_key[16];
+	int err;
+
+	if (WARN_ON(crypto_shash_digestsize(tfm) != 16))
+		return -EINVAL;
+	err = crypto_shash_setkey(tfm, zeros, sizeof(zeros));
+	if (err)
+		return err;
+	err = crypto_shash_tfm_digest(tfm, key, keylen, derived_key);
+	if (err)
+		return err;
+	return crypto_shash_setkey(tfm, derived_key, sizeof(derived_key));
+}
+
+static int tcp_authopt_get_traffic_key(struct sock *sk,
+				       struct sk_buff *skb,
+				       struct tcp_authopt_key_info *key,
+				       bool input,
+				       bool ipv6,
+				       u8 *traffic_key)
+{
+	SHASH_DESC_ON_STACK(desc, kdf_tfm);
+	struct crypto_shash *kdf_tfm;
+	int err;
+
+	kdf_tfm = tcp_authopt_get_kdf_shash(key);
+	if (IS_ERR(kdf_tfm))
+		return PTR_ERR(kdf_tfm);
+	if (WARN_ON(crypto_shash_digestsize(kdf_tfm) != key->alg->traffic_key_len)) {
+		err = -EINVAL;
+		goto out;
+	}
+
+	if (key->alg_id == TCP_AUTHOPT_ALG_AES_128_CMAC_96 && key->keylen != 16) {
+		err = aes_setkey_derived(kdf_tfm, key->key, key->keylen);
+		if (err)
+			goto out;
+	} else {
+		err = crypto_shash_setkey(kdf_tfm, key->key, key->keylen);
+		if (err)
+			goto out;
+	}
+
+	desc->tfm = kdf_tfm;
+	err = crypto_shash_init(desc);
+	if (err)
+		goto out;
+
+	err = tcp_authopt_shash_traffic_key(desc, sk, skb, input, ipv6);
+	if (err)
+		goto out;
+
+	err = crypto_shash_final(desc, traffic_key);
+	if (err)
+		goto out;
+	//printk("traffic_key: %*phN\n", 20, traffic_key);
+
+out:
+	tcp_authopt_put_kdf_shash(key, kdf_tfm);
+	return err;
+}
+
+static int crypto_shash_update_zero(struct shash_desc *desc, int len)
+{
+	u8 zero = 0;
+	int i, err;
+
+	for (i = 0; i < len; ++i) {
+		err = crypto_shash_update(desc, &zero, 1);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int tcp_authopt_hash_tcp4_pseudoheader(struct shash_desc *desc,
+					      __be32 saddr,
+					      __be32 daddr,
+					      int nbytes)
+{
+	struct tcp4_pseudohdr phdr = {
+		.saddr = saddr,
+		.daddr = daddr,
+		.pad = 0,
+		.protocol = IPPROTO_TCP,
+		.len = htons(nbytes)
+	};
+	return crypto_shash_update(desc, (u8 *)&phdr, sizeof(phdr));
+}
+
+static int tcp_authopt_hash_tcp6_pseudoheader(struct shash_desc *desc,
+					      struct in6_addr *saddr,
+					      struct in6_addr *daddr,
+					      u32 plen)
+{
+	int err;
+	u32 buf[2];
+
+	buf[0] = htonl(plen);
+	buf[1] = htonl(IPPROTO_TCP);
+
+	err = crypto_shash_update(desc, (u8 *)saddr, sizeof(*saddr));
+	if (err)
+		return err;
+	err = crypto_shash_update(desc, (u8 *)daddr, sizeof(*daddr));
+	if (err)
+		return err;
+	return crypto_shash_update(desc, (u8 *)&buf, sizeof(buf));
+}
+
+/* TCP authopt as found in header */
+struct tcphdr_authopt {
+	u8 num;
+	u8 len;
+	u8 keyid;
+	u8 rnextkeyid;
+	u8 mac[0];
+};
+
+/* Find TCP_AUTHOPT in header.
+ *
+ * Returns pointer to TCP_AUTHOPT or NULL if not found.
+ */
+static u8 *tcp_authopt_find_option(struct tcphdr *th)
+{
+	int length = (th->doff << 2) - sizeof(*th);
+	u8 *ptr = (u8 *)(th + 1);
+
+	while (length >= 2) {
+		int opcode = *ptr++;
+		int opsize;
+
+		switch (opcode) {
+		case TCPOPT_EOL:
+			return NULL;
+		case TCPOPT_NOP:
+			length--;
+			continue;
+		default:
+			if (length < 2)
+				return NULL;
+			opsize = *ptr++;
+			if (opsize < 2)
+				return NULL;
+			if (opsize > length)
+				return NULL;
+			if (opcode == TCPOPT_AUTHOPT)
+				return ptr - 2;
+		}
+		ptr += opsize - 2;
+		length -= opsize;
+	}
+	return NULL;
+}
+
+/** Hash tcphdr options.
+ *  If include_options is false then only the TCPOPT_AUTHOPT option itself is hashed
+ *  Maybe we could skip option parsing by assuming the AUTHOPT header is at hash_location-4?
+ */
+static int tcp_authopt_hash_opts(struct shash_desc *desc,
+				 struct tcphdr *th,
+				 bool include_options)
+{
+	int err;
+	/* start of options */
+	u8 *tcp_opts = (u8 *)(th + 1);
+	/* end of options */
+	u8 *tcp_data = ((u8 *)th) + th->doff * 4;
+	/* pointer to TCPOPT_AUTHOPT */
+	u8 *authopt_ptr = tcp_authopt_find_option(th);
+	u8 authopt_len;
+
+	if (!authopt_ptr)
+		return -EINVAL;
+	authopt_len = *(authopt_ptr + 1);
+
+	if (include_options) {
+		err = crypto_shash_update(desc, tcp_opts, authopt_ptr - tcp_opts + 4);
+		if (err)
+			return err;
+		err = crypto_shash_update_zero(desc, authopt_len - 4);
+		if (err)
+			return err;
+		err = crypto_shash_update(desc,
+					  authopt_ptr + authopt_len,
+					  tcp_data - (authopt_ptr + authopt_len));
+		if (err)
+			return err;
+	} else {
+		err = crypto_shash_update(desc, authopt_ptr, 4);
+		if (err)
+			return err;
+		err = crypto_shash_update_zero(desc, authopt_len - 4);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int skb_shash_frags(struct shash_desc *desc,
+			   struct sk_buff *skb)
+{
+	struct sk_buff *frag_iter;
+	int err, i;
+
+	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+		skb_frag_t *f = &skb_shinfo(skb)->frags[i];
+		u32 p_off, p_len, copied;
+		struct page *p;
+		u8 *vaddr;
+
+		skb_frag_foreach_page(f, skb_frag_off(f), skb_frag_size(f),
+				      p, p_off, p_len, copied) {
+			vaddr = kmap_atomic(p);
+			err = crypto_shash_update(desc, vaddr + p_off, p_len);
+			kunmap_atomic(vaddr);
+			if (err)
+				return err;
+		}
+	}
+
+	skb_walk_frags(skb, frag_iter) {
+		err = skb_shash_frags(desc, frag_iter);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int tcp_authopt_hash_packet(struct crypto_shash *tfm,
+				   struct sock *sk,
+				   struct sk_buff *skb,
+				   bool input,
+				   bool ipv6,
+				   bool include_options,
+				   u8 *macbuf)
+{
+	struct tcphdr *th = tcp_hdr(skb);
+	SHASH_DESC_ON_STACK(desc, tfm);
+	int err;
+
+	/* NOTE: SNE unimplemented */
+	__be32 sne = 0;
+
+	desc->tfm = tfm;
+	err = crypto_shash_init(desc);
+	if (err)
+		return err;
+
+	err = crypto_shash_update(desc, (u8 *)&sne, 4);
+	if (err)
+		return err;
+
+	if (ipv6) {
+		struct in6_addr *saddr;
+		struct in6_addr *daddr;
+
+		if (input) {
+			saddr = &ipv6_hdr(skb)->saddr;
+			daddr = &ipv6_hdr(skb)->daddr;
+		} else {
+			saddr = &sk->sk_v6_rcv_saddr;
+			daddr = &sk->sk_v6_daddr;
+		}
+		err = tcp_authopt_hash_tcp6_pseudoheader(desc, saddr, daddr, skb->len);
+		if (err)
+			return err;
+	} else {
+		__be32 saddr;
+		__be32 daddr;
+
+		if (input) {
+			saddr = ip_hdr(skb)->saddr;
+			daddr = ip_hdr(skb)->daddr;
+		} else {
+			saddr = sk->sk_rcv_saddr;
+			daddr = sk->sk_daddr;
+		}
+		err = tcp_authopt_hash_tcp4_pseudoheader(desc, saddr, daddr, skb->len);
+		if (err)
+			return err;
+	}
+
+	// TCP header with checksum set to zero
+	{
+		struct tcphdr hashed_th = *th;
+
+		hashed_th.check = 0;
+		err = crypto_shash_update(desc, (u8 *)&hashed_th, sizeof(hashed_th));
+		if (err)
+			return err;
+	}
+
+	// TCP options
+	err = tcp_authopt_hash_opts(desc, th, include_options);
+	if (err)
+		return err;
+
+	// Rest of SKB->data
+	err = crypto_shash_update(desc, (u8 *)th + th->doff * 4, skb_headlen(skb) - th->doff * 4);
+	if (err)
+		return err;
+
+	err = skb_shash_frags(desc, skb);
+	if (err)
+		return err;
+
+	return crypto_shash_final(desc, macbuf);
+}
+
+int __tcp_authopt_calc_mac(struct sock *sk,
+			   struct sk_buff *skb,
+			   struct tcp_authopt_key_info *key,
+			   bool input,
+			   char *macbuf)
+{
+	struct crypto_shash *mac_tfm;
+	u8 traffic_key[TCP_AUTHOPT_MAX_TRAFFIC_KEY_LEN];
+	int err;
+	bool ipv6 = (sk->sk_family != AF_INET);
+
+	if (sk->sk_family != AF_INET && sk->sk_family != AF_INET6)
+		return -EINVAL;
+	if (WARN_ON(key->alg->traffic_key_len > sizeof(traffic_key)))
+		return -ENOBUFS;
+
+	err = tcp_authopt_get_traffic_key(sk, skb, key, input, ipv6, traffic_key);
+	if (err)
+		return err;
+
+	mac_tfm = tcp_authopt_get_mac_shash(key);
+	if (IS_ERR(mac_tfm))
+		return PTR_ERR(mac_tfm);
+	if (crypto_shash_digestsize(mac_tfm) > TCP_AUTHOPT_MAXMACBUF) {
+		err = -EINVAL;
+		goto out;
+	}
+	err = crypto_shash_setkey(mac_tfm, traffic_key, key->alg->traffic_key_len);
+	if (err)
+		goto out;
+
+	err = tcp_authopt_hash_packet(mac_tfm,
+				      sk,
+				      skb,
+				      input,
+				      ipv6,
+				      !(key->flags & TCP_AUTHOPT_KEY_EXCLUDE_OPTS),
+				      macbuf);
+	//printk("mac: %*phN\n", key->maclen, macbuf);
+
+out:
+	tcp_authopt_put_mac_shash(key, mac_tfm);
+	return err;
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 07/15] tcp: authopt: Hook into tcp core
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (5 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 06/15] tcp: authopt: Compute packet signatures Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 22:59   ` Eric Dumazet
  2021-08-24 21:34 ` [RFCv3 08/15] tcp: authopt: Add snmp counters Leonard Crestez
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

The tcp_authopt features exposes a minimal interface to the rest of the
TCP stack. Only a few functions are exposed and if the feature is
disabled they return neutral values, avoiding ifdefs in the rest of the
code.

Add calls into tcp authopt from send, receive and accept code.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 include/net/tcp_authopt.h |  56 +++++++++
 net/ipv4/tcp_authopt.c    | 246 ++++++++++++++++++++++++++++++++++++++
 net/ipv4/tcp_input.c      |  17 +++
 net/ipv4/tcp_ipv4.c       |   3 +
 net/ipv4/tcp_minisocks.c  |   2 +
 net/ipv4/tcp_output.c     |  74 +++++++++++-
 net/ipv6/tcp_ipv6.c       |   4 +
 7 files changed, 401 insertions(+), 1 deletion(-)

diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index c9ee2059b442..61db268f36f8 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -21,10 +21,11 @@ struct tcp_authopt_key_info {
 	/* Wire identifiers */
 	u8 send_id, recv_id;
 	u8 alg_id;
 	u8 keylen;
 	u8 key[TCP_AUTHOPT_MAXKEYLEN];
+	u8 maclen;
 	struct sockaddr_storage addr;
 	struct tcp_authopt_alg_imp *alg;
 };
 
 /**
@@ -41,15 +42,53 @@ struct tcp_authopt_info {
 	u32 src_isn;
 	u32 dst_isn;
 };
 
 #ifdef CONFIG_TCP_AUTHOPT
+struct tcp_authopt_key_info *tcp_authopt_select_key(const struct sock *sk,
+						    const struct sock *addr_sk,
+						    u8 *rnextkeyid);
 void tcp_authopt_clear(struct sock *sk);
 int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen);
 int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *key);
 int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen);
+int tcp_authopt_hash(
+		char *hash_location,
+		struct tcp_authopt_key_info *key,
+		struct sock *sk, struct sk_buff *skb);
+int __tcp_authopt_openreq(struct sock *newsk, const struct sock *oldsk, struct request_sock *req);
+static inline int tcp_authopt_openreq(
+		struct sock *newsk,
+		const struct sock *oldsk,
+		struct request_sock *req)
+{
+	if (!rcu_dereference(tcp_sk(oldsk)->authopt_info))
+		return 0;
+	else
+		return __tcp_authopt_openreq(newsk, oldsk, req);
+}
+int __tcp_authopt_inbound_check(
+		struct sock *sk,
+		struct sk_buff *skb,
+		struct tcp_authopt_info *info);
+static inline int tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb)
+{
+	struct tcp_authopt_info *info = rcu_dereference(tcp_sk(sk)->authopt_info);
+
+	if (info)
+		return __tcp_authopt_inbound_check(sk, skb, info);
+	else
+		return 0;
+}
 #else
+static inline struct tcp_authopt_key_info *tcp_authopt_select_key(
+		const struct sock *sk,
+		const struct sock *addr_sk,
+		u8 *rnextkeyid)
+{
+	return NULL;
+}
 static inline int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
 {
 	return -ENOPROTOOPT;
 }
 static inline int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *key)
@@ -61,8 +100,25 @@ static inline void tcp_authopt_clear(struct sock *sk)
 }
 static inline int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
 {
 	return -ENOPROTOOPT;
 }
+static inline int tcp_authopt_hash(
+		char *hash_location,
+		struct tcp_authopt_key_info *key,
+		struct sock *sk, struct sk_buff *skb)
+{
+	return -EINVAL;
+}
+static inline int tcp_authopt_openreq(struct sock *newsk,
+				      const struct sock *oldsk,
+				      struct request_sock *req)
+{
+	return 0;
+}
+static inline int tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb)
+{
+	return 0;
+}
 #endif
 
 #endif /* _LINUX_TCP_AUTHOPT_H */
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index 2a3463ad6896..af777244d098 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -203,10 +203,71 @@ static struct tcp_authopt_key_info *tcp_authopt_key_lookup_exact(const struct so
 			return key_info;
 
 	return NULL;
 }
 
+struct tcp_authopt_key_info *tcp_authopt_lookup_send(struct tcp_authopt_info *info,
+						     const struct sock *addr_sk,
+						     int send_id)
+{
+	struct tcp_authopt_key_info *result = NULL;
+	struct tcp_authopt_key_info *key;
+
+	hlist_for_each_entry_rcu(key, &info->head, node, 0) {
+		if (send_id >= 0 && key->send_id != send_id)
+			continue;
+		if (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND) {
+			if (addr_sk->sk_family == AF_INET) {
+				struct sockaddr_in *key_addr = (struct sockaddr_in *)&key->addr;
+				const struct in_addr *daddr =
+					(const struct in_addr *)&addr_sk->sk_daddr;
+
+				if (WARN_ON(key_addr->sin_family != AF_INET))
+					continue;
+				if (memcmp(daddr, &key_addr->sin_addr, sizeof(*daddr)))
+					continue;
+			}
+			if (addr_sk->sk_family == AF_INET6) {
+				struct sockaddr_in6 *key_addr = (struct sockaddr_in6 *)&key->addr;
+				const struct in6_addr *daddr = &addr_sk->sk_v6_daddr;
+
+				if (WARN_ON(key_addr->sin6_family != AF_INET6))
+					continue;
+				if (memcmp(daddr, &key_addr->sin6_addr, sizeof(*daddr)))
+					continue;
+			}
+		}
+		if (result && net_ratelimit())
+			pr_warn("ambiguous tcp authentication keys configured for send\n");
+		result = key;
+	}
+
+	return result;
+}
+
+/**
+ * tcp_authopt_select_key - select key for sending
+ *
+ * addr_sk is the sock used for comparing daddr, it is only different from sk in
+ * the synack case.
+ *
+ * Result is protected by RCU and can't be stored, it may only be passed to
+ * tcp_authopt_hash and only under a single rcu_read_lock.
+ */
+struct tcp_authopt_key_info *tcp_authopt_select_key(const struct sock *sk,
+						    const struct sock *addr_sk,
+						    u8 *rnextkeyid)
+{
+	struct tcp_authopt_info *info;
+
+	info = rcu_dereference(tcp_sk(sk)->authopt_info);
+	if (!info)
+		return NULL;
+
+	return tcp_authopt_lookup_send(info, addr_sk, -1);
+}
+
 static struct tcp_authopt_info *__tcp_authopt_info_get_or_create(struct sock *sk)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct tcp_authopt_info *info;
 
@@ -387,16 +448,69 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
 	key_info->recv_id = opt.recv_id;
 	key_info->alg_id = opt.alg;
 	key_info->alg = alg;
 	key_info->keylen = opt.keylen;
 	memcpy(key_info->key, opt.key, opt.keylen);
+	key_info->maclen = alg->maclen;
 	memcpy(&key_info->addr, &opt.addr, sizeof(key_info->addr));
 	hlist_add_head_rcu(&key_info->node, &info->head);
 
 	return 0;
 }
 
+static int tcp_authopt_clone_keys(struct sock *newsk,
+				  const struct sock *oldsk,
+				  struct tcp_authopt_info *new_info,
+				  struct tcp_authopt_info *old_info)
+{
+	struct tcp_authopt_key_info *old_key;
+	struct tcp_authopt_key_info *new_key;
+
+	hlist_for_each_entry_rcu(old_key, &old_info->head, node, lockdep_sock_is_held(sk)) {
+		new_key = sock_kmalloc(newsk, sizeof(*new_key), GFP_ATOMIC);
+		if (!new_key)
+			return -ENOMEM;
+		memcpy(new_key, old_key, sizeof(*new_key));
+		tcp_authopt_alg_incref(old_key->alg);
+		hlist_add_head_rcu(&new_key->node, &new_info->head);
+	}
+
+	return 0;
+}
+
+/** Called to create accepted sockets.
+ *
+ *  Need to copy authopt info from listen socket.
+ */
+int __tcp_authopt_openreq(struct sock *newsk, const struct sock *oldsk, struct request_sock *req)
+{
+	struct tcp_authopt_info *old_info;
+	struct tcp_authopt_info *new_info;
+	int err;
+
+	old_info = rcu_dereference(tcp_sk(oldsk)->authopt_info);
+	if (!old_info)
+		return 0;
+
+	new_info = kmalloc(sizeof(*new_info), GFP_ATOMIC | __GFP_ZERO);
+	if (!new_info)
+		return -ENOMEM;
+
+	sk_nocaps_add(newsk, NETIF_F_GSO_MASK);
+	new_info->src_isn = tcp_rsk(req)->snt_isn;
+	new_info->dst_isn = tcp_rsk(req)->rcv_isn;
+	INIT_HLIST_HEAD(&new_info->head);
+	err = tcp_authopt_clone_keys(newsk, oldsk, new_info, old_info);
+	if (err) {
+		__tcp_authopt_info_free(newsk, new_info);
+		return err;
+	}
+	rcu_assign_pointer(tcp_sk(newsk)->authopt_info, new_info);
+
+	return 0;
+}
+
 /* feed traffic key into shash */
 static int tcp_authopt_shash_traffic_key(struct shash_desc *desc,
 					 struct sock *sk,
 					 struct sk_buff *skb,
 					 bool input,
@@ -815,10 +929,16 @@ static int tcp_authopt_hash_packet(struct crypto_shash *tfm,
 		return err;
 
 	return crypto_shash_final(desc, macbuf);
 }
 
+/**
+ * __tcp_authopt_calc_mac - Compute packet MAC using key
+ *
+ * @macbuf: output buffer. Must be large enough to fit the digestsize of the
+ * 			underlying transform before truncation. Please use TCP_AUTHOPT_MAXMACBUF
+ */
 int __tcp_authopt_calc_mac(struct sock *sk,
 			   struct sk_buff *skb,
 			   struct tcp_authopt_key_info *key,
 			   bool input,
 			   char *macbuf)
@@ -859,5 +979,131 @@ int __tcp_authopt_calc_mac(struct sock *sk,
 
 out:
 	tcp_authopt_put_mac_shash(key, mac_tfm);
 	return err;
 }
+
+/**
+ * tcp_authopt_hash - fill in the mac
+ *
+ * The key must come from tcp_authopt_select_key.
+ */
+int tcp_authopt_hash(char *hash_location,
+		     struct tcp_authopt_key_info *key,
+		     struct sock *sk,
+		     struct sk_buff *skb)
+{
+	/* MAC inside option is truncated to 12 bytes but crypto API needs output
+	 * buffer to be large enough so we use a buffer on the stack.
+	 */
+	u8 macbuf[TCP_AUTHOPT_MAXMACBUF];
+	int err;
+
+	if (WARN_ON(key->maclen > sizeof(macbuf)))
+		return -ENOBUFS;
+
+	err = __tcp_authopt_calc_mac(sk, skb, key, false, macbuf);
+	if (err) {
+		/* If mac calculation fails and caller doesn't handle the error
+		 * try to make it obvious inside the packet.
+		 */
+		memset(hash_location, 0, key->maclen);
+		return err;
+	}
+	memcpy(hash_location, macbuf, key->maclen);
+
+	return 0;
+}
+
+static struct tcp_authopt_key_info *tcp_authopt_lookup_recv(struct sock *sk,
+							    struct sk_buff *skb,
+							    struct tcp_authopt_info *info,
+							    int recv_id)
+{
+	struct tcp_authopt_key_info *result = NULL;
+	struct tcp_authopt_key_info *key;
+
+	/* multiple matches will cause occasional failures */
+	hlist_for_each_entry_rcu(key, &info->head, node, 0) {
+		if (recv_id >= 0 && key->recv_id != recv_id)
+			continue;
+		if (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND) {
+			if (sk->sk_family == AF_INET) {
+				struct sockaddr_in *key_addr = (struct sockaddr_in *)&key->addr;
+				struct iphdr *iph = (struct iphdr *)skb_network_header(skb);
+
+				if (WARN_ON(key_addr->sin_family != AF_INET))
+					continue;
+				if (WARN_ON(iph->version != 4))
+					continue;
+				if (memcmp(&iph->saddr, &key_addr->sin_addr, sizeof(iph->saddr)))
+					continue;
+			}
+			if (sk->sk_family == AF_INET6) {
+				struct sockaddr_in6 *key_addr = (struct sockaddr_in6 *)&key->addr;
+				struct ipv6hdr *iph = (struct ipv6hdr *)skb_network_header(skb);
+
+				if (WARN_ON(key_addr->sin6_family != AF_INET6))
+					continue;
+				if (WARN_ON(iph->version != 6))
+					continue;
+				if (memcmp(&iph->saddr, &key_addr->sin6_addr, sizeof(iph->saddr)))
+					continue;
+			}
+		}
+		if (result && net_ratelimit())
+			pr_warn("ambiguous tcp authentication keys configured for receive\n");
+		result = key;
+	}
+
+	return result;
+}
+
+int __tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb, struct tcp_authopt_info *info)
+{
+	struct tcphdr *th = (struct tcphdr *)skb_transport_header(skb);
+	struct tcphdr_authopt *opt;
+	struct tcp_authopt_key_info *key;
+	u8 macbuf[TCP_AUTHOPT_MAXMACBUF];
+	int err;
+
+	opt = (struct tcphdr_authopt *)tcp_authopt_find_option(th);
+	key = tcp_authopt_lookup_recv(sk, skb, info, opt ? opt->keyid : -1);
+
+	/* nothing found or expected */
+	if (!opt && !key)
+		return 0;
+	if (!opt && key) {
+		net_info_ratelimited("TCP Authentication Missing\n");
+		return -EINVAL;
+	}
+	if (opt && !key) {
+		/* RFC5925 Section 7.3:
+		 * A TCP-AO implementation MUST allow for configuration of the behavior
+		 * of segments with TCP-AO but that do not match an MKT. The initial
+		 * default of this configuration SHOULD be to silently accept such
+		 * connections.
+		 */
+		if (info->flags & TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED) {
+			net_info_ratelimited("TCP Authentication Unexpected: Rejected\n");
+			return -EINVAL;
+		} else {
+			net_info_ratelimited("TCP Authentication Unexpected: Accepted\n");
+			return 0;
+		}
+	}
+
+	/* bad inbound key len */
+	if (key->maclen + 4 != opt->len)
+		return -EINVAL;
+
+	err = __tcp_authopt_calc_mac(sk, skb, key, true, macbuf);
+	if (err)
+		return err;
+
+	if (memcmp(macbuf, opt->mac, key->maclen)) {
+		net_info_ratelimited("TCP Authentication Failed\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 3f7bd7ae7d7a..e0b51b2f747f 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -70,10 +70,11 @@
 #include <linux/sysctl.h>
 #include <linux/kernel.h>
 #include <linux/prefetch.h>
 #include <net/dst.h>
 #include <net/tcp.h>
+#include <net/tcp_authopt.h>
 #include <net/inet_common.h>
 #include <linux/ipsec.h>
 #include <asm/unaligned.h>
 #include <linux/errqueue.h>
 #include <trace/events/tcp.h>
@@ -5967,18 +5968,34 @@ void tcp_init_transfer(struct sock *sk, int bpf_op, struct sk_buff *skb)
 	if (!icsk->icsk_ca_initialized)
 		tcp_init_congestion_control(sk);
 	tcp_init_buffer_space(sk);
 }
 
+static void tcp_authopt_finish_connect(struct sock *sk, struct sk_buff *skb)
+{
+#ifdef CONFIG_TCP_AUTHOPT
+	struct tcp_authopt_info *info;
+
+	info = rcu_dereference_protected(tcp_sk(sk)->authopt_info, lockdep_sock_is_held(sk));
+	if (!info)
+		return;
+
+	info->src_isn = ntohl(tcp_hdr(skb)->ack_seq) - 1;
+	info->dst_isn = ntohl(tcp_hdr(skb)->seq);
+#endif
+}
+
 void tcp_finish_connect(struct sock *sk, struct sk_buff *skb)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct inet_connection_sock *icsk = inet_csk(sk);
 
 	tcp_set_state(sk, TCP_ESTABLISHED);
 	icsk->icsk_ack.lrcvtime = tcp_jiffies32;
 
+	tcp_authopt_finish_connect(sk, skb);
+
 	if (skb) {
 		icsk->icsk_af_ops->sk_rx_dst_set(sk, skb);
 		security_inet_conn_established(sk, skb);
 		sk_mark_napi_id(sk, skb);
 	}
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 1348615c7576..a1d39183908c 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2060,10 +2060,13 @@ int tcp_v4_rcv(struct sk_buff *skb)
 		goto discard_and_relse;
 
 	if (tcp_v4_inbound_md5_hash(sk, skb, dif, sdif))
 		goto discard_and_relse;
 
+	if (tcp_authopt_inbound_check(sk, skb))
+		goto discard_and_relse;
+
 	nf_reset_ct(skb);
 
 	if (tcp_filter(sk, skb))
 		goto discard_and_relse;
 	th = (const struct tcphdr *)skb->data;
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 0a4f3f16140a..4d7d86547b0e 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -24,10 +24,11 @@
 #include <linux/slab.h>
 #include <linux/sysctl.h>
 #include <linux/workqueue.h>
 #include <linux/static_key.h>
 #include <net/tcp.h>
+#include <net/tcp_authopt.h>
 #include <net/inet_common.h>
 #include <net/xfrm.h>
 #include <net/busy_poll.h>
 
 static bool tcp_in_window(u32 seq, u32 end_seq, u32 s_win, u32 e_win)
@@ -539,10 +540,11 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
 #ifdef CONFIG_TCP_MD5SIG
 	newtp->md5sig_info = NULL;	/*XXX*/
 	if (newtp->af_specific->md5_lookup(sk, newsk))
 		newtp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED;
 #endif
+	tcp_authopt_openreq(newsk, sk, req);
 	if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len)
 		newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len;
 	newtp->rx_opt.mss_clamp = req->mss;
 	tcp_ecn_openreq_child(newtp, req);
 	newtp->fastopen_req = NULL;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 6d72f3ea48c4..6d73bee349c9 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -37,10 +37,11 @@
 
 #define pr_fmt(fmt) "TCP: " fmt
 
 #include <net/tcp.h>
 #include <net/mptcp.h>
+#include <net/tcp_authopt.h>
 
 #include <linux/compiler.h>
 #include <linux/gfp.h>
 #include <linux/module.h>
 #include <linux/static_key.h>
@@ -411,10 +412,11 @@ static inline bool tcp_urg_mode(const struct tcp_sock *tp)
 
 #define OPTION_SACK_ADVERTISE	(1 << 0)
 #define OPTION_TS		(1 << 1)
 #define OPTION_MD5		(1 << 2)
 #define OPTION_WSCALE		(1 << 3)
+#define OPTION_AUTHOPT		(1 << 4)
 #define OPTION_FAST_OPEN_COOKIE	(1 << 8)
 #define OPTION_SMC		(1 << 9)
 #define OPTION_MPTCP		(1 << 10)
 
 static void smc_options_write(__be32 *ptr, u16 *options)
@@ -435,16 +437,21 @@ static void smc_options_write(__be32 *ptr, u16 *options)
 struct tcp_out_options {
 	u16 options;		/* bit field of OPTION_* */
 	u16 mss;		/* 0 to disable */
 	u8 ws;			/* window scale, 0 to disable */
 	u8 num_sack_blocks;	/* number of SACK blocks to include */
-	u8 hash_size;		/* bytes in hash_location */
 	u8 bpf_opt_len;		/* length of BPF hdr option */
+#ifdef CONFIG_TCP_AUTHOPT
+	u8 authopt_rnextkeyid; /* rnextkey */
+#endif
 	__u8 *hash_location;	/* temporary pointer, overloaded */
 	__u32 tsval, tsecr;	/* need to include OPTION_TS */
 	struct tcp_fastopen_cookie *fastopen_cookie;	/* Fast open cookie */
 	struct mptcp_out_options mptcp;
+#ifdef CONFIG_TCP_AUTHOPT
+	struct tcp_authopt_key_info *authopt_key;
+#endif
 };
 
 static void mptcp_options_write(__be32 *ptr, const struct tcp_sock *tp,
 				struct tcp_out_options *opts)
 {
@@ -617,10 +624,24 @@ static void tcp_options_write(__be32 *ptr, struct tcp_sock *tp,
 		/* overload cookie hash location */
 		opts->hash_location = (__u8 *)ptr;
 		ptr += 4;
 	}
 
+#ifdef CONFIG_TCP_AUTHOPT
+	if (unlikely(OPTION_AUTHOPT & options)) {
+		struct tcp_authopt_key_info *key = opts->authopt_key;
+
+		WARN_ON(!key);
+		*ptr++ = htonl((TCPOPT_AUTHOPT << 24) | ((4 + key->maclen) << 16) |
+			       (key->send_id << 8) | opts->authopt_rnextkeyid);
+		/* overload cookie hash location */
+		opts->hash_location = (__u8 *)ptr;
+		/* maclen is currently always 12 but try to align nicely anyway. */
+		ptr += (key->maclen + 3) / 4;
+	}
+#endif
+
 	if (unlikely(opts->mss)) {
 		*ptr++ = htonl((TCPOPT_MSS << 24) |
 			       (TCPOLEN_MSS << 16) |
 			       opts->mss);
 	}
@@ -752,10 +773,28 @@ static void mptcp_set_option_cond(const struct request_sock *req,
 			}
 		}
 	}
 }
 
+static int tcp_authopt_init_options(const struct sock *sk,
+				    const struct sock *addr_sk,
+				    struct tcp_out_options *opts)
+{
+#ifdef CONFIG_TCP_AUTHOPT
+	struct tcp_authopt_key_info *key;
+
+	key = tcp_authopt_select_key(sk, addr_sk, &opts->authopt_rnextkeyid);
+	if (key) {
+		opts->options |= OPTION_AUTHOPT;
+		opts->authopt_key = key;
+		return 4 + key->maclen;
+	}
+#endif
+
+	return 0;
+}
+
 /* Compute TCP options for SYN packets. This is not the final
  * network wire format yet.
  */
 static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
 				struct tcp_out_options *opts,
@@ -774,10 +813,11 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
 			opts->options |= OPTION_MD5;
 			remaining -= TCPOLEN_MD5SIG_ALIGNED;
 		}
 	}
 #endif
+	remaining -= tcp_authopt_init_options(sk, sk, opts);
 
 	/* We always get an MSS option.  The option bytes which will be seen in
 	 * normal data packets should timestamps be used, must be in the MSS
 	 * advertised.  But we subtract them from tp->mss_cache so that
 	 * calculations in tcp_sendmsg are simpler etc.  So account for this
@@ -862,10 +902,11 @@ static unsigned int tcp_synack_options(const struct sock *sk,
 		 */
 		if (synack_type != TCP_SYNACK_COOKIE)
 			ireq->tstamp_ok &= !ireq->sack_ok;
 	}
 #endif
+	remaining -= tcp_authopt_init_options(sk, req_to_sk(req), opts);
 
 	/* We always send an MSS option. */
 	opts->mss = mss;
 	remaining -= TCPOLEN_MSS_ALIGNED;
 
@@ -930,10 +971,11 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
 			opts->options |= OPTION_MD5;
 			size += TCPOLEN_MD5SIG_ALIGNED;
 		}
 	}
 #endif
+	size += tcp_authopt_init_options(sk, sk, opts);
 
 	if (likely(tp->rx_opt.tstamp_ok)) {
 		opts->options |= OPTION_TS;
 		opts->tsval = skb ? tcp_skb_timestamp(skb) + tp->tsoffset : 0;
 		opts->tsecr = tp->rx_opt.ts_recent;
@@ -1277,10 +1319,14 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
 
 	inet = inet_sk(sk);
 	tcb = TCP_SKB_CB(skb);
 	memset(&opts, 0, sizeof(opts));
 
+#ifdef CONFIG_TCP_AUTHOPT
+	/* for tcp_authopt_init_options inside tcp_syn_options or tcp_established_options */
+	rcu_read_lock();
+#endif
 	if (unlikely(tcb->tcp_flags & TCPHDR_SYN)) {
 		tcp_options_size = tcp_syn_options(sk, skb, &opts, &md5);
 	} else {
 		tcp_options_size = tcp_established_options(sk, skb, &opts,
 							   &md5);
@@ -1365,10 +1411,17 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
 		sk_nocaps_add(sk, NETIF_F_GSO_MASK);
 		tp->af_specific->calc_md5_hash(opts.hash_location,
 					       md5, sk, skb);
 	}
 #endif
+#ifdef CONFIG_TCP_AUTHOPT
+	if (opts.authopt_key) {
+		sk_nocaps_add(sk, NETIF_F_GSO_MASK);
+		tcp_authopt_hash(opts.hash_location, opts.authopt_key, sk, skb);
+	}
+	rcu_read_unlock();
+#endif
 
 	/* BPF prog is the last one writing header option */
 	bpf_skops_write_hdr_opt(sk, skb, NULL, NULL, 0, &opts);
 
 	INDIRECT_CALL_INET(icsk->icsk_af_ops->send_check,
@@ -1836,12 +1889,21 @@ unsigned int tcp_current_mss(struct sock *sk)
 		u32 mtu = dst_mtu(dst);
 		if (mtu != inet_csk(sk)->icsk_pmtu_cookie)
 			mss_now = tcp_sync_mss(sk, mtu);
 	}
 
+#ifdef CONFIG_TCP_AUTHOPT
+	/* Even if the result is not used rcu_read_lock is required when scanning for
+	 * tcp authentication keys. Otherwise lockdep will complain.
+	 */
+	rcu_read_lock();
+#endif
 	header_len = tcp_established_options(sk, NULL, &opts, &md5) +
 		     sizeof(struct tcphdr);
+#ifdef CONFIG_TCP_AUTHOPT
+	rcu_read_unlock();
+#endif
 	/* The mss_cache is sized based on tp->tcp_header_len, which assumes
 	 * some common options. If this is an odd packet (because we have SACK
 	 * blocks etc) then our calculated header_len will be different, and
 	 * we have to adjust mss_now correspondingly */
 	if (header_len != tp->tcp_header_len) {
@@ -3566,10 +3628,14 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
 	}
 
 #ifdef CONFIG_TCP_MD5SIG
 	rcu_read_lock();
 	md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req));
+#endif
+#ifdef CONFIG_TCP_AUTHOPT
+	/* for tcp_authopt_init_options inside tcp_synack_options */
+	rcu_read_lock();
 #endif
 	skb_set_hash(skb, tcp_rsk(req)->txhash, PKT_HASH_TYPE_L4);
 	/* bpf program will be interested in the tcp_flags */
 	TCP_SKB_CB(skb)->tcp_flags = TCPHDR_SYN | TCPHDR_ACK;
 	tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, md5,
@@ -3603,10 +3669,16 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
 	if (md5)
 		tcp_rsk(req)->af_specific->calc_md5_hash(opts.hash_location,
 					       md5, req_to_sk(req), skb);
 	rcu_read_unlock();
 #endif
+#ifdef CONFIG_TCP_AUTHOPT
+	/* If signature fails we do nothing */
+	if (opts.authopt_key)
+		tcp_authopt_hash(opts.hash_location, opts.authopt_key, req_to_sk(req), skb);
+	rcu_read_unlock();
+#endif
 
 	bpf_skops_write_hdr_opt((struct sock *)sk, skb, req, syn_skb,
 				synack_type, &opts);
 
 	skb->skb_mstamp_ns = now;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 0ce52d46e4f8..51381a9c2bd5 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -40,10 +40,11 @@
 #include <linux/icmpv6.h>
 #include <linux/random.h>
 #include <linux/indirect_call_wrapper.h>
 
 #include <net/tcp.h>
+#include <net/tcp_authopt.h>
 #include <net/ndisc.h>
 #include <net/inet6_hashtables.h>
 #include <net/inet6_connection_sock.h>
 #include <net/ipv6.h>
 #include <net/transp_v6.h>
@@ -1733,10 +1734,13 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 		goto discard_and_relse;
 
 	if (tcp_v6_inbound_md5_hash(sk, skb, dif, sdif))
 		goto discard_and_relse;
 
+	if (tcp_authopt_inbound_check(sk, skb))
+		goto discard_and_relse;
+
 	if (tcp_filter(sk, skb))
 		goto discard_and_relse;
 	th = (const struct tcphdr *)skb->data;
 	hdr = ipv6_hdr(skb);
 	tcp_v6_fill_cb(skb, hdr, th);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 08/15] tcp: authopt: Add snmp counters
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (6 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 07/15] tcp: authopt: Hook into tcp core Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 09/15] selftests: tcp_authopt: Test key address binding Leonard Crestez
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

Add LINUX_MIB_TCPAUTHOPTFAILURE and increment on failure. This can be
use by userspace to count the number of failed authentications.

All types of authentication failures are reported under a single
counter.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 include/uapi/linux/snmp.h | 1 +
 net/ipv4/proc.c           | 1 +
 net/ipv4/tcp_authopt.c    | 3 +++
 3 files changed, 5 insertions(+)

diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index 904909d020e2..1d96030889a1 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -290,10 +290,11 @@ enum
 	LINUX_MIB_TCPDUPLICATEDATAREHASH,	/* TCPDuplicateDataRehash */
 	LINUX_MIB_TCPDSACKRECVSEGS,		/* TCPDSACKRecvSegs */
 	LINUX_MIB_TCPDSACKIGNOREDDUBIOUS,	/* TCPDSACKIgnoredDubious */
 	LINUX_MIB_TCPMIGRATEREQSUCCESS,		/* TCPMigrateReqSuccess */
 	LINUX_MIB_TCPMIGRATEREQFAILURE,		/* TCPMigrateReqFailure */
+	LINUX_MIB_TCPAUTHOPTFAILURE,		/* TCPAuthOptFailure */
 	__LINUX_MIB_MAX
 };
 
 /* linux Xfrm mib definitions */
 enum
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index b0d3a09dc84e..61dd06f8389c 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -295,10 +295,11 @@ static const struct snmp_mib snmp4_net_list[] = {
 	SNMP_MIB_ITEM("TcpDuplicateDataRehash", LINUX_MIB_TCPDUPLICATEDATAREHASH),
 	SNMP_MIB_ITEM("TCPDSACKRecvSegs", LINUX_MIB_TCPDSACKRECVSEGS),
 	SNMP_MIB_ITEM("TCPDSACKIgnoredDubious", LINUX_MIB_TCPDSACKIGNOREDDUBIOUS),
 	SNMP_MIB_ITEM("TCPMigrateReqSuccess", LINUX_MIB_TCPMIGRATEREQSUCCESS),
 	SNMP_MIB_ITEM("TCPMigrateReqFailure", LINUX_MIB_TCPMIGRATEREQFAILURE),
+	SNMP_MIB_ITEM("TCPAuthOptFailure", LINUX_MIB_TCPAUTHOPTFAILURE),
 	SNMP_MIB_SENTINEL
 };
 
 static void icmpmsg_put_line(struct seq_file *seq, unsigned long *vals,
 			     unsigned short *type, int count)
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index af777244d098..08ca77f01c46 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -1071,10 +1071,11 @@ int __tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb, struct tcp
 
 	/* nothing found or expected */
 	if (!opt && !key)
 		return 0;
 	if (!opt && key) {
+		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
 		net_info_ratelimited("TCP Authentication Missing\n");
 		return -EINVAL;
 	}
 	if (opt && !key) {
 		/* RFC5925 Section 7.3:
@@ -1082,10 +1083,11 @@ int __tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb, struct tcp
 		 * of segments with TCP-AO but that do not match an MKT. The initial
 		 * default of this configuration SHOULD be to silently accept such
 		 * connections.
 		 */
 		if (info->flags & TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED) {
+			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
 			net_info_ratelimited("TCP Authentication Unexpected: Rejected\n");
 			return -EINVAL;
 		} else {
 			net_info_ratelimited("TCP Authentication Unexpected: Accepted\n");
 			return 0;
@@ -1099,10 +1101,11 @@ int __tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb, struct tcp
 	err = __tcp_authopt_calc_mac(sk, skb, key, true, macbuf);
 	if (err)
 		return err;
 
 	if (memcmp(macbuf, opt->mac, key->maclen)) {
+		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
 		net_info_ratelimited("TCP Authentication Failed\n");
 		return -EINVAL;
 	}
 
 	return 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 09/15] selftests: tcp_authopt: Test key address binding
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (7 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 08/15] tcp: authopt: Add snmp counters Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-25  5:18   ` David Ahern
  2021-08-24 21:34 ` [RFCv3 10/15] selftests: tcp_authopt: Capture and verify packets Leonard Crestez
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

By default TCP-AO keys apply to all possible peers but it's possible to
have different keys for different remote hosts.

This patch adds initial tests for the behavior behind the
TCP_AUTHOPT_KEY_BIND_ADDR flag. Server rejection is tested via client
timeout so this can be slightly slow.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 .../tcp_authopt_test/netns_fixture.py         |  63 +++++++
 .../tcp_authopt/tcp_authopt_test/server.py    |  82 ++++++++++
 .../tcp_authopt/tcp_authopt_test/test_bind.py | 143 ++++++++++++++++
 .../tcp_authopt/tcp_authopt_test/utils.py     | 154 ++++++++++++++++++
 4 files changed, 442 insertions(+)
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/netns_fixture.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/server.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_bind.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/utils.py

diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/netns_fixture.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/netns_fixture.py
new file mode 100644
index 000000000000..20bb12c2aae2
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/netns_fixture.py
@@ -0,0 +1,63 @@
+# SPDX-License-Identifier: GPL-2.0
+import subprocess
+import socket
+from ipaddress import IPv4Address
+from ipaddress import IPv6Address
+
+
+class NamespaceFixture:
+    """Create a pair of namespaces connected by one veth pair
+
+    Each end of the pair has multiple addresses but everything is in the same subnet
+    """
+
+    ns1_name = "tcp_authopt_test_1"
+    ns2_name = "tcp_authopt_test_2"
+
+    @classmethod
+    def get_ipv4_addr(cls, ns=1, index=1) -> IPv4Address:
+        return IPv4Address("10.10.0.0") + (ns << 8) + index
+
+    @classmethod
+    def get_ipv6_addr(cls, ns=1, index=1) -> IPv6Address:
+        return IPv6Address("fd00::") + (ns << 16) + index
+
+    @classmethod
+    def get_addr(cls, address_family=socket.AF_INET, ns=1, index=1):
+        if address_family == socket.AF_INET:
+            return cls.get_ipv4_addr(ns, index)
+        elif address_family == socket.AF_INET6:
+            return cls.get_ipv6_addr(ns, index)
+        else:
+            raise ValueError(f"Bad address_family={address_family}")
+
+    def __init__(self, **kw):
+        for k, v in kw.items():
+            setattr(self, k, v)
+
+    def __enter__(self):
+        script = f"""
+set -e -x
+ip netns del {self.ns1_name} || true
+ip netns del {self.ns2_name} || true
+ip netns add {self.ns1_name}
+ip netns add {self.ns2_name}
+ip link add veth0 netns {self.ns1_name} type veth peer name veth0 netns {self.ns2_name}
+ip netns exec {self.ns1_name} ip link set veth0 up
+ip netns exec {self.ns2_name} ip link set veth0 up
+"""
+        for index in [1, 2, 3]:
+            script += f"ip -n {self.ns1_name} addr add {self.get_ipv4_addr(1, index)}/16 dev veth0\n"
+            script += f"ip -n {self.ns2_name} addr add {self.get_ipv4_addr(2, index)}/16 dev veth0\n"
+            script += f"ip -n {self.ns1_name} addr add {self.get_ipv6_addr(1, index)}/64 dev veth0 nodad\n"
+            script += f"ip -n {self.ns2_name} addr add {self.get_ipv6_addr(2, index)}/64 dev veth0 nodad\n"
+        subprocess.run(script, shell=True, check=True)
+        return self
+
+    def __exit__(self, *a):
+        script = f"""
+set -e -x
+ip netns del {self.ns1_name} || true
+ip netns del {self.ns2_name} || true
+"""
+        subprocess.run(script, shell=True, check=True)
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/server.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/server.py
new file mode 100644
index 000000000000..c4cce8f5862a
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/server.py
@@ -0,0 +1,82 @@
+# SPDX-License-Identifier: GPL-2.0
+import logging
+import os
+import selectors
+from contextlib import ExitStack
+from threading import Thread
+
+logger = logging.getLogger(__name__)
+
+
+class SimpleServerThread(Thread):
+    """Simple server thread for testing TCP sockets
+
+    All data is read in 1000 bytes chunks and either echoed back or discarded.
+    """
+
+    def __init__(self, socket, mode="recv"):
+        self.listen_socket = socket
+        self.server_socket = []
+        self.mode = mode
+        super().__init__()
+
+    def _read(self, conn, events):
+        # logger.debug("events=%r", events)
+        data = conn.recv(1000)
+        # logger.debug("len(data)=%r", len(data))
+        if len(data) == 0:
+            # logger.info("closing %r", conn)
+            conn.close()
+            self.sel.unregister(conn)
+        else:
+            if self.mode == "echo":
+                conn.sendall(data)
+            elif self.mode == "recv":
+                pass
+            else:
+                raise ValueError(f"Unknown mode {self.mode}")
+
+    def _stop_pipe_read(self, conn, events):
+        self.should_loop = False
+
+    def start(self) -> None:
+        self.exit_stack = ExitStack()
+        self._stop_pipe_rfd, self._stop_pipe_wfd = os.pipe()
+        self.exit_stack.callback(lambda: os.close(self._stop_pipe_rfd))
+        self.exit_stack.callback(lambda: os.close(self._stop_pipe_wfd))
+        return super().start()
+
+    def _accept(self, conn, events):
+        assert conn == self.listen_socket
+        conn, _addr = self.listen_socket.accept()
+        conn = self.exit_stack.enter_context(conn)
+        conn.setblocking(False)
+        self.sel.register(conn, selectors.EVENT_READ, self._read)
+        self.server_socket.append(conn)
+
+    def run(self):
+        self.should_loop = True
+        self.sel = self.exit_stack.enter_context(selectors.DefaultSelector())
+        self.sel.register(
+            self._stop_pipe_rfd, selectors.EVENT_READ, self._stop_pipe_read
+        )
+        self.sel.register(self.listen_socket, selectors.EVENT_READ, self._accept)
+        # logger.debug("loop init")
+        while self.should_loop:
+            for key, events in self.sel.select(timeout=1):
+                callback = key.data
+                callback(key.fileobj, events)
+        # logger.debug("loop done")
+
+    def stop(self):
+        """Try to stop nicely"""
+        os.write(self._stop_pipe_wfd, b"Q")
+        self.join()
+        self.exit_stack.close()
+
+    def __enter__(self):
+        self.start()
+        return self
+
+    def __exit__(self, *args):
+        self.stop()
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_bind.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_bind.py
new file mode 100644
index 000000000000..980954098e97
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_bind.py
@@ -0,0 +1,143 @@
+# SPDX-License-Identifier: GPL-2.0
+"""Test TCP-AO keys can be bound to specific remote addresses"""
+from contextlib import ExitStack
+import socket
+import pytest
+from .netns_fixture import NamespaceFixture
+from .utils import create_listen_socket
+from .server import SimpleServerThread
+from . import linux_tcp_authopt
+from .linux_tcp_authopt import (
+    tcp_authopt,
+    set_tcp_authopt,
+    set_tcp_authopt_key,
+    tcp_authopt_key,
+)
+from .utils import netns_context, DEFAULT_TCP_SERVER_PORT, check_socket_echo
+from .conftest import skipif_missing_tcp_authopt
+
+pytestmark = skipif_missing_tcp_authopt
+
+
+@pytest.mark.parametrize("address_family", [socket.AF_INET, socket.AF_INET6])
+def test_addr_server_bind(exit_stack: ExitStack, address_family):
+    """ "Server only accept client2, check client1 fails"""
+    nsfixture = exit_stack.enter_context(NamespaceFixture())
+    server_addr = str(nsfixture.get_addr(address_family, 1, 1))
+    client_addr = str(nsfixture.get_addr(address_family, 2, 1))
+    client_addr2 = str(nsfixture.get_addr(address_family, 2, 2))
+
+    # create server:
+    listen_socket = exit_stack.push(
+        create_listen_socket(family=address_family, ns=nsfixture.ns1_name)
+    )
+    exit_stack.enter_context(SimpleServerThread(listen_socket, mode="echo"))
+
+    # set keys:
+    server_key = tcp_authopt_key(
+        alg=linux_tcp_authopt.TCP_AUTHOPT_ALG_HMAC_SHA_1_96,
+        key="hello",
+        flags=linux_tcp_authopt.TCP_AUTHOPT_KEY_BIND_ADDR,
+        addr=client_addr2,
+    )
+    set_tcp_authopt(
+        listen_socket,
+        tcp_authopt(flags=linux_tcp_authopt.TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED),
+    )
+    set_tcp_authopt_key(listen_socket, server_key)
+
+    # create client socket:
+    def create_client_socket():
+        with netns_context(nsfixture.ns2_name):
+            client_socket = socket.socket(address_family, socket.SOCK_STREAM)
+        client_key = tcp_authopt_key(
+            alg=linux_tcp_authopt.TCP_AUTHOPT_ALG_HMAC_SHA_1_96,
+            key="hello",
+        )
+        set_tcp_authopt_key(client_socket, client_key)
+        return client_socket
+
+    # addr match:
+    # with create_client_socket() as client_socket2:
+    #    client_socket2.bind((client_addr2, 0))
+    #    client_socket2.settimeout(1.0)
+    #    client_socket2.connect((server_addr, TCP_SERVER_PORT))
+
+    # addr mismatch:
+    with create_client_socket() as client_socket1:
+        client_socket1.bind((client_addr, 0))
+        with pytest.raises(socket.timeout):
+            client_socket1.settimeout(1.0)
+            client_socket1.connect((server_addr, DEFAULT_TCP_SERVER_PORT))
+
+
+@pytest.mark.parametrize("address_family", [socket.AF_INET, socket.AF_INET6])
+def test_addr_client_bind(exit_stack: ExitStack, address_family):
+    """ "Client configures different keys with same id but different addresses"""
+    nsfixture = exit_stack.enter_context(NamespaceFixture())
+    server_addr1 = str(nsfixture.get_addr(address_family, 1, 1))
+    server_addr2 = str(nsfixture.get_addr(address_family, 1, 2))
+    client_addr = str(nsfixture.get_addr(address_family, 2, 1))
+
+    # create servers:
+    listen_socket1 = exit_stack.enter_context(
+        create_listen_socket(
+            family=address_family, ns=nsfixture.ns1_name, bind_addr=server_addr1
+        )
+    )
+    listen_socket2 = exit_stack.enter_context(
+        create_listen_socket(
+            family=address_family, ns=nsfixture.ns1_name, bind_addr=server_addr2
+        )
+    )
+    exit_stack.enter_context(SimpleServerThread(listen_socket1, mode="echo"))
+    exit_stack.enter_context(SimpleServerThread(listen_socket2, mode="echo"))
+
+    # set keys:
+    set_tcp_authopt_key(
+        listen_socket1,
+        tcp_authopt_key(
+            alg=linux_tcp_authopt.TCP_AUTHOPT_ALG_HMAC_SHA_1_96,
+            key="11111",
+        ),
+    )
+    set_tcp_authopt_key(
+        listen_socket2,
+        tcp_authopt_key(
+            alg=linux_tcp_authopt.TCP_AUTHOPT_ALG_HMAC_SHA_1_96,
+            key="22222",
+        ),
+    )
+
+    # create client socket:
+    def create_client_socket():
+        with netns_context(nsfixture.ns2_name):
+            client_socket = socket.socket(address_family, socket.SOCK_STREAM)
+        set_tcp_authopt_key(
+            client_socket,
+            tcp_authopt_key(
+                alg=linux_tcp_authopt.TCP_AUTHOPT_ALG_HMAC_SHA_1_96,
+                key="11111",
+                flags=linux_tcp_authopt.TCP_AUTHOPT_KEY_BIND_ADDR,
+                addr=server_addr1,
+            ),
+        )
+        set_tcp_authopt_key(
+            client_socket,
+            tcp_authopt_key(
+                alg=linux_tcp_authopt.TCP_AUTHOPT_ALG_HMAC_SHA_1_96,
+                key="22222",
+                flags=linux_tcp_authopt.TCP_AUTHOPT_KEY_BIND_ADDR,
+                addr=server_addr2,
+            ),
+        )
+        client_socket.settimeout(1.0)
+        client_socket.bind((client_addr, 0))
+        return client_socket
+
+    with create_client_socket() as client_socket1:
+        client_socket1.connect((server_addr1, DEFAULT_TCP_SERVER_PORT))
+        check_socket_echo(client_socket1)
+    with create_client_socket() as client_socket2:
+        client_socket2.connect((server_addr2, DEFAULT_TCP_SERVER_PORT))
+        check_socket_echo(client_socket2)
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/utils.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/utils.py
new file mode 100644
index 000000000000..22bd3f0a142a
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/utils.py
@@ -0,0 +1,154 @@
+# SPDX-License-Identifier: GPL-2.0
+import json
+import random
+import subprocess
+import threading
+import typing
+import socket
+from dataclasses import dataclass
+from contextlib import nullcontext
+
+from nsenter import Namespace
+from scapy.sendrecv import AsyncSniffer
+
+
+# TCP port does not impact Authentication Option so define a single default
+DEFAULT_TCP_SERVER_PORT = 17971
+
+
+class SimpleWaitEvent(threading.Event):
+    @property
+    def value(self) -> bool:
+        return self.is_set()
+
+    @value.setter
+    def value(self, value: bool):
+        if value:
+            self.set()
+        else:
+            self.clear()
+
+    def wait(self, timeout=None):
+        """Like Event.wait except raise on timeout"""
+        super().wait(timeout)
+        if not self.is_set():
+            raise TimeoutError(f"Timed out timeout={timeout!r}")
+
+
+def recvall(sock, todo):
+    """Receive exactly todo bytes unless EOF"""
+    data = bytes()
+    while True:
+        chunk = sock.recv(todo)
+        if not len(chunk):
+            return data
+        data += chunk
+        todo -= len(chunk)
+        if todo == 0:
+            return data
+        assert todo > 0
+
+
+def randbytes(count) -> bytes:
+    """Return a random byte array"""
+    return bytes([random.randint(0, 255) for index in range(count)])
+
+
+def check_socket_echo(sock, size=1024):
+    """Send random bytes and check they are received"""
+    send_buf = randbytes(size)
+    sock.sendall(send_buf)
+    recv_buf = recvall(sock, size)
+    assert send_buf == recv_buf
+
+
+def nstat_json(command_prefix: str = ""):
+    """Parse nstat output into a python dict"""
+    runres = subprocess.run(
+        f"{command_prefix}nstat -a --zeros --json",
+        shell=True,
+        check=True,
+        stdout=subprocess.PIPE,
+        encoding="utf-8",
+    )
+    return json.loads(runres.stdout)
+
+
+def netns_context(ns: str = ""):
+    """Create context manager for a certain optional netns
+
+    If the ns argument is empty then just return a `nullcontext`
+    """
+    if ns:
+        return Namespace("/var/run/netns/" + ns, "net")
+    else:
+        return nullcontext()
+
+
+def create_listen_socket(
+    ns: str = "",
+    family=socket.AF_INET,
+    reuseaddr=True,
+    listen_depth=10,
+    bind_addr="",
+    bind_port=DEFAULT_TCP_SERVER_PORT,
+):
+    with netns_context(ns):
+        listen_socket = socket.socket(family, socket.SOCK_STREAM)
+    if reuseaddr:
+        listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
+    listen_socket.bind((str(bind_addr), bind_port))
+    listen_socket.listen(listen_depth)
+    return listen_socket
+
+
+@dataclass
+class tcphdr_authopt:
+    """Representation of a TCP auth option as it appears in a TCP packet"""
+
+    keyid: int
+    rnextkeyid: int
+    mac: bytes
+
+    @classmethod
+    def unpack(cls, buf) -> "tcphdr_authopt":
+        return cls(buf[0], buf[1], buf[2:])
+
+    def __repr__(self):
+        return f"tcphdr_authopt({self.keyid}, {self.rnextkeyid}, bytes.fromhex({self.mac.hex(' ')!r})"
+
+
+def scapy_tcp_get_authopt_val(tcp) -> typing.Optional[tcphdr_authopt]:
+    for optnum, optval in tcp.options:
+        if optnum == 29:
+            return tcphdr_authopt.unpack(optval)
+    return None
+
+
+def scapy_sniffer_start_block(sniffer: AsyncSniffer, timeout=1):
+    """Like AsyncSniffer.start except block until sniffing starts
+
+    This ensures no lost packets and no delays
+    """
+    if sniffer.kwargs.get("started_callback"):
+        raise ValueError("sniffer must not already have a started_callback")
+
+    e = SimpleWaitEvent()
+    sniffer.kwargs["started_callback"] = e.set
+    sniffer.start()
+    e.wait(timeout=timeout)
+
+
+def scapy_sniffer_stop(sniffer: AsyncSniffer):
+    """Like AsyncSniffer.stop except no error is raising if not running"""
+    if sniffer is not None and sniffer.running:
+        sniffer.stop()
+
+
+class AsyncSnifferContext(AsyncSniffer):
+    def __enter__(self):
+        scapy_sniffer_start_block(self)
+        return self
+
+    def __exit__(self, *a):
+        scapy_sniffer_stop(self)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 10/15] selftests: tcp_authopt: Capture and verify packets
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (8 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 09/15] selftests: tcp_authopt: Test key address binding Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 11/15] selftests: Initial tcp_authopt support for nettest Leonard Crestez
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

Tools like tcpdump and wireshark can parse the TCP Authentication Option
but there is not yet support to verify correct signatures.

This patch implements TCP-AO signature verification using scapy and the
python cryptography package and applies it to captures of linux traffic
in multiple scenarios (ipv4, ipv6 etc).

The python code is verified itself with a subset of IETF test vectors
from this page:
https://datatracker.ietf.org/doc/html/draft-touch-tcpm-ao-test-vectors-02

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 .../full_tcp_sniff_session.py                 |  53 +++
 .../tcp_authopt_test/tcp_authopt_alg.py       | 276 ++++++++++++++
 .../tcp_authopt_test/test_vectors.py          | 359 ++++++++++++++++++
 .../tcp_authopt_test/test_verify_capture.py   | 123 ++++++
 .../tcp_authopt/tcp_authopt_test/validator.py | 158 ++++++++
 5 files changed, 969 insertions(+)
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/full_tcp_sniff_session.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/tcp_authopt_alg.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_vectors.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_verify_capture.py
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/validator.py

diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/full_tcp_sniff_session.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/full_tcp_sniff_session.py
new file mode 100644
index 000000000000..d709e83c8700
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/full_tcp_sniff_session.py
@@ -0,0 +1,53 @@
+# SPDX-License-Identifier: GPL-2.0
+import scapy.sessions
+from scapy.layers.inet import TCP
+
+from .utils import SimpleWaitEvent
+
+
+class FullTCPSniffSession(scapy.sessions.DefaultSession):
+    """Implementation of a scapy sniff session that can wait for a full TCP capture
+
+    Allows another thread to wait for a complete FIN handshake without polling or sleep.
+    """
+
+    found_syn = False
+    found_synack = False
+    found_fin = False
+    found_client_fin = False
+    found_server_fin = False
+
+    def __init__(self, server_port=None, **kw):
+        super().__init__(**kw)
+        self.server_port = server_port
+        self._close_event = SimpleWaitEvent()
+
+    def on_packet_received(self, p):
+        super().on_packet_received(p)
+        if not p or not TCP in p:
+            return
+        th = p[TCP]
+        # logger.debug("sport=%d dport=%d flags=%s", th.sport, th.dport, th.flags)
+        if th.flags.S and not th.flags.A:
+            if th.dport == self.server_port or self.server_port is None:
+                self.found_syn = True
+        if th.flags.S and th.flags.A:
+            if th.sport == self.server_port or self.server_port is None:
+                self.found_synack = True
+        if th.flags.F:
+            if self.server_port is None:
+                self.found_fin = True
+                self._close_event.set()
+            elif self.server_port == th.dport:
+                self.found_client_fin = True
+                self.found_fin = True
+                if self.found_server_fin and self.found_client_fin:
+                    self._close_event.set()
+            elif self.server_port == th.sport:
+                self.found_server_fin = True
+                self.found_fin = True
+                if self.found_server_fin and self.found_client_fin:
+                    self._close_event.set()
+
+    def wait_close(self, timeout=10):
+        self._close_event.wait(timeout=timeout)
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/tcp_authopt_alg.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/tcp_authopt_alg.py
new file mode 100644
index 000000000000..093cb4716184
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/tcp_authopt_alg.py
@@ -0,0 +1,276 @@
+# SPDX-License-Identifier: GPL-2.0
+"""Packet-processing utilities implementing RFC5925 and RFC2926"""
+
+import logging
+from dataclasses import dataclass
+from ipaddress import IPv4Address, IPv6Address
+from scapy.layers.inet import IP, TCP
+from scapy.layers.inet6 import IPv6
+from scapy.packet import Packet
+import socket
+import struct
+import typing
+import hmac
+
+logger = logging.getLogger(__name__)
+
+
+def kdf_sha1(master_key: bytes, context: bytes) -> bytes:
+    """RFC5926 section 3.1.1.1"""
+    input = b"\x01" + b"TCP-AO" + context + b"\x00\xa0"
+    return hmac.digest(master_key, input, "SHA1")
+
+
+def mac_sha1(traffic_key: bytes, message: bytes) -> bytes:
+    """RFC5926 section 3.2.1"""
+    return hmac.digest(traffic_key, message, "SHA1")[:12]
+
+
+def cmac_aes_digest(key: bytes, msg: bytes) -> bytes:
+    from cryptography.hazmat.primitives import cmac
+    from cryptography.hazmat.primitives.ciphers import algorithms
+    from cryptography.hazmat.backends import default_backend
+
+    backend = default_backend()
+    c = cmac.CMAC(algorithms.AES(key), backend=backend)
+    c.update(bytes(msg))
+    return c.finalize()
+
+
+def kdf_cmac_aes(master_key: bytes, context: bytes) -> bytes:
+    if len(master_key) == 16:
+        key = master_key
+    else:
+        key = cmac_aes_digest(b"\x00" * 16, master_key)
+    return cmac_aes_digest(key, b"\x01" + b"TCP-AO" + context + b"\x00\x80")
+
+
+def mac_cmac_aes(traffic_key: bytes, message: bytes) -> bytes:
+    return cmac_aes_digest(traffic_key, message)[:12]
+
+
+class TcpAuthOptAlg:
+    def kdf(self, master_key: bytes, context: bytes) -> bytes:
+        raise NotImplementedError()
+
+    def mac(self, traffic_key: bytes, message: bytes) -> bytes:
+        raise NotImplementedError()
+
+
+class TcpAuthOptAlg_HMAC_SHA1(TcpAuthOptAlg):
+    def kdf(self, master_key: bytes, context: bytes) -> bytes:
+        return kdf_sha1(master_key, context)
+
+    def mac(self, traffic_key: bytes, message: bytes) -> bytes:
+        return mac_sha1(traffic_key, message)
+
+
+class TcpAuthOptAlg_CMAC_AES(TcpAuthOptAlg):
+    def kdf(self, master_key: bytes, context: bytes) -> bytes:
+        return kdf_cmac_aes(master_key, context)
+
+    def mac(self, traffic_key: bytes, message: bytes) -> bytes:
+        return mac_cmac_aes(traffic_key, message)
+
+
+def get_alg(name) -> TcpAuthOptAlg:
+    if name.upper() == "HMAC-SHA-1-96":
+        return TcpAuthOptAlg_HMAC_SHA1()
+    elif name.upper() == "AES-128-CMAC-96":
+        return TcpAuthOptAlg_CMAC_AES()
+    else:
+        raise ValueError(f"Bad TCP AuthOpt algorithms {name}")
+
+
+IPvXAddress = typing.Union[IPv4Address, IPv6Address]
+
+
+def get_scapy_ipvx_src(p: Packet) -> IPvXAddress:
+    if IP in p:
+        return IPv4Address(p[IP].src)
+    elif IPv6 in p:
+        return IPv6Address(p[IPv6].src)
+    else:
+        raise Exception("Neither IP nor IPv6 found on packet")
+
+
+def get_scapy_ipvx_dst(p: Packet) -> IPvXAddress:
+    if IP in p:
+        return IPv4Address(p[IP].dst)
+    elif IPv6 in p:
+        return IPv6Address(p[IPv6].dst)
+    else:
+        raise Exception("Neither IP nor IPv6 found on packet")
+
+
+def build_context(
+    saddr: IPvXAddress, daddr: IPvXAddress, sport, dport, src_isn, dst_isn
+) -> bytes:
+    """Build context bytes as specified by RFC5925 section 5.2"""
+    return (
+        saddr.packed
+        + daddr.packed
+        + struct.pack(
+            "!HHII",
+            sport,
+            dport,
+            src_isn,
+            dst_isn,
+        )
+    )
+
+
+def build_context_from_scapy(p: Packet, src_isn: int, dst_isn: int) -> bytes:
+    """Build context based on a scapy Packet and src/dst initial-sequence numbers"""
+    return build_context(
+        get_scapy_ipvx_src(p),
+        get_scapy_ipvx_dst(p),
+        p[TCP].sport,
+        p[TCP].dport,
+        src_isn,
+        dst_isn,
+    )
+
+
+def build_context_from_scapy_syn(p: Packet) -> bytes:
+    """Build context for a scapy SYN packet"""
+    return build_context_from_scapy(p, p[TCP].seq, 0)
+
+
+def build_context_from_scapy_synack(p: Packet) -> bytes:
+    """Build context for a scapy SYN/ACK packet"""
+    return build_context_from_scapy(p, p[TCP].seq, p[TCP].ack - 1)
+
+
+def build_message_from_scapy(p: Packet, include_options=True, sne=0) -> bytearray:
+    """Build message bytes as described by RFC5925 section 5.1"""
+    result = bytearray()
+    result += struct.pack("!I", sne)
+    # ip pseudo-header:
+    if IP in p:
+        result += struct.pack(
+            "!4s4sHH",
+            IPv4Address(p[IP].src).packed,
+            IPv4Address(p[IP].dst).packed,
+            socket.IPPROTO_TCP,
+            p[TCP].dataofs * 4 + len(p[TCP].payload),
+        )
+        assert p[TCP].dataofs * 4 + len(p[TCP].payload) + p[IP].ihl * 4 == p[IP].len
+    elif IPv6 in p:
+        result += struct.pack(
+            "!16s16sII",
+            IPv6Address(p[IPv6].src).packed,
+            IPv6Address(p[IPv6].dst).packed,
+            p[IPv6].plen,
+            socket.IPPROTO_TCP,
+        )
+        assert p[TCP].dataofs * 4 + len(p[TCP].payload) == p[IPv6].plen
+    else:
+        raise Exception("Neither IP nor IPv6 found on packet")
+
+    # tcp header with checksum set to zero
+    th_bytes = bytes(p[TCP])
+    result += th_bytes[:16]
+    result += b"\x00\x00"
+    result += th_bytes[18:20]
+
+    # Even if include_options=False the TCP-AO option itself is still included
+    # with the MAC set to all-zeros. This means we need to parse TCP options.
+    pos = 20
+    tcphdr_optend = p[TCP].dataofs * 4
+    # logger.info("th_bytes: %s", th_bytes.hex(' '))
+    assert len(th_bytes) >= tcphdr_optend
+    while pos < tcphdr_optend:
+        optnum = th_bytes[pos]
+        pos += 1
+        if optnum == 0 or optnum == 1:
+            if include_options:
+                result += bytes([optnum])
+            continue
+
+        optlen = th_bytes[pos]
+        pos += 1
+        if pos + optlen - 2 > tcphdr_optend:
+            logger.info(
+                "bad tcp option %d optlen %d beyond end-of-header", optnum, optlen
+            )
+            break
+        if optlen < 2:
+            logger.info("bad tcp option %d optlen %d less than two", optnum, optlen)
+            break
+        if optnum == 29:
+            if optlen < 4:
+                logger.info("bad tcp option %d optlen %d", optnum, optlen)
+                break
+            result += bytes([optnum, optlen])
+            result += th_bytes[pos : pos + 2]
+            result += (optlen - 4) * b"\x00"
+        elif include_options:
+            result += bytes([optnum, optlen])
+            result += th_bytes[pos : pos + optlen - 2]
+        pos += optlen - 2
+    result += bytes(p[TCP].payload)
+    return result
+
+
+@dataclass
+class TCPAuthContext:
+    """Context used to TCP Authentication option as defined in RFC5925 5.2"""
+
+    saddr: IPvXAddress = None
+    daddr: IPvXAddress = None
+    sport: int = 0
+    dport: int = 0
+    sisn: int = 0
+    disn: int = 0
+
+    def pack(self, syn=False, rev=False) -> bytes:
+        if rev:
+            return build_context(
+                self.daddr,
+                self.saddr,
+                self.dport,
+                self.sport,
+                self.disn if not syn else 0,
+                self.sisn,
+            )
+        else:
+            return build_context(
+                self.saddr,
+                self.daddr,
+                self.sport,
+                self.dport,
+                self.sisn,
+                self.disn if not syn else 0,
+            )
+
+    def rev(self) -> "TCPAuthContext":
+        """Reverse"""
+        return TCPAuthContext(
+            saddr=self.daddr,
+            daddr=self.saddr,
+            sport=self.dport,
+            dport=self.sport,
+            sisn=self.disn,
+            disn=self.sisn,
+        )
+
+    def init_from_syn_packet(self, p):
+        """Init from a SYN packet (and set dist to zero)"""
+        assert p[TCP].flags.S and not p[TCP].flags.A and p[TCP].ack == 0
+        self.saddr = get_scapy_ipvx_src(p)
+        self.daddr = get_scapy_ipvx_dst(p)
+        self.sport = p[TCP].sport
+        self.dport = p[TCP].dport
+        self.sisn = p[TCP].seq
+        self.disn = 0
+
+    def update_from_synack_packet(self, p):
+        """Update disn and check everything else matches"""
+        assert p[TCP].flags.S and p[TCP].flags.A
+        assert self.saddr == get_scapy_ipvx_dst(p)
+        assert self.daddr == get_scapy_ipvx_src(p)
+        assert self.sport == p[TCP].dport
+        assert self.dport == p[TCP].sport
+        assert self.sisn == p[TCP].ack - 1
+        self.disn = p[TCP].seq
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_vectors.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_vectors.py
new file mode 100644
index 000000000000..f622bcf0dcbc
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_vectors.py
@@ -0,0 +1,359 @@
+# SPDX-License-Identifier: GPL-2.0
+import logging
+from ipaddress import IPv4Address, IPv6Address
+from scapy.layers.inet import IP, TCP
+from scapy.layers.inet6 import IPv6
+from .tcp_authopt_alg import get_alg, build_context_from_scapy, build_message_from_scapy
+from .utils import scapy_tcp_get_authopt_val, tcphdr_authopt
+import socket
+
+logger = logging.getLogger(__name__)
+
+
+class TestIETFVectors:
+    """Test python implementation of TCP-AO algorithms
+
+    Data is a subset of IETF test vectors:
+    https://datatracker.ietf.org/doc/html/draft-touch-tcpm-ao-test-vectors-02
+    """
+
+    master_key = b"testvector"
+    client_keyid = 61
+    server_keyid = 84
+    client_ipv4 = IPv4Address("10.11.12.13")
+    client_ipv6 = IPv6Address("FD00::1")
+    server_ipv4 = IPv4Address("172.27.28.29")
+    server_ipv6 = IPv6Address("FD00::2")
+
+    client_isn_41x = 0xFBFBAB5A
+    server_isn_41x = 0x11C14261
+    client_isn_42x = 0xCB0EFBEE
+    server_isn_42x = 0xACD5B5E1
+    client_isn_61x = 0x176A833F
+    server_isn_61x = 0x3F51994B
+    client_isn_62x = 0x020C1E69
+    server_isn_62x = 0xEBA3734D
+
+    def check(
+        self,
+        packet_hex: str,
+        traffic_key_hex: str,
+        mac_hex: str,
+        src_isn,
+        dst_isn,
+        include_options=True,
+        alg_name="HMAC-SHA-1-96",
+        sne=0,
+    ):
+        packet_bytes = bytes.fromhex(packet_hex)
+
+        # sanity check for ip version
+        ipv = packet_bytes[0] >> 4
+        if ipv == 4:
+            p = IP(bytes.fromhex(packet_hex))
+            assert p[IP].proto == socket.IPPROTO_TCP
+        elif ipv == 6:
+            p = IPv6(bytes.fromhex(packet_hex))
+            assert p[IPv6].nh == socket.IPPROTO_TCP
+        else:
+            raise ValueError(f"bad ipv={ipv}")
+
+        # sanity check for seq/ack in SYN/ACK packets
+        if p[TCP].flags.S and p[TCP].flags.A is False:
+            assert p[TCP].seq == src_isn
+            assert p[TCP].ack == 0
+        if p[TCP].flags.S and p[TCP].flags.A:
+            assert p[TCP].seq == src_isn
+            assert p[TCP].ack == dst_isn + 1
+
+        # check traffic key
+        alg = get_alg(alg_name)
+        context_bytes = build_context_from_scapy(p, src_isn, dst_isn)
+        traffic_key = alg.kdf(self.master_key, context_bytes)
+        assert traffic_key.hex(" ") == traffic_key_hex
+
+        # check mac
+        message_bytes = build_message_from_scapy(
+            p, include_options=include_options, sne=sne
+        )
+        mac = alg.mac(traffic_key, message_bytes)
+        assert mac.hex(" ") == mac_hex
+
+        # check option bytes in header
+        opt = scapy_tcp_get_authopt_val(p[TCP])
+        assert opt is not None
+        assert opt.keyid in [self.client_keyid, self.server_keyid]
+        assert opt.rnextkeyid in [self.client_keyid, self.server_keyid]
+        assert opt.mac.hex(" ") == mac_hex
+
+    def test_4_1_1(self):
+        self.check(
+            """
+            45 e0 00 4c dd 0f 40 00 ff 06 bf 6b 0a 0b 0c 0d
+            ac 1b 1c 1d e9 d7 00 b3 fb fb ab 5a 00 00 00 00
+            e0 02 ff ff ca c4 00 00 02 04 05 b4 01 03 03 08
+            04 02 08 0a 00 15 5a b7 00 00 00 00 1d 10 3d 54
+            2e e4 37 c6 f8 ed e6 d7 c4 d6 02 e7
+            """,
+            "6d 63 ef 1b 02 fe 15 09 d4 b1 40 27 07 fd 7b 04 16 ab b7 4f",
+            "2e e4 37 c6 f8 ed e6 d7 c4 d6 02 e7",
+            self.client_isn_41x,
+            0,
+        )
+
+    def test_4_1_2(self):
+        self.check(
+            """
+            45 e0 00 4c 65 06 40 00 ff 06 37 75 ac 1b 1c 1d
+            0a 0b 0c 0d 00 b3 e9 d7 11 c1 42 61 fb fb ab 5b
+            e0 12 ff ff 37 76 00 00 02 04 05 b4 01 03 03 08
+            04 02 08 0a 84 a5 0b eb 00 15 5a b7 1d 10 54 3d
+            ee ab 0f e2 4c 30 10 81 51 16 b3 be
+            """,
+            "d9 e2 17 e4 83 4a 80 ca 2f 3f d8 de 2e 41 b8 e6 79 7f ea 96",
+            "ee ab 0f e2 4c 30 10 81 51 16 b3 be",
+            self.server_isn_41x,
+            self.client_isn_41x,
+        )
+
+    def test_4_1_3(self):
+        self.check(
+            """
+            45 e0 00 87 36 a1 40 00 ff 06 65 9f 0a 0b 0c 0d
+            ac 1b 1c 1d e9 d7 00 b3 fb fb ab 5b 11 c1 42 62
+            c0 18 01 04 a1 62 00 00 01 01 08 0a 00 15 5a c1
+            84 a5 0b eb 1d 10 3d 54 70 64 cf 99 8c c6 c3 15
+            c2 c2 e2 bf ff ff ff ff ff ff ff ff ff ff ff ff
+            ff ff ff ff 00 43 01 04 da bf 00 b4 0a 0b 0c 0d
+            26 02 06 01 04 00 01 00 01 02 02 80 00 02 02 02
+            00 02 02 42 00 02 06 41 04 00 00 da bf 02 08 40
+            06 00 64 00 01 01 00
+            """,
+            "d2 e5 9c 65 ff c7 b1 a3 93 47 65 64 63 b7 0e dc 24 a1 3d 71",
+            "70 64 cf 99 8c c6 c3 15 c2 c2 e2 bf",
+            self.client_isn_41x,
+            self.server_isn_41x,
+        )
+
+    def test_4_1_4(self):
+        self.check(
+            """
+            45 e0 00 87 1f a9 40 00 ff 06 7c 97 ac 1b 1c 1d
+            0a 0b 0c 0d 00 b3 e9 d7 11 c1 42 62 fb fb ab 9e
+            c0 18 01 00 40 0c 00 00 01 01 08 0a 84 a5 0b f5
+            00 15 5a c1 1d 10 54 3d a6 3f 0e cb bb 2e 63 5c
+            95 4d ea c7 ff ff ff ff ff ff ff ff ff ff ff ff
+            ff ff ff ff 00 43 01 04 da c0 00 b4 ac 1b 1c 1d
+            26 02 06 01 04 00 01 00 01 02 02 80 00 02 02 02
+            00 02 02 42 00 02 06 41 04 00 00 da c0 02 08 40
+            06 00 64 00 01 01 00
+            """,
+            "d9 e2 17 e4 83 4a 80 ca 2f 3f d8 de 2e 41 b8 e6 79 7f ea 96",
+            "a6 3f 0e cb bb 2e 63 5c 95 4d ea c7",
+            self.server_isn_41x,
+            self.client_isn_41x,
+        )
+
+    def test_4_2_1(self):
+        self.check(
+            """
+            45 e0 00 4c 53 99 40 00 ff 06 48 e2 0a 0b 0c 0d
+            ac 1b 1c 1d ff 12 00 b3 cb 0e fb ee 00 00 00 00
+            e0 02 ff ff 54 1f 00 00 02 04 05 b4 01 03 03 08
+            04 02 08 0a 00 02 4c ce 00 00 00 00 1d 10 3d 54
+            80 af 3c fe b8 53 68 93 7b 8f 9e c2
+            """,
+            "30 ea a1 56 0c f0 be 57 da b5 c0 45 22 9f b1 0a 42 3c d7 ea",
+            "80 af 3c fe b8 53 68 93 7b 8f 9e c2",
+            self.client_isn_42x,
+            0,
+            include_options=False,
+        )
+
+    def test_4_2_2(self):
+        self.check(
+            """
+            45 e0 00 4c 32 84 40 00 ff 06 69 f7 ac 1b 1c 1d
+            0a 0b 0c 0d 00 b3 ff 12 ac d5 b5 e1 cb 0e fb ef
+            e0 12 ff ff 38 8e 00 00 02 04 05 b4 01 03 03 08
+            04 02 08 0a 57 67 72 f3 00 02 4c ce 1d 10 54 3d
+            09 30 6f 9a ce a6 3a 8c 68 cb 9a 70
+            """,
+            "b5 b2 89 6b b3 66 4e 81 76 b0 ed c6 e7 99 52 41 01 a8 30 7f",
+            "09 30 6f 9a ce a6 3a 8c 68 cb 9a 70",
+            self.server_isn_42x,
+            self.client_isn_42x,
+            include_options=False,
+        )
+
+    def test_4_2_3(self):
+        self.check(
+            """
+            45 e0 00 87 a8 f5 40 00 ff 06 f3 4a 0a 0b 0c 0d
+            ac 1b 1c 1d ff 12 00 b3 cb 0e fb ef ac d5 b5 e2
+            c0 18 01 04 6c 45 00 00 01 01 08 0a 00 02 4c ce
+            57 67 72 f3 1d 10 3d 54 71 06 08 cc 69 6c 03 a2
+            71 c9 3a a5 ff ff ff ff ff ff ff ff ff ff ff ff
+            ff ff ff ff 00 43 01 04 da bf 00 b4 0a 0b 0c 0d
+            26 02 06 01 04 00 01 00 01 02 02 80 00 02 02 02
+            00 02 02 42 00 02 06 41 04 00 00 da bf 02 08 40
+            06 00 64 00 01 01 00
+            """,
+            "f3 db 17 93 d7 91 0e cd 80 6c 34 f1 55 ea 1f 00 34 59 53 e3",
+            "71 06 08 cc 69 6c 03 a2 71 c9 3a a5",
+            self.client_isn_42x,
+            self.server_isn_42x,
+            include_options=False,
+        )
+
+    def test_4_2_4(self):
+        self.check(
+            """
+            45 e0 00 87 54 37 40 00 ff 06 48 09 ac 1b 1c 1d
+            0a 0b 0c 0d 00 b3 ff 12 ac d5 b5 e2 cb 0e fc 32
+            c0 18 01 00 46 b6 00 00 01 01 08 0a 57 67 72 f3
+            00 02 4c ce 1d 10 54 3d 97 76 6e 48 ac 26 2d e9
+            ae 61 b4 f9 ff ff ff ff ff ff ff ff ff ff ff ff
+            ff ff ff ff 00 43 01 04 da c0 00 b4 ac 1b 1c 1d
+            26 02 06 01 04 00 01 00 01 02 02 80 00 02 02 02
+            00 02 02 42 00 02 06 41 04 00 00 da c0 02 08 40
+            06 00 64 00 01 01 00
+            """,
+            "b5 b2 89 6b b3 66 4e 81 76 b0 ed c6 e7 99 52 41 01 a8 30 7f",
+            "97 76 6e 48 ac 26 2d e9 ae 61 b4 f9",
+            self.server_isn_42x,
+            self.client_isn_42x,
+            include_options=False,
+        )
+
+    def test_5_1_1(self):
+        self.check(
+            """
+            45 e0 00 4c 7b 9f 40 00 ff 06 20 dc 0a 0b 0c 0d
+            ac 1b 1c 1d c4 fa 00 b3 78 7a 1d df 00 00 00 00
+            e0 02 ff ff 5a 0f 00 00 02 04 05 b4 01 03 03 08
+            04 02 08 0a 00 01 7e d0 00 00 00 00 1d 10 3d 54
+            e4 77 e9 9c 80 40 76 54 98 e5 50 91
+            """,
+            "f5 b8 b3 d5 f3 4f db b6 eb 8d 4a b9 66 0e 60 e3",
+            "e4 77 e9 9c 80 40 76 54 98 e5 50 91",
+            0x787A1DDF,
+            0,
+            include_options=True,
+            alg_name="AES-128-CMAC-96",
+        )
+
+    def test_6_1_1(self):
+        self.check(
+            """
+            6e 08 91 dc 00 38 06 40 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 01 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 02 f7 e4 00 b3 17 6a 83 3f
+            00 00 00 00 e0 02 ff ff 47 21 00 00 02 04 05 a0
+            01 03 03 08 04 02 08 0a 00 41 d0 87 00 00 00 00
+            1d 10 3d 54 90 33 ec 3d 73 34 b6 4c 5e dd 03 9f
+            """,
+            "62 5e c0 9d 57 58 36 ed c9 b6 42 84 18 bb f0 69 89 a3 61 bb",
+            "90 33 ec 3d 73 34 b6 4c 5e dd 03 9f",
+            self.client_isn_61x,
+            0,
+            include_options=True,
+        )
+
+    def test_6_1_2(self):
+        self.check(
+            """
+            6e 01 00 9e 00 38 06 40 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 02 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 01 00 b3 f7 e4 3f 51 99 4b
+            17 6a 83 40 e0 12 ff ff bf ec 00 00 02 04 05 a0
+            01 03 03 08 04 02 08 0a bd 33 12 9b 00 41 d0 87
+            1d 10 54 3d f1 cb a3 46 c3 52 61 63 f7 1f 1f 55
+            """,
+            "e4 a3 7a da 2a 0a fc a8 71 14 34 91 3f e1 38 c7 71 eb cb 4a",
+            "f1 cb a3 46 c3 52 61 63 f7 1f 1f 55",
+            self.server_isn_61x,
+            self.client_isn_61x,
+            include_options=True,
+        )
+
+    def test_6_2_2(self):
+        self.check(
+            """
+            6e 0a 7e 1f 00 38 06 40 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 02 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 01 00 b3 c6 cd eb a3 73 4d
+            02 0c 1e 6a e0 12 ff ff 77 4d 00 00 02 04 05 a0
+            01 03 03 08 04 02 08 0a 5e c9 9b 70 00 9d b9 5b
+            1d 10 54 3d 3c 54 6b ad 97 43 f1 2d f8 b8 01 0d
+            """,
+            "40 51 08 94 7f 99 65 75 e7 bd bc 26 d4 02 16 a2 c7 fa 91 bd",
+            "3c 54 6b ad 97 43 f1 2d f8 b8 01 0d",
+            self.server_isn_62x,
+            self.client_isn_62x,
+            include_options=False,
+        )
+
+    def test_6_2_4(self):
+        self.check(
+            """
+            6e 0a 7e 1f 00 73 06 40 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 02 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 01 00 b3 c6 cd eb a3 73 4e
+            02 0c 1e ad c0 18 01 00 71 6a 00 00 01 01 08 0a
+            5e c9 9b 7a 00 9d b9 65 1d 10 54 3d 55 9a 81 94
+            45 b4 fd e9 8d 9e 13 17 ff ff ff ff ff ff ff ff
+            ff ff ff ff ff ff ff ff 00 43 01 04 fd e8 00 b4
+            01 01 01 7a 26 02 06 01 04 00 01 00 01 02 02 80
+            00 02 02 02 00 02 02 42 00 02 06 41 04 00 00 fd
+            e8 02 08 40 06 00 64 00 01 01 00
+            """,
+            "40 51 08 94 7f 99 65 75 e7 bd bc 26 d4 02 16 a2 c7 fa 91 bd",
+            "55 9a 81 94 45 b4 fd e9 8d 9e 13 17",
+            self.server_isn_62x,
+            self.client_isn_62x,
+            include_options=False,
+        )
+
+    server_isn_71x = 0xA6744ECB
+    client_isn_71x = 0x193CCCEC
+
+    def test_7_1_2(self):
+        self.check(
+            """
+            6e 06 15 20 00 38 06 40 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 02 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 01 00 b3 f8 5a a6 74 4e cb
+            19 3c cc ed e0 12 ff ff ea bb 00 00 02 04 05 a0
+            01 03 03 08 04 02 08 0a 71 da ab c8 13 e4 ab 99
+            1d 10 54 3d dc 28 43 a8 4e 78 a6 bc fd c5 ed 80
+            """,
+            "cf 1b 1e 22 5e 06 a6 36 16 76 4a 06 7b 46 f4 b1",
+            "dc 28 43 a8 4e 78 a6 bc fd c5 ed 80",
+            self.server_isn_71x,
+            self.client_isn_71x,
+            alg_name="AES-128-CMAC-96",
+            include_options=True,
+        )
+
+    def test_7_1_4(self):
+        self.check(
+            """
+            6e 06 15 20 00 73 06 40 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 02 fd 00 00 00 00 00 00 00
+            00 00 00 00 00 00 00 01 00 b3 f8 5a a6 74 4e cc
+            19 3c cd 30 c0 18 01 00 52 f4 00 00 01 01 08 0a
+            71 da ab d3 13 e4 ab a3 1d 10 54 3d c1 06 9b 7d
+            fd 3d 69 3a 6d f3 f2 89 ff ff ff ff ff ff ff ff
+            ff ff ff ff ff ff ff ff 00 43 01 04 fd e8 00 b4
+            01 01 01 7a 26 02 06 01 04 00 01 00 01 02 02 80
+            00 02 02 02 00 02 02 42 00 02 06 41 04 00 00 fd
+            e8 02 08 40 06 00 64 00 01 01 00
+            """,
+            "cf 1b 1e 22 5e 06 a6 36 16 76 4a 06 7b 46 f4 b1",
+            "c1 06 9b 7d fd 3d 69 3a 6d f3 f2 89",
+            self.server_isn_71x,
+            self.client_isn_71x,
+            alg_name="AES-128-CMAC-96",
+            include_options=True,
+        )
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_verify_capture.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_verify_capture.py
new file mode 100644
index 000000000000..1bc0f05197b8
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_verify_capture.py
@@ -0,0 +1,123 @@
+# SPDX-License-Identifier: GPL-2.0
+"""Capture packets with TCP-AO and verify signatures"""
+
+import logging
+import os
+import socket
+
+import pytest
+
+from . import linux_tcp_authopt, tcp_authopt_alg
+from .full_tcp_sniff_session import FullTCPSniffSession
+from .linux_tcp_authopt import (
+    set_tcp_authopt_key,
+    tcp_authopt_key,
+)
+from .server import SimpleServerThread
+from .utils import (
+    DEFAULT_TCP_SERVER_PORT,
+    AsyncSnifferContext,
+    check_socket_echo,
+    create_listen_socket,
+    nstat_json,
+)
+from .validator import TcpAuthValidator, TcpAuthValidatorKey
+from .conftest import skipif_missing_tcp_authopt
+
+logger = logging.getLogger(__name__)
+pytestmark = skipif_missing_tcp_authopt
+
+
+def can_capture():
+    # This is too restrictive:
+    return os.geteuid() == 0
+
+
+skipif_cant_capture = pytest.mark.skipif(
+    not can_capture(), reason="run as root to capture packets"
+)
+
+
+def get_alg_id(alg_name) -> int:
+    if alg_name == "HMAC-SHA-1-96":
+        return linux_tcp_authopt.TCP_AUTHOPT_ALG_HMAC_SHA_1_96
+    elif alg_name == "AES-128-CMAC-96":
+        return linux_tcp_authopt.TCP_AUTHOPT_ALG_AES_128_CMAC_96
+    else:
+        raise ValueError()
+
+
+@skipif_cant_capture
+@pytest.mark.parametrize(
+    "address_family,alg_name,include_options",
+    [
+        (socket.AF_INET, "HMAC-SHA-1-96", True),
+        (socket.AF_INET, "AES-128-CMAC-96", True),
+        (socket.AF_INET, "AES-128-CMAC-96", False),
+        (socket.AF_INET6, "HMAC-SHA-1-96", True),
+        (socket.AF_INET6, "HMAC-SHA-1-96", False),
+        (socket.AF_INET6, "AES-128-CMAC-96", True),
+    ],
+)
+def test_verify_capture(exit_stack, address_family, alg_name, include_options):
+    master_key = b"testvector"
+    alg_id = get_alg_id(alg_name)
+
+    session = FullTCPSniffSession(server_port=DEFAULT_TCP_SERVER_PORT)
+    sniffer = exit_stack.enter_context(
+        AsyncSnifferContext(
+            filter=f"tcp port {DEFAULT_TCP_SERVER_PORT}",
+            iface="lo",
+            session=session,
+        )
+    )
+
+    listen_socket = create_listen_socket(family=address_family)
+    listen_socket = exit_stack.enter_context(listen_socket)
+    exit_stack.enter_context(SimpleServerThread(listen_socket, mode="echo"))
+
+    client_socket = socket.socket(address_family, socket.SOCK_STREAM)
+    client_socket = exit_stack.push(client_socket)
+
+    set_tcp_authopt_key(
+        listen_socket,
+        tcp_authopt_key(alg=alg_id, key=master_key, include_options=include_options),
+    )
+    set_tcp_authopt_key(
+        client_socket,
+        tcp_authopt_key(alg=alg_id, key=master_key, include_options=include_options),
+    )
+
+    # even if one signature is incorrect keep processing the capture
+    old_nstat = nstat_json()
+    valkey = TcpAuthValidatorKey(
+        key=master_key, alg_name=alg_name, include_options=include_options
+    )
+    validator = TcpAuthValidator(keys=[valkey])
+
+    try:
+        client_socket.settimeout(1.0)
+        client_socket.connect(("localhost", DEFAULT_TCP_SERVER_PORT))
+        for _ in range(5):
+            check_socket_echo(client_socket)
+    except socket.timeout:
+        logger.warning("socket timeout", exc_info=True)
+        pass
+    client_socket.close()
+    session.wait_close()
+    sniffer.stop()
+
+    logger.info("capture: %r", sniffer.results)
+    for p in sniffer.results:
+        validator.handle_packet(p)
+
+    assert not validator.any_fail
+    assert not validator.any_unsigned
+    # Fails because of duplicate packets:
+    # assert not validator.any_incomplete
+    new_nstat = nstat_json()
+    assert (
+        0
+        == new_nstat["kernel"]["TcpExtTCPAuthOptFailure"]
+        - old_nstat["kernel"]["TcpExtTCPAuthOptFailure"]
+    )
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/validator.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/validator.py
new file mode 100644
index 000000000000..5f0e83421af5
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/validator.py
@@ -0,0 +1,158 @@
+# SPDX-License-Identifier: GPL-2.0
+from tcp_authopt_test.utils import scapy_tcp_get_authopt_val
+import typing
+import logging
+
+from dataclasses import dataclass
+from scapy.packet import Packet
+from scapy.layers.inet import TCP
+from . import tcp_authopt_alg
+from .tcp_authopt_alg import IPvXAddress, TCPAuthContext
+from .tcp_authopt_alg import get_scapy_ipvx_src
+from .tcp_authopt_alg import get_scapy_ipvx_dst
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass(frozen=True)
+class TCPSocketPair:
+    """TCP connection identifier"""
+
+    saddr: IPvXAddress = None
+    daddr: IPvXAddress = None
+    sport: int = 0
+    dport: int = 0
+
+    def rev(self) -> "TCPSocketPair":
+        return TCPSocketPair(self.daddr, self.saddr, self.dport, self.sport)
+
+
+@dataclass
+class TcpAuthValidatorKey:
+    key: bytes
+    alg_name: str
+    include_options: bool = True
+    keyid: typing.Optional[int] = None
+    sport: typing.Optional[int] = None
+    dport: typing.Optional[int] = None
+
+    def match_packet(self, p: Packet):
+        if not TCP in p:
+            return False
+        authopt = scapy_tcp_get_authopt_val(p[TCP])
+        if authopt is None:
+            return False
+        if self.keyid is not None and authopt.keyid != self.keyid:
+            return False
+        if self.sport is not None and p[TCP].sport != self.sport:
+            return False
+        if self.dport is not None and p[TCP].dport != self.dport:
+            return False
+        return True
+
+    def get_alg_imp(self):
+        return tcp_authopt_alg.get_alg(self.alg_name)
+
+
+def is_init_syn(p: Packet) -> bool:
+    return p[TCP].flags.S and not p[TCP].flags.A
+
+
+class TcpAuthValidator:
+    """Validate TCP auth sessions inside a capture"""
+
+    keys: typing.List[TcpAuthValidatorKey]
+    conn_dict: typing.Dict[TCPSocketPair, TCPAuthContext]
+    any_incomplete: bool = False
+    any_unsigned: bool = False
+    any_fail: bool = False
+
+    def __init__(self, keys=None):
+        self.keys = keys or []
+        self.conn_dict = {}
+
+    def get_key_for_packet(self, p):
+        for k in self.keys:
+            if k.match_packet(p):
+                return k
+        return None
+
+    def handle_packet(self, p: Packet):
+        if TCP not in p:
+            logger.debug("skip non-TCP packet")
+            return
+        key = self.get_key_for_packet(p)
+        if not key:
+            self.any_unsigned = True
+            logger.debug("skip packet not matching any known keys: %r", p)
+            return
+        authopt = scapy_tcp_get_authopt_val(p[TCP])
+        if not authopt:
+            self.any_unsigned = True
+            logger.debug("skip packet without tcp authopt: %r", p)
+            return
+        captured_mac = authopt.mac
+
+        saddr = get_scapy_ipvx_src(p)
+        daddr = get_scapy_ipvx_dst(p)
+
+        conn_key = TCPSocketPair(saddr, daddr, p[TCP].sport, p[TCP].dport)
+        if p[TCP].flags.S:
+            conn = self.conn_dict.get(conn_key, None)
+            if conn is not None:
+                logger.warning("overwrite %r", conn)
+                self.any_incomplete = True
+            conn = TCPAuthContext()
+            conn.saddr = saddr
+            conn.daddr = daddr
+            conn.sport = p[TCP].sport
+            conn.dport = p[TCP].dport
+            self.conn_dict[conn_key] = conn
+
+            if p[TCP].flags.A == False:
+                # SYN
+                conn.sisn = p[TCP].seq
+                conn.disn = 0
+                logger.info("Initialized for SYN: %r", conn)
+            else:
+                # SYN/ACK
+                conn.sisn = p[TCP].seq
+                conn.disn = p[TCP].ack - 1
+                logger.info("Initialized for SYNACK: %r", conn)
+
+                # Update opposite connection with dst_isn
+                rconn_key = conn_key.rev()
+                rconn = self.conn_dict.get(rconn_key, None)
+                if rconn is None:
+                    logger.warning("missing SYN for SYNACK: %s", rconn_key)
+                    self.any_incomplete = True
+                else:
+                    assert rconn.sisn == conn.disn
+                    assert rconn.disn == 0 or rconn.disn == conn.sisn
+                    rconn.disn = conn.sisn
+                    rconn.update_from_synack_packet(p)
+                    logger.info("Updated peer for SYNACK: %r", rconn)
+        else:
+            conn = self.conn_dict.get(conn_key, None)
+            if conn is None:
+                logger.warning("missing TCP syn for %r", conn_key)
+                self.any_incomplete = True
+                return
+        # logger.debug("conn %r found for packet %r", conn, p)
+
+        context_bytes = conn.pack(syn=is_init_syn(p))
+        alg = key.get_alg_imp()
+        traffic_key = alg.kdf(key.key, context_bytes)
+        message_bytes = tcp_authopt_alg.build_message_from_scapy(
+            p, include_options=key.include_options
+        )
+        computed_mac = alg.mac(traffic_key, message_bytes)
+        if computed_mac == captured_mac:
+            logger.debug("ok - mac %s", computed_mac.hex())
+        else:
+            self.any_fail = True
+            logger.error(
+                "not ok - captured %s computed %s",
+                captured_mac.hex(),
+                computed_mac.hex(),
+            )
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 11/15] selftests: Initial tcp_authopt support for nettest
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (9 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 10/15] selftests: tcp_authopt: Capture and verify packets Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 12/15] selftests: Initial tcp_authopt support for fcnal-test Leonard Crestez
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

Add support for configuring TCP Authentication Option. Only a single key
with default options.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 tools/testing/selftests/net/nettest.c | 34 ++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/nettest.c b/tools/testing/selftests/net/nettest.c
index bd6288302094..f04c6af79129 100644
--- a/tools/testing/selftests/net/nettest.c
+++ b/tools/testing/selftests/net/nettest.c
@@ -100,10 +100,12 @@ struct sock_args {
 		struct sockaddr_in v4;
 		struct sockaddr_in6 v6;
 	} md5_prefix;
 	unsigned int prefix_len;
 
+	const char *authopt_password;
+
 	/* expected addresses and device index for connection */
 	const char *expected_dev;
 	const char *expected_server_dev;
 	int expected_ifindex;
 
@@ -250,10 +252,27 @@ static int switch_ns(const char *ns)
 	close(fd);
 
 	return ret;
 }
 
+static int tcp_set_authopt(int sd, struct sock_args *args)
+{
+	struct tcp_authopt_key key;
+	int rc;
+
+	memset(&key, 0, sizeof(key));
+	strcpy((char *)key.key, args->authopt_password);
+	key.keylen = strlen(args->authopt_password);
+	key.alg = TCP_AUTHOPT_ALG_HMAC_SHA_1_96;
+
+	rc = setsockopt(sd, IPPROTO_TCP, TCP_AUTHOPT_KEY, &key, sizeof(key));
+	if (rc < 0)
+		log_err_errno("setsockopt(TCP_AUTHOPT_KEY)");
+
+	return rc;
+}
+
 static int tcp_md5sig(int sd, void *addr, socklen_t alen, struct sock_args *args)
 {
 	int keylen = strlen(args->password);
 	struct tcp_md5sig md5sig = {};
 	int opt = TCP_MD5SIG;
@@ -1508,10 +1527,15 @@ static int do_server(struct sock_args *args, int ipc_fd)
 	if (args->password && tcp_md5_remote(lsd, args)) {
 		close(lsd);
 		goto err_exit;
 	}
 
+	if (args->authopt_password && tcp_set_authopt(lsd, args)) {
+		close(lsd);
+		goto err_exit;
+	}
+
 	ipc_write(ipc_fd, 1);
 	while (1) {
 		log_msg("waiting for client connection.\n");
 		FD_ZERO(&rfds);
 		FD_SET(lsd, &rfds);
@@ -1630,10 +1654,13 @@ static int connectsock(void *addr, socklen_t alen, struct sock_args *args)
 		goto out;
 
 	if (args->password && tcp_md5sig(sd, addr, alen, args))
 		goto err;
 
+	if (args->authopt_password && tcp_set_authopt(sd, args))
+		goto err;
+
 	if (args->bind_test_only)
 		goto out;
 
 	if (connect(sd, addr, alen) < 0) {
 		if (errno != EINPROGRESS) {
@@ -1819,11 +1846,11 @@ static int ipc_parent(int cpid, int fd, struct sock_args *args)
 
 	wait(&status);
 	return client_status;
 }
 
-#define GETOPT_STR  "sr:l:c:p:t:g:P:DRn:M:X:m:d:I:BN:O:SCi6xL:0:1:2:3:Fbq"
+#define GETOPT_STR  "sr:l:c:p:t:g:P:DRn:M:X:m:A:d:I:BN:O:SCi6xL:0:1:2:3:Fbq"
 
 static void print_usage(char *prog)
 {
 	printf(
 	"usage: %s OPTS\n"
@@ -1856,10 +1883,12 @@ static void print_usage(char *prog)
 	"    -n num        number of times to send message\n"
 	"\n"
 	"    -M password   use MD5 sum protection\n"
 	"    -X password   MD5 password for client mode\n"
 	"    -m prefix/len prefix and length to use for MD5 key\n"
+	"    -A password   use RFC5925 TCP Authentication option\n"
+	"\n"
 	"    -g grp        multicast group (e.g., 239.1.1.1)\n"
 	"    -i            interactive mode (default is echo and terminate)\n"
 	"\n"
 	"    -0 addr       Expected local address\n"
 	"    -1 addr       Expected remote address\n"
@@ -1970,10 +1999,13 @@ int main(int argc, char *argv[])
 			args.client_pw = optarg;
 			break;
 		case 'm':
 			args.md5_prefix_str = optarg;
 			break;
+		case 'A':
+			args.authopt_password = optarg;
+			break;
 		case 'S':
 			args.use_setsockopt = 1;
 			break;
 		case 'C':
 			args.use_cmsg = 1;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 12/15] selftests: Initial tcp_authopt support for fcnal-test
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (10 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 11/15] selftests: Initial tcp_authopt support for nettest Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 13/15] selftests: Add -t tcp_authopt option for fcnal-test.sh Leonard Crestez
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

Just test that a correct password is passed or otherwise a timeout is
obtained.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 tools/testing/selftests/net/fcnal-test.sh | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/tools/testing/selftests/net/fcnal-test.sh b/tools/testing/selftests/net/fcnal-test.sh
index 162e5f1ac36b..ca3b90f6fecb 100755
--- a/tools/testing/selftests/net/fcnal-test.sh
+++ b/tools/testing/selftests/net/fcnal-test.sh
@@ -788,10 +788,31 @@ ipv4_ping()
 }
 
 ################################################################################
 # IPv4 TCP
 
+#
+# TCP Authentication Option Tests
+#
+ipv4_tcp_authopt()
+{
+	# basic use case
+	log_start
+	run_cmd nettest -s -A ${MD5_PW} &
+	sleep 1
+	run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_PW}
+	log_test $? 0 "AO: Simple password"
+
+	# wrong password
+	log_start
+	show_hint "Should timeout since client uses wrong password"
+	run_cmd nettest -s -A ${MD5_PW} &
+	sleep 1
+	run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_WRONG_PW}
+	log_test $? 2 "AO: Client uses wrong password"
+}
+
 #
 # MD5 tests without VRF
 #
 ipv4_tcp_md5_novrf()
 {
@@ -1119,10 +1140,11 @@ ipv4_tcp_novrf()
 	show_hint "Should fail 'Connection refused'"
 	run_cmd nettest -d ${NSA_DEV} -r ${a}
 	log_test_addr ${a} $? 1 "No server, device client, local conn"
 
 	ipv4_tcp_md5_novrf
+	ipv4_tcp_authopt
 }
 
 ipv4_tcp_vrf()
 {
 	local a
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 13/15] selftests: Add -t tcp_authopt option for fcnal-test.sh
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (11 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 12/15] selftests: Initial tcp_authopt support for fcnal-test Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 14/15] tcp: authopt: Add key selection controls Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 15/15] selftests: tcp_authopt: Add tests for rollover Leonard Crestez
  14 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

This script is otherwise very slow to run!

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 tools/testing/selftests/net/fcnal-test.sh | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/tools/testing/selftests/net/fcnal-test.sh b/tools/testing/selftests/net/fcnal-test.sh
index ca3b90f6fecb..3fa812789ac2 100755
--- a/tools/testing/selftests/net/fcnal-test.sh
+++ b/tools/testing/selftests/net/fcnal-test.sh
@@ -1328,10 +1328,21 @@ ipv4_tcp()
 	log_subsection "With VRF"
 	setup "yes"
 	ipv4_tcp_vrf
 }
 
+
+only_tcp_authopt()
+{
+	log_section "TCP Authentication"
+	setup
+	set_sysctl net.ipv4.tcp_l3mdev_accept=0
+	log_subsection "IPv4 no VRF"
+	ipv4_tcp_authopt
+}
+
+
 ################################################################################
 # IPv4 UDP
 
 ipv4_udp_novrf()
 {
@@ -4018,10 +4029,11 @@ do
 	ipv6_bind|bind6) ipv6_addr_bind;;
 	ipv6_runtime)    ipv6_runtime;;
 	ipv6_netfilter)  ipv6_netfilter;;
 
 	use_cases)       use_cases;;
+	tcp_authopt)     only_tcp_authopt;;
 
 	# setup namespaces and config, but do not run any tests
 	setup)		 setup; exit 0;;
 	vrf_setup)	 setup "yes"; exit 0;;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 14/15] tcp: authopt: Add key selection controls
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (12 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 13/15] selftests: Add -t tcp_authopt option for fcnal-test.sh Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  2021-08-24 21:34 ` [RFCv3 15/15] selftests: tcp_authopt: Add tests for rollover Leonard Crestez
  14 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

The RFC requires that TCP can report the keyid and rnextkeyid values
being sent or received, implement this via getsockopt values.

The RFC also requires that user can select the sending key and that the
sending key is automatically switched based on rnextkeyid. These
requirements can conflict so we implement both and add a flag which
specifies if user or peer request takes priority.

Also add an option to control rnextkeyid explicitly from userspace.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 Documentation/networking/tcp_authopt.rst | 25 ++++++++++
 include/net/tcp_authopt.h                | 10 ++++
 include/uapi/linux/tcp.h                 | 31 ++++++++++++
 net/ipv4/tcp_authopt.c                   | 60 +++++++++++++++++++++++-
 4 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/Documentation/networking/tcp_authopt.rst b/Documentation/networking/tcp_authopt.rst
index 484f66f41ad5..cded87a70d05 100644
--- a/Documentation/networking/tcp_authopt.rst
+++ b/Documentation/networking/tcp_authopt.rst
@@ -35,10 +35,35 @@ Keys can be bound to remote addresses in a way that is similar to TCP_MD5.
 
 RFC5925 requires that key ids do not overlap when tcp identifiers (addr/port)
 overlap. This is not enforced by linux, configuring ambiguous keys will result
 in packet drops and lost connections.
 
+Key selection
+-------------
+
+On getsockopt(TCP_AUTHOPT) information is provided about keyid/rnextkeyid in
+the last send packet and about the keyid/rnextkeyd in the last valid received
+packet.
+
+By default the sending keyid is selected to match the "rnextkeyid" value sent
+by the remote side. If that keyid is not available (or for new connections) a
+random matching key is selected.
+
+If the `TCP_AUTHOPT_LOCK_KEYID` is set then the sending key is selected by the
+`tcp_authopt.send_local_id` field and rnextkeyid is ignored. If no key with
+local_id == send_local_id is configured then a random matching key is
+selected.
+
+The current sending key is cached in the socket and will not change unless
+requested by remote rnextkeyid or by setsockopt.
+
+The rnextkeyid value sent on the wire is usually the recv_id of the current
+key used for sending. If the TCP_AUTHOPT_LOCK_RNEXTKEY flag is set in
+`tcp_authopt.flags` the value of `tcp_authopt.send_rnextkeyid` is send
+instead.  This can be used to implement smooth rollover: the peer will switch
+its keyid to the received rnextkeyid when it is available.
+
 ABI Reference
 =============
 
 .. kernel-doc:: include/uapi/linux/tcp.h
    :identifiers: tcp_authopt tcp_authopt_flag tcp_authopt_key tcp_authopt_key_flag tcp_authopt_alg
diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index 61db268f36f8..92d0c2333272 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -36,11 +36,21 @@ struct tcp_authopt_key_info {
  */
 struct tcp_authopt_info {
 	/** @head: List of tcp_authopt_key_info */
 	struct hlist_head head;
 	struct rcu_head rcu;
+	/**
+	 * @send_keyid - Current key used for sending, cached.
+	 *
+	 * Once a key is found it only changes by user or remote request.
+	 */
+	struct tcp_authopt_key_info *send_key;
 	u32 flags;
+	u8 send_keyid;
+	u8 send_rnextkeyid;
+	u8 recv_keyid;
+	u8 recv_rnextkeyid;
 	u32 src_isn;
 	u32 dst_isn;
 };
 
 #ifdef CONFIG_TCP_AUTHOPT
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 575162e7e281..43df8e3cd4cc 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -346,10 +346,24 @@ struct tcp_diag_md5sig {
 
 /**
  * enum tcp_authopt_flag - flags for `tcp_authopt.flags`
  */
 enum tcp_authopt_flag {
+	/**
+	 * @TCP_AUTHOPT_FLAG_LOCK_KEYID: keyid controlled by sockopt
+	 *
+	 * If this is set `tcp_authopt.send_keyid` is used to determined sending
+	 * key. Otherwise a key with send_id == recv_rnextkeyid is preferred.
+	 */
+	TCP_AUTHOPT_FLAG_LOCK_KEYID = (1 << 0),
+	/**
+	 * @TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID: Override rnextkeyid from userspace
+	 *
+	 * If this is set then `tcp_authopt.send_rnextkeyid` is sent on outbound
+	 * packets. Other the recv_id of the current sending key is sent.
+	 */
+	TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID = (1 << 1),
 	/**
 	 * @TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED:
 	 *	Configure behavior of segments with TCP-AO coming from hosts for which no
 	 *	key is configured. The default recommended by RFC is to silently accept
 	 *	such connections.
@@ -361,10 +375,27 @@ enum tcp_authopt_flag {
  * struct tcp_authopt - Per-socket options related to TCP Authentication Option
  */
 struct tcp_authopt {
 	/** @flags: Combination of &enum tcp_authopt_flag */
 	__u32	flags;
+	/**
+	 * @send_keyid: `tcp_authopt_key.send_id` of preferred send key
+	 *
+	 * This is only used if `TCP_AUTHOPT_FLAG_LOCK_KEYID` is set.
+	 */
+	__u8	send_keyid;
+	/**
+	 * @send_rnextkeyid: The rnextkeyid to send in packets
+	 *
+	 * This is controlled by the user iff TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID is
+	 * set. Otherwise rnextkeyid is the recv_id of the current key.
+	 */
+	__u8	send_rnextkeyid;
+	/** @recv_keyid: A recently-received keyid value. Only for getsockopt. */
+	__u8	recv_keyid;
+	/** @recv_rnextkeyid: A recently-received rnextkeyid value. Only for getsockopt. */
+	__u8	recv_rnextkeyid;
 };
 
 /**
  * enum tcp_authopt_key_flag - flags for `tcp_authopt.flags`
  *
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index 08ca77f01c46..1a80df739fd2 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -255,17 +255,44 @@ struct tcp_authopt_key_info *tcp_authopt_lookup_send(struct tcp_authopt_info *in
  */
 struct tcp_authopt_key_info *tcp_authopt_select_key(const struct sock *sk,
 						    const struct sock *addr_sk,
 						    u8 *rnextkeyid)
 {
+	struct tcp_authopt_key_info *key, *new_key;
 	struct tcp_authopt_info *info;
 
 	info = rcu_dereference(tcp_sk(sk)->authopt_info);
 	if (!info)
 		return NULL;
 
-	return tcp_authopt_lookup_send(info, addr_sk, -1);
+	key = info->send_key;
+	if (info->flags & TCP_AUTHOPT_FLAG_LOCK_KEYID) {
+		int send_keyid = info->send_keyid;
+
+		if (!key || key->send_id != send_keyid)
+			new_key = tcp_authopt_lookup_send(info, addr_sk, send_keyid);
+	} else {
+		if (!key || key->send_id != info->recv_rnextkeyid)
+			new_key = tcp_authopt_lookup_send(info, addr_sk, info->recv_rnextkeyid);
+	}
+	if (!key && !new_key)
+		new_key = tcp_authopt_lookup_send(info, addr_sk, -1);
+
+	// Change current key.
+	if (key != new_key && new_key) {
+		key = new_key;
+		info->send_key = key;
+	}
+
+	if (key) {
+		if (info->flags & TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID)
+			*rnextkeyid = info->send_rnextkeyid;
+		else
+			*rnextkeyid = info->send_rnextkeyid = key->recv_id;
+	}
+
+	return key;
 }
 
 static struct tcp_authopt_info *__tcp_authopt_info_get_or_create(struct sock *sk)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
@@ -285,10 +312,12 @@ static struct tcp_authopt_info *__tcp_authopt_info_get_or_create(struct sock *sk
 
 	return info;
 }
 
 #define TCP_AUTHOPT_KNOWN_FLAGS ( \
+	TCP_AUTHOPT_FLAG_LOCK_KEYID | \
+	TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID | \
 	TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED)
 
 int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
 {
 	struct tcp_authopt opt;
@@ -309,10 +338,14 @@ int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
 	info = __tcp_authopt_info_get_or_create(sk);
 	if (IS_ERR(info))
 		return PTR_ERR(info);
 
 	info->flags = opt.flags & TCP_AUTHOPT_KNOWN_FLAGS;
+	if (opt.flags & TCP_AUTHOPT_FLAG_LOCK_KEYID)
+		info->send_keyid = opt.send_keyid;
+	if (opt.flags & TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID)
+		info->send_rnextkeyid = opt.send_rnextkeyid;
 
 	return 0;
 }
 
 int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *opt)
@@ -326,10 +359,21 @@ int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *opt)
 	info = rcu_dereference_check(tp->authopt_info, lockdep_sock_is_held(sk));
 	if (!info)
 		return -EINVAL;
 
 	opt->flags = info->flags & TCP_AUTHOPT_KNOWN_FLAGS;
+	/* These keyids might be undefined, for example before connect.
+	 * Reporting zero is not strictly correct because there are no reserved
+	 * values.
+	 */
+	if (info->send_key)
+		opt->send_keyid = info->send_key->send_id;
+	else
+		opt->send_keyid = 0;
+	opt->send_rnextkeyid = info->send_rnextkeyid;
+	opt->recv_keyid = info->recv_keyid;
+	opt->recv_rnextkeyid = info->recv_rnextkeyid;
 
 	return 0;
 }
 
 static void tcp_authopt_key_free_rcu(struct rcu_head *rcu)
@@ -343,10 +387,12 @@ static void tcp_authopt_key_free_rcu(struct rcu_head *rcu)
 static void tcp_authopt_key_del(struct sock *sk,
 				struct tcp_authopt_info *info,
 				struct tcp_authopt_key_info *key)
 {
 	hlist_del_rcu(&key->node);
+	if (info->send_key == key)
+		info->send_key = NULL;
 	atomic_sub(sizeof(*key), &sk->sk_omem_alloc);
 	call_rcu(&key->rcu, tcp_authopt_key_free_rcu);
 }
 
 /* free info and keys but don't touch tp->authopt_info */
@@ -496,10 +542,13 @@ int __tcp_authopt_openreq(struct sock *newsk, const struct sock *oldsk, struct r
 		return -ENOMEM;
 
 	sk_nocaps_add(newsk, NETIF_F_GSO_MASK);
 	new_info->src_isn = tcp_rsk(req)->snt_isn;
 	new_info->dst_isn = tcp_rsk(req)->rcv_isn;
+	new_info->send_keyid = old_info->send_keyid;
+	new_info->send_rnextkeyid = old_info->send_rnextkeyid;
+	new_info->flags = old_info->flags;
 	INIT_HLIST_HEAD(&new_info->head);
 	err = tcp_authopt_clone_keys(newsk, oldsk, new_info, old_info);
 	if (err) {
 		__tcp_authopt_info_free(newsk, new_info);
 		return err;
@@ -1088,11 +1137,11 @@ int __tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb, struct tcp
 			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
 			net_info_ratelimited("TCP Authentication Unexpected: Rejected\n");
 			return -EINVAL;
 		} else {
 			net_info_ratelimited("TCP Authentication Unexpected: Accepted\n");
-			return 0;
+			goto accept;
 		}
 	}
 
 	/* bad inbound key len */
 	if (key->maclen + 4 != opt->len)
@@ -1106,7 +1155,14 @@ int __tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb, struct tcp
 		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
 		net_info_ratelimited("TCP Authentication Failed\n");
 		return -EINVAL;
 	}
 
+accept:
+	/* Doing this for all valid packets will results in keyids temporarily
+	 * flipping back and forth if packets are reordered or retransmitted.
+	 */
+	info->recv_keyid = opt->keyid;
+	info->recv_rnextkeyid = opt->rnextkeyid;
+
 	return 0;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFCv3 15/15] selftests: tcp_authopt: Add tests for rollover
  2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
                   ` (13 preceding siblings ...)
  2021-08-24 21:34 ` [RFCv3 14/15] tcp: authopt: Add key selection controls Leonard Crestez
@ 2021-08-24 21:34 ` Leonard Crestez
  14 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-24 21:34 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

RFC5925 requires that the use can examine or control the keys being
used. This is implemented in linux via fields on the TCP_AUTHOPT
sockopt.

Add socket-level tests for the adjusting keyids on live connections and
checking the they are reflected on the peer.

Also check smooth transitions via rnextkeyid.

Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
---
 .../tcp_authopt_test/linux_tcp_authopt.py     |  16 +-
 .../tcp_authopt_test/test_rollover.py         | 181 ++++++++++++++++++
 2 files changed, 194 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_rollover.py

diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/linux_tcp_authopt.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/linux_tcp_authopt.py
index 41374f9851aa..23de148a4078 100644
--- a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/linux_tcp_authopt.py
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/linux_tcp_authopt.py
@@ -20,10 +20,12 @@ def BIT(x):
 TCP_AUTHOPT = 38
 TCP_AUTHOPT_KEY = 39
 
 TCP_AUTHOPT_MAXKEYLEN = 80
 
+TCP_AUTHOPT_FLAG_LOCK_KEYID = BIT(0)
+TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID = BIT(1)
 TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED = BIT(2)
 
 TCP_AUTHOPT_KEY_DEL = BIT(0)
 TCP_AUTHOPT_KEY_EXCLUDE_OPTS = BIT(1)
 TCP_AUTHOPT_KEY_BIND_ADDR = BIT(2)
@@ -35,24 +37,32 @@ TCP_AUTHOPT_ALG_AES_128_CMAC_96 = 2
 @dataclass
 class tcp_authopt:
     """Like linux struct tcp_authopt"""
 
     flags: int = 0
-    sizeof = 4
+    send_keyid: int = 0
+    send_rnextkeyid: int = 0
+    recv_keyid: int = 0
+    recv_rnextkeyid: int = 0
+    sizeof = 8
 
     def pack(self) -> bytes:
         return struct.pack(
-            "I",
+            "IBBBB",
             self.flags,
+            self.send_keyid,
+            self.send_rnextkeyid,
+            self.recv_keyid,
+            self.recv_rnextkeyid,
         )
 
     def __bytes__(self):
         return self.pack()
 
     @classmethod
     def unpack(cls, b: bytes):
-        tup = struct.unpack("I", b)
+        tup = struct.unpack("IBBBB", b)
         return cls(*tup)
 
 
 def set_tcp_authopt(sock, opt: tcp_authopt):
     return sock.setsockopt(socket.IPPROTO_TCP, TCP_AUTHOPT, bytes(opt))
diff --git a/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_rollover.py b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_rollover.py
new file mode 100644
index 000000000000..68c59c6d1e33
--- /dev/null
+++ b/tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_rollover.py
@@ -0,0 +1,181 @@
+# SPDX-License-Identifier: GPL-2.0
+import typing
+import socket
+from .server import SimpleServerThread
+from .linux_tcp_authopt import (
+    TCP_AUTHOPT_FLAG_LOCK_KEYID,
+    TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID,
+    set_tcp_authopt_key,
+    tcp_authopt,
+    tcp_authopt_key,
+    set_tcp_authopt,
+    get_tcp_authopt,
+)
+from .utils import DEFAULT_TCP_SERVER_PORT, create_listen_socket, check_socket_echo
+from contextlib import ExitStack, contextmanager
+from .conftest import skipif_missing_tcp_authopt
+
+pytestmark = skipif_missing_tcp_authopt
+
+
+@contextmanager
+def make_tcp_authopt_socket_pair(
+    server_addr="127.0.0.1",
+    server_authopt: tcp_authopt = None,
+    server_key_list: typing.Iterable[tcp_authopt_key] = [],
+    client_authopt: tcp_authopt = None,
+    client_key_list: typing.Iterable[tcp_authopt_key] = [],
+) -> typing.Tuple[socket.socket, socket.socket]:
+    """Make a pair for connected sockets for key switching tests
+
+    Server runs in a background thread implementing echo protocol"""
+    with ExitStack() as exit_stack:
+        listen_socket = exit_stack.enter_context(
+            create_listen_socket(bind_addr=server_addr)
+        )
+        server_thread = exit_stack.enter_context(
+            SimpleServerThread(listen_socket, mode="echo")
+        )
+        client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+        client_socket.settimeout(1.0)
+
+        if server_authopt:
+            set_tcp_authopt(listen_socket, server_authopt)
+        for k in server_key_list:
+            set_tcp_authopt_key(listen_socket, k)
+        if client_authopt:
+            set_tcp_authopt(client_socket, client_authopt)
+        for k in client_key_list:
+            set_tcp_authopt_key(client_socket, k)
+
+        client_socket.connect((server_addr, DEFAULT_TCP_SERVER_PORT))
+        check_socket_echo(client_socket)
+        server_socket = server_thread.server_socket[0]
+
+        yield client_socket, server_socket
+
+
+def test_get_keyids(exit_stack: ExitStack):
+    """Check reading key ids"""
+    sk1 = tcp_authopt_key(send_id=11, recv_id=12, key="111")
+    sk2 = tcp_authopt_key(send_id=21, recv_id=22, key="222")
+    ck1 = tcp_authopt_key(send_id=12, recv_id=11, key="111")
+    client_socket, server_socket = exit_stack.enter_context(
+        make_tcp_authopt_socket_pair(
+            server_key_list=[sk1, sk2],
+            client_key_list=[ck1],
+        )
+    )
+
+    check_socket_echo(client_socket)
+    client_tcp_authopt = get_tcp_authopt(client_socket)
+    server_tcp_authopt = get_tcp_authopt(server_socket)
+    assert server_tcp_authopt.send_keyid == 11
+    assert server_tcp_authopt.send_rnextkeyid == 12
+    assert server_tcp_authopt.recv_keyid == 12
+    assert server_tcp_authopt.recv_rnextkeyid == 11
+    assert client_tcp_authopt.send_keyid == 12
+    assert client_tcp_authopt.send_rnextkeyid == 11
+    assert client_tcp_authopt.recv_keyid == 11
+    assert client_tcp_authopt.recv_rnextkeyid == 12
+
+
+def test_rollover_send_keyid(exit_stack: ExitStack):
+    """Check reading key ids"""
+    sk1 = tcp_authopt_key(send_id=11, recv_id=12, key="111")
+    sk2 = tcp_authopt_key(send_id=21, recv_id=22, key="222")
+    ck1 = tcp_authopt_key(send_id=12, recv_id=11, key="111")
+    ck2 = tcp_authopt_key(send_id=22, recv_id=21, key="222")
+    client_socket, server_socket = exit_stack.enter_context(
+        make_tcp_authopt_socket_pair(
+            server_key_list=[sk1, sk2],
+            client_key_list=[ck1, ck2],
+            client_authopt=tcp_authopt(
+                send_keyid=12, flags=TCP_AUTHOPT_FLAG_LOCK_KEYID
+            ),
+        )
+    )
+
+    check_socket_echo(client_socket)
+    assert get_tcp_authopt(client_socket).recv_keyid == 11
+    assert get_tcp_authopt(server_socket).recv_keyid == 12
+
+    # Explicit request for key2
+    set_tcp_authopt(
+        client_socket, tcp_authopt(send_keyid=22, flags=TCP_AUTHOPT_FLAG_LOCK_KEYID)
+    )
+    check_socket_echo(client_socket)
+    assert get_tcp_authopt(client_socket).recv_keyid == 21
+    assert get_tcp_authopt(server_socket).recv_keyid == 22
+
+
+def test_rollover_rnextkeyid(exit_stack: ExitStack):
+    """Check reading key ids"""
+    sk1 = tcp_authopt_key(send_id=11, recv_id=12, key="111")
+    sk2 = tcp_authopt_key(send_id=21, recv_id=22, key="222")
+    ck1 = tcp_authopt_key(send_id=12, recv_id=11, key="111")
+    ck2 = tcp_authopt_key(send_id=22, recv_id=21, key="222")
+    client_socket, server_socket = exit_stack.enter_context(
+        make_tcp_authopt_socket_pair(
+            server_key_list=[sk1],
+            client_key_list=[ck1, ck2],
+            client_authopt=tcp_authopt(
+                send_keyid=12, flags=TCP_AUTHOPT_FLAG_LOCK_KEYID
+            ),
+        )
+    )
+
+    check_socket_echo(client_socket)
+    assert get_tcp_authopt(server_socket).recv_rnextkeyid == 11
+
+    # request rnextkeyd=22 but server does not have it
+    set_tcp_authopt(
+        client_socket,
+        tcp_authopt(send_rnextkeyid=21, flags=TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID),
+    )
+    check_socket_echo(client_socket)
+    check_socket_echo(client_socket)
+    assert get_tcp_authopt(server_socket).recv_rnextkeyid == 21
+    assert get_tcp_authopt(server_socket).send_keyid == 11
+
+    # after adding k2 on server the key is switched
+    set_tcp_authopt_key(server_socket, sk2)
+    check_socket_echo(client_socket)
+    check_socket_echo(client_socket)
+    assert get_tcp_authopt(server_socket).send_keyid == 21
+
+
+def test_rollover_delkey(exit_stack: ExitStack):
+    sk1 = tcp_authopt_key(send_id=11, recv_id=12, key="111")
+    sk2 = tcp_authopt_key(send_id=21, recv_id=22, key="222")
+    ck1 = tcp_authopt_key(send_id=12, recv_id=11, key="111")
+    ck2 = tcp_authopt_key(send_id=22, recv_id=21, key="222")
+    client_socket, server_socket = exit_stack.enter_context(
+        make_tcp_authopt_socket_pair(
+            server_key_list=[sk1, sk2],
+            client_key_list=[ck1, ck2],
+            client_authopt=tcp_authopt(
+                send_keyid=12, flags=TCP_AUTHOPT_FLAG_LOCK_KEYID
+            ),
+        )
+    )
+
+    check_socket_echo(client_socket)
+    assert get_tcp_authopt(server_socket).recv_keyid == 12
+
+    # invalid send_keyid is just ignored
+    set_tcp_authopt(client_socket, tcp_authopt(send_keyid=7))
+    check_socket_echo(client_socket)
+    assert get_tcp_authopt(client_socket).send_keyid == 12
+    assert get_tcp_authopt(server_socket).recv_keyid == 12
+    assert get_tcp_authopt(client_socket).recv_keyid == 11
+
+    # If a key is removed it is replaced by anything that matches
+    ck1.delete_flag = True
+    set_tcp_authopt_key(client_socket, ck1)
+    check_socket_echo(client_socket)
+    check_socket_echo(client_socket)
+    assert get_tcp_authopt(client_socket).send_keyid == 22
+    assert get_tcp_authopt(server_socket).send_keyid == 21
+    assert get_tcp_authopt(server_socket).recv_keyid == 22
+    assert get_tcp_authopt(client_socket).recv_keyid == 21
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFCv3 07/15] tcp: authopt: Hook into tcp core
  2021-08-24 21:34 ` [RFCv3 07/15] tcp: authopt: Hook into tcp core Leonard Crestez
@ 2021-08-24 22:59   ` Eric Dumazet
  2021-08-25 16:32     ` Leonard Crestez
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2021-08-24 22:59 UTC (permalink / raw)
  To: Leonard Crestez, Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel



On 8/24/21 2:34 PM, Leonard Crestez wrote:
> The tcp_authopt features exposes a minimal interface to the rest of the
> TCP stack. Only a few functions are exposed and if the feature is
> disabled they return neutral values, avoiding ifdefs in the rest of the
> code.
> 
> Add calls into tcp authopt from send, receive and accept code.
> 
> Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
> ---
>  include/net/tcp_authopt.h |  56 +++++++++
>  net/ipv4/tcp_authopt.c    | 246 ++++++++++++++++++++++++++++++++++++++
>  net/ipv4/tcp_input.c      |  17 +++
>  net/ipv4/tcp_ipv4.c       |   3 +
>  net/ipv4/tcp_minisocks.c  |   2 +
>  net/ipv4/tcp_output.c     |  74 +++++++++++-
>  net/ipv6/tcp_ipv6.c       |   4 +
>  7 files changed, 401 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
> index c9ee2059b442..61db268f36f8 100644
> --- a/include/net/tcp_authopt.h
> +++ b/include/net/tcp_authopt.h
> @@ -21,10 +21,11 @@ struct tcp_authopt_key_info {
>  	/* Wire identifiers */
>  	u8 send_id, recv_id;
>  	u8 alg_id;
>  	u8 keylen;
>  	u8 key[TCP_AUTHOPT_MAXKEYLEN];
> +	u8 maclen;

I do not see maclen being enforced to 12, or a multiple of 4 ?

This means that later [2], tcp_authopt_hash() will leave up to 3
unitialized bytes in the TCP options, sent to the wire.

This is a  security issue, since we will leak kernel memory.

>  	struct sockaddr_storage addr;
>  	struct tcp_authopt_alg_imp *alg;
>  };
>  
>  /**
> @@ -41,15 +42,53 @@ struct tcp_authopt_info {
>  	u32 src_isn;
>  	u32 dst_isn;
>  };
>  
>  #ifdef CONFIG_TCP_AUTHOPT
> +struct tcp_authopt_key_info *tcp_authopt_select_key(const struct sock *sk,
> +						    const struct sock *addr_sk,
> +						    u8 *rnextkeyid);
>  void tcp_authopt_clear(struct sock *sk);
>  int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen);
>  int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *key);
>  int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen);
> +int tcp_authopt_hash(
> +		char *hash_location,
> +		struct tcp_authopt_key_info *key,
> +		struct sock *sk, struct sk_buff *skb);
> +int __tcp_authopt_openreq(struct sock *newsk, const struct sock *oldsk, struct request_sock *req);
> +static inline int tcp_authopt_openreq(
> +		struct sock *newsk,
> +		const struct sock *oldsk,
> +		struct request_sock *req)
> +{
> +	if (!rcu_dereference(tcp_sk(oldsk)->authopt_info))
> +		return 0;
> +	else
> +		return __tcp_authopt_openreq(newsk, oldsk, req);
> +}
> +int __tcp_authopt_inbound_check(
> +		struct sock *sk,
> +		struct sk_buff *skb,
> +		struct tcp_authopt_info *info);
> +static inline int tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb)
> +{
> +	struct tcp_authopt_info *info = rcu_dereference(tcp_sk(sk)->authopt_info);
> +
> +	if (info)
> +		return __tcp_authopt_inbound_check(sk, skb, info);
> +	else
> +		return 0;
> +}
>  #else
> +static inline struct tcp_authopt_key_info *tcp_authopt_select_key(
> +		const struct sock *sk,
> +		const struct sock *addr_sk,
> +		u8 *rnextkeyid)
> +{
> +	return NULL;
> +}
>  static inline int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
>  {
>  	return -ENOPROTOOPT;
>  }
>  static inline int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *key)
> @@ -61,8 +100,25 @@ static inline void tcp_authopt_clear(struct sock *sk)
>  }
>  static inline int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
>  {
>  	return -ENOPROTOOPT;
>  }
> +static inline int tcp_authopt_hash(
> +		char *hash_location,
> +		struct tcp_authopt_key_info *key,
> +		struct sock *sk, struct sk_buff *skb)
> +{
> +	return -EINVAL;
> +}
> +static inline int tcp_authopt_openreq(struct sock *newsk,
> +				      const struct sock *oldsk,
> +				      struct request_sock *req)
> +{
> +	return 0;
> +}
> +static inline int tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb)
> +{
> +	return 0;
> +}
>  #endif
>  
>  #endif /* _LINUX_TCP_AUTHOPT_H */
> diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
> index 2a3463ad6896..af777244d098 100644
> --- a/net/ipv4/tcp_authopt.c
> +++ b/net/ipv4/tcp_authopt.c
> @@ -203,10 +203,71 @@ static struct tcp_authopt_key_info *tcp_authopt_key_lookup_exact(const struct so
>  			return key_info;
>  
>  	return NULL;
>  }
>  
> +struct tcp_authopt_key_info *tcp_authopt_lookup_send(struct tcp_authopt_info *info,
> +						     const struct sock *addr_sk,
> +						     int send_id)
> +{
> +	struct tcp_authopt_key_info *result = NULL;
> +	struct tcp_authopt_key_info *key;
> +
> +	hlist_for_each_entry_rcu(key, &info->head, node, 0) {
> +		if (send_id >= 0 && key->send_id != send_id)
> +			continue;
> +		if (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND) {
> +			if (addr_sk->sk_family == AF_INET) {
> +				struct sockaddr_in *key_addr = (struct sockaddr_in *)&key->addr;
> +				const struct in_addr *daddr =
> +					(const struct in_addr *)&addr_sk->sk_daddr;

Why a cast is needed ? sk_daddr is a __be32, no need to cast it to in_addr
> +
> +				if (WARN_ON(key_addr->sin_family != AF_INET))

Why a WARN_ON() is used ? If we expect this to trigger, then at minimumum WARN_ON_ONCE() please.

> +					continue;
> +				if (memcmp(daddr, &key_addr->sin_addr, sizeof(*daddr)))
> +					continue;

Using memcmp() to compare two __be32 is overkill.

> +			}
> +			if (addr_sk->sk_family == AF_INET6) {
> +				struct sockaddr_in6 *key_addr = (struct sockaddr_in6 *)&key->addr;
> +				const struct in6_addr *daddr = &addr_sk->sk_v6_daddr;

Not sure why a variable is used, you need it once.

> +
> +				if (WARN_ON(key_addr->sin6_family != AF_INET6))
> +					continue;
> +				if (memcmp(daddr, &key_addr->sin6_addr, sizeof(*daddr)))

ipv6_addr_equal() should be faster.

> +					continue;
> +			}
> +		}
> +		if (result && net_ratelimit())
> +			pr_warn("ambiguous tcp authentication keys configured for send\n");
> +		result = key;
> +	}
> +
> +	return result;
> +}
> +
> +/**
> + * tcp_authopt_select_key - select key for sending
> + *
> + * addr_sk is the sock used for comparing daddr, it is only different from sk in
> + * the synack case.
> + *
> + * Result is protected by RCU and can't be stored, it may only be passed to
> + * tcp_authopt_hash and only under a single rcu_read_lock.
> + */
> +struct tcp_authopt_key_info *tcp_authopt_select_key(const struct sock *sk,
> +						    const struct sock *addr_sk,
> +						    u8 *rnextkeyid)
> +{
> +	struct tcp_authopt_info *info;
> +
> +	info = rcu_dereference(tcp_sk(sk)->authopt_info);

distro kernels will have CONFIG_TCP_AUTHOPT set, meaning
that we will add a cache line miss for every incoming TCP packet
even on hosts not using any RFC5925 TCP flow.

For TCP MD5 we are using a static key, to avoid this extra cost.

> +	if (!info)
> +		return NULL;
> +
> +	return tcp_authopt_lookup_send(info, addr_sk, -1);
> +}
> +
>  static struct tcp_authopt_info *__tcp_authopt_info_get_or_create(struct sock *sk)
>  {
>  	struct tcp_sock *tp = tcp_sk(sk);
>  	struct tcp_authopt_info *info;
>  
> @@ -387,16 +448,69 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
>  	key_info->recv_id = opt.recv_id;
>  	key_info->alg_id = opt.alg;
>  	key_info->alg = alg;
>  	key_info->keylen = opt.keylen;
>  	memcpy(key_info->key, opt.key, opt.keylen);
> +	key_info->maclen = alg->maclen;
>  	memcpy(&key_info->addr, &opt.addr, sizeof(key_info->addr));
>  	hlist_add_head_rcu(&key_info->node, &info->head);
>  
>  	return 0;
>  }
>  
> +static int tcp_authopt_clone_keys(struct sock *newsk,
> +				  const struct sock *oldsk,
> +				  struct tcp_authopt_info *new_info,
> +				  struct tcp_authopt_info *old_info)
> +{
> +	struct tcp_authopt_key_info *old_key;
> +	struct tcp_authopt_key_info *new_key;
> +
> +	hlist_for_each_entry_rcu(old_key, &old_info->head, node, lockdep_sock_is_held(sk)) {
> +		new_key = sock_kmalloc(newsk, sizeof(*new_key), GFP_ATOMIC);
> +		if (!new_key)
> +			return -ENOMEM;
> +		memcpy(new_key, old_key, sizeof(*new_key));
> +		tcp_authopt_alg_incref(old_key->alg);
> +		hlist_add_head_rcu(&new_key->node, &new_info->head);
> +	}
> +
> +	return 0;
> +}
> +
> +/** Called to create accepted sockets.
> + *
> + *  Need to copy authopt info from listen socket.
> + */
> +int __tcp_authopt_openreq(struct sock *newsk, const struct sock *oldsk, struct request_sock *req)
> +{
> +	struct tcp_authopt_info *old_info;
> +	struct tcp_authopt_info *new_info;
> +	int err;
> +
> +	old_info = rcu_dereference(tcp_sk(oldsk)->authopt_info);
> +	if (!old_info)
> +		return 0;
> +
> +	new_info = kmalloc(sizeof(*new_info), GFP_ATOMIC | __GFP_ZERO);

kzalloc() is your friend. (same remark for your other patches, where you are using __GFP_ZERO)
Also see additional comment [1]

> +	if (!new_info)
> +		return -ENOMEM;
> +
> +	sk_nocaps_add(newsk, NETIF_F_GSO_MASK);
> +	new_info->src_isn = tcp_rsk(req)->snt_isn;
> +	new_info->dst_isn = tcp_rsk(req)->rcv_isn;
> +	INIT_HLIST_HEAD(&new_info->head);
> +	err = tcp_authopt_clone_keys(newsk, oldsk, new_info, old_info);
> +	if (err) {
> +		__tcp_authopt_info_free(newsk, new_info);

[1]
		Are we leaving in place old value of newsk->authopt_info ?
		If this is copied from the listener, I think you need
		to add a tcp_sk(newsk)->authopt_info = NULL;
		before the kzalloc() call done above.

			

> +		return err;
> +	}
> +	rcu_assign_pointer(tcp_sk(newsk)->authopt_info, new_info);
> +
> +	return 0;
> +}
> +
>  /* feed traffic key into shash */
>  static int tcp_authopt_shash_traffic_key(struct shash_desc *desc,
>  					 struct sock *sk,
>  					 struct sk_buff *skb,
>  					 bool input,
> @@ -815,10 +929,16 @@ static int tcp_authopt_hash_packet(struct crypto_shash *tfm,
>  		return err;
>  
>  	return crypto_shash_final(desc, macbuf);
>  }
>  
> +/**
> + * __tcp_authopt_calc_mac - Compute packet MAC using key
> + *
> + * @macbuf: output buffer. Must be large enough to fit the digestsize of the
> + * 			underlying transform before truncation. Please use TCP_AUTHOPT_MAXMACBUF
> + */
>  int __tcp_authopt_calc_mac(struct sock *sk,
>  			   struct sk_buff *skb,
>  			   struct tcp_authopt_key_info *key,
>  			   bool input,
>  			   char *macbuf)
> @@ -859,5 +979,131 @@ int __tcp_authopt_calc_mac(struct sock *sk,
>  
>  out:
>  	tcp_authopt_put_mac_shash(key, mac_tfm);
>  	return err;
>  }
> +
> +/**
> + * tcp_authopt_hash - fill in the mac
> + *
> + * The key must come from tcp_authopt_select_key.
> + */
> +int tcp_authopt_hash(char *hash_location,
> +		     struct tcp_authopt_key_info *key,
> +		     struct sock *sk,
> +		     struct sk_buff *skb)
> +{
> +	/* MAC inside option is truncated to 12 bytes but crypto API needs output
> +	 * buffer to be large enough so we use a buffer on the stack.
> +	 */
> +	u8 macbuf[TCP_AUTHOPT_MAXMACBUF];
> +	int err;
> +
> +	if (WARN_ON(key->maclen > sizeof(macbuf)))
> +		return -ENOBUFS;
> +
> +	err = __tcp_authopt_calc_mac(sk, skb, key, false, macbuf);
> +	if (err) {
> +		/* If mac calculation fails and caller doesn't handle the error
> +		 * try to make it obvious inside the packet.
> +		 */
> +		memset(hash_location, 0, key->maclen);
> +		return err;
> +	}
> +	memcpy(hash_location, macbuf, key->maclen);


[2]
This is the place were we do not make sure to clear the padding bytes
(if key->maclen is not a multiple of 4)


> +
> +	return 0;
> +}
> +
> +static struct tcp_authopt_key_info *tcp_authopt_lookup_recv(struct sock *sk,
> +							    struct sk_buff *skb,
> +							    struct tcp_authopt_info *info,
> +							    int recv_id)
> +{
> +	struct tcp_authopt_key_info *result = NULL;
> +	struct tcp_authopt_key_info *key;
> +
> +	/* multiple matches will cause occasional failures */
> +	hlist_for_each_entry_rcu(key, &info->head, node, 0) {
> +		if (recv_id >= 0 && key->recv_id != recv_id)
> +			continue;
> +		if (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND) {
> +			if (sk->sk_family == AF_INET) {
> +				struct sockaddr_in *key_addr = (struct sockaddr_in *)&key->addr;
> +				struct iphdr *iph = (struct iphdr *)skb_network_header(skb);
> +
> +				if (WARN_ON(key_addr->sin_family != AF_INET))
> +					continue;
> +				if (WARN_ON(iph->version != 4))
> +					continue;
> +				if (memcmp(&iph->saddr, &key_addr->sin_addr, sizeof(iph->saddr)))
> +					continue;
> +			}
> +			if (sk->sk_family == AF_INET6) {
> +				struct sockaddr_in6 *key_addr = (struct sockaddr_in6 *)&key->addr;
> +				struct ipv6hdr *iph = (struct ipv6hdr *)skb_network_header(skb);
> +
> +				if (WARN_ON(key_addr->sin6_family != AF_INET6))
> +					continue;
> +				if (WARN_ON(iph->version != 6))
> +					continue;
> +				if (memcmp(&iph->saddr, &key_addr->sin6_addr, sizeof(iph->saddr)))
> +					continue;
> +			}
> +		}
> +		if (result && net_ratelimit())
> +			pr_warn("ambiguous tcp authentication keys configured for receive\n");
> +		result = key;
> +	}
> +
> +	return result;
> +}
> +
> +int __tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb, struct tcp_authopt_info *info)
> +{
> +	struct tcphdr *th = (struct tcphdr *)skb_transport_header(skb);
> +	struct tcphdr_authopt *opt;
> +	struct tcp_authopt_key_info *key;
> +	u8 macbuf[TCP_AUTHOPT_MAXMACBUF];
> +	int err;
> +
> +	opt = (struct tcphdr_authopt *)tcp_authopt_find_option(th);
> +	key = tcp_authopt_lookup_recv(sk, skb, info, opt ? opt->keyid : -1);
> +
> +	/* nothing found or expected */
> +	if (!opt && !key)
> +		return 0;
> +	if (!opt && key) {
> +		net_info_ratelimited("TCP Authentication Missing\n");
> +		return -EINVAL;
> +	}
> +	if (opt && !key) {
> +		/* RFC5925 Section 7.3:
> +		 * A TCP-AO implementation MUST allow for configuration of the behavior
> +		 * of segments with TCP-AO but that do not match an MKT. The initial
> +		 * default of this configuration SHOULD be to silently accept such
> +		 * connections.
> +		 */
> +		if (info->flags & TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED) {
> +			net_info_ratelimited("TCP Authentication Unexpected: Rejected\n");
> +			return -EINVAL;
> +		} else {
> +			net_info_ratelimited("TCP Authentication Unexpected: Accepted\n");
> +			return 0;
> +		}
> +	}
> +
> +	/* bad inbound key len */
> +	if (key->maclen + 4 != opt->len)
> +		return -EINVAL;
> +
> +	err = __tcp_authopt_calc_mac(sk, skb, key, true, macbuf);
> +	if (err)
> +		return err;
> +
> +	if (memcmp(macbuf, opt->mac, key->maclen)) {
> +		net_info_ratelimited("TCP Authentication Failed\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 3f7bd7ae7d7a..e0b51b2f747f 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -70,10 +70,11 @@
>  #include <linux/sysctl.h>
>  #include <linux/kernel.h>
>  #include <linux/prefetch.h>
>  #include <net/dst.h>
>  #include <net/tcp.h>
> +#include <net/tcp_authopt.h>
>  #include <net/inet_common.h>
>  #include <linux/ipsec.h>
>  #include <asm/unaligned.h>
>  #include <linux/errqueue.h>
>  #include <trace/events/tcp.h>
> @@ -5967,18 +5968,34 @@ void tcp_init_transfer(struct sock *sk, int bpf_op, struct sk_buff *skb)
>  	if (!icsk->icsk_ca_initialized)
>  		tcp_init_congestion_control(sk);
>  	tcp_init_buffer_space(sk);
>  }
>  
> +static void tcp_authopt_finish_connect(struct sock *sk, struct sk_buff *skb)
> +{
> +#ifdef CONFIG_TCP_AUTHOPT
> +	struct tcp_authopt_info *info;
> +
> +	info = rcu_dereference_protected(tcp_sk(sk)->authopt_info, lockdep_sock_is_held(sk));
> +	if (!info)
> +		return;
> +
> +	info->src_isn = ntohl(tcp_hdr(skb)->ack_seq) - 1;
> +	info->dst_isn = ntohl(tcp_hdr(skb)->seq);
> +#endif
> +}
> +
>  void tcp_finish_connect(struct sock *sk, struct sk_buff *skb)
>  {
>  	struct tcp_sock *tp = tcp_sk(sk);
>  	struct inet_connection_sock *icsk = inet_csk(sk);
>  
>  	tcp_set_state(sk, TCP_ESTABLISHED);
>  	icsk->icsk_ack.lrcvtime = tcp_jiffies32;
>  
> +	tcp_authopt_finish_connect(sk, skb);
> +
>  	if (skb) {
>  		icsk->icsk_af_ops->sk_rx_dst_set(sk, skb);
>  		security_inet_conn_established(sk, skb);
>  		sk_mark_napi_id(sk, skb);
>  	}
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 1348615c7576..a1d39183908c 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -2060,10 +2060,13 @@ int tcp_v4_rcv(struct sk_buff *skb)
>  		goto discard_and_relse;
>  
>  	if (tcp_v4_inbound_md5_hash(sk, skb, dif, sdif))
>  		goto discard_and_relse;
>  
> +	if (tcp_authopt_inbound_check(sk, skb))
> +		goto discard_and_relse;
> +
>  	nf_reset_ct(skb);
>  
>  	if (tcp_filter(sk, skb))
>  		goto discard_and_relse;
>  	th = (const struct tcphdr *)skb->data;
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index 0a4f3f16140a..4d7d86547b0e 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -24,10 +24,11 @@
>  #include <linux/slab.h>
>  #include <linux/sysctl.h>
>  #include <linux/workqueue.h>
>  #include <linux/static_key.h>
>  #include <net/tcp.h>
> +#include <net/tcp_authopt.h>
>  #include <net/inet_common.h>
>  #include <net/xfrm.h>
>  #include <net/busy_poll.h>
>  
>  static bool tcp_in_window(u32 seq, u32 end_seq, u32 s_win, u32 e_win)
> @@ -539,10 +540,11 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
>  #ifdef CONFIG_TCP_MD5SIG
>  	newtp->md5sig_info = NULL;	/*XXX*/
>  	if (newtp->af_specific->md5_lookup(sk, newsk))
>  		newtp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED;
>  #endif
> +	tcp_authopt_openreq(newsk, sk, req);
>  	if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len)
>  		newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len;
>  	newtp->rx_opt.mss_clamp = req->mss;
>  	tcp_ecn_openreq_child(newtp, req);
>  	newtp->fastopen_req = NULL;
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 6d72f3ea48c4..6d73bee349c9 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -37,10 +37,11 @@
>  
>  #define pr_fmt(fmt) "TCP: " fmt
>  
>  #include <net/tcp.h>
>  #include <net/mptcp.h>
> +#include <net/tcp_authopt.h>
>  
>  #include <linux/compiler.h>
>  #include <linux/gfp.h>
>  #include <linux/module.h>
>  #include <linux/static_key.h>
> @@ -411,10 +412,11 @@ static inline bool tcp_urg_mode(const struct tcp_sock *tp)
>  
>  #define OPTION_SACK_ADVERTISE	(1 << 0)
>  #define OPTION_TS		(1 << 1)
>  #define OPTION_MD5		(1 << 2)
>  #define OPTION_WSCALE		(1 << 3)
> +#define OPTION_AUTHOPT		(1 << 4)
>  #define OPTION_FAST_OPEN_COOKIE	(1 << 8)
>  #define OPTION_SMC		(1 << 9)
>  #define OPTION_MPTCP		(1 << 10)
>  
>  static void smc_options_write(__be32 *ptr, u16 *options)
> @@ -435,16 +437,21 @@ static void smc_options_write(__be32 *ptr, u16 *options)
>  struct tcp_out_options {
>  	u16 options;		/* bit field of OPTION_* */
>  	u16 mss;		/* 0 to disable */
>  	u8 ws;			/* window scale, 0 to disable */
>  	u8 num_sack_blocks;	/* number of SACK blocks to include */
> -	u8 hash_size;		/* bytes in hash_location */
>  	u8 bpf_opt_len;		/* length of BPF hdr option */
> +#ifdef CONFIG_TCP_AUTHOPT
> +	u8 authopt_rnextkeyid; /* rnextkey */
> +#endif
>  	__u8 *hash_location;	/* temporary pointer, overloaded */
>  	__u32 tsval, tsecr;	/* need to include OPTION_TS */
>  	struct tcp_fastopen_cookie *fastopen_cookie;	/* Fast open cookie */
>  	struct mptcp_out_options mptcp;
> +#ifdef CONFIG_TCP_AUTHOPT
> +	struct tcp_authopt_key_info *authopt_key;
> +#endif
>  };
>  
>  static void mptcp_options_write(__be32 *ptr, const struct tcp_sock *tp,
>  				struct tcp_out_options *opts)
>  {
> @@ -617,10 +624,24 @@ static void tcp_options_write(__be32 *ptr, struct tcp_sock *tp,
>  		/* overload cookie hash location */
>  		opts->hash_location = (__u8 *)ptr;
>  		ptr += 4;
>  	}
>  
> +#ifdef CONFIG_TCP_AUTHOPT
> +	if (unlikely(OPTION_AUTHOPT & options)) {
> +		struct tcp_authopt_key_info *key = opts->authopt_key;
> +
> +		WARN_ON(!key);
> +		*ptr++ = htonl((TCPOPT_AUTHOPT << 24) | ((4 + key->maclen) << 16) |
> +			       (key->send_id << 8) | opts->authopt_rnextkeyid);
> +		/* overload cookie hash location */
> +		opts->hash_location = (__u8 *)ptr;
> +		/* maclen is currently always 12 but try to align nicely anyway. */
> +		ptr += (key->maclen + 3) / 4;
> +	}
> +#endif
> +
>  	if (unlikely(opts->mss)) {
>  		*ptr++ = htonl((TCPOPT_MSS << 24) |
>  			       (TCPOLEN_MSS << 16) |
>  			       opts->mss);
>  	}
> @@ -752,10 +773,28 @@ static void mptcp_set_option_cond(const struct request_sock *req,
>  			}
>  		}
>  	}
>  }
>  
> +static int tcp_authopt_init_options(const struct sock *sk,
> +				    const struct sock *addr_sk,
> +				    struct tcp_out_options *opts)
> +{
> +#ifdef CONFIG_TCP_AUTHOPT
> +	struct tcp_authopt_key_info *key;
> +
> +	key = tcp_authopt_select_key(sk, addr_sk, &opts->authopt_rnextkeyid);
> +	if (key) {
> +		opts->options |= OPTION_AUTHOPT;
> +		opts->authopt_key = key;
> +		return 4 + key->maclen;
> +	}
> +#endif
> +
> +	return 0;
> +}
> +
>  /* Compute TCP options for SYN packets. This is not the final
>   * network wire format yet.
>   */
>  static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
>  				struct tcp_out_options *opts,
> @@ -774,10 +813,11 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
>  			opts->options |= OPTION_MD5;
>  			remaining -= TCPOLEN_MD5SIG_ALIGNED;
>  		}
>  	}
>  #endif
> +	remaining -= tcp_authopt_init_options(sk, sk, opts);
>  
>  	/* We always get an MSS option.  The option bytes which will be seen in
>  	 * normal data packets should timestamps be used, must be in the MSS
>  	 * advertised.  But we subtract them from tp->mss_cache so that
>  	 * calculations in tcp_sendmsg are simpler etc.  So account for this
> @@ -862,10 +902,11 @@ static unsigned int tcp_synack_options(const struct sock *sk,
>  		 */
>  		if (synack_type != TCP_SYNACK_COOKIE)
>  			ireq->tstamp_ok &= !ireq->sack_ok;
>  	}
>  #endif
> +	remaining -= tcp_authopt_init_options(sk, req_to_sk(req), opts);
>  
>  	/* We always send an MSS option. */
>  	opts->mss = mss;
>  	remaining -= TCPOLEN_MSS_ALIGNED;
>  
> @@ -930,10 +971,11 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
>  			opts->options |= OPTION_MD5;
>  			size += TCPOLEN_MD5SIG_ALIGNED;
>  		}
>  	}
>  #endif
> +	size += tcp_authopt_init_options(sk, sk, opts);
>  
>  	if (likely(tp->rx_opt.tstamp_ok)) {
>  		opts->options |= OPTION_TS;
>  		opts->tsval = skb ? tcp_skb_timestamp(skb) + tp->tsoffset : 0;
>  		opts->tsecr = tp->rx_opt.ts_recent;
> @@ -1277,10 +1319,14 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
>  
>  	inet = inet_sk(sk);
>  	tcb = TCP_SKB_CB(skb);
>  	memset(&opts, 0, sizeof(opts));
>  
> +#ifdef CONFIG_TCP_AUTHOPT
> +	/* for tcp_authopt_init_options inside tcp_syn_options or tcp_established_options */
> +	rcu_read_lock();
> +#endif
>  	if (unlikely(tcb->tcp_flags & TCPHDR_SYN)) {
>  		tcp_options_size = tcp_syn_options(sk, skb, &opts, &md5);
>  	} else {
>  		tcp_options_size = tcp_established_options(sk, skb, &opts,
>  							   &md5);
> @@ -1365,10 +1411,17 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
>  		sk_nocaps_add(sk, NETIF_F_GSO_MASK);
>  		tp->af_specific->calc_md5_hash(opts.hash_location,
>  					       md5, sk, skb);
>  	}
>  #endif
> +#ifdef CONFIG_TCP_AUTHOPT
> +	if (opts.authopt_key) {
> +		sk_nocaps_add(sk, NETIF_F_GSO_MASK);
> +		tcp_authopt_hash(opts.hash_location, opts.authopt_key, sk, skb);
> +	}
> +	rcu_read_unlock();
> +#endif
>  
>  	/* BPF prog is the last one writing header option */
>  	bpf_skops_write_hdr_opt(sk, skb, NULL, NULL, 0, &opts);
>  
>  	INDIRECT_CALL_INET(icsk->icsk_af_ops->send_check,
> @@ -1836,12 +1889,21 @@ unsigned int tcp_current_mss(struct sock *sk)
>  		u32 mtu = dst_mtu(dst);
>  		if (mtu != inet_csk(sk)->icsk_pmtu_cookie)
>  			mss_now = tcp_sync_mss(sk, mtu);
>  	}
>  
> +#ifdef CONFIG_TCP_AUTHOPT
> +	/* Even if the result is not used rcu_read_lock is required when scanning for
> +	 * tcp authentication keys. Otherwise lockdep will complain.
> +	 */
> +	rcu_read_lock();
> +#endif
>  	header_len = tcp_established_options(sk, NULL, &opts, &md5) +
>  		     sizeof(struct tcphdr);
> +#ifdef CONFIG_TCP_AUTHOPT
> +	rcu_read_unlock();
> +#endif
>  	/* The mss_cache is sized based on tp->tcp_header_len, which assumes
>  	 * some common options. If this is an odd packet (because we have SACK
>  	 * blocks etc) then our calculated header_len will be different, and
>  	 * we have to adjust mss_now correspondingly */
>  	if (header_len != tp->tcp_header_len) {
> @@ -3566,10 +3628,14 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
>  	}
>  
>  #ifdef CONFIG_TCP_MD5SIG
>  	rcu_read_lock();
>  	md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req));
> +#endif
> +#ifdef CONFIG_TCP_AUTHOPT
> +	/* for tcp_authopt_init_options inside tcp_synack_options */
> +	rcu_read_lock();
>  #endif
>  	skb_set_hash(skb, tcp_rsk(req)->txhash, PKT_HASH_TYPE_L4);
>  	/* bpf program will be interested in the tcp_flags */
>  	TCP_SKB_CB(skb)->tcp_flags = TCPHDR_SYN | TCPHDR_ACK;
>  	tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, md5,
> @@ -3603,10 +3669,16 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
>  	if (md5)
>  		tcp_rsk(req)->af_specific->calc_md5_hash(opts.hash_location,
>  					       md5, req_to_sk(req), skb);
>  	rcu_read_unlock();
>  #endif
> +#ifdef CONFIG_TCP_AUTHOPT
> +	/* If signature fails we do nothing */
> +	if (opts.authopt_key)
> +		tcp_authopt_hash(opts.hash_location, opts.authopt_key, req_to_sk(req), skb);
> +	rcu_read_unlock();
> +#endif
>  
>  	bpf_skops_write_hdr_opt((struct sock *)sk, skb, req, syn_skb,
>  				synack_type, &opts);
>  
>  	skb->skb_mstamp_ns = now;
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 0ce52d46e4f8..51381a9c2bd5 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -40,10 +40,11 @@
>  #include <linux/icmpv6.h>
>  #include <linux/random.h>
>  #include <linux/indirect_call_wrapper.h>
>  
>  #include <net/tcp.h>
> +#include <net/tcp_authopt.h>
>  #include <net/ndisc.h>
>  #include <net/inet6_hashtables.h>
>  #include <net/inet6_connection_sock.h>
>  #include <net/ipv6.h>
>  #include <net/transp_v6.h>
> @@ -1733,10 +1734,13 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
>  		goto discard_and_relse;
>  
>  	if (tcp_v6_inbound_md5_hash(sk, skb, dif, sdif))
>  		goto discard_and_relse;
>  
> +	if (tcp_authopt_inbound_check(sk, skb))
> +		goto discard_and_relse;
> +
>  	if (tcp_filter(sk, skb))
>  		goto discard_and_relse;
>  	th = (const struct tcphdr *)skb->data;
>  	hdr = ipv6_hdr(skb);
>  	tcp_v6_fill_cb(skb, hdr, th);
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 05/15] tcp: authopt: Add crypto initialization
  2021-08-24 21:34 ` [RFCv3 05/15] tcp: authopt: Add crypto initialization Leonard Crestez
@ 2021-08-24 23:02   ` Eric Dumazet
  2021-08-24 23:34   ` Eric Dumazet
  1 sibling, 0 replies; 31+ messages in thread
From: Eric Dumazet @ 2021-08-24 23:02 UTC (permalink / raw)
  To: Leonard Crestez, Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel



On 8/24/21 2:34 PM, Leonard Crestez wrote:
> The crypto_shash API is used in order to compute packet signatures. The
> API comes with several unfortunate limitations:
> 
> 1) Allocating a crypto_shash can sleep and must be done in user context.
> 2) Packet signatures must be computed in softirq context
> 3) Packet signatures use dynamic "traffic keys" which require exclusive
> access to crypto_shash for crypto_setkey.
> 
> The solution is to allocate one crypto_shash for each possible cpu for
> each algorithm at setsockopt time. The per-cpu tfm is then borrowed from
> softirq context, signatures are computed and the tfm is returned.
> 
> The pool for each algorithm is reference counted, initialized at
> setsockopt time and released in tcp_authopt_key_info's rcu callback
> 
>

I don't know, why should we really care and try so hard to release
the tfm per cpu ?

I would simply allocate them at boot time.
This would avoid the expensive refcounting (potential false sharing)


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 05/15] tcp: authopt: Add crypto initialization
  2021-08-24 21:34 ` [RFCv3 05/15] tcp: authopt: Add crypto initialization Leonard Crestez
  2021-08-24 23:02   ` Eric Dumazet
@ 2021-08-24 23:34   ` Eric Dumazet
  2021-08-25  8:08     ` Herbert Xu
  2021-08-25 16:35     ` Leonard Crestez
  1 sibling, 2 replies; 31+ messages in thread
From: Eric Dumazet @ 2021-08-24 23:34 UTC (permalink / raw)
  To: Leonard Crestez, Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel



On 8/24/21 2:34 PM, Leonard Crestez wrote:
> The crypto_shash API is used in order to compute packet signatures. The
> API comes with several unfortunate limitations:
> 
> 1) Allocating a crypto_shash can sleep and must be done in user context.
> 2) Packet signatures must be computed in softirq context
> 3) Packet signatures use dynamic "traffic keys" which require exclusive
> access to crypto_shash for crypto_setkey.
> 
> The solution is to allocate one crypto_shash for each possible cpu for
> each algorithm at setsockopt time. The per-cpu tfm is then borrowed from
> softirq context, signatures are computed and the tfm is returned.
> 

I could not see the per-cpu stuff that you mention in the changelog.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 09/15] selftests: tcp_authopt: Test key address binding
  2021-08-24 21:34 ` [RFCv3 09/15] selftests: tcp_authopt: Test key address binding Leonard Crestez
@ 2021-08-25  5:18   ` David Ahern
  2021-08-25 16:37     ` Leonard Crestez
  0 siblings, 1 reply; 31+ messages in thread
From: David Ahern @ 2021-08-25  5:18 UTC (permalink / raw)
  To: Leonard Crestez, Dmitry Safonov, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

On 8/24/21 2:34 PM, Leonard Crestez wrote:
> By default TCP-AO keys apply to all possible peers but it's possible to
> have different keys for different remote hosts.
> 
> This patch adds initial tests for the behavior behind the
> TCP_AUTHOPT_KEY_BIND_ADDR flag. Server rejection is tested via client
> timeout so this can be slightly slow.
> 
> Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
> ---
>  .../tcp_authopt_test/netns_fixture.py         |  63 +++++++
>  .../tcp_authopt/tcp_authopt_test/server.py    |  82 ++++++++++
>  .../tcp_authopt/tcp_authopt_test/test_bind.py | 143 ++++++++++++++++
>  .../tcp_authopt/tcp_authopt_test/utils.py     | 154 ++++++++++++++++++
>  4 files changed, 442 insertions(+)
>  create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/netns_fixture.py
>  create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/server.py
>  create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_bind.py
>  create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/utils.py
> 

This should be under selftests/net as a single "tcp_authopt" directory
from what I can tell.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 05/15] tcp: authopt: Add crypto initialization
  2021-08-24 23:34   ` Eric Dumazet
@ 2021-08-25  8:08     ` Herbert Xu
  2021-08-25 14:55       ` Eric Dumazet
  2021-08-25 16:04       ` Ard Biesheuvel
  2021-08-25 16:35     ` Leonard Crestez
  1 sibling, 2 replies; 31+ messages in thread
From: Herbert Xu @ 2021-08-25  8:08 UTC (permalink / raw)
  To: Eric Dumazet, Ard Biesheuvel, Eric Biggers
  Cc: Leonard Crestez, Dmitry Safonov, David Ahern, Shuah Khan,
	Eric Dumazet, David S. Miller, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

On Tue, Aug 24, 2021 at 04:34:58PM -0700, Eric Dumazet wrote:
> 
> On 8/24/21 2:34 PM, Leonard Crestez wrote:
> > The crypto_shash API is used in order to compute packet signatures. The
> > API comes with several unfortunate limitations:
> > 
> > 1) Allocating a crypto_shash can sleep and must be done in user context.
> > 2) Packet signatures must be computed in softirq context
> > 3) Packet signatures use dynamic "traffic keys" which require exclusive
> > access to crypto_shash for crypto_setkey.
> > 
> > The solution is to allocate one crypto_shash for each possible cpu for
> > each algorithm at setsockopt time. The per-cpu tfm is then borrowed from
> > softirq context, signatures are computed and the tfm is returned.
> > 
> 
> I could not see the per-cpu stuff that you mention in the changelog.

Perhaps it's time we moved the key information from the tfm into
the request structure for hashes? Or at least provide a way for
the key to be in the request structure in addition to the tfm as
the tfm model still works for IPsec.  Ard/Eric, what do you think
about that?

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 05/15] tcp: authopt: Add crypto initialization
  2021-08-25  8:08     ` Herbert Xu
@ 2021-08-25 14:55       ` Eric Dumazet
  2021-08-25 16:04       ` Ard Biesheuvel
  1 sibling, 0 replies; 31+ messages in thread
From: Eric Dumazet @ 2021-08-25 14:55 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric Dumazet, Ard Biesheuvel, Eric Biggers, Leonard Crestez,
	Dmitry Safonov, David Ahern, Shuah Khan, David S. Miller,
	Kuniyuki Iwashima, Hideaki YOSHIFUJI, Jakub Kicinski,
	Yuchung Cheng, Francesco Ruggeri, Mat Martineau,
	Christoph Paasch, Ivan Delalande, Priyaranjan Jha, Menglong Dong,
	netdev, open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	open list:KERNEL SELFTEST FRAMEWORK, LKML

On Wed, Aug 25, 2021 at 1:08 AM Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Tue, Aug 24, 2021 at 04:34:58PM -0700, Eric Dumazet wrote:
> >
> > On 8/24/21 2:34 PM, Leonard Crestez wrote:
> > > The crypto_shash API is used in order to compute packet signatures. The
> > > API comes with several unfortunate limitations:
> > >
> > > 1) Allocating a crypto_shash can sleep and must be done in user context.
> > > 2) Packet signatures must be computed in softirq context
> > > 3) Packet signatures use dynamic "traffic keys" which require exclusive
> > > access to crypto_shash for crypto_setkey.
> > >
> > > The solution is to allocate one crypto_shash for each possible cpu for
> > > each algorithm at setsockopt time. The per-cpu tfm is then borrowed from
> > > softirq context, signatures are computed and the tfm is returned.
> > >
> >
> > I could not see the per-cpu stuff that you mention in the changelog.
>
> Perhaps it's time we moved the key information from the tfm into
> the request structure for hashes? Or at least provide a way for
> the key to be in the request structure in addition to the tfm as
> the tfm model still works for IPsec.  Ard/Eric, what do you think
> about that?

What is the typical size of a ' tfm' and associated data ?

per-cpu tfm might still make sense, if we had proper NUMA affinities.
AFAIK, currently we can not provide a numa node to crypto allocations.

So using construct like this ends up allocating all data on one single NUMA node

for_each_possible_cpu(cpu) {
    tfm = crypto_alloc_shash(algo->name, 0, 0);
    if (IS_ERR(tfm))
        return PTR_ERR(tfm);
    p_tfm = per_cpu_ptr(algo->tfms, cpu);
    *p_tfm = tfm;
}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 05/15] tcp: authopt: Add crypto initialization
  2021-08-25  8:08     ` Herbert Xu
  2021-08-25 14:55       ` Eric Dumazet
@ 2021-08-25 16:04       ` Ard Biesheuvel
  2021-08-25 16:31         ` Leonard Crestez
  1 sibling, 1 reply; 31+ messages in thread
From: Ard Biesheuvel @ 2021-08-25 16:04 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric Dumazet, Eric Biggers, Leonard Crestez, Dmitry Safonov,
	David Ahern, Shuah Khan, Eric Dumazet, David S. Miller,
	Kuniyuki Iwashima, Hideaki YOSHIFUJI, Jakub Kicinski,
	Yuchung Cheng, Francesco Ruggeri, Mat Martineau,
	Christoph Paasch, Ivan Delalande, Priyaranjan Jha, Menglong Dong,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Linux Crypto Mailing List, open list:KERNEL SELFTEST FRAMEWORK,
	Linux Kernel Mailing List

On Wed, 25 Aug 2021 at 10:08, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Tue, Aug 24, 2021 at 04:34:58PM -0700, Eric Dumazet wrote:
> >
> > On 8/24/21 2:34 PM, Leonard Crestez wrote:
> > > The crypto_shash API is used in order to compute packet signatures. The
> > > API comes with several unfortunate limitations:
> > >
> > > 1) Allocating a crypto_shash can sleep and must be done in user context.
> > > 2) Packet signatures must be computed in softirq context
> > > 3) Packet signatures use dynamic "traffic keys" which require exclusive
> > > access to crypto_shash for crypto_setkey.
> > >
> > > The solution is to allocate one crypto_shash for each possible cpu for
> > > each algorithm at setsockopt time. The per-cpu tfm is then borrowed from
> > > softirq context, signatures are computed and the tfm is returned.
> > >
> >
> > I could not see the per-cpu stuff that you mention in the changelog.
>
> Perhaps it's time we moved the key information from the tfm into
> the request structure for hashes? Or at least provide a way for
> the key to be in the request structure in addition to the tfm as
> the tfm model still works for IPsec.  Ard/Eric, what do you think
> about that?
>

I think it makes sense for a shash desc to have the ability to carry a
key, which will be used instead of the TFM key, but this seems like
quite a lot of work, given that all implementations will need to be
updated. Also, setkey() can currently sleep, so we need to check
whether the existing key manipulation code can actually execute during
init/update/final if sleeping is not permitted.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 05/15] tcp: authopt: Add crypto initialization
  2021-08-25 16:04       ` Ard Biesheuvel
@ 2021-08-25 16:31         ` Leonard Crestez
  0 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-25 16:31 UTC (permalink / raw)
  To: Ard Biesheuvel, Herbert Xu
  Cc: Eric Dumazet, Eric Biggers, Dmitry Safonov, David Ahern,
	Shuah Khan, Eric Dumazet, David S. Miller, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Linux Crypto Mailing List, open list:KERNEL SELFTEST FRAMEWORK,
	Linux Kernel Mailing List



On 8/25/21 7:04 PM, Ard Biesheuvel wrote:
> On Wed, 25 Aug 2021 at 10:08, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>>
>> On Tue, Aug 24, 2021 at 04:34:58PM -0700, Eric Dumazet wrote:
>>>
>>> On 8/24/21 2:34 PM, Leonard Crestez wrote:
>>>> The crypto_shash API is used in order to compute packet signatures. The
>>>> API comes with several unfortunate limitations:
>>>>
>>>> 1) Allocating a crypto_shash can sleep and must be done in user context.
>>>> 2) Packet signatures must be computed in softirq context
>>>> 3) Packet signatures use dynamic "traffic keys" which require exclusive
>>>> access to crypto_shash for crypto_setkey.
>>>>
>>>> The solution is to allocate one crypto_shash for each possible cpu for
>>>> each algorithm at setsockopt time. The per-cpu tfm is then borrowed from
>>>> softirq context, signatures are computed and the tfm is returned.
>>>>
>>>
>>> I could not see the per-cpu stuff that you mention in the changelog.
>>
>> Perhaps it's time we moved the key information from the tfm into
>> the request structure for hashes? Or at least provide a way for
>> the key to be in the request structure in addition to the tfm as
>> the tfm model still works for IPsec.  Ard/Eric, what do you think
>> about that?
>>
> 
> I think it makes sense for a shash desc to have the ability to carry a
> key, which will be used instead of the TFM key, but this seems like
> quite a lot of work, given that all implementations will need to be
> updated. Also, setkey() can currently sleep, so we need to check
> whether the existing key manipulation code can actually execute during
> init/update/final if sleeping is not permitted.

Are you sure that setkey can sleep? The documentation is not clear, 
maybe it only applies to certain hardware implementations?

The TCP Authentication Option needs dynamic keys for SYN and SYNACK 
packets, all of which happens in BH context.

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 07/15] tcp: authopt: Hook into tcp core
  2021-08-24 22:59   ` Eric Dumazet
@ 2021-08-25 16:32     ` Leonard Crestez
  0 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-25 16:32 UTC (permalink / raw)
  To: Eric Dumazet, Dmitry Safonov, David Ahern
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel, Shuah Khan

On 25.08.2021 01:59, Eric Dumazet wrote:
> On 8/24/21 2:34 PM, Leonard Crestez wrote:
>> The tcp_authopt features exposes a minimal interface to the rest of the
>> TCP stack. Only a few functions are exposed and if the feature is
>> disabled they return neutral values, avoiding ifdefs in the rest of the
>> code.
>>
>> Add calls into tcp authopt from send, receive and accept code.
>>
>> Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
>> ---
>>   include/net/tcp_authopt.h |  56 +++++++++
>>   net/ipv4/tcp_authopt.c    | 246 ++++++++++++++++++++++++++++++++++++++
>>   net/ipv4/tcp_input.c      |  17 +++
>>   net/ipv4/tcp_ipv4.c       |   3 +
>>   net/ipv4/tcp_minisocks.c  |   2 +
>>   net/ipv4/tcp_output.c     |  74 +++++++++++-
>>   net/ipv6/tcp_ipv6.c       |   4 +
>>   7 files changed, 401 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
>> index c9ee2059b442..61db268f36f8 100644
>> --- a/include/net/tcp_authopt.h
>> +++ b/include/net/tcp_authopt.h
>> @@ -21,10 +21,11 @@ struct tcp_authopt_key_info {
>>   	/* Wire identifiers */
>>   	u8 send_id, recv_id;
>>   	u8 alg_id;
>>   	u8 keylen;
>>   	u8 key[TCP_AUTHOPT_MAXKEYLEN];
>> +	u8 maclen;
> 
> I do not see maclen being enforced to 12, or a multiple of 4 ?

For both current algorithms the maclen value is 12. I just implemented 
RFC5926, there is no way to control this from userspace.

> This means that later [2], tcp_authopt_hash() will leave up to 3
> unitialized bytes in the TCP options, sent to the wire.
> 
> This is a  security issue, since we will leak kernel memory.

Filling the remainder with zeroes does make sense, or at least 
WARN_ON(maclen != 4) so that it's obvious to anyone who attempts to 
extend the algorithms.

>> +struct tcp_authopt_key_info *tcp_authopt_lookup_send(struct tcp_authopt_info *info,
>> +						     const struct sock *addr_sk,
>> +						     int send_id)
>> +{
>> +	struct tcp_authopt_key_info *result = NULL;
>> +	struct tcp_authopt_key_info *key;
>> +
>> +	hlist_for_each_entry_rcu(key, &info->head, node, 0) {
>> +		if (send_id >= 0 && key->send_id != send_id)
>> +			continue;
>> +		if (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND) {
>> +			if (addr_sk->sk_family == AF_INET) {
>> +				struct sockaddr_in *key_addr = (struct sockaddr_in *)&key->addr;
>> +				const struct in_addr *daddr =
>> +					(const struct in_addr *)&addr_sk->sk_daddr;
> 
> Why a cast is needed ? sk_daddr is a __be32, no need to cast it to in_addr
>> +
>> +				if (WARN_ON(key_addr->sin_family != AF_INET))
> 
> Why a WARN_ON() is used ? If we expect this to trigger, then at minimumum WARN_ON_ONCE() please.
> 
>> +					continue;
>> +				if (memcmp(daddr, &key_addr->sin_addr, sizeof(*daddr)))
>> +					continue;
> 
> Using memcmp() to compare two __be32 is overkill.
> 
>> +			}
>> +			if (addr_sk->sk_family == AF_INET6) {
>> +				struct sockaddr_in6 *key_addr = (struct sockaddr_in6 *)&key->addr;
>> +				const struct in6_addr *daddr = &addr_sk->sk_v6_daddr;
> 
> Not sure why a variable is used, you need it once.
> 
>> +
>> +				if (WARN_ON(key_addr->sin6_family != AF_INET6))
>> +					continue;
>> +				if (memcmp(daddr, &key_addr->sin6_addr, sizeof(*daddr)))
> 
> ipv6_addr_equal() should be faster.

OK, I will replace the comparisons.

Checking address family is mostly paranoia on my part, I don't know if a 
real scenario exists for AF mismatch. Still need to check ipv4-mapped 
ipv6 addresses, not sure if those can receive ipv4 skbs on an ipv6 socket.

>> +struct tcp_authopt_key_info *tcp_authopt_select_key(const struct sock *sk,
>> +						    const struct sock *addr_sk,
>> +						    u8 *rnextkeyid)
>> +{
>> +	struct tcp_authopt_info *info;
>> +
>> +	info = rcu_dereference(tcp_sk(sk)->authopt_info);
> 
> distro kernels will have CONFIG_TCP_AUTHOPT set, meaning
> that we will add a cache line miss for every incoming TCP packet
> even on hosts not using any RFC5925 TCP flow.
> 
> For TCP MD5 we are using a static key, to avoid this extra cost.

OK, will add a static_key.

The check for "does socket have tcp_authopt" also belongs in an inline 
wrapper, similar to inbound check

>> +int __tcp_authopt_openreq(struct sock *newsk, const struct sock *oldsk, struct request_sock *req)
>> +{
>> +	struct tcp_authopt_info *old_info;
>> +	struct tcp_authopt_info *new_info;
>> +	int err;
>> +
>> +	old_info = rcu_dereference(tcp_sk(oldsk)->authopt_info);
>> +	if (!old_info)
>> +		return 0;
>> +
>> +	new_info = kmalloc(sizeof(*new_info), GFP_ATOMIC | __GFP_ZERO);
> 
> kzalloc() is your friend. (same remark for your other patches, where you are using __GFP_ZERO)
> Also see additional comment [1]

OK
> 
>> +	if (!new_info)
>> +		return -ENOMEM;
>> +
>> +	sk_nocaps_add(newsk, NETIF_F_GSO_MASK);
>> +	new_info->src_isn = tcp_rsk(req)->snt_isn;
>> +	new_info->dst_isn = tcp_rsk(req)->rcv_isn;
>> +	INIT_HLIST_HEAD(&new_info->head);
>> +	err = tcp_authopt_clone_keys(newsk, oldsk, new_info, old_info);
>> +	if (err) {
>> +		__tcp_authopt_info_free(newsk, new_info);
> 
> 		Are we leaving in place old value of newsk->authopt_info ?
> 		If this is copied from the listener, I think you need
> 		to add a tcp_sk(newsk)->authopt_info = NULL;
> 		before the kzalloc() call done above.

Yes, authopt_info should be set to NULL on error because keeping the 
listen socket's value is wrong and dangerous (double free).

Leaving authopt_info NULL or malloc failure is still possible dangerous 
because it means all keys are ignored and accepted. Not clear how we 
could cause tcp_create_openreq_child to fail instead.

This is a problem in a few other parts: if cryptography fails the 
outbound MAC is filled with zeros because there's not obvious way to 
make TX fail at that point.

>> +	err = __tcp_authopt_calc_mac(sk, skb, key, false, macbuf);
>> +	if (err) {
>> +		/* If mac calculation fails and caller doesn't handle the error
>> +		 * try to make it obvious inside the packet.
>> +		 */
>> +		memset(hash_location, 0, key->maclen);
>> +		return err;
>> +	}
>> +	memcpy(hash_location, macbuf, key->maclen);
> 
> 
> [2]
> This is the place were we do not make sure to clear the padding bytes
> (if key->maclen is not a multiple of 4)

Yes. It might make sense to fix in caller because it's the caller which 
decides to align options.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 05/15] tcp: authopt: Add crypto initialization
  2021-08-24 23:34   ` Eric Dumazet
  2021-08-25  8:08     ` Herbert Xu
@ 2021-08-25 16:35     ` Leonard Crestez
  2021-08-25 17:55       ` Eric Dumazet
  1 sibling, 1 reply; 31+ messages in thread
From: Leonard Crestez @ 2021-08-25 16:35 UTC (permalink / raw)
  To: Eric Dumazet, Dmitry Safonov, David Ahern
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel, Shuah Khan

On 25.08.2021 02:34, Eric Dumazet wrote:
> On 8/24/21 2:34 PM, Leonard Crestez wrote:
>> The crypto_shash API is used in order to compute packet signatures. The
>> API comes with several unfortunate limitations:
>>
>> 1) Allocating a crypto_shash can sleep and must be done in user context.
>> 2) Packet signatures must be computed in softirq context
>> 3) Packet signatures use dynamic "traffic keys" which require exclusive
>> access to crypto_shash for crypto_setkey.
>>
>> The solution is to allocate one crypto_shash for each possible cpu for
>> each algorithm at setsockopt time. The per-cpu tfm is then borrowed from
>> softirq context, signatures are computed and the tfm is returned.
>>
> 
> I could not see the per-cpu stuff that you mention in the changelog.

That's a little embarrasing, I forgot to implement the actual per-cpu 
stuff. tcp_authopt_alg_imp.tfm is meant to be an array up to NR_CPUS and 
tcp_authopt_alg_get_tfm needs no locking other than preempt_disable 
(which should already be the case).

The reference counting would still only happen from very few places: 
setsockopt, close and openreq. This would only impact request/response 
traffic and relatively little.

Performance was not a major focus so far. Preventing impact on non-AO 
connections is important but typical AO usecases are long-lived 
low-traffic connections.

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 09/15] selftests: tcp_authopt: Test key address binding
  2021-08-25  5:18   ` David Ahern
@ 2021-08-25 16:37     ` Leonard Crestez
  0 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-25 16:37 UTC (permalink / raw)
  To: David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Dmitry Safonov, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

On 25.08.2021 08:18, David Ahern wrote:
> On 8/24/21 2:34 PM, Leonard Crestez wrote:
>> By default TCP-AO keys apply to all possible peers but it's possible to
>> have different keys for different remote hosts.
>>
>> This patch adds initial tests for the behavior behind the
>> TCP_AUTHOPT_KEY_BIND_ADDR flag. Server rejection is tested via client
>> timeout so this can be slightly slow.
>>
>> Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
>> ---
>>   .../tcp_authopt_test/netns_fixture.py         |  63 +++++++
>>   .../tcp_authopt/tcp_authopt_test/server.py    |  82 ++++++++++
>>   .../tcp_authopt/tcp_authopt_test/test_bind.py | 143 ++++++++++++++++
>>   .../tcp_authopt/tcp_authopt_test/utils.py     | 154 ++++++++++++++++++
>>   4 files changed, 442 insertions(+)
>>   create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/netns_fixture.py
>>   create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/server.py
>>   create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/test_bind.py
>>   create mode 100644 tools/testing/selftests/tcp_authopt/tcp_authopt_test/utils.py
>>
> 
> This should be under selftests/net as a single "tcp_authopt" directory
> from what I can tell.

Maybe? I found no clear guidelines for organizing tests by subsystem. I 
just did a grep for .py in selftests and placed mine next to tc-testing.

Having a tcp_authopt_test code directory under tcp_authopt is the 
standard pattern for python packages, otherwise all submodules with 
utilities of dubious generality are dumped at the global level. Removing 
the tcp_authopt/tcp_authopt_test structure is awkward in python.

One way to deal with this is to add my test code in 
tools/testing/selftests/net/tcp_authopt and my setup.cfg and similar 
directly in tools/testing/selftests/net. This would make "net" the root 
of the package and make it easy to add other networking pytests. This 
seems close to what you mean.

kselftest itself does not seem to offer any special support for python 
code, only some for C and shell. Maybe it could offer a "kselftest" 
package with common utilities that are used by multiple test packages 
and everything would be installed into a single virtualenv by makefiles.

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 05/15] tcp: authopt: Add crypto initialization
  2021-08-25 16:35     ` Leonard Crestez
@ 2021-08-25 17:55       ` Eric Dumazet
  2021-08-25 18:56         ` Leonard Crestez
  0 siblings, 1 reply; 31+ messages in thread
From: Eric Dumazet @ 2021-08-25 17:55 UTC (permalink / raw)
  To: Leonard Crestez
  Cc: Eric Dumazet, Dmitry Safonov, David Ahern, David S. Miller,
	Herbert Xu, Kuniyuki Iwashima, Hideaki YOSHIFUJI, Jakub Kicinski,
	Yuchung Cheng, Francesco Ruggeri, Mat Martineau,
	Christoph Paasch, Ivan Delalande, Priyaranjan Jha, Menglong Dong,
	netdev, open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	open list:KERNEL SELFTEST FRAMEWORK, LKML, Shuah Khan

On Wed, Aug 25, 2021 at 9:35 AM Leonard Crestez <cdleonard@gmail.com> wrote:
>
> On 25.08.2021 02:34, Eric Dumazet wrote:
> > On 8/24/21 2:34 PM, Leonard Crestez wrote:
> >> The crypto_shash API is used in order to compute packet signatures. The
> >> API comes with several unfortunate limitations:
> >>
> >> 1) Allocating a crypto_shash can sleep and must be done in user context.
> >> 2) Packet signatures must be computed in softirq context
> >> 3) Packet signatures use dynamic "traffic keys" which require exclusive
> >> access to crypto_shash for crypto_setkey.
> >>
> >> The solution is to allocate one crypto_shash for each possible cpu for
> >> each algorithm at setsockopt time. The per-cpu tfm is then borrowed from
> >> softirq context, signatures are computed and the tfm is returned.
> >>
> >
> > I could not see the per-cpu stuff that you mention in the changelog.
>
> That's a little embarrasing, I forgot to implement the actual per-cpu
> stuff. tcp_authopt_alg_imp.tfm is meant to be an array up to NR_CPUS and
> tcp_authopt_alg_get_tfm needs no locking other than preempt_disable
> (which should already be the case).

Well, do not use arrays of NR_CPUS and instead use normal per_cpu
accessors (as in __tcp_alloc_md5sig_pool)

>
> The reference counting would still only happen from very few places:
> setsockopt, close and openreq. This would only impact request/response
> traffic and relatively little.

What I meant is that __tcp_alloc_md5sig_pool() allocates stuff one time,
we do not care about tcp_md5sig_pool_populated going back to false.

Otherwise, a single user application constantly allocating a socket,
enabling MD5 (or authopt), then closing the socket would incur
a big cost on hosts with a lot of cpus.

>
> Performance was not a major focus so far. Preventing impact on non-AO
> connections is important but typical AO usecases are long-lived
> low-traffic connections.
>
> --
> Regards,
> Leonard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 05/15] tcp: authopt: Add crypto initialization
  2021-08-25 17:55       ` Eric Dumazet
@ 2021-08-25 18:56         ` Leonard Crestez
  0 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-08-25 18:56 UTC (permalink / raw)
  To: Eric Dumazet, Herbert Xu
  Cc: Eric Dumazet, Dmitry Safonov, David Ahern, David S. Miller,
	Kuniyuki Iwashima, Hideaki YOSHIFUJI, Jakub Kicinski,
	Yuchung Cheng, Francesco Ruggeri, Mat Martineau,
	Christoph Paasch, Ivan Delalande, Priyaranjan Jha, Menglong Dong,
	netdev, open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	open list:KERNEL SELFTEST FRAMEWORK, LKML, Shuah Khan

On 8/25/21 8:55 PM, Eric Dumazet wrote:
> On Wed, Aug 25, 2021 at 9:35 AM Leonard Crestez <cdleonard@gmail.com> wrote:
>>
>> On 25.08.2021 02:34, Eric Dumazet wrote:
>>> On 8/24/21 2:34 PM, Leonard Crestez wrote:
>>>> The crypto_shash API is used in order to compute packet signatures. The
>>>> API comes with several unfortunate limitations:
>>>>
>>>> 1) Allocating a crypto_shash can sleep and must be done in user context.
>>>> 2) Packet signatures must be computed in softirq context
>>>> 3) Packet signatures use dynamic "traffic keys" which require exclusive
>>>> access to crypto_shash for crypto_setkey.
>>>>
>>>> The solution is to allocate one crypto_shash for each possible cpu for
>>>> each algorithm at setsockopt time. The per-cpu tfm is then borrowed from
>>>> softirq context, signatures are computed and the tfm is returned.
>>>>
>>>
>>> I could not see the per-cpu stuff that you mention in the changelog.
>>
>> That's a little embarrasing, I forgot to implement the actual per-cpu
>> stuff. tcp_authopt_alg_imp.tfm is meant to be an array up to NR_CPUS and
>> tcp_authopt_alg_get_tfm needs no locking other than preempt_disable
>> (which should already be the case).
> 
> Well, do not use arrays of NR_CPUS and instead use normal per_cpu
> accessors (as in __tcp_alloc_md5sig_pool)
> 
>>
>> The reference counting would still only happen from very few places:
>> setsockopt, close and openreq. This would only impact request/response
>> traffic and relatively little.
> 
> What I meant is that __tcp_alloc_md5sig_pool() allocates stuff one time,
> we do not care about tcp_md5sig_pool_populated going back to false.
> 
> Otherwise, a single user application constantly allocating a socket,
> enabling MD5 (or authopt), then closing the socket would incur
> a big cost on hosts with a lot of cpus.

Allocating only once would definitely simply things.

I don't know if this might end up tying hardware resources forever if 
some accelerators are in play but for this feature software-only crypto 
is perfectly fine.

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 01/15] tcp: authopt: Initial support and key management
  2021-08-24 21:34 ` [RFCv3 01/15] tcp: authopt: Initial support and key management Leonard Crestez
@ 2021-08-31 19:04   ` Dmitry Safonov
  2021-09-03 14:26     ` Leonard Crestez
  0 siblings, 1 reply; 31+ messages in thread
From: Dmitry Safonov @ 2021-08-31 19:04 UTC (permalink / raw)
  To: Leonard Crestez, David Ahern, Shuah Khan
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel

Hi Leonard,

On 8/24/21 10:34 PM, Leonard Crestez wrote:
[..]
> --- /dev/null
> +++ b/include/net/tcp_authopt.h
> @@ -0,0 +1,65 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef _LINUX_TCP_AUTHOPT_H
> +#define _LINUX_TCP_AUTHOPT_H
> +
> +#include <uapi/linux/tcp.h>
> +
> +/**
> + * struct tcp_authopt_key_info - Representation of a Master Key Tuple as per RFC5925
> + *
> + * Key structure lifetime is only protected by RCU so readers needs to hold a
> + * single rcu_read_lock until they're done with the key.
> + */
> +struct tcp_authopt_key_info {
> +	struct hlist_node node;
> +	struct rcu_head rcu;
> +	/* Local identifier */
> +	u32 local_id;

It's unused now, can be removed.

[..]
> +
> +/**
> + * enum tcp_authopt_key_flag - flags for `tcp_authopt.flags`
> + *
> + * @TCP_AUTHOPT_KEY_DEL: Delete the key by local_id and ignore all other fields.
                                              ^
By send_id and recv_id.
Also, tcp_authopt_key_match_exact() seems to check
TCP_AUTHOPT_KEY_ADDR_BIND. I wounder if that makes sense to relax it in
case of TCP_AUTHOPT_KEY_DEL to match only send_id/recv_id if addr isn't
specified (no hard feelings about it, though).

[..]
> +#ifdef CONFIG_TCP_AUTHOPT
> +	case TCP_AUTHOPT: {
> +		struct tcp_authopt info;
> +
> +		if (get_user(len, optlen))
> +			return -EFAULT;
> +
> +		lock_sock(sk);
> +		tcp_get_authopt_val(sk, &info);
> +		release_sock(sk);
> +
> +		len = min_t(unsigned int, len, sizeof(info));
> +		if (put_user(len, optlen))
> +			return -EFAULT;
> +		if (copy_to_user(optval, &info, len))
> +			return -EFAULT;
> +		return 0;

Failed tcp_get_authopt_val() lookup in:
:       if (!info)
:               return -EINVAL;

will leak uninitialized kernel memory from stack.
ASLR guys defeated.

[..]
> +#define TCP_AUTHOPT_KNOWN_FLAGS ( \
> +	TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED)
> +
> +int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
> +{
> +	struct tcp_authopt opt;
> +	struct tcp_authopt_info *info;
> +
> +	sock_owned_by_me(sk);
> +
> +	/* If userspace optlen is too short fill the rest with zeros */
> +	if (optlen > sizeof(opt))
> +		return -EINVAL;

More like
:	if (unlikely(len > sizeof(opt))) {
:		err = check_zeroed_user(optval + sizeof(opt),
:					len - sizeof(opt));
:		if (err < 1)
:			return err == 0 ? -EINVAL : err;
:		len = sizeof(opt);
:		if (put_user(len, optlen))
:			return -EFAULT;
:	}

> +	memset(&opt, 0, sizeof(opt));
> +	if (copy_from_sockptr(&opt, optval, optlen))
> +		return -EFAULT;
> +
> +	if (opt.flags & ~TCP_AUTHOPT_KNOWN_FLAGS)
> +		return -EINVAL;
> +
> +	info = __tcp_authopt_info_get_or_create(sk);
> +	if (IS_ERR(info))
> +		return PTR_ERR(info);
> +
> +	info->flags = opt.flags & TCP_AUTHOPT_KNOWN_FLAGS;
> +
> +	return 0;
> +}

[..]
> +int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
> +{
> +	struct tcp_authopt_key opt;
> +	struct tcp_authopt_info *info;
> +	struct tcp_authopt_key_info *key_info;
> +
> +	sock_owned_by_me(sk);
> +
> +	/* If userspace optlen is too short fill the rest with zeros */
> +	if (optlen > sizeof(opt))
> +		return -EINVAL;

Ditto

> +	memset(&opt, 0, sizeof(opt));
> +	if (copy_from_sockptr(&opt, optval, optlen))
> +		return -EFAULT;
> +
> +	if (opt.flags & ~TCP_AUTHOPT_KEY_KNOWN_FLAGS)
> +		return -EINVAL;
> +
> +	if (opt.keylen > TCP_AUTHOPT_MAXKEYLEN)
> +		return -EINVAL;
> +
> +	/* Delete is a special case: */
> +	if (opt.flags & TCP_AUTHOPT_KEY_DEL) {
> +		info = rcu_dereference_check(tcp_sk(sk)->authopt_info, lockdep_sock_is_held(sk));
> +		if (!info)
> +			return -ENOENT;
> +		key_info = tcp_authopt_key_lookup_exact(sk, info, &opt);
> +		if (!key_info)
> +			return -ENOENT;
> +		tcp_authopt_key_del(sk, info, key_info);

Doesn't seem to be safe together with tcp_authopt_select_key().
A key can be in use at this moment - you have to add checks for it.

> +		return 0;
> +	}
> +
> +	/* check key family */
> +	if (opt.flags & TCP_AUTHOPT_KEY_ADDR_BIND) {
> +		if (sk->sk_family != opt.addr.ss_family)
> +			return -EINVAL;
> +	}
> +
> +	/* Initialize tcp_authopt_info if not already set */
> +	info = __tcp_authopt_info_get_or_create(sk);
> +	if (IS_ERR(info))
> +		return PTR_ERR(info);
> +
> +	/* If an old key exists with exact ID then remove and replace.
> +	 * RCU-protected readers might observe both and pick any.
> +	 */
> +	key_info = tcp_authopt_key_lookup_exact(sk, info, &opt);
> +	if (key_info)
> +		tcp_authopt_key_del(sk, info, key_info);
> +	key_info = sock_kmalloc(sk, sizeof(*key_info), GFP_KERNEL | __GFP_ZERO);
> +	if (!key_info)
> +		return -ENOMEM;

So, you may end up without any key.
Also, replacing a key is not at all safe: you may receive old segments
which you in turn will discard and reset the connection.

I think the limitation RFC puts on removing keys in use and replacing
existing keys are actually reasonable. Probably, it'd be better to
enforce "key in use => desired key is different (or key_outdated flag)
=> key not in use => key may be removed" life-cycle of MKT.

Thanks,
            Dmitry

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFCv3 01/15] tcp: authopt: Initial support and key management
  2021-08-31 19:04   ` Dmitry Safonov
@ 2021-09-03 14:26     ` Leonard Crestez
  0 siblings, 0 replies; 31+ messages in thread
From: Leonard Crestez @ 2021-09-03 14:26 UTC (permalink / raw)
  To: Dmitry Safonov, David Ahern, Eric Dumazet
  Cc: David S. Miller, Herbert Xu, Kuniyuki Iwashima,
	Hideaki YOSHIFUJI, Jakub Kicinski, Yuchung Cheng,
	Francesco Ruggeri, Mat Martineau, Christoph Paasch,
	Ivan Delalande, Priyaranjan Jha, Menglong Dong, netdev,
	linux-crypto, linux-kselftest, linux-kernel, Shuah Khan

On 31.08.2021 22:04, Dmitry Safonov wrote:
> Hi Leonard,
> On 8/24/21 10:34 PM, Leonard Crestez wrote:
>> +/**
>> + * struct tcp_authopt_key_info - Representation of a Master Key Tuple as per RFC5925
>> + *
>> + * Key structure lifetime is only protected by RCU so readers needs to hold a
>> + * single rcu_read_lock until they're done with the key.
>> + */
>> +struct tcp_authopt_key_info {
>> +	struct hlist_node node;
>> +	struct rcu_head rcu;
>> +	/* Local identifier */
>> +	u32 local_id;
> 
> It's unused now, can be removed.

Yes

>> +/**
>> + * enum tcp_authopt_key_flag - flags for `tcp_authopt.flags`
>> + *
>> + * @TCP_AUTHOPT_KEY_DEL: Delete the key by local_id and ignore all other fields.
>                                                ^
> By send_id and recv_id.

Yes. The identifying fields are documented on struct tcp_authopt_key so 
I will abbreviate this.

> Also, tcp_authopt_key_match_exact() seems to check
> TCP_AUTHOPT_KEY_ADDR_BIND. I wounder if that makes sense to relax it in
> case of TCP_AUTHOPT_KEY_DEL to match only send_id/recv_id if addr isn't
> specified (no hard feelings about it, though).

Same send_id/recv_id can overlap between different remote peers.

> [..]
>> +#ifdef CONFIG_TCP_AUTHOPT
>> +	case TCP_AUTHOPT: {
>> +		struct tcp_authopt info;
>> +
>> +		if (get_user(len, optlen))
>> +			return -EFAULT;
>> +
>> +		lock_sock(sk);
>> +		tcp_get_authopt_val(sk, &info);
>> +		release_sock(sk);
>> +
>> +		len = min_t(unsigned int, len, sizeof(info));
>> +		if (put_user(len, optlen))
>> +			return -EFAULT;
>> +		if (copy_to_user(optval, &info, len))
>> +			return -EFAULT;
>> +		return 0;
> 
> Failed tcp_get_authopt_val() lookup in:
> :       if (!info)
> :               return -EINVAL;
> 
> will leak uninitialized kernel memory from stack.
> ASLR guys defeated.

tcp_get_authopt_val clears *info before all checks so this will return 
zeros to userspace.

I do need to propagate the return value from tcp_get_authopt_val.

>> +#define TCP_AUTHOPT_KNOWN_FLAGS ( \
>> +	TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED)
>> +
>> +int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
>> +{
>> +	struct tcp_authopt opt;
>> +	struct tcp_authopt_info *info;
>> +
>> +	sock_owned_by_me(sk);
>> +
>> +	/* If userspace optlen is too short fill the rest with zeros */
>> +	if (optlen > sizeof(opt))
>> +		return -EINVAL;
> 
> More like
> :	if (unlikely(len > sizeof(opt))) {
> :		err = check_zeroed_user(optval + sizeof(opt),
> :					len - sizeof(opt));
> :		if (err < 1)
> :			return err == 0 ? -EINVAL : err;
> :		len = sizeof(opt);
> :		if (put_user(len, optlen))
> :			return -EFAULT;
> :	}

If (optlen > sizeof(opt)) means userspace is attempting to use newer 
ABI. Current behavior is to return an error which seems very reasonable.

You seem to be suggesting that we check that the rest of option is 
zeroes and if it is to continue. That seems potentially dangerous but it 
could work if we forever ensure that zeroes always mean "no effect".

This would make it easier for new apps to run on old kernels: unless 
they specifically use new features they don't need to do anything.

Also, setsockopt can't report a new length back and there's no 
getsockopt for keys.

>> +	memset(&opt, 0, sizeof(opt));
>> +	if (copy_from_sockptr(&opt, optval, optlen))
>> +		return -EFAULT;
>> +
>> +	if (opt.flags & ~TCP_AUTHOPT_KNOWN_FLAGS)
>> +		return -EINVAL;

Here if the user requests unrecognized flags an error is reported. My 
intention is that new fields will be accompanied by new flags.

>> +	info = __tcp_authopt_info_get_or_create(sk);
>> +	if (IS_ERR(info))
>> +		return PTR_ERR(info);
>> +
>> +	info->flags = opt.flags & TCP_AUTHOPT_KNOWN_FLAGS;
>> +
>> +	return 0;
>> +}
> 
> [..]
>> +int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
>> +{
>> +	struct tcp_authopt_key opt;
>> +	struct tcp_authopt_info *info;
>> +	struct tcp_authopt_key_info *key_info;
>> +
>> +	sock_owned_by_me(sk);
>> +
>> +	/* If userspace optlen is too short fill the rest with zeros */
>> +	if (optlen > sizeof(opt))
>> +		return -EINVAL;
> 
> Ditto
> 
>> +	memset(&opt, 0, sizeof(opt));
>> +	if (copy_from_sockptr(&opt, optval, optlen))
>> +		return -EFAULT;
>> +
>> +	if (opt.flags & ~TCP_AUTHOPT_KEY_KNOWN_FLAGS)
>> +		return -EINVAL;
>> +
>> +	if (opt.keylen > TCP_AUTHOPT_MAXKEYLEN)
>> +		return -EINVAL;
>> +
>> +	/* Delete is a special case: */
>> +	if (opt.flags & TCP_AUTHOPT_KEY_DEL) {
>> +		info = rcu_dereference_check(tcp_sk(sk)->authopt_info, lockdep_sock_is_held(sk));
>> +		if (!info)
>> +			return -ENOENT;
>> +		key_info = tcp_authopt_key_lookup_exact(sk, info, &opt);
>> +		if (!key_info)
>> +			return -ENOENT;
>> +		tcp_authopt_key_del(sk, info, key_info);
> 
> Doesn't seem to be safe together with tcp_authopt_select_key().
> A key can be in use at this moment - you have to add checks for it.

tcp_authopt_key_del does kfree_rcu. As far as I understand this means 
that if select_key can see the key it is guaranteed to live until the 
next grace period, which shouldn't be until after the packet is signed.

I will attempt to document this restriction on tcp_authopt_select_key: 
you can't do anything with the key except give it to tcp_authopt_hash 
before an RCU grace period.

I'm not confident this is correct in all cases. It's inspired by what 
MD5 does but apparently those key lists are protected by a combination 
of sk_lock and rcu?

>> +		return 0;
>> +	}
>> +
>> +	/* check key family */
>> +	if (opt.flags & TCP_AUTHOPT_KEY_ADDR_BIND) {
>> +		if (sk->sk_family != opt.addr.ss_family)
>> +			return -EINVAL;
>> +	}
>> +
>> +	/* Initialize tcp_authopt_info if not already set */
>> +	info = __tcp_authopt_info_get_or_create(sk);
>> +	if (IS_ERR(info))
>> +		return PTR_ERR(info);
>> +
>> +	/* If an old key exists with exact ID then remove and replace.
>> +	 * RCU-protected readers might observe both and pick any.
>> +	 */
>> +	key_info = tcp_authopt_key_lookup_exact(sk, info, &opt);
>> +	if (key_info)
>> +		tcp_authopt_key_del(sk, info, key_info);
>> +	key_info = sock_kmalloc(sk, sizeof(*key_info), GFP_KERNEL | __GFP_ZERO);
>> +	if (!key_info)
>> +		return -ENOMEM;
> 
> So, you may end up without any key.

Moving the sock_kmalloc higher should fix this, there would be no effect 
on alloc failure.

> Also, replacing a key is not at all safe: you may receive old segments
> which you in turn will discard and reset the connection. >
> I think the limitation RFC puts on removing keys in use and replacing
> existing keys are actually reasonable. Probably, it'd be better to
> enforce "key in use => desired key is different (or key_outdated flag)
> => key not in use => key may be removed" life-cycle of MKT.

Userspace breaking its own connections seems fine, it can already do 
this in many ways.

If the current key is removed the kernel will just switch to another 
valid key. If no valid keys exist then I expect it will switch to 
unsigned packets which is possibly quite dangerous.

Maybe it should be possible to insert a "marker" key which just says 
"don't do any unsigned traffic with this peer"?

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-09-03 14:26 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-24 21:34 [RFCv3 00/15] tcp: Initial support for RFC5925 auth option Leonard Crestez
2021-08-24 21:34 ` [RFCv3 01/15] tcp: authopt: Initial support and key management Leonard Crestez
2021-08-31 19:04   ` Dmitry Safonov
2021-09-03 14:26     ` Leonard Crestez
2021-08-24 21:34 ` [RFCv3 02/15] docs: Add user documentation for tcp_authopt Leonard Crestez
2021-08-24 21:34 ` [RFCv3 03/15] selftests: Initial tcp_authopt test module Leonard Crestez
2021-08-24 21:34 ` [RFCv3 04/15] selftests: tcp_authopt: Initial sockopt manipulation Leonard Crestez
2021-08-24 21:34 ` [RFCv3 05/15] tcp: authopt: Add crypto initialization Leonard Crestez
2021-08-24 23:02   ` Eric Dumazet
2021-08-24 23:34   ` Eric Dumazet
2021-08-25  8:08     ` Herbert Xu
2021-08-25 14:55       ` Eric Dumazet
2021-08-25 16:04       ` Ard Biesheuvel
2021-08-25 16:31         ` Leonard Crestez
2021-08-25 16:35     ` Leonard Crestez
2021-08-25 17:55       ` Eric Dumazet
2021-08-25 18:56         ` Leonard Crestez
2021-08-24 21:34 ` [RFCv3 06/15] tcp: authopt: Compute packet signatures Leonard Crestez
2021-08-24 21:34 ` [RFCv3 07/15] tcp: authopt: Hook into tcp core Leonard Crestez
2021-08-24 22:59   ` Eric Dumazet
2021-08-25 16:32     ` Leonard Crestez
2021-08-24 21:34 ` [RFCv3 08/15] tcp: authopt: Add snmp counters Leonard Crestez
2021-08-24 21:34 ` [RFCv3 09/15] selftests: tcp_authopt: Test key address binding Leonard Crestez
2021-08-25  5:18   ` David Ahern
2021-08-25 16:37     ` Leonard Crestez
2021-08-24 21:34 ` [RFCv3 10/15] selftests: tcp_authopt: Capture and verify packets Leonard Crestez
2021-08-24 21:34 ` [RFCv3 11/15] selftests: Initial tcp_authopt support for nettest Leonard Crestez
2021-08-24 21:34 ` [RFCv3 12/15] selftests: Initial tcp_authopt support for fcnal-test Leonard Crestez
2021-08-24 21:34 ` [RFCv3 13/15] selftests: Add -t tcp_authopt option for fcnal-test.sh Leonard Crestez
2021-08-24 21:34 ` [RFCv3 14/15] tcp: authopt: Add key selection controls Leonard Crestez
2021-08-24 21:34 ` [RFCv3 15/15] selftests: tcp_authopt: Add tests for rollover Leonard Crestez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).