From: David Ahern <dsahern@gmail.com>
To: Dmitry Safonov <0x7f454c46@gmail.com>,
Leonard Crestez <cdleonard@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>,
"David S. Miller" <davem@davemloft.net>,
Herbert Xu <herbert@gondor.apana.org.au>,
Kuniyuki Iwashima <kuniyu@amazon.co.jp>,
David Ahern <dsahern@kernel.org>,
Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
Jakub Kicinski <kuba@kernel.org>,
Yuchung Cheng <ycheng@google.com>,
Francesco Ruggeri <fruggeri@arista.com>,
Mat Martineau <mathew.j.martineau@linux.intel.com>,
Christoph Paasch <cpaasch@apple.com>,
Ivan Delalande <colona@arista.com>,
Priyaranjan Jha <priyarjha@google.com>,
Menglong Dong <dong.menglong@zte.com.cn>,
open list <linux-kernel@vger.kernel.org>,
linux-crypto@vger.kernel.org,
Network Development <netdev@vger.kernel.org>,
Dmitry Safonov <dima@arista.com>
Subject: Re: [RFCv2 1/9] tcp: authopt: Initial support and key management
Date: Wed, 11 Aug 2021 11:15:52 -0600 [thread overview]
Message-ID: <ac911d47-eef7-c97b-9a77-f386546b56e8@gmail.com> (raw)
In-Reply-To: <68749e37-8e29-7a51-2186-7692f5fd6a79@gmail.com>
On 8/11/21 8:31 AM, Dmitry Safonov wrote:
> On 8/11/21 9:29 AM, Leonard Crestez wrote:
>> On 8/10/21 11:41 PM, Dmitry Safonov wrote:
> [..]
>>>> + u32 flags;
>>>> + /* Wire identifiers */
>>>> + u8 send_id, recv_id;
>>>> + u8 alg_id;
>>>> + u8 keylen;
>>>> + u8 key[TCP_AUTHOPT_MAXKEYLEN];
>>>> + struct rcu_head rcu;
>>>
>>> This is unaligned and will add padding.
>>
>> Not clear padding matters. Moving rcu_head higher would avoid it, is
>> that what you're suggesting.
>
> Yes.
>
>>> I wonder if it's also worth saving some bytes by something like
>>> struct tcp_ao_key *key;
>>>
>>> With
>>> struct tcp_ao_key {
>>> u8 keylen;
>>> u8 key[0];
>>> };
>>>
>>> Hmm?
>>
>> This increases memory management complexity for very minor gains. Very
>> few tcp_authopt_key will ever be created.
>
> The change doesn't seem to be big, like:
> --- a/net/ipv4/tcp_authopt.c
> +++ b/net/ipv4/tcp_authopt.c
> @@ -422,8 +422,16 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t
> optval, unsig>
> key_info = __tcp_authopt_key_info_lookup(sk, info, opt.local_id);
> if (key_info)
> tcp_authopt_key_del(sk, info, key_info);
> +
> + key = sock_kmalloc(sk, sizeof(*key) + opt.keylen, GFP_KERNEL |
> __GFP_ZERO);
> + if (!key) {
> + tcp_authopt_alg_release(alg);
> + return -ENOMEM;
> + }
> +
> key_info = sock_kmalloc(sk, sizeof(*key_info), GFP_KERNEL |
> __GFP_ZERO);
> if (!key_info) {
> + sock_kfree_s(sk, key, sizeof(*key) + opt.keylen);
> tcp_authopt_alg_release(alg);
> return -ENOMEM;
> }
>
> I don't know, probably it'll be enough for every user to limit their
> keys by length of 80, but if one day it won't be enough - this ABI will
> be painful to extend.
>
> [..]
>>>> +#define TCP_AUTHOPT 38 /* TCP Authentication
>>>> Option (RFC2385) */
>>>> +#define TCP_AUTHOPT_KEY 39 /* TCP Authentication
>>>> Option update key (RFC2385) */
>>>
>>> RFC2385 is md5 one.
>>> Also, should there be TCP_AUTHOPT_ADD_KEY, TCP_AUTHOPT_DELTE_KEY
>>> instead of using flags inside setsocketopt()? (no hard feelings)
>>
>> Fixed RFC reference.
>>
>> TCP_AUTHOPT_DELETE_KEY could be clearer, I just wanted to avoid bloating
>> the sockopt space. But there's not any scarcity.
>>
>> For reference tcp_md5 handles key deletion based on keylen == 0. This
>> seems wrong to me, an empty key is in fact valid though not realistic.
>>
>> If local_id is removed in favor of "full match on id and binding" then
>> the delete sockopt would still need most of a full struct
>> tcp_authopt_key anyway.
>
> Sounds like a plan.
>
> [..]>> I'm not sure what's the use of enum here, probably: > #define
>>> TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED BIT(2)
>>
>> This is an enum because it looks nice in kernel-doc. I couldn't find a
>> way to attach docs to a macro and include it somewhere else.
>
> Yeah, ok, seems like good justification.
>
>> BTW, the enum gains more members later.
>>
>> As for BIT() it doesn't see to be allowed in uapi and there were recent
>> changes removing such usage.
>
> Ok, I just saw it's still used in include/uapi, but not aware of the
> removal.
>
>>
>>> [..]
>>>> +struct tcp_authopt_key {
>>>> + /** @flags: Combination of &enum tcp_authopt_key_flag */
>>>> + __u32 flags;
>>>> + /** @local_id: Local identifier */
>>>> + __u32 local_id;
>>>> + /** @send_id: keyid value for send */
>>>> + __u8 send_id;
>>>> + /** @recv_id: keyid value for receive */
>>>> + __u8 recv_id;
>>>> + /** @alg: One of &enum tcp_authopt_alg */
>>>> + __u8 alg;
>>>> + /** @keylen: Length of the key buffer */
>>>> + __u8 keylen;
>>>> + /** @key: Secret key */
>>>> + __u8 key[TCP_AUTHOPT_MAXKEYLEN];
>>>> + /**
>>>> + * @addr: Key is only valid for this address
>>>> + *
>>>> + * Ignored unless TCP_AUTHOPT_KEY_ADDR_BIND flag is set
>>>> + */
>>>> + struct __kernel_sockaddr_storage addr;
>>>> +};
>>>
>>> It'll be an ABI if this is accepted. As it is - it can't support
>>> RFC5925 fully.
>>> Extending syscall ABI is painful. I think, even the initial ABI *must*
>>> support
>>> all possible features of the RFC.
>>> In other words, there must be src_addr, src_port, dst_addr and dst_port.
>>> I can see from docs you've written you don't want to support a mix of
>>> different
>>> addr/port MKTs, so you can return -EINVAL or -ENOTSUPP for any value
>>> but zero.
>>> This will create an ABI that can be later extended to fully support RFC.
>>
>> RFC states that MKT connection identifiers can be specified using ranges
>> and wildcards and the details are up to the implementation. Keys are
>> *NOT* just bound to a classical TCP 4-tuple.
>>
>> * src_addr and src_port is implicit in socket binding. Maybe in theory
>> src_addr they could apply for a server socket bound to 0.0.0.0:PORT but
>> userspace can just open more sockets.
>> * dst_port is not supported by MD5 and I can't think of any useful
>> usecase. This is either well known (179 for BGP) or auto-allocated.
>> * tcp_md5 was recently enhanced allow a "prefixlen" for addr and
>> "l3mdev" ifindex binding.
>>
>> This last point shows that the binding features people require can't be
>> easily predicted in advance so it's better to allow the rules to be
>> extended.
>
> Yeah, I see both changes you mention went on easy way as they reused
> existing paddings in the ABI structures.
> Ok, if you don't want to reserve src_addr/src_port/dst_addr/dst_port,
> than how about reserving some space for those instead?
>
>>> The same is about key: I don't think you need to define/use
>>> tcp_authopt_alg.
>>> Just use algo name - that way TCP-AO will automatically be able to use
>>> any algo supported by crypto engine.
>>> See how xfrm does it, e.g.:
>>> struct xfrm_algo_auth {
>>> char alg_name[64];
>>> unsigned int alg_key_len; /* in bits */
>>> unsigned int alg_trunc_len; /* in bits */
>>> char alg_key[0];
>>> };
>>>
>>> So you can let a user chose maclen instead of hard-coding it.
>>> Much more extendable than what you propose.
>>
>> This complicates ABI and implementation with features that are not
>> needed. I'd much rather only expose an enum of real-world tcp-ao
>> algorithms.
>
> I see it exactly the opposite way: a new enum unnecessary complicates
> ABI, instead of passing alg_name[] to crypto engine. No need to add any
> support in tcp-ao as the algorithms are already provided by kernel.
> That is how it transparently works for ipsec, why not for tcp-ao?
>
>>
>>> [..]
>>>> +#ifdef CONFIG_TCP_AUTHOPT
>>>> + case TCP_AUTHOPT: {
>>>> + struct tcp_authopt info;
>>>> +
>>>> + if (get_user(len, optlen))
>>>> + return -EFAULT;
>>>> +
>>>> + lock_sock(sk);
>>>> + tcp_get_authopt_val(sk, &info);
>>>> + release_sock(sk);
>>>> +
>>>> + len = min_t(unsigned int, len, sizeof(info));
>>>> + if (put_user(len, optlen))
>>>> + return -EFAULT;
>>>> + if (copy_to_user(optval, &info, len))
>>>> + return -EFAULT;
>>>> + return 0;
>>>> + }
>>>
>>> I'm pretty sure it's not a good choice to write partly tcp_authopt.
>>> And a user has no way to check what's the correct len on this kernel.
>>> Instead of len = min_t(unsigned int, len, sizeof(info)), it should be
>>> if (len != sizeof(info))
>>> return -EINVAL;
>>
>> Purpose is to allow sockopts to grow as md5 has grown.
>
> md5 has not grown. See above.
MD5 uapi has - e.g., 8917a777be3ba and 6b102db50cdde. We want similar
capabilities for growth with this API.
>
> Another issue with your approach
>
> + /* If userspace optlen is too short fill the rest with zeros */
> + if (optlen > sizeof(opt))
> + return -EINVAL;
> + memset(&opt, 0, sizeof(opt));
> + if (copy_from_sockptr(&opt, optval, optlen))
> + return -EFAULT;
>
> is that userspace compiled with updated/grew structure will fail on
> older kernel. So, no extension without breaking something is possible.
> Which also reminds me that currently you don't validate (opt.flags) for
> unknown by kernel flags.
>
> Extending syscalls is impossible without breaking userspace if ABI is
> not designed with extensibility in mind. That was quite a big problem,
> and still is. Please, read this carefully:
> https://lwn.net/Articles/830666/
>
> That is why I'm suggesting you all those changes that will be harder to
> fix when/if your patches get accepted.
> As an example how it should work see in copy_clone_args_from_user().
>
Look at how TCP_ZEROCOPY_RECEIVE has grown over releases as an example
of how to properly handle this.
next prev parent reply other threads:[~2021-08-11 17:16 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-09 21:35 [RFCv2 0/9] tcp: Initial support for RFC5925 auth option Leonard Crestez
2021-08-09 21:35 ` [RFCv2 1/9] tcp: authopt: Initial support and key management Leonard Crestez
2021-08-10 20:41 ` Dmitry Safonov
2021-08-11 8:29 ` Leonard Crestez
2021-08-11 13:42 ` David Ahern
2021-08-11 19:11 ` Leonard Crestez
2021-08-11 20:26 ` Dmitry Safonov
2021-08-11 20:26 ` David Ahern
2021-08-11 14:31 ` Dmitry Safonov
2021-08-11 17:15 ` David Ahern [this message]
2021-08-11 20:12 ` Dmitry Safonov
2021-08-11 20:23 ` David Ahern
2021-08-11 19:08 ` Leonard Crestez
2021-08-12 19:46 ` Leonard Crestez
2021-08-09 21:35 ` [RFCv2 2/9] docs: Add user documentation for tcp_authopt Leonard Crestez
2021-08-09 21:35 ` [RFCv2 3/9] tcp: authopt: Add crypto initialization Leonard Crestez
2021-08-09 21:35 ` [RFCv2 4/9] tcp: authopt: Compute packet signatures Leonard Crestez
2021-08-09 21:35 ` [RFCv2 5/9] tcp: authopt: Hook into tcp core Leonard Crestez
2021-08-09 21:35 ` [RFCv2 6/9] tcp: authopt: Add key selection controls Leonard Crestez
2021-08-09 21:35 ` [RFCv2 7/9] tcp: authopt: Add snmp counters Leonard Crestez
2021-08-09 21:35 ` [RFCv2 8/9] selftests: Initial TCP-AO support for nettest Leonard Crestez
2021-08-09 21:35 ` [RFCv2 9/9] selftests: Initial TCP-AO support for fcnal-test Leonard Crestez
2021-08-11 13:46 ` David Ahern
2021-08-11 19:09 ` Leonard Crestez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ac911d47-eef7-c97b-9a77-f386546b56e8@gmail.com \
--to=dsahern@gmail.com \
--cc=0x7f454c46@gmail.com \
--cc=cdleonard@gmail.com \
--cc=colona@arista.com \
--cc=cpaasch@apple.com \
--cc=davem@davemloft.net \
--cc=dima@arista.com \
--cc=dong.menglong@zte.com.cn \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=fruggeri@arista.com \
--cc=herbert@gondor.apana.org.au \
--cc=kuba@kernel.org \
--cc=kuniyu@amazon.co.jp \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathew.j.martineau@linux.intel.com \
--cc=netdev@vger.kernel.org \
--cc=priyarjha@google.com \
--cc=ycheng@google.com \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).