All of
 help / color / mirror / Atom feed
From: Florian Westphal <>
Subject: Revised mptcp_info sockopt and selftests
Date: Thu, 29 Jul 2021 13:50:00 +0200	[thread overview]
Message-ID: <YQKV6GJe/> (raw)


I have reworked the MPTCP_INFO getsockopt.

I think that for now its better to focus on the UAPI rather than the

Here is a minimal usage example to illustrate the feature:

UAPI structures:
struct mptcp_info_opt_ptr {
	__aligned_u64	address;			/* userspace address */
	__u32		nmemb_user;			/* number of elements in address[] */
	__u32		size_user;			/* size of one element in address[] */
	__u32		nmemb_kernel;			/* nmemb count kernel wanted to store */
	__u32		size_kernel;			/* structure size in kernel */

struct mptcp_info_opt {
	struct mptcp_info_opt_ptr mptcp_info;		/* struct mptcp_info */
	struct mptcp_info_opt_ptr tcp_info;		/* struct tcp_info[] of each subflow */
	struct mptcp_info_opt_ptr socknames;		/* struct sockaddr_storage[] */
	struct mptcp_info_opt_ptr peernames;		/* struct sockaddr_storage[] */

/* Sample call, error checks removed */
void test_info_getsockopt(int fd)
	struct sockaddr_storage remotes[16], locals[16];
	struct mptcp_info mptcp_info;
	struct tcp_info tcpinfo[16];
	struct mptcp_info_opt m;
	socklen_t optlen = sizeof(m);
	uint32_t i;
	int err;

	memset(&m, 0, sizeof(m));
	m.mptcp_info.size_user = sizeof(mptcp_info);
	m.mptcp_info.address = (uint64_t)&mptcp_info;

	m.tcp_info.address = (uint64_t)tcpinfo;
	m.tcp_info.size_user = sizeof(tcpinfo[0]);
	m.tcp_info.nmemb_user = 16;

	m.socknames.size_user = sizeof(locals[0]);
	m.socknames.nmemb_user = 16;
	m.peernames.size_user = sizeof(remotes[0]);
	m.peernames.nmemb_user = 16;
	m.socknames.address = (uint64_t)locals;
	m.peernames.address = (uint64_t)remotes;

	err = getsockopt(fd, SOL_MPTCP, MPTCP_INFO, &m, &optlen);

	printf("have %u subflows, token is %x\n", mptcp_info.mptcpi_subflows, mptcp_info.mptcpi_token);

	for (i = 0; i < m.socknames.nmemb_user; i++) {
		printf("subflow %02d: ", i);
		fprintf_mptcp_info(stdout, (void *)&locals[i], (void *)&remotes[i]);

	printf("... dumped %u of %u addresses.\n",
			m.socknames.nmemb_user, m.socknames.nmemb_kernel);

	for (i = 0; i < m.tcp_info.nmemb_user; i++)

As you can see, I removed 'byte' based sizes, i found this to be
unecessarily tedious as the kernel then has to subtract from a
'remaining buffer size', while making sure that this can't underflow.

With number-of-elements, its enough to 'remaining--' and "ptr += size_user".

Userspace passes the address of the structure(s), the size of a single
structure that it expects to get and the number of structures it has
allocated (16 in above example).

The kernel will:
1. not change, or shrink the number of structures.
   This means that, in above example, nmemb_user will be anywhere
   between 0 and 16 inclusive.

   0 would happen in the 'no subflows' case, when nothing was filled in.
2. Not change, or shrink, size_user.  This happens for instance
   if userspace has a larger 'struct tcp_info' than the kernel,
   in this case size_user is updated to reflect the real element size.

3. update nmemb_kernel to reflect the structure size it *WOULD* have
   filled. For example, if userspace tcp_info struct is smaller than the
   kernel, then size_user < size_kernel.

4. update size_kernel with the number of elements it WOULD have filled
   if there would have been more space.

So, in above example, if there are 17 subflows then
socknames.nmemb_user == 16 and socknames.nmemb_kernel == 17.

Or, in short:
The _kernel members are only filled by kernel, the value provided by
userspace are ignored.

The _user members are input/output: userspace tells kernel how much
space it provides, kernel updates it to reflect how much it ended up

Is that sane?  Any suggestions?

On a related note, how should the test infra look like?
kunit isn't useable as i would prefer to test the UAPI, so that means

I would prefer not to extend mptcp_connect again.
Would you be ok with a new test program?

Any preferences?  If not, i would go with mptcp_getsockopt.c +

The latter would then call the former with various possible
combinations, such as 'ipv4', 'ipv6', 'only ask for tcp_info', etc.


             reply	other threads:[~2021-07-29 11:50 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-29 11:50 Florian Westphal [this message]
2021-07-30 23:43 ` Revised mptcp_info sockopt and selftests Mat Martineau
2021-07-31 16:14   ` Mat Martineau
2021-08-02 14:51   ` Florian Westphal
2021-08-02 22:23     ` Mat Martineau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YQKV6GJe/ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.