linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fw@deneb.enyo.de>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	paulmck <paulmck@linux.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Paul Turner <pjt@google.com>,
	linux-api <linux-api@vger.kernel.org>, carlos <carlos@redhat.com>
Subject: Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
Date: Wed, 15 Jul 2020 11:10:47 -0400 (EDT)	[thread overview]
Message-ID: <1944484270.14303.1594825847550.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20200715123323.vrmblagnvkdp4pyy@wittgenstein>

----- On Jul 15, 2020, at 8:33 AM, Christian Brauner christian.brauner@ubuntu.com wrote:
[...]
> 
> So here's a very free-wheeling draft of roughly what I had in mind. Not
> even compile-tested just to illustrate what I'd change and sorry if that
> code will make you sob in your hands:
> 
[...]
> +	/*
> +	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, user_size should be
> +	 * statically initialized to offsetof(struct rseq, end).
> +	 */
> +	__u32 user_size;
> +	/*
> +	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, if the kernel supports
> +	 * extensible struct rseq ABI, the kernel_size field is populated by
> +	 * the kernel to the minimum between user_size and the offset of the
> +	 * "end" field within the struct rseq supported by the kernel on
> +	 * successful registration. Should be initialized to 0.
> +	 */
> +	__u32 kernel_size;

Moving from __u16 to __u32 for both fields don't achieve much, and increase
the size of struct rseq (excluding padding) from 24 bytes to 28 bytes.

Note that the struct rseq alignment is 32 bytes. At 24 bytes, it leaves room
for exactly one 8 bytes pointer, which can be useful for future extensions.
If the size is increased to 28 bytes or more, then we're done and cannot
add a pointer.

> +	__u32 active_size;

This additional field takes the very last bytes of padding we have in the
current layout.

> } __attribute__((aligned(4 * sizeof(__u64))));
> 
> +#define RSEQ_SIZE_VER0 24 /* sizeof first published struct */

This is incorrect. The sizeof(struct_rseq) with its 32 bytes alignment is 32,
not 24. The padding at the end of the structure is considered as part of its
size, but we cannot rely on its content being zero-initialized based on the
C standard.

> +#define RSEQ_SIZE_VER1 32 /* sizeof second published struct */
> +#define RSEQ_SIZE_LATEST RSEQ_SIZE_VER1 /* sizeof last published struct */
> +

[...]

> @@ -349,10 +363,58 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32,
> rseq_len,
> 	 * ensure the provided rseq is properly aligned and valid.
> 	 */
> 	if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq)) ||
> -	    rseq_len != sizeof(*rseq))
> +	    rseq_len < RSEQ_SIZE_VER0)

This could perhaps be changed for future kernels, but will break for existing
kernels as soon as rseq_len is increased. This is something we should have
planned for in the initial implementation of the system call, but here we are.

How do you envision that userspace would handle this failure from older kernels ?
Try again with a second system call passing RSEQ_SIZE_VER0 as argument ?

> 		return -EINVAL;
> 	if (!access_ok(rseq, rseq_len))
> 		return -EFAULT;
> +
> +	/* Handle extensible struct rseq ABI. */
> +	ret = get_user(tls_flags, &rseq->flags);
> +	if (ret)
> +		return ret;
> +	if (tls_flags & RSEQ_TLS_FLAG_SIZE) {
> +		u32 user_size, kernel_size, active_size;
> +
> +		/* Can probably be made nicer by using check_zeroed_user(). */
> +		ret = get_user(user_size, &rseq->user_size);
> +		if (ret)
> +			return ret;
> +		if (user_size != 0)
> +			return -EINVAL;
> +
> +		ret = get_user(active_size, &rseq->active_size);
> +		if (ret)
> +			return ret;
> +		if (active_size != 0)
> +			return -EINVAL;
> +
> +		ret = get_user(active_size, &rseq->kernel_size);

I guess you mean kernel_size here.

> +		if (ret)
> +			return ret;
> +		if (kernel_size != 0)
> +			return -EINVAL;
> +
> +		/* Calculate the useable size. */
> +		active_size = min_t(u32, rseq_len, RSEQ_SIZE_LATEST);

Where is the rseq_len supposed to come from in userspace ? Should it be
that the code doing the registration uses sizeof(struct rseq), or offsetof(struct rseq, end),
or should it read the content of __rseq_abi.user_size ?

> +		ret = put_user(active_size, &rseq->active_size);
> +		if (ret)
> +			return ret;
> +
> +		/* Let other users know what userspace used to register. */
> +		ret = put_user(rseq_len, &rseq->user_size);
> +		if (ret)
> +			return -EFAULT;
> +
> +		/* Let other users know what size the kernel supports. */

I am not sure what those 3 __u32 fields (user_size, kernel_size, and active_size),
plus use of the rseq_len syscall parameter, accomplish which was not accomplished
by my __u16 user_size + kernel_size approach ? If anything, it seems to make support
of older kernels which do not support an extended rseq_len parameter more complex.

Thanks,

Mathieu


> +		ret = put_user(RSEQ_SIZE_LATEST, &rseq->kernel_size);
> +		if (ret)
> +			return -EFAULT;
> +
> +		current->rseq_size = active_size;
> +	} else {
> +		current->rseq_size = RSEQ_SIZE_VER0;
> +	}
> +
> 	current->rseq = rseq;
> 	current->rseq_sig = sig;
> 	/*
> --
> 2.27.0

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2020-07-15 15:10 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-14  3:03 [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Mathieu Desnoyers
2020-07-14  3:03 ` [RFC PATCH 1/4] selftests: rseq: Use fixed value as rseq_len parameter Mathieu Desnoyers
2020-07-14  3:03 ` [RFC PATCH 2/4] rseq: Allow extending struct rseq Mathieu Desnoyers
2020-07-14  9:58   ` Florian Weimer
2020-07-14 12:50     ` Mathieu Desnoyers
2020-07-14 13:00       ` Florian Weimer
2020-07-14 13:19         ` Mathieu Desnoyers
2020-07-14 21:30           ` Carlos O'Donell
2020-07-15 13:12             ` Mathieu Desnoyers
2020-07-15 13:22               ` Florian Weimer
2020-07-15 13:31                 ` Mathieu Desnoyers
2020-07-15 13:42                   ` Florian Weimer
2020-07-15 13:55                     ` Christian Brauner
2020-07-15 14:20                       ` Mathieu Desnoyers
2020-07-15 14:54                     ` Mathieu Desnoyers
2020-07-15 14:58                       ` Florian Weimer
2020-07-15 15:26                         ` Mathieu Desnoyers
2020-07-14 17:24   ` Peter Oskolkov
2020-07-14 17:43     ` Mathieu Desnoyers
2020-07-14 18:33       ` Peter Oskolkov
2020-07-15  2:34         ` Chris Kennelly
2020-07-15  6:31           ` Florian Weimer
2020-07-15 10:59             ` Christian Brauner
2020-07-15 14:38             ` Mathieu Desnoyers
2020-07-15 14:50           ` Mathieu Desnoyers
2020-07-15 11:38   ` Christian Brauner
2020-07-15 12:33     ` Christian Brauner
2020-07-15 15:10       ` Mathieu Desnoyers [this message]
2020-07-15 15:33         ` Christian Brauner
2020-07-14  3:03 ` [RFC PATCH 3/4] selftests: rseq: define __rseq_abi with extensible size Mathieu Desnoyers
2020-07-14  3:03 ` [RFC PATCH 4/4] selftests: rseq: print rseq extensible size in basic test Mathieu Desnoyers
2020-07-14 20:55 ` [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Carlos O'Donell
2020-07-15 13:02   ` Mathieu Desnoyers
2020-07-16 13:39     ` Carlos O'Donell
2020-07-16 14:45       ` Mathieu Desnoyers
2020-07-15 15:12   ` Florian Weimer
2020-07-15 15:32     ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1944484270.14303.1594825847550.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=boqun.feng@gmail.com \
    --cc=carlos@redhat.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=fw@deneb.enyo.de \
    --cc=hpa@zytor.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).