All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aleksa Sarai <cyphar@cyphar.com>
To: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Daniel Colascione <dancol@google.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Jeff Layton <jlayton@kernel.org>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Arnd Bergmann <arnd@arndb.de>,
	David Howells <dhowells@redhat.com>,
	Shuah Khan <shuah@kernel.org>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Christian Brauner <christian@brauner.io>,
	Eric Biederman <ebiederm@xmission.com>,
	Andy Lutomirski <luto@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Kees Cook <keescook@chromium.org>, Jann Horn <jannh@google.com>,
	Tycho Andersen <tycho@tycho.ws>,
	David Drysdale <drysdale@google.com>,
	Chanho Min <chanho.min@lge.com>, Oleg Nesterov <oleg@redhat.com>,
	Aleksa Sarai <asarai@suse.de>, Linus Torvalds <torvalds@linux-fo>
Subject: Re: [PATCH RESEND v11 7/8] open: openat2(2) syscall
Date: Thu, 29 Aug 2019 13:19:04 +0000	[thread overview]
Message-ID: <20190829131904.bkbalbtqt6j3gwcp@yavin> (raw)
In-Reply-To: <899401fa-ff0a-2ce9-8826-09904efab2d2@rasmusvillemoes.dk>

[-- Attachment #1: Type: text/plain, Size: 3601 bytes --]

On 2019-08-29, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> On 29/08/2019 14.15, Aleksa Sarai wrote:
> > On 2019-08-24, Daniel Colascione <dancol@google.com> wrote:
> 
> >> Why pad the structure when new functionality (perhaps accommodated via
> >> a larger structure) could be signaled by passing a new flag? Adding
> >> reserved fields to a structure with a size embedded in the ABI makes a
> >> lot of sense --- e.g., pthread_mutex_t can't grow. But this structure
> >> can grow, so the reservation seems needless to me.
> > 
> > Quite a few folks have said that ->reserved is either unnecessary or
> > too big. I will be changing this, though I am not clear what the best
> > way of extending the structure is. If anyone has a strong opinion on
> > this (or an alternative to the ones listed below), please chime in. I
> > don't have any really strong attachment to this aspect of the API.
> > 
> > There appear to be a few ways we can do it (that all have precedence
> > with other syscalls):
> > 
> >  1. Use O_* flags to indicate extensions.
> >  2. A separate "version" field that is incremented when we change.
> >  3. Add a size_t argument to openat2(2).
> >  4. Reserve space (as in this patchset).
> > 
> > (My personal preference would be (3), followed closely by (2).)
> 
> 3, definitely, and instead of having to invent a new scheme for every
> new syscall, make that the default pattern by providing a helper

Sure (though hopefully I don't need to immediately go and refactor all
the existing size_t syscalls). I will be presenting about this patchset
at the containers microconference at LPC (in a few weeks), so I'll hold
of on any API-related rewrites until after that.

> int __copy_abi_struct(void *kernel, size_t ksize, const void __user
> *user, size_t usize)
> {
> 	size_t copy = min(ksize, usize);
> 
> 	if (copy_from_user(kernel, user, copy))
> 		return -EFAULT;
> 
> 	if (usize > ksize) {
> 		/* maybe a separate "return user_is_zero(user + ksize, usize -
> ksize);" helper */
> 		char c;
> 		user += ksize;
> 		usize -= ksize;
> 		while (usize--) {
> 			if (get_user(c, user++))
> 				return -EFAULT;
> 			if (c)
> 				return -EINVAL;

This part would probably be better done with memchr_inv() and
copy_from_user() (and probably should put an upper limit on usize), but
I get what you mean.

> 		}
> 	} else if (ksize > usize) {
> 		memset(kernel + usize, 0, ksize - usize);
> 	}
> 	return 0;
> }
> #define copy_abi_struct(kernel, user, usize)	\
> 	__copy_abi_struct(kernel, sizeof(*kernel), user, usize)
>
> > Both (1) and (2) have the problem that the "struct version" is inside
> > the struct so we'd need to copy_from_user() twice. This isn't the end of
> > the world, it just feels a bit less clean than is ideal. (3) fixes that
> > problem, at the cost of making the API slightly more cumbersome to use
> > directly (though again glibc could wrap that away).
> 
> I don't see how 3 is cumbersome to use directly. Userspace code does
> struct openat_of_the_day args = {.field1 = x, .field3 = y} and passes
> &args, sizeof(args). What does glibc need to do beyond its usual munging
> of the userspace ABI registers to the syscall ABI registers?

I'd argue that

    ret = openat2(AT_FDCWD, "foo", &how, sizeof(how)); // (3)

is slightly less pretty than

    ret = openat2(AT_FDCWD, "foo", &how); // (1), (2), (4)

But it's not really that bad. Forget I said anything.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Aleksa Sarai <cyphar@cyphar.com>
To: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Daniel Colascione <dancol@google.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Jeff Layton <jlayton@kernel.org>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Arnd Bergmann <arnd@arndb.de>,
	David Howells <dhowells@redhat.com>,
	Shuah Khan <shuah@kernel.org>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Christian Brauner <christian@brauner.io>,
	Eric Biederman <ebiederm@xmission.com>,
	Andy Lutomirski <luto@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Kees Cook <keescook@chromium.org>, Jann Horn <jannh@google.com>,
	Tycho Andersen <tycho@tycho.ws>,
	David Drysdale <drysdale@google.com>,
	Chanho Min <chanho.min@lge.com>, Oleg Nesterov <oleg@redhat.com>,
	Aleksa Sarai <asarai@suse.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	containers@lists.linux-foundation.org,
	linux-alpha@vger.kernel.org,
	Linux API <linux-api@vger.kernel.org>,
	linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	linux-ia64@vger.kernel.org,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"open list:KERNEL SELFTEST FRAMEWORK" 
	<linux-kselftest@vger.kernel.org>,
	linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org,
	linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
	linux-xtensa@linux-xtensa.org, sparclinux@vger.kernel.org
Subject: Re: [PATCH RESEND v11 7/8] open: openat2(2) syscall
Date: Thu, 29 Aug 2019 23:19:04 +1000	[thread overview]
Message-ID: <20190829131904.bkbalbtqt6j3gwcp@yavin> (raw)
In-Reply-To: <899401fa-ff0a-2ce9-8826-09904efab2d2@rasmusvillemoes.dk>

[-- Attachment #1: Type: text/plain, Size: 3601 bytes --]

On 2019-08-29, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> On 29/08/2019 14.15, Aleksa Sarai wrote:
> > On 2019-08-24, Daniel Colascione <dancol@google.com> wrote:
> 
> >> Why pad the structure when new functionality (perhaps accommodated via
> >> a larger structure) could be signaled by passing a new flag? Adding
> >> reserved fields to a structure with a size embedded in the ABI makes a
> >> lot of sense --- e.g., pthread_mutex_t can't grow. But this structure
> >> can grow, so the reservation seems needless to me.
> > 
> > Quite a few folks have said that ->reserved is either unnecessary or
> > too big. I will be changing this, though I am not clear what the best
> > way of extending the structure is. If anyone has a strong opinion on
> > this (or an alternative to the ones listed below), please chime in. I
> > don't have any really strong attachment to this aspect of the API.
> > 
> > There appear to be a few ways we can do it (that all have precedence
> > with other syscalls):
> > 
> >  1. Use O_* flags to indicate extensions.
> >  2. A separate "version" field that is incremented when we change.
> >  3. Add a size_t argument to openat2(2).
> >  4. Reserve space (as in this patchset).
> > 
> > (My personal preference would be (3), followed closely by (2).)
> 
> 3, definitely, and instead of having to invent a new scheme for every
> new syscall, make that the default pattern by providing a helper

Sure (though hopefully I don't need to immediately go and refactor all
the existing size_t syscalls). I will be presenting about this patchset
at the containers microconference at LPC (in a few weeks), so I'll hold
of on any API-related rewrites until after that.

> int __copy_abi_struct(void *kernel, size_t ksize, const void __user
> *user, size_t usize)
> {
> 	size_t copy = min(ksize, usize);
> 
> 	if (copy_from_user(kernel, user, copy))
> 		return -EFAULT;
> 
> 	if (usize > ksize) {
> 		/* maybe a separate "return user_is_zero(user + ksize, usize -
> ksize);" helper */
> 		char c;
> 		user += ksize;
> 		usize -= ksize;
> 		while (usize--) {
> 			if (get_user(c, user++))
> 				return -EFAULT;
> 			if (c)
> 				return -EINVAL;

This part would probably be better done with memchr_inv() and
copy_from_user() (and probably should put an upper limit on usize), but
I get what you mean.

> 		}
> 	} else if (ksize > usize) {
> 		memset(kernel + usize, 0, ksize - usize);
> 	}
> 	return 0;
> }
> #define copy_abi_struct(kernel, user, usize)	\
> 	__copy_abi_struct(kernel, sizeof(*kernel), user, usize)
>
> > Both (1) and (2) have the problem that the "struct version" is inside
> > the struct so we'd need to copy_from_user() twice. This isn't the end of
> > the world, it just feels a bit less clean than is ideal. (3) fixes that
> > problem, at the cost of making the API slightly more cumbersome to use
> > directly (though again glibc could wrap that away).
> 
> I don't see how 3 is cumbersome to use directly. Userspace code does
> struct openat_of_the_day args = {.field1 = x, .field3 = y} and passes
> &args, sizeof(args). What does glibc need to do beyond its usual munging
> of the userspace ABI registers to the syscall ABI registers?

I'd argue that

    ret = openat2(AT_FDCWD, "foo", &how, sizeof(how)); // (3)

is slightly less pretty than

    ret = openat2(AT_FDCWD, "foo", &how); // (1), (2), (4)

But it's not really that bad. Forget I said anything.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Aleksa Sarai <cyphar@cyphar.com>
To: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Daniel Colascione <dancol@google.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Jeff Layton <jlayton@kernel.org>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Arnd Bergmann <arnd@arndb.de>,
	David Howells <dhowells@redhat.com>,
	Shuah Khan <shuah@kernel.org>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Christian Brauner <christian@brauner.io>,
	Eric Biederman <ebiederm@xmission.com>,
	Andy Lutomirski <luto@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Kees Cook <keescook@chromium.org>, Jann Horn <jannh@google.com>,
	Tycho Andersen <tycho@tycho.ws>,
	David Drysdale <drysdale@google.com>,
	Chanho Min <chanho.min@lge.com>, Oleg Nesterov <oleg@redhat.com>,
	Aleksa Sarai <asarai@suse.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	containers@lists.linux-foundation.org,
	linux-alpha@vger.kernel.org,
	Linux API <linux-api@vger.kernel.org>,
	linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	linux-ia64@vger.kernel.org,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>,
	linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org,
	linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
	linux-xtensa@linux-xtensa.org, sparclinux@vger.kernel.org
Subject: Re: [PATCH RESEND v11 7/8] open: openat2(2) syscall
Date: Thu, 29 Aug 2019 23:19:04 +1000	[thread overview]
Message-ID: <20190829131904.bkbalbtqt6j3gwcp@yavin> (raw)
In-Reply-To: <899401fa-ff0a-2ce9-8826-09904efab2d2@rasmusvillemoes.dk>

[-- Attachment #1: Type: text/plain, Size: 3601 bytes --]

On 2019-08-29, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> On 29/08/2019 14.15, Aleksa Sarai wrote:
> > On 2019-08-24, Daniel Colascione <dancol@google.com> wrote:
> 
> >> Why pad the structure when new functionality (perhaps accommodated via
> >> a larger structure) could be signaled by passing a new flag? Adding
> >> reserved fields to a structure with a size embedded in the ABI makes a
> >> lot of sense --- e.g., pthread_mutex_t can't grow. But this structure
> >> can grow, so the reservation seems needless to me.
> > 
> > Quite a few folks have said that ->reserved is either unnecessary or
> > too big. I will be changing this, though I am not clear what the best
> > way of extending the structure is. If anyone has a strong opinion on
> > this (or an alternative to the ones listed below), please chime in. I
> > don't have any really strong attachment to this aspect of the API.
> > 
> > There appear to be a few ways we can do it (that all have precedence
> > with other syscalls):
> > 
> >  1. Use O_* flags to indicate extensions.
> >  2. A separate "version" field that is incremented when we change.
> >  3. Add a size_t argument to openat2(2).
> >  4. Reserve space (as in this patchset).
> > 
> > (My personal preference would be (3), followed closely by (2).)
> 
> 3, definitely, and instead of having to invent a new scheme for every
> new syscall, make that the default pattern by providing a helper

Sure (though hopefully I don't need to immediately go and refactor all
the existing size_t syscalls). I will be presenting about this patchset
at the containers microconference at LPC (in a few weeks), so I'll hold
of on any API-related rewrites until after that.

> int __copy_abi_struct(void *kernel, size_t ksize, const void __user
> *user, size_t usize)
> {
> 	size_t copy = min(ksize, usize);
> 
> 	if (copy_from_user(kernel, user, copy))
> 		return -EFAULT;
> 
> 	if (usize > ksize) {
> 		/* maybe a separate "return user_is_zero(user + ksize, usize -
> ksize);" helper */
> 		char c;
> 		user += ksize;
> 		usize -= ksize;
> 		while (usize--) {
> 			if (get_user(c, user++))
> 				return -EFAULT;
> 			if (c)
> 				return -EINVAL;

This part would probably be better done with memchr_inv() and
copy_from_user() (and probably should put an upper limit on usize), but
I get what you mean.

> 		}
> 	} else if (ksize > usize) {
> 		memset(kernel + usize, 0, ksize - usize);
> 	}
> 	return 0;
> }
> #define copy_abi_struct(kernel, user, usize)	\
> 	__copy_abi_struct(kernel, sizeof(*kernel), user, usize)
>
> > Both (1) and (2) have the problem that the "struct version" is inside
> > the struct so we'd need to copy_from_user() twice. This isn't the end of
> > the world, it just feels a bit less clean than is ideal. (3) fixes that
> > problem, at the cost of making the API slightly more cumbersome to use
> > directly (though again glibc could wrap that away).
> 
> I don't see how 3 is cumbersome to use directly. Userspace code does
> struct openat_of_the_day args = {.field1 = x, .field3 = y} and passes
> &args, sizeof(args). What does glibc need to do beyond its usual munging
> of the userspace ABI registers to the syscall ABI registers?

I'd argue that

    ret = openat2(AT_FDCWD, "foo", &how, sizeof(how)); // (3)

is slightly less pretty than

    ret = openat2(AT_FDCWD, "foo", &how); // (1), (2), (4)

But it's not really that bad. Forget I said anything.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Aleksa Sarai <cyphar@cyphar.com>
To: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Daniel Colascione <dancol@google.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Jeff Layton <jlayton@kernel.org>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Arnd Bergmann <arnd@arndb.de>,
	David Howells <dhowells@redhat.com>,
	Shuah Khan <shuah@kernel.org>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Christian Brauner <christian@brauner.io>,
	Eric Biederman <ebiederm@xmission.com>,
	Andy Lutomirski <luto@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Kees Cook <keescook@chromium.org>, Jann Horn <jannh@google.com>,
	Tycho Andersen <tycho@tycho.ws>,
	David Drysdale <drysdale@google.com>,
	Chanho Min <chanho.min@lge.com>, Oleg Nesterov <oleg@redhat.com>,
	Aleksa Sarai <asarai@suse.de>, Linus Torvalds <torvalds@linux-fo>
Subject: Re: [PATCH RESEND v11 7/8] open: openat2(2) syscall
Date: Thu, 29 Aug 2019 23:19:04 +1000	[thread overview]
Message-ID: <20190829131904.bkbalbtqt6j3gwcp@yavin> (raw)
In-Reply-To: <899401fa-ff0a-2ce9-8826-09904efab2d2@rasmusvillemoes.dk>

[-- Attachment #1: Type: text/plain, Size: 3601 bytes --]

On 2019-08-29, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> On 29/08/2019 14.15, Aleksa Sarai wrote:
> > On 2019-08-24, Daniel Colascione <dancol@google.com> wrote:
> 
> >> Why pad the structure when new functionality (perhaps accommodated via
> >> a larger structure) could be signaled by passing a new flag? Adding
> >> reserved fields to a structure with a size embedded in the ABI makes a
> >> lot of sense --- e.g., pthread_mutex_t can't grow. But this structure
> >> can grow, so the reservation seems needless to me.
> > 
> > Quite a few folks have said that ->reserved is either unnecessary or
> > too big. I will be changing this, though I am not clear what the best
> > way of extending the structure is. If anyone has a strong opinion on
> > this (or an alternative to the ones listed below), please chime in. I
> > don't have any really strong attachment to this aspect of the API.
> > 
> > There appear to be a few ways we can do it (that all have precedence
> > with other syscalls):
> > 
> >  1. Use O_* flags to indicate extensions.
> >  2. A separate "version" field that is incremented when we change.
> >  3. Add a size_t argument to openat2(2).
> >  4. Reserve space (as in this patchset).
> > 
> > (My personal preference would be (3), followed closely by (2).)
> 
> 3, definitely, and instead of having to invent a new scheme for every
> new syscall, make that the default pattern by providing a helper

Sure (though hopefully I don't need to immediately go and refactor all
the existing size_t syscalls). I will be presenting about this patchset
at the containers microconference at LPC (in a few weeks), so I'll hold
of on any API-related rewrites until after that.

> int __copy_abi_struct(void *kernel, size_t ksize, const void __user
> *user, size_t usize)
> {
> 	size_t copy = min(ksize, usize);
> 
> 	if (copy_from_user(kernel, user, copy))
> 		return -EFAULT;
> 
> 	if (usize > ksize) {
> 		/* maybe a separate "return user_is_zero(user + ksize, usize -
> ksize);" helper */
> 		char c;
> 		user += ksize;
> 		usize -= ksize;
> 		while (usize--) {
> 			if (get_user(c, user++))
> 				return -EFAULT;
> 			if (c)
> 				return -EINVAL;

This part would probably be better done with memchr_inv() and
copy_from_user() (and probably should put an upper limit on usize), but
I get what you mean.

> 		}
> 	} else if (ksize > usize) {
> 		memset(kernel + usize, 0, ksize - usize);
> 	}
> 	return 0;
> }
> #define copy_abi_struct(kernel, user, usize)	\
> 	__copy_abi_struct(kernel, sizeof(*kernel), user, usize)
>
> > Both (1) and (2) have the problem that the "struct version" is inside
> > the struct so we'd need to copy_from_user() twice. This isn't the end of
> > the world, it just feels a bit less clean than is ideal. (3) fixes that
> > problem, at the cost of making the API slightly more cumbersome to use
> > directly (though again glibc could wrap that away).
> 
> I don't see how 3 is cumbersome to use directly. Userspace code does
> struct openat_of_the_day args = {.field1 = x, .field3 = y} and passes
> &args, sizeof(args). What does glibc need to do beyond its usual munging
> of the userspace ABI registers to the syscall ABI registers?

I'd argue that

    ret = openat2(AT_FDCWD, "foo", &how, sizeof(how)); // (3)

is slightly less pretty than

    ret = openat2(AT_FDCWD, "foo", &how); // (1), (2), (4)

But it's not really that bad. Forget I said anything.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Aleksa Sarai <cyphar@cyphar.com>
To: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	David Howells <dhowells@redhat.com>,
	"open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>,
	sparclinux@vger.kernel.org, Shuah Khan <shuah@kernel.org>,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	Daniel Colascione <dancol@google.com>,
	Aleksa Sarai <asarai@suse.de>,
	linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
	linux-xtensa@linux-xtensa.org, Kees Cook <keescook@chromium.org>,
	Arnd Bergmann <arnd@arndb.de>, Jann Horn <jannh@google.com>,
	Tycho Andersen <tycho@tycho.ws>,
	linux-m68k@lists.linux-m68k.org,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andy Lutomirski <luto@kernel.org>,
	Shuah Khan <skhan@linuxfoundation.org>,
	David Drysdale <drysdale@google.com>,
	Christian Brauner <christian@brauner.io>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	linux-parisc@vger.kernel.org,
	Linux API <linux-api@vger.kernel.org>,
	Chanho Min <chanho.min@lge.com>,
	linuxppc-dev@lists.ozlabs.org, Jeff Layton <jlayton@kernel.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	linux-alpha@vger.kernel.org,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	containers@lists.linux-foundation.org
Subject: Re: [PATCH RESEND v11 7/8] open: openat2(2) syscall
Date: Thu, 29 Aug 2019 23:19:04 +1000	[thread overview]
Message-ID: <20190829131904.bkbalbtqt6j3gwcp@yavin> (raw)
In-Reply-To: <899401fa-ff0a-2ce9-8826-09904efab2d2@rasmusvillemoes.dk>

[-- Attachment #1: Type: text/plain, Size: 3601 bytes --]

On 2019-08-29, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> On 29/08/2019 14.15, Aleksa Sarai wrote:
> > On 2019-08-24, Daniel Colascione <dancol@google.com> wrote:
> 
> >> Why pad the structure when new functionality (perhaps accommodated via
> >> a larger structure) could be signaled by passing a new flag? Adding
> >> reserved fields to a structure with a size embedded in the ABI makes a
> >> lot of sense --- e.g., pthread_mutex_t can't grow. But this structure
> >> can grow, so the reservation seems needless to me.
> > 
> > Quite a few folks have said that ->reserved is either unnecessary or
> > too big. I will be changing this, though I am not clear what the best
> > way of extending the structure is. If anyone has a strong opinion on
> > this (or an alternative to the ones listed below), please chime in. I
> > don't have any really strong attachment to this aspect of the API.
> > 
> > There appear to be a few ways we can do it (that all have precedence
> > with other syscalls):
> > 
> >  1. Use O_* flags to indicate extensions.
> >  2. A separate "version" field that is incremented when we change.
> >  3. Add a size_t argument to openat2(2).
> >  4. Reserve space (as in this patchset).
> > 
> > (My personal preference would be (3), followed closely by (2).)
> 
> 3, definitely, and instead of having to invent a new scheme for every
> new syscall, make that the default pattern by providing a helper

Sure (though hopefully I don't need to immediately go and refactor all
the existing size_t syscalls). I will be presenting about this patchset
at the containers microconference at LPC (in a few weeks), so I'll hold
of on any API-related rewrites until after that.

> int __copy_abi_struct(void *kernel, size_t ksize, const void __user
> *user, size_t usize)
> {
> 	size_t copy = min(ksize, usize);
> 
> 	if (copy_from_user(kernel, user, copy))
> 		return -EFAULT;
> 
> 	if (usize > ksize) {
> 		/* maybe a separate "return user_is_zero(user + ksize, usize -
> ksize);" helper */
> 		char c;
> 		user += ksize;
> 		usize -= ksize;
> 		while (usize--) {
> 			if (get_user(c, user++))
> 				return -EFAULT;
> 			if (c)
> 				return -EINVAL;

This part would probably be better done with memchr_inv() and
copy_from_user() (and probably should put an upper limit on usize), but
I get what you mean.

> 		}
> 	} else if (ksize > usize) {
> 		memset(kernel + usize, 0, ksize - usize);
> 	}
> 	return 0;
> }
> #define copy_abi_struct(kernel, user, usize)	\
> 	__copy_abi_struct(kernel, sizeof(*kernel), user, usize)
>
> > Both (1) and (2) have the problem that the "struct version" is inside
> > the struct so we'd need to copy_from_user() twice. This isn't the end of
> > the world, it just feels a bit less clean than is ideal. (3) fixes that
> > problem, at the cost of making the API slightly more cumbersome to use
> > directly (though again glibc could wrap that away).
> 
> I don't see how 3 is cumbersome to use directly. Userspace code does
> struct openat_of_the_day args = {.field1 = x, .field3 = y} and passes
> &args, sizeof(args). What does glibc need to do beyond its usual munging
> of the userspace ABI registers to the syscall ABI registers?

I'd argue that

    ret = openat2(AT_FDCWD, "foo", &how, sizeof(how)); // (3)

is slightly less pretty than

    ret = openat2(AT_FDCWD, "foo", &how); // (1), (2), (4)

But it's not really that bad. Forget I said anything.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Aleksa Sarai <cyphar@cyphar.com>
To: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	David Howells <dhowells@redhat.com>,
	"open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>,
	sparclinux@vger.kernel.org, Shuah Khan <shuah@kernel.org>,
	linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
	Daniel Colascione <dancol@google.com>,
	Aleksa Sarai <asarai@suse.de>,
	linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
	linux-xtensa@linux-xtensa.org, Kees Cook <keescook@chromium.org>,
	Arnd Bergmann <arnd@arndb.de>, Jann Horn <jannh@google.com>,
	Tycho Andersen <tycho@tycho.ws>,
	linux-m68k@lists.linux-m68k.org,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andy Lutomirski <luto@kernel.org>,
	Shuah Khan <skhan@linuxfoundation.org>,
	David Drysdale <drysdale@google.com>,
	Christian Brauner <christian@brauner.io>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	linux-parisc@vger.kernel.org,
	Linux API <linux-api@vger.kernel.org>,
	Chanho Min <chanho.min@lge.com>,
	linuxppc-dev@lists.ozlabs.org, Jeff Layton <jlayton@kernel.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	linux-alpha@vger.kernel.org,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	containers@lists.linux-foundation.org
Subject: Re: [PATCH RESEND v11 7/8] open: openat2(2) syscall
Date: Thu, 29 Aug 2019 23:19:04 +1000	[thread overview]
Message-ID: <20190829131904.bkbalbtqt6j3gwcp@yavin> (raw)
In-Reply-To: <899401fa-ff0a-2ce9-8826-09904efab2d2@rasmusvillemoes.dk>


[-- Attachment #1.1: Type: text/plain, Size: 3601 bytes --]

On 2019-08-29, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> On 29/08/2019 14.15, Aleksa Sarai wrote:
> > On 2019-08-24, Daniel Colascione <dancol@google.com> wrote:
> 
> >> Why pad the structure when new functionality (perhaps accommodated via
> >> a larger structure) could be signaled by passing a new flag? Adding
> >> reserved fields to a structure with a size embedded in the ABI makes a
> >> lot of sense --- e.g., pthread_mutex_t can't grow. But this structure
> >> can grow, so the reservation seems needless to me.
> > 
> > Quite a few folks have said that ->reserved is either unnecessary or
> > too big. I will be changing this, though I am not clear what the best
> > way of extending the structure is. If anyone has a strong opinion on
> > this (or an alternative to the ones listed below), please chime in. I
> > don't have any really strong attachment to this aspect of the API.
> > 
> > There appear to be a few ways we can do it (that all have precedence
> > with other syscalls):
> > 
> >  1. Use O_* flags to indicate extensions.
> >  2. A separate "version" field that is incremented when we change.
> >  3. Add a size_t argument to openat2(2).
> >  4. Reserve space (as in this patchset).
> > 
> > (My personal preference would be (3), followed closely by (2).)
> 
> 3, definitely, and instead of having to invent a new scheme for every
> new syscall, make that the default pattern by providing a helper

Sure (though hopefully I don't need to immediately go and refactor all
the existing size_t syscalls). I will be presenting about this patchset
at the containers microconference at LPC (in a few weeks), so I'll hold
of on any API-related rewrites until after that.

> int __copy_abi_struct(void *kernel, size_t ksize, const void __user
> *user, size_t usize)
> {
> 	size_t copy = min(ksize, usize);
> 
> 	if (copy_from_user(kernel, user, copy))
> 		return -EFAULT;
> 
> 	if (usize > ksize) {
> 		/* maybe a separate "return user_is_zero(user + ksize, usize -
> ksize);" helper */
> 		char c;
> 		user += ksize;
> 		usize -= ksize;
> 		while (usize--) {
> 			if (get_user(c, user++))
> 				return -EFAULT;
> 			if (c)
> 				return -EINVAL;

This part would probably be better done with memchr_inv() and
copy_from_user() (and probably should put an upper limit on usize), but
I get what you mean.

> 		}
> 	} else if (ksize > usize) {
> 		memset(kernel + usize, 0, ksize - usize);
> 	}
> 	return 0;
> }
> #define copy_abi_struct(kernel, user, usize)	\
> 	__copy_abi_struct(kernel, sizeof(*kernel), user, usize)
>
> > Both (1) and (2) have the problem that the "struct version" is inside
> > the struct so we'd need to copy_from_user() twice. This isn't the end of
> > the world, it just feels a bit less clean than is ideal. (3) fixes that
> > problem, at the cost of making the API slightly more cumbersome to use
> > directly (though again glibc could wrap that away).
> 
> I don't see how 3 is cumbersome to use directly. Userspace code does
> struct openat_of_the_day args = {.field1 = x, .field3 = y} and passes
> &args, sizeof(args). What does glibc need to do beyond its usual munging
> of the userspace ABI registers to the syscall ABI registers?

I'd argue that

    ret = openat2(AT_FDCWD, "foo", &how, sizeof(how)); // (3)

is slightly less pretty than

    ret = openat2(AT_FDCWD, "foo", &how); // (1), (2), (4)

But it's not really that bad. Forget I said anything.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-08-29 13:19 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-20  3:33 [PATCH RESEND v11 0/8] openat2(2) Aleksa Sarai
2019-08-20  3:33 ` Aleksa Sarai
2019-08-20  3:33 ` Aleksa Sarai
2019-08-20  3:33 ` Aleksa Sarai
2019-08-20  3:33 ` Aleksa Sarai
2019-08-20  3:33 ` [PATCH RESEND v11 1/8] namei: obey trailing magic-link DAC permissions Aleksa Sarai
2019-08-20  3:33   ` Aleksa Sarai
2019-08-20  3:33   ` Aleksa Sarai
2019-08-20  3:33   ` Aleksa Sarai
2019-08-20  3:33   ` Aleksa Sarai
2019-08-20  3:33   ` Aleksa Sarai
2019-08-20  3:34 ` [PATCH RESEND v11 2/8] procfs: switch magic-link modes to be more sane Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34 ` [PATCH RESEND v11 3/8] open: O_EMPTYPATH: procfs-less file descriptor re-opening Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34 ` [PATCH RESEND v11 4/8] namei: O_BENEATH-style path resolution flags Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34 ` [PATCH RESEND v11 5/8] namei: LOOKUP_IN_ROOT: chroot-like path resolution Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34 ` [PATCH RESEND v11 6/8] namei: aggressively check for nd->root escape on ".." resolution Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34 ` [PATCH RESEND v11 7/8] open: openat2(2) syscall Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-24 20:17   ` Daniel Colascione
2019-08-24 20:17     ` Daniel Colascione
2019-08-24 20:17     ` Daniel Colascione
2019-08-24 20:17     ` Daniel Colascione
2019-08-24 20:17     ` Daniel Colascione
2019-08-24 20:17     ` Daniel Colascione
2019-08-29 12:15     ` Aleksa Sarai
2019-08-29 12:15       ` Aleksa Sarai
2019-08-29 12:15       ` Aleksa Sarai
2019-08-29 12:15       ` Aleksa Sarai
2019-08-29 12:15       ` Aleksa Sarai
2019-08-29 12:15       ` Aleksa Sarai
2019-08-29 13:05       ` Rasmus Villemoes
2019-08-29 13:05         ` Rasmus Villemoes
2019-08-29 13:05         ` Rasmus Villemoes
2019-08-29 13:05         ` Rasmus Villemoes
2019-08-29 13:05         ` Rasmus Villemoes
2019-08-29 13:05         ` Rasmus Villemoes
2019-08-29 13:19         ` Aleksa Sarai [this message]
2019-08-29 13:19           ` Aleksa Sarai
2019-08-29 13:19           ` Aleksa Sarai
2019-08-29 13:19           ` Aleksa Sarai
2019-08-29 13:19           ` Aleksa Sarai
2019-08-29 13:19           ` Aleksa Sarai
2019-08-26 19:50   ` sbaugh
2019-08-26 19:50     ` sbaugh
2019-08-26 19:50     ` sbaugh
2019-08-26 19:50     ` sbaugh
2019-08-26 19:50     ` sbaugh
2019-08-26 19:50     ` sbaugh
2019-08-26 19:50     ` sbaugh
2019-08-26 19:50     ` sbaugh
2019-08-28 15:55     ` Jeff Layton
2019-08-28 15:55       ` Jeff Layton
2019-08-28 15:55       ` Jeff Layton
2019-08-28 15:55       ` Jeff Layton
2019-08-28 20:17       ` Spencer Baugh
2019-08-28 20:17         ` Spencer Baugh
2019-08-28 20:17         ` Spencer Baugh
2019-08-20  3:34 ` [PATCH RESEND v11 8/8] selftests: add openat2(2) selftests Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai
2019-08-20  3:34   ` Aleksa Sarai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190829131904.bkbalbtqt6j3gwcp@yavin \
    --to=cyphar@cyphar.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=asarai@suse.de \
    --cc=ast@kernel.org \
    --cc=bfields@fieldses.org \
    --cc=chanho.min@lge.com \
    --cc=christian@brauner.io \
    --cc=dancol@google.com \
    --cc=dhowells@redhat.com \
    --cc=drysdale@google.com \
    --cc=ebiederm@xmission.com \
    --cc=jannh@google.com \
    --cc=jlayton@kernel.org \
    --cc=keescook@chromium.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=luto@kernel.org \
    --cc=oleg@redhat.com \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=torvalds@linux-fo \
    --cc=tycho@tycho.ws \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.