All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <christian@brauner.io>
To: Florian Weimer <fweimer@redhat.com>
Cc: viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
	jannh@google.com, oleg@redhat.com, tglx@linutronix.de,
	torvalds@linux-foundation.org, arnd@arndb.de, shuah@kernel.org,
	dhowells@redhat.com, tkjos@android.com, ldv@altlinux.org,
	miklos@szeredi.hu, linux-alpha@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org,
	linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org,
	linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org,
	x86@kernel.org
Subject: Re: [PATCH 1/2] open: add close_range()
Date: Tue, 21 May 2019 13:04:39 +0000	[thread overview]
Message-ID: <20190521130438.q3u4wvve7p6md6cm@brauner.io> (raw)
In-Reply-To: <87tvdoau12.fsf@oldenburg2.str.redhat.com>

On Tue, May 21, 2019 at 02:09:29PM +0200, Florian Weimer wrote:
> * Christian Brauner:
> 
> > +/**
> > + * __close_range() - Close all file descriptors in a given range.
> > + *
> > + * @fd:     starting file descriptor to close
> > + * @max_fd: last file descriptor to close
> > + *
> > + * This closes a range of file descriptors. All file descriptors
> > + * from @fd up to and including @max_fd are closed.
> > + */
> > +int __close_range(struct files_struct *files, unsigned fd, unsigned max_fd)
> > +{
> > +	unsigned int cur_max;
> > +
> > +	if (fd > max_fd)
> > +		return -EINVAL;
> > +
> > +	rcu_read_lock();
> > +	cur_max = files_fdtable(files)->max_fds;
> > +	rcu_read_unlock();
> > +
> > +	/* cap to last valid index into fdtable */
> > +	if (max_fd >= cur_max)
> > +		max_fd = cur_max - 1;
> > +
> > +	while (fd <= max_fd)
> > +		__close_fd(files, fd++);
> > +
> > +	return 0;
> > +}
> 
> This seems rather drastic.  How long does this block in kernel mode?
> Maybe it's okay as long as the maximum possible value for cur_max stays
> around 4 million or so.

That's probably valid concern when you reach very high numbers though I
wonder how relevant this is in practice.
Also, you would only be blocking yourself I imagine, i.e. you can't DOS
another task with this unless your multi-threaded.

> 
> Solaris has an fdwalk function:
> 
>   <https://docs.oracle.com/cd/E88353_01/html/E37843/closefrom-3c.html>
> 
> So a different way to implement this would expose a nextfd system call

Meh. If nextfd() then I would like it to be able to:
- get the nextfd(fd) >= fd
- get highest open fd e.g. nextfd(-1)

But then I wonder if nextfd() needs to be a syscall and isn't just
either:
fcntl(fd, F_GET_NEXT)?
or
prctl(PR_GET_NEXT)?

Technically, one could also do:

fd_range(unsigned fd, unsigend end_fd, unsigned flags);

fd_range(3, 50, FD_RANGE_CLOSE);

/* return highest fd within the range [3, 50] */
fd_range(3, 50, FD_RANGE_NEXT);

/* return highest fd */
fd_range(3, UINT_MAX, FD_RANGE_NEXT);

This syscall could also reasonably be extended.

> to userspace, so that we can use that to implement both fdwalk and
> closefrom.  But maybe fdwalk is just too obscure, given the existence of
> /proc.

Yeah we probably don't need fdwalk.

> 
> I'll happily implement closefrom on top of close_range in glibc (plus
> fallback for older kernels based on /proc—with an abort in case that
> doesn't work because the RLIMIT_NOFILE hack is unreliable
> unfortunately).
> 
> Thanks,
> Florian

WARNING: multiple messages have this Message-ID (diff)
From: Christian Brauner <christian@brauner.io>
To: Florian Weimer <fweimer@redhat.com>
Cc: viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
	jannh@google.com, oleg@redhat.com, tglx@linutronix.de,
	torvalds@linux-foundation.org, arnd@arndb.de, shuah@kernel.org,
	dhowells@redhat.com, tkjos@android.com, ldv@altlinux.org,
	miklos@szeredi.hu, linux-alpha@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org,
	linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org,
	linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org,
	x86@kernel.org
Subject: Re: [PATCH 1/2] open: add close_range()
Date: Tue, 21 May 2019 15:04:39 +0200	[thread overview]
Message-ID: <20190521130438.q3u4wvve7p6md6cm@brauner.io> (raw)
In-Reply-To: <87tvdoau12.fsf@oldenburg2.str.redhat.com>

On Tue, May 21, 2019 at 02:09:29PM +0200, Florian Weimer wrote:
> * Christian Brauner:
> 
> > +/**
> > + * __close_range() - Close all file descriptors in a given range.
> > + *
> > + * @fd:     starting file descriptor to close
> > + * @max_fd: last file descriptor to close
> > + *
> > + * This closes a range of file descriptors. All file descriptors
> > + * from @fd up to and including @max_fd are closed.
> > + */
> > +int __close_range(struct files_struct *files, unsigned fd, unsigned max_fd)
> > +{
> > +	unsigned int cur_max;
> > +
> > +	if (fd > max_fd)
> > +		return -EINVAL;
> > +
> > +	rcu_read_lock();
> > +	cur_max = files_fdtable(files)->max_fds;
> > +	rcu_read_unlock();
> > +
> > +	/* cap to last valid index into fdtable */
> > +	if (max_fd >= cur_max)
> > +		max_fd = cur_max - 1;
> > +
> > +	while (fd <= max_fd)
> > +		__close_fd(files, fd++);
> > +
> > +	return 0;
> > +}
> 
> This seems rather drastic.  How long does this block in kernel mode?
> Maybe it's okay as long as the maximum possible value for cur_max stays
> around 4 million or so.

That's probably valid concern when you reach very high numbers though I
wonder how relevant this is in practice.
Also, you would only be blocking yourself I imagine, i.e. you can't DOS
another task with this unless your multi-threaded.

> 
> Solaris has an fdwalk function:
> 
>   <https://docs.oracle.com/cd/E88353_01/html/E37843/closefrom-3c.html>
> 
> So a different way to implement this would expose a nextfd system call

Meh. If nextfd() then I would like it to be able to:
- get the nextfd(fd) >= fd
- get highest open fd e.g. nextfd(-1)

But then I wonder if nextfd() needs to be a syscall and isn't just
either:
fcntl(fd, F_GET_NEXT)?
or
prctl(PR_GET_NEXT)?

Technically, one could also do:

fd_range(unsigned fd, unsigend end_fd, unsigned flags);

fd_range(3, 50, FD_RANGE_CLOSE);

/* return highest fd within the range [3, 50] */
fd_range(3, 50, FD_RANGE_NEXT);

/* return highest fd */
fd_range(3, UINT_MAX, FD_RANGE_NEXT);

This syscall could also reasonably be extended.

> to userspace, so that we can use that to implement both fdwalk and
> closefrom.  But maybe fdwalk is just too obscure, given the existence of
> /proc.

Yeah we probably don't need fdwalk.

> 
> I'll happily implement closefrom on top of close_range in glibc (plus
> fallback for older kernels based on /proc—with an abort in case that
> doesn't work because the RLIMIT_NOFILE hack is unreliable
> unfortunately).
> 
> Thanks,
> Florian

WARNING: multiple messages have this Message-ID (diff)
From: christian at brauner.io (Christian Brauner)
Subject: [PATCH 1/2] open: add close_range()
Date: Tue, 21 May 2019 15:04:39 +0200	[thread overview]
Message-ID: <20190521130438.q3u4wvve7p6md6cm@brauner.io> (raw)
In-Reply-To: <87tvdoau12.fsf@oldenburg2.str.redhat.com>

On Tue, May 21, 2019 at 02:09:29PM +0200, Florian Weimer wrote:
> * Christian Brauner:
> 
> > +/**
> > + * __close_range() - Close all file descriptors in a given range.
> > + *
> > + * @fd:     starting file descriptor to close
> > + * @max_fd: last file descriptor to close
> > + *
> > + * This closes a range of file descriptors. All file descriptors
> > + * from @fd up to and including @max_fd are closed.
> > + */
> > +int __close_range(struct files_struct *files, unsigned fd, unsigned max_fd)
> > +{
> > +	unsigned int cur_max;
> > +
> > +	if (fd > max_fd)
> > +		return -EINVAL;
> > +
> > +	rcu_read_lock();
> > +	cur_max = files_fdtable(files)->max_fds;
> > +	rcu_read_unlock();
> > +
> > +	/* cap to last valid index into fdtable */
> > +	if (max_fd >= cur_max)
> > +		max_fd = cur_max - 1;
> > +
> > +	while (fd <= max_fd)
> > +		__close_fd(files, fd++);
> > +
> > +	return 0;
> > +}
> 
> This seems rather drastic.  How long does this block in kernel mode?
> Maybe it's okay as long as the maximum possible value for cur_max stays
> around 4 million or so.

That's probably valid concern when you reach very high numbers though I
wonder how relevant this is in practice.
Also, you would only be blocking yourself I imagine, i.e. you can't DOS
another task with this unless your multi-threaded.

> 
> Solaris has an fdwalk function:
> 
>   <https://docs.oracle.com/cd/E88353_01/html/E37843/closefrom-3c.html>
> 
> So a different way to implement this would expose a nextfd system call

Meh. If nextfd() then I would like it to be able to:
- get the nextfd(fd) >= fd
- get highest open fd e.g. nextfd(-1)

But then I wonder if nextfd() needs to be a syscall and isn't just
either:
fcntl(fd, F_GET_NEXT)?
or
prctl(PR_GET_NEXT)?

Technically, one could also do:

fd_range(unsigned fd, unsigend end_fd, unsigned flags);

fd_range(3, 50, FD_RANGE_CLOSE);

/* return highest fd within the range [3, 50] */
fd_range(3, 50, FD_RANGE_NEXT);

/* return highest fd */
fd_range(3, UINT_MAX, FD_RANGE_NEXT);

This syscall could also reasonably be extended.

> to userspace, so that we can use that to implement both fdwalk and
> closefrom.  But maybe fdwalk is just too obscure, given the existence of
> /proc.

Yeah we probably don't need fdwalk.

> 
> I'll happily implement closefrom on top of close_range in glibc (plus
> fallback for older kernels based on /proc—with an abort in case that
> doesn't work because the RLIMIT_NOFILE hack is unreliable
> unfortunately).
> 
> Thanks,
> Florian

WARNING: multiple messages have this Message-ID (diff)
From: christian@brauner.io (Christian Brauner)
Subject: [PATCH 1/2] open: add close_range()
Date: Tue, 21 May 2019 15:04:39 +0200	[thread overview]
Message-ID: <20190521130438.q3u4wvve7p6md6cm@brauner.io> (raw)
Message-ID: <20190521130439.ZYX3SA-fJ-Ks5WA59bcohujkNDdU2V0Va56FRTI5Mbw@z> (raw)
In-Reply-To: <87tvdoau12.fsf@oldenburg2.str.redhat.com>

On Tue, May 21, 2019@02:09:29PM +0200, Florian Weimer wrote:
> * Christian Brauner:
> 
> > +/**
> > + * __close_range() - Close all file descriptors in a given range.
> > + *
> > + * @fd:     starting file descriptor to close
> > + * @max_fd: last file descriptor to close
> > + *
> > + * This closes a range of file descriptors. All file descriptors
> > + * from @fd up to and including @max_fd are closed.
> > + */
> > +int __close_range(struct files_struct *files, unsigned fd, unsigned max_fd)
> > +{
> > +	unsigned int cur_max;
> > +
> > +	if (fd > max_fd)
> > +		return -EINVAL;
> > +
> > +	rcu_read_lock();
> > +	cur_max = files_fdtable(files)->max_fds;
> > +	rcu_read_unlock();
> > +
> > +	/* cap to last valid index into fdtable */
> > +	if (max_fd >= cur_max)
> > +		max_fd = cur_max - 1;
> > +
> > +	while (fd <= max_fd)
> > +		__close_fd(files, fd++);
> > +
> > +	return 0;
> > +}
> 
> This seems rather drastic.  How long does this block in kernel mode?
> Maybe it's okay as long as the maximum possible value for cur_max stays
> around 4 million or so.

That's probably valid concern when you reach very high numbers though I
wonder how relevant this is in practice.
Also, you would only be blocking yourself I imagine, i.e. you can't DOS
another task with this unless your multi-threaded.

> 
> Solaris has an fdwalk function:
> 
>   <https://docs.oracle.com/cd/E88353_01/html/E37843/closefrom-3c.html>
> 
> So a different way to implement this would expose a nextfd system call

Meh. If nextfd() then I would like it to be able to:
- get the nextfd(fd) >= fd
- get highest open fd e.g. nextfd(-1)

But then I wonder if nextfd() needs to be a syscall and isn't just
either:
fcntl(fd, F_GET_NEXT)?
or
prctl(PR_GET_NEXT)?

Technically, one could also do:

fd_range(unsigned fd, unsigend end_fd, unsigned flags);

fd_range(3, 50, FD_RANGE_CLOSE);

/* return highest fd within the range [3, 50] */
fd_range(3, 50, FD_RANGE_NEXT);

/* return highest fd */
fd_range(3, UINT_MAX, FD_RANGE_NEXT);

This syscall could also reasonably be extended.

> to userspace, so that we can use that to implement both fdwalk and
> closefrom.  But maybe fdwalk is just too obscure, given the existence of
> /proc.

Yeah we probably don't need fdwalk.

> 
> I'll happily implement closefrom on top of close_range in glibc (plus
> fallback for older kernels based on /proc—with an abort in case that
> doesn't work because the RLIMIT_NOFILE hack is unreliable
> unfortunately).
> 
> Thanks,
> Florian

WARNING: multiple messages have this Message-ID (diff)
From: Christian Brauner <christian@brauner.io>
To: Florian Weimer <fweimer@redhat.com>
Cc: linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org,
	linux-kernel@vger.kernel.org, dhowells@redhat.com,
	linux-kselftest@vger.kernel.org, sparclinux@vger.kernel.org,
	shuah@kernel.org, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, miklos@szeredi.hu, x86@kernel.org,
	torvalds@linux-foundation.org, linux-mips@vger.kernel.org,
	linux-xtensa@linux-xtensa.org, tkjos@android.com, arnd@arndb.de,
	jannh@google.com, linux-m68k@lists.linux-m68k.org,
	viro@zeniv.linux.org.uk, tglx@linutronix.de, ldv@altlinux.org,
	linux-arm-kernel@lists.infradead.org,
	linux-parisc@vger.kernel.org, linux-api@vger.kernel.org,
	oleg@redhat.com, linux-alpha@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 1/2] open: add close_range()
Date: Tue, 21 May 2019 15:04:39 +0200	[thread overview]
Message-ID: <20190521130438.q3u4wvve7p6md6cm@brauner.io> (raw)
In-Reply-To: <87tvdoau12.fsf@oldenburg2.str.redhat.com>

On Tue, May 21, 2019 at 02:09:29PM +0200, Florian Weimer wrote:
> * Christian Brauner:
> 
> > +/**
> > + * __close_range() - Close all file descriptors in a given range.
> > + *
> > + * @fd:     starting file descriptor to close
> > + * @max_fd: last file descriptor to close
> > + *
> > + * This closes a range of file descriptors. All file descriptors
> > + * from @fd up to and including @max_fd are closed.
> > + */
> > +int __close_range(struct files_struct *files, unsigned fd, unsigned max_fd)
> > +{
> > +	unsigned int cur_max;
> > +
> > +	if (fd > max_fd)
> > +		return -EINVAL;
> > +
> > +	rcu_read_lock();
> > +	cur_max = files_fdtable(files)->max_fds;
> > +	rcu_read_unlock();
> > +
> > +	/* cap to last valid index into fdtable */
> > +	if (max_fd >= cur_max)
> > +		max_fd = cur_max - 1;
> > +
> > +	while (fd <= max_fd)
> > +		__close_fd(files, fd++);
> > +
> > +	return 0;
> > +}
> 
> This seems rather drastic.  How long does this block in kernel mode?
> Maybe it's okay as long as the maximum possible value for cur_max stays
> around 4 million or so.

That's probably valid concern when you reach very high numbers though I
wonder how relevant this is in practice.
Also, you would only be blocking yourself I imagine, i.e. you can't DOS
another task with this unless your multi-threaded.

> 
> Solaris has an fdwalk function:
> 
>   <https://docs.oracle.com/cd/E88353_01/html/E37843/closefrom-3c.html>
> 
> So a different way to implement this would expose a nextfd system call

Meh. If nextfd() then I would like it to be able to:
- get the nextfd(fd) >= fd
- get highest open fd e.g. nextfd(-1)

But then I wonder if nextfd() needs to be a syscall and isn't just
either:
fcntl(fd, F_GET_NEXT)?
or
prctl(PR_GET_NEXT)?

Technically, one could also do:

fd_range(unsigned fd, unsigend end_fd, unsigned flags);

fd_range(3, 50, FD_RANGE_CLOSE);

/* return highest fd within the range [3, 50] */
fd_range(3, 50, FD_RANGE_NEXT);

/* return highest fd */
fd_range(3, UINT_MAX, FD_RANGE_NEXT);

This syscall could also reasonably be extended.

> to userspace, so that we can use that to implement both fdwalk and
> closefrom.  But maybe fdwalk is just too obscure, given the existence of
> /proc.

Yeah we probably don't need fdwalk.

> 
> I'll happily implement closefrom on top of close_range in glibc (plus
> fallback for older kernels based on /proc—with an abort in case that
> doesn't work because the RLIMIT_NOFILE hack is unreliable
> unfortunately).
> 
> Thanks,
> Florian

WARNING: multiple messages have this Message-ID (diff)
From: Christian Brauner <christian@brauner.io>
To: Florian Weimer <fweimer@redhat.com>
Cc: linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org,
	linux-kernel@vger.kernel.org, dhowells@redhat.com,
	linux-kselftest@vger.kernel.org, sparclinux@vger.kernel.org,
	shuah@kernel.org, linux-arch@vger.kernel.org,
	linux-s390@vger.kernel.org, miklos@szeredi.hu, x86@kernel.org,
	torvalds@linux-foundation.org, linux-mips@vger.kernel.org,
	linux-xtensa@linux-xtensa.org, tkjos@android.com, arnd@arndb.de,
	jannh@google.com, linux-m68k@lists.linux-m68k.org,
	viro@zeniv.linux.org.uk, tglx@linutronix.de, ldv@altlinux.org,
	linux-arm-kernel@lists.infradead.org,
	linux-parisc@vger.kernel.org, linux-api@vger.kernel.org,
	oleg@redhat.com, linux-alpha@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 1/2] open: add close_range()
Date: Tue, 21 May 2019 15:04:39 +0200	[thread overview]
Message-ID: <20190521130438.q3u4wvve7p6md6cm@brauner.io> (raw)
In-Reply-To: <87tvdoau12.fsf@oldenburg2.str.redhat.com>

On Tue, May 21, 2019 at 02:09:29PM +0200, Florian Weimer wrote:
> * Christian Brauner:
> 
> > +/**
> > + * __close_range() - Close all file descriptors in a given range.
> > + *
> > + * @fd:     starting file descriptor to close
> > + * @max_fd: last file descriptor to close
> > + *
> > + * This closes a range of file descriptors. All file descriptors
> > + * from @fd up to and including @max_fd are closed.
> > + */
> > +int __close_range(struct files_struct *files, unsigned fd, unsigned max_fd)
> > +{
> > +	unsigned int cur_max;
> > +
> > +	if (fd > max_fd)
> > +		return -EINVAL;
> > +
> > +	rcu_read_lock();
> > +	cur_max = files_fdtable(files)->max_fds;
> > +	rcu_read_unlock();
> > +
> > +	/* cap to last valid index into fdtable */
> > +	if (max_fd >= cur_max)
> > +		max_fd = cur_max - 1;
> > +
> > +	while (fd <= max_fd)
> > +		__close_fd(files, fd++);
> > +
> > +	return 0;
> > +}
> 
> This seems rather drastic.  How long does this block in kernel mode?
> Maybe it's okay as long as the maximum possible value for cur_max stays
> around 4 million or so.

That's probably valid concern when you reach very high numbers though I
wonder how relevant this is in practice.
Also, you would only be blocking yourself I imagine, i.e. you can't DOS
another task with this unless your multi-threaded.

> 
> Solaris has an fdwalk function:
> 
>   <https://docs.oracle.com/cd/E88353_01/html/E37843/closefrom-3c.html>
> 
> So a different way to implement this would expose a nextfd system call

Meh. If nextfd() then I would like it to be able to:
- get the nextfd(fd) >= fd
- get highest open fd e.g. nextfd(-1)

But then I wonder if nextfd() needs to be a syscall and isn't just
either:
fcntl(fd, F_GET_NEXT)?
or
prctl(PR_GET_NEXT)?

Technically, one could also do:

fd_range(unsigned fd, unsigend end_fd, unsigned flags);

fd_range(3, 50, FD_RANGE_CLOSE);

/* return highest fd within the range [3, 50] */
fd_range(3, 50, FD_RANGE_NEXT);

/* return highest fd */
fd_range(3, UINT_MAX, FD_RANGE_NEXT);

This syscall could also reasonably be extended.

> to userspace, so that we can use that to implement both fdwalk and
> closefrom.  But maybe fdwalk is just too obscure, given the existence of
> /proc.

Yeah we probably don't need fdwalk.

> 
> I'll happily implement closefrom on top of close_range in glibc (plus
> fallback for older kernels based on /proc—with an abort in case that
> doesn't work because the RLIMIT_NOFILE hack is unreliable
> unfortunately).
> 
> Thanks,
> Florian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Christian Brauner <christian@brauner.io>
To: Florian Weimer <fweimer@redhat.com>
Cc: viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
	jannh@google.com, oleg@redhat.com, tglx@linutronix.de,
	torvalds@linux-foundation.org, arnd@arndb.de, shuah@kernel.org,
	dhowells@redhat.com, tkjos@android.com, ldv@altlinux.org,
	miklos@szeredi.hu, linux-alpha@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org,
	linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org,
	linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, linux-sh@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org,
	x86@kernel.org
Subject: Re: [PATCH 1/2] open: add close_range()
Date: Tue, 21 May 2019 15:04:39 +0200	[thread overview]
Message-ID: <20190521130438.q3u4wvve7p6md6cm@brauner.io> (raw)
In-Reply-To: <87tvdoau12.fsf@oldenburg2.str.redhat.com>

On Tue, May 21, 2019 at 02:09:29PM +0200, Florian Weimer wrote:
> * Christian Brauner:
> 
> > +/**
> > + * __close_range() - Close all file descriptors in a given range.
> > + *
> > + * @fd:     starting file descriptor to close
> > + * @max_fd: last file descriptor to close
> > + *
> > + * This closes a range of file descriptors. All file descriptors
> > + * from @fd up to and including @max_fd are closed.
> > + */
> > +int __close_range(struct files_struct *files, unsigned fd, unsigned max_fd)
> > +{
> > +	unsigned int cur_max;
> > +
> > +	if (fd > max_fd)
> > +		return -EINVAL;
> > +
> > +	rcu_read_lock();
> > +	cur_max = files_fdtable(files)->max_fds;
> > +	rcu_read_unlock();
> > +
> > +	/* cap to last valid index into fdtable */
> > +	if (max_fd >= cur_max)
> > +		max_fd = cur_max - 1;
> > +
> > +	while (fd <= max_fd)
> > +		__close_fd(files, fd++);
> > +
> > +	return 0;
> > +}
> 
> This seems rather drastic.  How long does this block in kernel mode?
> Maybe it's okay as long as the maximum possible value for cur_max stays
> around 4 million or so.

That's probably valid concern when you reach very high numbers though I
wonder how relevant this is in practice.
Also, you would only be blocking yourself I imagine, i.e. you can't DOS
another task with this unless your multi-threaded.

> 
> Solaris has an fdwalk function:
> 
>   <https://docs.oracle.com/cd/E88353_01/html/E37843/closefrom-3c.html>
> 
> So a different way to implement this would expose a nextfd system call

Meh. If nextfd() then I would like it to be able to:
- get the nextfd(fd) >= fd
- get highest open fd e.g. nextfd(-1)

But then I wonder if nextfd() needs to be a syscall and isn't just
either:
fcntl(fd, F_GET_NEXT)?
or
prctl(PR_GET_NEXT)?

Technically, one could also do:

fd_range(unsigned fd, unsigend end_fd, unsigned flags);

fd_range(3, 50, FD_RANGE_CLOSE);

/* return highest fd within the range [3, 50] */
fd_range(3, 50, FD_RANGE_NEXT);

/* return highest fd */
fd_range(3, UINT_MAX, FD_RANGE_NEXT);

This syscall could also reasonably be extended.

> to userspace, so that we can use that to implement both fdwalk and
> closefrom.  But maybe fdwalk is just too obscure, given the existence of
> /proc.

Yeah we probably don't need fdwalk.

> 
> I'll happily implement closefrom on top of close_range in glibc (plus
> fallback for older kernels based on /proc—with an abort in case that
> doesn't work because the RLIMIT_NOFILE hack is unreliable
> unfortunately).
> 
> Thanks,
> Florian

  reply	other threads:[~2019-05-21 13:04 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-21 11:34 [PATCH 1/2] open: add close_range() Christian Brauner
2019-05-21 11:34 ` Christian Brauner
2019-05-21 11:34 ` Christian Brauner
2019-05-21 11:34 ` Christian Brauner
2019-05-21 11:34 ` Christian Brauner
2019-05-21 11:34 ` Christian Brauner
2019-05-21 11:34 ` christian
2019-05-21 11:34 ` Christian Brauner
2019-05-21 11:34 ` [PATCH 2/2] tests: add close_range() tests Christian Brauner
2019-05-21 11:34   ` Christian Brauner
2019-05-21 11:34   ` Christian Brauner
2019-05-21 11:34   ` Christian Brauner
2019-05-21 11:34   ` christian
2019-05-21 11:34   ` Christian Brauner
2019-05-21 12:09 ` [PATCH 1/2] open: add close_range() Florian Weimer
2019-05-21 12:09   ` Florian Weimer
2019-05-21 12:09   ` Florian Weimer
2019-05-21 12:09   ` Florian Weimer
2019-05-21 12:09   ` Florian Weimer
2019-05-21 12:09   ` fweimer
2019-05-21 12:09   ` Florian Weimer
2019-05-21 13:04   ` Christian Brauner [this message]
2019-05-21 13:04     ` Christian Brauner
2019-05-21 13:04     ` Christian Brauner
2019-05-21 13:04     ` Christian Brauner
2019-05-21 13:04     ` Christian Brauner
2019-05-21 13:04     ` christian
2019-05-21 13:04     ` Christian Brauner
2019-05-21 13:10     ` Florian Weimer
2019-05-21 13:10       ` Florian Weimer
2019-05-21 13:10       ` Florian Weimer
2019-05-21 13:10       ` Florian Weimer
2019-05-21 13:10       ` fweimer
2019-05-21 13:10       ` Florian Weimer
2019-05-21 13:18       ` Christian Brauner
2019-05-21 13:18         ` Christian Brauner
2019-05-21 13:18         ` Christian Brauner
2019-05-21 13:18         ` Christian Brauner
2019-05-21 13:18         ` christian
2019-05-21 13:18         ` Christian Brauner
2019-05-21 13:23       ` Christian Brauner
2019-05-21 13:23         ` Christian Brauner
2019-05-21 13:23         ` Christian Brauner
2019-05-21 13:23         ` christian
2019-05-21 13:23         ` Christian Brauner
2019-05-21 13:23         ` Christian Brauner
2019-05-21 13:07 ` Marc Gonzalez
2019-05-21 13:12   ` Christian Brauner
2019-05-21 13:39 ` Rasmus Villemoes
2019-05-21 15:00 ` Al Viro
2019-05-21 15:00   ` Al Viro
2019-05-21 15:00   ` Al Viro
2019-05-21 15:00   ` Al Viro
2019-05-21 15:00   ` viro
2019-05-21 15:00   ` Al Viro
2019-05-21 16:53   ` Christian Brauner
2019-05-21 16:53     ` Christian Brauner
2019-05-21 16:53     ` Christian Brauner
2019-05-21 16:53     ` Christian Brauner
2019-05-21 16:53     ` christian
2019-05-21 16:53     ` Christian Brauner
2019-05-21 16:30 ` David Howells
2019-05-21 16:30   ` David Howells
2019-05-21 16:30   ` David Howells
2019-05-21 16:30   ` David Howells
2019-05-21 16:30   ` dhowells
2019-05-21 16:30   ` David Howells
2019-05-21 16:41   ` Christian Brauner
2019-05-21 16:41     ` Christian Brauner
2019-05-21 16:41     ` Christian Brauner
2019-05-21 16:41     ` Christian Brauner
2019-05-21 16:41     ` christian
2019-05-21 16:41     ` Christian Brauner
2019-05-21 20:23     ` Linus Torvalds
2019-05-21 20:23       ` Linus Torvalds
2019-05-21 20:23       ` Linus Torvalds
2019-05-21 20:23       ` Linus Torvalds
2019-05-21 20:23       ` Linus Torvalds
2019-05-21 20:23       ` torvalds
2019-05-21 20:23       ` Linus Torvalds
2019-05-21 20:23       ` Linus Torvalds
2019-05-22  8:12       ` Christian Brauner
2019-05-22  8:12         ` Christian Brauner
2019-05-22  8:12         ` Christian Brauner
2019-05-22  8:12         ` Christian Brauner
2019-05-22  8:12         ` Christian Brauner
2019-05-22  8:12         ` Christian Brauner
2019-05-22  8:12         ` christian
2019-05-22  8:12         ` Christian Brauner
2019-05-21 19:20   ` Al Viro
2019-05-21 19:20     ` Al Viro
2019-05-21 19:20     ` Al Viro
2019-05-21 19:20     ` Al Viro
2019-05-21 19:20     ` viro
2019-05-21 19:20     ` Al Viro
2019-05-21 19:59     ` Matthew Wilcox
2019-05-21 19:59       ` Matthew Wilcox
2019-05-21 19:59       ` Matthew Wilcox
2019-05-21 19:59       ` willy
2019-05-21 19:59       ` Matthew Wilcox
2019-05-21 19:59       ` Matthew Wilcox
2019-05-24 20:32 ` Michael Tirado

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190521130438.q3u4wvve7p6md6cm@brauner.io \
    --to=christian@brauner.io \
    --cc=arnd@arndb.de \
    --cc=dhowells@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=jannh@google.com \
    --cc=ldv@altlinux.org \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux-xtensa@linux-xtensa.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=miklos@szeredi.hu \
    --cc=oleg@redhat.com \
    --cc=shuah@kernel.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tkjos@android.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.