linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <christian.brauner@ubuntu.com>
To: "Alejandro Colomar (man-pages)" <alx.manpages@gmail.com>
Cc: Stephen Kitt <steve@sk2.org>,
	linux-man@vger.kernel.org,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [patch] close_range.2: new page documenting close_range(2)
Date: Sat, 12 Dec 2020 13:14:19 +0100	[thread overview]
Message-ID: <20201212121419.odpgbaigrjhpkjnm@wittgenstein> (raw)
In-Reply-To: <0ea38a7a-1c64-086e-3d64-38686f5b7856@gmail.com>

On Thu, Dec 10, 2020 at 03:36:42PM +0100, Alejandro Colomar (man-pages) wrote:
> Hi Christian,

Hi Alex,

> 
> Thanks for confirming that behavior.  Seems reasonable.
> 
> I was wondering...
> If this call is equivalent to unshare(2)+{close(2) in a loop},
> shouldn't it fail for the same reasons those syscalls can fail?
> 
> What about the following errors?:
> 
> From unshare(2):
> 
>        EPERM  The calling process did not have the  required  privi‐
>               leges for this operation.

unshare(CLONE_FILES) doesn't require any privileges. Only flags relevant
to kernel/nsproxy.c:unshare_nsproxy_namespaces() require privileges,
i.e.
CLONE_NEWNS
CLONE_NEWUTS
CLONE_NEWIPC
CLONE_NEWNET
CLONE_NEWPID
CLONE_NEWCGROUP
CLONE_NEWTIME
so the permissions are the same.

> 
> From close(2):
>        EBADF  fd isn't a valid open file descriptor.
> 
> OK, this one can't happen with the current code.
> Let's say there are fds 1 to 10, and you call 'close_range(20,30,0)'.
> It's a no-op (although it will still unshare if the flag is set).
> But souldn't it fail with EBADF?

CLOSE_RANGE_UNSHARE should always give you a private file descriptor
table independent of whether or not any file descriptors need to be
closed. That's also how we documented the flag:

/* Unshare the file descriptor table before closing file descriptors. */
#define CLOSE_RANGE_UNSHARE	(1U << 1)

A caller calling unshare(CLONE_FILES) and then an emulated close_range()
or the proper close_range() syscall wants to make sure that all unwanted
file descriptors are closed (if any) and that no new file descriptors
can be injected afterwards. If you skip the unshare(CLONE_FILES) because
there are no fds to be closed you open up a race window. It would also
be annoying for userspace if they _may_ have received a private file
descriptor table but only if any fds needed to be closed.

If people really were extremely keen about skipping the unshare when no
fd needs to be closed then this could become a new flag. But I really
don't think that's necessary and also doesn't make a lot of sense, imho.

> 
>        EINTR  The close() call was interrupted by a signal; see sig‐
>               nal(7).
> 
>        EIO    An I/O error occurred.
> 
>        ENOSPC, EDQUOT
>               On NFS, these errors are not normally reported against
>               the first write which exceeds  the  available  storage
>               space,  but  instead  against  a  subsequent write(2),
>               fsync(2), or close().

None of these will be seen by userspace because close_range() currently
ignores all errors after it has begun closing files.

Christian

  reply	other threads:[~2020-12-12 12:15 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-08 21:51 [patch] close_range.2: new page documenting close_range(2) Stephen Kitt
2020-12-09  8:50 ` Michael Kerrisk (man-pages)
2020-12-09  9:40   ` Christian Brauner
2020-12-09  9:43     ` Stephen Kitt
2020-12-09  9:47   ` Alejandro Colomar (man-pages)
2020-12-10 22:40     ` Michael Kerrisk (man-pages)
2020-12-09  9:58 ` Christian Brauner
2020-12-09 10:44   ` Alejandro Colomar (man-pages)
2020-12-09 10:56     ` Christian Brauner
2020-12-10 14:36       ` Alejandro Colomar (man-pages)
2020-12-12 12:14         ` Christian Brauner [this message]
2020-12-12 17:58           ` Alejandro Colomar (man-pages)
2020-12-18 10:12             ` Ping: " Alejandro Colomar (man-pages)
2020-12-18 10:14               ` Stephen Kitt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201212121419.odpgbaigrjhpkjnm@wittgenstein \
    --to=christian.brauner@ubuntu.com \
    --cc=alx.manpages@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=steve@sk2.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).