* [RFC] vfs: don't bother clearing close_on_exec bit for unused fds
@ 2015-11-03 9:41 Rasmus Villemoes
2015-11-03 22:45 ` Linus Torvalds
2015-11-04 1:31 ` Eric Dumazet
0 siblings, 2 replies; 6+ messages in thread
From: Rasmus Villemoes @ 2015-11-03 9:41 UTC (permalink / raw)
To: Alexander Viro
Cc: Linus Torvalds, Rasmus Villemoes, linux-fsdevel, linux-kernel
In fc90888d07b8 (vfs: conditionally clear close-on-exec flag) a
conditional was added to __clear_close_on_exec to avoid dirtying a
cache line in the common case where the bit is already clear. However,
AFAICT, we don't rely on the close_on_exec bit being clear for unused
fds, except as an optimization in do_close_on_exec(); if I haven't
missed anything, __{set,clear}_close_on_exec is always called when a
new fd is allocated. At the expense of also reading through ->open_fds
in do_close_on_exec(), we can avoid accessing the close_on_exec bitmap
altogether in close(), which I think is a reasonable trade-off.
The conditional added in the commit above still makes sense to avoid
the dirtying on the allocation paths, but I also think it might make
sense in __set_close_on_exec: I suppose any given app handling a
non-trivial amount of fds uses O_CLOEXEC for either almost none or
almost all of them.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
I'm sure I've missed something, hence the RFC. But if not, there's
probably also a few memsets which become redundant. And the
__set_close_on_exec part should probably be its own patch...
fs/file.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/file.c b/fs/file.c
index c6986dce0334..93cfbcd450c3 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -231,7 +231,8 @@ repeat:
static inline void __set_close_on_exec(int fd, struct fdtable *fdt)
{
- __set_bit(fd, fdt->close_on_exec);
+ if (!test_bit(fd, fdt->close_on_exec))
+ __set_bit(fd, fdt->close_on_exec);
}
static inline void __clear_close_on_exec(int fd, struct fdtable *fdt)
@@ -644,7 +645,6 @@ int __close_fd(struct files_struct *files, unsigned fd)
if (!file)
goto out_unlock;
rcu_assign_pointer(fdt->fd[fd], NULL);
- __clear_close_on_exec(fd, fdt);
__put_unused_fd(files, fd);
spin_unlock(&files->file_lock);
return filp_close(file, files);
@@ -667,7 +667,7 @@ void do_close_on_exec(struct files_struct *files)
fdt = files_fdtable(files);
if (fd >= fdt->max_fds)
break;
- set = fdt->close_on_exec[i];
+ set = fdt->close_on_exec[i] & fdt->open_fds[i];
if (!set)
continue;
fdt->close_on_exec[i] = 0;
--
2.6.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC] vfs: don't bother clearing close_on_exec bit for unused fds
2015-11-03 9:41 [RFC] vfs: don't bother clearing close_on_exec bit for unused fds Rasmus Villemoes
@ 2015-11-03 22:45 ` Linus Torvalds
2015-11-03 23:13 ` Rasmus Villemoes
2015-11-04 1:31 ` Eric Dumazet
1 sibling, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2015-11-03 22:45 UTC (permalink / raw)
To: Rasmus Villemoes; +Cc: Alexander Viro, linux-fsdevel, Linux Kernel Mailing List
On Tue, Nov 3, 2015 at 1:41 AM, Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> I'm sure I've missed something, hence the RFC. But if not, there's
> probably also a few memsets which become redundant. And the
> __set_close_on_exec part should probably be its own patch...
The patch looks fine to me. I'm not sure the __set_close_on_exec part
even makes sense, because if you set that bit, it usually really *is*
clear before, so testing it beforehand is just pointless. And if
somebody really keeps setting the bit, they are doing something stupid
anyway..
So I have nothing against the patch, but I do wonder how much it
matters. If there isn't a noticeable performance win, I'd almost
rather just keep the close-on-exec bitmap up-to-date. Hmm?
Linus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] vfs: don't bother clearing close_on_exec bit for unused fds
2015-11-03 22:45 ` Linus Torvalds
@ 2015-11-03 23:13 ` Rasmus Villemoes
0 siblings, 0 replies; 6+ messages in thread
From: Rasmus Villemoes @ 2015-11-03 23:13 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Alexander Viro, linux-fsdevel, Linux Kernel Mailing List
On Tue, Nov 03 2015, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Tue, Nov 3, 2015 at 1:41 AM, Rasmus Villemoes
> <linux@rasmusvillemoes.dk> wrote:
>>
>> I'm sure I've missed something, hence the RFC. But if not, there's
>> probably also a few memsets which become redundant. And the
>> __set_close_on_exec part should probably be its own patch...
>
> The patch looks fine to me. I'm not sure the __set_close_on_exec part
> even makes sense, because if you set that bit, it usually really *is*
> clear before, so testing it beforehand is just pointless. And if
> somebody really keeps setting the bit, they are doing something stupid
> anyway..
So that's true for the lifetime of a single fd where no-one of course
does fcntl(fd, FD_CLOEXEC) more than once. But the scenario I was
thinking of was when fds get recycled. open(, O_CLOEXEC) => 5, close(5),
open(, O_CLOEXEC) => 5; in that case, letting the close_on_exec bit keep
its value avoids dirtying the cache line on all subsequent allocations
of fd 5 (for example, had Eric's app been using *_CLOEXEC for all its
open's, socket's etc. there wouldn't have been any gain by adding the
conditional to __clear_close_on_exec, but I'd expect to see a similar
gain by doing the symmetric thing). Again, this is assuming that almost
all fd allocations either do or do not apply CLOEXEC - after a while,
->close_on_exec would reach a steady-state where no bits get flipped
anymore.
The "usually really *is* clear" only holds when we do "bother clearing
close_on_exec bit for unused fds", which is what I suggest we don't :-)
I don't think either state of the bit in close_on_exec is more or less
'up-to-date' when its buddy in open_fds is not set.
Rasmus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] vfs: don't bother clearing close_on_exec bit for unused fds
2015-11-03 9:41 [RFC] vfs: don't bother clearing close_on_exec bit for unused fds Rasmus Villemoes
2015-11-03 22:45 ` Linus Torvalds
@ 2015-11-04 1:31 ` Eric Dumazet
2015-11-04 10:59 ` Rasmus Villemoes
1 sibling, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2015-11-04 1:31 UTC (permalink / raw)
To: Rasmus Villemoes
Cc: Alexander Viro, Linus Torvalds, linux-fsdevel, linux-kernel
On Tue, 2015-11-03 at 10:41 +0100, Rasmus Villemoes wrote:
> @@ -667,7 +667,7 @@ void do_close_on_exec(struct files_struct *files)
> fdt = files_fdtable(files);
> if (fd >= fdt->max_fds)
> break;
> - set = fdt->close_on_exec[i];
> + set = fdt->close_on_exec[i] & fdt->open_fds[i];
> if (!set)
> continue;
> fdt->close_on_exec[i] = 0;
If you don't bother, why leaving this final fdt->close_on_exec[i] = 0 ?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] vfs: don't bother clearing close_on_exec bit for unused fds
2015-11-04 1:31 ` Eric Dumazet
@ 2015-11-04 10:59 ` Rasmus Villemoes
2015-11-04 12:33 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Rasmus Villemoes @ 2015-11-04 10:59 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Alexander Viro, Linus Torvalds, linux-fsdevel, linux-kernel
On Wed, Nov 04 2015, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2015-11-03 at 10:41 +0100, Rasmus Villemoes wrote:
>
>> @@ -667,7 +667,7 @@ void do_close_on_exec(struct files_struct *files)
>> fdt = files_fdtable(files);
>> if (fd >= fdt->max_fds)
>> break;
>> - set = fdt->close_on_exec[i];
>> + set = fdt->close_on_exec[i] & fdt->open_fds[i];
>> if (!set)
>> continue;
>> fdt->close_on_exec[i] = 0;
>
> If you don't bother, why leaving this final fdt->close_on_exec[i] = 0 ?
Thanks, it should go, along with the mentioned memsets. Updated patch below.
Reading dup_fd() I'm even more convinced that we're not relying on any
particular value for close_on_exec bits for unused fds. After
/*
* The fd may be claimed in the fd bitmap but not yet
* instantiated in the files array if a sibling thread
* is partway through open(). So make sure that this
* fd is available to the new process.
*/
we only __clear_open_fd(), so the close_on_exec bit may be left set in
the new process.
From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Date: Tue, 3 Nov 2015 09:43:53 +0100
Subject: [PATCH] vfs: don't bother clearing close_on_exec bit for unused fds
In fc90888d07b8 (vfs: conditionally clear close-on-exec flag) a
conditional was added to __clear_close_on_exec to avoid dirtying a
cache line in the common case where the bit is already clear. However,
AFAICT, we don't rely on the close_on_exec bit being clear for unused
fds, except as an optimization in do_close_on_exec(); if I haven't
missed anything, __{set,clear}_close_on_exec is always called when a
new fd is allocated. At the expense of also reading through ->open_fds
in do_close_on_exec(), we can avoid accessing the close_on_exec bitmap
altogether in close(), which I think is a reasonable trade-off.
The conditional added in the commit above still makes sense to avoid
the dirtying on the allocation paths, but I also think it might make
sense in __set_close_on_exec: I suppose any given app handling a
non-trivial amount of fds uses O_CLOEXEC for either almost none or
almost all of them, so after a while one would reach a sort of
steady-state where bits in ->close_on_exec are almost never flipped.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
fs/file.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/fs/file.c b/fs/file.c
index c6986dce0334..1bb74923395c 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -79,7 +79,6 @@ static void copy_fdtable(struct fdtable *nfdt, struct fdtable *ofdt)
memcpy(nfdt->open_fds, ofdt->open_fds, cpy);
memset((char *)(nfdt->open_fds) + cpy, 0, set);
memcpy(nfdt->close_on_exec, ofdt->close_on_exec, cpy);
- memset((char *)(nfdt->close_on_exec) + cpy, 0, set);
cpy = BITBIT_SIZE(ofdt->max_fds);
set = BITBIT_SIZE(nfdt->max_fds) - cpy;
@@ -231,7 +230,8 @@ repeat:
static inline void __set_close_on_exec(int fd, struct fdtable *fdt)
{
- __set_bit(fd, fdt->close_on_exec);
+ if (!test_bit(fd, fdt->close_on_exec))
+ __set_bit(fd, fdt->close_on_exec);
}
static inline void __clear_close_on_exec(int fd, struct fdtable *fdt)
@@ -369,7 +369,6 @@ struct files_struct *dup_fd(struct files_struct *oldf, int *errorp)
int start = open_files / BITS_PER_LONG;
memset(&new_fdt->open_fds[start], 0, left);
- memset(&new_fdt->close_on_exec[start], 0, left);
}
rcu_assign_pointer(newf->fdt, new_fdt);
@@ -644,7 +643,6 @@ int __close_fd(struct files_struct *files, unsigned fd)
if (!file)
goto out_unlock;
rcu_assign_pointer(fdt->fd[fd], NULL);
- __clear_close_on_exec(fd, fdt);
__put_unused_fd(files, fd);
spin_unlock(&files->file_lock);
return filp_close(file, files);
@@ -667,10 +665,9 @@ void do_close_on_exec(struct files_struct *files)
fdt = files_fdtable(files);
if (fd >= fdt->max_fds)
break;
- set = fdt->close_on_exec[i];
+ set = fdt->close_on_exec[i] & fdt->open_fds[i];
if (!set)
continue;
- fdt->close_on_exec[i] = 0;
for ( ; set ; fd++, set >>= 1) {
struct file *file;
if (!(set & 1))
--
2.6.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC] vfs: don't bother clearing close_on_exec bit for unused fds
2015-11-04 10:59 ` Rasmus Villemoes
@ 2015-11-04 12:33 ` Eric Dumazet
0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2015-11-04 12:33 UTC (permalink / raw)
To: Rasmus Villemoes
Cc: Alexander Viro, Linus Torvalds, linux-fsdevel, linux-kernel
On Wed, 2015-11-04 at 11:59 +0100, Rasmus Villemoes wrote:
> @@ -667,10 +665,9 @@ void do_close_on_exec(struct files_struct *files)
> fdt = files_fdtable(files);
> if (fd >= fdt->max_fds)
> break;
> - set = fdt->close_on_exec[i];
> + set = fdt->close_on_exec[i] & fdt->open_fds[i];
> if (!set)
> continue;
Many processes have a big hole at the end of fdt->open_fds[], due
to the fact that max_fds is rounded to a power of two.
It makes sense to avoid bringing in cpu caches the close_on_exec[] part.
set = fdt->open_fds[i];
if (!set)
continue;
set &= fdt->close_on_exec[i];
if (!set)
continue;
Not sure if this is a net win due to branch prediction...
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-11-04 12:33 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-03 9:41 [RFC] vfs: don't bother clearing close_on_exec bit for unused fds Rasmus Villemoes
2015-11-03 22:45 ` Linus Torvalds
2015-11-03 23:13 ` Rasmus Villemoes
2015-11-04 1:31 ` Eric Dumazet
2015-11-04 10:59 ` Rasmus Villemoes
2015-11-04 12:33 ` Eric Dumazet
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.