All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] pipe: increase minimum default pipe size to 2 pages
       [not found] <20210805000435.10833-1-alex_y_xu.ref@yahoo.ca>
@ 2021-08-05  0:04 ` Alex Xu (Hello71)
  2021-08-05  1:33   ` Alex Xu (Hello71)
  2021-08-05  8:35   ` Greg KH
  0 siblings, 2 replies; 6+ messages in thread
From: Alex Xu (Hello71) @ 2021-08-05  0:04 UTC (permalink / raw)
  To: torvalds
  Cc: linux-fsdevel, linux-api, linux-kernel, dhowells, linux, gregkh,
	peterz, nicolas.dichtel, raven, christian, Alex Xu (Hello71)

Before this patch, the following program prints 4096 and hangs.
Afterwards, it prints 8192 and exits successfully. Note that you may
need to increase your RLIMIT_NOFILE before running the program.

int main() {
    int pipefd[2];
    for (int i = 0; i < 1025; i++)
        if (pipe(pipefd) == -1)
            return 1;
    size_t bufsz = fcntl(pipefd[1], F_GETPIPE_SZ);
    printf("%zd\n", bufsz);
    char *buf = calloc(bufsz, 1);
    write(pipefd[1], buf, bufsz);
    read(pipefd[0], buf, bufsz-1);
    write(pipefd[1], buf, 1);
}

Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
---

See discussion at https://lore.kernel.org/lkml/1628086770.5rn8p04n6j.none@localhost/.

 fs/pipe.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/pipe.c b/fs/pipe.c
index 9ef4231cce61..8e6ef62aeb1c 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -31,6 +31,21 @@
 
 #include "internal.h"
 
+/*
+ * New pipe buffers will be restricted to this size while the user is exceeding
+ * their pipe buffer quota. The general pipe use case needs at least two
+ * buffers: one for data yet to be read, and one for new data. If this is less
+ * than two, then a write to a non-empty pipe may block even if the pipe is not
+ * full. This can occur with GNU make jobserver or similar uses of pipes as
+ * semaphores: multiple processes may be waiting to write tokens back to the
+ * pipe before reading tokens: https://lore.kernel.org/lkml/1628086770.5rn8p04n6j.none@localhost/.
+ *
+ * Users can reduce their pipe buffers with F_SETPIPE_SZ below this at their
+ * own risk, namely: pipe writes to non-full pipes may block until the pipe is
+ * emptied.
+ */
+#define PIPE_MIN_DEF_BUFFERS 2
+
 /*
  * The max size that a non-root user is allowed to grow the pipe. Can
  * be set by root in /proc/sys/fs/pipe-max-size
@@ -781,8 +796,8 @@ struct pipe_inode_info *alloc_pipe_info(void)
 	user_bufs = account_pipe_buffers(user, 0, pipe_bufs);
 
 	if (too_many_pipe_buffers_soft(user_bufs) && pipe_is_unprivileged_user()) {
-		user_bufs = account_pipe_buffers(user, pipe_bufs, 1);
-		pipe_bufs = 1;
+		user_bufs = account_pipe_buffers(user, pipe_bufs, PIPE_MIN_DEF_BUFFERS);
+		pipe_bufs = PIPE_MIN_DEF_BUFFERS;
 	}
 
 	if (too_many_pipe_buffers_hard(user_bufs) && pipe_is_unprivileged_user())
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] pipe: increase minimum default pipe size to 2 pages
  2021-08-05  0:04 ` [PATCH] pipe: increase minimum default pipe size to 2 pages Alex Xu (Hello71)
@ 2021-08-05  1:33   ` Alex Xu (Hello71)
  2021-08-05  8:35   ` Greg KH
  1 sibling, 0 replies; 6+ messages in thread
From: Alex Xu (Hello71) @ 2021-08-05  1:33 UTC (permalink / raw)
  To: linux-kernel

Before this patch, the following program prints 4096 and hangs.
Afterwards, it prints 8192 and exits successfully. Note that you may
need to increase your RLIMIT_NOFILE before running the program.

int main() {
    int pipefd[2];
    for (int i = 0; i < 1025; i++)
        if (pipe(pipefd) == -1)
            return 1;
    size_t bufsz = fcntl(pipefd[1], F_GETPIPE_SZ);
    printf("%zd\n", bufsz);
    char *buf = calloc(bufsz, 1);
    write(pipefd[1], buf, bufsz);
    read(pipefd[0], buf, bufsz-1);
    write(pipefd[1], buf, 1);
}

Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
---

See discussion at https://lore.kernel.org/lkml/1628086770.5rn8p04n6j.none@localhost/.

Patch resend to linux-kernel@vger.kernel.org (typoed email last time). 
Sorry if this messes up your mail clients.

 fs/pipe.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/pipe.c b/fs/pipe.c
index 9ef4231cce61..8e6ef62aeb1c 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -31,6 +31,21 @@
 
 #include "internal.h"
 
+/*
+ * New pipe buffers will be restricted to this size while the user is exceeding
+ * their pipe buffer quota. The general pipe use case needs at least two
+ * buffers: one for data yet to be read, and one for new data. If this is less
+ * than two, then a write to a non-empty pipe may block even if the pipe is not
+ * full. This can occur with GNU make jobserver or similar uses of pipes as
+ * semaphores: multiple processes may be waiting to write tokens back to the
+ * pipe before reading tokens: https://lore.kernel.org/lkml/1628086770.5rn8p04n6j.none@localhost/.
+ *
+ * Users can reduce their pipe buffers with F_SETPIPE_SZ below this at their
+ * own risk, namely: pipe writes to non-full pipes may block until the pipe is
+ * emptied.
+ */
+#define PIPE_MIN_DEF_BUFFERS 2
+
 /*
  * The max size that a non-root user is allowed to grow the pipe. Can
  * be set by root in /proc/sys/fs/pipe-max-size
@@ -781,8 +796,8 @@ struct pipe_inode_info *alloc_pipe_info(void)
 	user_bufs = account_pipe_buffers(user, 0, pipe_bufs);
 
 	if (too_many_pipe_buffers_soft(user_bufs) && pipe_is_unprivileged_user()) {
-		user_bufs = account_pipe_buffers(user, pipe_bufs, 1);
-		pipe_bufs = 1;
+		user_bufs = account_pipe_buffers(user, pipe_bufs, PIPE_MIN_DEF_BUFFERS);
+		pipe_bufs = PIPE_MIN_DEF_BUFFERS;
 	}
 
 	if (too_many_pipe_buffers_hard(user_bufs) && pipe_is_unprivileged_user())
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] pipe: increase minimum default pipe size to 2 pages
  2021-08-05  0:04 ` [PATCH] pipe: increase minimum default pipe size to 2 pages Alex Xu (Hello71)
  2021-08-05  1:33   ` Alex Xu (Hello71)
@ 2021-08-05  8:35   ` Greg KH
  2021-08-05 14:18     ` Alex Xu (Hello71)
  1 sibling, 1 reply; 6+ messages in thread
From: Greg KH @ 2021-08-05  8:35 UTC (permalink / raw)
  To: Alex Xu (Hello71)
  Cc: torvalds, linux-fsdevel, linux-api, linux-kernel, dhowells,
	linux, peterz, nicolas.dichtel, raven, christian

On Wed, Aug 04, 2021 at 08:04:35PM -0400, Alex Xu (Hello71) wrote:
> Before this patch, the following program prints 4096 and hangs.
> Afterwards, it prints 8192 and exits successfully. Note that you may
> need to increase your RLIMIT_NOFILE before running the program.
> 
> int main() {
>     int pipefd[2];
>     for (int i = 0; i < 1025; i++)
>         if (pipe(pipefd) == -1)
>             return 1;
>     size_t bufsz = fcntl(pipefd[1], F_GETPIPE_SZ);
>     printf("%zd\n", bufsz);
>     char *buf = calloc(bufsz, 1);
>     write(pipefd[1], buf, bufsz);
>     read(pipefd[0], buf, bufsz-1);
>     write(pipefd[1], buf, 1);
> }
> 
> Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
> ---

Is this due to the changes that happened in 5.5?  If so, a cc: stable
and a fixes tag would be nice to have :)

> See discussion at https://lore.kernel.org/lkml/1628086770.5rn8p04n6j.none@localhost/.

This can go up in the changelog text too.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] pipe: increase minimum default pipe size to 2 pages
  2021-08-05  8:35   ` Greg KH
@ 2021-08-05 14:18     ` Alex Xu (Hello71)
  2021-08-05 17:41       ` Linus Torvalds
  0 siblings, 1 reply; 6+ messages in thread
From: Alex Xu (Hello71) @ 2021-08-05 14:18 UTC (permalink / raw)
  To: Greg KH
  Cc: christian, dhowells, linux-api, linux-fsdevel, linux-kernel,
	linux, nicolas.dichtel, peterz, raven, torvalds

Excerpts from Greg KH's message of August 5, 2021 4:35 am:
> On Wed, Aug 04, 2021 at 08:04:35PM -0400, Alex Xu (Hello71) wrote:
>> Before this patch, the following program prints 4096 and hangs.
>> Afterwards, it prints 8192 and exits successfully. Note that you may
>> need to increase your RLIMIT_NOFILE before running the program.
>> 
>> int main() {
>>     int pipefd[2];
>>     for (int i = 0; i < 1025; i++)
>>         if (pipe(pipefd) == -1)
>>             return 1;
>>     size_t bufsz = fcntl(pipefd[1], F_GETPIPE_SZ);
>>     printf("%zd\n", bufsz);
>>     char *buf = calloc(bufsz, 1);
>>     write(pipefd[1], buf, bufsz);
>>     read(pipefd[0], buf, bufsz-1);
>>     write(pipefd[1], buf, 1);
>> }
>> 
>> Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
>> ---
> 
> Is this due to the changes that happened in 5.5?  If so, a cc: stable
> and a fixes tag would be nice to have :)
> 
>> See discussion at https://lore.kernel.org/lkml/1628086770.5rn8p04n6j.none@localhost/.
> 
> This can go up in the changelog text too.
> 
> thanks,
> 
> greg k-h
> 

I tested 5.4 and it exhibits the same problem as master using this 
non-racy program. I think the problem goes back to v4.5, the first 
release with 759c01142a ("pipe: limit the per-user amount of pages 
allocated in pipes"). The issue likely become more apparent with the 
improvement in pipe performance from v5.5, whereas before that, pipes 
were too slow for the issue to manifest in racy environments.

I'll send a new patch with #include lines and a Fixes: 759c01142a. I'm 
not 100% sure that it actually goes back that far, but the worst thing 
that can plausibly happen is that applications opening very large 
numbers of pipes suddenly use slightly more memory. I certainly hope 
nobody is relying on pipes randomly blocking roughly 1/4096 of the 
time.

Regards,
Alex.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] pipe: increase minimum default pipe size to 2 pages
  2021-08-05 14:18     ` Alex Xu (Hello71)
@ 2021-08-05 17:41       ` Linus Torvalds
  0 siblings, 0 replies; 6+ messages in thread
From: Linus Torvalds @ 2021-08-05 17:41 UTC (permalink / raw)
  To: Alex Xu (Hello71)
  Cc: Greg KH, Christian Brauner, David Howells, Linux API,
	linux-fsdevel, Linux Kernel Mailing List, Rasmus Villemoes,
	Nicolas Dichtel, Peter Zijlstra, Ian Kent

On Thu, Aug 5, 2021 at 7:18 AM Alex Xu (Hello71) <alex_y_xu@yahoo.ca> wrote:
>
> I tested 5.4 and it exhibits the same problem as master using this
> non-racy program. I think the problem goes back to v4.5, the first
> release with 759c01142a ("pipe: limit the per-user amount of pages
> allocated in pipes").

Yeah, our pipe buffer allocation strategy has been fairly consistent,
although the exact locking details etc have certainly changed over
time.

I do think the behavior goes back all the way to that "limit to one
single buffer if you hit the pipe size soft limit" commit, because the
thing that example program tests has been true for the whole time,
afaik: first fill up the first pipe buffer completely, then (a) read
everything but one byte, and then (b) try to write another byte.

Doing (a) will leave the pipe buffer still allocated and in use, and
then (b) will fundamentally want to allocate a new buffer for the new
write.  Which will obviously not then be allowed if we have said "one
pipe buffer only".

So a lot of the code around pipe buffers has changed over the years,
and the exact patterns and timing and wakeups has been completely
rewritten, but that buffer allocation pattern is pretty fundamental
and I don't think that has changed at all.

(A long LONG time ago, we had only one pipe buffer, and it was a
single circular queue, and you never had this kind of "used up one
buffer, need to allocate a new one" issue, so it's not like this goes
back to Linux 0.01, but the pipe buffers go back a _loong_ time).

Allowing two buffers obviously doesn't change the basic pattern at all
- but it means that we will always allow having at least PIPE_BUF
bytes in the pipe. So you can obviously still trigger that "cannot
write any more, will block any future writes", but at that point it's
a clear user bug in thinking that pipes have some infinite buffer
size.

In contrast, expecting pipes to be able to hold 2 bytes at a time is
quite reasonable, with POSIX guaranteeing PIPE_BUF of at least 512
bytes.

I've applied Alex's patch.

                 Linus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] pipe: increase minimum default pipe size to 2 pages
       [not found] <20210805144047.13518-1-alex_y_xu.ref@yahoo.ca>
@ 2021-08-05 14:40 ` Alex Xu (Hello71)
  0 siblings, 0 replies; 6+ messages in thread
From: Alex Xu (Hello71) @ 2021-08-05 14:40 UTC (permalink / raw)
  To: torvalds
  Cc: linux-fsdevel, linux-api, linux-kernel, dhowells, linux, gregkh,
	peterz, nicolas.dichtel, raven, christian, Alex Xu (Hello71),
	stable

This program always prints 4096 and hangs before the patch, and always
prints 8192 and exits successfully after:

int main() {
    int pipefd[2];
    for (int i = 0; i < 1025; i++)
        if (pipe(pipefd) == -1)
            return 1;
    size_t bufsz = fcntl(pipefd[1], F_GETPIPE_SZ);
    printf("%zd\n", bufsz);
    char *buf = calloc(bufsz, 1);
    write(pipefd[1], buf, bufsz);
    read(pipefd[0], buf, bufsz-1);
    write(pipefd[1], buf, 1);
}

Note that you may need to increase your RLIMIT_NOFILE before running the
program.

Fixes: 759c01142a ("pipe: limit the per-user amount of pages allocated in pipes")
Cc: <stable@vger.kernel.org>

Link: https://lore.kernel.org/lkml/1628086770.5rn8p04n6j.none@localhost/
Link: https://lore.kernel.org/lkml/1628127094.lxxn016tj7.none@localhost/

Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
---
 fs/pipe.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/pipe.c b/fs/pipe.c
index 9ef4231cce61..8e6ef62aeb1c 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -31,6 +31,21 @@
 
 #include "internal.h"
 
+/*
+ * New pipe buffers will be restricted to this size while the user is exceeding
+ * their pipe buffer quota. The general pipe use case needs at least two
+ * buffers: one for data yet to be read, and one for new data. If this is less
+ * than two, then a write to a non-empty pipe may block even if the pipe is not
+ * full. This can occur with GNU make jobserver or similar uses of pipes as
+ * semaphores: multiple processes may be waiting to write tokens back to the
+ * pipe before reading tokens: https://lore.kernel.org/lkml/1628086770.5rn8p04n6j.none@localhost/.
+ *
+ * Users can reduce their pipe buffers with F_SETPIPE_SZ below this at their
+ * own risk, namely: pipe writes to non-full pipes may block until the pipe is
+ * emptied.
+ */
+#define PIPE_MIN_DEF_BUFFERS 2
+
 /*
  * The max size that a non-root user is allowed to grow the pipe. Can
  * be set by root in /proc/sys/fs/pipe-max-size
@@ -781,8 +796,8 @@ struct pipe_inode_info *alloc_pipe_info(void)
 	user_bufs = account_pipe_buffers(user, 0, pipe_bufs);
 
 	if (too_many_pipe_buffers_soft(user_bufs) && pipe_is_unprivileged_user()) {
-		user_bufs = account_pipe_buffers(user, pipe_bufs, 1);
-		pipe_bufs = 1;
+		user_bufs = account_pipe_buffers(user, pipe_bufs, PIPE_MIN_DEF_BUFFERS);
+		pipe_bufs = PIPE_MIN_DEF_BUFFERS;
 	}
 
 	if (too_many_pipe_buffers_hard(user_bufs) && pipe_is_unprivileged_user())
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-05 17:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210805000435.10833-1-alex_y_xu.ref@yahoo.ca>
2021-08-05  0:04 ` [PATCH] pipe: increase minimum default pipe size to 2 pages Alex Xu (Hello71)
2021-08-05  1:33   ` Alex Xu (Hello71)
2021-08-05  8:35   ` Greg KH
2021-08-05 14:18     ` Alex Xu (Hello71)
2021-08-05 17:41       ` Linus Torvalds
     [not found] <20210805144047.13518-1-alex_y_xu.ref@yahoo.ca>
2021-08-05 14:40 ` Alex Xu (Hello71)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.