[PATCH] pipe: Make a partially-satisfied blocking read wait for more

From: David Howells <dhowells@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: dhowells@redhat.com, Franck Grosjean <fgrosjea@redhat.com>,
	Phil Auld <pauld@redhat.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH] pipe: Make a partially-satisfied blocking read wait for more
Date: Fri, 23 Jun 2023 23:34:28 +0100	[thread overview]
Message-ID: <2730511.1687559668@warthog.procyon.org.uk> (raw)

Hi Linus,

Can you consider merging something like the attached patch?  Unfortunately,
there are applications out there that depend on a read from pipe() waiting
until the buffer is full under some circumstances.  Patch a28c8b9db8a1
removed the conditionality on there being an attached writer.

I'm not sure this is the best solution though as it goes over the other way
and will now block reads for which there isn't an active writer - and I'm
sure that, somewhere, there's an app that will break on tht.

Thanks,
David
---
pipe: Make a partially-satisfied blocking read wait for more data

A read on a pipe may return short after reading some data from a pipe, even
though the pipe isn't non-blocking.  This is stated in the read(2) manual
page:

    ... It is not an error if this number is smaller than the number of
    bytes requested; this may happen for example because fewer bytes are
    actually available right now (maybe because we were close to
    end-of-file, or because we are reading from a pipe, or from a
    terminal)...

However, some applications depend on a blocking read on a pipe not
returning until it fills the buffer unless it hits EOF or a signal occurs -
at least as long as there's an active writer on the other end.

Fix the pipe reader to restore this behaviour by only breaking out with a
short read in the non-block (and signal) cases.

Here's a reproducer for it:

    #include <fcntl.h>
    #include <stdio.h>
    #include <unistd.h>
    #include <stdlib.h>
    #include <sys/uio.h>

    #define F_GETPIPE_SZ 1032

    int main(int argc, char *argv[])
    {
       int fildes[2];
       if (pipe(fildes) == -1) {
	       perror("in pipe");
	       return -1;
       }
       printf("%d %d\n",
              fcntl(fildes[0], F_GETPIPE_SZ),
              fcntl(fildes[1], F_GETPIPE_SZ));
       if (fork() != 0) {
	       void *tata = malloc(100000);
	       int res = read(fildes[0], tata, 100000);
	       printf("could read %d bytes\n", res);
	       return -1;
       }
       void *toto = malloc(100000);
       struct iovec iov;
       iov.iov_base = toto;
       iov.iov_len = 100000;
       int d = writev(fildes[1], &iov, 1);
       if (d == -1) {
	       perror("in writev");
	       return -1;
       }
       printf("could write %d bytes\n", d);
       sleep(1);
       return 0;
    }

It should show the same amount read as written, but shows a short read because
the pipe capacity isn't sufficient.

Fixes: a28c8b9db8a1 ("pipe: remove 'waiting_writers' merging logic")
Reported-by: Franck Grosjean <fgrosjea@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Phil Auld <pauld@redhat.com>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Christian Brauner <brauner@kernel.org>
cc: linux-fsdevel@vger.kernel.org
---
 fs/pipe.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/pipe.c b/fs/pipe.c
index 2d88f73f585a..c5c992f19d28 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -340,11 +340,10 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)

 		if (!pipe->writers)
 			break;
-		if (ret)
-			break;
 		if ((filp->f_flags & O_NONBLOCK) ||
 		    (iocb->ki_flags & IOCB_NOWAIT)) {
-			ret = -EAGAIN;
+			if (!ret)
+				ret = -EAGAIN;
 			break;
 		}
 		__pipe_unlock(pipe);