Linux-Fsdevel Archive on lore.kernel.org
 help / Atom feed
* [RFC PATCH] fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes
@ 2019-02-07 15:45 Slavomir Kaslev
  2019-02-13 14:44 ` Steven Rostedt
  0 siblings, 1 reply; 7+ messages in thread
From: Slavomir Kaslev @ 2019-02-07 15:45 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Jens Axboe, Steven Rostedt, linux-fsdevel, linux-kernel

The current implementation of splice() and tee() ignores O_NONBLOCK set on pipe
file descriptors and checks only the SPLICE_F_NONBLOCK flag for blocking on pipe
arguments. This is inconsistent since splice()-ing from/to non-pipe file
descriptors does take O_NONBLOCK into consideration.

Fix this by promoting O_NONBLOCK, when set on a pipe, to SPLICE_F_NONBLOCK.

Some context for how the current implementation of splice() leads to
inconsistent behavior. In the ongoing work[1] to add VM tracing capability to
trace-cmd we stream tracing data over named FIFOs or vsockets from guests back
to the host.

When we receive SIGINT from user to stop tracing, we set O_NONBLOCK on the input
file descriptor and set SPLICE_F_NONBLOCK for the next call to splice(). If
splice() was blocked waiting on data from the input FIFO, after SIGINT splice()
restarts with the same arguments (no SPLICE_F_NONBLOCK) and blocks again instead
of returning -EAGAIN when no data is available.

This differs from the splice() behavior when reading from a vsocket or when
we're doing a traditional read()/write() loop (trace-cmd's --nosplice argument).

With this patch applied we get the same behavior in all situations after setting
O_NONBLOCK which also matches the behavior of doing a read()/write() loop
instead of splice().

This change does have potential of breaking users who don't expect EAGAIN from
splice() when SPLICE_F_NONBLOCK is not set. OTOH programs that set O_NONBLOCK
and don't anticipate EAGAIN are arguably buggy[2].

[1] https://github.com/skaslev/trace-cmd/tree/vsock
[2] https://github.com/torvalds/linux/blob/d47e3da1759230e394096fd742aad423c291ba48/fs/read_write.c#L1425

Signed-off-by: Slavomir Kaslev <kaslevs@vmware.com>
---
 fs/splice.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/fs/splice.c b/fs/splice.c
index de2ede048473..6a1761b74f8d 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1123,6 +1123,9 @@ static long do_splice(struct file *in, loff_t __user *off_in,
 		if (ipipe == opipe)
 			return -EINVAL;
 
+		if ((in->f_flags | out->f_flags) & O_NONBLOCK)
+			flags |= SPLICE_F_NONBLOCK;
+
 		return splice_pipe_to_pipe(ipipe, opipe, len, flags);
 	}
 
@@ -1148,6 +1151,9 @@ static long do_splice(struct file *in, loff_t __user *off_in,
 		if (unlikely(ret < 0))
 			return ret;
 
+		if (in->f_flags & O_NONBLOCK)
+			flags |= SPLICE_F_NONBLOCK;
+
 		file_start_write(out);
 		ret = do_splice_from(ipipe, out, &offset, len, flags);
 		file_end_write(out);
@@ -1172,6 +1178,9 @@ static long do_splice(struct file *in, loff_t __user *off_in,
 			offset = in->f_pos;
 		}
 
+		if (out->f_flags & O_NONBLOCK)
+			flags |= SPLICE_F_NONBLOCK;
+
 		pipe_lock(opipe);
 		ret = wait_for_space(opipe, flags);
 		if (!ret)
@@ -1717,6 +1726,9 @@ static long do_tee(struct file *in, struct file *out, size_t len,
 	 * copying the data.
 	 */
 	if (ipipe && opipe && ipipe != opipe) {
+		if ((in->f_flags | out->f_flags) & O_NONBLOCK)
+			flags |= SPLICE_F_NONBLOCK;
+
 		/*
 		 * Keep going, unless we encounter an error. The ipipe/opipe
 		 * ordering doesn't really matter.
-- 
2.19.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes
  2019-02-07 15:45 [RFC PATCH] fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes Slavomir Kaslev
@ 2019-02-13 14:44 ` Steven Rostedt
  2019-02-19 17:12   ` Steven Rostedt
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2019-02-13 14:44 UTC (permalink / raw)
  To: Slavomir Kaslev; +Cc: Alexander Viro, Jens Axboe, linux-fsdevel, linux-kernel

On Thu,  7 Feb 2019 17:45:19 +0200
Slavomir Kaslev <kaslevs@vmware.com> wrote:

> The current implementation of splice() and tee() ignores O_NONBLOCK set on pipe
> file descriptors and checks only the SPLICE_F_NONBLOCK flag for blocking on pipe
> arguments. This is inconsistent since splice()-ing from/to non-pipe file
> descriptors does take O_NONBLOCK into consideration.
> 
> Fix this by promoting O_NONBLOCK, when set on a pipe, to SPLICE_F_NONBLOCK.
> 
> Some context for how the current implementation of splice() leads to
> inconsistent behavior. In the ongoing work[1] to add VM tracing capability to
> trace-cmd we stream tracing data over named FIFOs or vsockets from guests back
> to the host.
> 
> When we receive SIGINT from user to stop tracing, we set O_NONBLOCK on the input
> file descriptor and set SPLICE_F_NONBLOCK for the next call to splice(). If
> splice() was blocked waiting on data from the input FIFO, after SIGINT splice()
> restarts with the same arguments (no SPLICE_F_NONBLOCK) and blocks again instead
> of returning -EAGAIN when no data is available.
> 
> This differs from the splice() behavior when reading from a vsocket or when
> we're doing a traditional read()/write() loop (trace-cmd's --nosplice argument).
> 
> With this patch applied we get the same behavior in all situations after setting
> O_NONBLOCK which also matches the behavior of doing a read()/write() loop
> instead of splice().
> 
> This change does have potential of breaking users who don't expect EAGAIN from
> splice() when SPLICE_F_NONBLOCK is not set. OTOH programs that set O_NONBLOCK
> and don't anticipate EAGAIN are arguably buggy[2].

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Does anyone have any issues or comments about this patch?

Thanks!

-- Steve

> 
> [1] https://github.com/skaslev/trace-cmd/tree/vsock
> [2] https://github.com/torvalds/linux/blob/d47e3da1759230e394096fd742aad423c291ba48/fs/read_write.c#L1425
> 
> Signed-off-by: Slavomir Kaslev <kaslevs@vmware.com>
> ---
>  fs/splice.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/fs/splice.c b/fs/splice.c
> index de2ede048473..6a1761b74f8d 100644
> --- a/fs/splice.c
> +++ b/fs/splice.c
> @@ -1123,6 +1123,9 @@ static long do_splice(struct file *in, loff_t __user *off_in,
>  		if (ipipe == opipe)
>  			return -EINVAL;
>  
> +		if ((in->f_flags | out->f_flags) & O_NONBLOCK)
> +			flags |= SPLICE_F_NONBLOCK;
> +
>  		return splice_pipe_to_pipe(ipipe, opipe, len, flags);
>  	}
>  
> @@ -1148,6 +1151,9 @@ static long do_splice(struct file *in, loff_t __user *off_in,
>  		if (unlikely(ret < 0))
>  			return ret;
>  
> +		if (in->f_flags & O_NONBLOCK)
> +			flags |= SPLICE_F_NONBLOCK;
> +
>  		file_start_write(out);
>  		ret = do_splice_from(ipipe, out, &offset, len, flags);
>  		file_end_write(out);
> @@ -1172,6 +1178,9 @@ static long do_splice(struct file *in, loff_t __user *off_in,
>  			offset = in->f_pos;
>  		}
>  
> +		if (out->f_flags & O_NONBLOCK)
> +			flags |= SPLICE_F_NONBLOCK;
> +
>  		pipe_lock(opipe);
>  		ret = wait_for_space(opipe, flags);
>  		if (!ret)
> @@ -1717,6 +1726,9 @@ static long do_tee(struct file *in, struct file *out, size_t len,
>  	 * copying the data.
>  	 */
>  	if (ipipe && opipe && ipipe != opipe) {
> +		if ((in->f_flags | out->f_flags) & O_NONBLOCK)
> +			flags |= SPLICE_F_NONBLOCK;
> +
>  		/*
>  		 * Keep going, unless we encounter an error. The ipipe/opipe
>  		 * ordering doesn't really matter.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes
  2019-02-13 14:44 ` Steven Rostedt
@ 2019-02-19 17:12   ` Steven Rostedt
  2019-03-04 15:58     ` Steven Rostedt
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2019-02-19 17:12 UTC (permalink / raw)
  To: Alexander Viro, Jens Axboe; +Cc: Slavomir Kaslev, linux-fsdevel, linux-kernel

On Wed, 13 Feb 2019 09:44:45 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> 
> Does anyone have any issues or comments about this patch?

I'm going to start pinging people once a week, looking for comments on
this patch ;-)

-- Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes
  2019-02-19 17:12   ` Steven Rostedt
@ 2019-03-04 15:58     ` Steven Rostedt
  2019-03-05  0:04       ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2019-03-04 15:58 UTC (permalink / raw)
  To: Alexander Viro, Jens Axboe
  Cc: Slavomir Kaslev, linux-fsdevel, linux-kernel, Andrew Morton,
	Linus Torvalds

On Tue, 19 Feb 2019 12:12:12 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Wed, 13 Feb 2019 09:44:45 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> > 
> > Does anyone have any issues or comments about this patch?  
> 
> I'm going to start pinging people once a week, looking for comments on
> this patch ;-)
> 

I'm still just hearing crickets about this? Should I just send this
directly to Linus or Andrew?

-- Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes
  2019-03-04 15:58     ` Steven Rostedt
@ 2019-03-05  0:04       ` Linus Torvalds
  2019-03-05  0:09         ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2019-03-05  0:04 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Alexander Viro, Jens Axboe, Slavomir Kaslev, linux-fsdevel,
	Linux List Kernel Mailing, Andrew Morton

On Mon, Mar 4, 2019 at 7:58 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> I'm still just hearing crickets about this? Should I just send this
> directly to Linus or Andrew?

I'll take it. It looks sane.

Of course, if it turns out that this breaks something that assumes
that splice blocks purely based on the SPLICE_F_NONBLOCK flag, we'll
have to revert it. Looking at the history of splice, it does look like
it has always ignored O_NONBLOCK.

                 Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes
  2019-03-05  0:04       ` Linus Torvalds
@ 2019-03-05  0:09         ` Linus Torvalds
  2019-03-05  1:54           ` Steven Rostedt
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2019-03-05  0:09 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Alexander Viro, Jens Axboe, Slavomir Kaslev, linux-fsdevel,
	Linux List Kernel Mailing, Andrew Morton

On Mon, Mar 4, 2019 at 4:04 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Of course, if it turns out that this breaks something that assumes
> that splice blocks purely based on the SPLICE_F_NONBLOCK flag, we'll
> have to revert it. Looking at the history of splice, it does look like
> it has always ignored O_NONBLOCK.

Note that the "arguably buggy" argument is not an actual real argument
in the presence of regressions. It's more a "I wish reality wasn't
that way" argument, but it doesn't actually _change_ reality.

Of course, in the case of sendfile() (which is where that comment is),
I don't think we ever really even tested it either way.

                 Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH] fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes
  2019-03-05  0:09         ` Linus Torvalds
@ 2019-03-05  1:54           ` Steven Rostedt
  0 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2019-03-05  1:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alexander Viro, Jens Axboe, Slavomir Kaslev, linux-fsdevel,
	Linux List Kernel Mailing, Andrew Morton


[ Replying again, but without Cc'ing Trump due to a bug in claws-mail ]

On Mon, 4 Mar 2019 16:09:22 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> Note that the "arguably buggy" argument is not an actual real argument

Note that "arguably buggy" just means that you can argue that it is
buggy. It doesn't actually mean that it *is* buggy ;-)

-- Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-07 15:45 [RFC PATCH] fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes Slavomir Kaslev
2019-02-13 14:44 ` Steven Rostedt
2019-02-19 17:12   ` Steven Rostedt
2019-03-04 15:58     ` Steven Rostedt
2019-03-05  0:04       ` Linus Torvalds
2019-03-05  0:09         ` Linus Torvalds
2019-03-05  1:54           ` Steven Rostedt

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org linux-fsdevel@archiver.kernel.org
	public-inbox-index linux-fsdevel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox