All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Andy Lutomirski <luto@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>,
	Matthew Wilcox <willy@infradead.org>,
	Stefan Metzmacher <metze@samba.org>, Jens Axboe <axboe@kernel.dk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux API Mailing List <linux-api@vger.kernel.org>,
	io-uring <io-uring@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Samba Technical <samba-technical@lists.samba.org>
Subject: Re: copy on write for splice() from file to pipe?
Date: Fri, 10 Feb 2023 11:18:05 -0800	[thread overview]
Message-ID: <CAHk-=wjuXvF1cA=gJod=-6k4ypbEmOczFFDKriUpOVKy9dTJWQ@mail.gmail.com> (raw)
In-Reply-To: <CALCETrWuRHWh5XFn8M8qx5z0FXAGHH=ysb+c6J+cqbYyTAHvhw@mail.gmail.com>

On Fri, Feb 10, 2023 at 11:02 AM Andy Lutomirski <luto@kernel.org> wrote:
>
> Second, either make splice more strict or add a new "strict splice"
> variant.  Strict splice only completes when it can promise that writes
> to the source that start after strict splice's completion won't change
> what gets written to the destination.

The thing ius, I think your "strict splice" is pointless and wrong.

It's pointless, because it simply means that it won't perform well.

And since the whole point of splice was performance, it's wrong.

I really think the whole "source needs to be stable" is barking up the
wrong tree.

You are pointing fingers at splice().

And I think that's wrong.

We should point the fingers at either the _user_ of splice - as Jeremy
Allison has done a couple of times - or we should point it at the sink
that cannot deal with unstable sources.

Because that whole "source is unstable" is what allows for that higher
performance. The moment you start requiring stability, you _will_ lose
it. You will have to lock the page, you'll have to umap it from any
shared mappings, etc etc.  And even if there are no writers, or no
current mappers, all that effort to make sure that is the case is
actually fairly expensive.

So I would instead suggest a different approach entirely, with several
different steps:

 - make sure people are *aware* of this all.

   Maybe this thread raised some awareness of it for some people, but
more realistically - maybe we can really document this whole issue
somewhere much more clearly

 - it sounds like the particular user in question (samba) already very
much has a reasonable model for "I have exclusive access to this" that
just wasn't used

 - and finally, I do think it might make sense for the networking
people to look at how the networking side works with 'sendpage()'.

Because I really think that your "strict splice" model would just mean
that now the kernel would have to add not just a memcpy, but also a
new allocation for that new stable buffer for the memcpy, and that
would all just be very very pointless.

Alternatively, it would require some kind of nasty hard locking
together with other limitations on what can be done by non-splice
users.

                Linus

  reply	other threads:[~2023-02-10 19:18 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-09 13:55 copy on write for splice() from file to pipe? Stefan Metzmacher
2023-02-09 14:11 ` Matthew Wilcox
2023-02-09 14:29   ` Stefan Metzmacher
2023-02-09 16:41 ` Linus Torvalds
2023-02-09 19:17   ` Stefan Metzmacher
2023-02-09 19:36     ` Linus Torvalds
2023-02-09 19:48       ` Linus Torvalds
2023-02-09 20:33         ` Jeremy Allison
2023-02-10 20:45         ` Stefan Metzmacher
2023-02-10 20:51           ` Linus Torvalds
2023-02-10  2:16   ` Dave Chinner
2023-02-10  4:06     ` Dave Chinner
2023-02-10  4:44       ` Matthew Wilcox
2023-02-10  6:57         ` Dave Chinner
2023-02-10 15:14           ` Andy Lutomirski
2023-02-10 16:33             ` Linus Torvalds
2023-02-10 17:57               ` Andy Lutomirski
2023-02-10 18:19                 ` Jeremy Allison
2023-02-10 19:29                   ` Stefan Metzmacher
2023-02-10 18:37                 ` Linus Torvalds
2023-02-10 19:01                   ` Andy Lutomirski
2023-02-10 19:18                     ` Linus Torvalds [this message]
2023-02-10 19:27                       ` Jeremy Allison
2023-02-10 19:42                         ` Stefan Metzmacher
2023-02-10 19:42                         ` Linus Torvalds
2023-02-10 19:54                           ` Stefan Metzmacher
2023-02-10 19:29                       ` Linus Torvalds
2023-02-13  9:07                         ` Herbert Xu
2023-02-10 19:55                       ` Andy Lutomirski
2023-02-10 20:27                         ` Linus Torvalds
2023-02-10 20:32                           ` Jens Axboe
2023-02-10 20:36                             ` Linus Torvalds
2023-02-10 20:39                               ` Jens Axboe
2023-02-10 20:44                                 ` Linus Torvalds
2023-02-10 20:50                                   ` Jens Axboe
2023-02-10 21:14                                     ` Andy Lutomirski
2023-02-10 21:27                                       ` Jens Axboe
2023-02-10 21:51                                         ` Jens Axboe
2023-02-10 22:08                                           ` Linus Torvalds
2023-02-10 22:16                                             ` Jens Axboe
2023-02-10 22:17                                             ` Linus Torvalds
2023-02-10 22:25                                               ` Jens Axboe
2023-02-10 22:35                                                 ` Linus Torvalds
2023-02-10 22:51                                                   ` Jens Axboe
2023-02-11  3:18                                             ` Ming Lei
2023-02-11  6:17                                               ` Ming Lei
2023-02-11 14:13                                               ` Jens Axboe
2023-02-11 15:05                                                 ` Ming Lei
2023-02-11 15:33                                                   ` Jens Axboe
2023-02-11 18:57                                                     ` Linus Torvalds
2023-02-12  2:46                                                       ` Jens Axboe
2023-02-10  4:47       ` Linus Torvalds
2023-02-10  6:19         ` Dave Chinner
2023-02-10 17:23           ` Linus Torvalds
2023-02-10 17:47             ` Linus Torvalds
2023-02-13  9:28               ` Herbert Xu
2023-02-10 22:41             ` David Laight
2023-02-10 22:51               ` Jens Axboe
2023-02-13  9:30               ` Herbert Xu
2023-02-13  9:25           ` Herbert Xu
2023-02-13 18:01             ` Andy Lutomirski
2023-02-14  1:22               ` Herbert Xu
2023-02-17 23:13                 ` Andy Lutomirski
2023-02-20  4:54                   ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wjuXvF1cA=gJod=-6k4ypbEmOczFFDKriUpOVKy9dTJWQ@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=metze@samba.org \
    --cc=samba-technical@lists.samba.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.