linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Dave Chinner <david@fromorbit.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-btrfs <linux-btrfs@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	Kernel Team <kernel-team@fb.com>,
	Dave Chinner <dchinner@redhat.com>
Subject: Re: [PATCH RESEND x3 v9 1/9] iov_iter: add copy_struct_from_iter()
Date: Wed, 23 Jun 2021 13:46:50 -0700	[thread overview]
Message-ID: <YNOdunP+Fvhbsixb@relinquished.localdomain> (raw)
In-Reply-To: <YNOPdy14My+MHmy8@zeniv-ca.linux.org.uk>

On Wed, Jun 23, 2021 at 07:45:59PM +0000, Al Viro wrote:
> On Wed, Jun 23, 2021 at 10:49:51AM -0700, Omar Sandoval wrote:
> 
> > > Fair summary. The only other thing that I'd add is this is an IO
> > > interface that requires issuing physical IO. So if someone wants
> > > high throughput for encoded IO, we really need AIO and/or io_uring
> > > support, and we get that for free if we use readv2/writev2
> > > interfaces.
> > > 
> > > Yes, it could be an ioctl() interface, but I think that this sort of
> > > functionality is exactly what extensible syscalls like
> > > preadv2/pwritev2 should be used for. It's a slight variant on normal
> > > IO, and that's exactly what the RWF_* flags are intended to be used
> > > for - allowing interesting per-IO variant behaviour without having
> > > to completely re-implemnt the IO path via custom ioctls every time
> > > we want slightly different functionality...
> > 
> > Al, Linus, what do you think? Is there a path forward for this series as
> > is? I'd be happy to have this functionality merged in any form, but I do
> > think that this approach with preadv2/pwritev2 using iov_len is decent
> > relative to the alternatives.
> 
> IMO we might be better off with explicit ioctl - this magical mystery shite
> with special meaning of the first iovec length is, IMO, more than enough
> to make it a bad fit for read/write family.
> 
> It's *not* just a "slightly different functionality" - it's very different
> calling conventions.  And the deeper one needs to dig into the interface
> details to parse what's going on, the less it differs from ioctl() mess.
> 
> Said that, why do you need a variable-length header on the read side,
> in the first place?

Suppose we add a new field representing a new type of encoding to the
end of encoded_iov. On the write side, the caller might want to specify
that the data is encoded in that new way, of course. But on the read
side, if the data is encoded in that new way, then the kernel will want
to return that. The kernel needs to know if the user's structure
includes the new field (otherwise when it copies the full struct out, it
will write into what the user thinks is the data instead).

As I mentioned in my reply to Linus, maybe we can stick with
preadv2/pwritev2, but make the struct encoded_iov structure a fixed size
with some reserved space for future expansion. That makes this a lot
less special: just copy a fixed size structure, then read/write the
rest. And then we don't need to reinvent the rest of the
preadv2/pwritev2 path for an ioctl.

Between a fixed size structure and an ioctl, what would you prefer?

  reply	other threads:[~2021-06-23 20:46 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-17 23:51 [PATCH RESEND x3 v9 0/9] fs: interface for directly reading/writing compressed data Omar Sandoval
2021-06-17 23:51 ` [PATCH RESEND x3 v9 1/9] iov_iter: add copy_struct_from_iter() Omar Sandoval
2021-06-18 18:50   ` Linus Torvalds
2021-06-18 19:42     ` Al Viro
2021-06-18 19:49       ` Al Viro
2021-06-18 20:33         ` Omar Sandoval
2021-06-18 20:32       ` Omar Sandoval
2021-06-18 20:58         ` Al Viro
2021-06-18 21:10           ` Linus Torvalds
2021-06-18 21:32             ` Al Viro
2021-06-18 21:40               ` Linus Torvalds
2021-06-18 22:10                 ` Omar Sandoval
2021-06-18 22:32                   ` Al Viro
2021-06-19  0:43                     ` Omar Sandoval
2021-06-21 18:46                       ` Omar Sandoval
2021-06-21 19:33                         ` Linus Torvalds
2021-06-21 20:46                           ` Omar Sandoval
2021-06-21 20:53                             ` Omar Sandoval
2021-06-21 20:55                             ` Omar Sandoval
2021-06-22 22:06                               ` Dave Chinner
2021-06-23 17:49                                 ` Omar Sandoval
2021-06-23 18:28                                   ` Linus Torvalds
2021-06-23 19:33                                     ` Omar Sandoval
2021-06-23 19:45                                   ` Al Viro
2021-06-23 20:46                                     ` Omar Sandoval [this message]
2021-06-23 21:39                                       ` Al Viro
2021-06-23 21:58                                         ` Omar Sandoval
2021-06-23 22:26                                           ` Al Viro
2021-06-24  2:00                                           ` Matthew Wilcox
2021-06-24  6:14                                             ` Omar Sandoval
2021-06-24 17:52                                               ` Linus Torvalds
2021-06-24 18:28                                                 ` Omar Sandoval
2021-06-24 21:07                                                   ` Linus Torvalds
2021-06-24 22:41                                                     ` Martin K. Petersen
2021-06-25  3:38                                                       ` Matthew Wilcox
2021-06-25 16:16                                                         ` Linus Torvalds
2021-06-25 21:07                                                           ` Omar Sandoval
2021-07-07 17:59                                                             ` Omar Sandoval
2021-07-19 15:44                                                               ` Josef Bacik
2021-06-24  6:41                                             ` Christoph Hellwig
2021-06-24  7:50                                               ` Omar Sandoval
2021-06-18 22:14                 ` Al Viro
2021-06-17 23:51 ` [PATCH RESEND x3 v9 2/9] fs: add O_ALLOW_ENCODED open flag Omar Sandoval
2021-06-17 23:51 ` [PATCH RESEND x3 v9 3/9] fs: add RWF_ENCODED for reading/writing compressed data Omar Sandoval
2021-06-17 23:51 ` [PATCH RESEND x3 v9 4/9] btrfs: don't advance offset for compressed bios in btrfs_csum_one_bio() Omar Sandoval
2021-06-17 23:51 ` [PATCH RESEND x3 v9 5/9] btrfs: add ram_bytes and offset to btrfs_ordered_extent Omar Sandoval
2021-06-17 23:51 ` [PATCH RESEND x3 v9 6/9] btrfs: support different disk extent size for delalloc Omar Sandoval
2021-06-17 23:51 ` [PATCH RESEND x3 v9 7/9] btrfs: optionally extend i_size in cow_file_range_inline() Omar Sandoval
2021-06-17 23:51 ` [PATCH RESEND x3 v9 8/9] btrfs: implement RWF_ENCODED reads Omar Sandoval
2021-06-17 23:51 ` [PATCH RESEND x3 v9 9/9] btrfs: implement RWF_ENCODED writes Omar Sandoval

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YNOdunP+Fvhbsixb@relinquished.localdomain \
    --to=osandov@osandov.com \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=kernel-team@fb.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).