linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com,
	linux-fsdevel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-api@vger.kernel.org
Subject: Re: [PATCH v10 07/14] btrfs: add definitions + documentation for encoded I/O ioctls
Date: Fri, 20 Aug 2021 10:48:27 -0700	[thread overview]
Message-ID: <YR/q69Tiz6PFqFJN@relinquished.localdomain> (raw)
In-Reply-To: <1e9a95f4-01a3-c356-a348-2992d63c867f@suse.com>

On Fri, Aug 20, 2021 at 11:56:37AM +0300, Nikolay Borisov wrote:
> 
> 
> On 18.08.21 г. 0:06, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> > 
> > In order to allow sending and receiving compressed data without
> > decompressing it, we need an interface to write pre-compressed data
> > directly to the filesystem and the matching interface to read compressed
> > data without decompressing it. This adds the definitions for ioctls to
> > do that and detailed explanations of how to use them.
> > 
> > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > ---
> >  include/uapi/linux/btrfs.h | 132 +++++++++++++++++++++++++++++++++++++
> >  1 file changed, 132 insertions(+)
> > 
> > diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
> > index d7d3cfead056..95da52955894 100644
> > --- a/include/uapi/linux/btrfs.h
> > +++ b/include/uapi/linux/btrfs.h
> > @@ -861,6 +861,134 @@ struct btrfs_ioctl_get_subvol_rootref_args {
> >  		__u8 align[7];
> >  };
> >  
> > +/*
> > + * Data and metadata for an encoded read or write.
> > + *
> > + * Encoded I/O bypasses any encoding automatically done by the filesystem (e.g.,
> > + * compression). This can be used to read the compressed contents of a file or
> > + * write pre-compressed data directly to a file.
> > + *
> > + * BTRFS_IOC_ENCODED_READ and BTRFS_IOC_ENCODED_WRITE are essentially
> > + * preadv/pwritev with additional metadata about how the data is encoded and the
> > + * size of the unencoded data.
> > + *
> > + * BTRFS_IOC_ENCODED_READ fills the given iovecs with the encoded data, fills
> > + * the metadata fields, and returns the size of the encoded data. It reads one
> > + * extent per call. It can also read data which is not encoded.
> > + *
> > + * BTRFS_IOC_ENCODED_WRITE uses the metadata fields, writes the encoded data
> > + * from the iovecs, and returns the size of the encoded data. Note that the
> > + * encoded data is not validated when it is written; if it is not valid (e.g.,
> > + * it cannot be decompressed), then a subsequent read may return an error.
> > + *
> > + * Since the filesystem page cache contains decoded data, encoded I/O bypasses
> > + * the page cache. Encoded I/O requires CAP_SYS_ADMIN.
> > + */
> > +struct btrfs_ioctl_encoded_io_args {
> > +	/* Input parameters for both reads and writes. */
> > +
> > +	/*
> > +	 * iovecs containing encoded data.
> > +	 *
> > +	 * For reads, if the size of the encoded data is larger than the sum of
> > +	 * iov[n].iov_len for 0 <= n < iovcnt, then the ioctl fails with
> > +	 * ENOBUFS.
> > +	 *
> > +	 * For writes, the size of the encoded data is the sum of iov[n].iov_len
> > +	 * for 0 <= n < iovcnt. This must be less than 128 KiB (this limit may
> > +	 * increase in the future). This must also be less than or equal to
> > +	 * unencoded_len.
> > +	 */
> > +	const struct iovec __user *iov;
> > +	/* Number of iovecs. */
> > +	unsigned long iovcnt;
> > +	/*
> > +	 * Offset in file.
> > +	 *
> > +	 * For writes, must be aligned to the sector size of the filesystem.
> > +	 */
> > +	__s64 offset;
> > +	/* Currently must be zero. */
> > +	__u64 flags;
> > +
> > +	/*
> > +	 * For reads, the following members are filled in with the metadata for
> > +	 * the encoded data.
> > +	 * For writes, the following members must be set to the metadata for the
> > +	 * encoded data.
> > +	 */
> > +
> > +	/*
> > +	 * Length of the data in the file.
> > +	 *
> > +	 * Must be less than or equal to unencoded_len - unencoded_offset. For
> > +	 * writes, must be aligned to the sector size of the filesystem unless
> > +	 * the data ends at or beyond the current end of the file.
> > +	 */
> > +	__u64 len;
> > +	/*
> > +	 * Length of the unencoded (i.e., decrypted and decompressed) data.
> > +	 *
> > +	 * For writes, must be no more than 128 KiB (this limit may increase in
> > +	 * the future). If the unencoded data is actually longer than
> > +	 * unencoded_len, then it is truncated; if it is shorter, then it is
> > +	 * extended with zeroes.
> > +	 */
> > +	__u64 unencoded_len;
> > +	/*
> > +	 * Offset from the first byte of the unencoded data to the first byte of
> > +	 * logical data in the file.
> > +	 *
> > +	 * Must be less than unencoded_len.
> > +	 */
> > +	__u64 unencoded_offset;
> > +	/*
> > +	 * BTRFS_ENCODED_IO_COMPRESSION_* type.
> > +	 *
> > +	 * For writes, must not be BTRFS_ENCODED_IO_COMPRESSION_NONE.
> > +	 */
> > +	__u32 compression;
> > +	/* Currently always BTRFS_ENCODED_IO_ENCRYPTION_NONE. */
> > +	__u32 encryption;
> > +	/*
> > +	 * Reserved for future expansion.
> > +	 *
> > +	 * For reads, always returned as zero. Users should check for non-zero
> > +	 * bytes. If there are any, then the kernel has a newer version of this
> > +	 * structure with additional information that the user definition is
> > +	 * missing.
> > +	 *
> > +	 * For writes, must be zeroed.
> > +	 */
> > +	__u8 reserved[32];
> > +};
> > +
> > +/* Data is not compressed. */
> > +#define BTRFS_ENCODED_IO_COMPRESSION_NONE 0
> > +/* Data is compressed as a single zlib stream. */
> > +#define BTRFS_ENCODED_IO_COMPRESSION_ZLIB 1
> > +/*
> > + * Data is compressed as a single zstd frame with the windowLog compression
> > + * parameter set to no more than 17.
> > + */
> > +#define BTRFS_ENCODED_IO_COMPRESSION_ZSTD 2
> > +/*
> > + * Data is compressed page by page (using the page size indicated by the name of
> > + * the constant) with LZO1X and wrapped in the format documented in
> > + * fs/btrfs/lzo.c. For writes, the compression page size must match the
> > + * filesystem page size.
> > + */
> > +#define BTRFS_ENCODED_IO_COMPRESSION_LZO_4K 3
> > +#define BTRFS_ENCODED_IO_COMPRESSION_LZO_8K 4
> > +#define BTRFS_ENCODED_IO_COMPRESSION_LZO_16K 5
> > +#define BTRFS_ENCODED_IO_COMPRESSION_LZO_32K 6
> > +#define BTRFS_ENCODED_IO_COMPRESSION_LZO_64K 7
> > +#define BTRFS_ENCODED_IO_COMPRESSION_TYPES 8
> > +
> > +/* Data is not encrypted. */
> > +#define BTRFS_ENCODED_IO_ENCRYPTION_NONE 0
> > +#define BTRFS_ENCODED_IO_ENCRYPTION_TYPES 1
> 
> How about an enums for encryption/compression.

With #define, the user can use #ifdef to check if the constants are
defined and provide their own definitions if not (that's what I did in
the xfstests example programs). Another option is the enum+#define
pattern:

enum {
	BTRFS_ENCODED_IO_COMPRESSION_NONE,
#define BTRFS_ENCODED_IO_COMPRESSION_NONE BTRFS_ENCODED_IO_COMPRESSION_NONE
	BTRFS_ENCODED_IO_COMPRESSION_ZLIB,
#define BTRFS_ENCODED_IO_COMPRESSION_ZLIB BTRFS_ENCODED_IO_COMPRESSION_ZLIB
	BTRFS_ENCODED_IO_COMPRESSION_ZSTD,
#define BTRFS_ENCODED_IO_COMPRESSION_ZSTD BTRFS_ENCODED_IO_COMPRESSION_ZSTD
	BTRFS_ENCODED_IO_COMPRESSION_LZO_4K,
#define BTRFS_ENCODED_IO_COMPRESSION_LZO_4K BTRFS_ENCODED_IO_COMPRESSION_LZO_4K
	BTRFS_ENCODED_IO_COMPRESSION_LZO_8K,
#define BTRFS_ENCODED_IO_COMPRESSION_LZO_8K BTRFS_ENCODED_IO_COMPRESSION_LZO_8K
	BTRFS_ENCODED_IO_COMPRESSION_LZO_16K,
#define BTRFS_ENCODED_IO_COMPRESSION_LZO_16K BTRFS_ENCODED_IO_COMPRESSION_LZO_16K
	BTRFS_ENCODED_IO_COMPRESSION_LZO_32K,
#define BTRFS_ENCODED_IO_COMPRESSION_LZO_32K BTRFS_ENCODED_IO_COMPRESSION_LZO_32K
	BTRFS_ENCODED_IO_COMPRESSION_LZO_64K,
#define BTRFS_ENCODED_IO_COMPRESSION_LZO_64K BTRFS_ENCODED_IO_COMPRESSION_LZO_64K
	BTRFS_ENCODED_IO_COMPRESSION_TYPES,
};

But that seems to confuse people. I don't feel strongly one way or
another.

  reply	other threads:[~2021-08-20 17:48 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-17 21:06 [PATCH v10 00/14] btrfs: add ioctls and send/receive support for reading/writing compressed data Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 01/14] fs: export rw_verify_area() Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 02/14] fs: export variant of generic_write_checks without iov_iter Omar Sandoval
2021-08-20  7:59   ` Nikolay Borisov
2021-08-20 17:31     ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 03/14] btrfs: don't advance offset for compressed bios in btrfs_csum_one_bio() Omar Sandoval
2021-08-20  8:08   ` Nikolay Borisov
2021-08-20 17:37     ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 04/14] btrfs: add ram_bytes and offset to btrfs_ordered_extent Omar Sandoval
2021-08-20  8:34   ` Nikolay Borisov
2021-08-20 17:43     ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 05/14] btrfs: support different disk extent size for delalloc Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 06/14] btrfs: optionally extend i_size in cow_file_range_inline() Omar Sandoval
2021-08-20  8:51   ` Nikolay Borisov
2021-08-20  9:13     ` Qu Wenruo
2021-08-20 18:11       ` Omar Sandoval
2021-08-21  1:11         ` Qu Wenruo
2021-08-23 18:16           ` Omar Sandoval
2021-08-23 23:32             ` Qu Wenruo
2021-08-23 23:46               ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 07/14] btrfs: add definitions + documentation for encoded I/O ioctls Omar Sandoval
2021-08-20  8:56   ` Nikolay Borisov
2021-08-20 17:48     ` Omar Sandoval [this message]
2021-08-17 21:06 ` [PATCH v10 08/14] btrfs: add BTRFS_IOC_ENCODED_READ Omar Sandoval
2021-08-20 12:30   ` Nikolay Borisov
2021-08-20 17:58     ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 09/14] btrfs: add BTRFS_IOC_ENCODED_WRITE Omar Sandoval
2021-08-20 13:44   ` Nikolay Borisov
2021-08-20 17:59     ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 10/14] btrfs: add send stream v2 definitions Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 11/14] btrfs: send: write larger chunks when using stream v2 Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 12/14] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2 Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 13/14] btrfs: send: send compressed extents with encoded writes Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 14/14] btrfs: send: enable support for stream v2 and compressed writes Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 01/10] btrfs-progs: receive: support v2 send stream larger tlv_len Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 02/10] btrfs-progs: receive: dynamically allocate sctx->read_buf Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 03/10] btrfs-progs: receive: support v2 send stream DATA tlv format Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 04/10] btrfs-progs: receive: add send stream v2 cmds and attrs to send.h Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 05/10] btrfs-progs: receive: process encoded_write commands Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 06/10] btrfs-progs: receive: encoded_write fallback to explicit decode and write Omar Sandoval
2021-08-18 18:07   ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 07/10] btrfs-progs: receive: process fallocate commands Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 08/10] btrfs-progs: receive: process setflags ioctl commands Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 09/10] btrfs-progs: send: stream v2 ioctl flags Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 10/10] btrfs-progs: receive: add tests for basic encoded_write send/receive Omar Sandoval

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YR/q69Tiz6PFqFJN@relinquished.localdomain \
    --to=osandov@osandov.com \
    --cc=kernel-team@fb.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).