From: Dave Chinner <david@fromorbit.com>
To: Omar Sandoval <osandov@osandov.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 2/2] btrfs: add ioctl for directly writing compressed data
Date: Thu, 5 Sep 2019 12:10:12 +1000 [thread overview]
Message-ID: <20190905021012.GL7777@dread.disaster.area> (raw)
In-Reply-To: <8eae56abb90c0fe87c350322485ce8674e135074.1567623877.git.osandov@fb.com>
On Wed, Sep 04, 2019 at 12:13:26PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> This adds an API for writing compressed data directly to the filesystem.
> The use case that I have in mind is send/receive: currently, when
> sending data from one compressed filesystem to another, the sending side
> decompresses the data and the receiving side recompresses it before
> writing it out. This is wasteful and can be avoided if we can just send
> and write compressed extents. The send part will be implemented in a
> separate series, as this ioctl can stand alone.
>
> The interface is essentially pwrite(2) with some extra information:
>
> - The input buffer contains the compressed data.
> - Both the compressed and decompressed sizes of the data are given.
> - The compression type (zlib, lzo, or zstd) is given.
So why can't you do this with pwritev2()? Heaps of flags, and
use a second iovec to hold the decompressed size of the previous
iovec. i.e.
iov[0].iov_base = compressed_data;
iov[0].iov_len = compressed_size;
iov[1].iov_base = NULL;
iov[1].iov_len = uncompressed_size;
pwritev2(fd, iov, 2, offset, RWF_COMPRESSED_ZLIB);
And you don't need to reinvent pwritev() with some whacky ioctl that
is bound to be completely screwed up is ways not noticed until
someone else tries to use it...
I'd also suggest atht if we are going to be able to write compressed
data directly, then we should be able to read them as well directly
via preadv2()....
> The interface is general enough that it can be extended to encrypted or
> otherwise encoded extents in the future. A more detailed description,
> including restrictions and edge cases, is included in
> include/uapi/linux/btrfs.h.
No thanks, that bit us on the arse -hard- with the clone interfaces
we lifted to the VFS from btrfs. Let's do it through the existing IO
paths and write a bunch of fstests to exercise it and verify the
interface's utility and the filesystem implementation correctness
before anything is merged.
> The implementation is similar to direct I/O: we have to flush any
> ordered extents, invalidate the page cache, and do the io
> tree/delalloc/extent map/ordered extent dance.
Which, to me, says that this should be a small bit of extra code
in the direct IO path that skips the compression/decompression code
and sets a few extra flags in the iocb that is passed down to the
direct IO code.
We don't need a whole new IO path just to skip a data transformation
step in the direct IO path....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2019-09-05 2:10 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-04 19:13 [PATCH 0/2] Btrfs: add interface for writing compressed extents directly Omar Sandoval
2019-09-04 19:13 ` [PATCH 1/2] fs: export rw_verify_area() Omar Sandoval
2019-09-04 19:13 ` [PATCH 2/2] btrfs: add ioctl for directly writing compressed data Omar Sandoval
2019-09-05 2:10 ` Dave Chinner [this message]
2019-09-05 12:16 ` Johannes Thumshirn
2019-09-05 12:51 ` Johannes Thumshirn
2019-09-06 7:46 ` Dave Chinner
2019-09-06 18:19 ` Omar Sandoval
2019-09-06 21:07 ` Dave Chinner
2019-09-06 21:27 ` Omar Sandoval
2019-09-05 10:33 ` Nikolay Borisov
2019-09-19 6:14 ` Omar Sandoval
2019-09-19 7:46 ` Nikolay Borisov
2019-09-19 7:59 ` Omar Sandoval
2019-09-24 10:29 ` Nikolay Borisov
2019-09-10 11:39 ` Filipe Manana
2019-09-19 6:23 ` Omar Sandoval
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190905021012.GL7777@dread.disaster.area \
--to=david@fromorbit.com \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=osandov@osandov.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).