linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Gao Xiang <gaoxiang25@huawei.com>
Cc: Dave Chinner <david@fromorbit.com>,
	Goldwyn Rodrigues <RGoldwyn@suse.com>, "hch@lst.de" <hch@lst.de>,
	"darrick.wong@oracle.com" <darrick.wong@oracle.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	"ruansy.fnst@cn.fujitsu.com" <ruansy.fnst@cn.fujitsu.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	linux-erofs@lists.ozlabs.org, miaoxie@huawei.com
Subject: Re: [PATCH 10/13] iomap: use a function pointer for dio submits
Date: Wed, 7 Aug 2019 22:49:36 -0700	[thread overview]
Message-ID: <20190808054936.GA5319@sol.localdomain> (raw)
In-Reply-To: <20190808042640.GA28630@138>

On Thu, Aug 08, 2019 at 12:26:42PM +0800, Gao Xiang wrote:
> > 
> > > > That's why I don't like this hook - I think hiding data operations
> > > > and/or custom bio manipulations in opaque filesystem callouts is
> > > > completely the wrong approach to be taking. We need to do these
> > > > things in a generic manner so that all filesystems (and block
> > > > devices!) that use the iomap infrastructure can take advantage of
> > > > them, not just one of them.
> > > > 
> > > > Quite frankly, I don't care if it takes more time and work up front,
> > > > I'm tired of expedient hacks to merge code quickly repeatedly biting
> > > > us on the arse and wasting far more time sorting out than we would
> > > > have spent getting it right in the first place.
> > > 
> > > Sure. I am open to ideas. What are you proposing?
> > 
> > That you think about how to normalise the btrfs IO path to fit into
> > the standard iomap/blockdev model, rather than adding special hacks
> > to iomap to allow an opaque, custom, IO model to be shoe-horned into
> > the generic code.
> > 
> > For example, post-read validation requires end-io processing,
> > whether it be encryption, decompression, CRC/T10 validation, etc. The
> > iomap end-io completion has all the information needed to run these
> > things, whether it be a callout to the filesystem for custom
> > processing checking, or a generic "decrypt into supplied data page"
> > sort of thing. These all need to be done in the same place, so we
> > should have common support for this. And I suspect the iomap should
> > also state in a flag that something like this is necessary (e.g.
> > IOMAP_FL_ENCRYPTED indicates post-IO decryption needs to be run).
> 
> Add some word to this topic, I think introducing a generic full approach
> to IOMAP for encryption, decompression, verification is hard to meet all
> filesystems, and seems unnecessary, especially data compression is involved.
> 
> Since the data decompression will expand the data, therefore the logical
> data size is not same as the physical data size:
> 
> 1) IO submission should be applied to all physical data, but data
>    decompression will be eventually applied to logical mapping.
>    As for EROFS, it submits all physical pages with page->private
>    points to management structure which maintain all logical pages
>    as well for further decompression. And time-sharing approach is
>    used to save the L2P mapping array in these allocated pages itself.
> 
>    In addition, IOMAP also needs to consider fixed-sized output/input
>    difference which is filesystem specific and I have no idea whether
>    involveing too many code for each requirement is really good for IOMAP;
> 
> 2) The post-read processing order is another negotiable stuff.
>    Although there is no benefit to select verity->decrypt rather than
>    decrypt->verity; but when compression is involved, the different
>    orders could be selected by different filesystem users:
> 
>     1. decrypt->verity->decompress
> 
>     2. verity->decompress->decrypt
> 
>     3. decompress->decrypt->verity
> 
>    1. and 2. could cause less computation since it processes
>    compressed data, and the security is good enough since
>    the behavior of decompression algorithm is deterministic.
>    3 could cause more computation.
> 
> All I want to say is the post process is so complicated since we have
> many selection if encryption, decompression, verification are all involved.
> 
> Maybe introduce a core subset to IOMAP is better for long-term
> maintainment and better performance. And we should consider it
> more carefully.
> 

FWIW, the only order that actually makes sense is decrypt->decompress->verity.

Decrypt before decompress, i.e. encrypt after compress, because only the
plaintext can be compressible; the ciphertext isn't.

Verity last, on the original data, because otherwise the file hash that
fs-verity reports would be specific to that particular inode on-disk and
therefore would be useless for authenticating the file's user-visible contents.

[By "verity" I mean specifically fs-verity.  Integrity-only block checksums are
a different case; those can be done at any point, but doing them on the
compressed data would make sense as then there would be less to checksum.

And yes, compression+encryption leaks information about the original data, so
may not be advisable.  My point is just that if the two are nevertheless
combined, it only makes sense to compress the plaintext.]

- Eric

  parent reply	other threads:[~2019-08-08  5:49 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-02 22:00 [PATCH v2 0/13] Btrfs iomap Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 01/13] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O Goldwyn Rodrigues
2019-08-03  0:39   ` Darrick J. Wong
2019-08-05  0:06   ` Dave Chinner
2019-08-02 22:00 ` [PATCH 02/13] iomap: Read page from srcmap for IOMAP_COW Goldwyn Rodrigues
2019-08-03  0:23   ` Darrick J. Wong
2019-08-04 23:52   ` Dave Chinner
2019-08-02 22:00 ` [PATCH 03/13] btrfs: Eliminate PagePrivate for btrfs data pages Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 04/13] btrfs: Add a simple buffered iomap write Goldwyn Rodrigues
2019-08-05  0:11   ` Dave Chinner
2019-08-22 15:05     ` Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 05/13] btrfs: Add CoW in iomap based writes Goldwyn Rodrigues
2019-08-05  0:13   ` Dave Chinner
2019-08-22 15:01     ` Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 06/13] btrfs: remove buffered write code made unnecessary Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 07/13] btrfs: basic direct read operation Goldwyn Rodrigues
2019-08-12 12:32   ` RITESH HARJANI
2019-08-22 15:00     ` Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 08/13] btrfs: Carve out btrfs_get_extent_map_write() out of btrfs_get_blocks_write() Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 09/13] btrfs: Rename __endio_write_update_ordered() to btrfs_update_ordered_extent() Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 10/13] iomap: use a function pointer for dio submits Goldwyn Rodrigues
2019-08-03  0:21   ` Darrick J. Wong
2019-08-05 16:08     ` Goldwyn Rodrigues
2019-08-04 23:43   ` Dave Chinner
2019-08-05 16:08     ` Goldwyn Rodrigues
2019-08-05 21:54       ` Dave Chinner
2019-08-08  4:26         ` Gao Xiang
2019-08-08  4:52           ` Gao Xiang
2019-08-08  5:49           ` Eric Biggers [this message]
2019-08-08  6:28             ` Gao Xiang
2019-08-08  8:16             ` Dave Chinner
2019-08-08  8:57               ` Gao Xiang
2019-08-08  9:29               ` Gao Xiang
2019-08-08 11:21                 ` Gao Xiang
2019-08-08 13:11                   ` Gao Xiang
2019-08-09 20:45             ` Matthew Wilcox
2019-08-09 23:45               ` Gao Xiang
2019-08-10  0:31                 ` Eric Biggers
2019-08-10  0:50                   ` Eric Biggers
2019-08-10  1:34                     ` Gao Xiang
2019-08-10  1:13                   ` Gao Xiang
2019-08-10  0:17               ` Eric Biggers
2019-08-02 22:00 ` [PATCH 11/13] btrfs: Use iomap_dio_rw for performing direct I/O writes Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 12/13] btrfs: Remove btrfs_dio_data and __btrfs_direct_write Goldwyn Rodrigues
2019-08-02 22:00 ` [PATCH 13/13] btrfs: update inode size during bio completion Goldwyn Rodrigues

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190808054936.GA5319@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=RGoldwyn@suse.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=gaoxiang25@huawei.com \
    --cc=hch@lst.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miaoxie@huawei.com \
    --cc=ruansy.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).