linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: Andrey Albershteyn <aalbersh@redhat.com>,
	fsverity@lists.linux.dev, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, chandan.babu@oracle.com,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	Eric Biggers <ebiggers@kernel.org>
Subject: Re: [PATCH v5 06/24] fsverity: pass tree_blocksize to end_enable_verity()
Date: Mon, 11 Mar 2024 15:38:15 -0700	[thread overview]
Message-ID: <20240311223815.GW1927156@frogsfrogsfrogs> (raw)
In-Reply-To: <20240308044017.GC8111@sol.localdomain>

[add willy and linux-mm]

On Thu, Mar 07, 2024 at 08:40:17PM -0800, Eric Biggers wrote:
> On Thu, Mar 07, 2024 at 07:46:50PM -0800, Darrick J. Wong wrote:
> > > BTW, is xfs_repair planned to do anything about any such extra blocks?
> > 
> > Sorry to answer your question with a question, but how much checking is
> > $filesystem expected to do for merkle trees?
> > 
> > In theory xfs_repair could learn how to interpret the verity descriptor,
> > walk the merkle tree blocks, and even read the file data to confirm
> > intactness.  If the descriptor specifies the highest block address then
> > we could certainly trim off excess blocks.  But I don't know how much of
> > libfsverity actually lets you do that; I haven't looked into that
> > deeply. :/
> > 
> > For xfs_scrub I guess the job is theoretically simpler, since we only
> > need to stream reads of the verity files through the page cache and let
> > verity tell us if the file data are consistent.
> > 
> > For both tools, if something finds errors in the merkle tree structure
> > itself, do we turn off verity?  Or do we do something nasty like
> > truncate the file?
> 
> As far as I know (I haven't been following btrfs-progs, but I'm familiar with
> e2fsprogs and f2fs-tools), there isn't yet any precedent for fsck actually
> validating the data of verity inodes against their Merkle trees.
> 
> e2fsck does delete the verity metadata of inodes that don't have the verity flag
> enabled.  That handles cleaning up after a crash during FS_IOC_ENABLE_VERITY.
> 
> I suppose that ideally, if an inode's verity metadata is invalid, then fsck
> should delete that inode's verity metadata and remove the verity flag from the
> inode.  Checking for a missing or obviously corrupt fsverity_descriptor would be
> fairly straightforward, but it probably wouldn't catch much compared to actually
> validating the data against the Merkle tree.  And actually validating the data
> against the Merkle tree would be complex and expensive.  Note, none of this
> would work on files that are encrypted.
> 
> Re: libfsverity, I think it would be possible to validate a Merkle tree using
> libfsverity_compute_digest() and the callbacks that it supports.  But that's not
> quite what it was designed for.
> 
> > Is there an ioctl or something that allows userspace to validate an
> > entire file's contents?  Sort of like what BLKVERIFY would have done for
> > block devices, except that we might believe its answers?
> 
> Just reading the whole file and seeing whether you get an error would do it.
> 
> Though if you want to make sure it's really re-reading the on-disk data, it's
> necessary to drop the file's pagecache first.

I tried a straight pagecache read and it worked like a charm!

But then I thought to myself, do I really want to waste memory bandwidth
copying a bunch of data?  No.  I don't even want to incur system call
overhead from reading a single byte every $pagesize bytes.

So I created 2M mmap areas and read a byte every $pagesize bytes.  That
worked too, insofar as SIGBUSes are annoying to handle.  But it's
annoying to take signals like that.

Then I started looking at madvise.  MADV_POPULATE_READ looked exactly
like what I wanted -- it prefaults in the pages, and "If populating
fails, a SIGBUS signal is not generated; instead, an error is returned."

But then I tried rigging up a test to see if I could catch an EIO, and
instead I had to SIGKILL the process!  It looks filemap_fault returns
VM_FAULT_RETRY to __xfs_filemap_fault, which propagates up through
__do_fault -> do_read_fault -> do_fault -> handle_pte_fault ->
handle_mm_fault -> faultin_page -> __get_user_pages.  At faultin_pages,
the VM_FAULT_RETRY is translated to -EBUSY.

__get_user_pages squashes -EBUSY to 0, so faultin_vma_page_range returns
that to madvise_populate.  Unfortunately, madvise_populate increments
its loop counter by the return value (still 0) so it runs in an
infinite loop.  The only way out is SIGKILL.

So I don't know what the correct behavior is here, other than the
infinite loop seems pretty suspect.  Is it the correct behavior that
madvise_populate returns EIO if __get_user_pages ever returns zero?
That doesn't quite sound right if it's the case that a zero return could
also happen if memory is tight.

I suppose filemap_fault could return VM_FAULT_SIGBUS in this one
scenario so userspace would get an -EFAULT.  That would solve this one
case of weird behavior.  But I think that doesn't happen in the
page_not_uptodate case because fpin is non-null?

As for xfs_scrub validating data files, I suppose it's not /so/
terrible to read one byte every $fsblocksize so that we can report
exactly where fsverity and the file data became inconsistent.  The
POPULATE_READ interface doesn't tell you how many pages it /did/ manage
to load, so perhaps MADV_POPULATE_READ isn't workable anyway.

(and now I'm just handwaving wildly about pagecache behaviors ;))

--D

> > Also -- inconsistencies between the file data and the merkle tree aren't
> > something that xfs can self-heal, right?
> 
> Similar to file data itself, only way to self-heal would be via mechanisms that
> provide redundancy.  There's been some interest in adding support forward error
> correction (FEC) to fsverity similar to what dm-verity has, but this would be
> complex, and it's not something that anyone has gotten around to yet.
> 
> - Eric
> 

  reply	other threads:[~2024-03-11 22:38 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04 19:10 [PATCH v5 00/24] fs-verity support for XFS Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 01/24] fsverity: remove hash page spin lock Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 02/24] xfs: add parent pointer support to attribute code Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 03/24] xfs: define parent pointer ondisk extended attribute format Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 04/24] xfs: add parent pointer validator functions Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 05/24] fs: add FS_XFLAG_VERITY for verity files Andrey Albershteyn
2024-03-04 22:35   ` Eric Biggers
2024-03-07 21:39     ` Darrick J. Wong
2024-03-07 22:06       ` Eric Biggers
2024-03-04 19:10 ` [PATCH v5 06/24] fsverity: pass tree_blocksize to end_enable_verity() Andrey Albershteyn
2024-03-05  0:52   ` Eric Biggers
2024-03-06 16:30     ` Darrick J. Wong
2024-03-07 22:02       ` Eric Biggers
2024-03-08  3:46         ` Darrick J. Wong
2024-03-08  4:40           ` Eric Biggers
2024-03-11 22:38             ` Darrick J. Wong [this message]
2024-03-12 15:13               ` David Hildenbrand
2024-03-12 15:33                 ` David Hildenbrand
2024-03-12 16:44                   ` Darrick J. Wong
2024-03-13 12:29                     ` David Hildenbrand
2024-03-13 17:19                       ` Darrick J. Wong
2024-03-13 19:10                         ` David Hildenbrand
2024-03-13 21:03                           ` David Hildenbrand
2024-03-08 21:34           ` Dave Chinner
2024-03-09 16:19             ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 07/24] fsverity: support block-based Merkle tree caching Andrey Albershteyn
2024-03-06  3:56   ` Eric Biggers
2024-03-07 21:54     ` Darrick J. Wong
2024-03-07 22:49       ` Eric Biggers
2024-03-08  3:50         ` Darrick J. Wong
2024-03-09 16:24           ` Darrick J. Wong
2024-03-11 23:22   ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 08/24] fsverity: add per-sb workqueue for post read processing Andrey Albershteyn
2024-03-05  1:08   ` Eric Biggers
2024-03-07 21:58     ` Darrick J. Wong
2024-03-07 22:26       ` Eric Biggers
2024-03-08  3:53         ` Darrick J. Wong
2024-03-07 22:55       ` Dave Chinner
2024-03-04 19:10 ` [PATCH v5 09/24] fsverity: add tracepoints Andrey Albershteyn
2024-03-05  0:33   ` Eric Biggers
2024-03-04 19:10 ` [PATCH v5 10/24] iomap: integrate fs-verity verification into iomap's read path Andrey Albershteyn
2024-03-04 23:39   ` Eric Biggers
2024-03-07 22:06     ` Darrick J. Wong
2024-03-07 22:19       ` Eric Biggers
2024-03-07 23:38     ` Dave Chinner
2024-03-07 23:45       ` Darrick J. Wong
2024-03-08  0:47         ` Dave Chinner
2024-03-07 23:59       ` Eric Biggers
2024-03-08  1:20         ` Dave Chinner
2024-03-08  3:16           ` Eric Biggers
2024-03-08  3:57             ` Darrick J. Wong
2024-03-08  3:22           ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 11/24] xfs: add XBF_VERITY_SEEN xfs_buf flag Andrey Albershteyn
2024-03-07 22:46   ` Darrick J. Wong
2024-03-08  1:59     ` Dave Chinner
2024-03-08  3:31       ` Darrick J. Wong
2024-03-09 16:28         ` Darrick J. Wong
2024-03-11  0:26           ` Dave Chinner
2024-03-11 15:25             ` Darrick J. Wong
2024-03-12  2:43               ` Eric Biggers
2024-03-12  4:15                 ` Darrick J. Wong
2024-03-12  2:45               ` Darrick J. Wong
2024-03-12  7:01                 ` Dave Chinner
2024-03-12 20:04                   ` Darrick J. Wong
2024-03-12 21:45                     ` Dave Chinner
2024-03-04 19:10 ` [PATCH v5 12/24] xfs: add XFS_DA_OP_BUFFER to make xfs_attr_get() return buffer Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 13/24] xfs: add attribute type for fs-verity Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 14/24] xfs: make xfs_buf_get() to take XBF_* flags Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 15/24] xfs: add XBF_DOUBLE_ALLOC to increase size of the buffer Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 16/24] xfs: add fs-verity ro-compat flag Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 17/24] xfs: add inode on-disk VERITY flag Andrey Albershteyn
2024-03-07 22:06   ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 18/24] xfs: initialize fs-verity on file open and cleanup on inode destruction Andrey Albershteyn
2024-03-07 22:09   ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 19/24] xfs: don't allow to enable DAX on fs-verity sealsed inode Andrey Albershteyn
2024-03-07 22:09   ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 20/24] xfs: disable direct read path for fs-verity files Andrey Albershteyn
2024-03-07 22:11   ` Darrick J. Wong
2024-03-12 12:02     ` Andrey Albershteyn
2024-03-12 16:36       ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 21/24] xfs: add fs-verity support Andrey Albershteyn
2024-03-06  4:55   ` Eric Biggers
2024-03-06  5:01     ` Eric Biggers
2024-03-07 23:10   ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 22/24] xfs: make scrub aware of verity dinode flag Andrey Albershteyn
2024-03-07 22:18   ` Darrick J. Wong
2024-03-12 12:10     ` Andrey Albershteyn
2024-03-12 16:38       ` Darrick J. Wong
2024-03-13  1:35         ` Darrick J. Wong
2024-03-04 19:10 ` [PATCH v5 23/24] xfs: add fs-verity ioctls Andrey Albershteyn
2024-03-07 22:14   ` Darrick J. Wong
2024-03-12 12:42     ` Andrey Albershteyn
2024-03-04 19:10 ` [PATCH v5 24/24] xfs: enable ro-compat fs-verity flag Andrey Albershteyn
2024-03-07 22:16   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240311223815.GW1927156@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=aalbersh@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=chandan.babu@oracle.com \
    --cc=ebiggers@kernel.org \
    --cc=fsverity@lists.linux.dev \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).