All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHBOMB v5.3] fs-verity support for XFS
@ 2024-03-17 16:19 Darrick J. Wong
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:19 UTC (permalink / raw)
  To: aalbersh, ebiggers; +Cc: linux-fsdevel, fsverity, linux-xfs

Hi everyone,

I've mostly finished rounding out xfs_db and both fsck support for
fsverity.  I'm now sending out a full set of patches for everything I've
got, which is quite a bit more since the v5.2 stuff the other day.

This time around I've applied some more optimizations to the
implementation, including getting rid of the incore validation bitmap,
not storing trailing zeroes to reduce overhead, and eliding merkle tree
blocks that contain hashes of zeroed data blocks.  This last one is very
useful for reducing overhead of gold master disk images on vm farms.

Note that metadump is kinda broken and xfs_scrub media scans do not yet
know how to read verity files.  All that is actually fixed in the
version that's lodged in my development trees, but since Andrey's base
is the 6.9 for-next branch plus only a few of the parent pointers
patches, none of that stuff was easy to port to make a short dev branch.

Full versions are here:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=fsverity
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=fsverity

--D

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCHSET v5.3] fs-verity support for XFS
  2024-03-17 16:19 [PATCHBOMB v5.3] fs-verity support for XFS Darrick J. Wong
@ 2024-03-17 16:22 ` Darrick J. Wong
  2024-03-17 16:23   ` [PATCH 01/40] fsverity: remove hash page spin lock Darrick J. Wong
                     ` (40 more replies)
  2024-03-17 16:23 ` Darrick J. Wong
                   ` (2 subsequent siblings)
  3 siblings, 41 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:22 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh
  Cc: Eric Biggers, Mark Tinguely, Allison Henderson,
	Christoph Hellwig, Dave Chinner, linux-fsdevel, fsverity,
	linux-xfs

Hi all,

From Darrick J. Wong:

This v5.3 patchset builds upon v5.2 of Andrey's patchset to implement
fsverity for XFS.

The biggest thing that I didn't like in the v5 patchset is the abuse of
the data device's buffer cache to store the incore version of the merkle
tree blocks.  Not only do verity state flags end up in xfs_buf, but the
double-alloc flag wastes memory and doesn't remain internally consistent
if the xattrs shift around.

I replaced all of that with a per-inode xarray that indexes incore
merkle tree blocks.  For cache hits, this dramatically reduces the
amount of work that xfs has to do to feed fsverity.  The per-block
overhead is much lower (8 bytes instead of ~300 for xfs_bufs), and we no
longer have to entertain layering violations in the buffer cache.  I
also added a per-filesystem shrinker so that reclaim can cull cached
merkle tree blocks, starting with the leaf tree nodes.

I've also rolled in some changes recommended by the fsverity maintainer,
fixed some organization and naming problems in the xfs code, fixed a
collision in the xfs_inode iflags, and improved dead merkle tree cleanup
per the discussion of the v5 series.  At this point I'm happy enough
with this code to start integrating and testing it in my trees, so it's
time to send it out a coherent patchset for comments.

For v5.3, I've added bits and pieces of online and offline repair
support, reduced the size of partially filled merkle tree blocks by
removing trailing zeroes, changed the xattr hash function to better
avoid collisions between merkle tree keys, made the fsverity
invalidation bitmap unnecessary, and made it so that we can save space
on sparse verity files by not storing merkle tree blocks that hash
totally zeroed data blocks.

From Andrey Albershteyn:

Here's v5 of my patchset of adding fs-verity support to XFS.

This implementation uses extended attributes to store fs-verity
metadata. The Merkle tree blocks are stored in the remote extended
attributes. The names are offsets into the tree.

A few key points of this patchset:
- fs-verity can work with Merkle tree blocks based caching (xfs) and
  PAGE caching (ext4, f2fs, btrfs)
- iomap does fs-verity verification
- In XFS, fs-verity metadata is stored in extended attributes
- per-sb workqueue for verification processing
- Inodes with fs-verity have new on-disk diflag
- xfs_attr_get() can return a buffer with an extended attribute
- xfs_buf can allocate double space for Merkle tree blocks. Part of
  the space is used to store  the extended attribute data without
  leaf headers
- xfs_buf tracks verified status of merkle tree blocks

The patchset consists of five parts:
- [1]: fs-verity spinlock removal pending in fsverity/for-next
- [2..4]: Parent pointers adding binary xattr names
- [5]: Expose FS_XFLAG_VERITY for fs-verity files
- [6..9]: Changes to fs-verity core
- [10]: Integrate fs-verity to iomap
- [11-24]: Add fs-verity support to XFS

Testing:
The patchset is tested with xfstests -g verity on xfs_1k, xfs_4k,
xfs_1k_quota, xfs_4k_quota, ext4_4k, and ext4_4k_quota. With
KMEMLEAK and KASAN enabled. More testing on the way.

Changes from V4:
- Mainly fs-verity changes; removed unnecessary functions
- Replace XFS workqueue with per-sb workqueue created in
  fsverity_set_ops()
- Drop patch with readahead calculation in bytes
Changes from V3:
- redone changes to fs-verity core as previous version had an issue
  on ext4
- add blocks invalidation interface to fs-verity
- move memory ordering primitives out of block status check to fs
  read block function
- add fs-verity verification to iomap instead of general post read
  processing
Changes from V2:
- FS_XFLAG_VERITY extended attribute flag
- Change fs-verity to use Merkle tree blocks instead of expecting
  PAGE references from filesystem
- Change approach in iomap to filesystem provided bio_set and
  submit_io instead of just callouts to filesystem
- Add possibility for xfs_buf allocate more space for fs-verity
  extended attributes
- Make xfs_attr module to copy fs-verity blocks inside the xfs_buf,
  so XFS can get data without leaf headers
- Add Merkle tree removal for error path
- Makae scrub aware of new dinode flag
Changes from V1:
- Added parent pointer patches for easier testing
- Many issues and refactoring points fixed from the V1 review
- Adjusted for recent changes in fs-verity core (folios, non-4k)
- Dropped disabling of large folios
- Completely new fsverity patches (fix, callout, log_blocksize)
- Change approach to verification in iomap to the same one as in
  write path. Callouts to fs instead of direct fs-verity use.
- New XFS workqueue for post read folio verification
- xfs_attr_get() can return underlying xfs_buf
- xfs_bufs are marked with XBF_VERITY_CHECKED to track verified
  blocks

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fsverity-xfs

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=fsverity-xfs
---
Commits in this patchset:
 * fsverity: remove hash page spin lock
 * xfs: add parent pointer support to attribute code
 * xfs: define parent pointer ondisk extended attribute format
 * xfs: add parent pointer validator functions
 * fs: add FS_XFLAG_VERITY for verity files
 * fsverity: pass tree_blocksize to end_enable_verity()
 * fsverity: support block-based Merkle tree caching
 * fsverity: add per-sb workqueue for post read processing
 * fsverity: add tracepoints
 * fsverity: fix "support block-based Merkle tree caching"
 * fsverity: send the level of the merkle tree block to ->read_merkle_tree_block
 * fsverity: pass the new tree size and block size to ->begin_enable_verity
 * fsverity: expose merkle tree geometry to callers
 * fsverity: rely on cached block callers to retain verified state
 * fsverity: box up the write_merkle_tree_block parameters too
 * fsverity: pass the zero-hash value to the implementation
 * fsverity: report validation errors back to the filesystem
 * iomap: integrate fs-verity verification into iomap's read path
 * xfs: add attribute type for fs-verity
 * xfs: add fs-verity ro-compat flag
 * xfs: add inode on-disk VERITY flag
 * xfs: initialize fs-verity on file open and cleanup on inode destruction
 * xfs: don't allow to enable DAX on fs-verity sealed inode
 * xfs: disable direct read path for fs-verity files
 * xfs: widen flags argument to the xfs_iflags_* helpers
 * xfs: add fs-verity support
 * xfs: create a per-mount shrinker for verity inodes merkle tree blocks
 * xfs: create an icache tag for files with cached merkle tree blocks
 * xfs: shrink verity blob cache
 * xfs: clean up stale fsverity metadata before starting
 * xfs: better reporting and error handling in xfs_drop_merkle_tree
 * xfs: make scrub aware of verity dinode flag
 * xfs: add fs-verity ioctls
 * xfs: advertise fs-verity being available on filesystem
 * xfs: teach online repair to evaluate fsverity xattrs
 * xfs: don't store trailing zeroes of merkle tree blocks
 * xfs: create separate name hash function for xattrs
 * xfs: use merkle tree offset as attr hash
 * xfs: don't bother storing merkle tree blocks for zeroed data blocks
 * xfs: enable ro-compat fs-verity flag
---
 Documentation/filesystems/fsverity.rst |    8 
 MAINTAINERS                            |    1 
 fs/btrfs/verity.c                      |   13 -
 fs/ext4/verity.c                       |   13 -
 fs/f2fs/verity.c                       |   13 -
 fs/ioctl.c                             |   11 
 fs/iomap/buffered-io.c                 |   91 ++++
 fs/super.c                             |    7 
 fs/verity/enable.c                     |   19 +
 fs/verity/fsverity_private.h           |   42 ++
 fs/verity/init.c                       |    1 
 fs/verity/open.c                       |   41 ++
 fs/verity/read_metadata.c              |   63 +--
 fs/verity/signature.c                  |    2 
 fs/verity/verify.c                     |  233 +++++++---
 fs/xfs/Makefile                        |    2 
 fs/xfs/libxfs/xfs_attr.c               |   49 ++
 fs/xfs/libxfs/xfs_attr.h               |    6 
 fs/xfs/libxfs/xfs_attr_leaf.c          |    4 
 fs/xfs/libxfs/xfs_da_format.h          |   75 +++
 fs/xfs/libxfs/xfs_format.h             |   14 -
 fs/xfs/libxfs/xfs_fs.h                 |    1 
 fs/xfs/libxfs/xfs_log_format.h         |    2 
 fs/xfs/libxfs/xfs_ondisk.h             |    4 
 fs/xfs/libxfs/xfs_parent.c             |  113 +++++
 fs/xfs/libxfs/xfs_parent.h             |   19 +
 fs/xfs/libxfs/xfs_sb.c                 |    4 
 fs/xfs/scrub/attr.c                    |  114 +++++
 fs/xfs/scrub/attr.h                    |    4 
 fs/xfs/scrub/common.c                  |   27 +
 fs/xfs/xfs_attr_item.c                 |    9 
 fs/xfs/xfs_attr_list.c                 |   17 -
 fs/xfs/xfs_file.c                      |   23 +
 fs/xfs/xfs_icache.c                    |   85 ++++
 fs/xfs/xfs_icache.h                    |    8 
 fs/xfs/xfs_inode.c                     |    2 
 fs/xfs/xfs_inode.h                     |   19 +
 fs/xfs/xfs_ioctl.c                     |   21 +
 fs/xfs/xfs_iops.c                      |    4 
 fs/xfs/xfs_mount.c                     |   10 
 fs/xfs/xfs_mount.h                     |    8 
 fs/xfs/xfs_super.c                     |   14 +
 fs/xfs/xfs_trace.h                     |   80 +++
 fs/xfs/xfs_verity.c                    |  741 ++++++++++++++++++++++++++++++++
 fs/xfs/xfs_verity.h                    |   29 +
 fs/xfs/xfs_xattr.c                     |   10 
 include/linux/fs.h                     |    2 
 include/linux/fsverity.h               |  162 +++++++
 include/trace/events/fsverity.h        |  162 +++++++
 include/uapi/linux/fs.h                |    1 
 50 files changed, 2226 insertions(+), 177 deletions(-)
 create mode 100644 fs/xfs/libxfs/xfs_parent.c
 create mode 100644 fs/xfs/libxfs/xfs_parent.h
 create mode 100644 fs/xfs/xfs_verity.c
 create mode 100644 fs/xfs/xfs_verity.h
 create mode 100644 include/trace/events/fsverity.h


^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCHSET v5.3] fs-verity support for XFS
  2024-03-17 16:19 [PATCHBOMB v5.3] fs-verity support for XFS Darrick J. Wong
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
@ 2024-03-17 16:23 ` Darrick J. Wong
  2024-03-17 16:34   ` [PATCH 01/20] xfsprogs: add parent pointer support to attribute code Darrick J. Wong
                     ` (19 more replies)
  2024-03-17 16:23 ` [PATCHSET v5.3] fstests: fs-verity support for XFS Darrick J. Wong
  2024-03-18  1:39 ` [PATCHBOMB v5.3] fs-verity support for XFS Christoph Hellwig
  3 siblings, 20 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:23 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers
  Cc: Mark Tinguely, Darrick J. Wong, Dave Chinner, Allison Henderson,
	fsverity, linux-fsdevel, linux-xfs

Hi all,

From Darrick J. Wong:

This v5.3 patchset builds upon v5.2 of Andrey's patchset to implement
fsverity for XFS.

The biggest thing that I didn't like in the v5 patchset is the abuse of
the data device's buffer cache to store the incore version of the merkle
tree blocks.  Not only do verity state flags end up in xfs_buf, but the
double-alloc flag wastes memory and doesn't remain internally consistent
if the xattrs shift around.

I replaced all of that with a per-inode xarray that indexes incore
merkle tree blocks.  For cache hits, this dramatically reduces the
amount of work that xfs has to do to feed fsverity.  The per-block
overhead is much lower (8 bytes instead of ~300 for xfs_bufs), and we no
longer have to entertain layering violations in the buffer cache.  I
also added a per-filesystem shrinker so that reclaim can cull cached
merkle tree blocks, starting with the leaf tree nodes.

I've also rolled in some changes recommended by the fsverity maintainer,
fixed some organization and naming problems in the xfs code, fixed a
collision in the xfs_inode iflags, and improved dead merkle tree cleanup
per the discussion of the v5 series.  At this point I'm happy enough
with this code to start integrating and testing it in my trees, so it's
time to send it out a coherent patchset for comments.

For v5.3, I've added bits and pieces of online and offline repair
support, reduced the size of partially filled merkle tree blocks by
removing trailing zeroes, changed the xattr hash function to better
avoid collisions between merkle tree keys, made the fsverity
invalidation bitmap unnecessary, and made it so that we can save space
on sparse verity files by not storing merkle tree blocks that hash
totally zeroed data blocks.

From Andrey Albershteyn:

Here's v5 of my patchset of adding fs-verity support to XFS.

This implementation uses extended attributes to store fs-verity
metadata. The Merkle tree blocks are stored in the remote extended
attributes. The names are offsets into the tree.

A few key points of this patchset:
- fs-verity can work with Merkle tree blocks based caching (xfs) and
  PAGE caching (ext4, f2fs, btrfs)
- iomap does fs-verity verification
- In XFS, fs-verity metadata is stored in extended attributes
- per-sb workqueue for verification processing
- Inodes with fs-verity have new on-disk diflag
- xfs_attr_get() can return a buffer with an extended attribute
- xfs_buf can allocate double space for Merkle tree blocks. Part of
  the space is used to store  the extended attribute data without
  leaf headers
- xfs_buf tracks verified status of merkle tree blocks

The patchset consists of five parts:
- [1]: fs-verity spinlock removal pending in fsverity/for-next
- [2..4]: Parent pointers adding binary xattr names
- [5]: Expose FS_XFLAG_VERITY for fs-verity files
- [6..9]: Changes to fs-verity core
- [10]: Integrate fs-verity to iomap
- [11-24]: Add fs-verity support to XFS

Testing:
The patchset is tested with xfstests -g verity on xfs_1k, xfs_4k,
xfs_1k_quota, xfs_4k_quota, ext4_4k, and ext4_4k_quota. With
KMEMLEAK and KASAN enabled. More testing on the way.

Changes from V4:
- Mainly fs-verity changes; removed unnecessary functions
- Replace XFS workqueue with per-sb workqueue created in
  fsverity_set_ops()
- Drop patch with readahead calculation in bytes
Changes from V3:
- redone changes to fs-verity core as previous version had an issue
  on ext4
- add blocks invalidation interface to fs-verity
- move memory ordering primitives out of block status check to fs
  read block function
- add fs-verity verification to iomap instead of general post read
  processing
Changes from V2:
- FS_XFLAG_VERITY extended attribute flag
- Change fs-verity to use Merkle tree blocks instead of expecting
  PAGE references from filesystem
- Change approach in iomap to filesystem provided bio_set and
  submit_io instead of just callouts to filesystem
- Add possibility for xfs_buf allocate more space for fs-verity
  extended attributes
- Make xfs_attr module to copy fs-verity blocks inside the xfs_buf,
  so XFS can get data without leaf headers
- Add Merkle tree removal for error path
- Makae scrub aware of new dinode flag
Changes from V1:
- Added parent pointer patches for easier testing
- Many issues and refactoring points fixed from the V1 review
- Adjusted for recent changes in fs-verity core (folios, non-4k)
- Dropped disabling of large folios
- Completely new fsverity patches (fix, callout, log_blocksize)
- Change approach to verification in iomap to the same one as in
  write path. Callouts to fs instead of direct fs-verity use.
- New XFS workqueue for post read folio verification
- xfs_attr_get() can return underlying xfs_buf
- xfs_bufs are marked with XBF_VERITY_CHECKED to track verified
  blocks

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

This has been running on the djcloud for months with no problems.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fsverity-xfs

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=fsverity-xfs
---
Commits in this patchset:
 * xfsprogs: add parent pointer support to attribute code
 * xfsprogs: define parent pointer xattr format
 * xfsprogs: Add xfs_verify_pptr
 * fs: add FS_XFLAG_VERITY for verity files
 * xfs: add attribute type for fs-verity
 * xfs: add fs-verity ro-compat flag
 * xfs: add inode on-disk VERITY flag
 * xfs: add fs-verity support
 * xfs: advertise fs-verity being available on filesystem
 * xfs: create separate name hash function for xattrs
 * xfs: use merkle tree offset as attr hash
 * xfs: enable ro-compat fs-verity flag
 * libfrog: add fsverity to xfs_report_geom output
 * xfs_db: introduce attr_modify command
 * xfs_db: make attr_set/remove/modify be able to handle fs-verity attrs
 * man: document attr_modify command
 * xfs_db: dump verity features and metadata
 * xfs_db: dump merkle tree data
 * xfs_repair: junk fsverity xattrs when unnecessary
 * mkfs.xfs: add verity parameter
---
 db/attr.c                |   94 +++++++++++++++++++
 db/attrset.c             |  226 +++++++++++++++++++++++++++++++++++++++++++++-
 db/attrshort.c           |   22 ++++
 db/hash.c                |    4 -
 db/metadump.c            |   26 +++--
 db/sb.c                  |    2 
 db/write.c               |    2 
 db/write.h               |    1 
 include/linux.h          |    4 +
 include/xfs_mount.h      |    2 
 libfrog/fsgeom.c         |    4 +
 libxfs/libxfs_api_defs.h |    2 
 libxfs/xfs_attr.c        |   86 ++++++++++++++++--
 libxfs/xfs_attr.h        |    6 +
 libxfs/xfs_attr_leaf.c   |    4 -
 libxfs/xfs_da_format.h   |   80 ++++++++++++++++
 libxfs/xfs_format.h      |   14 ++-
 libxfs/xfs_fs.h          |    1 
 libxfs/xfs_log_format.h  |    2 
 libxfs/xfs_ondisk.h      |    4 +
 libxfs/xfs_sb.c          |    4 +
 man/man8/mkfs.xfs.8.in   |    4 +
 man/man8/xfs_db.8        |   34 +++++++
 mkfs/xfs_mkfs.c          |   19 +++-
 repair/attr_repair.c     |   52 +++++++++--
 25 files changed, 651 insertions(+), 48 deletions(-)


^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCHSET v5.3] fstests: fs-verity support for XFS
  2024-03-17 16:19 [PATCHBOMB v5.3] fs-verity support for XFS Darrick J. Wong
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
  2024-03-17 16:23 ` Darrick J. Wong
@ 2024-03-17 16:23 ` Darrick J. Wong
  2024-03-17 16:39   ` [PATCH 1/3] common/verity: enable fsverity " Darrick J. Wong
                     ` (2 more replies)
  2024-03-18  1:39 ` [PATCHBOMB v5.3] fs-verity support for XFS Christoph Hellwig
  3 siblings, 3 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:23 UTC (permalink / raw)
  To: aalbersh, ebiggers, djwong, zlang
  Cc: Andrey Albershteyn, fsverity, fstests, linux-fsdevel, guan, linux-xfs

Hi all,

From Darrick J. Wong:

This v5.3 patchset builds upon v5.2 of Andrey's patchset to implement
fsverity for XFS.

The biggest thing that I didn't like in the v5 patchset is the abuse of
the data device's buffer cache to store the incore version of the merkle
tree blocks.  Not only do verity state flags end up in xfs_buf, but the
double-alloc flag wastes memory and doesn't remain internally consistent
if the xattrs shift around.

I replaced all of that with a per-inode xarray that indexes incore
merkle tree blocks.  For cache hits, this dramatically reduces the
amount of work that xfs has to do to feed fsverity.  The per-block
overhead is much lower (8 bytes instead of ~300 for xfs_bufs), and we no
longer have to entertain layering violations in the buffer cache.  I
also added a per-filesystem shrinker so that reclaim can cull cached
merkle tree blocks, starting with the leaf tree nodes.

I've also rolled in some changes recommended by the fsverity maintainer,
fixed some organization and naming problems in the xfs code, fixed a
collision in the xfs_inode iflags, and improved dead merkle tree cleanup
per the discussion of the v5 series.  At this point I'm happy enough
with this code to start integrating and testing it in my trees, so it's
time to send it out a coherent patchset for comments.

For v5.3, I've added bits and pieces of online and offline repair
support, reduced the size of partially filled merkle tree blocks by
removing trailing zeroes, changed the xattr hash function to better
avoid collisions between merkle tree keys, made the fsverity
invalidation bitmap unnecessary, and made it so that we can save space
on sparse verity files by not storing merkle tree blocks that hash
totally zeroed data blocks.

From Andrey Albershteyn:

Here's v5 of my patchset of adding fs-verity support to XFS.

This implementation uses extended attributes to store fs-verity
metadata. The Merkle tree blocks are stored in the remote extended
attributes. The names are offsets into the tree.

A few key points of this patchset:
- fs-verity can work with Merkle tree blocks based caching (xfs) and
  PAGE caching (ext4, f2fs, btrfs)
- iomap does fs-verity verification
- In XFS, fs-verity metadata is stored in extended attributes
- per-sb workqueue for verification processing
- Inodes with fs-verity have new on-disk diflag
- xfs_attr_get() can return a buffer with an extended attribute
- xfs_buf can allocate double space for Merkle tree blocks. Part of
  the space is used to store  the extended attribute data without
  leaf headers
- xfs_buf tracks verified status of merkle tree blocks

The patchset consists of five parts:
- [1]: fs-verity spinlock removal pending in fsverity/for-next
- [2..4]: Parent pointers adding binary xattr names
- [5]: Expose FS_XFLAG_VERITY for fs-verity files
- [6..9]: Changes to fs-verity core
- [10]: Integrate fs-verity to iomap
- [11-24]: Add fs-verity support to XFS

Testing:
The patchset is tested with xfstests -g verity on xfs_1k, xfs_4k,
xfs_1k_quota, xfs_4k_quota, ext4_4k, and ext4_4k_quota. With
KMEMLEAK and KASAN enabled. More testing on the way.

Changes from V4:
- Mainly fs-verity changes; removed unnecessary functions
- Replace XFS workqueue with per-sb workqueue created in
  fsverity_set_ops()
- Drop patch with readahead calculation in bytes
Changes from V3:
- redone changes to fs-verity core as previous version had an issue
  on ext4
- add blocks invalidation interface to fs-verity
- move memory ordering primitives out of block status check to fs
  read block function
- add fs-verity verification to iomap instead of general post read
  processing
Changes from V2:
- FS_XFLAG_VERITY extended attribute flag
- Change fs-verity to use Merkle tree blocks instead of expecting
  PAGE references from filesystem
- Change approach in iomap to filesystem provided bio_set and
  submit_io instead of just callouts to filesystem
- Add possibility for xfs_buf allocate more space for fs-verity
  extended attributes
- Make xfs_attr module to copy fs-verity blocks inside the xfs_buf,
  so XFS can get data without leaf headers
- Add Merkle tree removal for error path
- Makae scrub aware of new dinode flag
Changes from V1:
- Added parent pointer patches for easier testing
- Many issues and refactoring points fixed from the V1 review
- Adjusted for recent changes in fs-verity core (folios, non-4k)
- Dropped disabling of large folios
- Completely new fsverity patches (fix, callout, log_blocksize)
- Change approach to verification in iomap to the same one as in
  write path. Callouts to fs instead of direct fs-verity use.
- New XFS workqueue for post read folio verification
- xfs_attr_get() can return underlying xfs_buf
- xfs_bufs are marked with XBF_VERITY_CHECKED to track verified
  blocks

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

This has been running on the djcloud for months with no problems.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fsverity

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=fsverity

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fsverity
---
Commits in this patchset:
 * common/verity: enable fsverity for XFS
 * xfs/{021,122}: adapt to fsverity xattrs
 * common/populate: add verity files to populate xfs images
---
 common/populate   |   21 +++++++++++++++++++++
 common/verity     |   29 ++++++++++++++++++++++++++++-
 tests/xfs/021     |    3 +++
 tests/xfs/122.out |    1 +
 4 files changed, 53 insertions(+), 1 deletion(-)


^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 01/40] fsverity: remove hash page spin lock
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
@ 2024-03-17 16:23   ` Darrick J. Wong
  2024-03-17 16:23   ` [PATCH 02/40] xfs: add parent pointer support to attribute code Darrick J. Wong
                     ` (39 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:23 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh
  Cc: Eric Biggers, linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

The spin lock is not necessary here as it can be replaced with
memory barrier which should be better performance-wise.

When Merkle tree block size differs from page size, in
is_hash_block_verified() two things are modified during check - a
bitmap and PG_checked flag of the page.

Each bit in the bitmap represent verification status of the Merkle
tree blocks. PG_checked flag tells if page was just re-instantiated
or was in pagecache. Both of this states are shared between
verification threads. Page which was re-instantiated can not have
already verified blocks (bit set in bitmap).

The spin lock was used to allow only one thread to modify both of
these states and keep order of operations. The only requirement here
is that PG_Checked is set strictly after bitmap is updated.
This way other threads which see that PG_Checked=1 (page cached)
knows that bitmap is up-to-date. Otherwise, if PG_Checked is set
before bitmap is cleared, other threads can see bit=1 and therefore
will not perform verification of that Merkle tree block.

However, there's still the case when one thread is setting a bit in
verify_data_block() and other thread is clearing it in
is_hash_block_verified(). This can happen if two threads get to
!PageChecked branch and one of the threads is rescheduled before
resetting the bitmap. This is fine as at worst blocks are
re-verified in each thread.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
---
 fs/verity/fsverity_private.h |    1 -
 fs/verity/open.c             |    1 -
 fs/verity/verify.c           |   48 +++++++++++++++++++++---------------------
 3 files changed, 24 insertions(+), 26 deletions(-)


diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index a6a6b2749241..b3506f56e180 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -69,7 +69,6 @@ struct fsverity_info {
 	u8 file_digest[FS_VERITY_MAX_DIGEST_SIZE];
 	const struct inode *inode;
 	unsigned long *hash_block_verified;
-	spinlock_t hash_page_init_lock;
 };
 
 #define FS_VERITY_MAX_SIGNATURE_SIZE	(FS_VERITY_MAX_DESCRIPTOR_SIZE - \
diff --git a/fs/verity/open.c b/fs/verity/open.c
index 6c31a871b84b..fdeb95eca3af 100644
--- a/fs/verity/open.c
+++ b/fs/verity/open.c
@@ -239,7 +239,6 @@ struct fsverity_info *fsverity_create_info(const struct inode *inode,
 			err = -ENOMEM;
 			goto fail;
 		}
-		spin_lock_init(&vi->hash_page_init_lock);
 	}
 
 	return vi;
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 904ccd7e8e16..4fcad0825a12 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -19,7 +19,6 @@ static struct workqueue_struct *fsverity_read_workqueue;
 static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
 				   unsigned long hblock_idx)
 {
-	bool verified;
 	unsigned int blocks_per_page;
 	unsigned int i;
 
@@ -43,12 +42,20 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
 	 * re-instantiated from the backing storage are re-verified.  To do
 	 * this, we use PG_checked again, but now it doesn't really mean
 	 * "checked".  Instead, now it just serves as an indicator for whether
-	 * the hash page is newly instantiated or not.
+	 * the hash page is newly instantiated or not.  If the page is new, as
+	 * indicated by PG_checked=0, we clear the bitmap bits for the page's
+	 * blocks since they are untrustworthy, then set PG_checked=1.
+	 * Otherwise we return the bitmap bit for the requested block.
 	 *
-	 * The first thread that sees PG_checked=0 must clear the corresponding
-	 * bitmap bits, then set PG_checked=1.  This requires a spinlock.  To
-	 * avoid having to take this spinlock in the common case of
-	 * PG_checked=1, we start with an opportunistic lockless read.
+	 * Multiple threads may execute this code concurrently on the same page.
+	 * This is safe because we use memory barriers to ensure that if a
+	 * thread sees PG_checked=1, then it also sees the associated bitmap
+	 * clearing to have occurred.  Also, all writes and their corresponding
+	 * reads are atomic, and all writes are safe to repeat in the event that
+	 * multiple threads get into the PG_checked=0 section.  (Clearing a
+	 * bitmap bit again at worst causes a hash block to be verified
+	 * redundantly.  That event should be very rare, so it's not worth using
+	 * a lock to avoid.  Setting PG_checked again has no effect.)
 	 */
 	if (PageChecked(hpage)) {
 		/*
@@ -58,24 +65,17 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
 		smp_rmb();
 		return test_bit(hblock_idx, vi->hash_block_verified);
 	}
-	spin_lock(&vi->hash_page_init_lock);
-	if (PageChecked(hpage)) {
-		verified = test_bit(hblock_idx, vi->hash_block_verified);
-	} else {
-		blocks_per_page = vi->tree_params.blocks_per_page;
-		hblock_idx = round_down(hblock_idx, blocks_per_page);
-		for (i = 0; i < blocks_per_page; i++)
-			clear_bit(hblock_idx + i, vi->hash_block_verified);
-		/*
-		 * A write memory barrier is needed here to give RELEASE
-		 * semantics to the below SetPageChecked() operation.
-		 */
-		smp_wmb();
-		SetPageChecked(hpage);
-		verified = false;
-	}
-	spin_unlock(&vi->hash_page_init_lock);
-	return verified;
+	blocks_per_page = vi->tree_params.blocks_per_page;
+	hblock_idx = round_down(hblock_idx, blocks_per_page);
+	for (i = 0; i < blocks_per_page; i++)
+		clear_bit(hblock_idx + i, vi->hash_block_verified);
+	/*
+	 * A write memory barrier is needed here to give RELEASE semantics to
+	 * the below SetPageChecked() operation.
+	 */
+	smp_wmb();
+	SetPageChecked(hpage);
+	return false;
 }
 
 /*


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 02/40] xfs: add parent pointer support to attribute code
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
  2024-03-17 16:23   ` [PATCH 01/40] fsverity: remove hash page spin lock Darrick J. Wong
@ 2024-03-17 16:23   ` Darrick J. Wong
  2024-03-17 16:24   ` [PATCH 03/40] xfs: define parent pointer ondisk extended attribute format Darrick J. Wong
                     ` (38 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:23 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh
  Cc: Mark Tinguely, Dave Chinner, Allison Henderson, linux-fsdevel,
	fsverity, linux-xfs

From: Allison Henderson <allison.henderson@oracle.com>

Add the new parent attribute type. XFS_ATTR_PARENT is used only for parent pointer
entries; it uses reserved blocks like XFS_ATTR_ROOT.

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_attr.c       |    3 ++-
 fs/xfs/libxfs/xfs_da_format.h  |    5 ++++-
 fs/xfs/libxfs/xfs_log_format.h |    1 +
 fs/xfs/scrub/attr.c            |    2 +-
 fs/xfs/xfs_trace.h             |    3 ++-
 5 files changed, 10 insertions(+), 4 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 673a4b6d2e8d..ff67a684a452 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -925,7 +925,8 @@ xfs_attr_set(
 	struct xfs_inode	*dp = args->dp;
 	struct xfs_mount	*mp = dp->i_mount;
 	struct xfs_trans_res	tres;
-	bool			rsvd = (args->attr_filter & XFS_ATTR_ROOT);
+	bool			rsvd = (args->attr_filter & (XFS_ATTR_ROOT |
+							     XFS_ATTR_PARENT));
 	int			error, local;
 	int			rmt_blks = 0;
 	unsigned int		total;
diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h
index 060e5c96b70f..5434d4d5b551 100644
--- a/fs/xfs/libxfs/xfs_da_format.h
+++ b/fs/xfs/libxfs/xfs_da_format.h
@@ -714,12 +714,15 @@ struct xfs_attr3_leafblock {
 #define	XFS_ATTR_LOCAL_BIT	0	/* attr is stored locally */
 #define	XFS_ATTR_ROOT_BIT	1	/* limit access to trusted attrs */
 #define	XFS_ATTR_SECURE_BIT	2	/* limit access to secure attrs */
+#define	XFS_ATTR_PARENT_BIT	3	/* parent pointer attrs */
 #define	XFS_ATTR_INCOMPLETE_BIT	7	/* attr in middle of create/delete */
 #define XFS_ATTR_LOCAL		(1u << XFS_ATTR_LOCAL_BIT)
 #define XFS_ATTR_ROOT		(1u << XFS_ATTR_ROOT_BIT)
 #define XFS_ATTR_SECURE		(1u << XFS_ATTR_SECURE_BIT)
+#define XFS_ATTR_PARENT		(1u << XFS_ATTR_PARENT_BIT)
 #define XFS_ATTR_INCOMPLETE	(1u << XFS_ATTR_INCOMPLETE_BIT)
-#define XFS_ATTR_NSP_ONDISK_MASK	(XFS_ATTR_ROOT | XFS_ATTR_SECURE)
+#define XFS_ATTR_NSP_ONDISK_MASK \
+			(XFS_ATTR_ROOT | XFS_ATTR_SECURE | XFS_ATTR_PARENT)
 
 /*
  * Alignment for namelist and valuelist entries (since they are mixed
diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
index 16872972e1e9..9cbcba4bd363 100644
--- a/fs/xfs/libxfs/xfs_log_format.h
+++ b/fs/xfs/libxfs/xfs_log_format.h
@@ -974,6 +974,7 @@ struct xfs_icreate_log {
  */
 #define XFS_ATTRI_FILTER_MASK		(XFS_ATTR_ROOT | \
 					 XFS_ATTR_SECURE | \
+					 XFS_ATTR_PARENT | \
 					 XFS_ATTR_INCOMPLETE)
 
 /*
diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
index 83c7feb38714..49f91cc85a65 100644
--- a/fs/xfs/scrub/attr.c
+++ b/fs/xfs/scrub/attr.c
@@ -494,7 +494,7 @@ xchk_xattr_rec(
 	/* Retrieve the entry and check it. */
 	hash = be32_to_cpu(ent->hashval);
 	badflags = ~(XFS_ATTR_LOCAL | XFS_ATTR_ROOT | XFS_ATTR_SECURE |
-			XFS_ATTR_INCOMPLETE);
+			XFS_ATTR_INCOMPLETE | XFS_ATTR_PARENT);
 	if ((ent->flags & badflags) != 0)
 		xchk_da_set_corrupt(ds, level);
 	if (ent->flags & XFS_ATTR_LOCAL) {
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 56b07d8ed431..d4f1b2da21e7 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -86,7 +86,8 @@ struct xfs_bmap_intent;
 #define XFS_ATTR_FILTER_FLAGS \
 	{ XFS_ATTR_ROOT,	"ROOT" }, \
 	{ XFS_ATTR_SECURE,	"SECURE" }, \
-	{ XFS_ATTR_INCOMPLETE,	"INCOMPLETE" }
+	{ XFS_ATTR_INCOMPLETE,	"INCOMPLETE" }, \
+	{ XFS_ATTR_PARENT,	"PARENT" }
 
 DECLARE_EVENT_CLASS(xfs_attr_list_class,
 	TP_PROTO(struct xfs_attr_list_context *ctx),


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 03/40] xfs: define parent pointer ondisk extended attribute format
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
  2024-03-17 16:23   ` [PATCH 01/40] fsverity: remove hash page spin lock Darrick J. Wong
  2024-03-17 16:23   ` [PATCH 02/40] xfs: add parent pointer support to attribute code Darrick J. Wong
@ 2024-03-17 16:24   ` Darrick J. Wong
  2024-03-17 16:24   ` [PATCH 04/40] xfs: add parent pointer validator functions Darrick J. Wong
                     ` (37 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:24 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh
  Cc: Dave Chinner, Allison Henderson, linux-fsdevel, fsverity, linux-xfs

From: Allison Henderson <allison.henderson@oracle.com>

We need to define the parent pointer attribute format before we start
adding support for it into all the code that needs to use it. The EA
format we will use encodes the following information:

        name={parent inode #, parent inode generation, dirent namehash}
        value={dirent name}

The inode/gen gives all the information we need to reliably identify the
parent without requiring child->parent lock ordering, and allows
userspace to do pathname component level reconstruction without the
kernel ever needing to verify the parent itself as part of ioctl calls.
Storing the dirent name hash in the key reduces hash collisions if a
file is hardlinked multiple times in the same directory.

By using the NVLOOKUP mode in the extended attribute code to match
parent pointers using both the xattr name and value, we can identify the
exact parent pointer EA we need to modify/remove in rename/unlink
operations without searching the entire EA space.

By storing the dirent name, we have enough information to be able to
validate and reconstruct damaged directory trees.  Earlier iterations of
this patchset encoded the directory offset in the parent pointer key,
but this format required repair to keep that in sync across directory
rebuilds, which is unnecessary complexity.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: replace diroffset with the namehash in the pptr key]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_da_format.h |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)


diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h
index 5434d4d5b551..67e8c33c4e82 100644
--- a/fs/xfs/libxfs/xfs_da_format.h
+++ b/fs/xfs/libxfs/xfs_da_format.h
@@ -878,4 +878,24 @@ static inline unsigned int xfs_dir2_dirblock_bytes(struct xfs_sb *sbp)
 xfs_failaddr_t xfs_da3_blkinfo_verify(struct xfs_buf *bp,
 				      struct xfs_da3_blkinfo *hdr3);
 
+/*
+ * Parent pointer attribute format definition
+ *
+ * The xattr name encodes the parent inode number, generation and the crc32c
+ * hash of the dirent name.
+ *
+ * The xattr value contains the dirent name.
+ */
+struct xfs_parent_name_rec {
+	__be64	p_ino;
+	__be32	p_gen;
+	__be32	p_namehash;
+};
+
+/*
+ * Maximum size of the dirent name that can be stored in a parent pointer.
+ * This matches the maximum dirent name length.
+ */
+#define XFS_PARENT_DIRENT_NAME_MAX_SIZE		(MAXNAMELEN - 1)
+
 #endif /* __XFS_DA_FORMAT_H__ */


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 04/40] xfs: add parent pointer validator functions
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (2 preceding siblings ...)
  2024-03-17 16:24   ` [PATCH 03/40] xfs: define parent pointer ondisk extended attribute format Darrick J. Wong
@ 2024-03-17 16:24   ` Darrick J. Wong
  2024-03-17 16:24   ` [PATCH 05/40] fs: add FS_XFLAG_VERITY for verity files Darrick J. Wong
                     ` (36 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:24 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh
  Cc: Allison Henderson, linux-fsdevel, fsverity, linux-xfs

From: Allison Henderson <allison.henderson@oracle.com>

Attribute names of parent pointers are not strings.  So we need to
modify attr_namecheck to verify parent pointer records when the
XFS_ATTR_PARENT flag is set.  At the same time, we need to validate attr
values during log recovery if the xattr is really a parent pointer.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: move functions to xfs_parent.c, adjust for new disk format]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/Makefile               |    1 
 fs/xfs/libxfs/xfs_attr.c      |   10 +++-
 fs/xfs/libxfs/xfs_attr.h      |    3 +
 fs/xfs/libxfs/xfs_da_format.h |    8 +++
 fs/xfs/libxfs/xfs_parent.c    |  113 +++++++++++++++++++++++++++++++++++++++++
 fs/xfs/libxfs/xfs_parent.h    |   19 +++++++
 fs/xfs/scrub/attr.c           |    2 -
 fs/xfs/xfs_attr_item.c        |    6 +-
 fs/xfs/xfs_attr_list.c        |   14 +++--
 9 files changed, 165 insertions(+), 11 deletions(-)
 create mode 100644 fs/xfs/libxfs/xfs_parent.c
 create mode 100644 fs/xfs/libxfs/xfs_parent.h


diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
index 76674ad5833e..f8845e65cac7 100644
--- a/fs/xfs/Makefile
+++ b/fs/xfs/Makefile
@@ -41,6 +41,7 @@ xfs-y				+= $(addprefix libxfs/, \
 				   xfs_inode_buf.o \
 				   xfs_log_rlimit.o \
 				   xfs_ag_resv.o \
+				   xfs_parent.o \
 				   xfs_rmap.o \
 				   xfs_rmap_btree.o \
 				   xfs_refcount.o \
diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index ff67a684a452..f0b625d45aa4 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -26,6 +26,7 @@
 #include "xfs_trace.h"
 #include "xfs_attr_item.h"
 #include "xfs_xattr.h"
+#include "xfs_parent.h"
 
 struct kmem_cache		*xfs_attr_intent_cache;
 
@@ -1515,9 +1516,14 @@ xfs_attr_node_get(
 /* Returns true if the attribute entry name is valid. */
 bool
 xfs_attr_namecheck(
-	const void	*name,
-	size_t		length)
+	struct xfs_mount	*mp,
+	const void		*name,
+	size_t			length,
+	unsigned int		flags)
 {
+	if (flags & XFS_ATTR_PARENT)
+		return xfs_parent_namecheck(mp, name, length, flags);
+
 	/*
 	 * MAXNAMELEN includes the trailing null, but (name/length) leave it
 	 * out, so use >= for the length check.
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 81be9b3e4004..92711c8d2a9f 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -547,7 +547,8 @@ int xfs_attr_get(struct xfs_da_args *args);
 int xfs_attr_set(struct xfs_da_args *args);
 int xfs_attr_set_iter(struct xfs_attr_intent *attr);
 int xfs_attr_remove_iter(struct xfs_attr_intent *attr);
-bool xfs_attr_namecheck(const void *name, size_t length);
+bool xfs_attr_namecheck(struct xfs_mount *mp, const void *name, size_t length,
+		unsigned int flags);
 int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
 void xfs_init_attr_trans(struct xfs_da_args *args, struct xfs_trans_res *tres,
 			 unsigned int *total);
diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h
index 67e8c33c4e82..839df0e5401b 100644
--- a/fs/xfs/libxfs/xfs_da_format.h
+++ b/fs/xfs/libxfs/xfs_da_format.h
@@ -757,6 +757,14 @@ xfs_attr3_leaf_name(xfs_attr_leafblock_t *leafp, int idx)
 	return &((char *)leafp)[be16_to_cpu(entries[idx].nameidx)];
 }
 
+static inline int
+xfs_attr3_leaf_flags(xfs_attr_leafblock_t *leafp, int idx)
+{
+	struct xfs_attr_leaf_entry *entries = xfs_attr3_leaf_entryp(leafp);
+
+	return entries[idx].flags;
+}
+
 static inline xfs_attr_leaf_name_remote_t *
 xfs_attr3_leaf_name_remote(xfs_attr_leafblock_t *leafp, int idx)
 {
diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c
new file mode 100644
index 000000000000..1d45f926c13a
--- /dev/null
+++ b/fs/xfs/libxfs/xfs_parent.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022-2024 Oracle.
+ * All rights reserved.
+ */
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_format.h"
+#include "xfs_da_format.h"
+#include "xfs_log_format.h"
+#include "xfs_shared.h"
+#include "xfs_trans_resv.h"
+#include "xfs_mount.h"
+#include "xfs_bmap_btree.h"
+#include "xfs_inode.h"
+#include "xfs_error.h"
+#include "xfs_trace.h"
+#include "xfs_trans.h"
+#include "xfs_da_btree.h"
+#include "xfs_attr.h"
+#include "xfs_dir2.h"
+#include "xfs_dir2_priv.h"
+#include "xfs_attr_sf.h"
+#include "xfs_bmap.h"
+#include "xfs_defer.h"
+#include "xfs_log.h"
+#include "xfs_xattr.h"
+#include "xfs_parent.h"
+#include "xfs_trans_space.h"
+
+/*
+ * Parent pointer attribute handling.
+ *
+ * Because the attribute value is a filename component, it will never be longer
+ * than 255 bytes. This means the attribute will always be a local format
+ * attribute as it is xfs_attr_leaf_entsize_local_max() for v5 filesystems will
+ * always be larger than this (max is 75% of block size).
+ *
+ * Creating a new parent attribute will always create a new attribute - there
+ * should never, ever be an existing attribute in the tree for a new inode.
+ * ENOSPC behavior is problematic - creating the inode without the parent
+ * pointer is effectively a corruption, so we allow parent attribute creation
+ * to dip into the reserve block pool to avoid unexpected ENOSPC errors from
+ * occurring.
+ */
+
+/* Return true if parent pointer EA name is valid. */
+bool
+xfs_parent_namecheck(
+	struct xfs_mount			*mp,
+	const struct xfs_parent_name_rec	*rec,
+	size_t					reclen,
+	unsigned int				attr_flags)
+{
+	if (!(attr_flags & XFS_ATTR_PARENT))
+		return false;
+
+	/* pptr updates use logged xattrs, so we should never see this flag */
+	if (attr_flags & XFS_ATTR_INCOMPLETE)
+		return false;
+
+	if (reclen != sizeof(struct xfs_parent_name_rec))
+		return false;
+
+	/* Only one namespace bit allowed. */
+	if (hweight32(attr_flags & XFS_ATTR_NSP_ONDISK_MASK) > 1)
+		return false;
+
+	return true;
+}
+
+/* Return true if parent pointer EA value is valid. */
+bool
+xfs_parent_valuecheck(
+	struct xfs_mount		*mp,
+	const void			*value,
+	size_t				valuelen)
+{
+	if (valuelen == 0 || valuelen > XFS_PARENT_DIRENT_NAME_MAX_SIZE)
+		return false;
+
+	if (value == NULL)
+		return false;
+
+	return true;
+}
+
+/* Return true if the ondisk parent pointer is consistent. */
+bool
+xfs_parent_hashcheck(
+	struct xfs_mount		*mp,
+	const struct xfs_parent_name_rec *rec,
+	const void			*value,
+	size_t				valuelen)
+{
+	struct xfs_name			dname = {
+		.name			= value,
+		.len			= valuelen,
+	};
+	xfs_ino_t			p_ino;
+
+	/* Valid dirent name? */
+	if (!xfs_dir2_namecheck(value, valuelen))
+		return false;
+
+	/* Valid inode number? */
+	p_ino = be64_to_cpu(rec->p_ino);
+	if (!xfs_verify_dir_ino(mp, p_ino))
+		return false;
+
+	/* Namehash matches name? */
+	return be32_to_cpu(rec->p_namehash) == xfs_dir2_hashname(mp, &dname);
+}
diff --git a/fs/xfs/libxfs/xfs_parent.h b/fs/xfs/libxfs/xfs_parent.h
new file mode 100644
index 000000000000..fcfeddb645f6
--- /dev/null
+++ b/fs/xfs/libxfs/xfs_parent.h
@@ -0,0 +1,19 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022-2024 Oracle.
+ * All Rights Reserved.
+ */
+#ifndef	__XFS_PARENT_H__
+#define	__XFS_PARENT_H__
+
+/* Metadata validators */
+bool xfs_parent_namecheck(struct xfs_mount *mp,
+		const struct xfs_parent_name_rec *rec, size_t reclen,
+		unsigned int attr_flags);
+bool xfs_parent_valuecheck(struct xfs_mount *mp, const void *value,
+		size_t valuelen);
+bool xfs_parent_hashcheck(struct xfs_mount *mp,
+		const struct xfs_parent_name_rec *rec, const void *value,
+		size_t valuelen);
+
+#endif /* __XFS_PARENT_H__ */
diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
index 49f91cc85a65..9a1f59f7b5a4 100644
--- a/fs/xfs/scrub/attr.c
+++ b/fs/xfs/scrub/attr.c
@@ -195,7 +195,7 @@ xchk_xattr_listent(
 	}
 
 	/* Does this name make sense? */
-	if (!xfs_attr_namecheck(name, namelen)) {
+	if (!xfs_attr_namecheck(sx->sc->mp, name, namelen, flags)) {
 		xchk_fblock_set_corrupt(sx->sc, XFS_ATTR_FORK, args.blkno);
 		goto fail_xref;
 	}
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
index 9b4c61e1c22e..703770cf1482 100644
--- a/fs/xfs/xfs_attr_item.c
+++ b/fs/xfs/xfs_attr_item.c
@@ -591,7 +591,8 @@ xfs_attr_recover_work(
 	 */
 	attrp = &attrip->attri_format;
 	if (!xfs_attri_validate(mp, attrp) ||
-	    !xfs_attr_namecheck(nv->name.i_addr, nv->name.i_len))
+	    !xfs_attr_namecheck(mp, nv->name.i_addr, nv->name.i_len,
+				attrp->alfi_attr_filter))
 		return -EFSCORRUPTED;
 
 	attr = xfs_attri_recover_work(mp, dfp, attrp, &ip, nv);
@@ -731,7 +732,8 @@ xlog_recover_attri_commit_pass2(
 		return -EFSCORRUPTED;
 	}
 
-	if (!xfs_attr_namecheck(attr_name, attri_formatp->alfi_name_len)) {
+	if (!xfs_attr_namecheck(mp, attr_name, attri_formatp->alfi_name_len,
+				attri_formatp->alfi_attr_filter)) {
 		XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp,
 				item->ri_buf[1].i_addr, item->ri_buf[1].i_len);
 		return -EFSCORRUPTED;
diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
index a6819a642cc0..fa74378577c5 100644
--- a/fs/xfs/xfs_attr_list.c
+++ b/fs/xfs/xfs_attr_list.c
@@ -59,6 +59,7 @@ xfs_attr_shortform_list(
 	struct xfs_attr_sf_sort		*sbuf, *sbp;
 	struct xfs_attr_sf_hdr		*sf = dp->i_af.if_data;
 	struct xfs_attr_sf_entry	*sfe;
+	struct xfs_mount		*mp = dp->i_mount;
 	int				sbsize, nsbuf, count, i;
 	int				error = 0;
 
@@ -82,8 +83,9 @@ xfs_attr_shortform_list(
 	     (dp->i_af.if_bytes + sf->count * 16) < context->bufsize)) {
 		for (i = 0, sfe = xfs_attr_sf_firstentry(sf); i < sf->count; i++) {
 			if (XFS_IS_CORRUPT(context->dp->i_mount,
-					   !xfs_attr_namecheck(sfe->nameval,
-							       sfe->namelen))) {
+					   !xfs_attr_namecheck(mp, sfe->nameval,
+							       sfe->namelen,
+							       sfe->flags))) {
 				xfs_dirattr_mark_sick(context->dp, XFS_ATTR_FORK);
 				return -EFSCORRUPTED;
 			}
@@ -177,8 +179,9 @@ xfs_attr_shortform_list(
 			cursor->offset = 0;
 		}
 		if (XFS_IS_CORRUPT(context->dp->i_mount,
-				   !xfs_attr_namecheck(sbp->name,
-						       sbp->namelen))) {
+				   !xfs_attr_namecheck(mp, sbp->name,
+						       sbp->namelen,
+						       sbp->flags))) {
 			xfs_dirattr_mark_sick(context->dp, XFS_ATTR_FORK);
 			error = -EFSCORRUPTED;
 			goto out;
@@ -474,7 +477,8 @@ xfs_attr3_leaf_list_int(
 		}
 
 		if (XFS_IS_CORRUPT(context->dp->i_mount,
-				   !xfs_attr_namecheck(name, namelen))) {
+				   !xfs_attr_namecheck(mp, name, namelen,
+						       entry->flags))) {
 			xfs_dirattr_mark_sick(context->dp, XFS_ATTR_FORK);
 			return -EFSCORRUPTED;
 		}


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 05/40] fs: add FS_XFLAG_VERITY for verity files
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (3 preceding siblings ...)
  2024-03-17 16:24   ` [PATCH 04/40] xfs: add parent pointer validator functions Darrick J. Wong
@ 2024-03-17 16:24   ` Darrick J. Wong
  2024-03-17 16:24   ` [PATCH 06/40] fsverity: pass tree_blocksize to end_enable_verity() Darrick J. Wong
                     ` (35 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:24 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

Add extended attribute FS_XFLAG_VERITY for inodes with fs-verity
enabled.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
[djwong: fix broken verity flag checks]
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 Documentation/filesystems/fsverity.rst |    8 ++++++++
 fs/ioctl.c                             |   11 +++++++++++
 include/uapi/linux/fs.h                |    1 +
 3 files changed, 20 insertions(+)


diff --git a/Documentation/filesystems/fsverity.rst b/Documentation/filesystems/fsverity.rst
index 13e4b18e5dbb..887cdaf162a9 100644
--- a/Documentation/filesystems/fsverity.rst
+++ b/Documentation/filesystems/fsverity.rst
@@ -326,6 +326,14 @@ the file has fs-verity enabled.  This can perform better than
 FS_IOC_GETFLAGS and FS_IOC_MEASURE_VERITY because it doesn't require
 opening the file, and opening verity files can be expensive.
 
+FS_IOC_FSGETXATTR
+-----------------
+
+Since Linux v6.9, the FS_IOC_FSGETXATTR ioctl sets FS_XFLAG_VERITY (0x00020000)
+in the returned flags when the file has verity enabled. Note that this attribute
+cannot be set with FS_IOC_FSSETXATTR as enabling verity requires input
+parameters. See FS_IOC_ENABLE_VERITY.
+
 .. _accessing_verity_files:
 
 Accessing verity files
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 76cf22ac97d7..fa30aae3903b 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -481,6 +481,8 @@ void fileattr_fill_xflags(struct fileattr *fa, u32 xflags)
 		fa->flags |= FS_DAX_FL;
 	if (fa->fsx_xflags & FS_XFLAG_PROJINHERIT)
 		fa->flags |= FS_PROJINHERIT_FL;
+	if (fa->fsx_xflags & FS_XFLAG_VERITY)
+		fa->flags |= FS_VERITY_FL;
 }
 EXPORT_SYMBOL(fileattr_fill_xflags);
 
@@ -511,6 +513,8 @@ void fileattr_fill_flags(struct fileattr *fa, u32 flags)
 		fa->fsx_xflags |= FS_XFLAG_DAX;
 	if (fa->flags & FS_PROJINHERIT_FL)
 		fa->fsx_xflags |= FS_XFLAG_PROJINHERIT;
+	if (fa->flags & FS_VERITY_FL)
+		fa->fsx_xflags |= FS_XFLAG_VERITY;
 }
 EXPORT_SYMBOL(fileattr_fill_flags);
 
@@ -641,6 +645,13 @@ static int fileattr_set_prepare(struct inode *inode,
 	    !(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)))
 		return -EINVAL;
 
+	/*
+	 * Verity cannot be changed through FS_IOC_FSSETXATTR/FS_IOC_SETFLAGS.
+	 * See FS_IOC_ENABLE_VERITY.
+	 */
+	if ((fa->fsx_xflags ^ old_ma->fsx_xflags) & FS_XFLAG_VERITY)
+		return -EINVAL;
+
 	/* Extent size hints of zero turn off the flags. */
 	if (fa->fsx_extsize == 0)
 		fa->fsx_xflags &= ~(FS_XFLAG_EXTSIZE | FS_XFLAG_EXTSZINHERIT);
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 48ad69f7722e..b1d0e1169bc3 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -140,6 +140,7 @@ struct fsxattr {
 #define FS_XFLAG_FILESTREAM	0x00004000	/* use filestream allocator */
 #define FS_XFLAG_DAX		0x00008000	/* use DAX for IO */
 #define FS_XFLAG_COWEXTSIZE	0x00010000	/* CoW extent size allocator hint */
+#define FS_XFLAG_VERITY		0x00020000	/* fs-verity enabled */
 #define FS_XFLAG_HASATTR	0x80000000	/* no DIFLAG for this	*/
 
 /* the read-only stuff doesn't really belong here, but any other place is


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 06/40] fsverity: pass tree_blocksize to end_enable_verity()
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (4 preceding siblings ...)
  2024-03-17 16:24   ` [PATCH 05/40] fs: add FS_XFLAG_VERITY for verity files Darrick J. Wong
@ 2024-03-17 16:24   ` Darrick J. Wong
  2024-03-17 16:25   ` [PATCH 07/40] fsverity: support block-based Merkle tree caching Darrick J. Wong
                     ` (34 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:24 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

XFS will need to know tree_blocksize to remove the tree in case of an
error. The size is needed to calculate offsets of particular Merkle
tree blocks.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: I put ebiggers' suggested changes in a separate patch]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/btrfs/verity.c        |    4 +++-
 fs/ext4/verity.c         |    3 ++-
 fs/f2fs/verity.c         |    3 ++-
 fs/verity/enable.c       |    6 ++++--
 include/linux/fsverity.h |    4 +++-
 5 files changed, 14 insertions(+), 6 deletions(-)


diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c
index 66e2270b0dae..966630523502 100644
--- a/fs/btrfs/verity.c
+++ b/fs/btrfs/verity.c
@@ -621,6 +621,7 @@ static int btrfs_begin_enable_verity(struct file *filp)
  * @desc:              verity descriptor to write out (NULL in error conditions)
  * @desc_size:         size of the verity descriptor (variable with signatures)
  * @merkle_tree_size:  size of the merkle tree in bytes
+ * @tree_blocksize:    the Merkle tree block size
  *
  * If desc is null, then VFS is signaling an error occurred during verity
  * enable, and we should try to rollback. Otherwise, attempt to finish verity.
@@ -628,7 +629,8 @@ static int btrfs_begin_enable_verity(struct file *filp)
  * Returns 0 on success, negative error code on error.
  */
 static int btrfs_end_enable_verity(struct file *filp, const void *desc,
-				   size_t desc_size, u64 merkle_tree_size)
+				   size_t desc_size, u64 merkle_tree_size,
+				   unsigned int tree_blocksize)
 {
 	struct btrfs_inode *inode = BTRFS_I(file_inode(filp));
 	int ret = 0;
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index 2f37e1ea3955..da2095a81349 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -189,7 +189,8 @@ static int ext4_write_verity_descriptor(struct inode *inode, const void *desc,
 }
 
 static int ext4_end_enable_verity(struct file *filp, const void *desc,
-				  size_t desc_size, u64 merkle_tree_size)
+				  size_t desc_size, u64 merkle_tree_size,
+				  unsigned int tree_blocksize)
 {
 	struct inode *inode = file_inode(filp);
 	const int credits = 2; /* superblock and inode for ext4_orphan_del() */
diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
index 4fc95f353a7a..b4461b9f47a3 100644
--- a/fs/f2fs/verity.c
+++ b/fs/f2fs/verity.c
@@ -144,7 +144,8 @@ static int f2fs_begin_enable_verity(struct file *filp)
 }
 
 static int f2fs_end_enable_verity(struct file *filp, const void *desc,
-				  size_t desc_size, u64 merkle_tree_size)
+				  size_t desc_size, u64 merkle_tree_size,
+				  unsigned int tree_blocksize)
 {
 	struct inode *inode = file_inode(filp);
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index c284f46d1b53..04e060880b79 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -274,7 +274,8 @@ static int enable_verity(struct file *filp,
 	 * Serialized with ->begin_enable_verity() by the inode lock.
 	 */
 	inode_lock(inode);
-	err = vops->end_enable_verity(filp, desc, desc_size, params.tree_size);
+	err = vops->end_enable_verity(filp, desc, desc_size, params.tree_size,
+				      params.block_size);
 	inode_unlock(inode);
 	if (err) {
 		fsverity_err(inode, "%ps() failed with err %d",
@@ -300,7 +301,8 @@ static int enable_verity(struct file *filp,
 
 rollback:
 	inode_lock(inode);
-	(void)vops->end_enable_verity(filp, NULL, 0, params.tree_size);
+	(void)vops->end_enable_verity(filp, NULL, 0, params.tree_size,
+				      params.block_size);
 	inode_unlock(inode);
 	goto out;
 }
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 1eb7eae580be..ac58b19f23d3 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -51,6 +51,7 @@ struct fsverity_operations {
 	 * @desc: the verity descriptor to write, or NULL on failure
 	 * @desc_size: size of verity descriptor, or 0 on failure
 	 * @merkle_tree_size: total bytes the Merkle tree took up
+	 * @tree_blocksize: the Merkle tree block size
 	 *
 	 * If desc == NULL, then enabling verity failed and the filesystem only
 	 * must do any necessary cleanups.  Else, it must also store the given
@@ -65,7 +66,8 @@ struct fsverity_operations {
 	 * Return: 0 on success, -errno on failure
 	 */
 	int (*end_enable_verity)(struct file *filp, const void *desc,
-				 size_t desc_size, u64 merkle_tree_size);
+				 size_t desc_size, u64 merkle_tree_size,
+				 unsigned int tree_blocksize);
 
 	/**
 	 * Get the verity descriptor of the given inode.


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 07/40] fsverity: support block-based Merkle tree caching
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (5 preceding siblings ...)
  2024-03-17 16:24   ` [PATCH 06/40] fsverity: pass tree_blocksize to end_enable_verity() Darrick J. Wong
@ 2024-03-17 16:25   ` Darrick J. Wong
  2024-03-17 16:25   ` [PATCH 08/40] fsverity: add per-sb workqueue for post read processing Darrick J. Wong
                     ` (33 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:25 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

In the current implementation fs-verity expects filesystem to
provide PAGEs filled with Merkle tree blocks. Then, when fs-verity
is done with processing the blocks, reference to PAGE is freed. This
doesn't fit well with the way XFS manages its memory.

To allow XFS integrate fs-verity this patch adds ability to
fs-verity verification code to take Merkle tree blocks instead of
PAGE reference. This way ext4, f2fs, and btrfs are still able to
pass PAGE references and XFS can pass reference to Merkle tree
blocks stored in XFS's buffer infrastructure.

Another addition is invalidation function which tells fs-verity to
mark part of Merkle tree as not verified. This function is used
by filesystem to tell fs-verity to invalidate block which was
evicted from memory.

Depending on Merkle tree block size fs-verity is using either bitmap
or PG_checked flag to track "verified" status of the blocks. With a
Merkle tree block caching (XFS) there is no PAGE to flag it as
verified. fs-verity always uses bitmap to track verified blocks for
filesystems which use block caching.

Further this patch allows filesystem to make additional processing
on verified pages via fsverity_drop_block() instead of just dropping
a reference. This will be used by XFS for internal buffer cache
manipulation in further patches. The btrfs, ext4, and f2fs just drop
the reference.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
[djwong: fix uninit err variable]
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/verity/fsverity_private.h |    8 +++
 fs/verity/open.c             |    8 ++-
 fs/verity/read_metadata.c    |   64 ++++++++++++++--------
 fs/verity/verify.c           |  125 ++++++++++++++++++++++++++++++++----------
 include/linux/fsverity.h     |   65 ++++++++++++++++++++++
 5 files changed, 217 insertions(+), 53 deletions(-)


diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index b3506f56e180..dad33e6ff0d6 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -154,4 +154,12 @@ static inline void fsverity_init_signature(void)
 
 void __init fsverity_init_workqueue(void);
 
+/*
+ * Drop 'block' obtained with ->read_merkle_tree_block(). Calls out back to
+ * filesystem if ->drop_block() is set, otherwise, drop the reference in the
+ * block->context.
+ */
+void fsverity_drop_block(struct inode *inode,
+			 struct fsverity_blockbuf *block);
+
 #endif /* _FSVERITY_PRIVATE_H */
diff --git a/fs/verity/open.c b/fs/verity/open.c
index fdeb95eca3af..6e6922b4b014 100644
--- a/fs/verity/open.c
+++ b/fs/verity/open.c
@@ -213,7 +213,13 @@ struct fsverity_info *fsverity_create_info(const struct inode *inode,
 	if (err)
 		goto fail;
 
-	if (vi->tree_params.block_size != PAGE_SIZE) {
+	/*
+	 * If fs passes Merkle tree blocks to fs-verity (e.g. XFS), then
+	 * fs-verity should use hash_block_verified bitmap as there's no page
+	 * to mark it with PG_checked.
+	 */
+	if (vi->tree_params.block_size != PAGE_SIZE ||
+			inode->i_sb->s_vop->read_merkle_tree_block) {
 		/*
 		 * When the Merkle tree block size and page size differ, we use
 		 * a bitmap to keep track of which hash blocks have been
diff --git a/fs/verity/read_metadata.c b/fs/verity/read_metadata.c
index f58432772d9e..5da40b5a81af 100644
--- a/fs/verity/read_metadata.c
+++ b/fs/verity/read_metadata.c
@@ -18,50 +18,68 @@ static int fsverity_read_merkle_tree(struct inode *inode,
 {
 	const struct fsverity_operations *vops = inode->i_sb->s_vop;
 	u64 end_offset;
-	unsigned int offs_in_page;
+	unsigned int offs_in_block;
 	pgoff_t index, last_index;
 	int retval = 0;
 	int err = 0;
+	const unsigned int block_size = vi->tree_params.block_size;
+	const u8 log_blocksize = vi->tree_params.log_blocksize;
 
 	end_offset = min(offset + length, vi->tree_params.tree_size);
 	if (offset >= end_offset)
 		return 0;
-	offs_in_page = offset_in_page(offset);
-	last_index = (end_offset - 1) >> PAGE_SHIFT;
+	offs_in_block = offset & (block_size - 1);
+	last_index = (end_offset - 1) >> log_blocksize;
 
 	/*
-	 * Iterate through each Merkle tree page in the requested range and copy
-	 * the requested portion to userspace.  Note that the Merkle tree block
-	 * size isn't important here, as we are returning a byte stream; i.e.,
-	 * we can just work with pages even if the tree block size != PAGE_SIZE.
+	 * Iterate through each Merkle tree block in the requested range and
+	 * copy the requested portion to userspace. Note that we are returning
+	 * a byte stream.
 	 */
-	for (index = offset >> PAGE_SHIFT; index <= last_index; index++) {
+	for (index = offset >> log_blocksize; index <= last_index; index++) {
 		unsigned long num_ra_pages =
 			min_t(unsigned long, last_index - index + 1,
 			      inode->i_sb->s_bdi->io_pages);
 		unsigned int bytes_to_copy = min_t(u64, end_offset - offset,
-						   PAGE_SIZE - offs_in_page);
-		struct page *page;
-		const void *virt;
+						   block_size - offs_in_block);
+		struct fsverity_blockbuf block = {
+			.size = block_size,
+		};
 
-		page = vops->read_merkle_tree_page(inode, index, num_ra_pages);
-		if (IS_ERR(page)) {
-			err = PTR_ERR(page);
+		if (!vops->read_merkle_tree_block) {
+			unsigned int blocks_per_page =
+				vi->tree_params.blocks_per_page;
+			unsigned long page_idx =
+				round_down(index, blocks_per_page);
+			struct page *page = vops->read_merkle_tree_page(inode,
+					page_idx, num_ra_pages);
+
+			if (IS_ERR(page)) {
+				err = PTR_ERR(page);
+			} else {
+				block.kaddr = kmap_local_page(page) +
+					((index - page_idx) << log_blocksize);
+				block.context = page;
+			}
+		} else {
+			err = vops->read_merkle_tree_block(inode,
+					index << log_blocksize,
+					&block, log_blocksize, num_ra_pages);
+		}
+
+		if (err) {
 			fsverity_err(inode,
-				     "Error %d reading Merkle tree page %lu",
-				     err, index);
+				     "Error %d reading Merkle tree block %lu",
+				     err, index << log_blocksize);
 			break;
 		}
 
-		virt = kmap_local_page(page);
-		if (copy_to_user(buf, virt + offs_in_page, bytes_to_copy)) {
-			kunmap_local(virt);
-			put_page(page);
+		if (copy_to_user(buf, block.kaddr + offs_in_block, bytes_to_copy)) {
+			fsverity_drop_block(inode, &block);
 			err = -EFAULT;
 			break;
 		}
-		kunmap_local(virt);
-		put_page(page);
+		fsverity_drop_block(inode, &block);
 
 		retval += bytes_to_copy;
 		buf += bytes_to_copy;
@@ -72,7 +90,7 @@ static int fsverity_read_merkle_tree(struct inode *inode,
 			break;
 		}
 		cond_resched();
-		offs_in_page = 0;
+		offs_in_block = 0;
 	}
 	return retval ? retval : err;
 }
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 4fcad0825a12..4ebdf9d2d7b6 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -13,14 +13,17 @@
 static struct workqueue_struct *fsverity_read_workqueue;
 
 /*
- * Returns true if the hash block with index @hblock_idx in the tree, located in
- * @hpage, has already been verified.
+ * Returns true if the hash block with index @hblock_idx in the tree has
+ * already been verified.
  */
-static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
+static bool is_hash_block_verified(struct inode *inode,
+				   struct fsverity_blockbuf *block,
 				   unsigned long hblock_idx)
 {
 	unsigned int blocks_per_page;
 	unsigned int i;
+	struct fsverity_info *vi = inode->i_verity_info;
+	struct page *hpage = (struct page *)block->context;
 
 	/*
 	 * When the Merkle tree block size and page size are the same, then the
@@ -34,6 +37,12 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
 	if (!vi->hash_block_verified)
 		return PageChecked(hpage);
 
+	/*
+	 * Filesystems which use block based caching (e.g. XFS) always use
+	 * bitmap.
+	 */
+	if (inode->i_sb->s_vop->read_merkle_tree_block)
+		return test_bit(hblock_idx, vi->hash_block_verified);
 	/*
 	 * When the Merkle tree block size and page size differ, we use a bitmap
 	 * to indicate whether each hash block has been verified.
@@ -95,15 +104,15 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 	const struct merkle_tree_params *params = &vi->tree_params;
 	const unsigned int hsize = params->digest_size;
 	int level;
+	int err = 0;
+	int num_ra_pages;
 	u8 _want_hash[FS_VERITY_MAX_DIGEST_SIZE];
 	const u8 *want_hash;
 	u8 real_hash[FS_VERITY_MAX_DIGEST_SIZE];
 	/* The hash blocks that are traversed, indexed by level */
 	struct {
-		/* Page containing the hash block */
-		struct page *page;
-		/* Mapped address of the hash block (will be within @page) */
-		const void *addr;
+		/* Buffer containing the hash block */
+		struct fsverity_blockbuf block;
 		/* Index of the hash block in the tree overall */
 		unsigned long index;
 		/* Byte offset of the wanted hash relative to @addr */
@@ -144,10 +153,11 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		unsigned long next_hidx;
 		unsigned long hblock_idx;
 		pgoff_t hpage_idx;
+		u64 hblock_pos;
 		unsigned int hblock_offset_in_page;
 		unsigned int hoffset;
 		struct page *hpage;
-		const void *haddr;
+		struct fsverity_blockbuf *block = &hblocks[level].block;
 
 		/*
 		 * The index of the block in the current level; also the index
@@ -165,29 +175,49 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		hblock_offset_in_page =
 			(hblock_idx << params->log_blocksize) & ~PAGE_MASK;
 
+		/* Offset of the Merkle tree block into the tree */
+		hblock_pos = hblock_idx << params->log_blocksize;
+
 		/* Byte offset of the hash within the block */
 		hoffset = (hidx << params->log_digestsize) &
 			  (params->block_size - 1);
 
-		hpage = inode->i_sb->s_vop->read_merkle_tree_page(inode,
-				hpage_idx, level == 0 ? min(max_ra_pages,
-					params->tree_pages - hpage_idx) : 0);
-		if (IS_ERR(hpage)) {
+		num_ra_pages = level == 0 ?
+			min(max_ra_pages, params->tree_pages - hpage_idx) : 0;
+
+		if (inode->i_sb->s_vop->read_merkle_tree_block) {
+			err = inode->i_sb->s_vop->read_merkle_tree_block(
+				inode, hblock_pos, block, params->log_blocksize,
+				num_ra_pages);
+		} else {
+			unsigned int blocks_per_page =
+				vi->tree_params.blocks_per_page;
+			hblock_idx = round_down(hblock_idx, blocks_per_page);
+			hpage = inode->i_sb->s_vop->read_merkle_tree_page(
+				inode, hpage_idx, (num_ra_pages << PAGE_SHIFT));
+
+			if (IS_ERR(hpage)) {
+				err = PTR_ERR(hpage);
+			} else {
+				block->kaddr = kmap_local_page(hpage) +
+					hblock_offset_in_page;
+				block->context = hpage;
+			}
+		}
+
+		if (err) {
 			fsverity_err(inode,
-				     "Error %ld reading Merkle tree page %lu",
-				     PTR_ERR(hpage), hpage_idx);
+				     "Error %d reading Merkle tree block %lu",
+				     err, hblock_idx);
 			goto error;
 		}
-		haddr = kmap_local_page(hpage) + hblock_offset_in_page;
-		if (is_hash_block_verified(vi, hpage, hblock_idx)) {
-			memcpy(_want_hash, haddr + hoffset, hsize);
+
+		if (is_hash_block_verified(inode, block, hblock_idx)) {
+			memcpy(_want_hash, block->kaddr + hoffset, hsize);
 			want_hash = _want_hash;
-			kunmap_local(haddr);
-			put_page(hpage);
+			fsverity_drop_block(inode, block);
 			goto descend;
 		}
-		hblocks[level].page = hpage;
-		hblocks[level].addr = haddr;
 		hblocks[level].index = hblock_idx;
 		hblocks[level].hoffset = hoffset;
 		hidx = next_hidx;
@@ -197,10 +227,11 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 descend:
 	/* Descend the tree verifying hash blocks. */
 	for (; level > 0; level--) {
-		struct page *hpage = hblocks[level - 1].page;
-		const void *haddr = hblocks[level - 1].addr;
+		struct fsverity_blockbuf *block = &hblocks[level - 1].block;
+		const void *haddr = block->kaddr;
 		unsigned long hblock_idx = hblocks[level - 1].index;
 		unsigned int hoffset = hblocks[level - 1].hoffset;
+		struct page *hpage = (struct page *)block->context;
 
 		if (fsverity_hash_block(params, inode, haddr, real_hash) != 0)
 			goto error;
@@ -217,8 +248,7 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 			SetPageChecked(hpage);
 		memcpy(_want_hash, haddr + hoffset, hsize);
 		want_hash = _want_hash;
-		kunmap_local(haddr);
-		put_page(hpage);
+		fsverity_drop_block(inode, block);
 	}
 
 	/* Finally, verify the data block. */
@@ -235,10 +265,8 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		     params->hash_alg->name, hsize, want_hash,
 		     params->hash_alg->name, hsize, real_hash);
 error:
-	for (; level > 0; level--) {
-		kunmap_local(hblocks[level - 1].addr);
-		put_page(hblocks[level - 1].page);
-	}
+	for (; level > 0; level--)
+		fsverity_drop_block(inode, &hblocks[level - 1].block);
 	return false;
 }
 
@@ -362,3 +390,42 @@ void __init fsverity_init_workqueue(void)
 	if (!fsverity_read_workqueue)
 		panic("failed to allocate fsverity_read_queue");
 }
+
+/**
+ * fsverity_invalidate_block() - invalidate Merkle tree block
+ * @inode: inode to which this Merkle tree blocks belong
+ * @block: block to be invalidated
+ *
+ * This function invalidates/clears "verified" state of Merkle tree block
+ * in the fs-verity bitmap. The block needs to have ->offset set.
+ */
+void fsverity_invalidate_block(struct inode *inode,
+		struct fsverity_blockbuf *block)
+{
+	struct fsverity_info *vi = inode->i_verity_info;
+	const unsigned int log_blocksize = vi->tree_params.log_blocksize;
+
+	if (block->offset > vi->tree_params.tree_size) {
+		fsverity_err(inode,
+"Trying to invalidate beyond Merkle tree (tree %lld, offset %lld)",
+			     vi->tree_params.tree_size, block->offset);
+		return;
+	}
+
+	clear_bit(block->offset >> log_blocksize, vi->hash_block_verified);
+}
+EXPORT_SYMBOL_GPL(fsverity_invalidate_block);
+
+void fsverity_drop_block(struct inode *inode,
+		struct fsverity_blockbuf *block)
+{
+	if (inode->i_sb->s_vop->drop_block)
+		inode->i_sb->s_vop->drop_block(block);
+	else {
+		struct page *page = (struct page *)block->context;
+
+		kunmap_local(block->kaddr);
+		put_page(page);
+	}
+	block->kaddr = NULL;
+}
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index ac58b19f23d3..0973b521ac5a 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -26,6 +26,33 @@
 /* Arbitrary limit to bound the kmalloc() size.  Can be changed. */
 #define FS_VERITY_MAX_DESCRIPTOR_SIZE	16384
 
+/**
+ * struct fsverity_blockbuf - Merkle Tree block buffer
+ * @kaddr: virtual address of the block's data
+ * @offset: block's offset into Merkle tree
+ * @size: the Merkle tree block size
+ * @context: filesystem private context
+ *
+ * Buffer containing single Merkle Tree block. These buffers are passed
+ *  - to filesystem, when fs-verity is building merkel tree,
+ *  - from filesystem, when fs-verity is reading merkle tree from a disk.
+ * Filesystems sets kaddr together with size to point to a memory which contains
+ * Merkle tree block. Same is done by fs-verity when Merkle tree is need to be
+ * written down to disk.
+ *
+ * While reading the tree, fs-verity calls ->read_merkle_tree_block followed by
+ * ->drop_block to let filesystem know that memory can be freed.
+ *
+ * The context is optional. This field can be used by filesystem to passthrough
+ * state from ->read_merkle_tree_block to ->drop_block.
+ */
+struct fsverity_blockbuf {
+	void *kaddr;
+	u64 offset;
+	unsigned int size;
+	void *context;
+};
+
 /* Verity operations for filesystems */
 struct fsverity_operations {
 
@@ -107,6 +134,32 @@ struct fsverity_operations {
 					      pgoff_t index,
 					      unsigned long num_ra_pages);
 
+	/**
+	 * Read a Merkle tree block of the given inode.
+	 * @inode: the inode
+	 * @pos: byte offset of the block within the Merkle tree
+	 * @block: block buffer for filesystem to point it to the block
+	 * @log_blocksize: log2 of the size of the expected block
+	 * @ra_bytes: The number of bytes that should be
+	 *		prefetched starting at @pos if the page at @pos
+	 *		isn't already cached.  Implementations may ignore this
+	 *		argument; it's only a performance optimization.
+	 *
+	 * This can be called at any time on an open verity file.  It may be
+	 * called by multiple processes concurrently.
+	 *
+	 * In case that block was evicted from the memory filesystem has to use
+	 * fsverity_invalidate_block() to let fsverity know that block's
+	 * verification state is not valid anymore.
+	 *
+	 * Return: 0 on success, -errno on failure
+	 */
+	int (*read_merkle_tree_block)(struct inode *inode,
+				      u64 pos,
+				      struct fsverity_blockbuf *block,
+				      unsigned int log_blocksize,
+				      u64 ra_bytes);
+
 	/**
 	 * Write a Merkle tree block to the given inode.
 	 *
@@ -122,6 +175,16 @@ struct fsverity_operations {
 	 */
 	int (*write_merkle_tree_block)(struct inode *inode, const void *buf,
 				       u64 pos, unsigned int size);
+
+	/**
+	 * Release the reference to a Merkle tree block
+	 *
+	 * @block: the block to release
+	 *
+	 * This is called when fs-verity is done with a block obtained with
+	 * ->read_merkle_tree_block().
+	 */
+	void (*drop_block)(struct fsverity_blockbuf *block);
 };
 
 #ifdef CONFIG_FS_VERITY
@@ -175,6 +238,8 @@ int fsverity_ioctl_read_metadata(struct file *filp, const void __user *uarg);
 bool fsverity_verify_blocks(struct folio *folio, size_t len, size_t offset);
 void fsverity_verify_bio(struct bio *bio);
 void fsverity_enqueue_verify_work(struct work_struct *work);
+void fsverity_invalidate_block(struct inode *inode,
+		struct fsverity_blockbuf *block);
 
 #else /* !CONFIG_FS_VERITY */
 


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 08/40] fsverity: add per-sb workqueue for post read processing
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (6 preceding siblings ...)
  2024-03-17 16:25   ` [PATCH 07/40] fsverity: support block-based Merkle tree caching Darrick J. Wong
@ 2024-03-17 16:25   ` Darrick J. Wong
  2024-03-17 16:25   ` [PATCH 09/40] fsverity: add tracepoints Darrick J. Wong
                     ` (32 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:25 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

For XFS, fsverity's global workqueue is not really suitable due to:

1. High priority workqueues are used within XFS to ensure that data
   IO completion cannot stall processing of journal IO completions.
   Hence using a WQ_HIGHPRI workqueue directly in the user data IO
   path is a potential filesystem livelock/deadlock vector.

2. The fsverity workqueue is global - it creates a cross-filesystem
   contention point.

This patch adds per-filesystem, per-cpu workqueue for fsverity
work. This allows iomap to add verification work in the read path on
BIO completion.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/super.c               |    7 +++++++
 include/linux/fs.h       |    2 ++
 include/linux/fsverity.h |   22 ++++++++++++++++++++++
 3 files changed, 31 insertions(+)


diff --git a/fs/super.c b/fs/super.c
index d35e85295489..338d86864200 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -642,6 +642,13 @@ void generic_shutdown_super(struct super_block *sb)
 			sb->s_dio_done_wq = NULL;
 		}
 
+#ifdef CONFIG_FS_VERITY
+		if (sb->s_read_done_wq) {
+			destroy_workqueue(sb->s_read_done_wq);
+			sb->s_read_done_wq = NULL;
+		}
+#endif
+
 		if (sop->put_super)
 			sop->put_super(sb);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ed5966a70495..9db24a825d94 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1221,6 +1221,8 @@ struct super_block {
 #endif
 #ifdef CONFIG_FS_VERITY
 	const struct fsverity_operations *s_vop;
+	/* Completion queue for post read verification */
+	struct workqueue_struct *s_read_done_wq;
 #endif
 #if IS_ENABLED(CONFIG_UNICODE)
 	struct unicode_map *s_encoding;
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 0973b521ac5a..45b7c613148a 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -241,6 +241,22 @@ void fsverity_enqueue_verify_work(struct work_struct *work);
 void fsverity_invalidate_block(struct inode *inode,
 		struct fsverity_blockbuf *block);
 
+static inline int fsverity_set_ops(struct super_block *sb,
+				   const struct fsverity_operations *ops)
+{
+	sb->s_vop = ops;
+
+	/* Create per-sb workqueue for post read bio verification */
+	struct workqueue_struct *wq = alloc_workqueue(
+		"pread/%s", (WQ_FREEZABLE | WQ_MEM_RECLAIM), 0, sb->s_id);
+	if (!wq)
+		return -ENOMEM;
+
+	sb->s_read_done_wq = wq;
+
+	return 0;
+}
+
 #else /* !CONFIG_FS_VERITY */
 
 static inline struct fsverity_info *fsverity_get_info(const struct inode *inode)
@@ -318,6 +334,12 @@ static inline void fsverity_enqueue_verify_work(struct work_struct *work)
 	WARN_ON_ONCE(1);
 }
 
+static inline int fsverity_set_ops(struct super_block *sb,
+				   const struct fsverity_operations *ops)
+{
+	return -EOPNOTSUPP;
+}
+
 #endif	/* !CONFIG_FS_VERITY */
 
 static inline bool fsverity_verify_folio(struct folio *folio)


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 09/40] fsverity: add tracepoints
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (7 preceding siblings ...)
  2024-03-17 16:25   ` [PATCH 08/40] fsverity: add per-sb workqueue for post read processing Darrick J. Wong
@ 2024-03-17 16:25   ` Darrick J. Wong
  2024-03-17 16:26   ` [PATCH 10/40] fsverity: fix "support block-based Merkle tree caching" Darrick J. Wong
                     ` (31 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:25 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

fs-verity previously had debug printk but it was removed. This patch
adds trace points to the same places where printk were used (with a
few additional ones).

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: fix formatting]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 MAINTAINERS                     |    1 
 fs/verity/enable.c              |    3 +
 fs/verity/fsverity_private.h    |    2 
 fs/verity/init.c                |    1 
 fs/verity/signature.c           |    2 
 fs/verity/verify.c              |    7 ++
 include/trace/events/fsverity.h |  181 +++++++++++++++++++++++++++++++++++++++
 7 files changed, 197 insertions(+)
 create mode 100644 include/trace/events/fsverity.h


diff --git a/MAINTAINERS b/MAINTAINERS
index 73d898383e51..f735d3e68514 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8740,6 +8740,7 @@ T:	git https://git.kernel.org/pub/scm/fs/fsverity/linux.git
 F:	Documentation/filesystems/fsverity.rst
 F:	fs/verity/
 F:	include/linux/fsverity.h
+F:	include/trace/events/fsverity.h
 F:	include/uapi/linux/fsverity.h
 
 FT260 FTDI USB-HID TO I2C BRIDGE DRIVER
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index 04e060880b79..945eba0092ab 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -227,6 +227,8 @@ static int enable_verity(struct file *filp,
 	if (err)
 		goto out;
 
+	trace_fsverity_enable(inode, desc, &params);
+
 	/*
 	 * Start enabling verity on this file, serialized by the inode lock.
 	 * Fail if verity is already enabled or is already being enabled.
@@ -255,6 +257,7 @@ static int enable_verity(struct file *filp,
 		fsverity_err(inode, "Error %d building Merkle tree", err);
 		goto rollback;
 	}
+	trace_fsverity_tree_done(inode, desc, &params);
 
 	/*
 	 * Create the fsverity_info.  Don't bother trying to save work by
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index dad33e6ff0d6..fd8f5a8d1f6a 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -162,4 +162,6 @@ void __init fsverity_init_workqueue(void);
 void fsverity_drop_block(struct inode *inode,
 			 struct fsverity_blockbuf *block);
 
+#include <trace/events/fsverity.h>
+
 #endif /* _FSVERITY_PRIVATE_H */
diff --git a/fs/verity/init.c b/fs/verity/init.c
index cb2c9aac61ed..3769d2dc9e3b 100644
--- a/fs/verity/init.c
+++ b/fs/verity/init.c
@@ -5,6 +5,7 @@
  * Copyright 2019 Google LLC
  */
 
+#define CREATE_TRACE_POINTS
 #include "fsverity_private.h"
 
 #include <linux/ratelimit.h>
diff --git a/fs/verity/signature.c b/fs/verity/signature.c
index 90c07573dd77..c1f08bb32ed1 100644
--- a/fs/verity/signature.c
+++ b/fs/verity/signature.c
@@ -53,6 +53,8 @@ int fsverity_verify_signature(const struct fsverity_info *vi,
 	struct fsverity_formatted_digest *d;
 	int err;
 
+	trace_fsverity_verify_signature(inode, signature, sig_size);
+
 	if (sig_size == 0) {
 		if (fsverity_require_signatures) {
 			fsverity_err(inode,
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 4ebdf9d2d7b6..aa1763e8b723 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -118,6 +118,7 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		/* Byte offset of the wanted hash relative to @addr */
 		unsigned int hoffset;
 	} hblocks[FS_VERITY_MAX_LEVELS];
+	trace_fsverity_verify_block(inode, data_pos);
 	/*
 	 * The index of the previous level's block within that level; also the
 	 * index of that block's hash within the current level.
@@ -215,6 +216,8 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		if (is_hash_block_verified(inode, block, hblock_idx)) {
 			memcpy(_want_hash, block->kaddr + hoffset, hsize);
 			want_hash = _want_hash;
+			trace_fsverity_merkle_tree_block_verified(inode,
+					block, FSVERITY_TRACE_DIR_ASCEND);
 			fsverity_drop_block(inode, block);
 			goto descend;
 		}
@@ -248,6 +251,8 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 			SetPageChecked(hpage);
 		memcpy(_want_hash, haddr + hoffset, hsize);
 		want_hash = _want_hash;
+		trace_fsverity_merkle_tree_block_verified(inode, block,
+				FSVERITY_TRACE_DIR_DESCEND);
 		fsverity_drop_block(inode, block);
 	}
 
@@ -405,6 +410,8 @@ void fsverity_invalidate_block(struct inode *inode,
 	struct fsverity_info *vi = inode->i_verity_info;
 	const unsigned int log_blocksize = vi->tree_params.log_blocksize;
 
+	trace_fsverity_invalidate_block(inode, block);
+
 	if (block->offset > vi->tree_params.tree_size) {
 		fsverity_err(inode,
 "Trying to invalidate beyond Merkle tree (tree %lld, offset %lld)",
diff --git a/include/trace/events/fsverity.h b/include/trace/events/fsverity.h
new file mode 100644
index 000000000000..763890e47358
--- /dev/null
+++ b/include/trace/events/fsverity.h
@@ -0,0 +1,181 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM fsverity
+
+#if !defined(_TRACE_FSVERITY_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_FSVERITY_H
+
+#include <linux/tracepoint.h>
+
+struct fsverity_descriptor;
+struct merkle_tree_params;
+struct fsverity_info;
+
+#define FSVERITY_TRACE_DIR_ASCEND	(1ul << 0)
+#define FSVERITY_TRACE_DIR_DESCEND	(1ul << 1)
+#define FSVERITY_HASH_SHOWN_LEN		20
+
+TRACE_EVENT(fsverity_enable,
+	TP_PROTO(struct inode *inode, struct fsverity_descriptor *desc,
+		struct merkle_tree_params *params),
+	TP_ARGS(inode, desc, params),
+	TP_STRUCT__entry(
+		__field(ino_t, ino)
+		__field(u64, data_size)
+		__field(unsigned int, block_size)
+		__field(unsigned int, num_levels)
+		__field(u64, tree_size)
+	),
+	TP_fast_assign(
+		__entry->ino = inode->i_ino;
+		__entry->data_size = desc->data_size;
+		__entry->block_size = params->block_size;
+		__entry->num_levels = params->num_levels;
+		__entry->tree_size = params->tree_size;
+	),
+	TP_printk("ino %lu data size %llu tree size %llu block size %u levels %u",
+		(unsigned long) __entry->ino,
+		__entry->data_size,
+		__entry->tree_size,
+		__entry->block_size,
+		__entry->num_levels)
+);
+
+TRACE_EVENT(fsverity_tree_done,
+	TP_PROTO(struct inode *inode, struct fsverity_descriptor *desc,
+		struct merkle_tree_params *params),
+	TP_ARGS(inode, desc, params),
+	TP_STRUCT__entry(
+		__field(ino_t, ino)
+		__field(unsigned int, levels)
+		__field(unsigned int, tree_blocks)
+		__field(u64, tree_size)
+		__array(u8, tree_hash, 64)
+	),
+	TP_fast_assign(
+		__entry->ino = inode->i_ino;
+		__entry->levels = params->num_levels;
+		__entry->tree_blocks =
+			params->tree_size >> params->log_blocksize;
+		__entry->tree_size = params->tree_size;
+		memcpy(__entry->tree_hash, desc->root_hash, 64);
+	),
+	TP_printk("ino %lu levels %d tree_blocks %d tree_size %lld root_hash %s",
+		(unsigned long) __entry->ino,
+		__entry->levels,
+		__entry->tree_blocks,
+		__entry->tree_size,
+		__print_hex(__entry->tree_hash, 64))
+);
+
+TRACE_EVENT(fsverity_verify_block,
+	TP_PROTO(struct inode *inode, u64 offset),
+	TP_ARGS(inode, offset),
+	TP_STRUCT__entry(
+		__field(ino_t, ino)
+		__field(u64, offset)
+		__field(unsigned int, block_size)
+	),
+	TP_fast_assign(
+		__entry->ino = inode->i_ino;
+		__entry->offset = offset;
+		__entry->block_size =
+			inode->i_verity_info->tree_params.block_size;
+	),
+	TP_printk("ino %lu data offset %lld data block size %u",
+		(unsigned long) __entry->ino,
+		__entry->offset,
+		__entry->block_size)
+);
+
+TRACE_EVENT(fsverity_merkle_tree_block_verified,
+	TP_PROTO(struct inode *inode,
+		 struct fsverity_blockbuf *block,
+		 u8 direction),
+	TP_ARGS(inode, block, direction),
+	TP_STRUCT__entry(
+		__field(ino_t, ino)
+		__field(u64, offset)
+		__field(u8, direction)
+	),
+	TP_fast_assign(
+		__entry->ino = inode->i_ino;
+		__entry->offset = block->offset;
+		__entry->direction = direction;
+	),
+	TP_printk("ino %lu block offset %llu %s",
+		(unsigned long) __entry->ino,
+		__entry->offset,
+		__entry->direction == 0 ? "ascend" : "descend")
+);
+
+TRACE_EVENT(fsverity_invalidate_block,
+	TP_PROTO(struct inode *inode, struct fsverity_blockbuf *block),
+	TP_ARGS(inode, block),
+	TP_STRUCT__entry(
+		__field(ino_t, ino)
+		__field(u64, offset)
+		__field(unsigned int, block_size)
+	),
+	TP_fast_assign(
+		__entry->ino = inode->i_ino;
+		__entry->offset = block->offset;
+		__entry->block_size = block->size;
+	),
+	TP_printk("ino %lu block position %llu block size %u",
+		(unsigned long) __entry->ino,
+		__entry->offset,
+		__entry->block_size)
+);
+
+TRACE_EVENT(fsverity_read_merkle_tree_block,
+	TP_PROTO(struct inode *inode, u64 offset, unsigned int log_blocksize),
+	TP_ARGS(inode, offset, log_blocksize),
+	TP_STRUCT__entry(
+		__field(ino_t, ino)
+		__field(u64, offset)
+		__field(u64, index)
+		__field(unsigned int, block_size)
+	),
+	TP_fast_assign(
+		__entry->ino = inode->i_ino;
+		__entry->offset = offset;
+		__entry->index = offset >> log_blocksize;
+		__entry->block_size = 1 << log_blocksize;
+	),
+	TP_printk("ino %lu tree offset %llu block index %llu block hize %u",
+		(unsigned long) __entry->ino,
+		__entry->offset,
+		__entry->index,
+		__entry->block_size)
+);
+
+TRACE_EVENT(fsverity_verify_signature,
+	TP_PROTO(const struct inode *inode, const u8 *signature, size_t sig_size),
+	TP_ARGS(inode, signature, sig_size),
+	TP_STRUCT__entry(
+		__field(ino_t, ino)
+		__dynamic_array(u8, signature, sig_size)
+		__field(size_t, sig_size)
+		__field(size_t, sig_size_show)
+	),
+	TP_fast_assign(
+		__entry->ino = inode->i_ino;
+		memcpy(__get_dynamic_array(signature), signature, sig_size);
+		__entry->sig_size = sig_size;
+		__entry->sig_size_show = (sig_size > FSVERITY_HASH_SHOWN_LEN ?
+			FSVERITY_HASH_SHOWN_LEN : sig_size);
+	),
+	TP_printk("ino %lu sig_size %zu %s%s%s",
+		(unsigned long) __entry->ino,
+		__entry->sig_size,
+		(__entry->sig_size ? "sig " : ""),
+		__print_hex(__get_dynamic_array(signature),
+			__entry->sig_size_show),
+		(__entry->sig_size ? "..." : ""))
+);
+
+#endif /* _TRACE_FSVERITY_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 10/40] fsverity: fix "support block-based Merkle tree caching"
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (8 preceding siblings ...)
  2024-03-17 16:25   ` [PATCH 09/40] fsverity: add tracepoints Darrick J. Wong
@ 2024-03-17 16:26   ` Darrick J. Wong
  2024-03-17 16:26   ` [PATCH 11/40] fsverity: send the level of the merkle tree block to ->read_merkle_tree_block Darrick J. Wong
                     ` (30 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:26 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Various fixes recommended by the maintainer.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/verity/fsverity_private.h |   36 ++++++++++-
 fs/verity/open.c             |    9 +--
 fs/verity/read_metadata.c    |   63 ++++++-------------
 fs/verity/verify.c           |  141 ++++++++++++++++++++++++------------------
 include/linux/fsverity.h     |   24 +++++--
 5 files changed, 151 insertions(+), 122 deletions(-)


diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index fd8f5a8d1f6a..0a4381acb394 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -154,13 +154,41 @@ static inline void fsverity_init_signature(void)
 
 void __init fsverity_init_workqueue(void);
 
+static inline bool fsverity_caches_blocks(const struct inode *inode)
+{
+	const struct fsverity_operations *vops = inode->i_sb->s_vop;
+
+	WARN_ON_ONCE(vops->read_merkle_tree_block &&
+		     !vops->drop_merkle_tree_block);
+
+	return vops->read_merkle_tree_block != NULL;
+}
+
+static inline bool fsverity_uses_bitmap(const struct fsverity_info *vi,
+					const struct inode *inode)
+{
+	/*
+	 * If fs uses block-based Merkle tree caching, then fs-verity must use
+	 * hash_block_verified bitmap as there's no page to mark it with
+	 * PG_checked.
+	 */
+	if (vi->tree_params.block_size != PAGE_SIZE)
+		return true;
+	return fsverity_caches_blocks(inode);
+}
+
+int fsverity_read_merkle_tree_block(struct inode *inode,
+				    const struct merkle_tree_params *params,
+				    u64 pos, unsigned long ra_bytes,
+				    struct fsverity_blockbuf *block);
+
 /*
  * Drop 'block' obtained with ->read_merkle_tree_block(). Calls out back to
- * filesystem if ->drop_block() is set, otherwise, drop the reference in the
- * block->context.
+ * filesystem if ->drop_merkle_tree_block() is set, otherwise, drop the
+ * reference in the block->context.
  */
-void fsverity_drop_block(struct inode *inode,
-			 struct fsverity_blockbuf *block);
+void fsverity_drop_merkle_tree_block(struct inode *inode,
+				     struct fsverity_blockbuf *block);
 
 #include <trace/events/fsverity.h>
 
diff --git a/fs/verity/open.c b/fs/verity/open.c
index 6e6922b4b014..9603b3a404f7 100644
--- a/fs/verity/open.c
+++ b/fs/verity/open.c
@@ -214,12 +214,11 @@ struct fsverity_info *fsverity_create_info(const struct inode *inode,
 		goto fail;
 
 	/*
-	 * If fs passes Merkle tree blocks to fs-verity (e.g. XFS), then
-	 * fs-verity should use hash_block_verified bitmap as there's no page
-	 * to mark it with PG_checked.
+	 * If fs uses block-based Merkle tree cachin, then fs-verity must use
+	 * hash_block_verified bitmap as there's no page to mark it with
+	 * PG_checked.
 	 */
-	if (vi->tree_params.block_size != PAGE_SIZE ||
-			inode->i_sb->s_vop->read_merkle_tree_block) {
+	if (fsverity_uses_bitmap(vi, inode)) {
 		/*
 		 * When the Merkle tree block size and page size differ, we use
 		 * a bitmap to keep track of which hash blocks have been
diff --git a/fs/verity/read_metadata.c b/fs/verity/read_metadata.c
index 5da40b5a81af..94fffa060f82 100644
--- a/fs/verity/read_metadata.c
+++ b/fs/verity/read_metadata.c
@@ -14,76 +14,53 @@
 
 static int fsverity_read_merkle_tree(struct inode *inode,
 				     const struct fsverity_info *vi,
-				     void __user *buf, u64 offset, int length)
+				     void __user *buf, u64 pos, int length)
 {
-	const struct fsverity_operations *vops = inode->i_sb->s_vop;
-	u64 end_offset;
-	unsigned int offs_in_block;
-	pgoff_t index, last_index;
+	const u64 end_pos = min(pos + length, vi->tree_params.tree_size);
+	const struct merkle_tree_params *params = &vi->tree_params;
+	unsigned int offs_in_block = pos & (params->block_size - 1);
 	int retval = 0;
 	int err = 0;
-	const unsigned int block_size = vi->tree_params.block_size;
-	const u8 log_blocksize = vi->tree_params.log_blocksize;
 
-	end_offset = min(offset + length, vi->tree_params.tree_size);
-	if (offset >= end_offset)
+	if (pos >= end_pos)
 		return 0;
-	offs_in_block = offset & (block_size - 1);
-	last_index = (end_offset - 1) >> log_blocksize;
 
 	/*
 	 * Iterate through each Merkle tree block in the requested range and
 	 * copy the requested portion to userspace. Note that we are returning
 	 * a byte stream.
 	 */
-	for (index = offset >> log_blocksize; index <= last_index; index++) {
-		unsigned long num_ra_pages =
-			min_t(unsigned long, last_index - index + 1,
-			      inode->i_sb->s_bdi->io_pages);
-		unsigned int bytes_to_copy = min_t(u64, end_offset - offset,
-						   block_size - offs_in_block);
+	while (pos < end_pos) {
+		unsigned long ra_bytes;
+		unsigned int bytes_to_copy;
 		struct fsverity_blockbuf block = {
-			.size = block_size,
+			.size = params->block_size,
 		};
 
-		if (!vops->read_merkle_tree_block) {
-			unsigned int blocks_per_page =
-				vi->tree_params.blocks_per_page;
-			unsigned long page_idx =
-				round_down(index, blocks_per_page);
-			struct page *page = vops->read_merkle_tree_page(inode,
-					page_idx, num_ra_pages);
-
-			if (IS_ERR(page)) {
-				err = PTR_ERR(page);
-			} else {
-				block.kaddr = kmap_local_page(page) +
-					((index - page_idx) << log_blocksize);
-				block.context = page;
-			}
-		} else {
-			err = vops->read_merkle_tree_block(inode,
-					index << log_blocksize,
-					&block, log_blocksize, num_ra_pages);
-		}
+		ra_bytes = min_t(unsigned long, end_pos - pos + 1,
+				 inode->i_sb->s_bdi->io_pages << PAGE_SHIFT);
+		bytes_to_copy = min_t(u64, end_pos - pos,
+				      params->block_size - offs_in_block);
 
+		err = fsverity_read_merkle_tree_block(inode, &vi->tree_params,
+				pos - offs_in_block, ra_bytes, &block);
 		if (err) {
 			fsverity_err(inode,
-				     "Error %d reading Merkle tree block %lu",
-				     err, index << log_blocksize);
+				     "Error %d reading Merkle tree block %llu",
+				     err, pos);
 			break;
 		}
 
 		if (copy_to_user(buf, block.kaddr + offs_in_block, bytes_to_copy)) {
-			fsverity_drop_block(inode, &block);
+			fsverity_drop_merkle_tree_block(inode, &block);
 			err = -EFAULT;
 			break;
 		}
-		fsverity_drop_block(inode, &block);
+		fsverity_drop_merkle_tree_block(inode, &block);
 
 		retval += bytes_to_copy;
 		buf += bytes_to_copy;
-		offset += bytes_to_copy;
+		pos += bytes_to_copy;
 
 		if (fatal_signal_pending(current))  {
 			err = -EINTR;
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index aa1763e8b723..6c4c73eeccea 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -23,7 +23,18 @@ static bool is_hash_block_verified(struct inode *inode,
 	unsigned int blocks_per_page;
 	unsigned int i;
 	struct fsverity_info *vi = inode->i_verity_info;
-	struct page *hpage = (struct page *)block->context;
+	struct page *hpage;
+
+	/*
+	 * If the filesystem uses block-based caching, then
+	 * ->hash_block_verified is always used and the filesystem pushes
+	 * invalidations to it as needed.
+	 */
+	if (fsverity_caches_blocks(inode))
+		return test_bit(hblock_idx, vi->hash_block_verified);
+
+	/* Otherwise, the filesystem uses page-based caching. */
+	hpage = (struct page *)block->context;
 
 	/*
 	 * When the Merkle tree block size and page size are the same, then the
@@ -34,15 +45,9 @@ static bool is_hash_block_verified(struct inode *inode,
 	 * get evicted and re-instantiated from the backing storage, as new
 	 * pages always start out with PG_checked cleared.
 	 */
-	if (!vi->hash_block_verified)
+	if (!fsverity_uses_bitmap(vi, inode))
 		return PageChecked(hpage);
 
-	/*
-	 * Filesystems which use block based caching (e.g. XFS) always use
-	 * bitmap.
-	 */
-	if (inode->i_sb->s_vop->read_merkle_tree_block)
-		return test_bit(hblock_idx, vi->hash_block_verified);
 	/*
 	 * When the Merkle tree block size and page size differ, we use a bitmap
 	 * to indicate whether each hash block has been verified.
@@ -99,13 +104,13 @@ static bool is_hash_block_verified(struct inode *inode,
  */
 static bool
 verify_data_block(struct inode *inode, struct fsverity_info *vi,
-		  const void *data, u64 data_pos, unsigned long max_ra_pages)
+		  const void *data, u64 data_pos, unsigned long max_ra_bytes)
 {
 	const struct merkle_tree_params *params = &vi->tree_params;
 	const unsigned int hsize = params->digest_size;
 	int level;
 	int err = 0;
-	int num_ra_pages;
+	unsigned long ra_bytes;
 	u8 _want_hash[FS_VERITY_MAX_DIGEST_SIZE];
 	const u8 *want_hash;
 	u8 real_hash[FS_VERITY_MAX_DIGEST_SIZE];
@@ -153,11 +158,8 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 	for (level = 0; level < params->num_levels; level++) {
 		unsigned long next_hidx;
 		unsigned long hblock_idx;
-		pgoff_t hpage_idx;
 		u64 hblock_pos;
-		unsigned int hblock_offset_in_page;
 		unsigned int hoffset;
-		struct page *hpage;
 		struct fsverity_blockbuf *block = &hblocks[level].block;
 
 		/*
@@ -169,47 +171,25 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		/* Index of the hash block in the tree overall */
 		hblock_idx = params->level_start[level] + next_hidx;
 
-		/* Index of the hash page in the tree overall */
-		hpage_idx = hblock_idx >> params->log_blocks_per_page;
-
-		/* Byte offset of the hash block within the page */
-		hblock_offset_in_page =
-			(hblock_idx << params->log_blocksize) & ~PAGE_MASK;
-
-		/* Offset of the Merkle tree block into the tree */
+		/* Byte offset of the hash block in the tree overall */
 		hblock_pos = hblock_idx << params->log_blocksize;
 
 		/* Byte offset of the hash within the block */
 		hoffset = (hidx << params->log_digestsize) &
 			  (params->block_size - 1);
 
-		num_ra_pages = level == 0 ?
-			min(max_ra_pages, params->tree_pages - hpage_idx) : 0;
-
-		if (inode->i_sb->s_vop->read_merkle_tree_block) {
-			err = inode->i_sb->s_vop->read_merkle_tree_block(
-				inode, hblock_pos, block, params->log_blocksize,
-				num_ra_pages);
-		} else {
-			unsigned int blocks_per_page =
-				vi->tree_params.blocks_per_page;
-			hblock_idx = round_down(hblock_idx, blocks_per_page);
-			hpage = inode->i_sb->s_vop->read_merkle_tree_page(
-				inode, hpage_idx, (num_ra_pages << PAGE_SHIFT));
-
-			if (IS_ERR(hpage)) {
-				err = PTR_ERR(hpage);
-			} else {
-				block->kaddr = kmap_local_page(hpage) +
-					hblock_offset_in_page;
-				block->context = hpage;
-			}
-		}
+		if (level == 0)
+			ra_bytes = min(max_ra_bytes,
+				       params->tree_size - hblock_pos);
+		else
+			ra_bytes = 0;
 
+		err = fsverity_read_merkle_tree_block(inode, params, hblock_pos,
+				ra_bytes, block);
 		if (err) {
 			fsverity_err(inode,
-				     "Error %d reading Merkle tree block %lu",
-				     err, hblock_idx);
+				     "Error %d reading Merkle tree block %llu",
+				     err, hblock_pos);
 			goto error;
 		}
 
@@ -218,7 +198,7 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 			want_hash = _want_hash;
 			trace_fsverity_merkle_tree_block_verified(inode,
 					block, FSVERITY_TRACE_DIR_ASCEND);
-			fsverity_drop_block(inode, block);
+			fsverity_drop_merkle_tree_block(inode, block);
 			goto descend;
 		}
 		hblocks[level].index = hblock_idx;
@@ -234,7 +214,6 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		const void *haddr = block->kaddr;
 		unsigned long hblock_idx = hblocks[level - 1].index;
 		unsigned int hoffset = hblocks[level - 1].hoffset;
-		struct page *hpage = (struct page *)block->context;
 
 		if (fsverity_hash_block(params, inode, haddr, real_hash) != 0)
 			goto error;
@@ -245,15 +224,15 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		 * idempotent, as the same hash block might be verified by
 		 * multiple threads concurrently.
 		 */
-		if (vi->hash_block_verified)
+		if (fsverity_uses_bitmap(vi, inode))
 			set_bit(hblock_idx, vi->hash_block_verified);
 		else
-			SetPageChecked(hpage);
+			SetPageChecked((struct page *)block->context);
 		memcpy(_want_hash, haddr + hoffset, hsize);
 		want_hash = _want_hash;
 		trace_fsverity_merkle_tree_block_verified(inode, block,
 				FSVERITY_TRACE_DIR_DESCEND);
-		fsverity_drop_block(inode, block);
+		fsverity_drop_merkle_tree_block(inode, block);
 	}
 
 	/* Finally, verify the data block. */
@@ -271,13 +250,13 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		     params->hash_alg->name, hsize, real_hash);
 error:
 	for (; level > 0; level--)
-		fsverity_drop_block(inode, &hblocks[level - 1].block);
+		fsverity_drop_merkle_tree_block(inode, &hblocks[level - 1].block);
 	return false;
 }
 
 static bool
 verify_data_blocks(struct folio *data_folio, size_t len, size_t offset,
-		   unsigned long max_ra_pages)
+		   unsigned long max_ra_bytes)
 {
 	struct inode *inode = data_folio->mapping->host;
 	struct fsverity_info *vi = inode->i_verity_info;
@@ -295,7 +274,7 @@ verify_data_blocks(struct folio *data_folio, size_t len, size_t offset,
 
 		data = kmap_local_folio(data_folio, offset);
 		valid = verify_data_block(inode, vi, data, pos + offset,
-					  max_ra_pages);
+					  max_ra_bytes);
 		kunmap_local(data);
 		if (!valid)
 			return false;
@@ -358,7 +337,7 @@ void fsverity_verify_bio(struct bio *bio)
 
 	bio_for_each_folio_all(fi, bio) {
 		if (!verify_data_blocks(fi.folio, fi.length, fi.offset,
-					max_ra_pages)) {
+					max_ra_pages << PAGE_SHIFT)) {
 			bio->bi_status = BLK_STS_IOERR;
 			break;
 		}
@@ -412,7 +391,7 @@ void fsverity_invalidate_block(struct inode *inode,
 
 	trace_fsverity_invalidate_block(inode, block);
 
-	if (block->offset > vi->tree_params.tree_size) {
+	if (block->offset >= vi->tree_params.tree_size) {
 		fsverity_err(inode,
 "Trying to invalidate beyond Merkle tree (tree %lld, offset %lld)",
 			     vi->tree_params.tree_size, block->offset);
@@ -423,16 +402,54 @@ void fsverity_invalidate_block(struct inode *inode,
 }
 EXPORT_SYMBOL_GPL(fsverity_invalidate_block);
 
-void fsverity_drop_block(struct inode *inode,
-		struct fsverity_blockbuf *block)
+/**
+ * fsverity_read_merkle_tree_block() - read Merkle tree block
+ * @inode: inode to which this Merkle tree blocks belong
+ * @params: merkle tree parameters
+ * @pos: byte position within merkle tree
+ * @ra_bytes: try to read ahead this many btes
+ * @block: block to be loaded
+ *
+ * This function loads data from a merkle tree.
+ */
+int fsverity_read_merkle_tree_block(struct inode *inode,
+				    const struct merkle_tree_params *params,
+				    u64 pos, unsigned long ra_bytes,
+				    struct fsverity_blockbuf *block)
 {
-	if (inode->i_sb->s_vop->drop_block)
-		inode->i_sb->s_vop->drop_block(block);
-	else {
-		struct page *page = (struct page *)block->context;
+	const struct fsverity_operations *vops = inode->i_sb->s_vop;
+	unsigned long page_idx;
+	struct page *page;
+	unsigned long index;
+	unsigned int offset_in_page;
 
+	if (fsverity_caches_blocks(inode))
+		return vops->read_merkle_tree_block(inode, pos, ra_bytes,
+				params->log_blocksize, block);
+
+	index = pos >> params->log_blocksize;
+	page_idx = round_down(index, params->blocks_per_page);
+	offset_in_page = pos & ~PAGE_MASK;
+
+	page = vops->read_merkle_tree_page(inode, page_idx,
+			ra_bytes >> PAGE_SHIFT);
+	if (IS_ERR(page))
+		return PTR_ERR(page);
+
+	block->kaddr = kmap_local_page(page) + offset_in_page;
+	block->context = page;
+	return 0;
+}
+
+void fsverity_drop_merkle_tree_block(struct inode *inode,
+				     struct fsverity_blockbuf *block)
+{
+	if (fsverity_caches_blocks(inode))  {
+		inode->i_sb->s_vop->drop_merkle_tree_block(block);
+	} else {
 		kunmap_local(block->kaddr);
-		put_page(page);
+		put_page((struct page *)block->context);
 	}
 	block->kaddr = NULL;
+	block->context = NULL;
 }
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 45b7c613148a..0af2cd1860e4 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -41,10 +41,10 @@
  * written down to disk.
  *
  * While reading the tree, fs-verity calls ->read_merkle_tree_block followed by
- * ->drop_block to let filesystem know that memory can be freed.
+ * ->drop_merkle_tree_block to let filesystem know that memory can be freed.
  *
  * The context is optional. This field can be used by filesystem to passthrough
- * state from ->read_merkle_tree_block to ->drop_block.
+ * state from ->read_merkle_tree_block to ->drop_merkle_tree_block.
  */
 struct fsverity_blockbuf {
 	void *kaddr;
@@ -128,6 +128,9 @@ struct fsverity_operations {
 	 *
 	 * Note that this must retrieve a *page*, not necessarily a *block*.
 	 *
+	 * If this function is implemented, do not implement
+	 * ->read_merkle_tree_block or ->drop_merkle_tree_block.
+	 *
 	 * Return: the page on success, ERR_PTR() on failure
 	 */
 	struct page *(*read_merkle_tree_page)(struct inode *inode,
@@ -138,12 +141,12 @@ struct fsverity_operations {
 	 * Read a Merkle tree block of the given inode.
 	 * @inode: the inode
 	 * @pos: byte offset of the block within the Merkle tree
-	 * @block: block buffer for filesystem to point it to the block
-	 * @log_blocksize: log2 of the size of the expected block
 	 * @ra_bytes: The number of bytes that should be
 	 *		prefetched starting at @pos if the page at @pos
 	 *		isn't already cached.  Implementations may ignore this
 	 *		argument; it's only a performance optimization.
+	 * @log_blocksize: log2 of the size of the expected block
+	 * @block: block buffer for filesystem to point it to the block
 	 *
 	 * This can be called at any time on an open verity file.  It may be
 	 * called by multiple processes concurrently.
@@ -152,13 +155,15 @@ struct fsverity_operations {
 	 * fsverity_invalidate_block() to let fsverity know that block's
 	 * verification state is not valid anymore.
 	 *
+	 * If this function is implemented, ->drop_merkle_tree_block must also
+	 * be implemented.
+	 *
 	 * Return: 0 on success, -errno on failure
 	 */
 	int (*read_merkle_tree_block)(struct inode *inode,
-				      u64 pos,
-				      struct fsverity_blockbuf *block,
+				      u64 pos, unsigned long ra_bytes,
 				      unsigned int log_blocksize,
-				      u64 ra_bytes);
+				      struct fsverity_blockbuf *block);
 
 	/**
 	 * Write a Merkle tree block to the given inode.
@@ -183,8 +188,11 @@ struct fsverity_operations {
 	 *
 	 * This is called when fs-verity is done with a block obtained with
 	 * ->read_merkle_tree_block().
+	 *
+	 * If this function is implemented, ->read_merkle_tree_block must also
+	 * be implemented.
 	 */
-	void (*drop_block)(struct fsverity_blockbuf *block);
+	void (*drop_merkle_tree_block)(struct fsverity_blockbuf *block);
 };
 
 #ifdef CONFIG_FS_VERITY


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 11/40] fsverity: send the level of the merkle tree block to ->read_merkle_tree_block
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (9 preceding siblings ...)
  2024-03-17 16:26   ` [PATCH 10/40] fsverity: fix "support block-based Merkle tree caching" Darrick J. Wong
@ 2024-03-17 16:26   ` Darrick J. Wong
  2024-03-17 16:26   ` [PATCH 12/40] fsverity: pass the new tree size and block size to ->begin_enable_verity Darrick J. Wong
                     ` (29 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:26 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

When fsverity needs to pull in a merkle tree block for file data
verification, it knows the level of the block within the tree.  For XFS,
we will cache the blocks in memory ourselves, and it is advantageous to
make higher level nodes more resistant to memory reclamation.

Therefore, we need to pass the anticipated level to the
->read_merkle_tree_block functions to enable this kind of caching.
Establish level == -1 to mean streaming read (e.g. downloading the
merkle tree).

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/verity/fsverity_private.h |    2 +-
 fs/verity/read_metadata.c    |    2 +-
 fs/verity/verify.c           |   25 +++++++++++++++++++------
 include/linux/fsverity.h     |   32 ++++++++++++++++++++++----------
 4 files changed, 43 insertions(+), 18 deletions(-)


diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index 0a4381acb394..b01343113e8b 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -179,7 +179,7 @@ static inline bool fsverity_uses_bitmap(const struct fsverity_info *vi,
 
 int fsverity_read_merkle_tree_block(struct inode *inode,
 				    const struct merkle_tree_params *params,
-				    u64 pos, unsigned long ra_bytes,
+				    int level, u64 pos, unsigned long ra_bytes,
 				    struct fsverity_blockbuf *block);
 
 /*
diff --git a/fs/verity/read_metadata.c b/fs/verity/read_metadata.c
index 94fffa060f82..87cc6f289663 100644
--- a/fs/verity/read_metadata.c
+++ b/fs/verity/read_metadata.c
@@ -43,7 +43,7 @@ static int fsverity_read_merkle_tree(struct inode *inode,
 				      params->block_size - offs_in_block);
 
 		err = fsverity_read_merkle_tree_block(inode, &vi->tree_params,
-				pos - offs_in_block, ra_bytes, &block);
+				-1, pos - offs_in_block, ra_bytes, &block);
 		if (err) {
 			fsverity_err(inode,
 				     "Error %d reading Merkle tree block %llu",
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 6c4c73eeccea..cd84182f5e43 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -184,8 +184,8 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		else
 			ra_bytes = 0;
 
-		err = fsverity_read_merkle_tree_block(inode, params, hblock_pos,
-				ra_bytes, block);
+		err = fsverity_read_merkle_tree_block(inode, params, level,
+				hblock_pos, ra_bytes, block);
 		if (err) {
 			fsverity_err(inode,
 				     "Error %d reading Merkle tree block %llu",
@@ -406,6 +406,8 @@ EXPORT_SYMBOL_GPL(fsverity_invalidate_block);
  * fsverity_read_merkle_tree_block() - read Merkle tree block
  * @inode: inode to which this Merkle tree blocks belong
  * @params: merkle tree parameters
+ * @level: expected level of the block; level 0 are the leaves, -1 means a
+ * streaming read
  * @pos: byte position within merkle tree
  * @ra_bytes: try to read ahead this many btes
  * @block: block to be loaded
@@ -414,7 +416,7 @@ EXPORT_SYMBOL_GPL(fsverity_invalidate_block);
  */
 int fsverity_read_merkle_tree_block(struct inode *inode,
 				    const struct merkle_tree_params *params,
-				    u64 pos, unsigned long ra_bytes,
+				    int level, u64 pos, unsigned long ra_bytes,
 				    struct fsverity_blockbuf *block)
 {
 	const struct fsverity_operations *vops = inode->i_sb->s_vop;
@@ -423,9 +425,20 @@ int fsverity_read_merkle_tree_block(struct inode *inode,
 	unsigned long index;
 	unsigned int offset_in_page;
 
-	if (fsverity_caches_blocks(inode))
-		return vops->read_merkle_tree_block(inode, pos, ra_bytes,
-				params->log_blocksize, block);
+	block->offset = pos;
+	block->size = params->block_size;
+
+	if (fsverity_caches_blocks(inode)) {
+		struct fsverity_readmerkle req = {
+			.inode = inode,
+			.level = level,
+			.num_levels = params->num_levels,
+			.log_blocksize = params->log_blocksize,
+			.ra_bytes = ra_bytes,
+		};
+
+		return vops->read_merkle_tree_block(&req, block);
+	}
 
 	index = pos >> params->log_blocksize;
 	page_idx = round_down(index, params->blocks_per_page);
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 0af2cd1860e4..d12a95623614 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -53,6 +53,26 @@ struct fsverity_blockbuf {
 	void *context;
 };
 
+/**
+ * struct fsverity_readmerkle - Request to read a Merkle Tree block buffer
+ * @inode: the inode to read
+ * @level: expected level of the block; level 0 are the leaves, -1 means a
+ * streaming read
+ * @num_levels: number of levels in the tree total
+ * @log_blocksize: log2 of the size of the expected block
+ * @ra_bytes: The number of bytes that should be prefetched starting at pos
+ *		if the page at @block->offset isn't already cached.
+ *		Implementations may ignore this argument; it's only a
+ *		performance optimization.
+ */
+struct fsverity_readmerkle {
+	struct inode *inode;
+	unsigned long ra_bytes;
+	int level;
+	int num_levels;
+	u8 log_blocksize;
+};
+
 /* Verity operations for filesystems */
 struct fsverity_operations {
 
@@ -139,13 +159,7 @@ struct fsverity_operations {
 
 	/**
 	 * Read a Merkle tree block of the given inode.
-	 * @inode: the inode
-	 * @pos: byte offset of the block within the Merkle tree
-	 * @ra_bytes: The number of bytes that should be
-	 *		prefetched starting at @pos if the page at @pos
-	 *		isn't already cached.  Implementations may ignore this
-	 *		argument; it's only a performance optimization.
-	 * @log_blocksize: log2 of the size of the expected block
+	 * @req: read request; see struct fsverity_readmerkle
 	 * @block: block buffer for filesystem to point it to the block
 	 *
 	 * This can be called at any time on an open verity file.  It may be
@@ -160,9 +174,7 @@ struct fsverity_operations {
 	 *
 	 * Return: 0 on success, -errno on failure
 	 */
-	int (*read_merkle_tree_block)(struct inode *inode,
-				      u64 pos, unsigned long ra_bytes,
-				      unsigned int log_blocksize,
+	int (*read_merkle_tree_block)(const struct fsverity_readmerkle *req,
 				      struct fsverity_blockbuf *block);
 
 	/**


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 12/40] fsverity: pass the new tree size and block size to ->begin_enable_verity
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (10 preceding siblings ...)
  2024-03-17 16:26   ` [PATCH 11/40] fsverity: send the level of the merkle tree block to ->read_merkle_tree_block Darrick J. Wong
@ 2024-03-17 16:26   ` Darrick J. Wong
  2024-03-17 16:26   ` [PATCH 13/40] fsverity: expose merkle tree geometry to callers Darrick J. Wong
                     ` (28 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:26 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

When starting up the process of enabling fsverity on a file, pass the
new size of the merkle tree and the merkle tree block size to the fs
implementation.  XFS will want this information later to try to clean
out a failed previous enablement attempt.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/btrfs/verity.c        |    3 ++-
 fs/ext4/verity.c         |    3 ++-
 fs/f2fs/verity.c         |    3 ++-
 fs/verity/enable.c       |    3 ++-
 include/linux/fsverity.h |    5 ++++-
 5 files changed, 12 insertions(+), 5 deletions(-)


diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c
index 966630523502..c52f32bd43c7 100644
--- a/fs/btrfs/verity.c
+++ b/fs/btrfs/verity.c
@@ -579,7 +579,8 @@ static int finish_verity(struct btrfs_inode *inode, const void *desc,
  *
  * Returns 0 on success, negative error code on failure.
  */
-static int btrfs_begin_enable_verity(struct file *filp)
+static int btrfs_begin_enable_verity(struct file *filp, u64 merkle_tree_size,
+				     unsigned int tree_blocksize)
 {
 	struct btrfs_inode *inode = BTRFS_I(file_inode(filp));
 	struct btrfs_root *root = inode->root;
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index da2095a81349..a8ae8c912cb5 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -99,7 +99,8 @@ static int pagecache_write(struct inode *inode, const void *buf, size_t count,
 	return 0;
 }
 
-static int ext4_begin_enable_verity(struct file *filp)
+static int ext4_begin_enable_verity(struct file *filp, u64 merkle_tree_size,
+				    unsigned int tree_blocksize)
 {
 	struct inode *inode = file_inode(filp);
 	const int credits = 2; /* superblock and inode for ext4_orphan_add() */
diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
index b4461b9f47a3..f6ad6523ce95 100644
--- a/fs/f2fs/verity.c
+++ b/fs/f2fs/verity.c
@@ -115,7 +115,8 @@ struct fsverity_descriptor_location {
 	__le64 pos;
 };
 
-static int f2fs_begin_enable_verity(struct file *filp)
+static int f2fs_begin_enable_verity(struct file *filp, u64 merkle_tree_size,
+				    unsigned int tree_blocksize)
 {
 	struct inode *inode = file_inode(filp);
 	int err;
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index 945eba0092ab..496a361c0a81 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -237,7 +237,8 @@ static int enable_verity(struct file *filp,
 	if (IS_VERITY(inode))
 		err = -EEXIST;
 	else
-		err = vops->begin_enable_verity(filp);
+		err = vops->begin_enable_verity(filp, params.tree_size,
+				      params.block_size);
 	inode_unlock(inode);
 	if (err)
 		goto out;
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index d12a95623614..c5f3564f2cb8 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -80,6 +80,8 @@ struct fsverity_operations {
 	 * Begin enabling verity on the given file.
 	 *
 	 * @filp: a readonly file descriptor for the file
+	 * @merkle_tree_size: total bytes the new Merkle tree will take up
+	 * @tree_blocksize: the new Merkle tree block size
 	 *
 	 * The filesystem must do any needed filesystem-specific preparations
 	 * for enabling verity, e.g. evicting inline data.  It also must return
@@ -89,7 +91,8 @@ struct fsverity_operations {
 	 *
 	 * Return: 0 on success, -errno on failure
 	 */
-	int (*begin_enable_verity)(struct file *filp);
+	int (*begin_enable_verity)(struct file *filp, u64 merkle_tree_size,
+				   unsigned int tree_blocksize);
 
 	/**
 	 * End enabling verity on the given file.


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 13/40] fsverity: expose merkle tree geometry to callers
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (11 preceding siblings ...)
  2024-03-17 16:26   ` [PATCH 12/40] fsverity: pass the new tree size and block size to ->begin_enable_verity Darrick J. Wong
@ 2024-03-17 16:26   ` Darrick J. Wong
  2024-03-17 16:27   ` [PATCH 14/40] fsverity: rely on cached block callers to retain verified state Darrick J. Wong
                     ` (27 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:26 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Create a function that will return selected information about the
geometry of the merkle tree.  Online fsck for XFS will need this piece
to perform basic checks of the merkle tree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/verity/open.c         |   26 ++++++++++++++++++++++++++
 include/linux/fsverity.h |    3 +++
 2 files changed, 29 insertions(+)


diff --git a/fs/verity/open.c b/fs/verity/open.c
index 9603b3a404f7..7a86407732c4 100644
--- a/fs/verity/open.c
+++ b/fs/verity/open.c
@@ -412,6 +412,32 @@ void __fsverity_cleanup_inode(struct inode *inode)
 }
 EXPORT_SYMBOL_GPL(__fsverity_cleanup_inode);
 
+/**
+ * fsverity_merkle_tree_geometry() - return Merkle tree geometry
+ * @inode: the inode for which the Merkle tree is being built
+ * @block_size: size of a merkle tree block, in bytes
+ * @tree_size: size of the merkle tree, in bytes
+ */
+int fsverity_merkle_tree_geometry(struct inode *inode, unsigned int *block_size,
+				  u64 *tree_size)
+{
+	struct fsverity_info *vi;
+	int error;
+
+	if (!IS_VERITY(inode))
+		return -EOPNOTSUPP;
+
+	error = ensure_verity_info(inode);
+	if (error)
+		return error;
+
+	vi = fsverity_get_info(inode);
+	*block_size = vi->tree_params.block_size;
+	*tree_size = vi->tree_params.tree_size;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(fsverity_merkle_tree_geometry);
+
 void __init fsverity_init_info_cache(void)
 {
 	fsverity_info_cachep = KMEM_CACHE_USERCOPY(
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index c5f3564f2cb8..17bc0729119c 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -240,6 +240,9 @@ int __fsverity_file_open(struct inode *inode, struct file *filp);
 int __fsverity_prepare_setattr(struct dentry *dentry, struct iattr *attr);
 void __fsverity_cleanup_inode(struct inode *inode);
 
+int fsverity_merkle_tree_geometry(struct inode *inode, unsigned int *block_size,
+				  u64 *tree_size);
+
 /**
  * fsverity_cleanup_inode() - free the inode's verity info, if present
  * @inode: an inode being evicted


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 14/40] fsverity: rely on cached block callers to retain verified state
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (12 preceding siblings ...)
  2024-03-17 16:26   ` [PATCH 13/40] fsverity: expose merkle tree geometry to callers Darrick J. Wong
@ 2024-03-17 16:27   ` Darrick J. Wong
  2024-03-17 16:27   ` [PATCH 15/40] fsverity: box up the write_merkle_tree_block parameters too Darrick J. Wong
                     ` (26 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:27 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Using a single contiguous bitmap to record merkle tree block
verification state is unnecessary when we can retain that state in the
merkle tree block cache.  Worse, it doesn't scale well to large verity
files and stresses the memory allocator.

Therefore, add a state bit to fsverity_blockbuf and let the
implementation retain the validated state.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/verity/fsverity_private.h    |    7 ++++---
 fs/verity/verify.c              |   39 +++++++--------------------------------
 include/linux/fsverity.h        |   13 ++++++++-----
 include/trace/events/fsverity.h |   19 -------------------
 4 files changed, 19 insertions(+), 59 deletions(-)


diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index b01343113e8b..de8798f141d4 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -167,14 +167,15 @@ static inline bool fsverity_caches_blocks(const struct inode *inode)
 static inline bool fsverity_uses_bitmap(const struct fsverity_info *vi,
 					const struct inode *inode)
 {
+	if (fsverity_caches_blocks(inode))
+		return false;
+
 	/*
 	 * If fs uses block-based Merkle tree caching, then fs-verity must use
 	 * hash_block_verified bitmap as there's no page to mark it with
 	 * PG_checked.
 	 */
-	if (vi->tree_params.block_size != PAGE_SIZE)
-		return true;
-	return fsverity_caches_blocks(inode);
+	return vi->tree_params.block_size != PAGE_SIZE;
 }
 
 int fsverity_read_merkle_tree_block(struct inode *inode,
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index cd84182f5e43..a61d1c99c485 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -26,12 +26,11 @@ static bool is_hash_block_verified(struct inode *inode,
 	struct page *hpage;
 
 	/*
-	 * If the filesystem uses block-based caching, then
-	 * ->hash_block_verified is always used and the filesystem pushes
-	 * invalidations to it as needed.
+	 * If the filesystem uses block-based caching, then rely on the
+	 * implementation to retain verified status.
 	 */
 	if (fsverity_caches_blocks(inode))
-		return test_bit(hblock_idx, vi->hash_block_verified);
+		return block->verified;
 
 	/* Otherwise, the filesystem uses page-based caching. */
 	hpage = (struct page *)block->context;
@@ -224,7 +223,9 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		 * idempotent, as the same hash block might be verified by
 		 * multiple threads concurrently.
 		 */
-		if (fsverity_uses_bitmap(vi, inode))
+		if (fsverity_caches_blocks(inode))
+			block->verified = true;
+		else if (fsverity_uses_bitmap(vi, inode))
 			set_bit(hblock_idx, vi->hash_block_verified);
 		else
 			SetPageChecked((struct page *)block->context);
@@ -375,33 +376,6 @@ void __init fsverity_init_workqueue(void)
 		panic("failed to allocate fsverity_read_queue");
 }
 
-/**
- * fsverity_invalidate_block() - invalidate Merkle tree block
- * @inode: inode to which this Merkle tree blocks belong
- * @block: block to be invalidated
- *
- * This function invalidates/clears "verified" state of Merkle tree block
- * in the fs-verity bitmap. The block needs to have ->offset set.
- */
-void fsverity_invalidate_block(struct inode *inode,
-		struct fsverity_blockbuf *block)
-{
-	struct fsverity_info *vi = inode->i_verity_info;
-	const unsigned int log_blocksize = vi->tree_params.log_blocksize;
-
-	trace_fsverity_invalidate_block(inode, block);
-
-	if (block->offset >= vi->tree_params.tree_size) {
-		fsverity_err(inode,
-"Trying to invalidate beyond Merkle tree (tree %lld, offset %lld)",
-			     vi->tree_params.tree_size, block->offset);
-		return;
-	}
-
-	clear_bit(block->offset >> log_blocksize, vi->hash_block_verified);
-}
-EXPORT_SYMBOL_GPL(fsverity_invalidate_block);
-
 /**
  * fsverity_read_merkle_tree_block() - read Merkle tree block
  * @inode: inode to which this Merkle tree blocks belong
@@ -436,6 +410,7 @@ int fsverity_read_merkle_tree_block(struct inode *inode,
 			.log_blocksize = params->log_blocksize,
 			.ra_bytes = ra_bytes,
 		};
+		block->verified = false;
 
 		return vops->read_merkle_tree_block(&req, block);
 	}
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 17bc0729119c..026e4f72290e 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -32,6 +32,7 @@
  * @offset: block's offset into Merkle tree
  * @size: the Merkle tree block size
  * @context: filesystem private context
+ * @verified: has this buffer been validated?
  *
  * Buffer containing single Merkle Tree block. These buffers are passed
  *  - to filesystem, when fs-verity is building merkel tree,
@@ -49,6 +50,7 @@
 struct fsverity_blockbuf {
 	void *kaddr;
 	u64 offset;
+	unsigned int verified:1;
 	unsigned int size;
 	void *context;
 };
@@ -168,9 +170,9 @@ struct fsverity_operations {
 	 * This can be called at any time on an open verity file.  It may be
 	 * called by multiple processes concurrently.
 	 *
-	 * In case that block was evicted from the memory filesystem has to use
-	 * fsverity_invalidate_block() to let fsverity know that block's
-	 * verification state is not valid anymore.
+	 * Implementations may cache the @block->verified state in
+	 * ->drop_merkle_tree_block.  They must clear the @block->verified
+	 * flag for a cache miss.
 	 *
 	 * If this function is implemented, ->drop_merkle_tree_block must also
 	 * be implemented.
@@ -204,6 +206,9 @@ struct fsverity_operations {
 	 * This is called when fs-verity is done with a block obtained with
 	 * ->read_merkle_tree_block().
 	 *
+	 * Implementations should cache a @block->verified==1 state to avoid
+	 * unnecessary revalidations during later accesses.
+	 *
 	 * If this function is implemented, ->read_merkle_tree_block must also
 	 * be implemented.
 	 */
@@ -264,8 +269,6 @@ int fsverity_ioctl_read_metadata(struct file *filp, const void __user *uarg);
 bool fsverity_verify_blocks(struct folio *folio, size_t len, size_t offset);
 void fsverity_verify_bio(struct bio *bio);
 void fsverity_enqueue_verify_work(struct work_struct *work);
-void fsverity_invalidate_block(struct inode *inode,
-		struct fsverity_blockbuf *block);
 
 static inline int fsverity_set_ops(struct super_block *sb,
 				   const struct fsverity_operations *ops)
diff --git a/include/trace/events/fsverity.h b/include/trace/events/fsverity.h
index 763890e47358..1a6ee2a2c3ce 100644
--- a/include/trace/events/fsverity.h
+++ b/include/trace/events/fsverity.h
@@ -109,25 +109,6 @@ TRACE_EVENT(fsverity_merkle_tree_block_verified,
 		__entry->direction == 0 ? "ascend" : "descend")
 );
 
-TRACE_EVENT(fsverity_invalidate_block,
-	TP_PROTO(struct inode *inode, struct fsverity_blockbuf *block),
-	TP_ARGS(inode, block),
-	TP_STRUCT__entry(
-		__field(ino_t, ino)
-		__field(u64, offset)
-		__field(unsigned int, block_size)
-	),
-	TP_fast_assign(
-		__entry->ino = inode->i_ino;
-		__entry->offset = block->offset;
-		__entry->block_size = block->size;
-	),
-	TP_printk("ino %lu block position %llu block size %u",
-		(unsigned long) __entry->ino,
-		__entry->offset,
-		__entry->block_size)
-);
-
 TRACE_EVENT(fsverity_read_merkle_tree_block,
 	TP_PROTO(struct inode *inode, u64 offset, unsigned int log_blocksize),
 	TP_ARGS(inode, offset, log_blocksize),


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 15/40] fsverity: box up the write_merkle_tree_block parameters too
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (13 preceding siblings ...)
  2024-03-17 16:27   ` [PATCH 14/40] fsverity: rely on cached block callers to retain verified state Darrick J. Wong
@ 2024-03-17 16:27   ` Darrick J. Wong
  2024-03-17 16:27   ` [PATCH 16/40] fsverity: pass the zero-hash value to the implementation Darrick J. Wong
                     ` (25 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:27 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Box up the tree write request parameters into a structure so that we can
add more in the next few patches.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/btrfs/verity.c        |    6 ++++--
 fs/ext4/verity.c         |    7 +++++--
 fs/f2fs/verity.c         |    7 +++++--
 fs/verity/enable.c       |    5 ++++-
 include/linux/fsverity.h |   21 ++++++++++++++++++---
 5 files changed, 36 insertions(+), 10 deletions(-)


diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c
index c52f32bd43c7..70794c608581 100644
--- a/fs/btrfs/verity.c
+++ b/fs/btrfs/verity.c
@@ -791,9 +791,11 @@ static struct page *btrfs_read_merkle_tree_page(struct inode *inode,
  *
  * Returns 0 on success or negative error code on failure
  */
-static int btrfs_write_merkle_tree_block(struct inode *inode, const void *buf,
-					 u64 pos, unsigned int size)
+static int btrfs_write_merkle_tree_block(const struct fsverity_writemerkle *req,
+					 const void *buf, u64 pos,
+					 unsigned int size)
 {
+	struct inode *inode = req->inode;
 	loff_t merkle_pos = merkle_file_pos(inode);
 
 	if (merkle_pos < 0)
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index a8ae8c912cb5..27eb2d51cce2 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -382,9 +382,12 @@ static struct page *ext4_read_merkle_tree_page(struct inode *inode,
 	return folio_file_page(folio, index);
 }
 
-static int ext4_write_merkle_tree_block(struct inode *inode, const void *buf,
-					u64 pos, unsigned int size)
+static int ext4_write_merkle_tree_block(const struct fsverity_writemerkle *req,
+					const void *buf, u64 pos,
+					unsigned int size)
 {
+	struct inode *inode = req->inode;
+
 	pos += ext4_verity_metadata_pos(inode);
 
 	return pagecache_write(inode, buf, size, pos);
diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
index f6ad6523ce95..923d7a09b2f4 100644
--- a/fs/f2fs/verity.c
+++ b/fs/f2fs/verity.c
@@ -277,9 +277,12 @@ static struct page *f2fs_read_merkle_tree_page(struct inode *inode,
 	return page;
 }
 
-static int f2fs_write_merkle_tree_block(struct inode *inode, const void *buf,
-					u64 pos, unsigned int size)
+static int f2fs_write_merkle_tree_block(const struct fsverity_writemerkle *req,
+					const void *buf, u64 pos,
+					unsigned int size)
 {
+	struct inode *inode = req->inode;
+
 	pos += f2fs_verity_metadata_pos(inode);
 
 	return pagecache_write(inode, buf, size, pos);
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index 496a361c0a81..8dcfefc848ee 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -50,10 +50,13 @@ static int write_merkle_tree_block(struct inode *inode, const u8 *buf,
 				   unsigned long index,
 				   const struct merkle_tree_params *params)
 {
+	struct fsverity_writemerkle req = {
+		.inode = inode,
+	};
 	u64 pos = (u64)index << params->log_blocksize;
 	int err;
 
-	err = inode->i_sb->s_vop->write_merkle_tree_block(inode, buf, pos,
+	err = inode->i_sb->s_vop->write_merkle_tree_block(&req, buf, pos,
 							  params->block_size);
 	if (err)
 		fsverity_err(inode, "Error %d writing Merkle tree block %lu",
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 026e4f72290e..0dded1fcf2b1 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -75,6 +75,20 @@ struct fsverity_readmerkle {
 	u8 log_blocksize;
 };
 
+/**
+ * struct fsverity_writemerkle - Request to write a Merkle Tree block buffer
+ * @inode: the inode to read
+ * @level: level of the block; level 0 are the leaves
+ * @num_levels: number of levels in the tree total
+ * @log_blocksize: log2 of the size of the block
+ */
+struct fsverity_writemerkle {
+	struct inode *inode;
+	int level;
+	int num_levels;
+	u8 log_blocksize;
+};
+
 /* Verity operations for filesystems */
 struct fsverity_operations {
 
@@ -185,7 +199,7 @@ struct fsverity_operations {
 	/**
 	 * Write a Merkle tree block to the given inode.
 	 *
-	 * @inode: the inode for which the Merkle tree is being built
+	 * @req: write request; see struct fsverity_writemerkle
 	 * @buf: the Merkle tree block to write
 	 * @pos: the position of the block in the Merkle tree (in bytes)
 	 * @size: the Merkle tree block size (in bytes)
@@ -195,8 +209,9 @@ struct fsverity_operations {
 	 *
 	 * Return: 0 on success, -errno on failure
 	 */
-	int (*write_merkle_tree_block)(struct inode *inode, const void *buf,
-				       u64 pos, unsigned int size);
+	int (*write_merkle_tree_block)(const struct fsverity_writemerkle *req,
+				       const void *buf, u64 pos,
+				       unsigned int size);
 
 	/**
 	 * Release the reference to a Merkle tree block


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 16/40] fsverity: pass the zero-hash value to the implementation
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (14 preceding siblings ...)
  2024-03-17 16:27   ` [PATCH 15/40] fsverity: box up the write_merkle_tree_block parameters too Darrick J. Wong
@ 2024-03-17 16:27   ` Darrick J. Wong
  2024-03-18 16:38     ` Eric Biggers
  2024-03-17 16:27   ` [PATCH 17/40] fsverity: report validation errors back to the filesystem Darrick J. Wong
                     ` (24 subsequent siblings)
  40 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:27 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Compute the hash of a data block full of zeros, and then supply this to
the merkle tree read and write methods.  A subsequent xfs patch will use
this to reduce the size of the merkle tree when dealing with sparse gold
master disk images and the like.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/verity/enable.c           |    2 ++
 fs/verity/fsverity_private.h |    2 ++
 fs/verity/open.c             |    7 +++++++
 fs/verity/verify.c           |    2 ++
 include/linux/fsverity.h     |    8 ++++++++
 5 files changed, 21 insertions(+)


diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index 8dcfefc848ee..06b769dd1bdf 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -52,6 +52,8 @@ static int write_merkle_tree_block(struct inode *inode, const u8 *buf,
 {
 	struct fsverity_writemerkle req = {
 		.inode = inode,
+		.zero_digest = params->zero_digest,
+		.digest_size = params->digest_size,
 	};
 	u64 pos = (u64)index << params->log_blocksize;
 	int err;
diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
index de8798f141d4..195a92f203bb 100644
--- a/fs/verity/fsverity_private.h
+++ b/fs/verity/fsverity_private.h
@@ -47,6 +47,8 @@ struct merkle_tree_params {
 	u64 tree_size;			/* Merkle tree size in bytes */
 	unsigned long tree_pages;	/* Merkle tree size in pages */
 
+	u8 zero_digest[FS_VERITY_MAX_DIGEST_SIZE]; /* hash of zeroed data block */
+
 	/*
 	 * Starting block index for each tree level, ordered from leaf level (0)
 	 * to root level ('num_levels - 1')
diff --git a/fs/verity/open.c b/fs/verity/open.c
index 7a86407732c4..433a70eeca55 100644
--- a/fs/verity/open.c
+++ b/fs/verity/open.c
@@ -144,6 +144,13 @@ int fsverity_init_merkle_tree_params(struct merkle_tree_params *params,
 		goto out_err;
 	}
 
+	err = fsverity_hash_buffer(params->hash_alg, page_address(ZERO_PAGE(0)),
+				   i_blocksize(inode), params->zero_digest);
+	if (err) {
+		fsverity_err(inode, "Error %d computing zero digest", err);
+		goto out_err;
+	}
+
 	params->tree_size = offset << log_blocksize;
 	params->tree_pages = PAGE_ALIGN(params->tree_size) >> PAGE_SHIFT;
 	return 0;
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index a61d1c99c485..494225f60608 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -409,6 +409,8 @@ int fsverity_read_merkle_tree_block(struct inode *inode,
 			.num_levels = params->num_levels,
 			.log_blocksize = params->log_blocksize,
 			.ra_bytes = ra_bytes,
+			.zero_digest = params->zero_digest,
+			.digest_size = params->digest_size,
 		};
 		block->verified = false;
 
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 0dded1fcf2b1..da23f1e30151 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -66,6 +66,8 @@ struct fsverity_blockbuf {
  *		if the page at @block->offset isn't already cached.
  *		Implementations may ignore this argument; it's only a
  *		performance optimization.
+ * @zero_digest: the hash for a data block of zeroes
+ * @digest_size: size of zero_digest
  */
 struct fsverity_readmerkle {
 	struct inode *inode;
@@ -73,6 +75,8 @@ struct fsverity_readmerkle {
 	int level;
 	int num_levels;
 	u8 log_blocksize;
+	const u8 *zero_digest;
+	unsigned int digest_size;
 };
 
 /**
@@ -81,12 +85,16 @@ struct fsverity_readmerkle {
  * @level: level of the block; level 0 are the leaves
  * @num_levels: number of levels in the tree total
  * @log_blocksize: log2 of the size of the block
+ * @zero_digest: the hash for a data block of zeroes
+ * @digest_size: size of zero_digest
  */
 struct fsverity_writemerkle {
 	struct inode *inode;
 	int level;
 	int num_levels;
 	u8 log_blocksize;
+	const u8 *zero_digest;
+	unsigned int digest_size;
 };
 
 /* Verity operations for filesystems */


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 17/40] fsverity: report validation errors back to the filesystem
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (15 preceding siblings ...)
  2024-03-17 16:27   ` [PATCH 16/40] fsverity: pass the zero-hash value to the implementation Darrick J. Wong
@ 2024-03-17 16:27   ` Darrick J. Wong
  2024-03-17 16:28   ` [PATCH 18/40] iomap: integrate fs-verity verification into iomap's read path Darrick J. Wong
                     ` (23 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:27 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Provide a new function call so that validation errors can be reported
back to the filesystem.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/verity/verify.c       |   14 +++++++++++++-
 include/linux/fsverity.h |   11 +++++++++++
 2 files changed, 24 insertions(+), 1 deletion(-)


diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 494225f60608..0782e94bc818 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -255,6 +255,15 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 	return false;
 }
 
+static void fsverity_fail_validation(struct inode *inode, loff_t pos,
+				     size_t len)
+{
+	const struct fsverity_operations *vops = inode->i_sb->s_vop;
+
+	if (vops->fail_validation)
+		vops->fail_validation(inode, pos, len);
+}
+
 static bool
 verify_data_blocks(struct folio *data_folio, size_t len, size_t offset,
 		   unsigned long max_ra_bytes)
@@ -277,8 +286,11 @@ verify_data_blocks(struct folio *data_folio, size_t len, size_t offset,
 		valid = verify_data_block(inode, vi, data, pos + offset,
 					  max_ra_bytes);
 		kunmap_local(data);
-		if (!valid)
+		if (!valid) {
+			fsverity_fail_validation(inode, pos + offset,
+						 block_size);
 			return false;
+		}
 		offset += block_size;
 		len -= block_size;
 	} while (len);
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index da23f1e30151..57df509295f4 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -236,6 +236,17 @@ struct fsverity_operations {
 	 * be implemented.
 	 */
 	void (*drop_merkle_tree_block)(struct fsverity_blockbuf *block);
+
+	/**
+	 * Notify the filesystem that file data validation failed
+	 *
+	 * @inode: the inode being validated
+	 * @pos: the file position of the invalid data
+	 * @len: the length of the invalid data
+	 *
+	 * This is called when fs-verity cannot validate the file contents.
+	 */
+	void (*fail_validation)(struct inode *inode, loff_t pos, size_t len);
 };
 
 #ifdef CONFIG_FS_VERITY


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 18/40] iomap: integrate fs-verity verification into iomap's read path
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (16 preceding siblings ...)
  2024-03-17 16:27   ` [PATCH 17/40] fsverity: report validation errors back to the filesystem Darrick J. Wong
@ 2024-03-17 16:28   ` Darrick J. Wong
  2024-03-17 16:28   ` [PATCH 19/40] xfs: add attribute type for fs-verity Darrick J. Wong
                     ` (22 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:28 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh
  Cc: Christoph Hellwig, linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

This patch adds fs-verity verification into iomap's read path. After
BIO's io operation is complete the data are verified against
fs-verity's Merkle tree. Verification work is done in a separate
workqueue.

The read path ioend iomap_read_ioend are stored side by side with
BIOs if FS_VERITY is enabled.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: fix doc warning]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c |   91 +++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 82 insertions(+), 9 deletions(-)


diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 093c4515b22a..c708a93d6a02 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -6,6 +6,7 @@
 #include <linux/module.h>
 #include <linux/compiler.h>
 #include <linux/fs.h>
+#include <linux/fsverity.h>
 #include <linux/iomap.h>
 #include <linux/pagemap.h>
 #include <linux/uio.h>
@@ -330,6 +331,56 @@ static inline bool iomap_block_needs_zeroing(const struct iomap_iter *iter,
 		pos >= i_size_read(iter->inode);
 }
 
+#ifdef CONFIG_FS_VERITY
+struct iomap_fsverity_bio {
+	struct work_struct	work;
+	struct bio		bio;
+};
+static struct bio_set iomap_fsverity_bioset;
+
+static void
+iomap_read_fsverify_end_io_work(struct work_struct *work)
+{
+	struct iomap_fsverity_bio *fbio =
+		container_of(work, struct iomap_fsverity_bio, work);
+
+	fsverity_verify_bio(&fbio->bio);
+	iomap_read_end_io(&fbio->bio);
+}
+
+static void
+iomap_read_fsverity_end_io(struct bio *bio)
+{
+	struct iomap_fsverity_bio *fbio =
+		container_of(bio, struct iomap_fsverity_bio, bio);
+
+	INIT_WORK(&fbio->work, iomap_read_fsverify_end_io_work);
+	queue_work(bio->bi_private, &fbio->work);
+}
+#endif /* CONFIG_FS_VERITY */
+
+static struct bio *iomap_read_bio_alloc(struct inode *inode,
+		struct block_device *bdev, int nr_vecs, gfp_t gfp)
+{
+	struct bio *bio;
+
+#ifdef CONFIG_FS_VERITY
+	if (fsverity_active(inode)) {
+		bio = bio_alloc_bioset(bdev, nr_vecs, REQ_OP_READ, gfp,
+					&iomap_fsverity_bioset);
+		if (bio) {
+			bio->bi_private = inode->i_sb->s_read_done_wq;
+			bio->bi_end_io = iomap_read_fsverity_end_io;
+		}
+		return bio;
+	}
+#endif
+	bio = bio_alloc(bdev, nr_vecs, REQ_OP_READ, gfp);
+	if (bio)
+		bio->bi_end_io = iomap_read_end_io;
+	return bio;
+}
+
 static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 		struct iomap_readpage_ctx *ctx, loff_t offset)
 {
@@ -353,6 +404,12 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 
 	if (iomap_block_needs_zeroing(iter, pos)) {
 		folio_zero_range(folio, poff, plen);
+		if (fsverity_active(iter->inode) &&
+		    !fsverity_verify_blocks(folio, plen, poff)) {
+			folio_set_error(folio);
+			goto done;
+		}
+
 		iomap_set_range_uptodate(folio, poff, plen);
 		goto done;
 	}
@@ -370,28 +427,29 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 	    !bio_add_folio(ctx->bio, folio, plen, poff)) {
 		gfp_t gfp = mapping_gfp_constraint(folio->mapping, GFP_KERNEL);
 		gfp_t orig_gfp = gfp;
-		unsigned int nr_vecs = DIV_ROUND_UP(length, PAGE_SIZE);
 
 		if (ctx->bio)
 			submit_bio(ctx->bio);
 
 		if (ctx->rac) /* same as readahead_gfp_mask */
 			gfp |= __GFP_NORETRY | __GFP_NOWARN;
-		ctx->bio = bio_alloc(iomap->bdev, bio_max_segs(nr_vecs),
-				     REQ_OP_READ, gfp);
+
+		ctx->bio = iomap_read_bio_alloc(iter->inode, iomap->bdev,
+				bio_max_segs(DIV_ROUND_UP(length, PAGE_SIZE)),
+				gfp);
+
 		/*
 		 * If the bio_alloc fails, try it again for a single page to
 		 * avoid having to deal with partial page reads.  This emulates
 		 * what do_mpage_read_folio does.
 		 */
 		if (!ctx->bio) {
-			ctx->bio = bio_alloc(iomap->bdev, 1, REQ_OP_READ,
-					     orig_gfp);
+			ctx->bio = iomap_read_bio_alloc(iter->inode,
+					iomap->bdev, 1, orig_gfp);
 		}
 		if (ctx->rac)
 			ctx->bio->bi_opf |= REQ_RAHEAD;
 		ctx->bio->bi_iter.bi_sector = sector;
-		ctx->bio->bi_end_io = iomap_read_end_io;
 		bio_add_folio_nofail(ctx->bio, folio, plen, poff);
 	}
 
@@ -1996,10 +2054,25 @@ iomap_writepages(struct address_space *mapping, struct writeback_control *wbc,
 }
 EXPORT_SYMBOL_GPL(iomap_writepages);
 
+#define IOMAP_POOL_SIZE		(4 * (PAGE_SIZE / SECTOR_SIZE))
+
 static int __init iomap_init(void)
 {
-	return bioset_init(&iomap_ioend_bioset, 4 * (PAGE_SIZE / SECTOR_SIZE),
-			   offsetof(struct iomap_ioend, io_inline_bio),
-			   BIOSET_NEED_BVECS);
+	int error;
+
+	error = bioset_init(&iomap_ioend_bioset, IOMAP_POOL_SIZE,
+			    offsetof(struct iomap_ioend, io_inline_bio),
+			    BIOSET_NEED_BVECS);
+#ifdef CONFIG_FS_VERITY
+	if (error)
+		return error;
+
+	error = bioset_init(&iomap_fsverity_bioset, IOMAP_POOL_SIZE,
+			    offsetof(struct iomap_fsverity_bio, bio),
+			    BIOSET_NEED_BVECS);
+	if (error)
+		bioset_exit(&iomap_ioend_bioset);
+#endif
+	return error;
 }
 fs_initcall(iomap_init);


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 19/40] xfs: add attribute type for fs-verity
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (17 preceding siblings ...)
  2024-03-17 16:28   ` [PATCH 18/40] iomap: integrate fs-verity verification into iomap's read path Darrick J. Wong
@ 2024-03-17 16:28   ` Darrick J. Wong
  2024-03-17 16:28   ` [PATCH 20/40] xfs: add fs-verity ro-compat flag Darrick J. Wong
                     ` (21 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:28 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

The Merkle tree blocks and descriptor are stored in the extended
attributes of the inode. Add new attribute type for fs-verity
metadata. Add XFS_ATTR_INTERNAL_MASK to skip parent pointer and
fs-verity attributes as those are only for internal use. While we're
at it add a few comments in relevant places that internally visible
attributes are not suppose to be handled via interface defined in
xfs_xattr.c.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_da_format.h  |   10 +++++++++-
 fs/xfs/libxfs/xfs_log_format.h |    1 +
 fs/xfs/xfs_ioctl.c             |    5 +++++
 fs/xfs/xfs_trace.h             |    3 ++-
 fs/xfs/xfs_xattr.c             |   10 ++++++++++
 5 files changed, 27 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h
index 839df0e5401b..28d4ac6fa156 100644
--- a/fs/xfs/libxfs/xfs_da_format.h
+++ b/fs/xfs/libxfs/xfs_da_format.h
@@ -715,14 +715,22 @@ struct xfs_attr3_leafblock {
 #define	XFS_ATTR_ROOT_BIT	1	/* limit access to trusted attrs */
 #define	XFS_ATTR_SECURE_BIT	2	/* limit access to secure attrs */
 #define	XFS_ATTR_PARENT_BIT	3	/* parent pointer attrs */
+#define	XFS_ATTR_VERITY_BIT	4	/* verity merkle tree and descriptor */
 #define	XFS_ATTR_INCOMPLETE_BIT	7	/* attr in middle of create/delete */
 #define XFS_ATTR_LOCAL		(1u << XFS_ATTR_LOCAL_BIT)
 #define XFS_ATTR_ROOT		(1u << XFS_ATTR_ROOT_BIT)
 #define XFS_ATTR_SECURE		(1u << XFS_ATTR_SECURE_BIT)
 #define XFS_ATTR_PARENT		(1u << XFS_ATTR_PARENT_BIT)
+#define XFS_ATTR_VERITY		(1u << XFS_ATTR_VERITY_BIT)
 #define XFS_ATTR_INCOMPLETE	(1u << XFS_ATTR_INCOMPLETE_BIT)
 #define XFS_ATTR_NSP_ONDISK_MASK \
-			(XFS_ATTR_ROOT | XFS_ATTR_SECURE | XFS_ATTR_PARENT)
+			(XFS_ATTR_ROOT | XFS_ATTR_SECURE | XFS_ATTR_PARENT | \
+			 XFS_ATTR_VERITY)
+
+/*
+ * Internal attributes not exposed to the user
+ */
+#define XFS_ATTR_INTERNAL_MASK (XFS_ATTR_PARENT | XFS_ATTR_VERITY)
 
 /*
  * Alignment for namelist and valuelist entries (since they are mixed
diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
index 9cbcba4bd363..407fadfb5c06 100644
--- a/fs/xfs/libxfs/xfs_log_format.h
+++ b/fs/xfs/libxfs/xfs_log_format.h
@@ -975,6 +975,7 @@ struct xfs_icreate_log {
 #define XFS_ATTRI_FILTER_MASK		(XFS_ATTR_ROOT | \
 					 XFS_ATTR_SECURE | \
 					 XFS_ATTR_PARENT | \
+					 XFS_ATTR_VERITY | \
 					 XFS_ATTR_INCOMPLETE)
 
 /*
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index d0e2cec6210d..ab61d7d552fb 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -352,6 +352,11 @@ static unsigned int
 xfs_attr_filter(
 	u32			ioc_flags)
 {
+	/*
+	 * Only externally visible attributes should be specified here.
+	 * Internally used attributes (such as parent pointers or fs-verity)
+	 * should not be exposed to userspace.
+	 */
 	if (ioc_flags & XFS_IOC_ATTR_ROOT)
 		return XFS_ATTR_ROOT;
 	if (ioc_flags & XFS_IOC_ATTR_SECURE)
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index d4f1b2da21e7..9d4ae05abfc8 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -87,7 +87,8 @@ struct xfs_bmap_intent;
 	{ XFS_ATTR_ROOT,	"ROOT" }, \
 	{ XFS_ATTR_SECURE,	"SECURE" }, \
 	{ XFS_ATTR_INCOMPLETE,	"INCOMPLETE" }, \
-	{ XFS_ATTR_PARENT,	"PARENT" }
+	{ XFS_ATTR_PARENT,	"PARENT" }, \
+	{ XFS_ATTR_VERITY,	"VERITY" }
 
 DECLARE_EVENT_CLASS(xfs_attr_list_class,
 	TP_PROTO(struct xfs_attr_list_context *ctx),
diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
index 364104e1b38a..e4c88dde4e44 100644
--- a/fs/xfs/xfs_xattr.c
+++ b/fs/xfs/xfs_xattr.c
@@ -20,6 +20,13 @@
 
 #include <linux/posix_acl_xattr.h>
 
+/*
+ * This file defines interface to work with externally visible extended
+ * attributes, such as those in user, system or security namespaces. This
+ * interface should not be used for internally used attributes (consider
+ * xfs_attr.c).
+ */
+
 /*
  * Get permission to use log-assisted atomic exchange of file extents.
  *
@@ -244,6 +251,9 @@ xfs_xattr_put_listent(
 
 	ASSERT(context->count >= 0);
 
+	if (flags & XFS_ATTR_INTERNAL_MASK)
+		return;
+
 	if (flags & XFS_ATTR_ROOT) {
 #ifdef CONFIG_XFS_POSIX_ACL
 		if (namelen == SGI_ACL_FILE_SIZE &&


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 20/40] xfs: add fs-verity ro-compat flag
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (18 preceding siblings ...)
  2024-03-17 16:28   ` [PATCH 19/40] xfs: add attribute type for fs-verity Darrick J. Wong
@ 2024-03-17 16:28   ` Darrick J. Wong
  2024-03-17 16:28   ` [PATCH 21/40] xfs: add inode on-disk VERITY flag Darrick J. Wong
                     ` (20 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:28 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

To mark inodes with fs-verity enabled the new XFS_DIFLAG2_VERITY flag
will be added in further patch. This requires ro-compat flag to let
older kernels know that fs with fs-verity can not be modified.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_format.h |    1 +
 fs/xfs/libxfs/xfs_sb.c     |    2 ++
 fs/xfs/xfs_mount.h         |    2 ++
 3 files changed, 5 insertions(+)


diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index 2b2f9050fbfb..93d280eb8451 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -353,6 +353,7 @@ xfs_sb_has_compat_feature(
 #define XFS_SB_FEAT_RO_COMPAT_RMAPBT   (1 << 1)		/* reverse map btree */
 #define XFS_SB_FEAT_RO_COMPAT_REFLINK  (1 << 2)		/* reflinked files */
 #define XFS_SB_FEAT_RO_COMPAT_INOBTCNT (1 << 3)		/* inobt block counts */
+#define XFS_SB_FEAT_RO_COMPAT_VERITY   (1 << 4)		/* fs-verity */
 #define XFS_SB_FEAT_RO_COMPAT_ALL \
 		(XFS_SB_FEAT_RO_COMPAT_FINOBT | \
 		 XFS_SB_FEAT_RO_COMPAT_RMAPBT | \
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index d991eec05436..a845cbe3f539 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -163,6 +163,8 @@ xfs_sb_version_to_features(
 		features |= XFS_FEAT_REFLINK;
 	if (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_INOBTCNT)
 		features |= XFS_FEAT_INOBTCNT;
+	if (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_VERITY)
+		features |= XFS_FEAT_VERITY;
 	if (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_FTYPE)
 		features |= XFS_FEAT_FTYPE;
 	if (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_SPINODES)
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index e880aa48de68..f198d7c82552 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -292,6 +292,7 @@ typedef struct xfs_mount {
 #define XFS_FEAT_BIGTIME	(1ULL << 24)	/* large timestamps */
 #define XFS_FEAT_NEEDSREPAIR	(1ULL << 25)	/* needs xfs_repair */
 #define XFS_FEAT_NREXT64	(1ULL << 26)	/* large extent counters */
+#define XFS_FEAT_VERITY		(1ULL << 27)	/* fs-verity */
 
 /* Mount features */
 #define XFS_FEAT_NOATTR2	(1ULL << 48)	/* disable attr2 creation */
@@ -355,6 +356,7 @@ __XFS_HAS_FEAT(inobtcounts, INOBTCNT)
 __XFS_HAS_FEAT(bigtime, BIGTIME)
 __XFS_HAS_FEAT(needsrepair, NEEDSREPAIR)
 __XFS_HAS_FEAT(large_extent_counts, NREXT64)
+__XFS_HAS_FEAT(verity, VERITY)
 
 /*
  * Mount features


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 21/40] xfs: add inode on-disk VERITY flag
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (19 preceding siblings ...)
  2024-03-17 16:28   ` [PATCH 20/40] xfs: add fs-verity ro-compat flag Darrick J. Wong
@ 2024-03-17 16:28   ` Darrick J. Wong
  2024-03-17 16:29   ` [PATCH 22/40] xfs: initialize fs-verity on file open and cleanup on inode destruction Darrick J. Wong
                     ` (19 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:28 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

Add flag to mark inodes which have fs-verity enabled on them (i.e.
descriptor exist and tree is built).

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_format.h |    4 +++-
 fs/xfs/xfs_inode.c         |    2 ++
 fs/xfs/xfs_iops.c          |    2 ++
 3 files changed, 7 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index 93d280eb8451..3ce2902101bc 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -1085,16 +1085,18 @@ static inline void xfs_dinode_put_rdev(struct xfs_dinode *dip, xfs_dev_t rdev)
 #define XFS_DIFLAG2_COWEXTSIZE_BIT   2  /* copy on write extent size hint */
 #define XFS_DIFLAG2_BIGTIME_BIT	3	/* big timestamps */
 #define XFS_DIFLAG2_NREXT64_BIT 4	/* large extent counters */
+#define XFS_DIFLAG2_VERITY_BIT	5	/* inode sealed by fsverity */
 
 #define XFS_DIFLAG2_DAX		(1 << XFS_DIFLAG2_DAX_BIT)
 #define XFS_DIFLAG2_REFLINK     (1 << XFS_DIFLAG2_REFLINK_BIT)
 #define XFS_DIFLAG2_COWEXTSIZE  (1 << XFS_DIFLAG2_COWEXTSIZE_BIT)
 #define XFS_DIFLAG2_BIGTIME	(1 << XFS_DIFLAG2_BIGTIME_BIT)
 #define XFS_DIFLAG2_NREXT64	(1 << XFS_DIFLAG2_NREXT64_BIT)
+#define XFS_DIFLAG2_VERITY	(1 << XFS_DIFLAG2_VERITY_BIT)
 
 #define XFS_DIFLAG2_ANY \
 	(XFS_DIFLAG2_DAX | XFS_DIFLAG2_REFLINK | XFS_DIFLAG2_COWEXTSIZE | \
-	 XFS_DIFLAG2_BIGTIME | XFS_DIFLAG2_NREXT64)
+	 XFS_DIFLAG2_BIGTIME | XFS_DIFLAG2_NREXT64 | XFS_DIFLAG2_VERITY)
 
 static inline bool xfs_dinode_has_bigtime(const struct xfs_dinode *dip)
 {
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index ea48774f6b76..59446e9e1719 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -607,6 +607,8 @@ xfs_ip2xflags(
 			flags |= FS_XFLAG_DAX;
 		if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE)
 			flags |= FS_XFLAG_COWEXTSIZE;
+		if (ip->i_diflags2 & XFS_DIFLAG2_VERITY)
+			flags |= FS_XFLAG_VERITY;
 	}
 
 	if (xfs_inode_has_attr_fork(ip))
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 66f8c47642e8..0e5cdb82b231 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -1241,6 +1241,8 @@ xfs_diflags_to_iflags(
 		flags |= S_NOATIME;
 	if (init && xfs_inode_should_enable_dax(ip))
 		flags |= S_DAX;
+	if (xflags & FS_XFLAG_VERITY)
+		flags |= S_VERITY;
 
 	/*
 	 * S_DAX can only be set during inode initialization and is never set by


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 22/40] xfs: initialize fs-verity on file open and cleanup on inode destruction
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (20 preceding siblings ...)
  2024-03-17 16:28   ` [PATCH 21/40] xfs: add inode on-disk VERITY flag Darrick J. Wong
@ 2024-03-17 16:29   ` Darrick J. Wong
  2024-03-17 16:29   ` [PATCH 23/40] xfs: don't allow to enable DAX on fs-verity sealed inode Darrick J. Wong
                     ` (18 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:29 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

fs-verity will read and attach metadata (not the tree itself) from
a disk for those inodes which already have fs-verity enabled.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_file.c  |    8 ++++++++
 fs/xfs/xfs_super.c |    2 ++
 2 files changed, 10 insertions(+)


diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 632653e00906..74dba917be93 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -31,6 +31,7 @@
 #include <linux/mman.h>
 #include <linux/fadvise.h>
 #include <linux/mount.h>
+#include <linux/fsverity.h>
 
 static const struct vm_operations_struct xfs_file_vm_ops;
 
@@ -1228,10 +1229,17 @@ xfs_file_open(
 	struct inode	*inode,
 	struct file	*file)
 {
+	int		error;
+
 	if (xfs_is_shutdown(XFS_M(inode->i_sb)))
 		return -EIO;
 	file->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC | FMODE_BUF_WASYNC |
 			FMODE_DIO_PARALLEL_WRITE | FMODE_CAN_ODIRECT;
+
+	error = fsverity_file_open(inode, file);
+	if (error)
+		return error;
+
 	return generic_file_open(inode, file);
 }
 
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 6828c48b15e9..a09739beb8f3 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -49,6 +49,7 @@
 #include <linux/magic.h>
 #include <linux/fs_context.h>
 #include <linux/fs_parser.h>
+#include <linux/fsverity.h>
 
 static const struct super_operations xfs_super_operations;
 
@@ -664,6 +665,7 @@ xfs_fs_destroy_inode(
 	ASSERT(!rwsem_is_locked(&inode->i_rwsem));
 	XFS_STATS_INC(ip->i_mount, vn_rele);
 	XFS_STATS_INC(ip->i_mount, vn_remove);
+	fsverity_cleanup_inode(inode);
 	xfs_inode_mark_reclaimable(ip);
 }
 


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 23/40] xfs: don't allow to enable DAX on fs-verity sealed inode
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (21 preceding siblings ...)
  2024-03-17 16:29   ` [PATCH 22/40] xfs: initialize fs-verity on file open and cleanup on inode destruction Darrick J. Wong
@ 2024-03-17 16:29   ` Darrick J. Wong
  2024-03-17 16:29   ` [PATCH 24/40] xfs: disable direct read path for fs-verity files Darrick J. Wong
                     ` (17 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:29 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

fs-verity doesn't support DAX. Forbid filesystem to enable DAX on
inodes which already have fs-verity enabled. The opposite is checked
when fs-verity is enabled, it won't be enabled if DAX is.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: fix typo in subject]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_iops.c |    2 ++
 1 file changed, 2 insertions(+)


diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 0e5cdb82b231..6f97d777f702 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -1213,6 +1213,8 @@ xfs_inode_should_enable_dax(
 		return false;
 	if (!xfs_inode_supports_dax(ip))
 		return false;
+	if (ip->i_diflags2 & XFS_DIFLAG2_VERITY)
+		return false;
 	if (xfs_has_dax_always(ip->i_mount))
 		return true;
 	if (ip->i_diflags2 & XFS_DIFLAG2_DAX)


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 24/40] xfs: disable direct read path for fs-verity files
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (22 preceding siblings ...)
  2024-03-17 16:29   ` [PATCH 23/40] xfs: don't allow to enable DAX on fs-verity sealed inode Darrick J. Wong
@ 2024-03-17 16:29   ` Darrick J. Wong
  2024-03-18 19:48     ` Andrey Albershteyn
  2024-03-17 16:29   ` [PATCH 25/40] xfs: widen flags argument to the xfs_iflags_* helpers Darrick J. Wong
                     ` (16 subsequent siblings)
  40 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:29 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

The direct path is not supported on verity files. Attempts to use direct
I/O path on such files should fall back to buffered I/O path.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: fix braces]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_file.c |   15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)


diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 74dba917be93..0ce51a020115 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -281,7 +281,8 @@ xfs_file_dax_read(
 	struct kiocb		*iocb,
 	struct iov_iter		*to)
 {
-	struct xfs_inode	*ip = XFS_I(iocb->ki_filp->f_mapping->host);
+	struct inode		*inode = iocb->ki_filp->f_mapping->host;
+	struct xfs_inode	*ip = XFS_I(inode);
 	ssize_t			ret = 0;
 
 	trace_xfs_file_dax_read(iocb, to);
@@ -334,10 +335,18 @@ xfs_file_read_iter(
 
 	if (IS_DAX(inode))
 		ret = xfs_file_dax_read(iocb, to);
-	else if (iocb->ki_flags & IOCB_DIRECT)
+	else if (iocb->ki_flags & IOCB_DIRECT && !fsverity_active(inode))
 		ret = xfs_file_dio_read(iocb, to);
-	else
+	else {
+		/*
+		 * In case fs-verity is enabled, we also fallback to the
+		 * buffered read from the direct read path. Therefore,
+		 * IOCB_DIRECT is set and need to be cleared (see
+		 * generic_file_read_iter())
+		 */
+		iocb->ki_flags &= ~IOCB_DIRECT;
 		ret = xfs_file_buffered_read(iocb, to);
+	}
 
 	if (ret > 0)
 		XFS_STATS_ADD(mp, xs_read_bytes, ret);


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 25/40] xfs: widen flags argument to the xfs_iflags_* helpers
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (23 preceding siblings ...)
  2024-03-17 16:29   ` [PATCH 24/40] xfs: disable direct read path for fs-verity files Darrick J. Wong
@ 2024-03-17 16:29   ` Darrick J. Wong
  2024-03-17 16:30   ` [PATCH 26/40] xfs: add fs-verity support Darrick J. Wong
                     ` (15 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:29 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

xfs_inode.i_flags is an unsigned long, so make these helpers take that
as the flags argument instead of unsigned short.  This is needed for the
next patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_inode.h |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)


diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index ab46ffb3ac19..3ea3a6f26ceb 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -207,13 +207,13 @@ xfs_new_eof(struct xfs_inode *ip, xfs_fsize_t new_size)
  * i_flags helper functions
  */
 static inline void
-__xfs_iflags_set(xfs_inode_t *ip, unsigned short flags)
+__xfs_iflags_set(xfs_inode_t *ip, unsigned long flags)
 {
 	ip->i_flags |= flags;
 }
 
 static inline void
-xfs_iflags_set(xfs_inode_t *ip, unsigned short flags)
+xfs_iflags_set(xfs_inode_t *ip, unsigned long flags)
 {
 	spin_lock(&ip->i_flags_lock);
 	__xfs_iflags_set(ip, flags);
@@ -221,7 +221,7 @@ xfs_iflags_set(xfs_inode_t *ip, unsigned short flags)
 }
 
 static inline void
-xfs_iflags_clear(xfs_inode_t *ip, unsigned short flags)
+xfs_iflags_clear(xfs_inode_t *ip, unsigned long flags)
 {
 	spin_lock(&ip->i_flags_lock);
 	ip->i_flags &= ~flags;
@@ -229,13 +229,13 @@ xfs_iflags_clear(xfs_inode_t *ip, unsigned short flags)
 }
 
 static inline int
-__xfs_iflags_test(xfs_inode_t *ip, unsigned short flags)
+__xfs_iflags_test(xfs_inode_t *ip, unsigned long flags)
 {
 	return (ip->i_flags & flags);
 }
 
 static inline int
-xfs_iflags_test(xfs_inode_t *ip, unsigned short flags)
+xfs_iflags_test(xfs_inode_t *ip, unsigned long flags)
 {
 	int ret;
 	spin_lock(&ip->i_flags_lock);
@@ -245,7 +245,7 @@ xfs_iflags_test(xfs_inode_t *ip, unsigned short flags)
 }
 
 static inline int
-xfs_iflags_test_and_clear(xfs_inode_t *ip, unsigned short flags)
+xfs_iflags_test_and_clear(xfs_inode_t *ip, unsigned long flags)
 {
 	int ret;
 
@@ -258,7 +258,7 @@ xfs_iflags_test_and_clear(xfs_inode_t *ip, unsigned short flags)
 }
 
 static inline int
-xfs_iflags_test_and_set(xfs_inode_t *ip, unsigned short flags)
+xfs_iflags_test_and_set(xfs_inode_t *ip, unsigned long flags)
 {
 	int ret;
 


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 26/40] xfs: add fs-verity support
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (24 preceding siblings ...)
  2024-03-17 16:29   ` [PATCH 25/40] xfs: widen flags argument to the xfs_iflags_* helpers Darrick J. Wong
@ 2024-03-17 16:30   ` Darrick J. Wong
  2024-03-18  1:43     ` Christoph Hellwig
  2024-03-17 16:30   ` [PATCH 27/40] xfs: create a per-mount shrinker for verity inodes merkle tree blocks Darrick J. Wong
                     ` (14 subsequent siblings)
  40 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:30 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

Add integration with fs-verity. The XFS store fs-verity metadata in
the extended file attributes. The metadata consist of verity
descriptor and Merkle tree blocks.

The descriptor is stored under "vdesc" extended attribute. The
Merkle tree blocks are stored under binary indexes which are offsets
into the Merkle tree.

When fs-verity is enabled on an inode, the XFS_IVERITY_CONSTRUCTION
flag is set meaning that the Merkle tree is being build. The
initialization ends with storing of verity descriptor and setting
inode on-disk flag (XFS_DIFLAG2_VERITY).

The verification on read is done in read path of iomap.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: replace caching implementation with an xarray, other cleanups]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/Makefile               |    1 
 fs/xfs/libxfs/xfs_attr.c      |   13 +
 fs/xfs/libxfs/xfs_da_format.h |   32 +++
 fs/xfs/libxfs/xfs_ondisk.h    |    4 
 fs/xfs/xfs_icache.c           |    4 
 fs/xfs/xfs_inode.h            |    5 
 fs/xfs/xfs_super.c            |   12 +
 fs/xfs/xfs_trace.h            |   32 +++
 fs/xfs/xfs_verity.c           |  468 +++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_verity.h           |   20 ++
 10 files changed, 591 insertions(+)
 create mode 100644 fs/xfs/xfs_verity.c
 create mode 100644 fs/xfs/xfs_verity.h


diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
index f8845e65cac7..8396a633b541 100644
--- a/fs/xfs/Makefile
+++ b/fs/xfs/Makefile
@@ -130,6 +130,7 @@ xfs-$(CONFIG_XFS_POSIX_ACL)	+= xfs_acl.o
 xfs-$(CONFIG_SYSCTL)		+= xfs_sysctl.o
 xfs-$(CONFIG_COMPAT)		+= xfs_ioctl32.o
 xfs-$(CONFIG_EXPORTFS_BLOCK_OPS)	+= xfs_pnfs.o
+xfs-$(CONFIG_FS_VERITY)		+= xfs_verity.o
 
 # notify failure
 ifeq ($(CONFIG_MEMORY_FAILURE),y)
diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index f0b625d45aa4..b7aa1bc12fd1 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -27,6 +27,7 @@
 #include "xfs_attr_item.h"
 #include "xfs_xattr.h"
 #include "xfs_parent.h"
+#include "xfs_verity.h"
 
 struct kmem_cache		*xfs_attr_intent_cache;
 
@@ -1524,6 +1525,18 @@ xfs_attr_namecheck(
 	if (flags & XFS_ATTR_PARENT)
 		return xfs_parent_namecheck(mp, name, length, flags);
 
+	if (flags & XFS_ATTR_VERITY) {
+		/* Merkle tree pages are stored under u64 indexes */
+		if (length == sizeof(struct xfs_verity_merkle_key))
+			return true;
+
+		/* Verity descriptor blocks are held in a named attribute. */
+		if (length == XFS_VERITY_DESCRIPTOR_NAME_LEN)
+			return true;
+
+		return false;
+	}
+
 	/*
 	 * MAXNAMELEN includes the trailing null, but (name/length) leave it
 	 * out, so use >= for the length check.
diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h
index 28d4ac6fa156..e4aa7c9a0ccb 100644
--- a/fs/xfs/libxfs/xfs_da_format.h
+++ b/fs/xfs/libxfs/xfs_da_format.h
@@ -914,4 +914,36 @@ struct xfs_parent_name_rec {
  */
 #define XFS_PARENT_DIRENT_NAME_MAX_SIZE		(MAXNAMELEN - 1)
 
+/*
+ * fs-verity attribute name format
+ *
+ * Merkle tree blocks are stored under extended attributes of the inode. The
+ * name of the attributes are byte offsets into merkle tree.
+ */
+struct xfs_verity_merkle_key {
+	__be64	vi_merkleoff;
+};
+
+static inline void
+xfs_verity_merkle_key_to_disk(
+	struct xfs_verity_merkle_key	*key,
+	uint64_t			offset)
+{
+	key->vi_merkleoff = cpu_to_be64(offset);
+}
+
+static inline uint64_t
+xfs_verity_merkle_key_from_disk(
+	const void			*attr_name)
+{
+	const struct xfs_verity_merkle_key *key = attr_name;
+
+	return be64_to_cpu(key->vi_merkleoff);
+}
+
+
+/* ondisk xattr name used for the fsverity descriptor */
+#define XFS_VERITY_DESCRIPTOR_NAME	"vdesc"
+#define XFS_VERITY_DESCRIPTOR_NAME_LEN	(sizeof(XFS_VERITY_DESCRIPTOR_NAME) - 1)
+
 #endif /* __XFS_DA_FORMAT_H__ */
diff --git a/fs/xfs/libxfs/xfs_ondisk.h b/fs/xfs/libxfs/xfs_ondisk.h
index 81885a6a028e..16f4ef2fbeaf 100644
--- a/fs/xfs/libxfs/xfs_ondisk.h
+++ b/fs/xfs/libxfs/xfs_ondisk.h
@@ -194,6 +194,10 @@ xfs_check_ondisk_structs(void)
 	XFS_CHECK_VALUE(XFS_DQ_BIGTIME_EXPIRY_MIN << XFS_DQ_BIGTIME_SHIFT, 4);
 	XFS_CHECK_VALUE(XFS_DQ_BIGTIME_EXPIRY_MAX << XFS_DQ_BIGTIME_SHIFT,
 			16299260424LL);
+
+	/* fs-verity xattrs */
+	XFS_CHECK_STRUCT_SIZE(struct xfs_verity_merkle_key,	8);
+	XFS_CHECK_VALUE(sizeof(XFS_VERITY_DESCRIPTOR_NAME),	6);
 }
 
 #endif /* __XFS_ONDISK_H */
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index e64265bc0b33..fef77938c718 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -25,6 +25,7 @@
 #include "xfs_ag.h"
 #include "xfs_log_priv.h"
 #include "xfs_health.h"
+#include "xfs_verity.h"
 
 #include <linux/iversion.h>
 
@@ -115,6 +116,7 @@ xfs_inode_alloc(
 	spin_lock_init(&ip->i_ioend_lock);
 	ip->i_next_unlinked = NULLAGINO;
 	ip->i_prev_unlinked = 0;
+	xfs_verity_cache_init(ip);
 
 	return ip;
 }
@@ -126,6 +128,8 @@ xfs_inode_free_callback(
 	struct inode		*inode = container_of(head, struct inode, i_rcu);
 	struct xfs_inode	*ip = XFS_I(inode);
 
+	xfs_verity_cache_destroy(ip);
+
 	switch (VFS_I(ip)->i_mode & S_IFMT) {
 	case S_IFREG:
 	case S_IFDIR:
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 3ea3a6f26ceb..cb2e43e5cd43 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -92,6 +92,9 @@ typedef struct xfs_inode {
 	spinlock_t		i_ioend_lock;
 	struct work_struct	i_ioend_work;
 	struct list_head	i_ioend_list;
+#ifdef CONFIG_FS_VERITY
+	struct xarray		i_merkle_blocks;
+#endif
 } xfs_inode_t;
 
 static inline bool xfs_inode_on_unlinked_list(const struct xfs_inode *ip)
@@ -361,6 +364,8 @@ static inline bool xfs_inode_has_large_extent_counts(struct xfs_inode *ip)
  */
 #define XFS_IREMAPPING		(1U << 15)
 
+#define XFS_VERITY_CONSTRUCTION	(1U << 16) /* merkle tree construction */
+
 /* All inode state flags related to inode reclaim. */
 #define XFS_ALL_IRECLAIM_FLAGS	(XFS_IRECLAIMABLE | \
 				 XFS_IRECLAIM | \
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index a09739beb8f3..1f96dff5731e 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -30,6 +30,7 @@
 #include "xfs_filestream.h"
 #include "xfs_quota.h"
 #include "xfs_sysfs.h"
+#include "xfs_verity.h"
 #include "xfs_ondisk.h"
 #include "xfs_rmap_item.h"
 #include "xfs_refcount_item.h"
@@ -666,6 +667,8 @@ xfs_fs_destroy_inode(
 	XFS_STATS_INC(ip->i_mount, vn_rele);
 	XFS_STATS_INC(ip->i_mount, vn_remove);
 	fsverity_cleanup_inode(inode);
+	if (IS_VERITY(inode))
+		xfs_verity_cache_drop(ip);
 	xfs_inode_mark_reclaimable(ip);
 }
 
@@ -1521,6 +1524,11 @@ xfs_fs_fill_super(
 	sb->s_quota_types = QTYPE_MASK_USR | QTYPE_MASK_GRP | QTYPE_MASK_PRJ;
 #endif
 	sb->s_op = &xfs_super_operations;
+#ifdef CONFIG_FS_VERITY
+	error = fsverity_set_ops(sb, &xfs_verity_ops);
+	if (error)
+		return error;
+#endif
 
 	/*
 	 * Delay mount work if the debug hook is set. This is debug
@@ -1730,6 +1738,10 @@ xfs_fs_fill_super(
 		goto out_filestream_unmount;
 	}
 
+	if (xfs_has_verity(mp))
+		xfs_alert(mp,
+	"EXPERIMENTAL fs-verity feature in use. Use at your own risk!");
+
 	error = xfs_mountfs(mp);
 	if (error)
 		goto out_filestream_unmount;
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 9d4ae05abfc8..23abec742c3b 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -4767,6 +4767,38 @@ DEFINE_XFBTREE_FREESP_EVENT(xfbtree_alloc_block);
 DEFINE_XFBTREE_FREESP_EVENT(xfbtree_free_block);
 #endif /* CONFIG_XFS_BTREE_IN_MEM */
 
+#ifdef CONFIG_FS_VERITY
+DECLARE_EVENT_CLASS(xfs_verity_cache_class,
+	TP_PROTO(struct xfs_inode *ip, unsigned long key, unsigned long caller_ip),
+	TP_ARGS(ip, key, caller_ip),
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__field(xfs_ino_t, ino)
+		__field(unsigned long, key)
+		__field(void *, caller_ip)
+	),
+	TP_fast_assign(
+		__entry->dev = ip->i_mount->m_super->s_dev;
+		__entry->ino = ip->i_ino;
+		__entry->key = key;
+		__entry->caller_ip = (void *)caller_ip;
+	),
+	TP_printk("dev %d:%d ino 0x%llx key 0x%lx caller %pS",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __entry->ino,
+		  __entry->key,
+		  __entry->caller_ip)
+)
+
+#define DEFINE_XFS_VERITY_CACHE_EVENT(name) \
+DEFINE_EVENT(xfs_verity_cache_class, name, \
+	TP_PROTO(struct xfs_inode *ip, unsigned long key, unsigned long caller_ip), \
+	TP_ARGS(ip, key, caller_ip))
+DEFINE_XFS_VERITY_CACHE_EVENT(xfs_verity_cache_load);
+DEFINE_XFS_VERITY_CACHE_EVENT(xfs_verity_cache_store);
+DEFINE_XFS_VERITY_CACHE_EVENT(xfs_verity_cache_drop);
+#endif /* CONFIG_XFS_VERITY */
+
 #endif /* _TRACE_XFS_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
new file mode 100644
index 000000000000..69b54e70e312
--- /dev/null
+++ b/fs/xfs/xfs_verity.c
@@ -0,0 +1,468 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 Red Hat, Inc.
+ */
+#include "xfs.h"
+#include "xfs_shared.h"
+#include "xfs_format.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
+#include "xfs_trans_resv.h"
+#include "xfs_mount.h"
+#include "xfs_inode.h"
+#include "xfs_log_format.h"
+#include "xfs_attr.h"
+#include "xfs_verity.h"
+#include "xfs_bmap_util.h"
+#include "xfs_log_format.h"
+#include "xfs_trans.h"
+#include "xfs_attr_leaf.h"
+#include "xfs_trace.h"
+#include <linux/fsverity.h>
+
+/*
+ * Merkle Tree Block Cache
+ * =======================
+ *
+ * fsverity requires that the filesystem implement caching of ondisk merkle
+ * tree blocks.  XFS stores merkle tree blocks in the extended attribute data,
+ * which makes it important to keep copies in memory for as long as possible.
+ * This is performed by allocating the data blob structure defined below,
+ * passing the data portion of the blob to xfs_attr_get, and later adding the
+ * data blob to an xarray embedded in the xfs_inode structure.
+ *
+ * The xarray structure indexes merkle tree blocks by the offset given to us by
+ * fsverity, which drastically reduces lookups.  First, it eliminating the need
+ * to walk the xattr structure to find the remote block containing the merkle
+ * tree block.  Second, access to each block in the xattr structure requires a
+ * lookup in the incore extent btree.
+ */
+struct xfs_merkle_blob {
+	/* refcount of this item; the cache holds its own ref */
+	refcount_t		refcount;
+
+	unsigned long		flags;
+
+	/* Pointer to the merkle tree block, which is power-of-2 sized */
+	void			*data;
+};
+
+#define XFS_MERKLE_BLOB_VERIFIED_BIT	(0) /* fsverity validated this */
+
+/*
+ * Allocate a merkle tree blob object to prepare for reading a merkle tree
+ * object from disk.
+ */
+static inline struct xfs_merkle_blob *
+xfs_merkle_blob_alloc(
+	unsigned int		blocksize)
+{
+	struct xfs_merkle_blob	*mk;
+
+	mk = kmalloc(sizeof(struct xfs_merkle_blob), GFP_KERNEL);
+	if (!mk)
+		return NULL;
+
+	mk->data = kvzalloc(blocksize, GFP_KERNEL);
+	if (!mk->data) {
+		kfree(mk);
+		return NULL;
+	}
+
+	/* Caller owns this refcount. */
+	refcount_set(&mk->refcount, 1);
+	mk->flags = 0;
+	return mk;
+}
+
+/* Free a merkle tree blob. */
+static inline void
+xfs_merkle_blob_rele(
+	struct xfs_merkle_blob	*mk)
+{
+	if (refcount_dec_and_test(&mk->refcount)) {
+		kvfree(mk->data);
+		kfree(mk);
+	}
+}
+
+/* Initialize the merkle tree block cache */
+void
+xfs_verity_cache_init(
+	struct xfs_inode	*ip)
+{
+	xa_init(&ip->i_merkle_blocks);
+}
+
+/*
+ * Drop all the merkle tree blocks out of the cache.  Caller must ensure that
+ * there are no active references to cache items.
+ */
+void
+xfs_verity_cache_drop(
+	struct xfs_inode	*ip)
+{
+	XA_STATE(xas, &ip->i_merkle_blocks, 0);
+	struct xfs_merkle_blob	*mk;
+	unsigned long		flags;
+
+	xas_lock_irqsave(&xas, flags);
+	xas_for_each(&xas, mk, ULONG_MAX) {
+		ASSERT(refcount_read(&mk->refcount) == 1);
+
+		trace_xfs_verity_cache_drop(ip, xas.xa_index, _RET_IP_);
+
+		xas_store(&xas, NULL);
+		xfs_merkle_blob_rele(mk);
+	}
+	xas_unlock_irqrestore(&xas, flags);
+}
+
+/* Destroy the merkle tree block cache */
+void
+xfs_verity_cache_destroy(
+	struct xfs_inode	*ip)
+{
+	ASSERT(xa_empty(&ip->i_merkle_blocks));
+
+	/*
+	 * xa_destroy calls xas_lock from rcu freeing softirq context, so
+	 * we must use xa*_lock_irqsave.
+	 */
+	xa_destroy(&ip->i_merkle_blocks);
+}
+
+/* Return a cached merkle tree block, or NULL. */
+static struct xfs_merkle_blob *
+xfs_verity_cache_load(
+	struct xfs_inode	*ip,
+	unsigned long		key)
+{
+	XA_STATE(xas, &ip->i_merkle_blocks, key);
+	struct xfs_merkle_blob	*mk;
+
+	/* Look up the cached item and try to get an active ref. */
+	rcu_read_lock();
+	do {
+		mk = xas_load(&xas);
+		if (xa_is_zero(mk))
+			mk = NULL;
+	} while (xas_retry(&xas, mk) ||
+		 (mk && !refcount_inc_not_zero(&mk->refcount)));
+	rcu_read_unlock();
+
+	if (!mk)
+		return NULL;
+
+	trace_xfs_verity_cache_load(ip, key, _RET_IP_);
+	return mk;
+}
+
+/*
+ * Try to store a merkle tree block in the cache with the given key.
+ *
+ * If the merkle tree block is not already in the cache, the given block @mk
+ * will be added to the cache and returned.  The caller retains its active
+ * reference to @mk.
+ *
+ * If there was already a merkle block in the cache, it will be returned to
+ * the caller with an active reference.  @mk will be untouched.
+ */
+static struct xfs_merkle_blob *
+xfs_verity_cache_store(
+	struct xfs_inode	*ip,
+	unsigned long		key,
+	struct xfs_merkle_blob	*mk)
+{
+	struct xfs_merkle_blob	*old;
+	unsigned long		flags;
+
+	trace_xfs_verity_cache_store(ip, key, _RET_IP_);
+
+	/*
+	 * Either replace a NULL entry with mk, or take an active ref to
+	 * whatever's currently there.
+	 */
+	xa_lock_irqsave(&ip->i_merkle_blocks, flags);
+	do {
+		old = __xa_cmpxchg(&ip->i_merkle_blocks, key, NULL, mk,
+				GFP_KERNEL);
+	} while (old && !refcount_inc_not_zero(&old->refcount));
+	xa_unlock_irqrestore(&ip->i_merkle_blocks, flags);
+
+	if (old == NULL) {
+		/*
+		 * There was no previous value.  @mk is now live in the cache.
+		 * Bump the active refcount to transfer ownership to the cache
+		 * and return @mk to the caller.
+		 */
+		refcount_inc(&mk->refcount);
+		return mk;
+	}
+
+	/*
+	 * We obtained an active reference to a previous value in the cache.
+	 * Return it to the caller.
+	 */
+	return old;
+}
+
+static int
+xfs_verity_get_descriptor(
+	struct inode		*inode,
+	void			*buf,
+	size_t			buf_size)
+{
+	struct xfs_inode	*ip = XFS_I(inode);
+	int			error = 0;
+	struct xfs_da_args	args = {
+		.dp		= ip,
+		.attr_filter	= XFS_ATTR_VERITY,
+		.name		= (const uint8_t *)XFS_VERITY_DESCRIPTOR_NAME,
+		.namelen	= XFS_VERITY_DESCRIPTOR_NAME_LEN,
+		.value		= buf,
+		.valuelen	= buf_size,
+	};
+
+	/*
+	 * The fact that (returned attribute size) == (provided buf_size) is
+	 * checked by xfs_attr_copy_value() (returns -ERANGE)
+	 */
+	error = xfs_attr_get(&args);
+	if (error)
+		return error;
+
+	return args.valuelen;
+}
+
+static int
+xfs_verity_begin_enable(
+	struct file		*filp,
+	u64			merkle_tree_size,
+	unsigned int		tree_blocksize)
+{
+	struct inode		*inode = file_inode(filp);
+	struct xfs_inode	*ip = XFS_I(inode);
+	int			error = 0;
+
+	xfs_assert_ilocked(ip, XFS_IOLOCK_EXCL);
+
+	if (IS_DAX(inode))
+		return -EINVAL;
+
+	if (xfs_iflags_test_and_set(ip, XFS_VERITY_CONSTRUCTION))
+		return -EBUSY;
+
+	return error;
+}
+
+static int
+xfs_drop_merkle_tree(
+	struct xfs_inode		*ip,
+	u64				merkle_tree_size,
+	unsigned int			tree_blocksize)
+{
+	struct xfs_verity_merkle_key	name;
+	int				error = 0;
+	u64				offset = 0;
+	struct xfs_da_args		args = {
+		.dp			= ip,
+		.whichfork		= XFS_ATTR_FORK,
+		.attr_filter		= XFS_ATTR_VERITY,
+		.op_flags		= XFS_DA_OP_REMOVE,
+		.name			= (const uint8_t *)&name,
+		.namelen		= sizeof(struct xfs_verity_merkle_key),
+		/* NULL value make xfs_attr_set remove the attr */
+		.value			= NULL,
+	};
+
+	if (!merkle_tree_size)
+		return 0;
+
+	for (offset = 0; offset < merkle_tree_size; offset += tree_blocksize) {
+		xfs_verity_merkle_key_to_disk(&name, offset);
+		error = xfs_attr_set(&args);
+		if (error)
+			return error;
+	}
+
+	args.name = (const uint8_t *)XFS_VERITY_DESCRIPTOR_NAME;
+	args.namelen = XFS_VERITY_DESCRIPTOR_NAME_LEN;
+	error = xfs_attr_set(&args);
+
+	return error;
+}
+
+static int
+xfs_verity_end_enable(
+	struct file		*filp,
+	const void		*desc,
+	size_t			desc_size,
+	u64			merkle_tree_size,
+	unsigned int		tree_blocksize)
+{
+	struct inode		*inode = file_inode(filp);
+	struct xfs_inode	*ip = XFS_I(inode);
+	struct xfs_mount	*mp = ip->i_mount;
+	struct xfs_trans	*tp;
+	struct xfs_da_args	args = {
+		.dp		= ip,
+		.whichfork	= XFS_ATTR_FORK,
+		.attr_filter	= XFS_ATTR_VERITY,
+		.name		= (const uint8_t *)XFS_VERITY_DESCRIPTOR_NAME,
+		.namelen	= XFS_VERITY_DESCRIPTOR_NAME_LEN,
+		.value		= (void *)desc,
+		.valuelen	= desc_size,
+	};
+	int			error = 0;
+
+	xfs_assert_ilocked(ip, XFS_IOLOCK_EXCL);
+
+	/* fs-verity failed, just cleanup */
+	if (desc == NULL)
+		goto out;
+
+	error = xfs_attr_set(&args);
+	if (error)
+		goto out;
+
+	/* Set fsverity inode flag */
+	error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_ichange,
+			0, 0, false, &tp);
+	if (error)
+		goto out;
+
+	/*
+	 * Ensure that we've persisted the verity information before we enable
+	 * it on the inode and tell the caller we have sealed the inode.
+	 */
+	ip->i_diflags2 |= XFS_DIFLAG2_VERITY;
+
+	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+	xfs_trans_set_sync(tp);
+
+	error = xfs_trans_commit(tp);
+	xfs_iunlock(ip, XFS_ILOCK_EXCL);
+
+	if (!error)
+		inode->i_flags |= S_VERITY;
+
+out:
+	if (error)
+		WARN_ON_ONCE(xfs_drop_merkle_tree(ip, merkle_tree_size,
+						  tree_blocksize));
+
+	xfs_iflags_clear(ip, XFS_VERITY_CONSTRUCTION);
+	return error;
+}
+
+static int
+xfs_verity_read_merkle(
+	const struct fsverity_readmerkle *req,
+	struct fsverity_blockbuf	*block)
+{
+	struct xfs_inode		*ip = XFS_I(req->inode);
+	struct xfs_verity_merkle_key	name;
+	struct xfs_da_args		args = {
+		.dp			= ip,
+		.attr_filter		= XFS_ATTR_VERITY,
+		.name			= (const uint8_t *)&name,
+		.namelen		= sizeof(struct xfs_verity_merkle_key),
+		.valuelen		= block->size,
+	};
+	struct xfs_merkle_blob		*mk, *new_mk;
+	unsigned long			key = block->offset >> req->log_blocksize;
+	int				error;
+
+	ASSERT(block->offset >> req->log_blocksize <= ULONG_MAX);
+
+	xfs_verity_merkle_key_to_disk(&name, block->offset);
+
+	/* Is the block already cached? */
+	mk = xfs_verity_cache_load(ip, key);
+	if (mk)
+		goto out_hit;
+
+	new_mk = xfs_merkle_blob_alloc(block->size);
+	if (!new_mk)
+		return -ENOMEM;
+	args.value = new_mk->data;
+
+	/* Read the block in from disk and try to store it in the cache. */
+	xfs_verity_merkle_key_to_disk(&name, block->offset);
+
+	error = xfs_attr_get(&args);
+	if (error)
+		goto out_new_mk;
+
+	if (!args.valuelen) {
+		error = -ENODATA;
+		goto out_new_mk;
+	}
+
+	mk = xfs_verity_cache_store(ip, key, new_mk);
+	if (mk != new_mk) {
+		/*
+		 * We raced with another thread to populate the cache and lost.
+		 * Free the new cache blob and continue with the existing one.
+		 */
+		xfs_merkle_blob_rele(new_mk);
+	}
+
+out_hit:
+	block->kaddr   = (void *)mk->data;
+	block->context = mk;
+	block->verified = test_bit(XFS_MERKLE_BLOB_VERIFIED_BIT, &mk->flags);
+
+	return 0;
+
+out_new_mk:
+	xfs_merkle_blob_rele(new_mk);
+	return error;
+}
+
+static int
+xfs_verity_write_merkle(
+	const struct fsverity_writemerkle *req,
+	const void			*buf,
+	u64				pos,
+	unsigned int			size)
+{
+	struct inode			*inode = req->inode;
+	struct xfs_inode		*ip = XFS_I(inode);
+	struct xfs_verity_merkle_key	name;
+	struct xfs_da_args		args = {
+		.dp			= ip,
+		.whichfork		= XFS_ATTR_FORK,
+		.attr_filter		= XFS_ATTR_VERITY,
+		.name			= (const uint8_t *)&name,
+		.namelen		= sizeof(struct xfs_verity_merkle_key),
+		.value			= (void *)buf,
+		.valuelen		= size,
+	};
+
+	xfs_verity_merkle_key_to_disk(&name, pos);
+	return xfs_attr_set(&args);
+}
+
+static void
+xfs_verity_drop_merkle(
+	struct fsverity_blockbuf	*block)
+{
+	struct xfs_merkle_blob		*mk = block->context;
+
+	if (block->verified)
+		set_bit(XFS_MERKLE_BLOB_VERIFIED_BIT, &mk->flags);
+	xfs_merkle_blob_rele(mk);
+	block->kaddr = NULL;
+	block->context = NULL;
+}
+
+const struct fsverity_operations xfs_verity_ops = {
+	.begin_enable_verity		= xfs_verity_begin_enable,
+	.end_enable_verity		= xfs_verity_end_enable,
+	.get_verity_descriptor		= xfs_verity_get_descriptor,
+	.read_merkle_tree_block		= xfs_verity_read_merkle,
+	.write_merkle_tree_block	= xfs_verity_write_merkle,
+	.drop_merkle_tree_block		= xfs_verity_drop_merkle,
+};
diff --git a/fs/xfs/xfs_verity.h b/fs/xfs/xfs_verity.h
new file mode 100644
index 000000000000..31d51482f7f7
--- /dev/null
+++ b/fs/xfs/xfs_verity.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Red Hat, Inc.
+ */
+#ifndef __XFS_VERITY_H__
+#define __XFS_VERITY_H__
+
+#ifdef CONFIG_FS_VERITY
+void xfs_verity_cache_init(struct xfs_inode *ip);
+void xfs_verity_cache_drop(struct xfs_inode *ip);
+void xfs_verity_cache_destroy(struct xfs_inode *ip);
+
+extern const struct fsverity_operations xfs_verity_ops;
+#else
+# define xfs_verity_cache_init(ip)		((void)0)
+# define xfs_verity_cache_drop(ip)		((void)0)
+# define xfs_verity_cache_destroy(ip)		((void)0)
+#endif	/* CONFIG_FS_VERITY */
+
+#endif	/* __XFS_VERITY_H__ */


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 27/40] xfs: create a per-mount shrinker for verity inodes merkle tree blocks
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (25 preceding siblings ...)
  2024-03-17 16:30   ` [PATCH 26/40] xfs: add fs-verity support Darrick J. Wong
@ 2024-03-17 16:30   ` Darrick J. Wong
  2024-03-17 16:30   ` [PATCH 28/40] xfs: create an icache tag for files with cached " Darrick J. Wong
                     ` (13 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:30 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Create a shrinker for an entire filesystem that will walk the inodes
looking for inodes that are caching merkle tree blocks, and invoke
shrink functions on that cache.  The actual details of shrinking merkle
tree caches are left for subsequent patches.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_mount.c  |   10 ++++++-
 fs/xfs/xfs_mount.h  |    6 ++++
 fs/xfs/xfs_trace.h  |   20 +++++++++++++
 fs/xfs/xfs_verity.c |   77 +++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_verity.h |    5 +++
 5 files changed, 117 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 7328034d42ed..4b5b74809cff 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -34,6 +34,7 @@
 #include "xfs_health.h"
 #include "xfs_trace.h"
 #include "xfs_ag.h"
+#include "xfs_verity.h"
 #include "scrub/stats.h"
 
 static DEFINE_MUTEX(xfs_uuid_table_mutex);
@@ -813,6 +814,10 @@ xfs_mountfs(
 	if (error)
 		goto out_fail_wait;
 
+	error = xfs_verity_register_shrinker(mp);
+	if (error)
+		goto out_inodegc_shrinker;
+
 	/*
 	 * Log's mount-time initialization. The first part of recovery can place
 	 * some items on the AIL, to be handled when recovery is finished or
@@ -823,7 +828,7 @@ xfs_mountfs(
 			      XFS_FSB_TO_BB(mp, sbp->sb_logblocks));
 	if (error) {
 		xfs_warn(mp, "log mount failed");
-		goto out_inodegc_shrinker;
+		goto out_verity_shrinker;
 	}
 
 	/* Enable background inode inactivation workers. */
@@ -1018,6 +1023,8 @@ xfs_mountfs(
 	xfs_unmount_flush_inodes(mp);
  out_log_dealloc:
 	xfs_log_mount_cancel(mp);
+ out_verity_shrinker:
+	xfs_verity_unregister_shrinker(mp);
  out_inodegc_shrinker:
 	shrinker_free(mp->m_inodegc_shrinker);
  out_fail_wait:
@@ -1100,6 +1107,7 @@ xfs_unmountfs(
 #if defined(DEBUG)
 	xfs_errortag_clearall(mp);
 #endif
+	xfs_verity_unregister_shrinker(mp);
 	shrinker_free(mp->m_inodegc_shrinker);
 	xfs_free_perag(mp);
 
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index f198d7c82552..855517583ce6 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -255,6 +255,12 @@ typedef struct xfs_mount {
 
 	/* Hook to feed dirent updates to an active online repair. */
 	struct xfs_hooks	m_dir_update_hooks;
+
+#ifdef CONFIG_FS_VERITY
+	/* shrinker and cached blocks count for merkle trees */
+	struct shrinker		*m_verity_shrinker;
+	struct percpu_counter	m_verity_blocks;
+#endif
 } xfs_mount_t;
 
 #define M_IGEO(mp)		(&(mp)->m_ino_geo)
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 23abec742c3b..fa05122a7c4d 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -4797,6 +4797,26 @@ DEFINE_EVENT(xfs_verity_cache_class, name, \
 DEFINE_XFS_VERITY_CACHE_EVENT(xfs_verity_cache_load);
 DEFINE_XFS_VERITY_CACHE_EVENT(xfs_verity_cache_store);
 DEFINE_XFS_VERITY_CACHE_EVENT(xfs_verity_cache_drop);
+
+TRACE_EVENT(xfs_verity_shrinker_count,
+	TP_PROTO(struct xfs_mount *mp, unsigned long long count,
+		 unsigned long caller_ip),
+	TP_ARGS(mp, count, caller_ip),
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__field(unsigned long long, count)
+		__field(void *, caller_ip)
+	),
+	TP_fast_assign(
+		__entry->dev = mp->m_super->s_dev;
+		__entry->count = count;
+		__entry->caller_ip = (void *)caller_ip;
+	),
+	TP_printk("dev %d:%d count %llu caller %pS",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __entry->count,
+		  __entry->caller_ip)
+)
 #endif /* CONFIG_XFS_VERITY */
 
 #endif /* _TRACE_XFS_H */
diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
index 69b54e70e312..46aa5002e4e1 100644
--- a/fs/xfs/xfs_verity.c
+++ b/fs/xfs/xfs_verity.c
@@ -18,6 +18,7 @@
 #include "xfs_trans.h"
 #include "xfs_attr_leaf.h"
 #include "xfs_trace.h"
+#include "xfs_icache.h"
 #include <linux/fsverity.h>
 
 /*
@@ -207,6 +208,82 @@ xfs_verity_cache_store(
 	return old;
 }
 
+/* Count the merkle tree blocks that we might be able to reclaim. */
+static unsigned long
+xfs_verity_shrinker_count(
+	struct shrinker		*shrink,
+	struct shrink_control	*sc)
+{
+	struct xfs_mount	*mp = shrink->private_data;
+	s64			count;
+
+	if (!xfs_has_verity(mp))
+		return SHRINK_EMPTY;
+
+	count = percpu_counter_sum_positive(&mp->m_verity_blocks);
+
+	trace_xfs_verity_shrinker_count(mp, count, _RET_IP_);
+	return min_t(s64, ULONG_MAX, count);
+}
+
+/* Actually try to reclaim merkle tree blocks. */
+static unsigned long
+xfs_verity_shrinker_scan(
+	struct shrinker		*shrink,
+	struct shrink_control	*sc)
+{
+	struct xfs_mount	*mp = shrink->private_data;
+
+	if (!xfs_has_verity(mp))
+		return SHRINK_STOP;
+
+	return 0;
+}
+
+/* Register a shrinker so we can release cached merkle tree blocks. */
+int
+xfs_verity_register_shrinker(
+	struct xfs_mount	*mp)
+{
+	int			error;
+
+	if (!xfs_has_verity(mp))
+		return 0;
+
+	error = percpu_counter_init(&mp->m_verity_blocks, 0, GFP_KERNEL);
+	if (error)
+		return error;
+
+	mp->m_verity_shrinker = shrinker_alloc(0, "xfs-verity:%s",
+			mp->m_super->s_id);
+	if (!mp->m_verity_shrinker) {
+		percpu_counter_destroy(&mp->m_verity_blocks);
+		return -ENOMEM;
+	}
+
+	mp->m_verity_shrinker->count_objects = xfs_verity_shrinker_count;
+	mp->m_verity_shrinker->scan_objects = xfs_verity_shrinker_scan;
+	mp->m_verity_shrinker->seeks = 0;
+	mp->m_verity_shrinker->private_data = mp;
+
+	shrinker_register(mp->m_verity_shrinker);
+
+	return 0;
+}
+
+/* Unregister the merkle tree block shrinker. */
+void
+xfs_verity_unregister_shrinker(struct xfs_mount *mp)
+{
+	if (!xfs_has_verity(mp))
+		return;
+
+	ASSERT(percpu_counter_sum(&mp->m_verity_blocks) == 0);
+
+	shrinker_free(mp->m_verity_shrinker);
+	percpu_counter_destroy(&mp->m_verity_blocks);
+}
+
 static int
 xfs_verity_get_descriptor(
 	struct inode		*inode,
diff --git a/fs/xfs/xfs_verity.h b/fs/xfs/xfs_verity.h
index 31d51482f7f7..0ec0a61bee65 100644
--- a/fs/xfs/xfs_verity.h
+++ b/fs/xfs/xfs_verity.h
@@ -10,11 +10,16 @@ void xfs_verity_cache_init(struct xfs_inode *ip);
 void xfs_verity_cache_drop(struct xfs_inode *ip);
 void xfs_verity_cache_destroy(struct xfs_inode *ip);
 
+int xfs_verity_register_shrinker(struct xfs_mount *mp);
+void xfs_verity_unregister_shrinker(struct xfs_mount *mp);
+
 extern const struct fsverity_operations xfs_verity_ops;
 #else
 # define xfs_verity_cache_init(ip)		((void)0)
 # define xfs_verity_cache_drop(ip)		((void)0)
 # define xfs_verity_cache_destroy(ip)		((void)0)
+# define xfs_verity_register_shrinker(mp)	(0)
+# define xfs_verity_unregister_shrinker(mp)	((void)0)
 #endif	/* CONFIG_FS_VERITY */
 
 #endif	/* __XFS_VERITY_H__ */


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 28/40] xfs: create an icache tag for files with cached merkle tree blocks
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (26 preceding siblings ...)
  2024-03-17 16:30   ` [PATCH 27/40] xfs: create a per-mount shrinker for verity inodes merkle tree blocks Darrick J. Wong
@ 2024-03-17 16:30   ` Darrick J. Wong
  2024-03-17 16:30   ` [PATCH 29/40] xfs: shrink verity blob cache Darrick J. Wong
                     ` (12 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:30 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Create a radix tree tag for the inode cache so that merkle tree block
shrinkers can find verity inodes quickly.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_icache.c |   81 +++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_icache.h |    8 +++++
 fs/xfs/xfs_trace.h  |   23 ++++++++++++++
 fs/xfs/xfs_verity.c |   30 ++++++++++++++++++-
 fs/xfs/xfs_verity.h |    4 +++
 5 files changed, 145 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index fef77938c718..ad02af0da843 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -35,6 +35,8 @@
 #define XFS_ICI_RECLAIM_TAG	0
 /* Inode has speculative preallocations (posteof or cow) to clean. */
 #define XFS_ICI_BLOCKGC_TAG	1
+/* Inode has incore merkle tree blocks */
+#define XFS_ICI_VERITY_TAG	2
 
 /*
  * The goal for walking incore inodes.  These can correspond with incore inode
@@ -44,6 +46,7 @@ enum xfs_icwalk_goal {
 	/* Goals directly associated with tagged inodes. */
 	XFS_ICWALK_BLOCKGC	= XFS_ICI_BLOCKGC_TAG,
 	XFS_ICWALK_RECLAIM	= XFS_ICI_RECLAIM_TAG,
+	XFS_ICWALK_VERITY	= XFS_ICI_VERITY_TAG,
 };
 
 static int xfs_icwalk(struct xfs_mount *mp,
@@ -1606,6 +1609,7 @@ xfs_icwalk_igrab(
 {
 	switch (goal) {
 	case XFS_ICWALK_BLOCKGC:
+	case XFS_ICWALK_VERITY:
 		return xfs_blockgc_igrab(ip);
 	case XFS_ICWALK_RECLAIM:
 		return xfs_reclaim_igrab(ip, icw);
@@ -1634,6 +1638,9 @@ xfs_icwalk_process_inode(
 	case XFS_ICWALK_RECLAIM:
 		xfs_reclaim_inode(ip, pag);
 		break;
+	case XFS_ICWALK_VERITY:
+		error = xfs_verity_scan_inode(ip, icw);
+		break;
 	}
 	return error;
 }
@@ -1750,6 +1757,80 @@ xfs_icwalk_ag(
 	return last_error;
 }
 
+#ifdef CONFIG_FS_VERITY
+/* Mark this inode as having cached merkle tree blocks */
+void
+xfs_inode_set_verity_tag(
+	struct xfs_inode	*ip)
+{
+	struct xfs_mount	*mp = ip->i_mount;
+	struct xfs_perag	*pag;
+
+	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
+	if (!pag)
+		return;
+
+	spin_lock(&pag->pag_ici_lock);
+	xfs_perag_set_inode_tag(pag, XFS_INO_TO_AGINO(mp, ip->i_ino),
+			XFS_ICI_VERITY_TAG);
+	spin_unlock(&pag->pag_ici_lock);
+	xfs_perag_put(pag);
+}
+
+/* Mark this inode as not having cached merkle tree blocks */
+void
+xfs_inode_clear_verity_tag(
+	struct xfs_inode	*ip)
+{
+	struct xfs_mount	*mp = ip->i_mount;
+	struct xfs_perag	*pag;
+
+	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
+	if (!pag)
+		return;
+
+	spin_lock(&pag->pag_ici_lock);
+	xfs_perag_clear_inode_tag(pag, XFS_INO_TO_AGINO(mp, ip->i_ino),
+			XFS_ICI_VERITY_TAG);
+	spin_unlock(&pag->pag_ici_lock);
+	xfs_perag_put(pag);
+}
+
+/* Walk all the verity inodes in the filesystem. */
+int
+xfs_icwalk_verity(
+	struct xfs_mount	*mp,
+	struct xfs_icwalk	*icw)
+{
+	struct xfs_perag	*pag;
+	xfs_agnumber_t		agno = 0;
+	int			error = 0;
+
+	for_each_perag_tag(mp, agno, pag, XFS_ICWALK_VERITY) {
+		error = xfs_icwalk_ag(pag, XFS_ICWALK_VERITY, icw);
+		if (error)
+			break;
+
+		if ((icw->icw_flags & XFS_ICWALK_FLAG_SCAN_LIMIT) &&
+		    icw->icw_scan_limit <= 0) {
+			xfs_perag_rele(pag);
+			break;
+		}
+	}
+
+	return error;
+}
+
+/* Stop a verity incore walk scan. */
+void
+xfs_icwalk_verity_stop(
+	struct xfs_icwalk	*icw)
+{
+	icw->icw_flags |= XFS_ICWALK_FLAG_SCAN_LIMIT;
+	icw->icw_scan_limit = -1;
+}
+#endif /* CONFIG_FS_VERITY */
+
 /* Walk all incore inodes to achieve a given goal. */
 static int
 xfs_icwalk(
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index 905944dafbe5..621ce0078e08 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -81,4 +81,12 @@ void xfs_inodegc_stop(struct xfs_mount *mp);
 void xfs_inodegc_start(struct xfs_mount *mp);
 int xfs_inodegc_register_shrinker(struct xfs_mount *mp);
 
+#ifdef CONFIG_FS_VERITY
+int xfs_icwalk_verity(struct xfs_mount *mp, struct xfs_icwalk *icw);
+void xfs_icwalk_verity_stop(struct xfs_icwalk *icw);
+
+void xfs_inode_set_verity_tag(struct xfs_inode *ip);
+void xfs_inode_clear_verity_tag(struct xfs_inode *ip);
+#endif /* CONFIG_FS_VERITY */
+
 #endif
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index fa05122a7c4d..91a73399114e 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -4817,6 +4817,29 @@ TRACE_EVENT(xfs_verity_shrinker_count,
 		  __entry->count,
 		  __entry->caller_ip)
 )
+
+TRACE_EVENT(xfs_verity_shrinker_scan,
+	TP_PROTO(struct xfs_mount *mp, unsigned long scanned,
+		 unsigned long freed, unsigned long caller_ip),
+	TP_ARGS(mp, scanned, freed, caller_ip),
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__field(unsigned long, scanned)
+		__field(unsigned long, freed)
+		__field(void *, caller_ip)
+	),
+	TP_fast_assign(
+		__entry->dev = mp->m_super->s_dev;
+		__entry->scanned = scanned;
+		__entry->freed = freed;
+		__entry->caller_ip = (void *)caller_ip;
+	),
+	TP_printk("dev %d:%d scanned %lu freed %lu caller %pS",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __entry->scanned,
+		  __entry->freed,
+		  __entry->caller_ip)
+)
 #endif /* CONFIG_XFS_VERITY */
 
 #endif /* _TRACE_XFS_H */
diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
index 46aa5002e4e1..8d1888353515 100644
--- a/fs/xfs/xfs_verity.c
+++ b/fs/xfs/xfs_verity.c
@@ -226,18 +226,46 @@ xfs_verity_shrinker_count(
 	return min_t(s64, ULONG_MAX, count);
 }
 
+struct xfs_verity_scan {
+	struct xfs_icwalk	icw;
+	struct shrink_control	*sc;
+
+	unsigned long		scanned;
+	unsigned long		freed;
+};
+
+/* Scan an inode as part of a verity scan. */
+int
+xfs_verity_scan_inode(
+	struct xfs_inode	*ip,
+	struct xfs_icwalk	*icw)
+{
+	xfs_irele(ip);
+	return 0;
+}
+
 /* Actually try to reclaim merkle tree blocks. */
 static unsigned long
 xfs_verity_shrinker_scan(
 	struct shrinker		*shrink,
 	struct shrink_control	*sc)
 {
+	struct xfs_verity_scan	vs = {
+		.sc		= sc,
+	};
 	struct xfs_mount	*mp = shrink->private_data;
+	int			error;
 
 	if (!xfs_has_verity(mp))
 		return SHRINK_STOP;
 
-	return 0;
+	error = xfs_icwalk_verity(mp, &vs.icw);
+	if (error)
+		xfs_alert(mp, "%s: verity scan failed, error %d", __func__,
+				error);
+
+	trace_xfs_verity_shrinker_scan(mp, vs.scanned, vs.freed, _RET_IP_);
+	return vs.freed;
 }
 
 /* Register a shrinker so we can release cached merkle tree blocks. */
diff --git a/fs/xfs/xfs_verity.h b/fs/xfs/xfs_verity.h
index 0ec0a61bee65..e1980fc1f149 100644
--- a/fs/xfs/xfs_verity.h
+++ b/fs/xfs/xfs_verity.h
@@ -13,6 +13,9 @@ void xfs_verity_cache_destroy(struct xfs_inode *ip);
 int xfs_verity_register_shrinker(struct xfs_mount *mp);
 void xfs_verity_unregister_shrinker(struct xfs_mount *mp);
 
+struct xfs_icwalk;
+int xfs_verity_scan_inode(struct xfs_inode *ip, struct xfs_icwalk *icw);
+
 extern const struct fsverity_operations xfs_verity_ops;
 #else
 # define xfs_verity_cache_init(ip)		((void)0)
@@ -20,6 +23,7 @@ extern const struct fsverity_operations xfs_verity_ops;
 # define xfs_verity_cache_destroy(ip)		((void)0)
 # define xfs_verity_register_shrinker(mp)	(0)
 # define xfs_verity_unregister_shrinker(mp)	((void)0)
+# define xfs_verity_scan_inode(ip, icw)		(0)
 #endif	/* CONFIG_FS_VERITY */
 
 #endif	/* __XFS_VERITY_H__ */


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 29/40] xfs: shrink verity blob cache
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (27 preceding siblings ...)
  2024-03-17 16:30   ` [PATCH 28/40] xfs: create an icache tag for files with cached " Darrick J. Wong
@ 2024-03-17 16:30   ` Darrick J. Wong
  2024-03-17 16:31   ` [PATCH 30/40] xfs: clean up stale fsverity metadata before starting Darrick J. Wong
                     ` (11 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:30 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Add some shrinkers so that reclaim can free cached merkle tree blocks
when memory is tight.  We add a shrinkref variable to bias reclaim
against freeing the upper levels of the merkle tree in the hope of
maintaining read performance.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_trace.h  |    1 +
 fs/xfs/xfs_verity.c |   87 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 88 insertions(+)


diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 91a73399114e..37ea6822cca3 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -4797,6 +4797,7 @@ DEFINE_EVENT(xfs_verity_cache_class, name, \
 DEFINE_XFS_VERITY_CACHE_EVENT(xfs_verity_cache_load);
 DEFINE_XFS_VERITY_CACHE_EVENT(xfs_verity_cache_store);
 DEFINE_XFS_VERITY_CACHE_EVENT(xfs_verity_cache_drop);
+DEFINE_XFS_VERITY_CACHE_EVENT(xfs_verity_cache_reclaim);
 
 TRACE_EVENT(xfs_verity_shrinker_count,
 	TP_PROTO(struct xfs_mount *mp, unsigned long long count,
diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
index 8d1888353515..c19fa47d1f76 100644
--- a/fs/xfs/xfs_verity.c
+++ b/fs/xfs/xfs_verity.c
@@ -42,6 +42,9 @@ struct xfs_merkle_blob {
 	/* refcount of this item; the cache holds its own ref */
 	refcount_t		refcount;
 
+	/* number of times the shrinker should ignore this item */
+	atomic_t		shrinkref;
+
 	unsigned long		flags;
 
 	/* Pointer to the merkle tree block, which is power-of-2 sized */
@@ -72,6 +75,7 @@ xfs_merkle_blob_alloc(
 
 	/* Caller owns this refcount. */
 	refcount_set(&mk->refcount, 1);
+	atomic_set(&mk->shrinkref, 0);
 	mk->flags = 0;
 	return mk;
 }
@@ -104,8 +108,10 @@ xfs_verity_cache_drop(
 	struct xfs_inode	*ip)
 {
 	XA_STATE(xas, &ip->i_merkle_blocks, 0);
+	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_merkle_blob	*mk;
 	unsigned long		flags;
+	s64			freed = 0;
 
 	xas_lock_irqsave(&xas, flags);
 	xas_for_each(&xas, mk, ULONG_MAX) {
@@ -113,10 +119,13 @@ xfs_verity_cache_drop(
 
 		trace_xfs_verity_cache_drop(ip, xas.xa_index, _RET_IP_);
 
+		freed++;
 		xas_store(&xas, NULL);
 		xfs_merkle_blob_rele(mk);
 	}
+	percpu_counter_sub(&mp->m_verity_blocks, freed);
 	xas_unlock_irqrestore(&xas, flags);
+	xfs_inode_clear_verity_tag(ip);
 }
 
 /* Destroy the merkle tree block cache */
@@ -175,6 +184,7 @@ xfs_verity_cache_store(
 	unsigned long		key,
 	struct xfs_merkle_blob	*mk)
 {
+	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_merkle_blob	*old;
 	unsigned long		flags;
 
@@ -189,6 +199,8 @@ xfs_verity_cache_store(
 		old = __xa_cmpxchg(&ip->i_merkle_blocks, key, NULL, mk,
 				GFP_KERNEL);
 	} while (old && !refcount_inc_not_zero(&old->refcount));
+	if (!old)
+		percpu_counter_add(&mp->m_verity_blocks, 1);
 	xa_unlock_irqrestore(&ip->i_merkle_blocks, flags);
 
 	if (old == NULL) {
@@ -234,12 +246,73 @@ struct xfs_verity_scan {
 	unsigned long		freed;
 };
 
+/* Reclaim inactive merkle tree blocks that have run out of second chances. */
+static void
+xfs_verity_cache_reclaim(
+	struct xfs_inode	*ip,
+	struct xfs_verity_scan	*vs)
+{
+	XA_STATE(xas, &ip->i_merkle_blocks, 0);
+	struct xfs_mount	*mp = ip->i_mount;
+	struct xfs_merkle_blob	*mk;
+	unsigned long		flags;
+	s64			freed = 0;
+
+	xas_lock_irqsave(&xas, flags);
+	xas_for_each(&xas, mk, ULONG_MAX) {
+		/*
+		 * Tell the shrinker that we scanned this merkle tree block,
+		 * even if we don't remove it.
+		 */
+		vs->scanned++;
+		if (vs->sc->nr_to_scan-- == 0)
+			break;
+
+		/* Retain if there are active references */
+		if (refcount_read(&mk->refcount) > 1)
+			continue;
+
+		/* Ignore if the item still has lru refcount */
+		if (atomic_add_unless(&mk->shrinkref, -1, 0))
+			continue;
+
+		trace_xfs_verity_cache_reclaim(ip, xas.xa_index, _RET_IP_);
+
+		freed++;
+		xas_store(&xas, NULL);
+		xfs_merkle_blob_rele(mk);
+	}
+	percpu_counter_sub(&mp->m_verity_blocks, freed);
+	xas_unlock_irqrestore(&xas, flags);
+
+	/*
+	 * Try to clear the verity tree tag if we reclaimed all the cached
+	 * blocks.  On the flag setting side, we should have IOLOCK_SHARED.
+	 */
+	xfs_ilock(ip, XFS_IOLOCK_EXCL);
+	if (xa_empty(&ip->i_merkle_blocks))
+		xfs_inode_clear_verity_tag(ip);
+	xfs_iunlock(ip, XFS_IOLOCK_EXCL);
+
+	vs->freed += freed;
+}
+
 /* Scan an inode as part of a verity scan. */
 int
 xfs_verity_scan_inode(
 	struct xfs_inode	*ip,
 	struct xfs_icwalk	*icw)
 {
+	struct xfs_verity_scan	*vs;
+
+	vs = container_of(icw, struct xfs_verity_scan, icw);
+
+	if (vs->sc->nr_to_scan > 0)
+		xfs_verity_cache_reclaim(ip, vs);
+
+	if (vs->sc->nr_to_scan == 0)
+		xfs_icwalk_verity_stop(icw);
+
 	xfs_irele(ip);
 	return 0;
 }
@@ -512,6 +585,13 @@ xfs_verity_read_merkle(
 		 * Free the new cache blob and continue with the existing one.
 		 */
 		xfs_merkle_blob_rele(new_mk);
+	} else {
+		/*
+		 * We added this merkle tree block to the cache; tag the inode
+		 * so that reclaim will scan this inode.  The caller holds
+		 * IOLOCK_SHARED this will not race with the shrinker.
+		 */
+		xfs_inode_set_verity_tag(ip);
 	}
 
 out_hit:
@@ -519,6 +599,13 @@ xfs_verity_read_merkle(
 	block->context = mk;
 	block->verified = test_bit(XFS_MERKLE_BLOB_VERIFIED_BIT, &mk->flags);
 
+	/*
+	 * Prioritize keeping the root-adjacent levels cached if this isn't a
+	 * streaming read.
+	 */
+	if (req->level >= 0)
+		atomic_set(&mk->shrinkref, req->level + 1);
+
 	return 0;
 
 out_new_mk:


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 30/40] xfs: clean up stale fsverity metadata before starting
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (28 preceding siblings ...)
  2024-03-17 16:30   ` [PATCH 29/40] xfs: shrink verity blob cache Darrick J. Wong
@ 2024-03-17 16:31   ` Darrick J. Wong
  2024-03-18 17:50     ` Andrey Albershteyn
  2024-03-17 16:31   ` [PATCH 31/40] xfs: better reporting and error handling in xfs_drop_merkle_tree Darrick J. Wong
                     ` (10 subsequent siblings)
  40 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:31 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Before we let fsverity begin writing merkle tree blocks to the file,
let's perform a minor effort to clean up any stale metadata from a
previous attempt to enable fsverity.  This can only happen if the system
crashes /and/ the file shrinks, which is unlikely.  But we could do a
better job of cleaning up anyway.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_verity.c |   42 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 40 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
index c19fa47d1f76..db43e017f10e 100644
--- a/fs/xfs/xfs_verity.c
+++ b/fs/xfs/xfs_verity.c
@@ -413,6 +413,44 @@ xfs_verity_get_descriptor(
 	return args.valuelen;
 }
 
+/*
+ * Clear out old fsverity metadata before we start building a new one.  This
+ * could happen if, say, we crashed while building fsverity data.
+ */
+static int
+xfs_verity_drop_old_metadata(
+	struct xfs_inode		*ip,
+	u64				new_tree_size,
+	unsigned int			tree_blocksize)
+{
+	struct xfs_verity_merkle_key	name;
+	struct xfs_da_args		args = {
+		.dp			= ip,
+		.whichfork		= XFS_ATTR_FORK,
+		.attr_filter		= XFS_ATTR_VERITY,
+		.op_flags		= XFS_DA_OP_REMOVE,
+		.name			= (const uint8_t *)&name,
+		.namelen		= sizeof(struct xfs_verity_merkle_key),
+		/* NULL value make xfs_attr_set remove the attr */
+		.value			= NULL,
+	};
+	u64				offset;
+	int				error = 0;
+
+	/*
+	 * Delete as many merkle tree blocks in increasing blkno order until we
+	 * don't find any more.  That ought to be good enough for avoiding
+	 * dead bloat without excessive runtime.
+	 */
+	for (offset = new_tree_size; !error; offset += tree_blocksize) {
+		xfs_verity_merkle_key_to_disk(&name, offset);
+		error = xfs_attr_set(&args);
+	}
+	if (error == -ENOATTR)
+		return 0;
+	return error;
+}
+
 static int
 xfs_verity_begin_enable(
 	struct file		*filp,
@@ -421,7 +459,6 @@ xfs_verity_begin_enable(
 {
 	struct inode		*inode = file_inode(filp);
 	struct xfs_inode	*ip = XFS_I(inode);
-	int			error = 0;
 
 	xfs_assert_ilocked(ip, XFS_IOLOCK_EXCL);
 
@@ -431,7 +468,8 @@ xfs_verity_begin_enable(
 	if (xfs_iflags_test_and_set(ip, XFS_VERITY_CONSTRUCTION))
 		return -EBUSY;
 
-	return error;
+	return xfs_verity_drop_old_metadata(ip, merkle_tree_size,
+			tree_blocksize);
 }
 
 static int


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 31/40] xfs: better reporting and error handling in xfs_drop_merkle_tree
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (29 preceding siblings ...)
  2024-03-17 16:31   ` [PATCH 30/40] xfs: clean up stale fsverity metadata before starting Darrick J. Wong
@ 2024-03-17 16:31   ` Darrick J. Wong
  2024-03-18 17:51     ` Andrey Albershteyn
  2024-03-17 16:31   ` [PATCH 32/40] xfs: make scrub aware of verity dinode flag Darrick J. Wong
                     ` (9 subsequent siblings)
  40 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:31 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

xfs_drop_merkle_tree is responsible for removing the fsverity metadata
after a failed attempt to enable fsverity for a file.  However, if the
enablement process fails before the verity descriptor is written to the
file, the cleanup function will trip the WARN_ON.  The error code in
that case is ENOATTR, which isn't worth logging about.

Fix that return code handling, fix the tree block removal loop not to
return early with ENOATTR, and improve the logging so that we actually
capture what kind of error occurred.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_verity.c |   25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)


diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
index db43e017f10e..32891ae42c47 100644
--- a/fs/xfs/xfs_verity.c
+++ b/fs/xfs/xfs_verity.c
@@ -472,15 +472,14 @@ xfs_verity_begin_enable(
 			tree_blocksize);
 }
 
+/* Try to remove all the fsverity metadata after a failed enablement. */
 static int
-xfs_drop_merkle_tree(
+xfs_verity_drop_incomplete_tree(
 	struct xfs_inode		*ip,
 	u64				merkle_tree_size,
 	unsigned int			tree_blocksize)
 {
 	struct xfs_verity_merkle_key	name;
-	int				error = 0;
-	u64				offset = 0;
 	struct xfs_da_args		args = {
 		.dp			= ip,
 		.whichfork		= XFS_ATTR_FORK,
@@ -491,6 +490,8 @@ xfs_drop_merkle_tree(
 		/* NULL value make xfs_attr_set remove the attr */
 		.value			= NULL,
 	};
+	u64				offset;
+	int				error;
 
 	if (!merkle_tree_size)
 		return 0;
@@ -498,6 +499,8 @@ xfs_drop_merkle_tree(
 	for (offset = 0; offset < merkle_tree_size; offset += tree_blocksize) {
 		xfs_verity_merkle_key_to_disk(&name, offset);
 		error = xfs_attr_set(&args);
+		if (error == -ENOATTR)
+			error = 0;
 		if (error)
 			return error;
 	}
@@ -505,7 +508,8 @@ xfs_drop_merkle_tree(
 	args.name = (const uint8_t *)XFS_VERITY_DESCRIPTOR_NAME;
 	args.namelen = XFS_VERITY_DESCRIPTOR_NAME_LEN;
 	error = xfs_attr_set(&args);
-
+	if (error == -ENOATTR)
+		return 0;
 	return error;
 }
 
@@ -564,9 +568,16 @@ xfs_verity_end_enable(
 		inode->i_flags |= S_VERITY;
 
 out:
-	if (error)
-		WARN_ON_ONCE(xfs_drop_merkle_tree(ip, merkle_tree_size,
-						  tree_blocksize));
+	if (error) {
+		int	error2;
+
+		error2 = xfs_verity_drop_incomplete_tree(ip, merkle_tree_size,
+				tree_blocksize);
+		if (error2)
+			xfs_alert(ip->i_mount,
+ "ino 0x%llx failed to clean up new fsverity metadata, err %d",
+					ip->i_ino, error2);
+	}
 
 	xfs_iflags_clear(ip, XFS_VERITY_CONSTRUCTION);
 	return error;


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 32/40] xfs: make scrub aware of verity dinode flag
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (30 preceding siblings ...)
  2024-03-17 16:31   ` [PATCH 31/40] xfs: better reporting and error handling in xfs_drop_merkle_tree Darrick J. Wong
@ 2024-03-17 16:31   ` Darrick J. Wong
  2024-03-17 16:32   ` [PATCH 33/40] xfs: add fs-verity ioctls Darrick J. Wong
                     ` (8 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:31 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

fs-verity adds new inode flag which causes scrub to fail as it is
not yet known.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/attr.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
index 9a1f59f7b5a4..ae4227cb55ec 100644
--- a/fs/xfs/scrub/attr.c
+++ b/fs/xfs/scrub/attr.c
@@ -494,7 +494,7 @@ xchk_xattr_rec(
 	/* Retrieve the entry and check it. */
 	hash = be32_to_cpu(ent->hashval);
 	badflags = ~(XFS_ATTR_LOCAL | XFS_ATTR_ROOT | XFS_ATTR_SECURE |
-			XFS_ATTR_INCOMPLETE | XFS_ATTR_PARENT);
+			XFS_ATTR_INCOMPLETE | XFS_ATTR_PARENT | XFS_ATTR_VERITY);
 	if ((ent->flags & badflags) != 0)
 		xchk_da_set_corrupt(ds, level);
 	if (ent->flags & XFS_ATTR_LOCAL) {


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 33/40] xfs: add fs-verity ioctls
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (31 preceding siblings ...)
  2024-03-17 16:31   ` [PATCH 32/40] xfs: make scrub aware of verity dinode flag Darrick J. Wong
@ 2024-03-17 16:32   ` Darrick J. Wong
  2024-03-17 16:32   ` [PATCH 34/40] xfs: advertise fs-verity being available on filesystem Darrick J. Wong
                     ` (7 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:32 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

Add fs-verity ioctls to enable, dump metadata (descriptor and Merkle
tree pages) and obtain file's digest.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: remove unnecessary casting]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_ioctl.c |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)


diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index ab61d7d552fb..4b11898728cc 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -43,6 +43,7 @@
 #include <linux/mount.h>
 #include <linux/namei.h>
 #include <linux/fileattr.h>
+#include <linux/fsverity.h>
 
 /*
  * xfs_find_handle maps from userspace xfs_fsop_handlereq structure to
@@ -2174,6 +2175,21 @@ xfs_file_ioctl(
 		return error;
 	}
 
+	case FS_IOC_ENABLE_VERITY:
+		if (!xfs_has_verity(mp))
+			return -EOPNOTSUPP;
+		return fsverity_ioctl_enable(filp, arg);
+
+	case FS_IOC_MEASURE_VERITY:
+		if (!xfs_has_verity(mp))
+			return -EOPNOTSUPP;
+		return fsverity_ioctl_measure(filp, arg);
+
+	case FS_IOC_READ_VERITY_METADATA:
+		if (!xfs_has_verity(mp))
+			return -EOPNOTSUPP;
+		return fsverity_ioctl_read_metadata(filp, arg);
+
 	default:
 		return -ENOTTY;
 	}


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 34/40] xfs: advertise fs-verity being available on filesystem
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (32 preceding siblings ...)
  2024-03-17 16:32   ` [PATCH 33/40] xfs: add fs-verity ioctls Darrick J. Wong
@ 2024-03-17 16:32   ` Darrick J. Wong
  2024-03-17 16:32   ` [PATCH 35/40] xfs: teach online repair to evaluate fsverity xattrs Darrick J. Wong
                     ` (6 subsequent siblings)
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:32 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Advertise that this filesystem supports fsverity.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_fs.h |    1 +
 fs/xfs/libxfs/xfs_sb.c |    2 ++
 2 files changed, 3 insertions(+)


diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
index ca1b17d01437..2f372088004f 100644
--- a/fs/xfs/libxfs/xfs_fs.h
+++ b/fs/xfs/libxfs/xfs_fs.h
@@ -239,6 +239,7 @@ typedef struct xfs_fsop_resblks {
 #define XFS_FSOP_GEOM_FLAGS_BIGTIME	(1 << 21) /* 64-bit nsec timestamps */
 #define XFS_FSOP_GEOM_FLAGS_INOBTCNT	(1 << 22) /* inobt btree counter */
 #define XFS_FSOP_GEOM_FLAGS_NREXT64	(1 << 23) /* large extent counters */
+#define XFS_FSOP_GEOM_FLAGS_VERITY	(1 << 24) /* fs-verity */
 
 /*
  * Minimum and maximum sizes need for growth checks.
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index a845cbe3f539..f5038d0d94fe 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -1260,6 +1260,8 @@ xfs_fs_geometry(
 	}
 	if (xfs_has_large_extent_counts(mp))
 		geo->flags |= XFS_FSOP_GEOM_FLAGS_NREXT64;
+	if (xfs_has_verity(mp))
+		geo->flags |= XFS_FSOP_GEOM_FLAGS_VERITY;
 	geo->rtsectsize = sbp->sb_blocksize;
 	geo->dirblocksize = xfs_dir2_dirblock_bytes(sbp);
 


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 35/40] xfs: teach online repair to evaluate fsverity xattrs
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (33 preceding siblings ...)
  2024-03-17 16:32   ` [PATCH 34/40] xfs: advertise fs-verity being available on filesystem Darrick J. Wong
@ 2024-03-17 16:32   ` Darrick J. Wong
  2024-03-18 17:34     ` Andrey Albershteyn
  2024-03-17 16:32   ` [PATCH 36/40] xfs: don't store trailing zeroes of merkle tree blocks Darrick J. Wong
                     ` (5 subsequent siblings)
  40 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:32 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Teach online repair to check for unused fsverity metadata and purge it
on reconstruction.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/attr.c   |  102 +++++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/scrub/attr.h   |    4 ++
 fs/xfs/scrub/common.c |   27 +++++++++++++
 3 files changed, 133 insertions(+)


diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
index ae4227cb55ec..c69dee281984 100644
--- a/fs/xfs/scrub/attr.c
+++ b/fs/xfs/scrub/attr.c
@@ -21,6 +21,8 @@
 #include "scrub/dabtree.h"
 #include "scrub/attr.h"
 
+#include <linux/fsverity.h>
+
 /* Free the buffers linked from the xattr buffer. */
 static void
 xchk_xattr_buf_cleanup(
@@ -135,6 +137,91 @@ xchk_setup_xattr(
 	return xchk_setup_inode_contents(sc, 0);
 }
 
+#ifdef CONFIG_FS_VERITY
+/* Extract merkle tree geometry from incore information. */
+static int
+xchk_xattr_extract_verity(
+	struct xfs_scrub		*sc)
+{
+	struct xchk_xattr_buf		*ab = sc->buf;
+
+	/* setup should have allocated the buffer */
+	if (!ab) {
+		ASSERT(0);
+		return -EFSCORRUPTED;
+	}
+
+	return fsverity_merkle_tree_geometry(VFS_I(sc->ip),
+			&ab->merkle_blocksize, &ab->merkle_tree_size);
+}
+
+/* Check the merkle tree xattrs. */
+STATIC void
+xchk_xattr_verity(
+	struct xfs_scrub		*sc,
+	xfs_dablk_t			blkno,
+	const unsigned char		*name,
+	unsigned int			namelen,
+	unsigned int			valuelen)
+{
+	struct xchk_xattr_buf		*ab = sc->buf;
+
+	/* Non-verity filesystems should never have verity xattrs. */
+	if (!xfs_has_verity(sc->mp)) {
+		xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
+		return;
+	}
+
+	/*
+	 * Any verity metadata on a non-verity file are leftovers from a
+	 * previous attempt to enable verity.
+	 */
+	if (!IS_VERITY(VFS_I(sc->ip))) {
+		xchk_ino_set_preen(sc, sc->ip->i_ino);
+		return;
+	}
+
+	switch (namelen) {
+	case sizeof(struct xfs_verity_merkle_key):
+		/* Oversized blocks are not allowed */
+		if (valuelen > ab->merkle_blocksize) {
+			xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
+			return;
+		}
+		break;
+	case XFS_VERITY_DESCRIPTOR_NAME_LEN:
+		/* Has to match the descriptor xattr name */
+		if (memcmp(name, XFS_VERITY_DESCRIPTOR_NAME, namelen)) {
+			xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
+		}
+		return;
+	default:
+		xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
+		return;
+	}
+
+	/*
+	 * Merkle tree blocks beyond the end of the tree are leftovers from
+	 * a previous failed attempt to enable verity.
+	 */
+	if (xfs_verity_merkle_key_from_disk(name) >= ab->merkle_tree_size)
+		xchk_ino_set_preen(sc, sc->ip->i_ino);
+}
+#else
+# define xchk_xattr_extract_verity(sc)	(0)
+
+static void
+xchk_xattr_verity(
+	struct xfs_scrub	*sc,
+	xfs_dablk_t		blkno,
+	const unsigned char	*name,
+	unsigned int		namelen)
+{
+	/* Should never see verity xattrs when verity is not enabled. */
+	xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
+}
+#endif /* CONFIG_FS_VERITY */
+
 /* Extended Attributes */
 
 struct xchk_xattr {
@@ -194,6 +281,15 @@ xchk_xattr_listent(
 		goto fail_xref;
 	}
 
+	/* Check verity xattr geometry */
+	if (flags & XFS_ATTR_VERITY) {
+		xchk_xattr_verity(sx->sc, args.blkno, name, namelen, valuelen);
+		if (sx->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) {
+			context->seen_enough = 1;
+			return;
+		}
+	}
+
 	/* Does this name make sense? */
 	if (!xfs_attr_namecheck(sx->sc->mp, name, namelen, flags)) {
 		xchk_fblock_set_corrupt(sx->sc, XFS_ATTR_FORK, args.blkno);
@@ -611,6 +707,12 @@ xchk_xattr(
 	if (error)
 		return error;
 
+	if (IS_VERITY(VFS_I(sc->ip))) {
+		error = xchk_xattr_extract_verity(sc);
+		if (error)
+			return error;
+	}
+
 	/* Check the physical structure of the xattr. */
 	if (sc->ip->i_af.if_format == XFS_DINODE_FMT_LOCAL)
 		error = xchk_xattr_check_sf(sc);
diff --git a/fs/xfs/scrub/attr.h b/fs/xfs/scrub/attr.h
index 48fd9402c432..37849ffb0375 100644
--- a/fs/xfs/scrub/attr.h
+++ b/fs/xfs/scrub/attr.h
@@ -19,6 +19,10 @@ struct xchk_xattr_buf {
 	/* Memory buffer used to extract xattr values. */
 	void			*value;
 	size_t			value_sz;
+
+	/* Geometry of the merkle tree attached to this verity file. */
+	u64			merkle_tree_size;
+	unsigned int		merkle_blocksize;
 };
 
 #endif	/* __XFS_SCRUB_ATTR_H__ */
diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
index abff79a77c72..dd2ed1f833c5 100644
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -37,6 +37,8 @@
 #include "scrub/repair.h"
 #include "scrub/health.h"
 
+#include <linux/fsverity.h>
+
 /* Common code for the metadata scrubbers. */
 
 /*
@@ -1073,6 +1075,25 @@ xchk_irele(
 	xfs_irele(ip);
 }
 
+#ifdef CONFIG_FS_VERITY
+/*
+ * Make sure the fsverity information is attached, so we don't have to do that
+ * later after taking locks.
+ */
+static inline int
+xchk_setup_fsverity(
+	struct xfs_scrub	*sc)
+{
+	unsigned int		dontcare;
+	u64			alsodontcare;
+
+	return fsverity_merkle_tree_geometry(VFS_I(sc->ip),
+			&dontcare, &alsodontcare);
+}
+#else
+# define xchk_setup_fsverity(sc)	(0)
+#endif
+
 /*
  * Set us up to scrub metadata mapped by a file's fork.  Callers must not use
  * this to operate on user-accessible regular file data because the MMAPLOCK is
@@ -1092,6 +1113,12 @@ xchk_setup_inode_contents(
 	/* Lock the inode so the VFS cannot touch this file. */
 	xchk_ilock(sc, XFS_IOLOCK_EXCL);
 
+	if (IS_VERITY(VFS_I(sc->ip))) {
+		error = xchk_setup_fsverity(sc);
+		if (error)
+			goto out;
+	}
+
 	error = xchk_trans_alloc(sc, resblks);
 	if (error)
 		goto out;


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 36/40] xfs: don't store trailing zeroes of merkle tree blocks
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (34 preceding siblings ...)
  2024-03-17 16:32   ` [PATCH 35/40] xfs: teach online repair to evaluate fsverity xattrs Darrick J. Wong
@ 2024-03-17 16:32   ` Darrick J. Wong
  2024-03-18 17:52     ` Andrey Albershteyn
  2024-03-17 16:33   ` [PATCH 37/40] xfs: create separate name hash function for xattrs Darrick J. Wong
                     ` (4 subsequent siblings)
  40 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:32 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

As a minor space optimization, don't store trailing zeroes of merkle
tree blocks to reduce space consumption and copying overhead.  This
really only affects the rightmost blocks at each level of the tree.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_verity.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)


diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
index 32891ae42c47..abd95bc1ba6e 100644
--- a/fs/xfs/xfs_verity.c
+++ b/fs/xfs/xfs_verity.c
@@ -622,11 +622,6 @@ xfs_verity_read_merkle(
 	if (error)
 		goto out_new_mk;
 
-	if (!args.valuelen) {
-		error = -ENODATA;
-		goto out_new_mk;
-	}
-
 	mk = xfs_verity_cache_store(ip, key, new_mk);
 	if (mk != new_mk) {
 		/*
@@ -681,6 +676,12 @@ xfs_verity_write_merkle(
 		.value			= (void *)buf,
 		.valuelen		= size,
 	};
+	const char			*p = buf + size - 1;
+
+	/* Don't store trailing zeroes. */
+	while (p >= (const char *)buf && *p == 0)
+		p--;
+	args.valuelen = p - (const char *)buf + 1;
 
 	xfs_verity_merkle_key_to_disk(&name, pos);
 	return xfs_attr_set(&args);


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 37/40] xfs: create separate name hash function for xattrs
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (35 preceding siblings ...)
  2024-03-17 16:32   ` [PATCH 36/40] xfs: don't store trailing zeroes of merkle tree blocks Darrick J. Wong
@ 2024-03-17 16:33   ` Darrick J. Wong
  2024-03-18 17:53     ` Andrey Albershteyn
  2024-03-17 16:33   ` [PATCH 38/40] xfs: use merkle tree offset as attr hash Darrick J. Wong
                     ` (3 subsequent siblings)
  40 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:33 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Create a new hashing function for extended attribute names.  The next
patch needs this so it can modify the hash strategy for verity xattrs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_attr.c      |   16 ++++++++++++++--
 fs/xfs/libxfs/xfs_attr.h      |    3 +++
 fs/xfs/libxfs/xfs_attr_leaf.c |    4 ++--
 fs/xfs/scrub/attr.c           |    8 +++++---
 fs/xfs/xfs_attr_item.c        |    3 ++-
 fs/xfs/xfs_attr_list.c        |    3 ++-
 6 files changed, 28 insertions(+), 9 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index b7aa1bc12fd1..b1fa45197eac 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -238,6 +238,16 @@ xfs_attr_get_ilocked(
 	return xfs_attr_node_get(args);
 }
 
+/* Compute hash for an extended attribute name. */
+xfs_dahash_t
+xfs_attr_hashname(
+	unsigned int		attr_flags,
+	const uint8_t		*name,
+	unsigned int		namelen)
+{
+	return xfs_da_hashname(name, namelen);
+}
+
 /*
  * Retrieve an extended attribute by name, and its value if requested.
  *
@@ -268,7 +278,8 @@ xfs_attr_get(
 
 	args->geo = args->dp->i_mount->m_attr_geo;
 	args->whichfork = XFS_ATTR_FORK;
-	args->hashval = xfs_da_hashname(args->name, args->namelen);
+	args->hashval = xfs_attr_hashname(args->attr_filter, args->name,
+					  args->namelen);
 
 	/* Entirely possible to look up a name which doesn't exist */
 	args->op_flags = XFS_DA_OP_OKNOENT;
@@ -942,7 +953,8 @@ xfs_attr_set(
 
 	args->geo = mp->m_attr_geo;
 	args->whichfork = XFS_ATTR_FORK;
-	args->hashval = xfs_da_hashname(args->name, args->namelen);
+	args->hashval = xfs_attr_hashname(args->attr_filter, args->name,
+					  args->namelen);
 
 	/*
 	 * We have no control over the attribute names that userspace passes us
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 92711c8d2a9f..19db6c1cc71f 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -619,4 +619,7 @@ extern struct kmem_cache *xfs_attr_intent_cache;
 int __init xfs_attr_intent_init_cache(void);
 void xfs_attr_intent_destroy_cache(void);
 
+xfs_dahash_t xfs_attr_hashname(unsigned int attr_flags,
+		const uint8_t *name_string, unsigned int name_length);
+
 #endif	/* __XFS_ATTR_H__ */
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index ac904cc1a97b..fcece25fd13e 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -911,8 +911,8 @@ xfs_attr_shortform_to_leaf(
 		nargs.namelen = sfe->namelen;
 		nargs.value = &sfe->nameval[nargs.namelen];
 		nargs.valuelen = sfe->valuelen;
-		nargs.hashval = xfs_da_hashname(sfe->nameval,
-						sfe->namelen);
+		nargs.hashval = xfs_attr_hashname(sfe->flags, sfe->nameval,
+						  sfe->namelen);
 		nargs.attr_filter = sfe->flags & XFS_ATTR_NSP_ONDISK_MASK;
 		error = xfs_attr3_leaf_lookup_int(bp, &nargs); /* set a->index */
 		ASSERT(error == -ENOATTR);
diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
index c69dee281984..e7d50589f72d 100644
--- a/fs/xfs/scrub/attr.c
+++ b/fs/xfs/scrub/attr.c
@@ -253,7 +253,6 @@ xchk_xattr_listent(
 		.dp			= context->dp,
 		.name			= name,
 		.namelen		= namelen,
-		.hashval		= xfs_da_hashname(name, namelen),
 		.trans			= context->tp,
 		.valuelen		= valuelen,
 	};
@@ -263,6 +262,7 @@ xchk_xattr_listent(
 
 	sx = container_of(context, struct xchk_xattr, context);
 	ab = sx->sc->buf;
+	args.hashval = xfs_attr_hashname(flags, name, namelen);
 
 	if (xchk_should_terminate(sx->sc, &error)) {
 		context->seen_enough = error;
@@ -600,7 +600,8 @@ xchk_xattr_rec(
 			xchk_da_set_corrupt(ds, level);
 			goto out;
 		}
-		calc_hash = xfs_da_hashname(lentry->nameval, lentry->namelen);
+		calc_hash = xfs_attr_hashname(ent->flags, lentry->nameval,
+				lentry->namelen);
 	} else {
 		rentry = (struct xfs_attr_leaf_name_remote *)
 				(((char *)bp->b_addr) + nameidx);
@@ -608,7 +609,8 @@ xchk_xattr_rec(
 			xchk_da_set_corrupt(ds, level);
 			goto out;
 		}
-		calc_hash = xfs_da_hashname(rentry->name, rentry->namelen);
+		calc_hash = xfs_attr_hashname(ent->flags, rentry->name,
+				rentry->namelen);
 	}
 	if (calc_hash != hash)
 		xchk_da_set_corrupt(ds, level);
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
index 703770cf1482..4d8264f0a537 100644
--- a/fs/xfs/xfs_attr_item.c
+++ b/fs/xfs/xfs_attr_item.c
@@ -536,7 +536,8 @@ xfs_attri_recover_work(
 	args->whichfork = XFS_ATTR_FORK;
 	args->name = nv->name.i_addr;
 	args->namelen = nv->name.i_len;
-	args->hashval = xfs_da_hashname(args->name, args->namelen);
+	args->hashval = xfs_attr_hashname(attrp->alfi_attr_filter, args->name,
+					  args->namelen);
 	args->attr_filter = attrp->alfi_attr_filter & XFS_ATTRI_FILTER_MASK;
 	args->op_flags = XFS_DA_OP_RECOVERY | XFS_DA_OP_OKNOENT |
 			 XFS_DA_OP_LOGGED;
diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
index fa74378577c5..96169474d023 100644
--- a/fs/xfs/xfs_attr_list.c
+++ b/fs/xfs/xfs_attr_list.c
@@ -135,7 +135,8 @@ xfs_attr_shortform_list(
 		}
 
 		sbp->entno = i;
-		sbp->hash = xfs_da_hashname(sfe->nameval, sfe->namelen);
+		sbp->hash = xfs_attr_hashname(sfe->flags, sfe->nameval,
+					      sfe->namelen);
 		sbp->name = sfe->nameval;
 		sbp->namelen = sfe->namelen;
 		/* These are bytes, and both on-disk, don't endian-flip */


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 38/40] xfs: use merkle tree offset as attr hash
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (36 preceding siblings ...)
  2024-03-17 16:33   ` [PATCH 37/40] xfs: create separate name hash function for xattrs Darrick J. Wong
@ 2024-03-17 16:33   ` Darrick J. Wong
  2024-03-18 17:55     ` Andrey Albershteyn
  2024-03-17 16:33   ` [PATCH 39/40] xfs: don't bother storing merkle tree blocks for zeroed data blocks Darrick J. Wong
                     ` (2 subsequent siblings)
  40 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:33 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

I was exploring the fsverity metadata with xfs_db after creating a 220MB
verity file, and I noticed the following in the debugger output:

entries[0-75] = [hashval,nameidx,incomplete,root,secure,local,parent,verity]
0:[0,4076,0,0,0,0,0,1]
1:[0,1472,0,0,0,1,0,1]
2:[0x800,4056,0,0,0,0,0,1]
3:[0x800,4036,0,0,0,0,0,1]
...
72:[0x12000,2716,0,0,0,0,0,1]
73:[0x12000,2696,0,0,0,0,0,1]
74:[0x12800,2676,0,0,0,0,0,1]
75:[0x12800,2656,0,0,0,0,0,1]
...
nvlist[0].merkle_off = 0x18000
nvlist[1].merkle_off = 0
nvlist[2].merkle_off = 0x19000
nvlist[3].merkle_off = 0x1000
...
nvlist[71].merkle_off = 0x5b000
nvlist[72].merkle_off = 0x44000
nvlist[73].merkle_off = 0x5c000
nvlist[74].merkle_off = 0x45000
nvlist[75].merkle_off = 0x5d000

Within just this attr leaf block, there are 76 attr entries, but only 38
distinct hash values.  There are 415 merkle tree blocks for this file,
but we already have hash collisions.  This isn't good performance from
the standard da hash function because we're mostly shifting and rolling
zeroes around.

However, we don't even have to do that much work -- the merkle tree
block keys are themslves u64 values.  Truncate that value to 32 bits
(the size of xfs_dahash_t) and use that for the hash.  We won't have any
collisions between merkle tree blocks until that tree grows to 2^32nd
blocks.  On a 4k block filesystem, we won't hit that unless the file
contains more than 2^49 bytes, assuming sha256.

As a side effect, the keys for merkle tree blocks get written out in
roughly sequential order, though I didn't observe any change in
performance.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_attr.c      |    7 +++++++
 fs/xfs/libxfs/xfs_da_format.h |    2 ++
 2 files changed, 9 insertions(+)


diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index b1fa45197eac..7c0f006f972a 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -245,6 +245,13 @@ xfs_attr_hashname(
 	const uint8_t		*name,
 	unsigned int		namelen)
 {
+	if ((attr_flags & XFS_ATTR_VERITY) &&
+	    namelen == sizeof(struct xfs_verity_merkle_key)) {
+		uint64_t	off = xfs_verity_merkle_key_from_disk(name);
+
+		return off >> XFS_VERITY_MIN_MERKLE_BLOCKLOG;
+	}
+
 	return xfs_da_hashname(name, namelen);
 }
 
diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h
index e4aa7c9a0ccb..58887a1c65fe 100644
--- a/fs/xfs/libxfs/xfs_da_format.h
+++ b/fs/xfs/libxfs/xfs_da_format.h
@@ -946,4 +946,6 @@ xfs_verity_merkle_key_from_disk(
 #define XFS_VERITY_DESCRIPTOR_NAME	"vdesc"
 #define XFS_VERITY_DESCRIPTOR_NAME_LEN	(sizeof(XFS_VERITY_DESCRIPTOR_NAME) - 1)
 
+#define XFS_VERITY_MIN_MERKLE_BLOCKLOG	(10)
+
 #endif /* __XFS_DA_FORMAT_H__ */


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 39/40] xfs: don't bother storing merkle tree blocks for zeroed data blocks
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (37 preceding siblings ...)
  2024-03-17 16:33   ` [PATCH 38/40] xfs: use merkle tree offset as attr hash Darrick J. Wong
@ 2024-03-17 16:33   ` Darrick J. Wong
  2024-03-18 17:56     ` Andrey Albershteyn
  2024-03-17 16:33   ` [PATCH 40/40] xfs: enable ro-compat fs-verity flag Darrick J. Wong
  2024-03-18 16:35   ` [PATCHSET v5.3] fs-verity support for XFS Eric Biggers
  40 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:33 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Now that fsverity tells our merkle tree io functions about what a hash
of a data block full of zeroes looks like, we can use this information
to avoid writing out merkle tree blocks for sparse regions of the file.
For verified gold master images this can save quite a bit of overhead.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_verity.c |   37 ++++++++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)


diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
index abd95bc1ba6e..ba96e7049f61 100644
--- a/fs/xfs/xfs_verity.c
+++ b/fs/xfs/xfs_verity.c
@@ -619,6 +619,20 @@ xfs_verity_read_merkle(
 	xfs_verity_merkle_key_to_disk(&name, block->offset);
 
 	error = xfs_attr_get(&args);
+	if (error == -ENOATTR) {
+		u8		*p;
+		unsigned int	i;
+
+		/*
+		 * No attribute found.  Synthesize a buffer full of the zero
+		 * digests on the assumption that we elided them at write time.
+		 */
+		for (i = 0, p = new_mk->data;
+		     i < block->size;
+		     i += req->digest_size, p += req->digest_size)
+			memcpy(p, req->zero_digest, req->digest_size);
+		error = 0;
+	}
 	if (error)
 		goto out_new_mk;
 
@@ -676,12 +690,29 @@ xfs_verity_write_merkle(
 		.value			= (void *)buf,
 		.valuelen		= size,
 	};
-	const char			*p = buf + size - 1;
+	const char			*p;
+	unsigned int			i;
 
-	/* Don't store trailing zeroes. */
+	/*
+	 * If this is a block full of hashes of zeroed blocks, don't bother
+	 * storing the block.  We can synthesize them later.
+	 */
+	for (i = 0, p = buf;
+	     i < size;
+	     i += req->digest_size, p += req->digest_size)
+		if (memcmp(p, req->zero_digest, req->digest_size))
+			break;
+	if (i == size)
+		return 0;
+
+	/*
+	 * Don't store trailing zeroes.  Store at least one byte so that the
+	 * block cannot be mistaken for an elided one.
+	 */
+	p = buf + size - 1;
 	while (p >= (const char *)buf && *p == 0)
 		p--;
-	args.valuelen = p - (const char *)buf + 1;
+	args.valuelen = max(1, p - (const char *)buf + 1);
 
 	xfs_verity_merkle_key_to_disk(&name, pos);
 	return xfs_attr_set(&args);


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 40/40] xfs: enable ro-compat fs-verity flag
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (38 preceding siblings ...)
  2024-03-17 16:33   ` [PATCH 39/40] xfs: don't bother storing merkle tree blocks for zeroed data blocks Darrick J. Wong
@ 2024-03-17 16:33   ` Darrick J. Wong
  2024-03-18 16:35   ` [PATCHSET v5.3] fs-verity support for XFS Eric Biggers
  40 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:33 UTC (permalink / raw)
  To: djwong, ebiggers, aalbersh; +Cc: linux-fsdevel, fsverity, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

Finalize fs-verity integration in XFS by making kernel fs-verity
aware with ro-compat flag.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: add spaces]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_format.h |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index 3ce2902101bc..c3f586d6bf7a 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -355,10 +355,11 @@ xfs_sb_has_compat_feature(
 #define XFS_SB_FEAT_RO_COMPAT_INOBTCNT (1 << 3)		/* inobt block counts */
 #define XFS_SB_FEAT_RO_COMPAT_VERITY   (1 << 4)		/* fs-verity */
 #define XFS_SB_FEAT_RO_COMPAT_ALL \
-		(XFS_SB_FEAT_RO_COMPAT_FINOBT | \
-		 XFS_SB_FEAT_RO_COMPAT_RMAPBT | \
-		 XFS_SB_FEAT_RO_COMPAT_REFLINK| \
-		 XFS_SB_FEAT_RO_COMPAT_INOBTCNT)
+		(XFS_SB_FEAT_RO_COMPAT_FINOBT   | \
+		 XFS_SB_FEAT_RO_COMPAT_RMAPBT   | \
+		 XFS_SB_FEAT_RO_COMPAT_REFLINK  | \
+		 XFS_SB_FEAT_RO_COMPAT_INOBTCNT | \
+		 XFS_SB_FEAT_RO_COMPAT_VERITY)
 #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN	~XFS_SB_FEAT_RO_COMPAT_ALL
 static inline bool
 xfs_sb_has_ro_compat_feature(


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 01/20] xfsprogs: add parent pointer support to attribute code
  2024-03-17 16:23 ` Darrick J. Wong
@ 2024-03-17 16:34   ` Darrick J. Wong
  2024-03-17 16:34   ` [PATCH 02/20] xfsprogs: define parent pointer xattr format Darrick J. Wong
                     ` (18 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:34 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers
  Cc: Mark Tinguely, Dave Chinner, Allison Henderson, fsverity,
	linux-fsdevel, linux-xfs

From: Allison Henderson <allison.henderson@oracle.com>

Source kernel commit: 403a9dc2804baec57eb03a9c4ae14ba811f091e5

Add the new parent attribute type. XFS_ATTR_PARENT is used only for parent pointer
entries; it uses reserved blocks like XFS_ATTR_ROOT.

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 libxfs/xfs_attr.c       |    4 +++-
 libxfs/xfs_da_format.h  |    5 ++++-
 libxfs/xfs_log_format.h |    1 +
 3 files changed, 8 insertions(+), 2 deletions(-)


diff --git a/libxfs/xfs_attr.c b/libxfs/xfs_attr.c
index 630065f1..4818eabb 100644
--- a/libxfs/xfs_attr.c
+++ b/libxfs/xfs_attr.c
@@ -922,11 +922,13 @@ xfs_attr_set(
 	struct xfs_inode	*dp = args->dp;
 	struct xfs_mount	*mp = dp->i_mount;
 	struct xfs_trans_res	tres;
-	bool			rsvd = (args->attr_filter & XFS_ATTR_ROOT);
+	bool			rsvd;
 	int			error, local;
 	int			rmt_blks = 0;
 	unsigned int		total;
 
+	rsvd = (args->attr_filter & (XFS_ATTR_ROOT | XFS_ATTR_PARENT)) != 0;
+
 	if (xfs_is_shutdown(dp->i_mount))
 		return -EIO;
 
diff --git a/libxfs/xfs_da_format.h b/libxfs/xfs_da_format.h
index 060e5c96..5434d4d5 100644
--- a/libxfs/xfs_da_format.h
+++ b/libxfs/xfs_da_format.h
@@ -714,12 +714,15 @@ struct xfs_attr3_leafblock {
 #define	XFS_ATTR_LOCAL_BIT	0	/* attr is stored locally */
 #define	XFS_ATTR_ROOT_BIT	1	/* limit access to trusted attrs */
 #define	XFS_ATTR_SECURE_BIT	2	/* limit access to secure attrs */
+#define	XFS_ATTR_PARENT_BIT	3	/* parent pointer attrs */
 #define	XFS_ATTR_INCOMPLETE_BIT	7	/* attr in middle of create/delete */
 #define XFS_ATTR_LOCAL		(1u << XFS_ATTR_LOCAL_BIT)
 #define XFS_ATTR_ROOT		(1u << XFS_ATTR_ROOT_BIT)
 #define XFS_ATTR_SECURE		(1u << XFS_ATTR_SECURE_BIT)
+#define XFS_ATTR_PARENT		(1u << XFS_ATTR_PARENT_BIT)
 #define XFS_ATTR_INCOMPLETE	(1u << XFS_ATTR_INCOMPLETE_BIT)
-#define XFS_ATTR_NSP_ONDISK_MASK	(XFS_ATTR_ROOT | XFS_ATTR_SECURE)
+#define XFS_ATTR_NSP_ONDISK_MASK \
+			(XFS_ATTR_ROOT | XFS_ATTR_SECURE | XFS_ATTR_PARENT)
 
 /*
  * Alignment for namelist and valuelist entries (since they are mixed
diff --git a/libxfs/xfs_log_format.h b/libxfs/xfs_log_format.h
index 16872972..9cbcba4b 100644
--- a/libxfs/xfs_log_format.h
+++ b/libxfs/xfs_log_format.h
@@ -974,6 +974,7 @@ struct xfs_icreate_log {
  */
 #define XFS_ATTRI_FILTER_MASK		(XFS_ATTR_ROOT | \
 					 XFS_ATTR_SECURE | \
+					 XFS_ATTR_PARENT | \
 					 XFS_ATTR_INCOMPLETE)
 
 /*


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 02/20] xfsprogs: define parent pointer xattr format
  2024-03-17 16:23 ` Darrick J. Wong
  2024-03-17 16:34   ` [PATCH 01/20] xfsprogs: add parent pointer support to attribute code Darrick J. Wong
@ 2024-03-17 16:34   ` Darrick J. Wong
  2024-03-17 16:34   ` [PATCH 03/20] xfsprogs: Add xfs_verify_pptr Darrick J. Wong
                     ` (17 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:34 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers
  Cc: Dave Chinner, Allison Henderson, fsverity, linux-fsdevel, linux-xfs

From: Allison Henderson <allison.henderson@oracle.com>

Source kernel commit: 655e7fb23dc155b37a2eeadf2c854def053980bf

We need to define the parent pointer attribute format before we start
adding support for it into all the code that needs to use it. The EA
format we will use encodes the following information:

name={parent inode #, parent inode generation, dirent offset}
value={dirent filename}

The inode/gen gives all the information we need to reliably identify the
parent without requiring child->parent lock ordering, and allows
userspace to do pathname component level reconstruction without the
kernel ever needing to verify the parent itself as part of ioctl calls.

By using the dirent offset in the EA name, we have a method of knowing
the exact parent pointer EA we need to modify/remove in rename/unlink
without an unbound EA name search.

By keeping the dirent name in the value, we have enough information to
be able to validate and reconstruct damaged directory trees. While the
diroffset of a filename alone is not unique enough to identify the
child, the {diroffset,filename,child_inode} tuple is sufficient. That
is, if the diroffset gets reused and points to a different filename, we
can detect that from the contents of EA. If a link of the same name is
created, then we can check whether it points at the same inode as the
parent EA we current have.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 libxfs/xfs_da_format.h |   25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)


diff --git a/libxfs/xfs_da_format.h b/libxfs/xfs_da_format.h
index 5434d4d5..fa0f46db 100644
--- a/libxfs/xfs_da_format.h
+++ b/libxfs/xfs_da_format.h
@@ -878,4 +878,29 @@ static inline unsigned int xfs_dir2_dirblock_bytes(struct xfs_sb *sbp)
 xfs_failaddr_t xfs_da3_blkinfo_verify(struct xfs_buf *bp,
 				      struct xfs_da3_blkinfo *hdr3);
 
+/*
+ * Parent pointer attribute format definition
+ *
+ * EA name encodes the parent inode number, generation and the offset of
+ * the dirent that points to the child inode. The EA value contains the
+ * same name as the dirent in the parent directory.
+ */
+struct xfs_parent_name_rec {
+	__be64  p_ino;
+	__be32  p_gen;
+	__be32  p_diroffset;
+};
+
+/*
+ * incore version of the above, also contains name pointers so callers
+ * can pass/obtain all the parent pointer information in a single structure
+ */
+struct xfs_parent_name_irec {
+	xfs_ino_t		p_ino;
+	uint32_t		p_gen;
+	xfs_dir2_dataptr_t	p_diroffset;
+	const char		*p_name;
+	uint8_t			p_namelen;
+};
+
 #endif /* __XFS_DA_FORMAT_H__ */


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 03/20] xfsprogs: Add xfs_verify_pptr
  2024-03-17 16:23 ` Darrick J. Wong
  2024-03-17 16:34   ` [PATCH 01/20] xfsprogs: add parent pointer support to attribute code Darrick J. Wong
  2024-03-17 16:34   ` [PATCH 02/20] xfsprogs: define parent pointer xattr format Darrick J. Wong
@ 2024-03-17 16:34   ` Darrick J. Wong
  2024-03-17 16:34   ` [PATCH 04/20] fs: add FS_XFLAG_VERITY for verity files Darrick J. Wong
                     ` (16 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:34 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers
  Cc: Allison Henderson, fsverity, linux-fsdevel, linux-xfs

From: Allison Henderson <allison.henderson@oracle.com>

Source kernel commit: 27e62618672464a8c011ee180878c711a6faed73

Attribute names of parent pointers are not strings.  So we need to modify
attr_namecheck to verify parent pointer records when the XFS_ATTR_PARENT flag is
set.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 libxfs/xfs_attr.c      |   47 ++++++++++++++++++++++++++++++++++++++++++++---
 libxfs/xfs_attr.h      |    3 ++-
 libxfs/xfs_da_format.h |    8 ++++++++
 repair/attr_repair.c   |   19 ++++++++++++-------
 4 files changed, 66 insertions(+), 11 deletions(-)


diff --git a/libxfs/xfs_attr.c b/libxfs/xfs_attr.c
index 4818eabb..a9241d18 100644
--- a/libxfs/xfs_attr.c
+++ b/libxfs/xfs_attr.c
@@ -1510,9 +1510,33 @@ xfs_attr_node_get(
 	return error;
 }
 
-/* Returns true if the attribute entry name is valid. */
-bool
-xfs_attr_namecheck(
+/*
+ * Verify parent pointer attribute is valid.
+ * Return true on success or false on failure
+ */
+STATIC bool
+xfs_verify_pptr(
+	struct xfs_mount			*mp,
+	const struct xfs_parent_name_rec	*rec)
+{
+	xfs_ino_t				p_ino;
+	xfs_dir2_dataptr_t			p_diroffset;
+
+	p_ino = be64_to_cpu(rec->p_ino);
+	p_diroffset = be32_to_cpu(rec->p_diroffset);
+
+	if (!xfs_verify_ino(mp, p_ino))
+		return false;
+
+	if (p_diroffset > XFS_DIR2_MAX_DATAPTR)
+		return false;
+
+	return true;
+}
+
+/* Returns true if the string attribute entry name is valid. */
+static bool
+xfs_str_attr_namecheck(
 	const void	*name,
 	size_t		length)
 {
@@ -1527,6 +1551,23 @@ xfs_attr_namecheck(
 	return !memchr(name, 0, length);
 }
 
+/* Returns true if the attribute entry name is valid. */
+bool
+xfs_attr_namecheck(
+	struct xfs_mount	*mp,
+	const void		*name,
+	size_t			length,
+	int			flags)
+{
+	if (flags & XFS_ATTR_PARENT) {
+		if (length != sizeof(struct xfs_parent_name_rec))
+			return false;
+		return xfs_verify_pptr(mp, (struct xfs_parent_name_rec *)name);
+	}
+
+	return xfs_str_attr_namecheck(name, length);
+}
+
 int __init
 xfs_attr_intent_init_cache(void)
 {
diff --git a/libxfs/xfs_attr.h b/libxfs/xfs_attr.h
index 81be9b3e..af92cc57 100644
--- a/libxfs/xfs_attr.h
+++ b/libxfs/xfs_attr.h
@@ -547,7 +547,8 @@ int xfs_attr_get(struct xfs_da_args *args);
 int xfs_attr_set(struct xfs_da_args *args);
 int xfs_attr_set_iter(struct xfs_attr_intent *attr);
 int xfs_attr_remove_iter(struct xfs_attr_intent *attr);
-bool xfs_attr_namecheck(const void *name, size_t length);
+bool xfs_attr_namecheck(struct xfs_mount *mp, const void *name, size_t length,
+			int flags);
 int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
 void xfs_init_attr_trans(struct xfs_da_args *args, struct xfs_trans_res *tres,
 			 unsigned int *total);
diff --git a/libxfs/xfs_da_format.h b/libxfs/xfs_da_format.h
index fa0f46db..e7045b36 100644
--- a/libxfs/xfs_da_format.h
+++ b/libxfs/xfs_da_format.h
@@ -757,6 +757,14 @@ xfs_attr3_leaf_name(xfs_attr_leafblock_t *leafp, int idx)
 	return &((char *)leafp)[be16_to_cpu(entries[idx].nameidx)];
 }
 
+static inline int
+xfs_attr3_leaf_flags(xfs_attr_leafblock_t *leafp, int idx)
+{
+	struct xfs_attr_leaf_entry *entries = xfs_attr3_leaf_entryp(leafp);
+
+	return entries[idx].flags;
+}
+
 static inline xfs_attr_leaf_name_remote_t *
 xfs_attr3_leaf_name_remote(xfs_attr_leafblock_t *leafp, int idx)
 {
diff --git a/repair/attr_repair.c b/repair/attr_repair.c
index f117f9ae..25588b3b 100644
--- a/repair/attr_repair.c
+++ b/repair/attr_repair.c
@@ -292,8 +292,9 @@ process_shortform_attr(
 		}
 
 		/* namecheck checks for null chars in attr names. */
-		if (!libxfs_attr_namecheck(currententry->nameval,
-					   currententry->namelen)) {
+		if (!libxfs_attr_namecheck(mp, currententry->nameval,
+					   currententry->namelen,
+					   currententry->flags)) {
 			do_warn(
 	_("entry contains illegal character in shortform attribute name\n"));
 			junkit = 1;
@@ -469,12 +470,14 @@ process_leaf_attr_local(
 	xfs_dablk_t		da_bno,
 	xfs_ino_t		ino)
 {
-	xfs_attr_leaf_name_local_t *local;
+	xfs_attr_leaf_name_local_t	*local;
+	int				flags;
 
 	local = xfs_attr3_leaf_name_local(leaf, i);
+	flags = xfs_attr3_leaf_flags(leaf, i);
 	if (local->namelen == 0 ||
-	    !libxfs_attr_namecheck(local->nameval,
-				   local->namelen)) {
+	    !libxfs_attr_namecheck(mp, local->nameval,
+				   local->namelen, flags)) {
 		do_warn(
 	_("attribute entry %d in attr block %u, inode %" PRIu64 " has bad name (namelen = %d)\n"),
 			i, da_bno, ino, local->namelen);
@@ -525,12 +528,14 @@ process_leaf_attr_remote(
 {
 	xfs_attr_leaf_name_remote_t *remotep;
 	char*			value;
+	int			flags;
 
 	remotep = xfs_attr3_leaf_name_remote(leaf, i);
+	flags = xfs_attr3_leaf_flags(leaf, i);
 
 	if (remotep->namelen == 0 ||
-	    !libxfs_attr_namecheck(remotep->name,
-				   remotep->namelen) ||
+	    !libxfs_attr_namecheck(mp, remotep->name,
+				   remotep->namelen, flags) ||
 	    be32_to_cpu(entry->hashval) !=
 			libxfs_da_hashname((unsigned char *)&remotep->name[0],
 					   remotep->namelen) ||


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 04/20] fs: add FS_XFLAG_VERITY for verity files
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (2 preceding siblings ...)
  2024-03-17 16:34   ` [PATCH 03/20] xfsprogs: Add xfs_verify_pptr Darrick J. Wong
@ 2024-03-17 16:34   ` Darrick J. Wong
  2024-03-17 16:35   ` [PATCH 05/20] xfs: add attribute type for fs-verity Darrick J. Wong
                     ` (15 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:34 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

Add extended attribute FS_XFLAG_VERITY for inodes with fs-verity
enabled.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
[djwong: fix broken verity flag checks]
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 include/linux.h |    4 ++++
 1 file changed, 4 insertions(+)


diff --git a/include/linux.h b/include/linux.h
index 95a0deee..d98d387e 100644
--- a/include/linux.h
+++ b/include/linux.h
@@ -249,6 +249,10 @@ struct fsxattr {
 #define FS_XFLAG_COWEXTSIZE	0x00010000	/* CoW extent size allocator hint */
 #endif
 
+#ifndef FS_XFLAG_VERITY
+#define FS_XFLAG_VERITY		0x00020000	/* fs-verity enabled */
+#endif
+
 /*
  * Reminder: anything added to this file will be compiled into downstream
  * userspace projects!


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 05/20] xfs: add attribute type for fs-verity
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (3 preceding siblings ...)
  2024-03-17 16:34   ` [PATCH 04/20] fs: add FS_XFLAG_VERITY for verity files Darrick J. Wong
@ 2024-03-17 16:35   ` Darrick J. Wong
  2024-03-17 16:35   ` [PATCH 06/20] xfs: add fs-verity ro-compat flag Darrick J. Wong
                     ` (14 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:35 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

The Merkle tree blocks and descriptor are stored in the extended
attributes of the inode. Add new attribute type for fs-verity
metadata. Add XFS_ATTR_INTERNAL_MASK to skip parent pointer and
fs-verity attributes as those are only for internal use. While we're
at it add a few comments in relevant places that internally visible
attributes are not suppose to be handled via interface defined in
xfs_xattr.c.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 libxfs/xfs_da_format.h  |   10 +++++++++-
 libxfs/xfs_log_format.h |    1 +
 2 files changed, 10 insertions(+), 1 deletion(-)


diff --git a/libxfs/xfs_da_format.h b/libxfs/xfs_da_format.h
index e7045b36..3a35ba58 100644
--- a/libxfs/xfs_da_format.h
+++ b/libxfs/xfs_da_format.h
@@ -715,14 +715,22 @@ struct xfs_attr3_leafblock {
 #define	XFS_ATTR_ROOT_BIT	1	/* limit access to trusted attrs */
 #define	XFS_ATTR_SECURE_BIT	2	/* limit access to secure attrs */
 #define	XFS_ATTR_PARENT_BIT	3	/* parent pointer attrs */
+#define	XFS_ATTR_VERITY_BIT	4	/* verity merkle tree and descriptor */
 #define	XFS_ATTR_INCOMPLETE_BIT	7	/* attr in middle of create/delete */
 #define XFS_ATTR_LOCAL		(1u << XFS_ATTR_LOCAL_BIT)
 #define XFS_ATTR_ROOT		(1u << XFS_ATTR_ROOT_BIT)
 #define XFS_ATTR_SECURE		(1u << XFS_ATTR_SECURE_BIT)
 #define XFS_ATTR_PARENT		(1u << XFS_ATTR_PARENT_BIT)
+#define XFS_ATTR_VERITY		(1u << XFS_ATTR_VERITY_BIT)
 #define XFS_ATTR_INCOMPLETE	(1u << XFS_ATTR_INCOMPLETE_BIT)
 #define XFS_ATTR_NSP_ONDISK_MASK \
-			(XFS_ATTR_ROOT | XFS_ATTR_SECURE | XFS_ATTR_PARENT)
+			(XFS_ATTR_ROOT | XFS_ATTR_SECURE | XFS_ATTR_PARENT | \
+			 XFS_ATTR_VERITY)
+
+/*
+ * Internal attributes not exposed to the user
+ */
+#define XFS_ATTR_INTERNAL_MASK (XFS_ATTR_PARENT | XFS_ATTR_VERITY)
 
 /*
  * Alignment for namelist and valuelist entries (since they are mixed
diff --git a/libxfs/xfs_log_format.h b/libxfs/xfs_log_format.h
index 9cbcba4b..407fadfb 100644
--- a/libxfs/xfs_log_format.h
+++ b/libxfs/xfs_log_format.h
@@ -975,6 +975,7 @@ struct xfs_icreate_log {
 #define XFS_ATTRI_FILTER_MASK		(XFS_ATTR_ROOT | \
 					 XFS_ATTR_SECURE | \
 					 XFS_ATTR_PARENT | \
+					 XFS_ATTR_VERITY | \
 					 XFS_ATTR_INCOMPLETE)
 
 /*


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 06/20] xfs: add fs-verity ro-compat flag
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (4 preceding siblings ...)
  2024-03-17 16:35   ` [PATCH 05/20] xfs: add attribute type for fs-verity Darrick J. Wong
@ 2024-03-17 16:35   ` Darrick J. Wong
  2024-03-17 16:35   ` [PATCH 07/20] xfs: add inode on-disk VERITY flag Darrick J. Wong
                     ` (13 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:35 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

To mark inodes with fs-verity enabled the new XFS_DIFLAG2_VERITY flag
will be added in further patch. This requires ro-compat flag to let
older kernels know that fs with fs-verity can not be modified.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 include/xfs_mount.h |    2 ++
 libxfs/xfs_format.h |    1 +
 libxfs/xfs_sb.c     |    2 ++
 3 files changed, 5 insertions(+)


diff --git a/include/xfs_mount.h b/include/xfs_mount.h
index 9c492b8f..e88535cd 100644
--- a/include/xfs_mount.h
+++ b/include/xfs_mount.h
@@ -169,6 +169,7 @@ typedef struct xfs_mount {
 #define XFS_FEAT_BIGTIME	(1ULL << 24)	/* large timestamps */
 #define XFS_FEAT_NEEDSREPAIR	(1ULL << 25)	/* needs xfs_repair */
 #define XFS_FEAT_NREXT64	(1ULL << 26)	/* large extent counters */
+#define XFS_FEAT_VERITY		(1ULL << 27)	/* fs-verity */
 
 #define __XFS_HAS_FEAT(name, NAME) \
 static inline bool xfs_has_ ## name (struct xfs_mount *mp) \
@@ -213,6 +214,7 @@ __XFS_HAS_FEAT(inobtcounts, INOBTCNT)
 __XFS_HAS_FEAT(bigtime, BIGTIME)
 __XFS_HAS_FEAT(needsrepair, NEEDSREPAIR)
 __XFS_HAS_FEAT(large_extent_counts, NREXT64)
+__XFS_HAS_FEAT(verity, VERITY)
 
 /* Kernel mount features that we don't support */
 #define __XFS_UNSUPP_FEAT(name) \
diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
index 2b2f9050..93d280eb 100644
--- a/libxfs/xfs_format.h
+++ b/libxfs/xfs_format.h
@@ -353,6 +353,7 @@ xfs_sb_has_compat_feature(
 #define XFS_SB_FEAT_RO_COMPAT_RMAPBT   (1 << 1)		/* reverse map btree */
 #define XFS_SB_FEAT_RO_COMPAT_REFLINK  (1 << 2)		/* reflinked files */
 #define XFS_SB_FEAT_RO_COMPAT_INOBTCNT (1 << 3)		/* inobt block counts */
+#define XFS_SB_FEAT_RO_COMPAT_VERITY   (1 << 4)		/* fs-verity */
 #define XFS_SB_FEAT_RO_COMPAT_ALL \
 		(XFS_SB_FEAT_RO_COMPAT_FINOBT | \
 		 XFS_SB_FEAT_RO_COMPAT_RMAPBT | \
diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
index 00b0a937..d6755181 100644
--- a/libxfs/xfs_sb.c
+++ b/libxfs/xfs_sb.c
@@ -161,6 +161,8 @@ xfs_sb_version_to_features(
 		features |= XFS_FEAT_REFLINK;
 	if (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_INOBTCNT)
 		features |= XFS_FEAT_INOBTCNT;
+	if (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_VERITY)
+		features |= XFS_FEAT_VERITY;
 	if (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_FTYPE)
 		features |= XFS_FEAT_FTYPE;
 	if (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_SPINODES)


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 07/20] xfs: add inode on-disk VERITY flag
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (5 preceding siblings ...)
  2024-03-17 16:35   ` [PATCH 06/20] xfs: add fs-verity ro-compat flag Darrick J. Wong
@ 2024-03-17 16:35   ` Darrick J. Wong
  2024-03-17 16:35   ` [PATCH 08/20] xfs: add fs-verity support Darrick J. Wong
                     ` (12 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:35 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

Source kernel commit: ff6c7e66f70cb7239fcc6a1011f47132091d679e

Add flag to mark inodes which have fs-verity enabled on them (i.e.
descriptor exist and tree is built).

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
---
 libxfs/xfs_format.h |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
index 93d280eb..3ce29021 100644
--- a/libxfs/xfs_format.h
+++ b/libxfs/xfs_format.h
@@ -1085,16 +1085,18 @@ static inline void xfs_dinode_put_rdev(struct xfs_dinode *dip, xfs_dev_t rdev)
 #define XFS_DIFLAG2_COWEXTSIZE_BIT   2  /* copy on write extent size hint */
 #define XFS_DIFLAG2_BIGTIME_BIT	3	/* big timestamps */
 #define XFS_DIFLAG2_NREXT64_BIT 4	/* large extent counters */
+#define XFS_DIFLAG2_VERITY_BIT	5	/* inode sealed by fsverity */
 
 #define XFS_DIFLAG2_DAX		(1 << XFS_DIFLAG2_DAX_BIT)
 #define XFS_DIFLAG2_REFLINK     (1 << XFS_DIFLAG2_REFLINK_BIT)
 #define XFS_DIFLAG2_COWEXTSIZE  (1 << XFS_DIFLAG2_COWEXTSIZE_BIT)
 #define XFS_DIFLAG2_BIGTIME	(1 << XFS_DIFLAG2_BIGTIME_BIT)
 #define XFS_DIFLAG2_NREXT64	(1 << XFS_DIFLAG2_NREXT64_BIT)
+#define XFS_DIFLAG2_VERITY	(1 << XFS_DIFLAG2_VERITY_BIT)
 
 #define XFS_DIFLAG2_ANY \
 	(XFS_DIFLAG2_DAX | XFS_DIFLAG2_REFLINK | XFS_DIFLAG2_COWEXTSIZE | \
-	 XFS_DIFLAG2_BIGTIME | XFS_DIFLAG2_NREXT64)
+	 XFS_DIFLAG2_BIGTIME | XFS_DIFLAG2_NREXT64 | XFS_DIFLAG2_VERITY)
 
 static inline bool xfs_dinode_has_bigtime(const struct xfs_dinode *dip)
 {


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 08/20] xfs: add fs-verity support
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (6 preceding siblings ...)
  2024-03-17 16:35   ` [PATCH 07/20] xfs: add inode on-disk VERITY flag Darrick J. Wong
@ 2024-03-17 16:35   ` Darrick J. Wong
  2024-03-17 16:36   ` [PATCH 09/20] xfs: advertise fs-verity being available on filesystem Darrick J. Wong
                     ` (11 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:35 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

Add integration with fs-verity. The XFS store fs-verity metadata in
the extended file attributes. The metadata consist of verity
descriptor and Merkle tree blocks.

The descriptor is stored under "vdesc" extended attribute. The
Merkle tree blocks are stored under binary indexes which are offsets
into the Merkle tree.

When fs-verity is enabled on an inode, the XFS_IVERITY_CONSTRUCTION
flag is set meaning that the Merkle tree is being build. The
initialization ends with storing of verity descriptor and setting
inode on-disk flag (XFS_DIFLAG2_VERITY).

The verification on read is done in read path of iomap.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: replace caching implementation with an xarray, other cleanups]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 libxfs/xfs_attr.c      |   12 ++++++++++++
 libxfs/xfs_da_format.h |   32 ++++++++++++++++++++++++++++++++
 libxfs/xfs_ondisk.h    |    4 ++++
 3 files changed, 48 insertions(+)


diff --git a/libxfs/xfs_attr.c b/libxfs/xfs_attr.c
index a9241d18..30cf3688 100644
--- a/libxfs/xfs_attr.c
+++ b/libxfs/xfs_attr.c
@@ -1565,6 +1565,18 @@ xfs_attr_namecheck(
 		return xfs_verify_pptr(mp, (struct xfs_parent_name_rec *)name);
 	}
 
+	if (flags & XFS_ATTR_VERITY) {
+		/* Merkle tree pages are stored under u64 indexes */
+		if (length == sizeof(struct xfs_verity_merkle_key))
+			return true;
+
+		/* Verity descriptor blocks are held in a named attribute. */
+		if (length == XFS_VERITY_DESCRIPTOR_NAME_LEN)
+			return true;
+
+		return false;
+	}
+
 	return xfs_str_attr_namecheck(name, length);
 }
 
diff --git a/libxfs/xfs_da_format.h b/libxfs/xfs_da_format.h
index 3a35ba58..2d2314a5 100644
--- a/libxfs/xfs_da_format.h
+++ b/libxfs/xfs_da_format.h
@@ -919,4 +919,36 @@ struct xfs_parent_name_irec {
 	uint8_t			p_namelen;
 };
 
+/*
+ * fs-verity attribute name format
+ *
+ * Merkle tree blocks are stored under extended attributes of the inode. The
+ * name of the attributes are byte offsets into merkle tree.
+ */
+struct xfs_verity_merkle_key {
+	__be64	vi_merkleoff;
+};
+
+static inline void
+xfs_verity_merkle_key_to_disk(
+	struct xfs_verity_merkle_key	*key,
+	uint64_t			offset)
+{
+	key->vi_merkleoff = cpu_to_be64(offset);
+}
+
+static inline uint64_t
+xfs_verity_merkle_key_from_disk(
+	const void			*attr_name)
+{
+	const struct xfs_verity_merkle_key *key = attr_name;
+
+	return be64_to_cpu(key->vi_merkleoff);
+}
+
+
+/* ondisk xattr name used for the fsverity descriptor */
+#define XFS_VERITY_DESCRIPTOR_NAME	"vdesc"
+#define XFS_VERITY_DESCRIPTOR_NAME_LEN	(sizeof(XFS_VERITY_DESCRIPTOR_NAME) - 1)
+
 #endif /* __XFS_DA_FORMAT_H__ */
diff --git a/libxfs/xfs_ondisk.h b/libxfs/xfs_ondisk.h
index 81885a6a..16f4ef2f 100644
--- a/libxfs/xfs_ondisk.h
+++ b/libxfs/xfs_ondisk.h
@@ -194,6 +194,10 @@ xfs_check_ondisk_structs(void)
 	XFS_CHECK_VALUE(XFS_DQ_BIGTIME_EXPIRY_MIN << XFS_DQ_BIGTIME_SHIFT, 4);
 	XFS_CHECK_VALUE(XFS_DQ_BIGTIME_EXPIRY_MAX << XFS_DQ_BIGTIME_SHIFT,
 			16299260424LL);
+
+	/* fs-verity xattrs */
+	XFS_CHECK_STRUCT_SIZE(struct xfs_verity_merkle_key,	8);
+	XFS_CHECK_VALUE(sizeof(XFS_VERITY_DESCRIPTOR_NAME),	6);
 }
 
 #endif /* __XFS_ONDISK_H */


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 09/20] xfs: advertise fs-verity being available on filesystem
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (7 preceding siblings ...)
  2024-03-17 16:35   ` [PATCH 08/20] xfs: add fs-verity support Darrick J. Wong
@ 2024-03-17 16:36   ` Darrick J. Wong
  2024-03-17 16:36   ` [PATCH 10/20] xfs: create separate name hash function for xattrs Darrick J. Wong
                     ` (10 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:36 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Advertise that this filesystem supports fsverity.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 libxfs/xfs_fs.h |    1 +
 libxfs/xfs_sb.c |    2 ++
 2 files changed, 3 insertions(+)


diff --git a/libxfs/xfs_fs.h b/libxfs/xfs_fs.h
index ca1b17d0..2f372088 100644
--- a/libxfs/xfs_fs.h
+++ b/libxfs/xfs_fs.h
@@ -239,6 +239,7 @@ typedef struct xfs_fsop_resblks {
 #define XFS_FSOP_GEOM_FLAGS_BIGTIME	(1 << 21) /* 64-bit nsec timestamps */
 #define XFS_FSOP_GEOM_FLAGS_INOBTCNT	(1 << 22) /* inobt btree counter */
 #define XFS_FSOP_GEOM_FLAGS_NREXT64	(1 << 23) /* large extent counters */
+#define XFS_FSOP_GEOM_FLAGS_VERITY	(1 << 24) /* fs-verity */
 
 /*
  * Minimum and maximum sizes need for growth checks.
diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
index d6755181..fc2269a2 100644
--- a/libxfs/xfs_sb.c
+++ b/libxfs/xfs_sb.c
@@ -1258,6 +1258,8 @@ xfs_fs_geometry(
 	}
 	if (xfs_has_large_extent_counts(mp))
 		geo->flags |= XFS_FSOP_GEOM_FLAGS_NREXT64;
+	if (xfs_has_verity(mp))
+		geo->flags |= XFS_FSOP_GEOM_FLAGS_VERITY;
 	geo->rtsectsize = sbp->sb_blocksize;
 	geo->dirblocksize = xfs_dir2_dirblock_bytes(sbp);
 


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 10/20] xfs: create separate name hash function for xattrs
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (8 preceding siblings ...)
  2024-03-17 16:36   ` [PATCH 09/20] xfs: advertise fs-verity being available on filesystem Darrick J. Wong
@ 2024-03-17 16:36   ` Darrick J. Wong
  2024-03-17 16:36   ` [PATCH 11/20] xfs: use merkle tree offset as attr hash Darrick J. Wong
                     ` (9 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:36 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Create a new hashing function for extended attribute names.  The next
patch needs this so it can modify the hash strategy for verity xattrs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 db/hash.c                |    4 ++--
 db/metadump.c            |   26 +++++++++++++++-----------
 libxfs/libxfs_api_defs.h |    1 +
 libxfs/xfs_attr.c        |   16 ++++++++++++++--
 libxfs/xfs_attr.h        |    3 +++
 libxfs/xfs_attr_leaf.c   |    4 ++--
 repair/attr_repair.c     |    9 ++++++---
 7 files changed, 43 insertions(+), 20 deletions(-)


diff --git a/db/hash.c b/db/hash.c
index 05a94f24..df214c16 100644
--- a/db/hash.c
+++ b/db/hash.c
@@ -73,7 +73,7 @@ hash_f(
 		if (use_dir2_hash)
 			hashval = libxfs_dir2_hashname(mp, &xname);
 		else
-			hashval = libxfs_da_hashname(xname.name, xname.len);
+			hashval = libxfs_attr_hashname(0, xname.name, xname.len);
 		dbprintf("0x%x\n", hashval);
 	}
 
@@ -306,7 +306,7 @@ collide_xattrs(
 	unsigned long		i;
 	int			error;
 
-	old_hash = libxfs_da_hashname((uint8_t *)name, namelen);
+	old_hash = libxfs_attr_hashname(0, (uint8_t *)name, namelen);
 
 	if (fd >= 0) {
 		/*
diff --git a/db/metadump.c b/db/metadump.c
index a656ef57..95f58363 100644
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -823,6 +823,7 @@ handle_duplicate_name(xfs_dahash_t hash, size_t name_len, unsigned char *name)
 static inline xfs_dahash_t
 dirattr_hashname(
 	bool		is_dirent,
+	unsigned int	attr_flags,
 	const uint8_t	*name,
 	int		namelen)
 {
@@ -835,12 +836,13 @@ dirattr_hashname(
 		return libxfs_dir2_hashname(mp, &xname);
 	}
 
-	return libxfs_da_hashname(name, namelen);
+	return libxfs_attr_hashname(attr_flags, name, namelen);
 }
 
 static void
 generate_obfuscated_name(
 	xfs_ino_t		ino,
+	unsigned int		attr_flags,
 	int			namelen,
 	unsigned char		*name)
 {
@@ -866,9 +868,9 @@ generate_obfuscated_name(
 
 	/* Obfuscate the name (if possible) */
 
-	hash = dirattr_hashname(ino != 0, name, namelen);
+	hash = dirattr_hashname(ino != 0, attr_flags, name, namelen);
 	obfuscate_name(hash, namelen, name, ino != 0);
-	ASSERT(hash == dirattr_hashname(ino != 0, name, namelen));
+	ASSERT(hash == dirattr_hashname(ino != 0, attr_flags, name, namelen));
 
 	/*
 	 * Make sure the name is not something already seen.  If we
@@ -945,7 +947,7 @@ process_sf_dir(
 		if (metadump.obfuscate)
 			generate_obfuscated_name(
 					 libxfs_dir2_sf_get_ino(mp, sfp, sfep),
-					 namelen, &sfep->name[0]);
+					 0, namelen, &sfep->name[0]);
 
 		sfep = (xfs_dir2_sf_entry_t *)((char *)sfep +
 				libxfs_dir2_sf_entsize(mp, sfp, namelen));
@@ -1071,8 +1073,8 @@ process_sf_attr(
 		}
 
 		if (metadump.obfuscate) {
-			generate_obfuscated_name(0, asfep->namelen,
-						 &asfep->nameval[0]);
+			generate_obfuscated_name(0, asfep->flags,
+					asfep->namelen, &asfep->nameval[0]);
 			memset(&asfep->nameval[asfep->namelen], 'v',
 			       asfep->valuelen);
 		}
@@ -1283,7 +1285,7 @@ process_dir_data_block(
 
 		if (metadump.obfuscate)
 			generate_obfuscated_name(be64_to_cpu(dep->inumber),
-					 dep->namelen, &dep->name[0]);
+					 0, dep->namelen, &dep->name[0]);
 		dir_offset += length;
 		ptr += length;
 		/* Zero the unused space after name, up to the tag */
@@ -1452,8 +1454,9 @@ process_attr_block(
 				break;
 			}
 			if (metadump.obfuscate) {
-				generate_obfuscated_name(0, local->namelen,
-					&local->nameval[0]);
+				generate_obfuscated_name(0, entry->flags,
+						local->namelen,
+						&local->nameval[0]);
 				memset(&local->nameval[local->namelen], 'v',
 					be16_to_cpu(local->valuelen));
 			}
@@ -1475,8 +1478,9 @@ process_attr_block(
 				break;
 			}
 			if (metadump.obfuscate) {
-				generate_obfuscated_name(0, remote->namelen,
-							 &remote->name[0]);
+				generate_obfuscated_name(0, entry->flags,
+						remote->namelen,
+						&remote->name[0]);
 				add_remote_vals(be32_to_cpu(remote->valueblk),
 						be32_to_cpu(remote->valuelen));
 			}
diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
index 9d2084e2..ccc92a83 100644
--- a/libxfs/libxfs_api_defs.h
+++ b/libxfs/libxfs_api_defs.h
@@ -44,6 +44,7 @@
 #define xfs_attr_set			libxfs_attr_set
 #define xfs_attr_sf_firstentry		libxfs_attr_sf_firstentry
 #define xfs_attr_shortform_verify	libxfs_attr_shortform_verify
+#define xfs_attr_hashname		libxfs_attr_hashname
 
 #define __xfs_bmap_add_free		__libxfs_bmap_add_free
 #define xfs_bmap_validate_extent	libxfs_bmap_validate_extent
diff --git a/libxfs/xfs_attr.c b/libxfs/xfs_attr.c
index 30cf3688..aca65971 100644
--- a/libxfs/xfs_attr.c
+++ b/libxfs/xfs_attr.c
@@ -234,6 +234,16 @@ xfs_attr_get_ilocked(
 	return xfs_attr_node_get(args);
 }
 
+/* Compute hash for an extended attribute name. */
+xfs_dahash_t
+xfs_attr_hashname(
+	unsigned int		attr_flags,
+	const uint8_t		*name,
+	unsigned int		namelen)
+{
+	return xfs_da_hashname(name, namelen);
+}
+
 /*
  * Retrieve an extended attribute by name, and its value if requested.
  *
@@ -264,7 +274,8 @@ xfs_attr_get(
 
 	args->geo = args->dp->i_mount->m_attr_geo;
 	args->whichfork = XFS_ATTR_FORK;
-	args->hashval = xfs_da_hashname(args->name, args->namelen);
+	args->hashval = xfs_attr_hashname(args->attr_filter, args->name,
+					  args->namelen);
 
 	/* Entirely possible to look up a name which doesn't exist */
 	args->op_flags = XFS_DA_OP_OKNOENT;
@@ -938,7 +949,8 @@ xfs_attr_set(
 
 	args->geo = mp->m_attr_geo;
 	args->whichfork = XFS_ATTR_FORK;
-	args->hashval = xfs_da_hashname(args->name, args->namelen);
+	args->hashval = xfs_attr_hashname(args->attr_filter, args->name,
+					  args->namelen);
 
 	/*
 	 * We have no control over the attribute names that userspace passes us
diff --git a/libxfs/xfs_attr.h b/libxfs/xfs_attr.h
index af92cc57..30cf51f3 100644
--- a/libxfs/xfs_attr.h
+++ b/libxfs/xfs_attr.h
@@ -619,4 +619,7 @@ extern struct kmem_cache *xfs_attr_intent_cache;
 int __init xfs_attr_intent_init_cache(void);
 void xfs_attr_intent_destroy_cache(void);
 
+xfs_dahash_t xfs_attr_hashname(unsigned int attr_flags,
+		const uint8_t *name_string, unsigned int name_length);
+
 #endif	/* __XFS_ATTR_H__ */
diff --git a/libxfs/xfs_attr_leaf.c b/libxfs/xfs_attr_leaf.c
index 663347b1..2459a1e7 100644
--- a/libxfs/xfs_attr_leaf.c
+++ b/libxfs/xfs_attr_leaf.c
@@ -909,8 +909,8 @@ xfs_attr_shortform_to_leaf(
 		nargs.namelen = sfe->namelen;
 		nargs.value = &sfe->nameval[nargs.namelen];
 		nargs.valuelen = sfe->valuelen;
-		nargs.hashval = xfs_da_hashname(sfe->nameval,
-						sfe->namelen);
+		nargs.hashval = xfs_attr_hashname(sfe->flags, sfe->nameval,
+						  sfe->namelen);
 		nargs.attr_filter = sfe->flags & XFS_ATTR_NSP_ONDISK_MASK;
 		error = xfs_attr3_leaf_lookup_int(bp, &nargs); /* set a->index */
 		ASSERT(error == -ENOATTR);
diff --git a/repair/attr_repair.c b/repair/attr_repair.c
index 25588b3b..9c41cb21 100644
--- a/repair/attr_repair.c
+++ b/repair/attr_repair.c
@@ -492,8 +492,10 @@ process_leaf_attr_local(
 	 * ordering anyway in case both the name value and the
 	 * hashvalue were wrong but matched. Unlikely, however.
 	 */
-	if (be32_to_cpu(entry->hashval) != libxfs_da_hashname(
-				&local->nameval[0], local->namelen) ||
+	if (be32_to_cpu(entry->hashval) !=
+			libxfs_attr_hashname(entry->flags,
+					     &local->nameval[0],
+					     local->namelen) ||
 				be32_to_cpu(entry->hashval) < last_hashval) {
 		do_warn(
 	_("bad hashvalue for attribute entry %d in attr block %u, inode %" PRIu64 "\n"),
@@ -537,7 +539,8 @@ process_leaf_attr_remote(
 	    !libxfs_attr_namecheck(mp, remotep->name,
 				   remotep->namelen, flags) ||
 	    be32_to_cpu(entry->hashval) !=
-			libxfs_da_hashname((unsigned char *)&remotep->name[0],
+			libxfs_attr_hashname(entry->flags,
+					   (unsigned char *)&remotep->name[0],
 					   remotep->namelen) ||
 	    be32_to_cpu(entry->hashval) < last_hashval ||
 	    be32_to_cpu(remotep->valueblk) == 0) {


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 11/20] xfs: use merkle tree offset as attr hash
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (9 preceding siblings ...)
  2024-03-17 16:36   ` [PATCH 10/20] xfs: create separate name hash function for xattrs Darrick J. Wong
@ 2024-03-17 16:36   ` Darrick J. Wong
  2024-03-17 16:36   ` [PATCH 12/20] xfs: enable ro-compat fs-verity flag Darrick J. Wong
                     ` (8 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:36 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

I was exploring the fsverity metadata with xfs_db after creating a 220MB
verity file, and I noticed the following in the debugger output:

entries[0-75] = [hashval,nameidx,incomplete,root,secure,local,parent,verity]
0:[0,4076,0,0,0,0,0,1]
1:[0,1472,0,0,0,1,0,1]
2:[0x800,4056,0,0,0,0,0,1]
3:[0x800,4036,0,0,0,0,0,1]
...
72:[0x12000,2716,0,0,0,0,0,1]
73:[0x12000,2696,0,0,0,0,0,1]
74:[0x12800,2676,0,0,0,0,0,1]
75:[0x12800,2656,0,0,0,0,0,1]
...
nvlist[0].merkle_off = 0x18000
nvlist[1].merkle_off = 0
nvlist[2].merkle_off = 0x19000
nvlist[3].merkle_off = 0x1000
...
nvlist[71].merkle_off = 0x5b000
nvlist[72].merkle_off = 0x44000
nvlist[73].merkle_off = 0x5c000
nvlist[74].merkle_off = 0x45000
nvlist[75].merkle_off = 0x5d000

Within just this attr leaf block, there are 76 attr entries, but only 38
distinct hash values.  There are 415 merkle tree blocks for this file,
but we already have hash collisions.  This isn't good performance from
the standard da hash function because we're mostly shifting and rolling
zeroes around.

However, we don't even have to do that much work -- the merkle tree
block keys are themslves u64 values.  Truncate that value to 32 bits
(the size of xfs_dahash_t) and use that for the hash.  We won't have any
collisions between merkle tree blocks until that tree grows to 2^32nd
blocks.  On a 4k block filesystem, we won't hit that unless the file
contains more than 2^49 bytes, assuming sha256.

As a side effect, the keys for merkle tree blocks get written out in
roughly sequential order, though I didn't observe any change in
performance.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 libxfs/xfs_attr.c      |    7 +++++++
 libxfs/xfs_da_format.h |    2 ++
 2 files changed, 9 insertions(+)


diff --git a/libxfs/xfs_attr.c b/libxfs/xfs_attr.c
index aca65971..971d185b 100644
--- a/libxfs/xfs_attr.c
+++ b/libxfs/xfs_attr.c
@@ -241,6 +241,13 @@ xfs_attr_hashname(
 	const uint8_t		*name,
 	unsigned int		namelen)
 {
+	if ((attr_flags & XFS_ATTR_VERITY) &&
+	    namelen == sizeof(struct xfs_verity_merkle_key)) {
+		uint64_t	off = xfs_verity_merkle_key_from_disk(name);
+
+		return off >> XFS_VERITY_MIN_MERKLE_BLOCKLOG;
+	}
+
 	return xfs_da_hashname(name, namelen);
 }
 
diff --git a/libxfs/xfs_da_format.h b/libxfs/xfs_da_format.h
index 2d2314a5..1d061cc0 100644
--- a/libxfs/xfs_da_format.h
+++ b/libxfs/xfs_da_format.h
@@ -951,4 +951,6 @@ xfs_verity_merkle_key_from_disk(
 #define XFS_VERITY_DESCRIPTOR_NAME	"vdesc"
 #define XFS_VERITY_DESCRIPTOR_NAME_LEN	(sizeof(XFS_VERITY_DESCRIPTOR_NAME) - 1)
 
+#define XFS_VERITY_MIN_MERKLE_BLOCKLOG	(10)
+
 #endif /* __XFS_DA_FORMAT_H__ */


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 12/20] xfs: enable ro-compat fs-verity flag
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (10 preceding siblings ...)
  2024-03-17 16:36   ` [PATCH 11/20] xfs: use merkle tree offset as attr hash Darrick J. Wong
@ 2024-03-17 16:36   ` Darrick J. Wong
  2024-03-17 16:37   ` [PATCH 13/20] libfrog: add fsverity to xfs_report_geom output Darrick J. Wong
                     ` (7 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:36 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

Finalize fs-verity integration in XFS by making kernel fs-verity
aware with ro-compat flag.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: add spaces]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 libxfs/xfs_format.h |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)


diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
index 3ce29021..c3f586d6 100644
--- a/libxfs/xfs_format.h
+++ b/libxfs/xfs_format.h
@@ -355,10 +355,11 @@ xfs_sb_has_compat_feature(
 #define XFS_SB_FEAT_RO_COMPAT_INOBTCNT (1 << 3)		/* inobt block counts */
 #define XFS_SB_FEAT_RO_COMPAT_VERITY   (1 << 4)		/* fs-verity */
 #define XFS_SB_FEAT_RO_COMPAT_ALL \
-		(XFS_SB_FEAT_RO_COMPAT_FINOBT | \
-		 XFS_SB_FEAT_RO_COMPAT_RMAPBT | \
-		 XFS_SB_FEAT_RO_COMPAT_REFLINK| \
-		 XFS_SB_FEAT_RO_COMPAT_INOBTCNT)
+		(XFS_SB_FEAT_RO_COMPAT_FINOBT   | \
+		 XFS_SB_FEAT_RO_COMPAT_RMAPBT   | \
+		 XFS_SB_FEAT_RO_COMPAT_REFLINK  | \
+		 XFS_SB_FEAT_RO_COMPAT_INOBTCNT | \
+		 XFS_SB_FEAT_RO_COMPAT_VERITY)
 #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN	~XFS_SB_FEAT_RO_COMPAT_ALL
 static inline bool
 xfs_sb_has_ro_compat_feature(


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 13/20] libfrog: add fsverity to xfs_report_geom output
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (11 preceding siblings ...)
  2024-03-17 16:36   ` [PATCH 12/20] xfs: enable ro-compat fs-verity flag Darrick J. Wong
@ 2024-03-17 16:37   ` Darrick J. Wong
  2024-03-17 16:37   ` [PATCH 14/20] xfs_db: introduce attr_modify command Darrick J. Wong
                     ` (6 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:37 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Announce the presence of fsverity on a filesystem.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 libfrog/fsgeom.c |    4 ++++
 1 file changed, 4 insertions(+)


diff --git a/libfrog/fsgeom.c b/libfrog/fsgeom.c
index 3e7f0797..502553bc 100644
--- a/libfrog/fsgeom.c
+++ b/libfrog/fsgeom.c
@@ -31,6 +31,7 @@ xfs_report_geom(
 	int			bigtime_enabled;
 	int			inobtcount;
 	int			nrext64;
+	int			verity;
 
 	isint = geo->logstart > 0;
 	lazycount = geo->flags & XFS_FSOP_GEOM_FLAGS_LAZYSB ? 1 : 0;
@@ -49,12 +50,14 @@ xfs_report_geom(
 	bigtime_enabled = geo->flags & XFS_FSOP_GEOM_FLAGS_BIGTIME ? 1 : 0;
 	inobtcount = geo->flags & XFS_FSOP_GEOM_FLAGS_INOBTCNT ? 1 : 0;
 	nrext64 = geo->flags & XFS_FSOP_GEOM_FLAGS_NREXT64 ? 1 : 0;
+	verity = geo->flags & XFS_FSOP_GEOM_FLAGS_VERITY ? 1 : 0;
 
 	printf(_(
 "meta-data=%-22s isize=%-6d agcount=%u, agsize=%u blks\n"
 "         =%-22s sectsz=%-5u attr=%u, projid32bit=%u\n"
 "         =%-22s crc=%-8u finobt=%u, sparse=%u, rmapbt=%u\n"
 "         =%-22s reflink=%-4u bigtime=%u inobtcount=%u nrext64=%u\n"
+"         =%-22s verity=%u\n"
 "data     =%-22s bsize=%-6u blocks=%llu, imaxpct=%u\n"
 "         =%-22s sunit=%-6u swidth=%u blks\n"
 "naming   =version %-14u bsize=%-6u ascii-ci=%d, ftype=%d\n"
@@ -65,6 +68,7 @@ xfs_report_geom(
 		"", geo->sectsize, attrversion, projid32bit,
 		"", crcs_enabled, finobt_enabled, spinodes, rmapbt_enabled,
 		"", reflink_enabled, bigtime_enabled, inobtcount, nrext64,
+		"", verity,
 		"", geo->blocksize, (unsigned long long)geo->datablocks,
 			geo->imaxpct,
 		"", geo->sunit, geo->swidth,


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 14/20] xfs_db: introduce attr_modify command
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (12 preceding siblings ...)
  2024-03-17 16:37   ` [PATCH 13/20] libfrog: add fsverity to xfs_report_geom output Darrick J. Wong
@ 2024-03-17 16:37   ` Darrick J. Wong
  2024-03-17 16:37   ` [PATCH 15/20] xfs_db: make attr_set/remove/modify be able to handle fs-verity attrs Darrick J. Wong
                     ` (5 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:37 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

This command allows for writing value over already existing value of
inode's extended attribute. The difference from 'write' command is
that extended attribute can be addressed by name and new value is
written over old value.

The command also allows addressing via binary names (introduced by
parent pointers). This can be done by specified name length (-m) and
value in #hex format.

Example:

	# Modify attribute with name #00000042 by overwriting 8
	# bytes at offset 3 with value #0000000000FF00FF
	attr_modify -o 3 -m 4 -v 8 #42 #FF00FF

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
---
 db/attrset.c |  202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 db/write.c   |    2 -
 db/write.h   |    1 
 3 files changed, 202 insertions(+), 3 deletions(-)


diff --git a/db/attrset.c b/db/attrset.c
index 0d8d70a8..7249294a 100644
--- a/db/attrset.c
+++ b/db/attrset.c
@@ -16,10 +16,12 @@
 #include "field.h"
 #include "inode.h"
 #include "malloc.h"
+#include "write.h"
 #include <sys/xattr.h>
 
 static int		attr_set_f(int argc, char **argv);
 static int		attr_remove_f(int argc, char **argv);
+static int		attr_modify_f(int argc, char **argv);
 static void		attrset_help(void);
 
 static const cmdinfo_t	attr_set_cmd =
@@ -30,6 +32,11 @@ static const cmdinfo_t	attr_remove_cmd =
 	{ "attr_remove", "aremove", attr_remove_f, 1, -1, 0,
 	  N_("[-r|-s|-u] [-n] name"),
 	  N_("remove the named attribute from the current inode"), attrset_help };
+static const cmdinfo_t	attr_modify_cmd =
+	{ "attr_modify", "amodify", attr_modify_f, 1, -1, 0,
+	  N_("[-r|-s|-u] [-o n] [-v n] [-m n] name value"),
+	  N_("modify value of the named attribute of the current inode"),
+		attrset_help };
 
 static void
 attrset_help(void)
@@ -38,8 +45,9 @@ attrset_help(void)
 "\n"
 " The 'attr_set' and 'attr_remove' commands provide interfaces for debugging\n"
 " the extended attribute allocation and removal code.\n"
-" Both commands require an attribute name to be specified, and the attr_set\n"
-" command allows an optional value length (-v) to be provided as well.\n"
+" Both commands together with 'attr_modify' require an attribute name to be\n"
+" specified. The attr_set and attr_modify commands allow an optional value\n"
+" length (-v) to be provided as well.\n"
 " There are 4 namespace flags:\n"
 "  -r -- 'root'\n"
 "  -u -- 'user'		(default)\n"
@@ -48,6 +56,9 @@ attrset_help(void)
 " For attr_set, these options further define the type of set operation:\n"
 "  -C -- 'create'    - create attribute, fail if it already exists\n"
 "  -R -- 'replace'   - replace attribute, fail if it does not exist\n"
+" attr_modify command provides more of the following options:\n"
+"  -m -- 'name length'   - specify length of the name (handy with binary names)\n"
+"  -o -- 'value offset'   - offset new value within old attr's value\n"
 " The backward compatibility mode 'noattr2' can be emulated (-n) also.\n"
 "\n"));
 }
@@ -60,6 +71,7 @@ attrset_init(void)
 
 	add_command(&attr_set_cmd);
 	add_command(&attr_remove_cmd);
+	add_command(&attr_modify_cmd);
 }
 
 static int
@@ -263,3 +275,189 @@ attr_remove_f(
 		libxfs_irele(args.dp);
 	return 0;
 }
+
+static int
+attr_modify_f(
+	int			argc,
+	char			**argv)
+{
+	struct xfs_da_args	args = { };
+	int			c;
+	int			offset = 0;
+	char			*sp;
+	char			*converted;
+	uint8_t			*name;
+	int			namelen = 0;
+	uint8_t			*value;
+	int			valuelen = 0;
+
+	if (cur_typ == NULL) {
+		dbprintf(_("no current type\n"));
+		return 0;
+	}
+
+	if (cur_typ->typnm != TYP_INODE) {
+		dbprintf(_("current type is not inode\n"));
+		return 0;
+	}
+
+	while ((c = getopt(argc, argv, "rusnv:o:m:")) != EOF) {
+		switch (c) {
+		/* namespaces */
+		case 'r':
+			args.attr_filter |= LIBXFS_ATTR_ROOT;
+			args.attr_filter &= ~LIBXFS_ATTR_SECURE;
+			break;
+		case 'u':
+			args.attr_filter &= ~(LIBXFS_ATTR_ROOT |
+					      LIBXFS_ATTR_SECURE);
+			break;
+		case 's':
+			args.attr_filter |= LIBXFS_ATTR_SECURE;
+			args.attr_filter &= ~LIBXFS_ATTR_ROOT;
+			break;
+
+		case 'n':
+			/*
+			 * We never touch attr2 these days; leave this here to
+			 * avoid breaking scripts.
+			 */
+			break;
+
+		case 'o':
+			offset = strtol(optarg, &sp, 0);
+			if (*sp != '\0' || offset < 0 || offset > 64 * 1024) {
+				dbprintf(_("bad attr_modify offset %s\n"),
+						optarg);
+				return 0;
+			}
+			break;
+
+		case 'v':
+			valuelen = strtol(optarg, &sp, 0);
+			if (*sp != '\0' || offset < 0 || valuelen > 64 * 1024) {
+				dbprintf(_("bad attr_modify value len %s\n"),
+						optarg);
+				return 0;
+			}
+			break;
+
+		case 'm':
+			namelen = strtol(optarg, &sp, 0);
+			if (*sp != '\0' || offset < 0 || namelen > MAXNAMELEN) {
+				dbprintf(_("bad attr_modify name len %s\n"),
+						optarg);
+				return 0;
+			}
+			break;
+
+		default:
+			dbprintf(_("bad option for attr_modify command\n"));
+			return 0;
+		}
+	}
+
+	if (optind != argc - 2) {
+		dbprintf(_("too few options for attr_modify\n"));
+		return 0;
+	}
+
+	if (namelen >= MAXNAMELEN) {
+		dbprintf(_("name too long\n"));
+		return 0;
+	}
+
+	if (!namelen) {
+		if (argv[optind][0] == '#')
+			namelen = strlen(argv[optind])/2;
+		if (argv[optind][0] == '"')
+			namelen = strlen(argv[optind]) - 2;
+	}
+
+	name = xcalloc(namelen, sizeof(uint8_t));
+	converted = convert_arg(argv[optind], (int)(namelen*8));
+	if (!converted) {
+		dbprintf(_("invalid name\n"));
+		goto out_free_name;
+	}
+
+	memcpy(name, converted, namelen);
+	args.name = (const uint8_t *)name;
+	args.namelen = namelen;
+
+	optind++;
+
+	if (valuelen > 64 * 1024) {
+		dbprintf(_("value too long\n"));
+		goto out_free_name;
+	}
+
+	if (!valuelen) {
+		if (argv[optind][0] == '#')
+			valuelen = strlen(argv[optind])/2;
+		if (argv[optind][0] == '"')
+			valuelen = strlen(argv[optind]) - 2;
+	}
+
+	if ((valuelen + offset) > 64 * 1024) {
+		dbprintf(_("offsetted value too long\n"));
+		goto out_free_name;
+	}
+
+	value = xcalloc(valuelen, sizeof(uint8_t));
+	converted = convert_arg(argv[optind], (int)(valuelen*8));
+	if (!converted) {
+		dbprintf(_("invalid value\n"));
+		goto out_free_value;
+	}
+	memcpy(value, converted, valuelen);
+
+	if (libxfs_iget(mp, NULL, iocur_top->ino, 0, &args.dp)) {
+		dbprintf(_("failed to iget inode %llu\n"),
+			(unsigned long long)iocur_top->ino);
+		goto out;
+	}
+
+	if (libxfs_attr_get(&args)) {
+		dbprintf(_("failed to get attr '%s' from inode %llu\n"),
+			args.name, (unsigned long long)iocur_top->ino);
+		goto out;
+	}
+
+	if (valuelen + offset > args.valuelen) {
+		dbprintf(_("new value too long\n"));
+		goto out;
+	}
+
+	/* As args.valuelen is now set let's get args.value */
+	if (libxfs_attr_get(&args)) {
+		dbprintf(_("failed to get attr '%s' from inode %llu\n"),
+			args.name, (unsigned long long)iocur_top->ino);
+		goto out;
+	}
+
+	/* modify value */
+	memcpy((uint8_t *)args.value + offset, value, valuelen);
+
+	args.attr_flags = XATTR_REPLACE;
+	args.attr_flags &= ~XATTR_CREATE;
+	if (libxfs_attr_set(&args)) {
+		dbprintf(_("failed to set attr '%s' from inode %llu\n"),
+			(unsigned char *)args.name,
+			(unsigned long long)iocur_top->ino);
+		goto out;
+	}
+
+	/* refresh with updated inode contents */
+	set_cur_inode(iocur_top->ino);
+
+out:
+	if (args.dp)
+		libxfs_irele(args.dp);
+	xfree(args.value);
+out_free_value:
+	xfree(value);
+out_free_name:
+	xfree(name);
+	return 0;
+}
diff --git a/db/write.c b/db/write.c
index 96dea705..9295dbc9 100644
--- a/db/write.c
+++ b/db/write.c
@@ -511,7 +511,7 @@ convert_oct(
  * are adjusted in the buffer so that the first input bit is to be be written to
  * the first bit in the output.
  */
-static char *
+char *
 convert_arg(
 	char		*arg,
 	int		bit_length)
diff --git a/db/write.h b/db/write.h
index e24e07d4..4ba04d03 100644
--- a/db/write.h
+++ b/db/write.h
@@ -6,6 +6,7 @@
 
 struct field;
 
+extern char	*convert_arg(char *arg, int bit_length);
 extern void	write_init(void);
 extern void	write_block(const field_t *fields, int argc, char **argv);
 extern void	write_struct(const field_t *fields, int argc, char **argv);


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 15/20] xfs_db: make attr_set/remove/modify be able to handle fs-verity attrs
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (13 preceding siblings ...)
  2024-03-17 16:37   ` [PATCH 14/20] xfs_db: introduce attr_modify command Darrick J. Wong
@ 2024-03-17 16:37   ` Darrick J. Wong
  2024-03-17 16:37   ` [PATCH 16/20] man: document attr_modify command Darrick J. Wong
                     ` (4 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:37 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
---
 db/attrset.c             |   28 ++++++++++++++++++++++------
 libxfs/libxfs_api_defs.h |    1 +
 2 files changed, 23 insertions(+), 6 deletions(-)


diff --git a/db/attrset.c b/db/attrset.c
index 7249294a..f64f0cd9 100644
--- a/db/attrset.c
+++ b/db/attrset.c
@@ -26,15 +26,15 @@ static void		attrset_help(void);
 
 static const cmdinfo_t	attr_set_cmd =
 	{ "attr_set", "aset", attr_set_f, 1, -1, 0,
-	  N_("[-r|-s|-u] [-n] [-R|-C] [-v n] name"),
+	  N_("[-r|-s|-u|-f] [-n] [-R|-C] [-v n] name"),
 	  N_("set the named attribute on the current inode"), attrset_help };
 static const cmdinfo_t	attr_remove_cmd =
 	{ "attr_remove", "aremove", attr_remove_f, 1, -1, 0,
-	  N_("[-r|-s|-u] [-n] name"),
+	  N_("[-r|-s|-u|-f] [-n] name"),
 	  N_("remove the named attribute from the current inode"), attrset_help };
 static const cmdinfo_t	attr_modify_cmd =
 	{ "attr_modify", "amodify", attr_modify_f, 1, -1, 0,
-	  N_("[-r|-s|-u] [-o n] [-v n] [-m n] name value"),
+	  N_("[-r|-s|-u|-f] [-o n] [-v n] [-m n] name value"),
 	  N_("modify value of the named attribute of the current inode"),
 		attrset_help };
 
@@ -52,6 +52,7 @@ attrset_help(void)
 "  -r -- 'root'\n"
 "  -u -- 'user'		(default)\n"
 "  -s -- 'secure'\n"
+"  -f -- 'fs-verity'\n"
 "\n"
 " For attr_set, these options further define the type of set operation:\n"
 "  -C -- 'create'    - create attribute, fail if it already exists\n"
@@ -92,7 +93,7 @@ attr_set_f(
 		return 0;
 	}
 
-	while ((c = getopt(argc, argv, "rusCRnv:")) != EOF) {
+	while ((c = getopt(argc, argv, "rusfCRnv:")) != EOF) {
 		switch (c) {
 		/* namespaces */
 		case 'r':
@@ -107,6 +108,11 @@ attr_set_f(
 			args.attr_filter |= LIBXFS_ATTR_SECURE;
 			args.attr_filter &= ~LIBXFS_ATTR_ROOT;
 			break;
+		case 'f':
+			args.attr_filter |= LIBXFS_ATTR_VERITY;
+			args.attr_filter &= ~(LIBXFS_ATTR_ROOT |
+					      LIBXFS_ATTR_SECURE);
+			break;
 
 		/* modifiers */
 		case 'C':
@@ -208,7 +214,7 @@ attr_remove_f(
 		return 0;
 	}
 
-	while ((c = getopt(argc, argv, "rusn")) != EOF) {
+	while ((c = getopt(argc, argv, "rusfn")) != EOF) {
 		switch (c) {
 		/* namespaces */
 		case 'r':
@@ -223,6 +229,11 @@ attr_remove_f(
 			args.attr_filter |= LIBXFS_ATTR_SECURE;
 			args.attr_filter &= ~LIBXFS_ATTR_ROOT;
 			break;
+		case 'f':
+			args.attr_filter |= LIBXFS_ATTR_VERITY;
+			args.attr_filter &= ~(LIBXFS_ATTR_ROOT |
+					      LIBXFS_ATTR_SECURE);
+			break;
 
 		case 'n':
 			/*
@@ -301,7 +312,7 @@ attr_modify_f(
 		return 0;
 	}
 
-	while ((c = getopt(argc, argv, "rusnv:o:m:")) != EOF) {
+	while ((c = getopt(argc, argv, "rusfnv:o:m:")) != EOF) {
 		switch (c) {
 		/* namespaces */
 		case 'r':
@@ -316,6 +327,11 @@ attr_modify_f(
 			args.attr_filter |= LIBXFS_ATTR_SECURE;
 			args.attr_filter &= ~LIBXFS_ATTR_ROOT;
 			break;
+		case 'f':
+			args.attr_filter |= LIBXFS_ATTR_VERITY;
+			args.attr_filter &= ~(LIBXFS_ATTR_ROOT |
+					      LIBXFS_ATTR_SECURE);
+			break;
 
 		case 'n':
 			/*
diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
index ccc92a83..04a5dad5 100644
--- a/libxfs/libxfs_api_defs.h
+++ b/libxfs/libxfs_api_defs.h
@@ -15,6 +15,7 @@
  */
 #define LIBXFS_ATTR_ROOT		XFS_ATTR_ROOT
 #define LIBXFS_ATTR_SECURE		XFS_ATTR_SECURE
+#define LIBXFS_ATTR_VERITY		XFS_ATTR_VERITY
 
 #define xfs_agfl_size			libxfs_agfl_size
 #define xfs_agfl_walk			libxfs_agfl_walk


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 16/20] man: document attr_modify command
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (14 preceding siblings ...)
  2024-03-17 16:37   ` [PATCH 15/20] xfs_db: make attr_set/remove/modify be able to handle fs-verity attrs Darrick J. Wong
@ 2024-03-17 16:37   ` Darrick J. Wong
  2024-03-17 16:38   ` [PATCH 17/20] xfs_db: dump verity features and metadata Darrick J. Wong
                     ` (3 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:37 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers
  Cc: Darrick J. Wong, fsverity, linux-fsdevel, linux-xfs

From: Darrick J. Wong <djwong@djwong.org>

Add some documentation for the new attr_modify command.  I'm not sure
all what this this supposed to do, but there needs to be /something/ to
satisfy the documentation tests.

Signed-off-by: Darrick J. Wong <djwong@djwong.org>
---
 man/man8/xfs_db.8 |   34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)


diff --git a/man/man8/xfs_db.8 b/man/man8/xfs_db.8
index a7f6d55e..d4651eb4 100644
--- a/man/man8/xfs_db.8
+++ b/man/man8/xfs_db.8
@@ -184,6 +184,40 @@ Displays the length, free block count, per-AG reservation size, and per-AG
 reservation usage for a given AG.
 If no argument is given, display information for all AGs.
 .TP
+.BI "attr_modify [\-r|\-u|\-s|\-f] [\-o n] [\-v n] [\-m n] name value
+Modifies an extended attribute on the current file with the given name.
+
+If the
+.B name
+is a string that can be converted into an integer value, it will be.
+.RS 1.0i
+.TP 0.4i
+.B \-r
+Sets the attribute in the root namespace.
+Only one namespace option can be specified.
+.TP
+.B \-u
+Sets the attribute in the user namespace.
+Only one namespace option can be specified.
+.TP
+.B \-s
+Sets the attribute in the secure namespace.
+Only one namespace option can be specified.
+.TP
+.B \-f
+Sets the attribute in the verity namespace.
+Only one namespace option can be specified.
+.TP
+.B \-m
+Length of the attr name.
+.TP
+.B \-o
+Offset into the attr value to place the new contents.
+.TP
+.B \-v
+Length of the attr value.
+.RE
+.TP
 .BI "attr_remove [\-r|\-u|\-s] [\-n] " name
 Remove the specified extended attribute from the current file.
 .RS 1.0i


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 17/20] xfs_db: dump verity features and metadata
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (15 preceding siblings ...)
  2024-03-17 16:37   ` [PATCH 16/20] man: document attr_modify command Darrick J. Wong
@ 2024-03-17 16:38   ` Darrick J. Wong
  2024-03-17 16:38   ` [PATCH 18/20] xfs_db: dump merkle tree data Darrick J. Wong
                     ` (2 subsequent siblings)
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:38 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Teach the debugger how to decode the merkle tree block number in the
attr name, and to display the fact that this is a verity filesystem.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 db/sb.c |    2 ++
 1 file changed, 2 insertions(+)


diff --git a/db/sb.c b/db/sb.c
index b48767f4..cd51f748 100644
--- a/db/sb.c
+++ b/db/sb.c
@@ -706,6 +706,8 @@ version_string(
 		strcat(s, ",NEEDSREPAIR");
 	if (xfs_has_large_extent_counts(mp))
 		strcat(s, ",NREXT64");
+	if (xfs_has_verity(mp))
+		strcat(s, ",VERITY");
 	return s;
 }
 


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 18/20] xfs_db: dump merkle tree data
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (16 preceding siblings ...)
  2024-03-17 16:38   ` [PATCH 17/20] xfs_db: dump verity features and metadata Darrick J. Wong
@ 2024-03-17 16:38   ` Darrick J. Wong
  2024-03-17 16:38   ` [PATCH 19/20] xfs_repair: junk fsverity xattrs when unnecessary Darrick J. Wong
  2024-03-17 16:39   ` [PATCH 20/20] mkfs.xfs: add verity parameter Darrick J. Wong
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:38 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Teach the debugger to dump the specific fields in the fsverity xattr
blocks.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 db/attr.c      |   94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 db/attrshort.c |   22 +++++++++++++
 2 files changed, 115 insertions(+), 1 deletion(-)


diff --git a/db/attr.c b/db/attr.c
index ba722e14..d00bf921 100644
--- a/db/attr.c
+++ b/db/attr.c
@@ -33,6 +33,9 @@ static int	attr_remote_data_count(void *obj, int startoff);
 static int	attr3_remote_hdr_count(void *obj, int startoff);
 static int	attr3_remote_data_count(void *obj, int startoff);
 
+static int	attr_leaf_name_local_merkle_count(void *obj, int startoff);
+static int	attr_leaf_name_remote_merkle_count(void *obj, int startoff);
+
 const field_t	attr_hfld[] = {
 	{ "", FLDT_ATTR, OI(0), C1, 0, TYP_NONE },
 	{ NULL }
@@ -82,6 +85,9 @@ const field_t	attr_leaf_entry_flds[] = {
 	{ "local", FLDT_UINT1,
 	  OI(LEOFF(flags) + bitsz(uint8_t) - XFS_ATTR_LOCAL_BIT - 1), C1, 0,
 	  TYP_NONE },
+	{ "verity", FLDT_UINT1,
+	  OI(LEOFF(flags) + bitsz(uint8_t) - XFS_ATTR_VERITY_BIT - 1), C1, 0,
+	  TYP_NONE },
 	{ "pad2", FLDT_UINT8X, OI(LEOFF(pad2)), C1, FLD_SKIPALL, TYP_NONE },
 	{ NULL }
 };
@@ -108,6 +114,10 @@ const field_t	attr_leaf_map_flds[] = {
 
 #define	LNOFF(f)	bitize(offsetof(xfs_attr_leaf_name_local_t, f))
 #define	LVOFF(f)	bitize(offsetof(xfs_attr_leaf_name_remote_t, f))
+#define	MKLOFF(f)	bitize(offsetof(xfs_attr_leaf_name_local_t, nameval) + \
+			       offsetof(struct xfs_verity_merkle_key, f))
+#define	MKROFF(f)	bitize(offsetof(xfs_attr_leaf_name_remote_t, name) + \
+			       offsetof(struct xfs_verity_merkle_key, f))
 const field_t	attr_leaf_name_flds[] = {
 	{ "valuelen", FLDT_UINT16D, OI(LNOFF(valuelen)),
 	  attr_leaf_name_local_count, FLD_COUNT, TYP_NONE },
@@ -115,6 +125,8 @@ const field_t	attr_leaf_name_flds[] = {
 	  attr_leaf_name_local_count, FLD_COUNT, TYP_NONE },
 	{ "name", FLDT_CHARNS, OI(LNOFF(nameval)),
 	  attr_leaf_name_local_name_count, FLD_COUNT, TYP_NONE },
+	{ "merkle_off", FLDT_UINT64X, OI(MKLOFF(vi_merkleoff)),
+	  attr_leaf_name_local_merkle_count, FLD_COUNT, TYP_NONE },
 	{ "value", FLDT_CHARNS, attr_leaf_name_local_value_offset,
 	  attr_leaf_name_local_value_count, FLD_COUNT|FLD_OFFSET, TYP_NONE },
 	{ "valueblk", FLDT_UINT32X, OI(LVOFF(valueblk)),
@@ -125,6 +137,8 @@ const field_t	attr_leaf_name_flds[] = {
 	  attr_leaf_name_remote_count, FLD_COUNT, TYP_NONE },
 	{ "name", FLDT_CHARNS, OI(LVOFF(name)),
 	  attr_leaf_name_remote_name_count, FLD_COUNT, TYP_NONE },
+	{ "merkle_off", FLDT_UINT64X, OI(MKROFF(vi_merkleoff)),
+	  attr_leaf_name_remote_merkle_count, FLD_COUNT, TYP_NONE },
 	{ NULL }
 };
 
@@ -258,7 +272,19 @@ __attr_leaf_name_local_count(
 	struct xfs_attr_leaf_entry      *e,
 	int				i)
 {
-	return (e->flags & XFS_ATTR_LOCAL) != 0;
+	struct xfs_attr_leaf_name_local	*l;
+
+	if (!(e->flags & XFS_ATTR_LOCAL))
+		return 0;
+
+	if ((e->flags & XFS_ATTR_NSP_ONDISK_MASK) == XFS_ATTR_VERITY) {
+		l = xfs_attr3_leaf_name_local(leaf, i);
+
+		if (l->namelen == sizeof(struct xfs_verity_merkle_key))
+			return 0;
+	}
+
+	return 1;
 }
 
 static int
@@ -270,6 +296,64 @@ attr_leaf_name_local_count(
 				    __attr_leaf_name_local_count);
 }
 
+static int
+__attr_leaf_name_local_merkle_count(
+	struct xfs_attr_leafblock	*leaf,
+	struct xfs_attr_leaf_entry      *e,
+	int				i)
+{
+	struct xfs_attr_leaf_name_local	*l;
+
+	if ((e->flags & XFS_ATTR_NSP_ONDISK_MASK) != XFS_ATTR_VERITY)
+		return 0;
+	if (!(e->flags & XFS_ATTR_LOCAL))
+		return 0;
+
+	l = xfs_attr3_leaf_name_local(leaf, i);
+	if (l->namelen != sizeof(struct xfs_verity_merkle_key))
+		return 0;
+
+	return 1;
+}
+
+static int
+attr_leaf_name_local_merkle_count(
+	void				*obj,
+	int				startoff)
+{
+	return attr_leaf_entry_walk(obj, startoff,
+			__attr_leaf_name_local_merkle_count);
+}
+
+static int
+__attr_leaf_name_remote_merkle_count(
+	struct xfs_attr_leafblock	*leaf,
+	struct xfs_attr_leaf_entry      *e,
+	int				i)
+{
+	struct xfs_attr_leaf_name_remote	*r;
+
+	if ((e->flags & XFS_ATTR_NSP_ONDISK_MASK) != XFS_ATTR_VERITY)
+		return 0;
+	if (e->flags & XFS_ATTR_LOCAL)
+		return 0;
+
+	r = xfs_attr3_leaf_name_remote(leaf, i);
+	if (r->namelen != sizeof(struct xfs_verity_merkle_key))
+		return 0;
+
+	return 1;
+}
+
+static int
+attr_leaf_name_remote_merkle_count(
+	void				*obj,
+	int				startoff)
+{
+	return attr_leaf_entry_walk(obj, startoff,
+			__attr_leaf_name_remote_merkle_count);
+}
+
 static int
 __attr_leaf_name_local_name_count(
 	struct xfs_attr_leafblock	*leaf,
@@ -282,6 +366,10 @@ __attr_leaf_name_local_name_count(
 		return 0;
 
 	l = xfs_attr3_leaf_name_local(leaf, i);
+	if ((e->flags & XFS_ATTR_NSP_ONDISK_MASK) == XFS_ATTR_VERITY &&
+	    l->namelen == sizeof(struct xfs_verity_merkle_key))
+		return 0;
+
 	return l->namelen;
 }
 
@@ -373,6 +461,10 @@ __attr_leaf_name_remote_name_count(
 		return 0;
 
 	r = xfs_attr3_leaf_name_remote(leaf, i);
+	if ((e->flags & XFS_ATTR_NSP_ONDISK_MASK) == XFS_ATTR_VERITY &&
+	    r->namelen == sizeof(struct xfs_verity_merkle_key))
+		return 0;
+
 	return r->namelen;
 }
 
diff --git a/db/attrshort.c b/db/attrshort.c
index 7c386d46..4a850016 100644
--- a/db/attrshort.c
+++ b/db/attrshort.c
@@ -13,6 +13,7 @@
 #include "attrshort.h"
 
 static int	attr_sf_entry_name_count(void *obj, int startoff);
+static int	attr_sf_entry_merkle_count(void *obj, int startoff);
 static int	attr_sf_entry_value_count(void *obj, int startoff);
 static int	attr_sf_entry_value_offset(void *obj, int startoff, int idx);
 static int	attr_shortform_list_count(void *obj, int startoff);
@@ -33,6 +34,8 @@ const field_t	attr_sf_hdr_flds[] = {
 };
 
 #define	EOFF(f)	bitize(offsetof(struct xfs_attr_sf_entry, f))
+#define	MKOFF(f) bitize(offsetof(struct xfs_attr_sf_entry, nameval) + \
+			offsetof(struct xfs_verity_merkle_key, f))
 const field_t	attr_sf_entry_flds[] = {
 	{ "namelen", FLDT_UINT8D, OI(EOFF(namelen)), C1, 0, TYP_NONE },
 	{ "valuelen", FLDT_UINT8D, OI(EOFF(valuelen)), C1, 0, TYP_NONE },
@@ -43,13 +46,32 @@ const field_t	attr_sf_entry_flds[] = {
 	{ "secure", FLDT_UINT1,
 	  OI(EOFF(flags) + bitsz(uint8_t) - XFS_ATTR_SECURE_BIT - 1), C1, 0,
 	  TYP_NONE },
+	{ "verity", FLDT_UINT1,
+	  OI(EOFF(flags) + bitsz(uint8_t) - XFS_ATTR_VERITY_BIT - 1), C1, 0,
+	  TYP_NONE },
 	{ "name", FLDT_CHARNS, OI(EOFF(nameval)), attr_sf_entry_name_count,
 	  FLD_COUNT, TYP_NONE },
+	{ "merkle_off", FLDT_UINT32X, OI(MKOFF(vi_merkleoff)),
+	  attr_sf_entry_merkle_count, FLD_COUNT, TYP_NONE },
 	{ "value", FLDT_CHARNS, attr_sf_entry_value_offset,
 	  attr_sf_entry_value_count, FLD_COUNT|FLD_OFFSET, TYP_NONE },
 	{ NULL }
 };
 
+static int
+attr_sf_entry_merkle_count(
+	void				*obj,
+	int				startoff)
+{
+	struct xfs_attr_sf_entry	*e;
+
+	ASSERT(bitoffs(startoff) == 0);
+	e = (struct xfs_attr_sf_entry *)((char *)obj + byteize(startoff));
+	if ((e->flags & XFS_ATTR_NSP_ONDISK_MASK) == XFS_ATTR_VERITY)
+		return 1;
+	return 0;
+}
+
 static int
 attr_sf_entry_name_count(
 	void				*obj,


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 19/20] xfs_repair: junk fsverity xattrs when unnecessary
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (17 preceding siblings ...)
  2024-03-17 16:38   ` [PATCH 18/20] xfs_db: dump merkle tree data Darrick J. Wong
@ 2024-03-17 16:38   ` Darrick J. Wong
  2024-03-17 16:39   ` [PATCH 20/20] mkfs.xfs: add verity parameter Darrick J. Wong
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:38 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Remove any fs-verity extended attributes when the filesystem doesn't
support fs-verity.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 repair/attr_repair.c |   24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)


diff --git a/repair/attr_repair.c b/repair/attr_repair.c
index 9c41cb21..5225950c 100644
--- a/repair/attr_repair.c
+++ b/repair/attr_repair.c
@@ -313,6 +313,13 @@ process_shortform_attr(
 					NULL, currententry->namelen,
 					currententry->valuelen);
 
+		if ((currententry->flags & XFS_ATTR_VERITY) &&
+		    !xfs_has_verity(mp)) {
+			do_warn(
+ _("verity metadata found on filesystem that doesn't support verity\n"));
+			junkit |= 1;
+		}
+
 		remainingspace = remainingspace -
 					xfs_attr_sf_entsize(currententry);
 
@@ -513,6 +520,15 @@ process_leaf_attr_local(
 			return -1;
 		}
 	}
+
+	if ((entry->flags & XFS_ATTR_VERITY) && !xfs_has_verity(mp)) {
+		do_warn(
+ _("verity metadata found in attribute entry %d in attr block %u, inode %"
+   PRIu64 " on filesystem that doesn't support verity\n"),
+				i, da_bno, ino);
+		return -1;
+	}
+
 	return xfs_attr_leaf_entsize_local(local->namelen,
 						be16_to_cpu(local->valuelen));
 }
@@ -549,6 +565,14 @@ process_leaf_attr_remote(
 		return -1;
 	}
 
+	if ((entry->flags & XFS_ATTR_VERITY) && !xfs_has_verity(mp)) {
+		do_warn(
+ _("verity metadata found in attribute entry %d in attr block %u, inode %"
+   PRIu64 " on filesystem that doesn't support verity\n"),
+				i, da_bno, ino);
+		return -1;
+	}
+
 	value = malloc(be32_to_cpu(remotep->valuelen));
 	if (value == NULL) {
 		do_warn(


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 20/20] mkfs.xfs: add verity parameter
  2024-03-17 16:23 ` Darrick J. Wong
                     ` (18 preceding siblings ...)
  2024-03-17 16:38   ` [PATCH 19/20] xfs_repair: junk fsverity xattrs when unnecessary Darrick J. Wong
@ 2024-03-17 16:39   ` Darrick J. Wong
  19 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:39 UTC (permalink / raw)
  To: aalbersh, djwong, cem, ebiggers; +Cc: fsverity, linux-fsdevel, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

fs-verity brings on-disk changes (inode flag). Add parameter to
enable (default disabled) fs-verity flag in superblock. This will
make newly create filesystem read-only for older kernels.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: make this an -i(node) option, edit manpage]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 man/man8/mkfs.xfs.8.in |    4 ++++
 mkfs/xfs_mkfs.c        |   19 +++++++++++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)


diff --git a/man/man8/mkfs.xfs.8.in b/man/man8/mkfs.xfs.8.in
index 8060d342..4864b4d4 100644
--- a/man/man8/mkfs.xfs.8.in
+++ b/man/man8/mkfs.xfs.8.in
@@ -670,6 +670,10 @@ If the value is omitted, 1 is assumed.
 This feature will be enabled when possible.
 This feature is only available for filesystems formatted with -m crc=1.
 .TP
+.BI verity[= value]
+This flag activates verity support, which enables sealing of regular file data
+with hashes and cryptographic signatures.
+This feature is only available for filesystems formatted with -m crc=1.
 .RE
 .PP
 .PD 0
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index d6fa48ed..dec5edaf 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -90,6 +90,7 @@ enum {
 	I_PROJID32BIT,
 	I_SPINODES,
 	I_NREXT64,
+	I_VERITY,
 	I_MAX_OPTS,
 };
 
@@ -469,6 +470,7 @@ static struct opt_params iopts = {
 		[I_PROJID32BIT] = "projid32bit",
 		[I_SPINODES] = "sparse",
 		[I_NREXT64] = "nrext64",
+		[I_VERITY] = "verity",
 		[I_MAX_OPTS] = NULL,
 	},
 	.subopt_params = {
@@ -523,7 +525,13 @@ static struct opt_params iopts = {
 		  .minval = 0,
 		  .maxval = 1,
 		  .defaultval = 1,
-		}
+		},
+		{ .index = I_VERITY,
+		  .conflicts = { { NULL, LAST_CONFLICT } },
+		  .minval = 0,
+		  .maxval = 1,
+		  .defaultval = 1,
+		},
 	},
 };
 
@@ -889,6 +897,7 @@ struct sb_feat_args {
 	bool	nodalign;
 	bool	nortalign;
 	bool	nrext64;
+	bool	verity;			/* XFS_SB_FEAT_RO_COMPAT_VERITY */
 };
 
 struct cli_params {
@@ -1024,7 +1033,7 @@ usage( void )
 			    sectsize=num,concurrency=num]\n\
 /* force overwrite */	[-f]\n\
 /* inode size */	[-i perblock=n|size=num,maxpct=n,attr=0|1|2,\n\
-			    projid32bit=0|1,sparse=0|1,nrext64=0|1]\n\
+			    projid32bit=0|1,sparse=0|1,nrext64=0|1,verity=0|1]\n\
 /* no discard */	[-K]\n\
 /* log subvol */	[-l agnum=n,internal,size=num,logdev=xxx,version=n\n\
 			    sunit=value|su=num,sectsize=num,lazy-count=0|1,\n\
@@ -1722,6 +1731,9 @@ inode_opts_parser(
 	case I_NREXT64:
 		cli->sb_feat.nrext64 = getnum(value, opts, subopt);
 		break;
+	case I_VERITY:
+		cli->sb_feat.verity = getnum(value, opts, subopt);
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -3478,6 +3490,8 @@ sb_set_features(
 		sbp->sb_features_ro_compat |= XFS_SB_FEAT_RO_COMPAT_REFLINK;
 	if (fp->inobtcnt)
 		sbp->sb_features_ro_compat |= XFS_SB_FEAT_RO_COMPAT_INOBTCNT;
+	if (fp->verity)
+		sbp->sb_features_ro_compat |= XFS_SB_FEAT_RO_COMPAT_VERITY;
 	if (fp->bigtime)
 		sbp->sb_features_incompat |= XFS_SB_FEAT_INCOMPAT_BIGTIME;
 
@@ -4339,6 +4353,7 @@ main(
 			.nortalign = false,
 			.bigtime = true,
 			.nrext64 = true,
+			.verity = false,
 			/*
 			 * When we decide to enable a new feature by default,
 			 * please remember to update the mkfs conf files.


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 1/3] common/verity: enable fsverity for XFS
  2024-03-17 16:23 ` [PATCHSET v5.3] fstests: fs-verity support for XFS Darrick J. Wong
@ 2024-03-17 16:39   ` Darrick J. Wong
  2024-03-17 16:39   ` [PATCH 2/3] xfs/{021,122}: adapt to fsverity xattrs Darrick J. Wong
  2024-03-17 16:39   ` [PATCH 3/3] common/populate: add verity files to populate xfs images Darrick J. Wong
  2 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:39 UTC (permalink / raw)
  To: aalbersh, ebiggers, djwong, zlang
  Cc: Andrey Albershteyn, fsverity, fstests, linux-fsdevel, guan, linux-xfs

From: Andrey Albershteyn <aalbersh@redhat.com>

XFS supports verity and can be enabled for -g verity group.

Signed-off-by: Andrey Albershteyn <andrey.albershteyn@gmail.com>
---
 common/verity |   29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)


diff --git a/common/verity b/common/verity
index 03d175ce1b..df4eb5dee7 100644
--- a/common/verity
+++ b/common/verity
@@ -43,7 +43,16 @@ _require_scratch_verity()
 
 	# The filesystem may be aware of fs-verity but have it disabled by
 	# CONFIG_FS_VERITY=n.  Detect support via sysfs.
-	if [ ! -e /sys/fs/$fstyp/features/verity ]; then
+	case $FSTYP in
+	xfs)
+		_scratch_unmount
+		_check_scratch_xfs_features VERITY &>>$seqres.full
+		_scratch_mount
+	;;
+	*)
+		test -e /sys/fs/$fstyp/features/verity
+	esac
+	if [ ! $? ]; then
 		_notrun "kernel $fstyp isn't configured with verity support"
 	fi
 
@@ -201,6 +210,9 @@ _scratch_mkfs_verity()
 	ext4|f2fs)
 		_scratch_mkfs -O verity
 		;;
+	xfs)
+		_scratch_mkfs -i verity
+		;;
 	btrfs)
 		_scratch_mkfs
 		;;
@@ -407,6 +419,21 @@ _fsv_scratch_corrupt_merkle_tree()
 		done
 		_scratch_mount
 		;;
+	xfs)
+		local ino=$(stat -c '%i' $file)
+		local attr_offset=$(( $offset % $FSV_BLOCK_SIZE ))
+		local attr_index=$(printf "%08d" $(( offset - attr_offset )))
+		_scratch_unmount
+		# Attribute name is 8 bytes long (index of Merkle tree page)
+		_scratch_xfs_db -x -c "inode $ino" \
+			-c "attr_modify -f -m 8 -o $attr_offset $attr_index \"BUG\"" \
+			>>$seqres.full
+		# In case bsize == 4096 and merkle block size == 1024, by
+		# modifying attribute with 'attr_modify we can corrupt quota
+		# account. Let's repair it
+		_scratch_xfs_repair > $seqres.full 2>&1
+		_scratch_mount
+		;;
 	*)
 		_fail "_fsv_scratch_corrupt_merkle_tree() unimplemented on $FSTYP"
 		;;


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 2/3] xfs/{021,122}: adapt to fsverity xattrs
  2024-03-17 16:23 ` [PATCHSET v5.3] fstests: fs-verity support for XFS Darrick J. Wong
  2024-03-17 16:39   ` [PATCH 1/3] common/verity: enable fsverity " Darrick J. Wong
@ 2024-03-17 16:39   ` Darrick J. Wong
  2024-03-19 14:59     ` Andrey Albershteyn
  2024-03-17 16:39   ` [PATCH 3/3] common/populate: add verity files to populate xfs images Darrick J. Wong
  2 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:39 UTC (permalink / raw)
  To: aalbersh, ebiggers, djwong, zlang
  Cc: fsverity, fstests, linux-fsdevel, guan, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Adjust these tests to accomdate the use of xattrs to store fsverity
metadata.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 tests/xfs/021     |    3 +++
 tests/xfs/122.out |    1 +
 2 files changed, 4 insertions(+)


diff --git a/tests/xfs/021 b/tests/xfs/021
index ef307fc064..dcecf41958 100755
--- a/tests/xfs/021
+++ b/tests/xfs/021
@@ -118,6 +118,7 @@ _scratch_xfs_db -r -c "inode $inum_1" -c "print a.sfattr"  | \
 	perl -ne '
 /\.secure/ && next;
 /\.parent/ && next;
+/\.verity/ && next;
 	print unless /^\d+:\[.*/;'
 
 echo "*** dump attributes (2)"
@@ -128,6 +129,7 @@ _scratch_xfs_db -r -c "inode $inum_2" -c "a a.bmx[0].startblock" -c print  \
 	| perl -ne '
 s/,secure//;
 s/,parent//;
+s/,verity//;
 s/info.hdr/info/;
 /hdr.info.crc/ && next;
 /hdr.info.bno/ && next;
@@ -135,6 +137,7 @@ s/info.hdr/info/;
 /hdr.info.lsn/ && next;
 /hdr.info.owner/ && next;
 /\.parent/ && next;
+/\.verity/ && next;
 s/^(hdr.info.magic =) 0x3bee/\1 0xfbee/;
 s/^(hdr.firstused =) (\d+)/\1 FIRSTUSED/;
 s/^(hdr.freemap\[0-2] = \[base,size]).*/\1 [FREEMAP..]/;
diff --git a/tests/xfs/122.out b/tests/xfs/122.out
index 3a99ce77bb..ff886b4eec 100644
--- a/tests/xfs/122.out
+++ b/tests/xfs/122.out
@@ -141,6 +141,7 @@ sizeof(struct xfs_scrub_vec) = 16
 sizeof(struct xfs_scrub_vec_head) = 32
 sizeof(struct xfs_swap_extent) = 64
 sizeof(struct xfs_unmount_log_format) = 8
+sizeof(struct xfs_verity_merkle_key) = 8
 sizeof(struct xfs_xmd_log_format) = 16
 sizeof(struct xfs_xmi_log_format) = 80
 sizeof(union xfs_rtword_raw) = 4


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 3/3] common/populate: add verity files to populate xfs images
  2024-03-17 16:23 ` [PATCHSET v5.3] fstests: fs-verity support for XFS Darrick J. Wong
  2024-03-17 16:39   ` [PATCH 1/3] common/verity: enable fsverity " Darrick J. Wong
  2024-03-17 16:39   ` [PATCH 2/3] xfs/{021,122}: adapt to fsverity xattrs Darrick J. Wong
@ 2024-03-17 16:39   ` Darrick J. Wong
  2 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-17 16:39 UTC (permalink / raw)
  To: aalbersh, ebiggers, djwong, zlang
  Cc: fsverity, fstests, linux-fsdevel, guan, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If verity is enabled on a filesystem, we should create some sample
verity files.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 common/populate |   21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)


diff --git a/common/populate b/common/populate
index 35071f4210..3f3ec0480d 100644
--- a/common/populate
+++ b/common/populate
@@ -520,6 +520,27 @@ _scratch_xfs_populate() {
 		done
 	fi
 
+	# verity merkle trees
+	is_verity="$(_xfs_has_feature "$SCRATCH_MNT" verity -v)"
+	if [ $is_verity -gt 0 ]; then
+		echo "+ fsverity"
+
+		# Create a biggish file with all zeroes, because metadump
+		# won't preserve data blocks and we don't want the hashes to
+		# stop working for our sample fs.
+		for ((pos = 0, i = 88; pos < 23456789; pos += 234567, i++)); do
+			$XFS_IO_PROG -f -c "pwrite -S 0 $pos 234567" "$SCRATCH_MNT/verity"
+		done
+
+		fsverity enable "$SCRATCH_MNT/verity"
+
+		# Create a sparse file
+		$XFS_IO_PROG -f -c "pwrite -S 0 0 3" "$SCRATCH_MNT/sparse_verity"
+		truncate -s 23456789 "$SCRATCH_MNT/sparse_verity"
+		$XFS_IO_PROG -f -c "pwrite -S 0 23456789 3" "$SCRATCH_MNT/sparse_verity"
+		fsverity enable "$SCRATCH_MNT/sparse_verity"
+	fi
+
 	# Copy some real files (xfs tests, I guess...)
 	echo "+ real files"
 	test $fill -ne 0 && __populate_fill_fs "${SCRATCH_MNT}" 5


^ permalink raw reply related	[flat|nested] 92+ messages in thread

* Re: [PATCHBOMB v5.3] fs-verity support for XFS
  2024-03-17 16:19 [PATCHBOMB v5.3] fs-verity support for XFS Darrick J. Wong
                   ` (2 preceding siblings ...)
  2024-03-17 16:23 ` [PATCHSET v5.3] fstests: fs-verity support for XFS Darrick J. Wong
@ 2024-03-18  1:39 ` Christoph Hellwig
  2024-03-18  4:30   ` Darrick J. Wong
  3 siblings, 1 reply; 92+ messages in thread
From: Christoph Hellwig @ 2024-03-18  1:39 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, ebiggers, linux-fsdevel, fsverity, linux-xfs

On Sun, Mar 17, 2024 at 09:19:54AM -0700, Darrick J. Wong wrote:
> Note that metadump is kinda broken and xfs_scrub media scans do not yet
> know how to read verity files.  All that is actually fixed in the
> version that's lodged in my development trees, but since Andrey's base
> is the 6.9 for-next branch plus only a few of the parent pointers
> patches, none of that stuff was easy to port to make a short dev branch.

Maybe we'll need to put the verity work back and do a good review cycle
on the parent pointers first?

Can you send out what your currently have?


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 26/40] xfs: add fs-verity support
  2024-03-17 16:30   ` [PATCH 26/40] xfs: add fs-verity support Darrick J. Wong
@ 2024-03-18  1:43     ` Christoph Hellwig
  2024-03-18  4:34       ` Darrick J. Wong
  0 siblings, 1 reply; 92+ messages in thread
From: Christoph Hellwig @ 2024-03-18  1:43 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: ebiggers, aalbersh, linux-fsdevel, fsverity, linux-xfs

Just skimming over the series from the little I've followed from the
last rounds (sorry, to busy with various projects):

> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -92,6 +92,9 @@ typedef struct xfs_inode {
>  	spinlock_t		i_ioend_lock;
>  	struct work_struct	i_ioend_work;
>  	struct list_head	i_ioend_list;
> +#ifdef CONFIG_FS_VERITY
> +	struct xarray		i_merkle_blocks;
> +#endif

This looks like very much a blocker to me.  Adding a 16 byte field to
struct inode that is used just for a few read-only-ish files on
select few file systems doesn't seem very efficient.  Given that we
very rarely update it and thus concurrency on the write side doesn't
matter much, is there any way we could get a away with a fs-wide
lookup data structure and avoid this?

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCHBOMB v5.3] fs-verity support for XFS
  2024-03-18  1:39 ` [PATCHBOMB v5.3] fs-verity support for XFS Christoph Hellwig
@ 2024-03-18  4:30   ` Darrick J. Wong
  0 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-18  4:30 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, ebiggers, linux-fsdevel, fsverity, linux-xfs

On Sun, Mar 17, 2024 at 06:39:16PM -0700, Christoph Hellwig wrote:
> On Sun, Mar 17, 2024 at 09:19:54AM -0700, Darrick J. Wong wrote:
> > Note that metadump is kinda broken and xfs_scrub media scans do not yet
> > know how to read verity files.  All that is actually fixed in the
> > version that's lodged in my development trees, but since Andrey's base
> > is the 6.9 for-next branch plus only a few of the parent pointers
> > patches, none of that stuff was easy to port to make a short dev branch.
> 
> Maybe we'll need to put the verity work back and do a good review cycle
> on the parent pointers first?
> 
> Can you send out what your currently have?

Ok, I'll do that tomorrow.

--D

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 26/40] xfs: add fs-verity support
  2024-03-18  1:43     ` Christoph Hellwig
@ 2024-03-18  4:34       ` Darrick J. Wong
  2024-03-18  4:39         ` Christoph Hellwig
  0 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-18  4:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: ebiggers, aalbersh, linux-fsdevel, fsverity, linux-xfs

On Sun, Mar 17, 2024 at 06:43:39PM -0700, Christoph Hellwig wrote:
> Just skimming over the series from the little I've followed from the
> last rounds (sorry, to busy with various projects):
> 
> > --- a/fs/xfs/xfs_inode.h
> > +++ b/fs/xfs/xfs_inode.h
> > @@ -92,6 +92,9 @@ typedef struct xfs_inode {
> >  	spinlock_t		i_ioend_lock;
> >  	struct work_struct	i_ioend_work;
> >  	struct list_head	i_ioend_list;
> > +#ifdef CONFIG_FS_VERITY
> > +	struct xarray		i_merkle_blocks;
> > +#endif
> 
> This looks like very much a blocker to me.  Adding a 16 byte field to
> struct inode that is used just for a few read-only-ish files on
> select few file systems doesn't seem very efficient.  Given that we
> very rarely update it and thus concurrency on the write side doesn't
> matter much, is there any way we could get a away with a fs-wide
> lookup data structure and avoid this?

Only if you can hand a 128-bit key to an xarray. ;)

But in all seriousness, we could have a per-AG xarray that maps
xfs_agino_t to this xarray of merkle blocks.  That would be nice in that
we don't have to touch xfs_icache.c for the shrinker at all.

--D

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 26/40] xfs: add fs-verity support
  2024-03-18  4:34       ` Darrick J. Wong
@ 2024-03-18  4:39         ` Christoph Hellwig
  2024-03-18  4:56           ` Darrick J. Wong
  0 siblings, 1 reply; 92+ messages in thread
From: Christoph Hellwig @ 2024-03-18  4:39 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, ebiggers, aalbersh, linux-fsdevel, fsverity,
	linux-xfs

On Sun, Mar 17, 2024 at 09:34:36PM -0700, Darrick J. Wong wrote:
> > select few file systems doesn't seem very efficient.  Given that we
> > very rarely update it and thus concurrency on the write side doesn't
> > matter much, is there any way we could get a away with a fs-wide
> > lookup data structure and avoid this?
> 
> Only if you can hand a 128-bit key to an xarray. ;)

That's why I said lookup data structure and not xarray.  It would
probably work with an rthashtable.

> But in all seriousness, we could have a per-AG xarray that maps
> xfs_agino_t to this xarray of merkle blocks.  That would be nice in that
> we don't have to touch xfs_icache.c for the shrinker at all.

I have to admit I haven't read the code enough to even know from
what to what it maps.  I'll try to get a bit deeper into the code,
time permitting.


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 26/40] xfs: add fs-verity support
  2024-03-18  4:39         ` Christoph Hellwig
@ 2024-03-18  4:56           ` Darrick J. Wong
  0 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-18  4:56 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: ebiggers, aalbersh, linux-fsdevel, fsverity, linux-xfs

On Sun, Mar 17, 2024 at 09:39:01PM -0700, Christoph Hellwig wrote:
> On Sun, Mar 17, 2024 at 09:34:36PM -0700, Darrick J. Wong wrote:
> > > select few file systems doesn't seem very efficient.  Given that we
> > > very rarely update it and thus concurrency on the write side doesn't
> > > matter much, is there any way we could get a away with a fs-wide
> > > lookup data structure and avoid this?
> > 
> > Only if you can hand a 128-bit key to an xarray. ;)
> 
> That's why I said lookup data structure and not xarray.  It would
> probably work with an rthashtable.

Heh.  Well willy gave me the idea to use an xarray so I'd then know how
to use an xarray. :)

> > But in all seriousness, we could have a per-AG xarray that maps
> > xfs_agino_t to this xarray of merkle blocks.  That would be nice in that
> > we don't have to touch xfs_icache.c for the shrinker at all.
> 
> I have to admit I haven't read the code enough to even know from
> what to what it maps.  I'll try to get a bit deeper into the code,
> time permitting.

fsverity flattens the blocks of the merkle tree into a linear u64
byte-address space.  The accesses are in those same units, which is why
I end up shifting so that the xarray entries for adjacent blocks are
contiguous.  Kind of like what the address_space does.

--D

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCHSET v5.3] fs-verity support for XFS
  2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
                     ` (39 preceding siblings ...)
  2024-03-17 16:33   ` [PATCH 40/40] xfs: enable ro-compat fs-verity flag Darrick J. Wong
@ 2024-03-18 16:35   ` Eric Biggers
  2024-03-19 22:07     ` Darrick J. Wong
  40 siblings, 1 reply; 92+ messages in thread
From: Eric Biggers @ 2024-03-18 16:35 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: aalbersh, Mark Tinguely, Allison Henderson, Christoph Hellwig,
	Dave Chinner, linux-fsdevel, fsverity, linux-xfs

On Sun, Mar 17, 2024 at 09:22:52AM -0700, Darrick J. Wong wrote:
> Hi all,
> 
> From Darrick J. Wong:
> 
> This v5.3 patchset builds upon v5.2 of Andrey's patchset to implement
> fsverity for XFS.

Is this ready for me to review, or is my feedback on v5 still being worked on?
From a quick glance, not everything from my feedback has been addressed.

- Eric

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 16/40] fsverity: pass the zero-hash value to the implementation
  2024-03-17 16:27   ` [PATCH 16/40] fsverity: pass the zero-hash value to the implementation Darrick J. Wong
@ 2024-03-18 16:38     ` Eric Biggers
  2024-03-18 21:04       ` Darrick J. Wong
  0 siblings, 1 reply; 92+ messages in thread
From: Eric Biggers @ 2024-03-18 16:38 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, linux-fsdevel, fsverity, linux-xfs

On Sun, Mar 17, 2024 at 09:27:34AM -0700, Darrick J. Wong wrote:
> diff --git a/fs/verity/open.c b/fs/verity/open.c
> index 7a86407732c4..433a70eeca55 100644
> --- a/fs/verity/open.c
> +++ b/fs/verity/open.c
> @@ -144,6 +144,13 @@ int fsverity_init_merkle_tree_params(struct merkle_tree_params *params,
>  		goto out_err;
>  	}
>  
> +	err = fsverity_hash_buffer(params->hash_alg, page_address(ZERO_PAGE(0)),
> +				   i_blocksize(inode), params->zero_digest);
> +	if (err) {
> +		fsverity_err(inode, "Error %d computing zero digest", err);
> +		goto out_err;
> +	}

This doesn't take the salt into account.  Also it's using the wrong block size
(filesystem block size instead of Merkle tree block size).

How about using fsverity_hash_block()?

- Eric

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 35/40] xfs: teach online repair to evaluate fsverity xattrs
  2024-03-17 16:32   ` [PATCH 35/40] xfs: teach online repair to evaluate fsverity xattrs Darrick J. Wong
@ 2024-03-18 17:34     ` Andrey Albershteyn
  2024-03-19 21:27       ` Darrick J. Wong
  0 siblings, 1 reply; 92+ messages in thread
From: Andrey Albershteyn @ 2024-03-18 17:34 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: ebiggers, linux-fsdevel, fsverity, linux-xfs

On 2024-03-17 09:32:31, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Teach online repair to check for unused fsverity metadata and purge it
> on reconstruction.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/scrub/attr.c   |  102 +++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/scrub/attr.h   |    4 ++
>  fs/xfs/scrub/common.c |   27 +++++++++++++
>  3 files changed, 133 insertions(+)
> 
> 
> diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
> index ae4227cb55ec..c69dee281984 100644
> --- a/fs/xfs/scrub/attr.c
> +++ b/fs/xfs/scrub/attr.c
> @@ -21,6 +21,8 @@
>  #include "scrub/dabtree.h"
>  #include "scrub/attr.h"
>  
> +#include <linux/fsverity.h>
> +
>  /* Free the buffers linked from the xattr buffer. */
>  static void
>  xchk_xattr_buf_cleanup(
> @@ -135,6 +137,91 @@ xchk_setup_xattr(
>  	return xchk_setup_inode_contents(sc, 0);
>  }
>  
> +#ifdef CONFIG_FS_VERITY
> +/* Extract merkle tree geometry from incore information. */
> +static int
> +xchk_xattr_extract_verity(
> +	struct xfs_scrub		*sc)
> +{
> +	struct xchk_xattr_buf		*ab = sc->buf;
> +
> +	/* setup should have allocated the buffer */
> +	if (!ab) {
> +		ASSERT(0);
> +		return -EFSCORRUPTED;
> +	}
> +
> +	return fsverity_merkle_tree_geometry(VFS_I(sc->ip),
> +			&ab->merkle_blocksize, &ab->merkle_tree_size);
> +}
> +
> +/* Check the merkle tree xattrs. */
> +STATIC void
> +xchk_xattr_verity(
> +	struct xfs_scrub		*sc,
> +	xfs_dablk_t			blkno,
> +	const unsigned char		*name,
> +	unsigned int			namelen,
> +	unsigned int			valuelen)
> +{
> +	struct xchk_xattr_buf		*ab = sc->buf;
> +
> +	/* Non-verity filesystems should never have verity xattrs. */
> +	if (!xfs_has_verity(sc->mp)) {
> +		xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
> +		return;
> +	}
> +
> +	/*
> +	 * Any verity metadata on a non-verity file are leftovers from a
> +	 * previous attempt to enable verity.
> +	 */
> +	if (!IS_VERITY(VFS_I(sc->ip))) {
> +		xchk_ino_set_preen(sc, sc->ip->i_ino);
> +		return;
> +	}
> +
> +	switch (namelen) {
> +	case sizeof(struct xfs_verity_merkle_key):
> +		/* Oversized blocks are not allowed */
> +		if (valuelen > ab->merkle_blocksize) {
> +			xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
> +			return;
> +		}
> +		break;
> +	case XFS_VERITY_DESCRIPTOR_NAME_LEN:
> +		/* Has to match the descriptor xattr name */
> +		if (memcmp(name, XFS_VERITY_DESCRIPTOR_NAME, namelen)) {
> +			xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
> +		}
> +		return;
> +	default:
> +		xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
> +		return;
> +	}
> +
> +	/*
> +	 * Merkle tree blocks beyond the end of the tree are leftovers from
> +	 * a previous failed attempt to enable verity.
> +	 */
> +	if (xfs_verity_merkle_key_from_disk(name) >= ab->merkle_tree_size)
> +		xchk_ino_set_preen(sc, sc->ip->i_ino);
> +}
> +#else
> +# define xchk_xattr_extract_verity(sc)	(0)
> +
> +static void
> +xchk_xattr_verity(
> +	struct xfs_scrub	*sc,
> +	xfs_dablk_t		blkno,
> +	const unsigned char	*name,
> +	unsigned int		namelen)
> +{
> +	/* Should never see verity xattrs when verity is not enabled. */
> +	xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
> +}
> +#endif /* CONFIG_FS_VERITY */
> +
>  /* Extended Attributes */
>  
>  struct xchk_xattr {
> @@ -194,6 +281,15 @@ xchk_xattr_listent(
>  		goto fail_xref;
>  	}
>  
> +	/* Check verity xattr geometry */
> +	if (flags & XFS_ATTR_VERITY) {
> +		xchk_xattr_verity(sx->sc, args.blkno, name, namelen, valuelen);
> +		if (sx->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) {
> +			context->seen_enough = 1;
> +			return;
> +		}
> +	}
> +
>  	/* Does this name make sense? */
>  	if (!xfs_attr_namecheck(sx->sc->mp, name, namelen, flags)) {
>  		xchk_fblock_set_corrupt(sx->sc, XFS_ATTR_FORK, args.blkno);

Would it be better to check verity after xfs_attr_namecheck()?
Invalid name seems to be a more basic corruption.

-- 
- Andrey


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 30/40] xfs: clean up stale fsverity metadata before starting
  2024-03-17 16:31   ` [PATCH 30/40] xfs: clean up stale fsverity metadata before starting Darrick J. Wong
@ 2024-03-18 17:50     ` Andrey Albershteyn
  0 siblings, 0 replies; 92+ messages in thread
From: Andrey Albershteyn @ 2024-03-18 17:50 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: ebiggers, linux-fsdevel, fsverity, linux-xfs

On 2024-03-17 09:31:13, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Before we let fsverity begin writing merkle tree blocks to the file,
> let's perform a minor effort to clean up any stale metadata from a
> previous attempt to enable fsverity.  This can only happen if the system
> crashes /and/ the file shrinks, which is unlikely.  But we could do a
> better job of cleaning up anyway.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Looks good to me:
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>

> ---
>  fs/xfs/xfs_verity.c |   42 ++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 40 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
> index c19fa47d1f76..db43e017f10e 100644
> --- a/fs/xfs/xfs_verity.c
> +++ b/fs/xfs/xfs_verity.c
> @@ -413,6 +413,44 @@ xfs_verity_get_descriptor(
>  	return args.valuelen;
>  }
>  
> +/*
> + * Clear out old fsverity metadata before we start building a new one.  This
> + * could happen if, say, we crashed while building fsverity data.
> + */
> +static int
> +xfs_verity_drop_old_metadata(
> +	struct xfs_inode		*ip,
> +	u64				new_tree_size,
> +	unsigned int			tree_blocksize)
> +{
> +	struct xfs_verity_merkle_key	name;
> +	struct xfs_da_args		args = {
> +		.dp			= ip,
> +		.whichfork		= XFS_ATTR_FORK,
> +		.attr_filter		= XFS_ATTR_VERITY,
> +		.op_flags		= XFS_DA_OP_REMOVE,
> +		.name			= (const uint8_t *)&name,
> +		.namelen		= sizeof(struct xfs_verity_merkle_key),
> +		/* NULL value make xfs_attr_set remove the attr */
> +		.value			= NULL,
> +	};
> +	u64				offset;
> +	int				error = 0;
> +
> +	/*
> +	 * Delete as many merkle tree blocks in increasing blkno order until we
> +	 * don't find any more.  That ought to be good enough for avoiding
> +	 * dead bloat without excessive runtime.
> +	 */
> +	for (offset = new_tree_size; !error; offset += tree_blocksize) {
> +		xfs_verity_merkle_key_to_disk(&name, offset);
> +		error = xfs_attr_set(&args);
> +	}
> +	if (error == -ENOATTR)
> +		return 0;
> +	return error;
> +}
> +
>  static int
>  xfs_verity_begin_enable(
>  	struct file		*filp,
> @@ -421,7 +459,6 @@ xfs_verity_begin_enable(
>  {
>  	struct inode		*inode = file_inode(filp);
>  	struct xfs_inode	*ip = XFS_I(inode);
> -	int			error = 0;
>  
>  	xfs_assert_ilocked(ip, XFS_IOLOCK_EXCL);
>  
> @@ -431,7 +468,8 @@ xfs_verity_begin_enable(
>  	if (xfs_iflags_test_and_set(ip, XFS_VERITY_CONSTRUCTION))
>  		return -EBUSY;
>  
> -	return error;
> +	return xfs_verity_drop_old_metadata(ip, merkle_tree_size,
> +			tree_blocksize);
>  }
>  
>  static int
> 

-- 
- Andrey


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 31/40] xfs: better reporting and error handling in xfs_drop_merkle_tree
  2024-03-17 16:31   ` [PATCH 31/40] xfs: better reporting and error handling in xfs_drop_merkle_tree Darrick J. Wong
@ 2024-03-18 17:51     ` Andrey Albershteyn
  0 siblings, 0 replies; 92+ messages in thread
From: Andrey Albershteyn @ 2024-03-18 17:51 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: ebiggers, linux-fsdevel, fsverity, linux-xfs

On 2024-03-17 09:31:28, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> xfs_drop_merkle_tree is responsible for removing the fsverity metadata
> after a failed attempt to enable fsverity for a file.  However, if the
> enablement process fails before the verity descriptor is written to the
> file, the cleanup function will trip the WARN_ON.  The error code in
> that case is ENOATTR, which isn't worth logging about.
> 
> Fix that return code handling, fix the tree block removal loop not to
> return early with ENOATTR, and improve the logging so that we actually
> capture what kind of error occurred.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Looks good to me:
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>

> ---
>  fs/xfs/xfs_verity.c |   25 ++++++++++++++++++-------
>  1 file changed, 18 insertions(+), 7 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
> index db43e017f10e..32891ae42c47 100644
> --- a/fs/xfs/xfs_verity.c
> +++ b/fs/xfs/xfs_verity.c
> @@ -472,15 +472,14 @@ xfs_verity_begin_enable(
>  			tree_blocksize);
>  }
>  
> +/* Try to remove all the fsverity metadata after a failed enablement. */
>  static int
> -xfs_drop_merkle_tree(
> +xfs_verity_drop_incomplete_tree(
>  	struct xfs_inode		*ip,
>  	u64				merkle_tree_size,
>  	unsigned int			tree_blocksize)
>  {
>  	struct xfs_verity_merkle_key	name;
> -	int				error = 0;
> -	u64				offset = 0;
>  	struct xfs_da_args		args = {
>  		.dp			= ip,
>  		.whichfork		= XFS_ATTR_FORK,
> @@ -491,6 +490,8 @@ xfs_drop_merkle_tree(
>  		/* NULL value make xfs_attr_set remove the attr */
>  		.value			= NULL,
>  	};
> +	u64				offset;
> +	int				error;
>  
>  	if (!merkle_tree_size)
>  		return 0;
> @@ -498,6 +499,8 @@ xfs_drop_merkle_tree(
>  	for (offset = 0; offset < merkle_tree_size; offset += tree_blocksize) {
>  		xfs_verity_merkle_key_to_disk(&name, offset);
>  		error = xfs_attr_set(&args);
> +		if (error == -ENOATTR)
> +			error = 0;
>  		if (error)
>  			return error;
>  	}
> @@ -505,7 +508,8 @@ xfs_drop_merkle_tree(
>  	args.name = (const uint8_t *)XFS_VERITY_DESCRIPTOR_NAME;
>  	args.namelen = XFS_VERITY_DESCRIPTOR_NAME_LEN;
>  	error = xfs_attr_set(&args);
> -
> +	if (error == -ENOATTR)
> +		return 0;
>  	return error;
>  }
>  
> @@ -564,9 +568,16 @@ xfs_verity_end_enable(
>  		inode->i_flags |= S_VERITY;
>  
>  out:
> -	if (error)
> -		WARN_ON_ONCE(xfs_drop_merkle_tree(ip, merkle_tree_size,
> -						  tree_blocksize));
> +	if (error) {
> +		int	error2;
> +
> +		error2 = xfs_verity_drop_incomplete_tree(ip, merkle_tree_size,
> +				tree_blocksize);
> +		if (error2)
> +			xfs_alert(ip->i_mount,
> + "ino 0x%llx failed to clean up new fsverity metadata, err %d",
> +					ip->i_ino, error2);
> +	}
>  
>  	xfs_iflags_clear(ip, XFS_VERITY_CONSTRUCTION);
>  	return error;
> 

-- 
- Andrey


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 36/40] xfs: don't store trailing zeroes of merkle tree blocks
  2024-03-17 16:32   ` [PATCH 36/40] xfs: don't store trailing zeroes of merkle tree blocks Darrick J. Wong
@ 2024-03-18 17:52     ` Andrey Albershteyn
  0 siblings, 0 replies; 92+ messages in thread
From: Andrey Albershteyn @ 2024-03-18 17:52 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: ebiggers, linux-fsdevel, fsverity, linux-xfs

On 2024-03-17 09:32:47, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> As a minor space optimization, don't store trailing zeroes of merkle
> tree blocks to reduce space consumption and copying overhead.  This
> really only affects the rightmost blocks at each level of the tree.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Looks good to me:
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>

> ---
>  fs/xfs/xfs_verity.c |   11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
> index 32891ae42c47..abd95bc1ba6e 100644
> --- a/fs/xfs/xfs_verity.c
> +++ b/fs/xfs/xfs_verity.c
> @@ -622,11 +622,6 @@ xfs_verity_read_merkle(
>  	if (error)
>  		goto out_new_mk;
>  
> -	if (!args.valuelen) {
> -		error = -ENODATA;
> -		goto out_new_mk;
> -	}
> -
>  	mk = xfs_verity_cache_store(ip, key, new_mk);
>  	if (mk != new_mk) {
>  		/*
> @@ -681,6 +676,12 @@ xfs_verity_write_merkle(
>  		.value			= (void *)buf,
>  		.valuelen		= size,
>  	};
> +	const char			*p = buf + size - 1;
> +
> +	/* Don't store trailing zeroes. */
> +	while (p >= (const char *)buf && *p == 0)
> +		p--;
> +	args.valuelen = p - (const char *)buf + 1;
>  
>  	xfs_verity_merkle_key_to_disk(&name, pos);
>  	return xfs_attr_set(&args);
> 

-- 
- Andrey


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 37/40] xfs: create separate name hash function for xattrs
  2024-03-17 16:33   ` [PATCH 37/40] xfs: create separate name hash function for xattrs Darrick J. Wong
@ 2024-03-18 17:53     ` Andrey Albershteyn
  0 siblings, 0 replies; 92+ messages in thread
From: Andrey Albershteyn @ 2024-03-18 17:53 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: ebiggers, linux-fsdevel, fsverity, linux-xfs

On 2024-03-17 09:33:02, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Create a new hashing function for extended attribute names.  The next
> patch needs this so it can modify the hash strategy for verity xattrs.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Looks good to me:
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>

> ---
>  fs/xfs/libxfs/xfs_attr.c      |   16 ++++++++++++++--
>  fs/xfs/libxfs/xfs_attr.h      |    3 +++
>  fs/xfs/libxfs/xfs_attr_leaf.c |    4 ++--
>  fs/xfs/scrub/attr.c           |    8 +++++---
>  fs/xfs/xfs_attr_item.c        |    3 ++-
>  fs/xfs/xfs_attr_list.c        |    3 ++-
>  6 files changed, 28 insertions(+), 9 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index b7aa1bc12fd1..b1fa45197eac 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -238,6 +238,16 @@ xfs_attr_get_ilocked(
>  	return xfs_attr_node_get(args);
>  }
>  
> +/* Compute hash for an extended attribute name. */
> +xfs_dahash_t
> +xfs_attr_hashname(
> +	unsigned int		attr_flags,
> +	const uint8_t		*name,
> +	unsigned int		namelen)
> +{
> +	return xfs_da_hashname(name, namelen);
> +}
> +
>  /*
>   * Retrieve an extended attribute by name, and its value if requested.
>   *
> @@ -268,7 +278,8 @@ xfs_attr_get(
>  
>  	args->geo = args->dp->i_mount->m_attr_geo;
>  	args->whichfork = XFS_ATTR_FORK;
> -	args->hashval = xfs_da_hashname(args->name, args->namelen);
> +	args->hashval = xfs_attr_hashname(args->attr_filter, args->name,
> +					  args->namelen);
>  
>  	/* Entirely possible to look up a name which doesn't exist */
>  	args->op_flags = XFS_DA_OP_OKNOENT;
> @@ -942,7 +953,8 @@ xfs_attr_set(
>  
>  	args->geo = mp->m_attr_geo;
>  	args->whichfork = XFS_ATTR_FORK;
> -	args->hashval = xfs_da_hashname(args->name, args->namelen);
> +	args->hashval = xfs_attr_hashname(args->attr_filter, args->name,
> +					  args->namelen);
>  
>  	/*
>  	 * We have no control over the attribute names that userspace passes us
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 92711c8d2a9f..19db6c1cc71f 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -619,4 +619,7 @@ extern struct kmem_cache *xfs_attr_intent_cache;
>  int __init xfs_attr_intent_init_cache(void);
>  void xfs_attr_intent_destroy_cache(void);
>  
> +xfs_dahash_t xfs_attr_hashname(unsigned int attr_flags,
> +		const uint8_t *name_string, unsigned int name_length);
> +
>  #endif	/* __XFS_ATTR_H__ */
> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
> index ac904cc1a97b..fcece25fd13e 100644
> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
> @@ -911,8 +911,8 @@ xfs_attr_shortform_to_leaf(
>  		nargs.namelen = sfe->namelen;
>  		nargs.value = &sfe->nameval[nargs.namelen];
>  		nargs.valuelen = sfe->valuelen;
> -		nargs.hashval = xfs_da_hashname(sfe->nameval,
> -						sfe->namelen);
> +		nargs.hashval = xfs_attr_hashname(sfe->flags, sfe->nameval,
> +						  sfe->namelen);
>  		nargs.attr_filter = sfe->flags & XFS_ATTR_NSP_ONDISK_MASK;
>  		error = xfs_attr3_leaf_lookup_int(bp, &nargs); /* set a->index */
>  		ASSERT(error == -ENOATTR);
> diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
> index c69dee281984..e7d50589f72d 100644
> --- a/fs/xfs/scrub/attr.c
> +++ b/fs/xfs/scrub/attr.c
> @@ -253,7 +253,6 @@ xchk_xattr_listent(
>  		.dp			= context->dp,
>  		.name			= name,
>  		.namelen		= namelen,
> -		.hashval		= xfs_da_hashname(name, namelen),
>  		.trans			= context->tp,
>  		.valuelen		= valuelen,
>  	};
> @@ -263,6 +262,7 @@ xchk_xattr_listent(
>  
>  	sx = container_of(context, struct xchk_xattr, context);
>  	ab = sx->sc->buf;
> +	args.hashval = xfs_attr_hashname(flags, name, namelen);
>  
>  	if (xchk_should_terminate(sx->sc, &error)) {
>  		context->seen_enough = error;
> @@ -600,7 +600,8 @@ xchk_xattr_rec(
>  			xchk_da_set_corrupt(ds, level);
>  			goto out;
>  		}
> -		calc_hash = xfs_da_hashname(lentry->nameval, lentry->namelen);
> +		calc_hash = xfs_attr_hashname(ent->flags, lentry->nameval,
> +				lentry->namelen);
>  	} else {
>  		rentry = (struct xfs_attr_leaf_name_remote *)
>  				(((char *)bp->b_addr) + nameidx);
> @@ -608,7 +609,8 @@ xchk_xattr_rec(
>  			xchk_da_set_corrupt(ds, level);
>  			goto out;
>  		}
> -		calc_hash = xfs_da_hashname(rentry->name, rentry->namelen);
> +		calc_hash = xfs_attr_hashname(ent->flags, rentry->name,
> +				rentry->namelen);
>  	}
>  	if (calc_hash != hash)
>  		xchk_da_set_corrupt(ds, level);
> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> index 703770cf1482..4d8264f0a537 100644
> --- a/fs/xfs/xfs_attr_item.c
> +++ b/fs/xfs/xfs_attr_item.c
> @@ -536,7 +536,8 @@ xfs_attri_recover_work(
>  	args->whichfork = XFS_ATTR_FORK;
>  	args->name = nv->name.i_addr;
>  	args->namelen = nv->name.i_len;
> -	args->hashval = xfs_da_hashname(args->name, args->namelen);
> +	args->hashval = xfs_attr_hashname(attrp->alfi_attr_filter, args->name,
> +					  args->namelen);
>  	args->attr_filter = attrp->alfi_attr_filter & XFS_ATTRI_FILTER_MASK;
>  	args->op_flags = XFS_DA_OP_RECOVERY | XFS_DA_OP_OKNOENT |
>  			 XFS_DA_OP_LOGGED;
> diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
> index fa74378577c5..96169474d023 100644
> --- a/fs/xfs/xfs_attr_list.c
> +++ b/fs/xfs/xfs_attr_list.c
> @@ -135,7 +135,8 @@ xfs_attr_shortform_list(
>  		}
>  
>  		sbp->entno = i;
> -		sbp->hash = xfs_da_hashname(sfe->nameval, sfe->namelen);
> +		sbp->hash = xfs_attr_hashname(sfe->flags, sfe->nameval,
> +					      sfe->namelen);
>  		sbp->name = sfe->nameval;
>  		sbp->namelen = sfe->namelen;
>  		/* These are bytes, and both on-disk, don't endian-flip */
> 

-- 
- Andrey


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 38/40] xfs: use merkle tree offset as attr hash
  2024-03-17 16:33   ` [PATCH 38/40] xfs: use merkle tree offset as attr hash Darrick J. Wong
@ 2024-03-18 17:55     ` Andrey Albershteyn
  0 siblings, 0 replies; 92+ messages in thread
From: Andrey Albershteyn @ 2024-03-18 17:55 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: ebiggers, linux-fsdevel, fsverity, linux-xfs

On 2024-03-17 09:33:18, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> I was exploring the fsverity metadata with xfs_db after creating a 220MB
> verity file, and I noticed the following in the debugger output:
> 
> entries[0-75] = [hashval,nameidx,incomplete,root,secure,local,parent,verity]
> 0:[0,4076,0,0,0,0,0,1]
> 1:[0,1472,0,0,0,1,0,1]
> 2:[0x800,4056,0,0,0,0,0,1]
> 3:[0x800,4036,0,0,0,0,0,1]
> ...
> 72:[0x12000,2716,0,0,0,0,0,1]
> 73:[0x12000,2696,0,0,0,0,0,1]
> 74:[0x12800,2676,0,0,0,0,0,1]
> 75:[0x12800,2656,0,0,0,0,0,1]
> ...
> nvlist[0].merkle_off = 0x18000
> nvlist[1].merkle_off = 0
> nvlist[2].merkle_off = 0x19000
> nvlist[3].merkle_off = 0x1000
> ...
> nvlist[71].merkle_off = 0x5b000
> nvlist[72].merkle_off = 0x44000
> nvlist[73].merkle_off = 0x5c000
> nvlist[74].merkle_off = 0x45000
> nvlist[75].merkle_off = 0x5d000
> 
> Within just this attr leaf block, there are 76 attr entries, but only 38
> distinct hash values.  There are 415 merkle tree blocks for this file,
> but we already have hash collisions.  This isn't good performance from
> the standard da hash function because we're mostly shifting and rolling
> zeroes around.
> 
> However, we don't even have to do that much work -- the merkle tree
> block keys are themslves u64 values.  Truncate that value to 32 bits
> (the size of xfs_dahash_t) and use that for the hash.  We won't have any
> collisions between merkle tree blocks until that tree grows to 2^32nd
> blocks.  On a 4k block filesystem, we won't hit that unless the file
> contains more than 2^49 bytes, assuming sha256.
> 
> As a side effect, the keys for merkle tree blocks get written out in
> roughly sequential order, though I didn't observe any change in
> performance.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Looks good to me:
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>

> ---
>  fs/xfs/libxfs/xfs_attr.c      |    7 +++++++
>  fs/xfs/libxfs/xfs_da_format.h |    2 ++
>  2 files changed, 9 insertions(+)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index b1fa45197eac..7c0f006f972a 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -245,6 +245,13 @@ xfs_attr_hashname(
>  	const uint8_t		*name,
>  	unsigned int		namelen)
>  {
> +	if ((attr_flags & XFS_ATTR_VERITY) &&
> +	    namelen == sizeof(struct xfs_verity_merkle_key)) {
> +		uint64_t	off = xfs_verity_merkle_key_from_disk(name);
> +
> +		return off >> XFS_VERITY_MIN_MERKLE_BLOCKLOG;
> +	}
> +
>  	return xfs_da_hashname(name, namelen);
>  }
>  
> diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h
> index e4aa7c9a0ccb..58887a1c65fe 100644
> --- a/fs/xfs/libxfs/xfs_da_format.h
> +++ b/fs/xfs/libxfs/xfs_da_format.h
> @@ -946,4 +946,6 @@ xfs_verity_merkle_key_from_disk(
>  #define XFS_VERITY_DESCRIPTOR_NAME	"vdesc"
>  #define XFS_VERITY_DESCRIPTOR_NAME_LEN	(sizeof(XFS_VERITY_DESCRIPTOR_NAME) - 1)
>  
> +#define XFS_VERITY_MIN_MERKLE_BLOCKLOG	(10)
> +
>  #endif /* __XFS_DA_FORMAT_H__ */
> 

-- 
- Andrey


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 39/40] xfs: don't bother storing merkle tree blocks for zeroed data blocks
  2024-03-17 16:33   ` [PATCH 39/40] xfs: don't bother storing merkle tree blocks for zeroed data blocks Darrick J. Wong
@ 2024-03-18 17:56     ` Andrey Albershteyn
  0 siblings, 0 replies; 92+ messages in thread
From: Andrey Albershteyn @ 2024-03-18 17:56 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: ebiggers, linux-fsdevel, fsverity, linux-xfs

On 2024-03-17 09:33:34, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Now that fsverity tells our merkle tree io functions about what a hash
> of a data block full of zeroes looks like, we can use this information
> to avoid writing out merkle tree blocks for sparse regions of the file.
> For verified gold master images this can save quite a bit of overhead.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Looks good to me:
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>

> ---
>  fs/xfs/xfs_verity.c |   37 ++++++++++++++++++++++++++++++++++---
>  1 file changed, 34 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c
> index abd95bc1ba6e..ba96e7049f61 100644
> --- a/fs/xfs/xfs_verity.c
> +++ b/fs/xfs/xfs_verity.c
> @@ -619,6 +619,20 @@ xfs_verity_read_merkle(
>  	xfs_verity_merkle_key_to_disk(&name, block->offset);
>  
>  	error = xfs_attr_get(&args);
> +	if (error == -ENOATTR) {
> +		u8		*p;
> +		unsigned int	i;
> +
> +		/*
> +		 * No attribute found.  Synthesize a buffer full of the zero
> +		 * digests on the assumption that we elided them at write time.
> +		 */
> +		for (i = 0, p = new_mk->data;
> +		     i < block->size;
> +		     i += req->digest_size, p += req->digest_size)
> +			memcpy(p, req->zero_digest, req->digest_size);
> +		error = 0;
> +	}
>  	if (error)
>  		goto out_new_mk;
>  
> @@ -676,12 +690,29 @@ xfs_verity_write_merkle(
>  		.value			= (void *)buf,
>  		.valuelen		= size,
>  	};
> -	const char			*p = buf + size - 1;
> +	const char			*p;
> +	unsigned int			i;
>  
> -	/* Don't store trailing zeroes. */
> +	/*
> +	 * If this is a block full of hashes of zeroed blocks, don't bother
> +	 * storing the block.  We can synthesize them later.
> +	 */
> +	for (i = 0, p = buf;
> +	     i < size;
> +	     i += req->digest_size, p += req->digest_size)
> +		if (memcmp(p, req->zero_digest, req->digest_size))
> +			break;
> +	if (i == size)
> +		return 0;
> +
> +	/*
> +	 * Don't store trailing zeroes.  Store at least one byte so that the
> +	 * block cannot be mistaken for an elided one.
> +	 */
> +	p = buf + size - 1;
>  	while (p >= (const char *)buf && *p == 0)
>  		p--;
> -	args.valuelen = p - (const char *)buf + 1;
> +	args.valuelen = max(1, p - (const char *)buf + 1);
>  
>  	xfs_verity_merkle_key_to_disk(&name, pos);
>  	return xfs_attr_set(&args);
> 

-- 
- Andrey


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 24/40] xfs: disable direct read path for fs-verity files
  2024-03-17 16:29   ` [PATCH 24/40] xfs: disable direct read path for fs-verity files Darrick J. Wong
@ 2024-03-18 19:48     ` Andrey Albershteyn
  2024-03-19 21:17       ` Darrick J. Wong
  0 siblings, 1 reply; 92+ messages in thread
From: Andrey Albershteyn @ 2024-03-18 19:48 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: ebiggers, linux-fsdevel, fsverity, linux-xfs

On 2024-03-17 09:29:39, Darrick J. Wong wrote:
> From: Andrey Albershteyn <aalbersh@redhat.com>
> 
> The direct path is not supported on verity files. Attempts to use direct
> I/O path on such files should fall back to buffered I/O path.
> 
> Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> [djwong: fix braces]
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/xfs_file.c |   15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 74dba917be93..0ce51a020115 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -281,7 +281,8 @@ xfs_file_dax_read(
>  	struct kiocb		*iocb,
>  	struct iov_iter		*to)
>  {
> -	struct xfs_inode	*ip = XFS_I(iocb->ki_filp->f_mapping->host);
> +	struct inode		*inode = iocb->ki_filp->f_mapping->host;
> +	struct xfs_inode	*ip = XFS_I(inode);
>  	ssize_t			ret = 0;
>  
>  	trace_xfs_file_dax_read(iocb, to);
> @@ -334,10 +335,18 @@ xfs_file_read_iter(
>  
>  	if (IS_DAX(inode))
>  		ret = xfs_file_dax_read(iocb, to);
> -	else if (iocb->ki_flags & IOCB_DIRECT)
> +	else if (iocb->ki_flags & IOCB_DIRECT && !fsverity_active(inode))

Brackets missing

>  		ret = xfs_file_dio_read(iocb, to);
> -	else
> +	else {
> +		/*
> +		 * In case fs-verity is enabled, we also fallback to the
> +		 * buffered read from the direct read path. Therefore,
> +		 * IOCB_DIRECT is set and need to be cleared (see
> +		 * generic_file_read_iter())
> +		 */
> +		iocb->ki_flags &= ~IOCB_DIRECT;
>  		ret = xfs_file_buffered_read(iocb, to);
> +	}
>  
>  	if (ret > 0)
>  		XFS_STATS_ADD(mp, xs_read_bytes, ret);
> 
> 

-- 
- Andrey


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 16/40] fsverity: pass the zero-hash value to the implementation
  2024-03-18 16:38     ` Eric Biggers
@ 2024-03-18 21:04       ` Darrick J. Wong
  0 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-18 21:04 UTC (permalink / raw)
  To: Eric Biggers; +Cc: aalbersh, linux-fsdevel, fsverity, linux-xfs

On Mon, Mar 18, 2024 at 09:38:47AM -0700, Eric Biggers wrote:
> On Sun, Mar 17, 2024 at 09:27:34AM -0700, Darrick J. Wong wrote:
> > diff --git a/fs/verity/open.c b/fs/verity/open.c
> > index 7a86407732c4..433a70eeca55 100644
> > --- a/fs/verity/open.c
> > +++ b/fs/verity/open.c
> > @@ -144,6 +144,13 @@ int fsverity_init_merkle_tree_params(struct merkle_tree_params *params,
> >  		goto out_err;
> >  	}
> >  
> > +	err = fsverity_hash_buffer(params->hash_alg, page_address(ZERO_PAGE(0)),
> > +				   i_blocksize(inode), params->zero_digest);
> > +	if (err) {
> > +		fsverity_err(inode, "Error %d computing zero digest", err);
> > +		goto out_err;
> > +	}
> 
> This doesn't take the salt into account.  Also it's using the wrong block size
> (filesystem block size instead of Merkle tree block size).
> 
> How about using fsverity_hash_block()?

/me looks at build_merkle_tree again, realizes that it calls
hash_one_block on params->block_size bytes of file data.

IOWs, fsverity_hash_block is indeed the correct function to call here.
Thanks for the correction!

--D

> - Eric

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 2/3] xfs/{021,122}: adapt to fsverity xattrs
  2024-03-17 16:39   ` [PATCH 2/3] xfs/{021,122}: adapt to fsverity xattrs Darrick J. Wong
@ 2024-03-19 14:59     ` Andrey Albershteyn
  2024-03-19 19:25       ` Darrick J. Wong
  0 siblings, 1 reply; 92+ messages in thread
From: Andrey Albershteyn @ 2024-03-19 14:59 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: ebiggers, zlang, fsverity, fstests, linux-fsdevel, guan, linux-xfs

On 2024-03-17 09:39:33, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Adjust these tests to accomdate the use of xattrs to store fsverity
> metadata.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Is it against one of pptrs branches? doesn't seem to apply on
for-next

> ---
>  tests/xfs/021     |    3 +++
>  tests/xfs/122.out |    1 +
>  2 files changed, 4 insertions(+)
> 
> 
> diff --git a/tests/xfs/021 b/tests/xfs/021
> index ef307fc064..dcecf41958 100755
> --- a/tests/xfs/021
> +++ b/tests/xfs/021
> @@ -118,6 +118,7 @@ _scratch_xfs_db -r -c "inode $inum_1" -c "print a.sfattr"  | \
>  	perl -ne '
>  /\.secure/ && next;
>  /\.parent/ && next;
> +/\.verity/ && next;
>  	print unless /^\d+:\[.*/;'
>  
>  echo "*** dump attributes (2)"
> @@ -128,6 +129,7 @@ _scratch_xfs_db -r -c "inode $inum_2" -c "a a.bmx[0].startblock" -c print  \
>  	| perl -ne '
>  s/,secure//;
>  s/,parent//;
> +s/,verity//;
>  s/info.hdr/info/;
>  /hdr.info.crc/ && next;
>  /hdr.info.bno/ && next;
> @@ -135,6 +137,7 @@ s/info.hdr/info/;
>  /hdr.info.lsn/ && next;
>  /hdr.info.owner/ && next;
>  /\.parent/ && next;
> +/\.verity/ && next;
>  s/^(hdr.info.magic =) 0x3bee/\1 0xfbee/;
>  s/^(hdr.firstused =) (\d+)/\1 FIRSTUSED/;
>  s/^(hdr.freemap\[0-2] = \[base,size]).*/\1 [FREEMAP..]/;
> diff --git a/tests/xfs/122.out b/tests/xfs/122.out
> index 3a99ce77bb..ff886b4eec 100644
> --- a/tests/xfs/122.out
> +++ b/tests/xfs/122.out
> @@ -141,6 +141,7 @@ sizeof(struct xfs_scrub_vec) = 16
>  sizeof(struct xfs_scrub_vec_head) = 32
>  sizeof(struct xfs_swap_extent) = 64
>  sizeof(struct xfs_unmount_log_format) = 8
> +sizeof(struct xfs_verity_merkle_key) = 8
>  sizeof(struct xfs_xmd_log_format) = 16
>  sizeof(struct xfs_xmi_log_format) = 80
>  sizeof(union xfs_rtword_raw) = 4
> 

-- 
- Andrey


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 2/3] xfs/{021,122}: adapt to fsverity xattrs
  2024-03-19 14:59     ` Andrey Albershteyn
@ 2024-03-19 19:25       ` Darrick J. Wong
  0 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-19 19:25 UTC (permalink / raw)
  To: Andrey Albershteyn
  Cc: ebiggers, zlang, fsverity, fstests, linux-fsdevel, guan, linux-xfs

On Tue, Mar 19, 2024 at 03:59:48PM +0100, Andrey Albershteyn wrote:
> On 2024-03-17 09:39:33, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Adjust these tests to accomdate the use of xattrs to store fsverity
> > metadata.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> 
> Is it against one of pptrs branches? doesn't seem to apply on
> for-next

See
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fsverity

(as mentioned in the cover letter)

--D

> 
> > ---
> >  tests/xfs/021     |    3 +++
> >  tests/xfs/122.out |    1 +
> >  2 files changed, 4 insertions(+)
> > 
> > 
> > diff --git a/tests/xfs/021 b/tests/xfs/021
> > index ef307fc064..dcecf41958 100755
> > --- a/tests/xfs/021
> > +++ b/tests/xfs/021
> > @@ -118,6 +118,7 @@ _scratch_xfs_db -r -c "inode $inum_1" -c "print a.sfattr"  | \
> >  	perl -ne '
> >  /\.secure/ && next;
> >  /\.parent/ && next;
> > +/\.verity/ && next;
> >  	print unless /^\d+:\[.*/;'
> >  
> >  echo "*** dump attributes (2)"
> > @@ -128,6 +129,7 @@ _scratch_xfs_db -r -c "inode $inum_2" -c "a a.bmx[0].startblock" -c print  \
> >  	| perl -ne '
> >  s/,secure//;
> >  s/,parent//;
> > +s/,verity//;
> >  s/info.hdr/info/;
> >  /hdr.info.crc/ && next;
> >  /hdr.info.bno/ && next;
> > @@ -135,6 +137,7 @@ s/info.hdr/info/;
> >  /hdr.info.lsn/ && next;
> >  /hdr.info.owner/ && next;
> >  /\.parent/ && next;
> > +/\.verity/ && next;
> >  s/^(hdr.info.magic =) 0x3bee/\1 0xfbee/;
> >  s/^(hdr.firstused =) (\d+)/\1 FIRSTUSED/;
> >  s/^(hdr.freemap\[0-2] = \[base,size]).*/\1 [FREEMAP..]/;
> > diff --git a/tests/xfs/122.out b/tests/xfs/122.out
> > index 3a99ce77bb..ff886b4eec 100644
> > --- a/tests/xfs/122.out
> > +++ b/tests/xfs/122.out
> > @@ -141,6 +141,7 @@ sizeof(struct xfs_scrub_vec) = 16
> >  sizeof(struct xfs_scrub_vec_head) = 32
> >  sizeof(struct xfs_swap_extent) = 64
> >  sizeof(struct xfs_unmount_log_format) = 8
> > +sizeof(struct xfs_verity_merkle_key) = 8
> >  sizeof(struct xfs_xmd_log_format) = 16
> >  sizeof(struct xfs_xmi_log_format) = 80
> >  sizeof(union xfs_rtword_raw) = 4
> > 
> 
> -- 
> - Andrey
> 
> 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 24/40] xfs: disable direct read path for fs-verity files
  2024-03-18 19:48     ` Andrey Albershteyn
@ 2024-03-19 21:17       ` Darrick J. Wong
  0 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-19 21:17 UTC (permalink / raw)
  To: Andrey Albershteyn; +Cc: ebiggers, linux-fsdevel, fsverity, linux-xfs

On Mon, Mar 18, 2024 at 08:48:47PM +0100, Andrey Albershteyn wrote:
> On 2024-03-17 09:29:39, Darrick J. Wong wrote:
> > From: Andrey Albershteyn <aalbersh@redhat.com>
> > 
> > The direct path is not supported on verity files. Attempts to use direct
> > I/O path on such files should fall back to buffered I/O path.
> > 
> > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> > [djwong: fix braces]
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  fs/xfs/xfs_file.c |   15 ++++++++++++---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 74dba917be93..0ce51a020115 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -281,7 +281,8 @@ xfs_file_dax_read(
> >  	struct kiocb		*iocb,
> >  	struct iov_iter		*to)
> >  {
> > -	struct xfs_inode	*ip = XFS_I(iocb->ki_filp->f_mapping->host);
> > +	struct inode		*inode = iocb->ki_filp->f_mapping->host;
> > +	struct xfs_inode	*ip = XFS_I(inode);
> >  	ssize_t			ret = 0;
> >  
> >  	trace_xfs_file_dax_read(iocb, to);
> > @@ -334,10 +335,18 @@ xfs_file_read_iter(
> >  
> >  	if (IS_DAX(inode))
> >  		ret = xfs_file_dax_read(iocb, to);
> > -	else if (iocb->ki_flags & IOCB_DIRECT)
> > +	else if (iocb->ki_flags & IOCB_DIRECT && !fsverity_active(inode))
> 
> Brackets missing

Oops, will fix that.  Thanks!

--D

> >  		ret = xfs_file_dio_read(iocb, to);
> > -	else
> > +	else {
> > +		/*
> > +		 * In case fs-verity is enabled, we also fallback to the
> > +		 * buffered read from the direct read path. Therefore,
> > +		 * IOCB_DIRECT is set and need to be cleared (see
> > +		 * generic_file_read_iter())
> > +		 */
> > +		iocb->ki_flags &= ~IOCB_DIRECT;
> >  		ret = xfs_file_buffered_read(iocb, to);
> > +	}
> >  
> >  	if (ret > 0)
> >  		XFS_STATS_ADD(mp, xs_read_bytes, ret);
> > 
> > 
> 
> -- 
> - Andrey
> 
> 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 35/40] xfs: teach online repair to evaluate fsverity xattrs
  2024-03-18 17:34     ` Andrey Albershteyn
@ 2024-03-19 21:27       ` Darrick J. Wong
  0 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-19 21:27 UTC (permalink / raw)
  To: Andrey Albershteyn; +Cc: ebiggers, linux-fsdevel, fsverity, linux-xfs

On Mon, Mar 18, 2024 at 06:34:04PM +0100, Andrey Albershteyn wrote:
> On 2024-03-17 09:32:31, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Teach online repair to check for unused fsverity metadata and purge it
> > on reconstruction.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  fs/xfs/scrub/attr.c   |  102 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  fs/xfs/scrub/attr.h   |    4 ++
> >  fs/xfs/scrub/common.c |   27 +++++++++++++
> >  3 files changed, 133 insertions(+)
> > 
> > 
> > diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
> > index ae4227cb55ec..c69dee281984 100644
> > --- a/fs/xfs/scrub/attr.c
> > +++ b/fs/xfs/scrub/attr.c
> > @@ -21,6 +21,8 @@
> >  #include "scrub/dabtree.h"
> >  #include "scrub/attr.h"
> >  
> > +#include <linux/fsverity.h>
> > +
> >  /* Free the buffers linked from the xattr buffer. */
> >  static void
> >  xchk_xattr_buf_cleanup(
> > @@ -135,6 +137,91 @@ xchk_setup_xattr(
> >  	return xchk_setup_inode_contents(sc, 0);
> >  }
> >  
> > +#ifdef CONFIG_FS_VERITY
> > +/* Extract merkle tree geometry from incore information. */
> > +static int
> > +xchk_xattr_extract_verity(
> > +	struct xfs_scrub		*sc)
> > +{
> > +	struct xchk_xattr_buf		*ab = sc->buf;
> > +
> > +	/* setup should have allocated the buffer */
> > +	if (!ab) {
> > +		ASSERT(0);
> > +		return -EFSCORRUPTED;
> > +	}
> > +
> > +	return fsverity_merkle_tree_geometry(VFS_I(sc->ip),
> > +			&ab->merkle_blocksize, &ab->merkle_tree_size);
> > +}
> > +
> > +/* Check the merkle tree xattrs. */
> > +STATIC void
> > +xchk_xattr_verity(
> > +	struct xfs_scrub		*sc,
> > +	xfs_dablk_t			blkno,
> > +	const unsigned char		*name,
> > +	unsigned int			namelen,
> > +	unsigned int			valuelen)
> > +{
> > +	struct xchk_xattr_buf		*ab = sc->buf;
> > +
> > +	/* Non-verity filesystems should never have verity xattrs. */
> > +	if (!xfs_has_verity(sc->mp)) {
> > +		xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
> > +		return;
> > +	}
> > +
> > +	/*
> > +	 * Any verity metadata on a non-verity file are leftovers from a
> > +	 * previous attempt to enable verity.
> > +	 */
> > +	if (!IS_VERITY(VFS_I(sc->ip))) {
> > +		xchk_ino_set_preen(sc, sc->ip->i_ino);
> > +		return;
> > +	}
> > +
> > +	switch (namelen) {
> > +	case sizeof(struct xfs_verity_merkle_key):
> > +		/* Oversized blocks are not allowed */
> > +		if (valuelen > ab->merkle_blocksize) {
> > +			xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
> > +			return;
> > +		}
> > +		break;
> > +	case XFS_VERITY_DESCRIPTOR_NAME_LEN:
> > +		/* Has to match the descriptor xattr name */
> > +		if (memcmp(name, XFS_VERITY_DESCRIPTOR_NAME, namelen)) {
> > +			xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
> > +		}
> > +		return;
> > +	default:
> > +		xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
> > +		return;
> > +	}
> > +
> > +	/*
> > +	 * Merkle tree blocks beyond the end of the tree are leftovers from
> > +	 * a previous failed attempt to enable verity.
> > +	 */
> > +	if (xfs_verity_merkle_key_from_disk(name) >= ab->merkle_tree_size)
> > +		xchk_ino_set_preen(sc, sc->ip->i_ino);
> > +}
> > +#else
> > +# define xchk_xattr_extract_verity(sc)	(0)
> > +
> > +static void
> > +xchk_xattr_verity(
> > +	struct xfs_scrub	*sc,
> > +	xfs_dablk_t		blkno,
> > +	const unsigned char	*name,
> > +	unsigned int		namelen)
> > +{
> > +	/* Should never see verity xattrs when verity is not enabled. */
> > +	xchk_fblock_set_corrupt(sc, XFS_ATTR_FORK, blkno);
> > +}
> > +#endif /* CONFIG_FS_VERITY */
> > +
> >  /* Extended Attributes */
> >  
> >  struct xchk_xattr {
> > @@ -194,6 +281,15 @@ xchk_xattr_listent(
> >  		goto fail_xref;
> >  	}
> >  
> > +	/* Check verity xattr geometry */
> > +	if (flags & XFS_ATTR_VERITY) {
> > +		xchk_xattr_verity(sx->sc, args.blkno, name, namelen, valuelen);
> > +		if (sx->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) {
> > +			context->seen_enough = 1;
> > +			return;
> > +		}
> > +	}
> > +
> >  	/* Does this name make sense? */
> >  	if (!xfs_attr_namecheck(sx->sc->mp, name, namelen, flags)) {
> >  		xchk_fblock_set_corrupt(sx->sc, XFS_ATTR_FORK, args.blkno);
> 
> Would it be better to check verity after xfs_attr_namecheck()?
> Invalid name seems to be a more basic corruption.

Yeah, that could be changed easily. Done.

--D

> -- 
> - Andrey
> 
> 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCHSET v5.3] fs-verity support for XFS
  2024-03-18 16:35   ` [PATCHSET v5.3] fs-verity support for XFS Eric Biggers
@ 2024-03-19 22:07     ` Darrick J. Wong
  2024-03-19 23:21       ` Darrick J. Wong
  0 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-19 22:07 UTC (permalink / raw)
  To: Eric Biggers
  Cc: aalbersh, Mark Tinguely, Allison Henderson, Christoph Hellwig,
	Dave Chinner, linux-fsdevel, fsverity, linux-xfs

On Mon, Mar 18, 2024 at 09:35:12AM -0700, Eric Biggers wrote:
> On Sun, Mar 17, 2024 at 09:22:52AM -0700, Darrick J. Wong wrote:
> > Hi all,
> > 
> > From Darrick J. Wong:
> > 
> > This v5.3 patchset builds upon v5.2 of Andrey's patchset to implement
> > fsverity for XFS.
> 
> Is this ready for me to review, or is my feedback on v5 still being
> worked on?

It's still being worked on.  I figured it was time to push my work tree
back to Andrey so everyone could see the results of me attempting to
understand the fsverity patchset by working around in the codebase.

From your perspective, I suspect the most interesting patches will be 5,
6, 7+10+14, 11-13, and 15-17.  For everyone on the XFS side, patches
27-39 are the most interesting since they change the caching strategy
and slim down the ondisk format.

> From a quick glance, not everything from my feedback has been
> addressed.

That's correct.  I cleaned up the mechanics of passing merkle trees
around, but I didn't address the comments about per-sb workqueues,
fsverity tracepoints, or whether or not iomap should allocate biosets.
Roughly, here's what I did in the generic code:

I fixed the FS_XFLAG_VERITY handling so that you can't clear it via
FS_IOC_FSSETXATTR.

I also rewrote and augmented the "drop dead merkle tree" functions in
xfs_verity to clean out incomplete trees when ->end_enable tells us we
failed; and to clean out extra blocks in the ->begin_enable just in case
the file shrank since a failed attempt to enable fsverity.

As for online repair, the "fsverity: expose merkle tree geometry to
callers" enables the kernel to do some basic online checking that there
aren't excessive merkle tree blocks and that fsverity can read the
descriptor.  In my djwong-wtf tree, xfs_scrub gains the ability to read
the entire file into the pagecache (and hence validate the verity info)
via MADV_POPULATE READ, and now it has a patch to read the entire merkle
tree/descriptor/signature just to make sure those can actually be read.

Most of the things you gave feedback about in "fsverity: support
block-based Merkle tree caching" I think I cleaned up in "fsverity: fix
"support block-based Merkle tree caching"" and "fsverity: rely on cached
block callers to retain verified state".  I kept those separate so that
Andrey could see what I did, though they really ought to be merged into
the main support patch.

Note that I greatly expanded the usage of struct fsverity_blockbuf and
changed the verified flag handling so that the invalidation function was
no longer necessary.

--D

> - Eric

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCHSET v5.3] fs-verity support for XFS
  2024-03-19 22:07     ` Darrick J. Wong
@ 2024-03-19 23:21       ` Darrick J. Wong
  2024-03-20 10:16         ` Andrey Albershteyn
  0 siblings, 1 reply; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-19 23:21 UTC (permalink / raw)
  To: aalbersh, Eric Biggers
  Cc: Allison Henderson, Christoph Hellwig, Dave Chinner,
	linux-fsdevel, fsverity, linux-xfs, mark.tinguely

[fix tinguely email addr]

On Tue, Mar 19, 2024 at 03:07:43PM -0700, Darrick J. Wong wrote:
> On Mon, Mar 18, 2024 at 09:35:12AM -0700, Eric Biggers wrote:
> > On Sun, Mar 17, 2024 at 09:22:52AM -0700, Darrick J. Wong wrote:
> > > Hi all,
> > > 
> > > From Darrick J. Wong:
> > > 
> > > This v5.3 patchset builds upon v5.2 of Andrey's patchset to implement
> > > fsverity for XFS.
> > 
> > Is this ready for me to review, or is my feedback on v5 still being
> > worked on?
> 
> It's still being worked on.  I figured it was time to push my work tree
> back to Andrey so everyone could see the results of me attempting to
> understand the fsverity patchset by working around in the codebase.
> 
> From your perspective, I suspect the most interesting patches will be 5,
> 6, 7+10+14, 11-13, and 15-17.  For everyone on the XFS side, patches
> 27-39 are the most interesting since they change the caching strategy
> and slim down the ondisk format.
> 
> > From a quick glance, not everything from my feedback has been
> > addressed.
> 
> That's correct.  I cleaned up the mechanics of passing merkle trees
> around, but I didn't address the comments about per-sb workqueues,
> fsverity tracepoints, or whether or not iomap should allocate biosets.

That perhaps wasn't quite clear enough -- I'm curious to see what Andrey
has to say about that part (patches 8, 9, 18) of the patchset.

--D

> Roughly, here's what I did in the generic code:
> 
> I fixed the FS_XFLAG_VERITY handling so that you can't clear it via
> FS_IOC_FSSETXATTR.
> 
> I also rewrote and augmented the "drop dead merkle tree" functions in
> xfs_verity to clean out incomplete trees when ->end_enable tells us we
> failed; and to clean out extra blocks in the ->begin_enable just in case
> the file shrank since a failed attempt to enable fsverity.
> 
> As for online repair, the "fsverity: expose merkle tree geometry to
> callers" enables the kernel to do some basic online checking that there
> aren't excessive merkle tree blocks and that fsverity can read the
> descriptor.  In my djwong-wtf tree, xfs_scrub gains the ability to read
> the entire file into the pagecache (and hence validate the verity info)
> via MADV_POPULATE READ, and now it has a patch to read the entire merkle
> tree/descriptor/signature just to make sure those can actually be read.
> 
> Most of the things you gave feedback about in "fsverity: support
> block-based Merkle tree caching" I think I cleaned up in "fsverity: fix
> "support block-based Merkle tree caching"" and "fsverity: rely on cached
> block callers to retain verified state".  I kept those separate so that
> Andrey could see what I did, though they really ought to be merged into
> the main support patch.
> 
> Note that I greatly expanded the usage of struct fsverity_blockbuf and
> changed the verified flag handling so that the invalidation function was
> no longer necessary.
> 
> --D
> 
> > - Eric
> 

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCHSET v5.3] fs-verity support for XFS
  2024-03-19 23:21       ` Darrick J. Wong
@ 2024-03-20 10:16         ` Andrey Albershteyn
  2024-03-20 15:11           ` Darrick J. Wong
  0 siblings, 1 reply; 92+ messages in thread
From: Andrey Albershteyn @ 2024-03-20 10:16 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Eric Biggers, Allison Henderson, Christoph Hellwig, Dave Chinner,
	linux-fsdevel, fsverity, linux-xfs, mark.tinguely

On 2024-03-19 16:21:18, Darrick J. Wong wrote:
> [fix tinguely email addr]
> 
> On Tue, Mar 19, 2024 at 03:07:43PM -0700, Darrick J. Wong wrote:
> > On Mon, Mar 18, 2024 at 09:35:12AM -0700, Eric Biggers wrote:
> > > On Sun, Mar 17, 2024 at 09:22:52AM -0700, Darrick J. Wong wrote:
> > > > Hi all,
> > > > 
> > > > From Darrick J. Wong:
> > > > 
> > > > This v5.3 patchset builds upon v5.2 of Andrey's patchset to implement
> > > > fsverity for XFS.
> > > 
> > > Is this ready for me to review, or is my feedback on v5 still being
> > > worked on?
> > 
> > It's still being worked on.  I figured it was time to push my work tree
> > back to Andrey so everyone could see the results of me attempting to
> > understand the fsverity patchset by working around in the codebase.
> > 
> > From your perspective, I suspect the most interesting patches will be 5,
> > 6, 7+10+14, 11-13, and 15-17.  For everyone on the XFS side, patches
> > 27-39 are the most interesting since they change the caching strategy
> > and slim down the ondisk format.
> > 
> > > From a quick glance, not everything from my feedback has been
> > > addressed.
> > 
> > That's correct.  I cleaned up the mechanics of passing merkle trees
> > around, but I didn't address the comments about per-sb workqueues,
> > fsverity tracepoints, or whether or not iomap should allocate biosets.
> 
> That perhaps wasn't quite clear enough -- I'm curious to see what Andrey
> has to say about that part (patches 8, 9, 18) of the patchset.

The per-sb workqueue can be used for other fs, which should be
doable (also I will rename it, as generic name came from the v2 when
I thought it would be used for more stuff than just verity)

For tracepoints, I will add all the changes suggested by Eric, the
signature tracepoints could be probably dropped.

For bioset allocation, I will look into this if there's good way to
allocate only for verity inodes, if it's not complicate things too
much. Make sense for systems which won't use fsverity but have
FS_VERITY=y.

-- 
- Andrey


^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCHSET v5.3] fs-verity support for XFS
  2024-03-20 10:16         ` Andrey Albershteyn
@ 2024-03-20 15:11           ` Darrick J. Wong
  0 siblings, 0 replies; 92+ messages in thread
From: Darrick J. Wong @ 2024-03-20 15:11 UTC (permalink / raw)
  To: Andrey Albershteyn
  Cc: Eric Biggers, Allison Henderson, Christoph Hellwig, Dave Chinner,
	linux-fsdevel, fsverity, linux-xfs, mark.tinguely

On Wed, Mar 20, 2024 at 11:16:01AM +0100, Andrey Albershteyn wrote:
> On 2024-03-19 16:21:18, Darrick J. Wong wrote:
> > [fix tinguely email addr]
> > 
> > On Tue, Mar 19, 2024 at 03:07:43PM -0700, Darrick J. Wong wrote:
> > > On Mon, Mar 18, 2024 at 09:35:12AM -0700, Eric Biggers wrote:
> > > > On Sun, Mar 17, 2024 at 09:22:52AM -0700, Darrick J. Wong wrote:
> > > > > Hi all,
> > > > > 
> > > > > From Darrick J. Wong:
> > > > > 
> > > > > This v5.3 patchset builds upon v5.2 of Andrey's patchset to implement
> > > > > fsverity for XFS.
> > > > 
> > > > Is this ready for me to review, or is my feedback on v5 still being
> > > > worked on?
> > > 
> > > It's still being worked on.  I figured it was time to push my work tree
> > > back to Andrey so everyone could see the results of me attempting to
> > > understand the fsverity patchset by working around in the codebase.
> > > 
> > > From your perspective, I suspect the most interesting patches will be 5,
> > > 6, 7+10+14, 11-13, and 15-17.  For everyone on the XFS side, patches
> > > 27-39 are the most interesting since they change the caching strategy
> > > and slim down the ondisk format.
> > > 
> > > > From a quick glance, not everything from my feedback has been
> > > > addressed.
> > > 
> > > That's correct.  I cleaned up the mechanics of passing merkle trees
> > > around, but I didn't address the comments about per-sb workqueues,
> > > fsverity tracepoints, or whether or not iomap should allocate biosets.
> > 
> > That perhaps wasn't quite clear enough -- I'm curious to see what Andrey
> > has to say about that part (patches 8, 9, 18) of the patchset.
> 
> The per-sb workqueue can be used for other fs, which should be
> doable (also I will rename it, as generic name came from the v2 when
> I thought it would be used for more stuff than just verity)

<nod>

> For tracepoints, I will add all the changes suggested by Eric, the
> signature tracepoints could be probably dropped.

I hacked up a bunch of tracepoint changes which I've attached below.
Note the use of print_hex_str so that the digest comes out like:

a0fcdf17f6d49b47

instead of

a0 fc df 17 f6 d4 9b 47

So that it's an exact match for what the fsverity tool emits.  I also
turned the _ASCEND and _DESCEND trace arguments into separate
tracepoints.

Also, if you ever want to have a tracepoint that stores an int value but
turns that into a string in TP_printk, you should use __print_symbolic
and not open-code the logic.  For bitflags, it's __print_flags.  None of
that is documented anywhere.

> For bioset allocation, I will look into this if there's good way to
> allocate only for verity inodes, if it's not complicate things too
> much. Make sense for systems which won't use fsverity but have
> FS_VERITY=y.

I'd imagine it's more or less a clone of sb_init_dio_done_wq that can be
called from iomap_read_bio_alloc when
(fsverity_active() && !sb->s_read_done_wq).

Something I just noticed -- shouldn't we be calling verity from
iomap_read_folio_sync as well?

--D

diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index 06b769dd1bdf1..8c6fe4b72b14e 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -232,7 +232,7 @@ static int enable_verity(struct file *filp,
 	if (err)
 		goto out;
 
-	trace_fsverity_enable(inode, desc, &params);
+	trace_fsverity_enable(inode, &params);
 
 	/*
 	 * Start enabling verity on this file, serialized by the inode lock.
@@ -263,7 +263,6 @@ static int enable_verity(struct file *filp,
 		fsverity_err(inode, "Error %d building Merkle tree", err);
 		goto rollback;
 	}
-	trace_fsverity_tree_done(inode, desc, &params);
 
 	/*
 	 * Create the fsverity_info.  Don't bother trying to save work by
@@ -278,6 +277,8 @@ static int enable_verity(struct file *filp,
 		goto rollback;
 	}
 
+	trace_fsverity_tree_done(inode, vi, &params);
+
 	/*
 	 * Tell the filesystem to finish enabling verity on the file.
 	 * Serialized with ->begin_enable_verity() by the inode lock.
diff --git a/fs/verity/signature.c b/fs/verity/signature.c
index c1f08bb32ed1f..90c07573dd77b 100644
--- a/fs/verity/signature.c
+++ b/fs/verity/signature.c
@@ -53,8 +53,6 @@ int fsverity_verify_signature(const struct fsverity_info *vi,
 	struct fsverity_formatted_digest *d;
 	int err;
 
-	trace_fsverity_verify_signature(inode, signature, sig_size);
-
 	if (sig_size == 0) {
 		if (fsverity_require_signatures) {
 			fsverity_err(inode,
diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index 0782e94bc818d..a6aa0d0556744 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -122,7 +122,9 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		/* Byte offset of the wanted hash relative to @addr */
 		unsigned int hoffset;
 	} hblocks[FS_VERITY_MAX_LEVELS];
-	trace_fsverity_verify_block(inode, data_pos);
+
+	trace_fsverity_verify_data_block(inode, params, data_pos);
+
 	/*
 	 * The index of the previous level's block within that level; also the
 	 * index of that block's hash within the current level.
@@ -195,8 +197,9 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 		if (is_hash_block_verified(inode, block, hblock_idx)) {
 			memcpy(_want_hash, block->kaddr + hoffset, hsize);
 			want_hash = _want_hash;
-			trace_fsverity_merkle_tree_block_verified(inode,
-					block, FSVERITY_TRACE_DIR_ASCEND);
+			trace_fsverity_merkle_hit(inode, data_pos, hblock_pos,
+					level,
+					hoffset >> params->log_digestsize);
 			fsverity_drop_merkle_tree_block(inode, block);
 			goto descend;
 		}
@@ -231,8 +234,8 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
 			SetPageChecked((struct page *)block->context);
 		memcpy(_want_hash, haddr + hoffset, hsize);
 		want_hash = _want_hash;
-		trace_fsverity_merkle_tree_block_verified(inode, block,
-				FSVERITY_TRACE_DIR_DESCEND);
+		trace_fsverity_verify_merkle_block(inode, block->offset,
+				level, hoffset >> params->log_digestsize);
 		fsverity_drop_merkle_tree_block(inode, block);
 	}
 
diff --git a/include/trace/events/fsverity.h b/include/trace/events/fsverity.h
index 1a6ee2a2c3ce2..f08d3eb3368f3 100644
--- a/include/trace/events/fsverity.h
+++ b/include/trace/events/fsverity.h
@@ -11,14 +11,10 @@ struct fsverity_descriptor;
 struct merkle_tree_params;
 struct fsverity_info;
 
-#define FSVERITY_TRACE_DIR_ASCEND	(1ul << 0)
-#define FSVERITY_TRACE_DIR_DESCEND	(1ul << 1)
-#define FSVERITY_HASH_SHOWN_LEN		20
-
 TRACE_EVENT(fsverity_enable,
-	TP_PROTO(struct inode *inode, struct fsverity_descriptor *desc,
-		struct merkle_tree_params *params),
-	TP_ARGS(inode, desc, params),
+	TP_PROTO(const struct inode *inode,
+		 const struct merkle_tree_params *params),
+	TP_ARGS(inode, params),
 	TP_STRUCT__entry(
 		__field(ino_t, ino)
 		__field(u64, data_size)
@@ -28,7 +24,7 @@ TRACE_EVENT(fsverity_enable,
 	),
 	TP_fast_assign(
 		__entry->ino = inode->i_ino;
-		__entry->data_size = desc->data_size;
+		__entry->data_size = i_size_read(inode);
 		__entry->block_size = params->block_size;
 		__entry->num_levels = params->num_levels;
 		__entry->tree_size = params->tree_size;
@@ -42,118 +38,102 @@ TRACE_EVENT(fsverity_enable,
 );
 
 TRACE_EVENT(fsverity_tree_done,
-	TP_PROTO(struct inode *inode, struct fsverity_descriptor *desc,
-		struct merkle_tree_params *params),
-	TP_ARGS(inode, desc, params),
+	TP_PROTO(const struct inode *inode, const struct fsverity_info *vi,
+		 const struct merkle_tree_params *params),
+	TP_ARGS(inode, vi, params),
 	TP_STRUCT__entry(
 		__field(ino_t, ino)
 		__field(unsigned int, levels)
-		__field(unsigned int, tree_blocks)
+		__field(unsigned int, block_size)
 		__field(u64, tree_size)
-		__array(u8, tree_hash, 64)
+		__dynamic_array(u8, root_hash, params->digest_size)
+		__dynamic_array(u8, file_digest, params->digest_size)
 	),
 	TP_fast_assign(
 		__entry->ino = inode->i_ino;
 		__entry->levels = params->num_levels;
-		__entry->tree_blocks =
-			params->tree_size >> params->log_blocksize;
+		__entry->block_size = params->block_size;
 		__entry->tree_size = params->tree_size;
-		memcpy(__entry->tree_hash, desc->root_hash, 64);
+		memcpy(__get_dynamic_array(root_hash), vi->root_hash, __get_dynamic_array_len(root_hash));
+		memcpy(__get_dynamic_array(file_digest), vi->file_digest, __get_dynamic_array_len(file_digest));
 	),
-	TP_printk("ino %lu levels %d tree_blocks %d tree_size %lld root_hash %s",
+	TP_printk("ino %lu levels %d block_size %d tree_size %lld root_hash %s digest %s",
 		(unsigned long) __entry->ino,
 		__entry->levels,
-		__entry->tree_blocks,
+		__entry->block_size,
 		__entry->tree_size,
-		__print_hex(__entry->tree_hash, 64))
+		__print_hex_str(__get_dynamic_array(root_hash), __get_dynamic_array_len(root_hash)),
+		__print_hex_str(__get_dynamic_array(file_digest), __get_dynamic_array_len(file_digest)))
 );
 
-TRACE_EVENT(fsverity_verify_block,
-	TP_PROTO(struct inode *inode, u64 offset),
-	TP_ARGS(inode, offset),
+TRACE_EVENT(fsverity_verify_data_block,
+	TP_PROTO(const struct inode *inode,
+		 const struct merkle_tree_params *params,
+		 u64 data_pos),
+	TP_ARGS(inode, params, data_pos),
 	TP_STRUCT__entry(
 		__field(ino_t, ino)
-		__field(u64, offset)
+		__field(u64, data_pos)
 		__field(unsigned int, block_size)
 	),
 	TP_fast_assign(
 		__entry->ino = inode->i_ino;
-		__entry->offset = offset;
-		__entry->block_size =
-			inode->i_verity_info->tree_params.block_size;
+		__entry->data_pos = data_pos;
+		__entry->block_size = params->block_size;
 	),
-	TP_printk("ino %lu data offset %lld data block size %u",
+	TP_printk("ino %lu pos %lld merkle_blocksize %u",
 		(unsigned long) __entry->ino,
-		__entry->offset,
+		__entry->data_pos,
 		__entry->block_size)
 );
 
-TRACE_EVENT(fsverity_merkle_tree_block_verified,
-	TP_PROTO(struct inode *inode,
-		 struct fsverity_blockbuf *block,
-		 u8 direction),
-	TP_ARGS(inode, block, direction),
+TRACE_EVENT(fsverity_merkle_hit,
+	TP_PROTO(const struct inode *inode, u64 data_pos, u64 merkle_pos,
+		 unsigned int level, unsigned int hidx),
+	TP_ARGS(inode, data_pos, merkle_pos, level, hidx),
 	TP_STRUCT__entry(
 		__field(ino_t, ino)
-		__field(u64, offset)
-		__field(u8, direction)
+		__field(u64, data_pos)
+		__field(u64, merkle_pos)
+		__field(unsigned int, level)
+		__field(unsigned int, hidx)
 	),
 	TP_fast_assign(
 		__entry->ino = inode->i_ino;
-		__entry->offset = block->offset;
-		__entry->direction = direction;
+		__entry->data_pos = data_pos;
+		__entry->merkle_pos = merkle_pos;
+		__entry->level = level;
+		__entry->hidx = hidx;
 	),
-	TP_printk("ino %lu block offset %llu %s",
+	TP_printk("ino %lu data_pos %llu merkle_pos %llu level %u hidx %u",
 		(unsigned long) __entry->ino,
-		__entry->offset,
-		__entry->direction == 0 ? "ascend" : "descend")
+		__entry->data_pos,
+		__entry->merkle_pos,
+		__entry->level,
+		__entry->hidx)
 );
 
-TRACE_EVENT(fsverity_read_merkle_tree_block,
-	TP_PROTO(struct inode *inode, u64 offset, unsigned int log_blocksize),
-	TP_ARGS(inode, offset, log_blocksize),
+TRACE_EVENT(fsverity_verify_merkle_block,
+	TP_PROTO(const struct inode *inode, u64 merkle_pos, unsigned int level,
+		unsigned int hidx),
+	TP_ARGS(inode, merkle_pos, level, hidx),
 	TP_STRUCT__entry(
 		__field(ino_t, ino)
-		__field(u64, offset)
-		__field(u64, index)
-		__field(unsigned int, block_size)
-	),
-	TP_fast_assign(
-		__entry->ino = inode->i_ino;
-		__entry->offset = offset;
-		__entry->index = offset >> log_blocksize;
-		__entry->block_size = 1 << log_blocksize;
-	),
-	TP_printk("ino %lu tree offset %llu block index %llu block hize %u",
-		(unsigned long) __entry->ino,
-		__entry->offset,
-		__entry->index,
-		__entry->block_size)
-);
-
-TRACE_EVENT(fsverity_verify_signature,
-	TP_PROTO(const struct inode *inode, const u8 *signature, size_t sig_size),
-	TP_ARGS(inode, signature, sig_size),
-	TP_STRUCT__entry(
-		__field(ino_t, ino)
-		__dynamic_array(u8, signature, sig_size)
-		__field(size_t, sig_size)
-		__field(size_t, sig_size_show)
+		__field(u64, merkle_pos)
+		__field(unsigned int, level)
+		__field(unsigned int, hidx)
 	),
 	TP_fast_assign(
 		__entry->ino = inode->i_ino;
-		memcpy(__get_dynamic_array(signature), signature, sig_size);
-		__entry->sig_size = sig_size;
-		__entry->sig_size_show = (sig_size > FSVERITY_HASH_SHOWN_LEN ?
-			FSVERITY_HASH_SHOWN_LEN : sig_size);
+		__entry->merkle_pos = merkle_pos;
+		__entry->level = level;
+		__entry->hidx = hidx;
 	),
-	TP_printk("ino %lu sig_size %zu %s%s%s",
+	TP_printk("ino %lu merkle_pos %llu level %u hidx %u",
 		(unsigned long) __entry->ino,
-		__entry->sig_size,
-		(__entry->sig_size ? "sig " : ""),
-		__print_hex(__get_dynamic_array(signature),
-			__entry->sig_size_show),
-		(__entry->sig_size ? "..." : ""))
+		__entry->merkle_pos,
+		__entry->level,
+		__entry->hidx)
 );
 
 #endif /* _TRACE_FSVERITY_H */

^ permalink raw reply related	[flat|nested] 92+ messages in thread

end of thread, other threads:[~2024-03-20 15:11 UTC | newest]

Thread overview: 92+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-17 16:19 [PATCHBOMB v5.3] fs-verity support for XFS Darrick J. Wong
2024-03-17 16:22 ` [PATCHSET " Darrick J. Wong
2024-03-17 16:23   ` [PATCH 01/40] fsverity: remove hash page spin lock Darrick J. Wong
2024-03-17 16:23   ` [PATCH 02/40] xfs: add parent pointer support to attribute code Darrick J. Wong
2024-03-17 16:24   ` [PATCH 03/40] xfs: define parent pointer ondisk extended attribute format Darrick J. Wong
2024-03-17 16:24   ` [PATCH 04/40] xfs: add parent pointer validator functions Darrick J. Wong
2024-03-17 16:24   ` [PATCH 05/40] fs: add FS_XFLAG_VERITY for verity files Darrick J. Wong
2024-03-17 16:24   ` [PATCH 06/40] fsverity: pass tree_blocksize to end_enable_verity() Darrick J. Wong
2024-03-17 16:25   ` [PATCH 07/40] fsverity: support block-based Merkle tree caching Darrick J. Wong
2024-03-17 16:25   ` [PATCH 08/40] fsverity: add per-sb workqueue for post read processing Darrick J. Wong
2024-03-17 16:25   ` [PATCH 09/40] fsverity: add tracepoints Darrick J. Wong
2024-03-17 16:26   ` [PATCH 10/40] fsverity: fix "support block-based Merkle tree caching" Darrick J. Wong
2024-03-17 16:26   ` [PATCH 11/40] fsverity: send the level of the merkle tree block to ->read_merkle_tree_block Darrick J. Wong
2024-03-17 16:26   ` [PATCH 12/40] fsverity: pass the new tree size and block size to ->begin_enable_verity Darrick J. Wong
2024-03-17 16:26   ` [PATCH 13/40] fsverity: expose merkle tree geometry to callers Darrick J. Wong
2024-03-17 16:27   ` [PATCH 14/40] fsverity: rely on cached block callers to retain verified state Darrick J. Wong
2024-03-17 16:27   ` [PATCH 15/40] fsverity: box up the write_merkle_tree_block parameters too Darrick J. Wong
2024-03-17 16:27   ` [PATCH 16/40] fsverity: pass the zero-hash value to the implementation Darrick J. Wong
2024-03-18 16:38     ` Eric Biggers
2024-03-18 21:04       ` Darrick J. Wong
2024-03-17 16:27   ` [PATCH 17/40] fsverity: report validation errors back to the filesystem Darrick J. Wong
2024-03-17 16:28   ` [PATCH 18/40] iomap: integrate fs-verity verification into iomap's read path Darrick J. Wong
2024-03-17 16:28   ` [PATCH 19/40] xfs: add attribute type for fs-verity Darrick J. Wong
2024-03-17 16:28   ` [PATCH 20/40] xfs: add fs-verity ro-compat flag Darrick J. Wong
2024-03-17 16:28   ` [PATCH 21/40] xfs: add inode on-disk VERITY flag Darrick J. Wong
2024-03-17 16:29   ` [PATCH 22/40] xfs: initialize fs-verity on file open and cleanup on inode destruction Darrick J. Wong
2024-03-17 16:29   ` [PATCH 23/40] xfs: don't allow to enable DAX on fs-verity sealed inode Darrick J. Wong
2024-03-17 16:29   ` [PATCH 24/40] xfs: disable direct read path for fs-verity files Darrick J. Wong
2024-03-18 19:48     ` Andrey Albershteyn
2024-03-19 21:17       ` Darrick J. Wong
2024-03-17 16:29   ` [PATCH 25/40] xfs: widen flags argument to the xfs_iflags_* helpers Darrick J. Wong
2024-03-17 16:30   ` [PATCH 26/40] xfs: add fs-verity support Darrick J. Wong
2024-03-18  1:43     ` Christoph Hellwig
2024-03-18  4:34       ` Darrick J. Wong
2024-03-18  4:39         ` Christoph Hellwig
2024-03-18  4:56           ` Darrick J. Wong
2024-03-17 16:30   ` [PATCH 27/40] xfs: create a per-mount shrinker for verity inodes merkle tree blocks Darrick J. Wong
2024-03-17 16:30   ` [PATCH 28/40] xfs: create an icache tag for files with cached " Darrick J. Wong
2024-03-17 16:30   ` [PATCH 29/40] xfs: shrink verity blob cache Darrick J. Wong
2024-03-17 16:31   ` [PATCH 30/40] xfs: clean up stale fsverity metadata before starting Darrick J. Wong
2024-03-18 17:50     ` Andrey Albershteyn
2024-03-17 16:31   ` [PATCH 31/40] xfs: better reporting and error handling in xfs_drop_merkle_tree Darrick J. Wong
2024-03-18 17:51     ` Andrey Albershteyn
2024-03-17 16:31   ` [PATCH 32/40] xfs: make scrub aware of verity dinode flag Darrick J. Wong
2024-03-17 16:32   ` [PATCH 33/40] xfs: add fs-verity ioctls Darrick J. Wong
2024-03-17 16:32   ` [PATCH 34/40] xfs: advertise fs-verity being available on filesystem Darrick J. Wong
2024-03-17 16:32   ` [PATCH 35/40] xfs: teach online repair to evaluate fsverity xattrs Darrick J. Wong
2024-03-18 17:34     ` Andrey Albershteyn
2024-03-19 21:27       ` Darrick J. Wong
2024-03-17 16:32   ` [PATCH 36/40] xfs: don't store trailing zeroes of merkle tree blocks Darrick J. Wong
2024-03-18 17:52     ` Andrey Albershteyn
2024-03-17 16:33   ` [PATCH 37/40] xfs: create separate name hash function for xattrs Darrick J. Wong
2024-03-18 17:53     ` Andrey Albershteyn
2024-03-17 16:33   ` [PATCH 38/40] xfs: use merkle tree offset as attr hash Darrick J. Wong
2024-03-18 17:55     ` Andrey Albershteyn
2024-03-17 16:33   ` [PATCH 39/40] xfs: don't bother storing merkle tree blocks for zeroed data blocks Darrick J. Wong
2024-03-18 17:56     ` Andrey Albershteyn
2024-03-17 16:33   ` [PATCH 40/40] xfs: enable ro-compat fs-verity flag Darrick J. Wong
2024-03-18 16:35   ` [PATCHSET v5.3] fs-verity support for XFS Eric Biggers
2024-03-19 22:07     ` Darrick J. Wong
2024-03-19 23:21       ` Darrick J. Wong
2024-03-20 10:16         ` Andrey Albershteyn
2024-03-20 15:11           ` Darrick J. Wong
2024-03-17 16:23 ` Darrick J. Wong
2024-03-17 16:34   ` [PATCH 01/20] xfsprogs: add parent pointer support to attribute code Darrick J. Wong
2024-03-17 16:34   ` [PATCH 02/20] xfsprogs: define parent pointer xattr format Darrick J. Wong
2024-03-17 16:34   ` [PATCH 03/20] xfsprogs: Add xfs_verify_pptr Darrick J. Wong
2024-03-17 16:34   ` [PATCH 04/20] fs: add FS_XFLAG_VERITY for verity files Darrick J. Wong
2024-03-17 16:35   ` [PATCH 05/20] xfs: add attribute type for fs-verity Darrick J. Wong
2024-03-17 16:35   ` [PATCH 06/20] xfs: add fs-verity ro-compat flag Darrick J. Wong
2024-03-17 16:35   ` [PATCH 07/20] xfs: add inode on-disk VERITY flag Darrick J. Wong
2024-03-17 16:35   ` [PATCH 08/20] xfs: add fs-verity support Darrick J. Wong
2024-03-17 16:36   ` [PATCH 09/20] xfs: advertise fs-verity being available on filesystem Darrick J. Wong
2024-03-17 16:36   ` [PATCH 10/20] xfs: create separate name hash function for xattrs Darrick J. Wong
2024-03-17 16:36   ` [PATCH 11/20] xfs: use merkle tree offset as attr hash Darrick J. Wong
2024-03-17 16:36   ` [PATCH 12/20] xfs: enable ro-compat fs-verity flag Darrick J. Wong
2024-03-17 16:37   ` [PATCH 13/20] libfrog: add fsverity to xfs_report_geom output Darrick J. Wong
2024-03-17 16:37   ` [PATCH 14/20] xfs_db: introduce attr_modify command Darrick J. Wong
2024-03-17 16:37   ` [PATCH 15/20] xfs_db: make attr_set/remove/modify be able to handle fs-verity attrs Darrick J. Wong
2024-03-17 16:37   ` [PATCH 16/20] man: document attr_modify command Darrick J. Wong
2024-03-17 16:38   ` [PATCH 17/20] xfs_db: dump verity features and metadata Darrick J. Wong
2024-03-17 16:38   ` [PATCH 18/20] xfs_db: dump merkle tree data Darrick J. Wong
2024-03-17 16:38   ` [PATCH 19/20] xfs_repair: junk fsverity xattrs when unnecessary Darrick J. Wong
2024-03-17 16:39   ` [PATCH 20/20] mkfs.xfs: add verity parameter Darrick J. Wong
2024-03-17 16:23 ` [PATCHSET v5.3] fstests: fs-verity support for XFS Darrick J. Wong
2024-03-17 16:39   ` [PATCH 1/3] common/verity: enable fsverity " Darrick J. Wong
2024-03-17 16:39   ` [PATCH 2/3] xfs/{021,122}: adapt to fsverity xattrs Darrick J. Wong
2024-03-19 14:59     ` Andrey Albershteyn
2024-03-19 19:25       ` Darrick J. Wong
2024-03-17 16:39   ` [PATCH 3/3] common/populate: add verity files to populate xfs images Darrick J. Wong
2024-03-18  1:39 ` [PATCHBOMB v5.3] fs-verity support for XFS Christoph Hellwig
2024-03-18  4:30   ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.