From: Eric Biggers <ebiggers@kernel.org>
To: Boris Burkov <boris@bur.io>
Cc: fstests@vger.kernel.org, linux-fscrypt@vger.kernel.org,
linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v5 4/4] generic: test fs-verity EFBIG scenarios
Date: Thu, 16 Sep 2021 14:18:34 -0700 [thread overview]
Message-ID: <YUO0qg3bqxsCy/iT@sol.localdomain> (raw)
In-Reply-To: <b1d116cd4d0ea74b9cd86f349c672021e005a75c.1631558495.git.boris@bur.io>
On Mon, Sep 13, 2021 at 11:44:37AM -0700, Boris Burkov wrote:
> +_fsv_scratch_begin_subtest "way too big: fail on first merkle block"
> +# have to go back by 4096 from max to not hit the fsverity MAX_LEVELS check.
> +truncate -s $(($max_sz - 4095)) $fsv_file
> +_fsv_enable $fsv_file |& _filter_scratch
This is actually a kernel bug, so please don't work around it in the test :-(
It will be fixed by the kernel patch
https://lore.kernel.org/linux-fscrypt/20210916203424.113376-1-ebiggers@kernel.org
> +
> +# The goal of this second test is to make a big enough file that we trip the
> +# EFBIG codepath, but not so big that we hit it immediately as soon as we try
> +# to write a Merkle leaf. Because of the layout of the Merkle tree that
> +# fs-verity uses, this is a bit complicated to compute dynamically.
> +
> +# The layout of the Merkle tree has the leaf nodes last, but writes them first.
> +# To get an interesting overflow, we need the start of L0 to be < MAX but the
> +# end of the merkle tree (EOM) to be past MAX. Ideally, the start of L0 is only
> +# just smaller than MAX, so that we don't have to write many blocks to blow up,
> +# but we take some liberties with adding alignments rather than computing them
> +# correctly, so we under-estimate the perfectly sized file.
> +
> +# We make the following assumptions to arrive at a Merkle tree layout:
> +# The Merkle tree is stored past EOF aligned to 64k.
> +# 4K blocks and pages
> +# Merkle tree levels aligned to the block (not pictured)
> +# SHA-256 hashes (32 bytes; 128 hashes per block/page)
> +# 64 bit max file size (and thus 8 levels)
> +
> +# 0 EOF round-to-64k L7L6L5 L4 L3 L2 L1 L0 MAX EOM
> +# |-------------------------| ||-|--|---|----|-----|------|--|!!!!!|
> +
> +# Given this structure, we can compute the size of the file that yields the
> +# desired properties. (NB the diagram skips the block alignment of each level)
> +# sz + 64k + sz/128^8 + 4k + sz/128^7 + 4k + ... + sz/128^2 + 4k < MAX
> +# sz + 64k + 7(4k) + sz/128^8 + sz/128^7 + ... + sz/128^2 < MAX
> +# sz + 92k + sz/128^2 < MAX
> +# (128^8)sz + (128^8)92k + sz + (128)sz + (128^2)sz + ... + (128^6)sz < (128^8)MAX
> +# sz(128^8 + 128^6 + 128^5 + 128^4 + 128^3 + 128^2 + 128 + 1) < (128^8)(MAX - 92k)
> +# sz < (128^8/(128^8 + (128^6 + ... + 128 + 1)))(MAX - 92k)
> +#
> +# Do the actual caclulation with 'bc' and 20 digits of precision.
> +# set -f prevents the * from being expanded into the files in the cwd.
> +set -f
> +calc="scale=20; ($max_sz - 94208) * ((128^8) / (1 + 128 + 128^2 + 128^3 + 128^4 + 128^5 + 128^6 + 128^8))"
> +sz=$(echo $calc | $BC -q | cut -d. -f1)
> +set +f
It's hard to follow the above explanation. I'm still wondering whether it could
be simplified a lot. Maybe something like the following:
# The goal of this second test is to make a big enough file that we trip the
# EFBIG codepath, but not so big that we hit it immediately when writing the
# first Merkle leaf.
#
# The Merkle tree is stored with the leaf node level (L0) last, but it is
# written first. To get an interesting overflow, we need the maximum file size
# (MAX) to be in the middle of L0 -- ideally near the beginning of L0 so that we
# don't have to write many blocks before getting an error.
#
# With SHA-256 and 4K blocks, there are 128 hashes per block. Thus, ignoring
# padding, L0 is 1/128 of the file size while the other levels in total are
# 1/128**2 + 1/128**3 + 1/128**4 + ... = 1/16256 of the file size. So still
# ignoring padding, for L0 start exactly at MAX, the file size must be s such
# that s + s/16256 = MAX, i.e. s = MAX * (16256/16257). Then to get a file size
# where MAX occurs *near* the start of L0 rather than *at* the start, we can
# just subtract an overestimate of the padding: 64K after the file contents,
# then 4K per level, where the consideration of 8 levels is sufficient.
sz=$(echo "scale=20; $max_sz * (16256/16257) - 65536 - 4096*8" | $BC -q | cut -d. -f1)
That gives a size only 4103 bytes different from your calculation, and IMO is
much easier to understand.
- Eric
next prev parent reply other threads:[~2021-09-16 21:18 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-13 18:44 [PATCH v5 0/4] tests for btrfs fsverity Boris Burkov
2021-09-13 18:44 ` [PATCH v5 1/4] btrfs: test btrfs specific fsverity corruption Boris Burkov
2021-09-13 18:44 ` [PATCH v5 2/4] generic/574: corrupt btrfs merkle tree data Boris Burkov
2021-09-13 18:44 ` [PATCH v5 3/4] btrfs: test verity orphans with dmlogwrites Boris Burkov
2021-09-13 18:44 ` [PATCH v5 4/4] generic: test fs-verity EFBIG scenarios Boris Burkov
2021-09-16 21:18 ` Eric Biggers [this message]
2022-02-08 2:20 ` [PATCH v5 0/4] tests for btrfs fsverity Eric Biggers
2022-02-08 22:34 ` Boris Burkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YUO0qg3bqxsCy/iT@sol.localdomain \
--to=ebiggers@kernel.org \
--cc=boris@bur.io \
--cc=fstests@vger.kernel.org \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fscrypt@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.