From: David Sterba <dsterba@suse.cz>
To: Filipe Manana <fdmanana@gmail.com>
Cc: Qu Wenruo <wqu@suse.com>, Nikolay Borisov <nborisov@suse.com>,
linux-btrfs <linux-btrfs@vger.kernel.org>,
Filipe Manana <fdmanana@suse.com>
Subject: Re: [PATCH v4] btrfs: trim: fix underflow in trim length to prevent access beyond device boundary
Date: Wed, 12 Aug 2020 08:14:46 +0200 [thread overview]
Message-ID: <20200812061446.GU2026@twin.jikos.cz> (raw)
In-Reply-To: <CAL3q7H7pKrurKVafXQ3+AsHtkWGEKdGa9NiyO_HBUy1MyzJFEw@mail.gmail.com>
On Tue, Aug 11, 2020 at 11:24:29AM +0100, Filipe Manana wrote:
> On Tue, Aug 11, 2020 at 9:48 AM Qu Wenruo <wqu@suse.com> wrote:
> >
> >
> >
> > On 2020/8/11 下午4:41, Nikolay Borisov wrote:
> > >
> > >
> > > On 31.07.20 г. 14:29 ч., Qu Wenruo wrote:
> > >> [BUG]
> > >> The following script can lead to tons of beyond device boundary access:
> > >>
> > >> mkfs.btrfs -f $dev -b 10G
> > >> mount $dev $mnt
> > >> trimfs $mnt
> > >> btrfs filesystem resize 1:-1G $mnt
> > >> trimfs $mnt
> > >>
> > >> [CAUSE]
> > >> Since commit 929be17a9b49 ("btrfs: Switch btrfs_trim_free_extents to
> > >> find_first_clear_extent_bit"), we try to avoid trimming ranges that's
> > >> already trimmed.
> > >>
> > >> So we check device->alloc_state by finding the first range which doesn't
> > >> have CHUNK_TRIMMED and CHUNK_ALLOCATED not set.
> > >>
> > >> But if we shrunk the device, that bits are not cleared, thus we could
> > >> easily got a range starts beyond the shrunk device size.
> > >>
> > >> This results the returned @start and @end are all beyond device size,
> > >> then we call "end = min(end, device->total_bytes -1);" making @end
> > >> smaller than device size.
> > >>
> > >> Then finally we goes "len = end - start + 1", totally underflow the
> > >> result, and lead to the beyond-device-boundary access.
> > >>
> > >> [FIX]
> > >> This patch will fix the problem in two ways:
> > >> - Clear CHUNK_TRIMMED | CHUNK_ALLOCATED bits when shrinking device
> > >> This is the root fix
> > >>
> > >> - Add extra safe net when trimming free device extents
> > >> We check and warn if the returned range is already beyond current
> > >> device.
> > >>
> > >> Link: https://github.com/kdave/btrfs-progs/issues/282
> > >> Fixes: 929be17a9b49 ("btrfs: Switch btrfs_trim_free_extents to find_first_clear_extent_bit")
> > >> Signed-off-by: Qu Wenruo <wqu@suse.com>
> > >> Reviewed-by: Filipe Manana <fdmanana@suse.com>
> > >> ---
> > >> Changelog:
> > >> v2:
> > >> - Add proper fixes tag
> > >> - Add extra warning for beyond device end case
> > >> - Add graceful exit for already trimmed case
> > >> v3:
> > >> - Don't return EUCLEAN for beyond boundary access
> > >> - Rephrase the warning message for beyond boundary access
> > >> v4:
> > >> - Remove one duplicated check on exiting the trim loop
> > >> ---
> > >> fs/btrfs/extent-tree.c | 14 ++++++++++++++
> > >> fs/btrfs/volumes.c | 12 ++++++++++++
> > >> 2 files changed, 26 insertions(+)
> > >>
> > >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> > >> index fa7d83051587..6b1b5dfba4b3 100644
> > >> --- a/fs/btrfs/extent-tree.c
> > >> +++ b/fs/btrfs/extent-tree.c
> > >> @@ -33,6 +33,7 @@
> > >> #include "delalloc-space.h"
> > >> #include "block-group.h"
> > >> #include "discard.h"
> > >> +#include "rcu-string.h"
> > >>
> > >> #undef SCRAMBLE_DELAYED_REFS
> > >>
> > >> @@ -5669,6 +5670,19 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
> > >> &start, &end,
> > >> CHUNK_TRIMMED | CHUNK_ALLOCATED);
> > >>
> > >> + /* CHUNK_* bits not cleared properly */
> > >> + if (start > device->total_bytes) {
> > >> + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
> > >> + btrfs_warn_in_rcu(fs_info,
> > >> +"ignoring attempt to trim beyond device size: offset %llu length %llu device %s device size %llu",
> > >> + start, end - start + 1,
> > >> + rcu_str_deref(device->name),
> > >> + device->total_bytes);
> > >> + mutex_unlock(&fs_info->chunk_mutex);
> > >> + ret = 0;
> > >> + break;
> > >> + }
> > >
> > > Isn't this a NOOP, because the latter chunk ensures we can never cross
> > > device->total_bytes. Since this is a purely defensive mechanism and
> > > following this patch we *should* never have CHUNK_* bits set beyond
> > > device->total_bytes I'd say make this an ASSERT(). Otherwise you force
> > > people to pay the cost of the check for every trim ...
> >
> > I'm fine with the ASSERT() idea.
> >
> > But on the other hand, we really don't know how things can go wrong, and
> > such graceful exit makes us way easier to expose and fix bugs when it
> > happens in a production system.
> >
> > So currently I'm 50-50 on change it to ASSERT().
>
> Typical non-debug kernels provided by at least some distros (looking
> at debian) don't have btrfs asserts enabled by default.
> So such a type of bug can lead to losing any data a user might have
> stored beyond the new size boundary.
> And if they are enabled, it results in a crash / BUG_ON(). So I'm
> strongly for the warning and skipping trim requests beyond the fs
> size.
I agree, the check should be a always enabled and just warn.
next prev parent reply other threads:[~2020-08-12 6:15 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-31 11:29 [PATCH v4] btrfs: trim: fix underflow in trim length to prevent access beyond device boundary Qu Wenruo
2020-07-31 14:08 ` David Sterba
2020-07-31 23:35 ` Qu Wenruo
2020-08-11 7:22 ` David Sterba
2020-08-11 7:42 ` Qu Wenruo
2020-08-12 6:10 ` David Sterba
2020-08-12 6:33 ` Qu Wenruo
2020-08-12 6:37 ` David Sterba
2020-08-11 8:41 ` Nikolay Borisov
2020-08-11 8:46 ` Qu Wenruo
2020-08-11 10:24 ` Filipe Manana
2020-08-12 6:14 ` David Sterba [this message]
2020-08-12 6:43 ` [PATCH v5] " David Sterba
2020-08-12 6:57 ` Qu Wenruo
2020-08-12 11:14 ` Qu Wenruo
2020-08-12 11:24 ` Nikolay Borisov
2020-08-12 11:26 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200812061446.GU2026@twin.jikos.cz \
--to=dsterba@suse.cz \
--cc=fdmanana@gmail.com \
--cc=fdmanana@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).