linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/4] btrfs: Enhanced runtime defence against fuzzed images
@ 2020-08-19  6:35 Qu Wenruo
  2020-08-19  6:35 ` [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions Qu Wenruo
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Qu Wenruo @ 2020-08-19  6:35 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Jungyeon Yoon

This patch is revived after one year, as one internal report has hit one
BUG_ON() with real world fs, so I believe this patchset still makes sense.

- Enhanced eb accessors
  Not really needed for the fuzzed images, as 448de471cd4c
  ("btrfs: Check the first key and level for cached extent buffer")
  already fixed half of the reported images.
  Just add a final layer of safe net.

  Just to complain here, two experienced btrfs developer have got
  confused by @start, @len in functions like read_extent_buffer() with
  logical address.
  The best example to solve the confusion is to check the
  read_extent_buffer() call in btree_read_extent_buffer_pages().

  I'm not sure why this confusion happens or even get spread.
  My guess is the extent_buffer::start naming causing the problem.

  If so, I would definitely rename extent_buffer::start to
  extent_buffer::bytenr at any cost.
  Hopes the new commend will address the problem for now.

- BUG_ON() hunt in __btrfs_free_extent()
  Kill BUG_ON()s in __btrfs_free_extent(), replace with error reporting
  and why it shouldn't happen.

  Also add comment on what __btrfs_free_extent() is designed to do, with
  two dump-tree examples for newcomers.

- BUG_ON() hunt in __btrfs_inc_extent_ref()
  Just like __btrfs_free_extent(), but less comment as
  comment for __btrfs_free_extent() should also work for
  __btrfs_inc_extent_ref(), and __btrfs_inc_extent_ref() has a better
  structure than __btrfs_free_extent().

- Defence against unbalanced empty leaf

- Defence against bad key order across two tree blocks

The last two cases can't be rejected by tree-checker and they are all
cross-eb cases.
Thankfully we can reuse existing first_key check against unbalanced
empty leaf, but needs extra check deep into ctree.c for tree block
merging time check.

Reported-by: Jungyeon Yoon <jungyeon.yoon@gmail.com>
[ Not to mail bombarding the report, thus only RB tag in cover letter ]

Changelog:
v2:
- Remove duplicated error message in WARN() call.
  Changed to WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG))
  Also move WARN() after btrfs error message.

- Fix a comment error in __btrfs_free_extent()
  It's not adding refs to a tree block, but adding the same refs
  to an existing tree block ref.
  It's impossible a btrfs tree owning the same tree block directly twice.

- Add comment for eb accessors about @start and @len
  If anyone could tell me why such confusion between @start @len and
  logical address is here, I will definitely solve the root cause no
  matter how many codes need to be modified.

- Use bool to replace int where only two values are returned
  Also rename to follow the bool type.

- Remove one unrelated change for the error handler in
  btrfs_inc_extent_ref()

- Add Reviewed-by tag

v3:
- Rebased to latest misc-next branch
  All conflicts can be auto-merged.

v4:
- Remove one patch which is already merged
  A little surprised by the fact that git can't detecth such case.

- Add new reviewed-by tags from Josef

v5:
- Properly inline the check while make the report code into another
  function for the 1st patch

- Keep btrfs_abort_transaction() call where it is for the 2nd patch
  To make the line number correct and abort transaction asap.

- Function naming update for the 4th patch

Qu Wenruo (4):
  btrfs: extent_io: do extra check for extent buffer read write
    functions
  btrfs: extent-tree: kill BUG_ON() in __btrfs_free_extent() and do
    better comment
  btrfs: extent-tree: kill the BUG_ON() in
    insert_inline_extent_backref()
  btrfs: ctree: checking key orders before merged tree blocks

 fs/btrfs/ctree.c       |  71 +++++++++++++++++
 fs/btrfs/extent-tree.c | 170 +++++++++++++++++++++++++++++++++++++----
 fs/btrfs/extent_io.c   |  82 +++++++++++---------
 3 files changed, 272 insertions(+), 51 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions
  2020-08-19  6:35 [PATCH v5 0/4] btrfs: Enhanced runtime defence against fuzzed images Qu Wenruo
@ 2020-08-19  6:35 ` Qu Wenruo
  2020-08-19 17:11   ` David Sterba
  2020-08-19  6:35 ` [PATCH v5 2/4] btrfs: extent-tree: kill BUG_ON() in __btrfs_free_extent() and do better comment Qu Wenruo
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Qu Wenruo @ 2020-08-19  6:35 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Josef Bacik

Although we have start, len check for extent buffer reader/write (e.g.
read_extent_buffer()), those checks has its limitations:
- No overflow check
  Values like start = 1024 len = -1024 can still pass the basic
   (start + len) > eb->len check.

- Checks are not consistent
  For read_extent_buffer() we only check (start + len) against eb->len.
  While for memcmp_extent_buffer() we also check start against eb->len.

- Different error reporting mechanism
  We use WARN() in read_extent_buffer() but BUG() in
  memcpy_extent_buffer().

- Still modify memory if the request is obviously wrong
  In read_extent_buffer() even we find (start + len) > eb->len, we still
  call memset(dst, 0, len), which can eaisly cause memory access error
  if start + len overflows.

To address above problems, this patch creates a new common function to
check such access, check_eb_range().
- Add overflow check
  This function checks start, start + len against eb->len and overflow
  check.

- Unified checks

- Unified error reports
  Will call WARN() if CONFIG_BTRFS_DEBUG is configured.
  And also do btrfs_warn() message for non-debug build.

- Exit ASAP if check fails
  No more possible memory corruption.

- Add extra comment for @start @len used in those functions
  Even experienced developers sometimes get confused with the @start
  @len with logical address in those functions.
  I'm not sure what's the cause, maybe it's the extent_buffer::start
  naming.
  For now, just add some comment.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=202817
[ Inspired by above report, the report itself is already addressed ]
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent_io.c | 82 ++++++++++++++++++++++++--------------------
 1 file changed, 45 insertions(+), 37 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 617ea38e6fd7..4eb35a1338c1 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -5620,6 +5620,34 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num)
 	return ret;
 }
 
+static void report_eb_range(const struct extent_buffer *eb, unsigned long start,
+			    unsigned long len)
+{
+	btrfs_warn(eb->fs_info,
+"btrfs: bad eb rw request, eb bytenr=%llu len=%lu rw start=%lu len=%lu\n",
+		   eb->start, eb->len, start, len);
+	WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
+}
+
+/*
+ * Check if the [start, start + len) range is valid before reading/writing
+ * the eb.
+ * NOTE: @start and @len are offset *INSIDE* the eb, *NOT* logical address.
+ *
+ * Caller should not touch the dst/src memory if this function returns error.
+ */
+static inline int check_eb_range(const struct extent_buffer *eb,
+				 unsigned long start, unsigned long len)
+{
+	/* start, start + len should not go beyond eb->len nor overflow */
+	if (unlikely(start > eb->len || start + len > eb->len ||
+		     len > eb->len)) {
+		report_eb_range(eb, start, len);
+		return -EINVAL;
+	}
+	return 0;
+}
+
 void read_extent_buffer(const struct extent_buffer *eb, void *dstv,
 			unsigned long start, unsigned long len)
 {
@@ -5630,12 +5658,8 @@ void read_extent_buffer(const struct extent_buffer *eb, void *dstv,
 	char *dst = (char *)dstv;
 	unsigned long i = start >> PAGE_SHIFT;
 
-	if (start + len > eb->len) {
-		WARN(1, KERN_ERR "btrfs bad mapping eb start %llu len %lu, wanted %lu %lu\n",
-		     eb->start, eb->len, start, len);
-		memset(dst, 0, len);
+	if (check_eb_range(eb, start, len))
 		return;
-	}
 
 	offset = offset_in_page(start);
 
@@ -5700,8 +5724,8 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv,
 	unsigned long i = start >> PAGE_SHIFT;
 	int ret = 0;
 
-	WARN_ON(start > eb->len);
-	WARN_ON(start + len > eb->start + eb->len);
+	if (check_eb_range(eb, start, len))
+		return -EINVAL;
 
 	offset = offset_in_page(start);
 
@@ -5754,8 +5778,8 @@ void write_extent_buffer(const struct extent_buffer *eb, const void *srcv,
 	char *src = (char *)srcv;
 	unsigned long i = start >> PAGE_SHIFT;
 
-	WARN_ON(start > eb->len);
-	WARN_ON(start + len > eb->start + eb->len);
+	if (check_eb_range(eb, start, len))
+		return;
 
 	offset = offset_in_page(start);
 
@@ -5783,8 +5807,8 @@ void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start,
 	char *kaddr;
 	unsigned long i = start >> PAGE_SHIFT;
 
-	WARN_ON(start > eb->len);
-	WARN_ON(start + len > eb->start + eb->len);
+	if (check_eb_range(eb, start, len))
+		return;
 
 	offset = offset_in_page(start);
 
@@ -5828,6 +5852,10 @@ void copy_extent_buffer(const struct extent_buffer *dst,
 	char *kaddr;
 	unsigned long i = dst_offset >> PAGE_SHIFT;
 
+	if (check_eb_range(dst, dst_offset, len) ||
+	    check_eb_range(src, src_offset, len))
+		return;
+
 	WARN_ON(src->len != dst_len);
 
 	offset = offset_in_page(dst_offset);
@@ -6017,25 +6045,15 @@ void memcpy_extent_buffer(const struct extent_buffer *dst,
 			  unsigned long dst_offset, unsigned long src_offset,
 			  unsigned long len)
 {
-	struct btrfs_fs_info *fs_info = dst->fs_info;
 	size_t cur;
 	size_t dst_off_in_page;
 	size_t src_off_in_page;
 	unsigned long dst_i;
 	unsigned long src_i;
 
-	if (src_offset + len > dst->len) {
-		btrfs_err(fs_info,
-			"memmove bogus src_offset %lu move len %lu dst len %lu",
-			 src_offset, len, dst->len);
-		BUG();
-	}
-	if (dst_offset + len > dst->len) {
-		btrfs_err(fs_info,
-			"memmove bogus dst_offset %lu move len %lu dst len %lu",
-			 dst_offset, len, dst->len);
-		BUG();
-	}
+	if (check_eb_range(dst, dst_offset, len) ||
+	    check_eb_range(dst, src_offset, len))
+		return;
 
 	while (len > 0) {
 		dst_off_in_page = offset_in_page(dst_offset);
@@ -6062,7 +6080,6 @@ void memmove_extent_buffer(const struct extent_buffer *dst,
 			   unsigned long dst_offset, unsigned long src_offset,
 			   unsigned long len)
 {
-	struct btrfs_fs_info *fs_info = dst->fs_info;
 	size_t cur;
 	size_t dst_off_in_page;
 	size_t src_off_in_page;
@@ -6071,18 +6088,9 @@ void memmove_extent_buffer(const struct extent_buffer *dst,
 	unsigned long dst_i;
 	unsigned long src_i;
 
-	if (src_offset + len > dst->len) {
-		btrfs_err(fs_info,
-			  "memmove bogus src_offset %lu move len %lu len %lu",
-			  src_offset, len, dst->len);
-		BUG();
-	}
-	if (dst_offset + len > dst->len) {
-		btrfs_err(fs_info,
-			  "memmove bogus dst_offset %lu move len %lu len %lu",
-			  dst_offset, len, dst->len);
-		BUG();
-	}
+	if (check_eb_range(dst, dst_offset, len) ||
+	    check_eb_range(dst, src_offset, len))
+		return;
 	if (dst_offset < src_offset) {
 		memcpy_extent_buffer(dst, dst_offset, src_offset, len);
 		return;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 2/4] btrfs: extent-tree: kill BUG_ON() in __btrfs_free_extent() and do better comment
  2020-08-19  6:35 [PATCH v5 0/4] btrfs: Enhanced runtime defence against fuzzed images Qu Wenruo
  2020-08-19  6:35 ` [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions Qu Wenruo
@ 2020-08-19  6:35 ` Qu Wenruo
  2020-08-19  6:35 ` [PATCH v5 3/4] btrfs: extent-tree: kill the BUG_ON() in insert_inline_extent_backref() Qu Wenruo
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2020-08-19  6:35 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Nikolay Borisov, Josef Bacik

__btrfs_free_extent() is one of the best cases to show how optimization
could make a function hard to read.

In fact __btrfs_free_extent() is only doing two major works:
1. Reduce the refs number of an extent backref
   Either it's an inlined extent backref (inside EXTENT/METADATA item) or
   a keyed extent backref (SHARED_* item).
   We only need to locate that backref line, either reduce the number or
   remove the backref line completely.

2. Update the refs count in EXTENT/METADATA_ITEM

But in real world, we do it in a complex but somewhat efficient way.
During step 1), we will try to locate the EXTENT/METADATA_ITEM without
triggering another btrfs_search_slot() as fast path.

Only when we failed to locate that item, we will trigger another
btrfs_search_slot() to get that EXTENT/METADATA_ITEM after we
updated/deleted the backref line.

And we have a lot of restrict check on things like refs_to_drop against
extent refs and special case check for single ref extent.

All of these results:
- 7 BUG_ON()s in a single function
  Although all these BUG_ON() are doing correct check, they're super
  easy to get triggered for fuzzed images.
  It's never a good idea to piss the end user.

- Near 300 lines without much useful comments but a lot of hidden
  conditions
  I believe even the author needs several minutes to recall what the
  code is doing
  Not to mention a lot of BUG_ON() conditions needs to go back tens of
  lines to find out why.

This patch address all these problems by:
- Introduce two examples to show what __btrfs_free_extent() is doing
  One inlined backref case and one keyed case.
  Should cover most cases.

- Kill all BUG_ON()s with proper error message and optional leaf dump

- Add comment to show the overall workflow

Link: https://bugzilla.kernel.org/show_bug.cgi?id=202819
[ The report triggers one BUG_ON() in __btrfs_free_extent() ]
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 150 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 137 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index fa7d83051587..94803e51e9d0 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2930,6 +2930,53 @@ int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans)
 	return 0;
 }
 
+/*
+ * Real work happens here to drop one or more refs of @node.
+ *
+ * The work is mostly done in two parts:
+ * 1. Locate the extent refs.
+ *    It's either inlined in EXTENT/METADATA_ITEM or in keyed SHARED_* item.
+ *    Locate it then reduces the refs number or remove the ref line completely.
+ *
+ * 2. Update the refs count in EXTENT/METADATA_ITEM
+ *
+ * Due to the above two operations and possible optimizations, the function
+ * is a little hard to read, but with the following examples, the result
+ * of this function should be pretty easy to get.
+ *
+ * Example:
+ * *** Inlined backref case ***
+ * In extent tree we have:
+ * 	item 0 key (13631488 EXTENT_ITEM 1048576) itemoff 16201 itemsize 82
+ *		refs 2 gen 6 flags DATA
+ *		extent data backref root FS_TREE objectid 258 offset 0 count 1
+ *		extent data backref root FS_TREE objectid 257 offset 0 count 1
+ *
+ * This function get called with
+ *  node->bytenr = 13631488, node->num_bytes = 1048576
+ *  root_objectid = FS_TREE, owner_objectid = 257, owner_offset = 0,
+ *  refs_to_drop = 1
+ * Then we should get some like:
+ * 	item 0 key (13631488 EXTENT_ITEM 1048576) itemoff 16201 itemsize 82
+ *		refs 1 gen 6 flags DATA
+ *		extent data backref root FS_TREE objectid 258 offset 0 count 1
+ *
+ * *** Keyed backref case ***
+ * In extent tree we have:
+ *	item 0 key (13631488 EXTENT_ITEM 1048576) itemoff 3971 itemsize 24
+ *		refs 754 gen 6 flags DATA
+ *	[...]
+ *	item 2 key (13631488 EXTENT_DATA_REF <HASH>) itemoff 3915 itemsize 28
+ *		extent data backref root FS_TREE objectid 866 offset 0 count 1
+ * This function get called with
+ *  node->bytenr = 13631488, node->num_bytes = 1048576
+ *  root_objectid = FS_TREE, owner_objectid = 866, owner_offset = 0,
+ *  refs_to_drop = 1
+ * Then we should get some like:
+ *	item 0 key (13631488 EXTENT_ITEM 1048576) itemoff 3971 itemsize 24
+ *		refs 753 gen 6 flags DATA
+ * And that (13631488 EXTENT_DATA_REF <HASH>) get removed.
+ */
 static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 			       struct btrfs_delayed_ref_node *node, u64 parent,
 			       u64 root_objectid, u64 owner_objectid,
@@ -2962,7 +3009,15 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 	path->leave_spinning = 1;
 
 	is_data = owner_objectid >= BTRFS_FIRST_FREE_OBJECTID;
-	BUG_ON(!is_data && refs_to_drop != 1);
+
+	if (unlikely(!is_data && refs_to_drop != 1)) {
+		btrfs_crit(info,
+"invalid refs_to_drop, dropping more than 1 refs for tree block %llu refs_to_drop %u",
+			   node->bytenr, refs_to_drop);
+		ret = -EINVAL;
+		btrfs_abort_transaction(trans, ret);
+		goto out;
+	}
 
 	if (is_data)
 		skinny_metadata = false;
@@ -2971,6 +3026,13 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 				    parent, root_objectid, owner_objectid,
 				    owner_offset);
 	if (ret == 0) {
+		/*
+		 * Either the inline backref or the SHARED_DATA_REF/
+		 * SHARED_BLOCK_REF is found
+		 *
+		 * Here is a quick path to locate EXTENT/METADATA_ITEM.
+		 * It's possible the EXTENT/METADATA_ITEM is near current slot.
+		 */
 		extent_slot = path->slots[0];
 		while (extent_slot >= 0) {
 			btrfs_item_key_to_cpu(path->nodes[0], &key,
@@ -2987,13 +3049,21 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 				found_extent = 1;
 				break;
 			}
+
+			/* Quick path didn't find the EXTEMT/METADATA_ITEM */
 			if (path->slots[0] - extent_slot > 5)
 				break;
 			extent_slot--;
 		}
 
 		if (!found_extent) {
-			BUG_ON(iref);
+			if (unlikely(iref)) {
+				btrfs_crit(info,
+"invalid iref, no EXTENT/METADATA_ITEM found but has inline extent ref");
+				btrfs_abort_transaction(trans, -EUCLEAN);
+				goto err_dump;
+			}
+			/* Must be SHARED_* item, remove the backref first */
 			ret = remove_extent_backref(trans, path, NULL,
 						    refs_to_drop,
 						    is_data, &last_ref);
@@ -3004,6 +3074,8 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 			btrfs_release_path(path);
 			path->leave_spinning = 1;
 
+
+			/* Slow path to locate EXTENT/METADATA_ITEM */
 			key.objectid = bytenr;
 			key.type = BTRFS_EXTENT_ITEM_KEY;
 			key.offset = num_bytes;
@@ -3078,19 +3150,26 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 	if (owner_objectid < BTRFS_FIRST_FREE_OBJECTID &&
 	    key.type == BTRFS_EXTENT_ITEM_KEY) {
 		struct btrfs_tree_block_info *bi;
-		BUG_ON(item_size < sizeof(*ei) + sizeof(*bi));
+		if (unlikely(item_size < sizeof(*ei) + sizeof(*bi))) {
+			btrfs_crit(info,
+"invalid extent item size for key (%llu, %u, %llu) owner %llu, has %u expect >= %lu",
+				   key.objectid, key.type, key.offset,
+				   owner_objectid, item_size,
+				   sizeof(*ei) + sizeof(*bi));
+			btrfs_abort_transaction(trans, -EUCLEAN);
+			goto err_dump;
+		}
 		bi = (struct btrfs_tree_block_info *)(ei + 1);
 		WARN_ON(owner_objectid != btrfs_tree_block_level(leaf, bi));
 	}
 
 	refs = btrfs_extent_refs(leaf, ei);
 	if (refs < refs_to_drop) {
-		btrfs_err(info,
-			  "trying to drop %d refs but we only have %Lu for bytenr %Lu",
+		btrfs_crit(info,
+		"trying to drop %d refs but we only have %Lu for bytenr %Lu",
 			  refs_to_drop, refs, bytenr);
-		ret = -EINVAL;
-		btrfs_abort_transaction(trans, ret);
-		goto out;
+		btrfs_abort_transaction(trans, -EUCLEAN);
+		goto err_dump;
 	}
 	refs -= refs_to_drop;
 
@@ -3102,7 +3181,12 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 		 * be updated by remove_extent_backref
 		 */
 		if (iref) {
-			BUG_ON(!found_extent);
+			if (unlikely(!found_extent)) {
+				btrfs_crit(info,
+"invalid iref, got inlined extent ref but no EXTENT/METADATA_ITEM found");
+				btrfs_abort_transaction(trans, -EUCLEAN);
+				goto err_dump;
+			}
 		} else {
 			btrfs_set_extent_refs(leaf, ei, refs);
 			btrfs_mark_buffer_dirty(leaf);
@@ -3117,13 +3201,39 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 			}
 		}
 	} else {
+		/* In this branch refs == 1 */
 		if (found_extent) {
-			BUG_ON(is_data && refs_to_drop !=
-			       extent_data_ref_count(path, iref));
+			if (is_data && refs_to_drop !=
+			    extent_data_ref_count(path, iref)) {
+				btrfs_crit(info,
+		"invalid refs_to_drop, current refs %u refs_to_drop %u",
+					   extent_data_ref_count(path, iref),
+					   refs_to_drop);
+				btrfs_abort_transaction(trans, -EUCLEAN);
+				goto err_dump;
+			}
 			if (iref) {
-				BUG_ON(path->slots[0] != extent_slot);
+				if (path->slots[0] != extent_slot) {
+					btrfs_crit(info,
+"invalid iref, extent item key (%llu %u %llu) doesn't has wanted iref",
+						   key.objectid, key.type,
+						   key.offset);
+					btrfs_abort_transaction(trans, -EUCLEAN);
+					goto err_dump;
+				}
 			} else {
-				BUG_ON(path->slots[0] != extent_slot + 1);
+				/*
+				 * No inline ref, we must at SHARED_* item,
+				 * And it's single ref, it must be:
+				 * |	extent_slot	  ||extent_slot + 1|
+				 * [ EXTENT/METADATA_ITEM ][ SHARED_* ITEM ]
+				 */
+				if (path->slots[0] != extent_slot + 1) {
+					btrfs_crit(info,
+	"invalid SHARED_* item, previous item is not EXTENT/METADATA_ITEM");
+					btrfs_abort_transaction(trans, -EUCLEAN);
+					goto err_dump;
+				}
 				path->slots[0] = extent_slot;
 				num_to_del = 2;
 			}
@@ -3164,6 +3274,20 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
 out:
 	btrfs_free_path(path);
 	return ret;
+err_dump:
+	/*
+	 * Leaf dump can take up a lot of dmesg buffer since default nodesize
+	 * is already 16K.
+	 * So we only do full leaf dump for debug build.
+	 */
+	if (IS_ENABLED(CONFIG_BTRFS_DEBUG)) {
+		btrfs_crit(info, "path->slots[0]=%d extent_slot=%d",
+			   path->slots[0], extent_slot);
+		btrfs_print_leaf(path->nodes[0]);
+	}
+
+	btrfs_free_path(path);
+	return -EUCLEAN;
 }
 
 /*
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 3/4] btrfs: extent-tree: kill the BUG_ON() in insert_inline_extent_backref()
  2020-08-19  6:35 [PATCH v5 0/4] btrfs: Enhanced runtime defence against fuzzed images Qu Wenruo
  2020-08-19  6:35 ` [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions Qu Wenruo
  2020-08-19  6:35 ` [PATCH v5 2/4] btrfs: extent-tree: kill BUG_ON() in __btrfs_free_extent() and do better comment Qu Wenruo
@ 2020-08-19  6:35 ` Qu Wenruo
  2020-08-19  6:35 ` [PATCH v5 4/4] btrfs: ctree: checking key orders before merged tree blocks Qu Wenruo
  2020-08-27 14:47 ` [PATCH v5 0/4] btrfs: Enhanced runtime defence against fuzzed images David Sterba
  4 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2020-08-19  6:35 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Nikolay Borisov, Josef Bacik

[BUG]
With crafted image, btrfs can panic at insert_inline_extent_backref():
  kernel BUG at fs/btrfs/extent-tree.c:1857!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 0 PID: 1117 Comm: btrfs-transacti Not tainted 5.0.0-rc8+ #9
  RIP: 0010:insert_inline_extent_backref+0xcc/0xe0
  Code: 45 20 49 8b 7e 50 49 89 d8 4c 8b 4d 10 48 8b 55 c8 4c 89 e1 41 57 4c 89 ee 50 ff 75 18 e8 cc bf ff ff 31 c0 48 83 c4 18 eb b2 <0f> 0b e8 9d df bd ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 66 66
  RSP: 0018:ffffac4dc1287be8 EFLAGS: 00010293
  RAX: 0000000000000000 RBX: 0000000000000007 RCX: 0000000000000001
  RDX: 0000000000001000 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffffac4dc1287c28 R08: ffffac4dc1287ab8 R09: ffffac4dc1287ac0
  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
  R13: ffff8febef88a540 R14: ffff8febeaa7bc30 R15: 0000000000000000
  FS: 0000000000000000(0000) GS:ffff8febf7a00000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f663ace94c0 CR3: 0000000235698006 CR4: 00000000000206f0
  Call Trace:
  ? _cond_resched+0x1a/0x50
  __btrfs_inc_extent_ref.isra.64+0x7e/0x240
  ? btrfs_merge_delayed_refs+0xa5/0x330
  __btrfs_run_delayed_refs+0x653/0x1120
  btrfs_run_delayed_refs+0xdb/0x1b0
  btrfs_commit_transaction+0x52/0x950
  ? start_transaction+0x94/0x450
  transaction_kthread+0x163/0x190
  kthread+0x105/0x140
  ? btrfs_cleanup_transaction+0x560/0x560
  ? kthread_destroy_worker+0x50/0x50
  ret_from_fork+0x35/0x40
  Modules linked in:
  ---[ end trace 2ad8b3de903cf825 ]---

[CAUSE]
Due to extent tree corruption (still valid by itself, but bad cross ref),
we can allocate an extent which is still in extent tree.
The offending tree block of that case is from csum tree.
The newly allocated tree block is also for csum tree.

Then we will try to insert an tree block ref for the existing tree block
ref.

For btrfs tree extent item, a tree block can never be shared directly by
the same tree twice.
We have such BUG_ON() to prevent such problem, but BUG_ON() is
definitely not good enough.

[FIX]
Replace that BUG_ON() with proper error message and leaf dump for debug
build.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=202829
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 94803e51e9d0..078b57fb3d43 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1177,7 +1177,22 @@ int insert_inline_extent_backref(struct btrfs_trans_handle *trans,
 					   num_bytes, parent, root_objectid,
 					   owner, offset, 1);
 	if (ret == 0) {
-		BUG_ON(owner < BTRFS_FIRST_FREE_OBJECTID);
+		/*
+		 * We're adding refs to an tree block we already own, this
+		 * should not happen at all.
+		 */
+		if (owner < BTRFS_FIRST_FREE_OBJECTID) {
+			btrfs_crit(trans->fs_info,
+"invalid operation, adding refs to an existing tree ref, bytenr=%llu num_bytes=%llu root_objectid=%llu",
+				   bytenr, num_bytes, root_objectid);
+			if (IS_ENABLED(CONFIG_BTRFS_DEBUG)) {
+				WARN_ON(1);
+				btrfs_crit(trans->fs_info,
+			"path->slots[0]=%d path->nodes[0]:", path->slots[0]);
+				btrfs_print_leaf(path->nodes[0]);
+			}
+			return -EUCLEAN;
+		}
 		update_inline_extent_backref(path, iref, refs_to_add,
 					     extent_op, NULL);
 	} else if (ret == -ENOENT) {
@@ -1397,6 +1412,9 @@ int btrfs_inc_extent_ref(struct btrfs_trans_handle *trans,
 /*
  * __btrfs_inc_extent_ref - insert backreference for a given extent
  *
+ * The work is opposite as __btrfs_free_extent().
+ * For more info about how it works or examples, refer to __btrfs_free_extent().
+ *
  * @trans:	    Handle of transaction
  *
  * @node:	    The delayed ref node used to get the bytenr/length for
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 4/4] btrfs: ctree: checking key orders before merged tree blocks
  2020-08-19  6:35 [PATCH v5 0/4] btrfs: Enhanced runtime defence against fuzzed images Qu Wenruo
                   ` (2 preceding siblings ...)
  2020-08-19  6:35 ` [PATCH v5 3/4] btrfs: extent-tree: kill the BUG_ON() in insert_inline_extent_backref() Qu Wenruo
@ 2020-08-19  6:35 ` Qu Wenruo
  2020-08-27 14:47 ` [PATCH v5 0/4] btrfs: Enhanced runtime defence against fuzzed images David Sterba
  4 siblings, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2020-08-19  6:35 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Nikolay Borisov, Josef Bacik

[BUG]
With crafted image, btrfs can panic at btrfs_del_csums().
  kernel BUG at fs/btrfs/ctree.c:3188!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 0 PID: 1156 Comm: btrfs-transacti Not tainted 5.0.0-rc8+ #9
  RIP: 0010:btrfs_set_item_key_safe+0x16c/0x180
  Code: b7 48 8d 7d bf 4c 89 fe 48 89 45 c8 0f b6 45 b6 88 45 c7 48 8b 45 ae 48 89 45 bf e8 ce f2 ff ff 85 c0 0f 8f 48 ff ff ff 0f 0b <0f> 0b e8 dd 8d be ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 66 66
  RSP: 0018:ffff976141257ab8 EFLAGS: 00010202
  RAX: 0000000000000001 RBX: ffff898a6b890930 RCX: 0000000004b70000
  RDX: 0000000000000000 RSI: ffff976141257bae RDI: ffff976141257acf
  RBP: ffff976141257b10 R08: 0000000000001000 R09: ffff9761412579a8
  R10: 0000000000000000 R11: 0000000000000000 R12: ffff976141257abe
  R13: 0000000000000003 R14: ffff898a6a8be578 R15: ffff976141257bae
  FS: 0000000000000000(0000) GS:ffff898a77a00000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f779d9cd624 CR3: 000000022b2b4006 CR4: 00000000000206f0
  Call Trace:
  truncate_one_csum+0xac/0xf0
  btrfs_del_csums+0x24f/0x3a0
  __btrfs_free_extent.isra.72+0x5a7/0xbe0
  __btrfs_run_delayed_refs+0x539/0x1120
  btrfs_run_delayed_refs+0xdb/0x1b0
  btrfs_commit_transaction+0x52/0x950
  ? start_transaction+0x94/0x450
  transaction_kthread+0x163/0x190
  kthread+0x105/0x140
  ? btrfs_cleanup_transaction+0x560/0x560
  ? kthread_destroy_worker+0x50/0x50
  ret_from_fork+0x35/0x40
  Modules linked in:
  ---[ end trace 93bf9db00e6c374e ]---

[CAUSE]
This crafted image has a very tricky key order corruption:

  checksum tree key (CSUM_TREE ROOT_ITEM 0)
  node 29741056 level 1 items 14 free 107 generation 19 owner CSUM_TREE
          ...
          key (EXTENT_CSUM EXTENT_CSUM 73785344) block 29757440 gen 19
          key (EXTENT_CSUM EXTENT_CSUM 77594624) block 29753344 gen 19
          ...

  leaf 29757440 items 5 free space 150 generation 19 owner CSUM_TREE
          item 0 key (EXTENT_CSUM EXTENT_CSUM 73785344) itemoff 2323 itemsize 1672
                  range start 73785344 end 75497472 length 1712128
          item 1 key (EXTENT_CSUM EXTENT_CSUM 75497472) itemoff 2319 itemsize 4
                  range start 75497472 end 75501568 length 4096
          item 2 key (EXTENT_CSUM EXTENT_CSUM 75501568) itemoff 579 itemsize 1740
                  range start 75501568 end 77283328 length 1781760
          item 3 key (EXTENT_CSUM EXTENT_CSUM 77283328) itemoff 575 itemsize 4
                  range start 77283328 end 77287424 length 4096
          item 4 key (EXTENT_CSUM EXTENT_CSUM 4120596480) itemoff 275 itemsize 300 <<<
                  range start 4120596480 end 4120903680 length 307200
  leaf 29753344 items 3 free space 1936 generation 19 owner CSUM_TREE
          item 0 key (18446744073457893366 EXTENT_CSUM 77594624) itemoff 2323 itemsize 1672
                  range start 77594624 end 79306752 length 1712128
          ...

Note the item 4 key of leaf 29757440, which is obviously too large, and
even larger than the first key of the next leaf.

However it still follows the key order in that tree block, thus tree
checker is unable to detect it at read time, since tree checker can only
work inside a leaf, thus such complex corruption can't be rejected in
advance.

[FIX]
The next timing to detect such problem is at tree block merge time,
which is in push_node_left(), balance_node_right(), push_leaf_left() and
push_leaf_right().

Now we check if the key order of the right most key of the left node is
larger than the left most key of the right node.

By this we don't need to call the full tree-checker, while still keeps the
key order correct as key order in each node is already checked by tree
checker thus we only need to check the above two slots.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=202833
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 70e49d8d4f6c..fb241a06daa7 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -3159,6 +3159,55 @@ void btrfs_set_item_key_safe(struct btrfs_fs_info *fs_info,
 		fixup_low_keys(path, &disk_key, 1);
 }
 
+/*
+ * Check key orders of two sibling extent buffers.
+ *
+ * Return true if something is wrong.
+ * Return false if everything is fine.
+ *
+ * Tree-checker only works inside one tree block, thus the following
+ * corruption can not be rejected by tree-checker:
+ * Leaf @left			| Leaf @right
+ * --------------------------------------------------------------
+ * | 1 | 2 | 3 | 4 | 5 | f6 |   | 7 | 8 |
+ *
+ * Key f6 in leaf @left itself is valid, but not valid when the next
+ * key in leaf @right is 7.
+ * This can only be checked at tree block merge time.
+ * And since tree checker has ensured all key order in each tree block
+ * is correct, we only need to bother the last key of @left and the first
+ * key of @right.
+ */
+static bool check_sibling_keys(struct extent_buffer *left,
+			       struct extent_buffer *right)
+{
+	struct btrfs_key left_last;
+	struct btrfs_key right_first;
+	int level = btrfs_header_level(left);
+	int nr_left = btrfs_header_nritems(left);
+	int nr_right = btrfs_header_nritems(right);
+
+	/* No key to check in one of the tree blocks */
+	if (!nr_left || !nr_right)
+		return false;
+	if (level) {
+		btrfs_node_key_to_cpu(left, &left_last, nr_left - 1);
+		btrfs_node_key_to_cpu(right, &right_first, 0);
+	} else {
+		btrfs_item_key_to_cpu(left, &left_last, nr_left - 1);
+		btrfs_item_key_to_cpu(right, &right_first, 0);
+	}
+	if (btrfs_comp_cpu_keys(&left_last, &right_first) >= 0) {
+		btrfs_crit(left->fs_info,
+"bad key order cross tree blocks, left last (%llu %u %llu) right first (%llu %u %llu",
+			   left_last.objectid, left_last.type,
+			   left_last.offset, right_first.objectid,
+			   right_first.type, right_first.offset);
+		return true;
+	}
+	return false;
+}
+
 /*
  * try to push data from one node into the next node left in the
  * tree.
@@ -3203,6 +3252,12 @@ static int push_node_left(struct btrfs_trans_handle *trans,
 	} else
 		push_items = min(src_nritems - 8, push_items);
 
+	/* dst is the left eb, src is the middle eb */
+	if (check_sibling_keys(dst, src)) {
+		ret = -EUCLEAN;
+		btrfs_abort_transaction(trans, ret);
+		return ret;
+	}
 	ret = tree_mod_log_eb_copy(dst, src, dst_nritems, 0, push_items);
 	if (ret) {
 		btrfs_abort_transaction(trans, ret);
@@ -3271,6 +3326,12 @@ static int balance_node_right(struct btrfs_trans_handle *trans,
 	if (max_push < push_items)
 		push_items = max_push;
 
+	/* dst is the right eb, src is the middle eb */
+	if (check_sibling_keys(src, dst)) {
+		ret = -EUCLEAN;
+		btrfs_abort_transaction(trans, ret);
+		return ret;
+	}
 	ret = tree_mod_log_insert_move(dst, push_items, 0, dst_nritems);
 	BUG_ON(ret < 0);
 	memmove_extent_buffer(dst, btrfs_node_key_ptr_offset(push_items),
@@ -3747,6 +3808,12 @@ static int push_leaf_right(struct btrfs_trans_handle *trans, struct btrfs_root
 	if (left_nritems == 0)
 		goto out_unlock;
 
+	if (check_sibling_keys(left, right)) {
+		ret = -EUCLEAN;
+		btrfs_tree_unlock(right);
+		free_extent_buffer(right);
+		return ret;
+	}
 	if (path->slots[0] == left_nritems && !empty) {
 		/* Key greater than all keys in the leaf, right neighbor has
 		 * enough room for it and we're not emptying our leaf to delete
@@ -3984,6 +4051,10 @@ static int push_leaf_left(struct btrfs_trans_handle *trans, struct btrfs_root
 		goto out;
 	}
 
+	if (check_sibling_keys(left, right)) {
+		ret = -EUCLEAN;
+		goto out;
+	}
 	return __push_leaf_left(path, min_data_size,
 			       empty, left, free_space, right_nritems,
 			       max_slot);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions
  2020-08-19  6:35 ` [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions Qu Wenruo
@ 2020-08-19 17:11   ` David Sterba
  2020-08-19 23:14     ` Qu Wenruo
  0 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2020-08-19 17:11 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, Josef Bacik

On Wed, Aug 19, 2020 at 02:35:47PM +0800, Qu Wenruo wrote:
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -5620,6 +5620,34 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num)
>  	return ret;
>  }
>  
> +static void report_eb_range(const struct extent_buffer *eb, unsigned long start,
> +			    unsigned long len)
> +{
> +	btrfs_warn(eb->fs_info,
> +"btrfs: bad eb rw request, eb bytenr=%llu len=%lu rw start=%lu len=%lu\n",

No "btrfs:" prefix needed, no "\n" at the end of the string. The format
now uses the 'key=value' style, while we have the 'key value' in many
other places, this should be consistent.

> +		   eb->start, eb->len, start, len);
> +	WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
> +}
> +
> +/*
> + * Check if the [start, start + len) range is valid before reading/writing
> + * the eb.
> + * NOTE: @start and @len are offset *INSIDE* the eb, *NOT* logical address.
> + *
> + * Caller should not touch the dst/src memory if this function returns error.
> + */
> +static inline int check_eb_range(const struct extent_buffer *eb,
> +				 unsigned long start, unsigned long len)
> +{
> +	/* start, start + len should not go beyond eb->len nor overflow */
> +	if (unlikely(start > eb->len || start + len > eb->len ||
> +		     len > eb->len)) {

Can the number of condition be reduced? If 'start + len' overflows, then
we don't need to check 'start > eb->len', and for the case where
start = 1024 and len = -1024 the 'len > eb-len' would be enough.

> +		report_eb_range(eb, start, len);
> +		return -EINVAL;

This could be simply

		return report_eb_range(...);

It's not a big difference though, compiler produces the same code.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions
  2020-08-19 17:11   ` David Sterba
@ 2020-08-19 23:14     ` Qu Wenruo
  2020-08-20  9:50       ` David Sterba
  0 siblings, 1 reply; 13+ messages in thread
From: Qu Wenruo @ 2020-08-19 23:14 UTC (permalink / raw)
  To: dsterba, linux-btrfs, Josef Bacik



On 2020/8/20 上午1:11, David Sterba wrote:
> On Wed, Aug 19, 2020 at 02:35:47PM +0800, Qu Wenruo wrote:
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -5620,6 +5620,34 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num)
>>  	return ret;
>>  }
>>  
>> +static void report_eb_range(const struct extent_buffer *eb, unsigned long start,
>> +			    unsigned long len)
>> +{
>> +	btrfs_warn(eb->fs_info,
>> +"btrfs: bad eb rw request, eb bytenr=%llu len=%lu rw start=%lu len=%lu\n",
> 
> No "btrfs:" prefix needed, no "\n" at the end of the string.

Oh sorry, should re-check the message.

> The format
> now uses the 'key=value' style, while we have the 'key value' in many
> other places, this should be consistent.

Indeed, I just checked tree-checker, which does use 'key value'.

> 
>> +		   eb->start, eb->len, start, len);
>> +	WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
>> +}
>> +
>> +/*
>> + * Check if the [start, start + len) range is valid before reading/writing
>> + * the eb.
>> + * NOTE: @start and @len are offset *INSIDE* the eb, *NOT* logical address.
>> + *
>> + * Caller should not touch the dst/src memory if this function returns error.
>> + */
>> +static inline int check_eb_range(const struct extent_buffer *eb,
>> +				 unsigned long start, unsigned long len)
>> +{
>> +	/* start, start + len should not go beyond eb->len nor overflow */
>> +	if (unlikely(start > eb->len || start + len > eb->len ||
>> +		     len > eb->len)) {
> 
> Can the number of condition be reduced? If 'start + len' overflows, then
> we don't need to check 'start > eb->len', and for the case where
> start = 1024 and len = -1024 the 'len > eb-len' would be enough.

I'm afraid not.
Although 'start > eb->len || len > eb->len' is enough to detect overflow
case, it no longer detects cases like 'start = 2k, len = 3k' while
eb->len == 4K case.

So we still need all 3 checks.

Thanks,
Qu

> 
>> +		report_eb_range(eb, start, len);
>> +		return -EINVAL;
> 
> This could be simply
> 
> 		return report_eb_range(...);
> 
> It's not a big difference though, compiler produces the same code.
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions
  2020-08-19 23:14     ` Qu Wenruo
@ 2020-08-20  9:50       ` David Sterba
  2020-08-20  9:58         ` Qu Wenruo
  0 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2020-08-20  9:50 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, linux-btrfs, Josef Bacik

On Thu, Aug 20, 2020 at 07:14:13AM +0800, Qu Wenruo wrote:
> >> +static inline int check_eb_range(const struct extent_buffer *eb,
> >> +				 unsigned long start, unsigned long len)
> >> +{
> >> +	/* start, start + len should not go beyond eb->len nor overflow */
> >> +	if (unlikely(start > eb->len || start + len > eb->len ||
> >> +		     len > eb->len)) {
> > 
> > Can the number of condition be reduced? If 'start + len' overflows, then
> > we don't need to check 'start > eb->len', and for the case where
> > start = 1024 and len = -1024 the 'len > eb-len' would be enough.
> 
> I'm afraid not.
> Although 'start > eb->len || len > eb->len' is enough to detect overflow
> case, it no longer detects cases like 'start = 2k, len = 3k' while
> eb->len == 4K case.
> 
> So we still need all 3 checks.

I was suggesting 'start + len > eb->len', not 'start > eb-len'.

"start > eb->len" is implied by "start + len > eb->len".

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions
  2020-08-20  9:50       ` David Sterba
@ 2020-08-20  9:58         ` Qu Wenruo
  2020-08-20 14:46           ` David Sterba
  0 siblings, 1 reply; 13+ messages in thread
From: Qu Wenruo @ 2020-08-20  9:58 UTC (permalink / raw)
  To: dsterba, linux-btrfs, Josef Bacik



On 2020/8/20 下午5:50, David Sterba wrote:
> On Thu, Aug 20, 2020 at 07:14:13AM +0800, Qu Wenruo wrote:
>>>> +static inline int check_eb_range(const struct extent_buffer *eb,
>>>> +				 unsigned long start, unsigned long len)
>>>> +{
>>>> +	/* start, start + len should not go beyond eb->len nor overflow */
>>>> +	if (unlikely(start > eb->len || start + len > eb->len ||
>>>> +		     len > eb->len)) {
>>>
>>> Can the number of condition be reduced? If 'start + len' overflows, then
>>> we don't need to check 'start > eb->len', and for the case where
>>> start = 1024 and len = -1024 the 'len > eb-len' would be enough.
>>
>> I'm afraid not.
>> Although 'start > eb->len || len > eb->len' is enough to detect overflow
>> case, it no longer detects cases like 'start = 2k, len = 3k' while
>> eb->len == 4K case.
>>
>> So we still need all 3 checks.
> 
> I was suggesting 'start + len > eb->len', not 'start > eb-len'.
> 
> "start > eb->len" is implied by "start + len > eb->len".
> 

start > eb->len is not implied if (start + len) over flows.

E.g. start = -2K, len = 2k, eb->len = 4K. We can still pass !(start +
len > eb->len || len > eb->len).

In short, if we want overflow check along with each one checked, we
really need 3 checks.

Thanks,
Qu


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions
  2020-08-20  9:58         ` Qu Wenruo
@ 2020-08-20 14:46           ` David Sterba
  2020-08-20 15:18             ` David Sterba
  2020-08-20 23:39             ` Qu Wenruo
  0 siblings, 2 replies; 13+ messages in thread
From: David Sterba @ 2020-08-20 14:46 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, linux-btrfs, Josef Bacik

On Thu, Aug 20, 2020 at 05:58:53PM +0800, Qu Wenruo wrote:
> 
> 
> On 2020/8/20 下午5:50, David Sterba wrote:
> > On Thu, Aug 20, 2020 at 07:14:13AM +0800, Qu Wenruo wrote:
> >>>> +static inline int check_eb_range(const struct extent_buffer *eb,
> >>>> +				 unsigned long start, unsigned long len)
> >>>> +{
> >>>> +	/* start, start + len should not go beyond eb->len nor overflow */
> >>>> +	if (unlikely(start > eb->len || start + len > eb->len ||
> >>>> +		     len > eb->len)) {
> >>>
> >>> Can the number of condition be reduced? If 'start + len' overflows, then
> >>> we don't need to check 'start > eb->len', and for the case where
> >>> start = 1024 and len = -1024 the 'len > eb-len' would be enough.
> >>
> >> I'm afraid not.
> >> Although 'start > eb->len || len > eb->len' is enough to detect overflow
> >> case, it no longer detects cases like 'start = 2k, len = 3k' while
> >> eb->len == 4K case.
> >>
> >> So we still need all 3 checks.
> > 
> > I was suggesting 'start + len > eb->len', not 'start > eb-len'.
> > 
> > "start > eb->len" is implied by "start + len > eb->len".
> 
> start > eb->len is not implied if (start + len) over flows.
> 
> E.g. start = -2K, len = 2k, eb->len = 4K. We can still pass !(start +
> len > eb->len || len > eb->len).
> 
> In short, if we want overflow check along with each one checked, we
> really need 3 checks.

So what if we add overflow check, that would catch negative start or
negative length, and then do start + len > eb->len?

The check_setget_bounds is different becasue the len is known at compile
time so the overflows can't happen in the same way as for the eb range,
so this this confused me first.

	check_add_overflow(start, len) || start + len > eb->len

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions
  2020-08-20 14:46           ` David Sterba
@ 2020-08-20 15:18             ` David Sterba
  2020-08-20 23:39             ` Qu Wenruo
  1 sibling, 0 replies; 13+ messages in thread
From: David Sterba @ 2020-08-20 15:18 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs, Josef Bacik

On Thu, Aug 20, 2020 at 04:46:47PM +0200, David Sterba wrote:
> On Thu, Aug 20, 2020 at 05:58:53PM +0800, Qu Wenruo wrote:
> 	check_add_overflow(start, len) || start + len > eb->len

--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -13,6 +13,7 @@
 #include <linux/pagevec.h>
 #include <linux/prefetch.h>
 #include <linux/cleancache.h>
+#include <linux/overflow.h>
 #include "extent_io.h"
 #include "extent-io-tree.h"
 #include "extent_map.h"
@@ -5641,9 +5642,10 @@ static void report_eb_range(const struct extent_buffer *eb, unsigned long start,
 static inline int check_eb_range(const struct extent_buffer *eb,
                                 unsigned long start, unsigned long len)
 {
+       unsigned long offset;
+
        /* start, start + len should not go beyond eb->len nor overflow */
-       if (unlikely(start > eb->len || start + len > eb->len ||
-                    len > eb->len)) {
+       if (unlikely(check_add_overflow(start, len, &offset) || offset > eb->len)) {
                report_eb_range(eb, start, len);
                return -EINVAL;
        }
---

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions
  2020-08-20 14:46           ` David Sterba
  2020-08-20 15:18             ` David Sterba
@ 2020-08-20 23:39             ` Qu Wenruo
  1 sibling, 0 replies; 13+ messages in thread
From: Qu Wenruo @ 2020-08-20 23:39 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs, Josef Bacik


[-- Attachment #1.1: Type: text/plain, Size: 2221 bytes --]



On 2020/8/20 下午10:46, David Sterba wrote:
> On Thu, Aug 20, 2020 at 05:58:53PM +0800, Qu Wenruo wrote:
>>
>>
>> On 2020/8/20 下午5:50, David Sterba wrote:
>>> On Thu, Aug 20, 2020 at 07:14:13AM +0800, Qu Wenruo wrote:
>>>>>> +static inline int check_eb_range(const struct extent_buffer *eb,
>>>>>> +				 unsigned long start, unsigned long len)
>>>>>> +{
>>>>>> +	/* start, start + len should not go beyond eb->len nor overflow */
>>>>>> +	if (unlikely(start > eb->len || start + len > eb->len ||
>>>>>> +		     len > eb->len)) {
>>>>>
>>>>> Can the number of condition be reduced? If 'start + len' overflows, then
>>>>> we don't need to check 'start > eb->len', and for the case where
>>>>> start = 1024 and len = -1024 the 'len > eb-len' would be enough.
>>>>
>>>> I'm afraid not.
>>>> Although 'start > eb->len || len > eb->len' is enough to detect overflow
>>>> case, it no longer detects cases like 'start = 2k, len = 3k' while
>>>> eb->len == 4K case.
>>>>
>>>> So we still need all 3 checks.
>>>
>>> I was suggesting 'start + len > eb->len', not 'start > eb-len'.
>>>
>>> "start > eb->len" is implied by "start + len > eb->len".
>>
>> start > eb->len is not implied if (start + len) over flows.
>>
>> E.g. start = -2K, len = 2k, eb->len = 4K. We can still pass !(start +
>> len > eb->len || len > eb->len).
>>
>> In short, if we want overflow check along with each one checked, we
>> really need 3 checks.
> 
> So what if we add overflow check, that would catch negative start or
> negative length, and then do start + len > eb->len?
> 
> The check_setget_bounds is different becasue the len is known at compile
> time so the overflows can't happen in the same way as for the eb range,
> so this this confused me first.
> 
> 	check_add_overflow(start, len) || start + len > eb->len
> 
Then it should be more or less the same as the existing 3 checks.

In fact, the 3 checks are just the overflow safe check for (start + len
> eb->len).

The difference between check_setget_bounds() and check_eb_range() is,
the size for check_setget_bounds() are fixed size, thus it will never be
negative, thus it can skip the (size > eb->len) check.

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 0/4] btrfs: Enhanced runtime defence against fuzzed images
  2020-08-19  6:35 [PATCH v5 0/4] btrfs: Enhanced runtime defence against fuzzed images Qu Wenruo
                   ` (3 preceding siblings ...)
  2020-08-19  6:35 ` [PATCH v5 4/4] btrfs: ctree: checking key orders before merged tree blocks Qu Wenruo
@ 2020-08-27 14:47 ` David Sterba
  4 siblings, 0 replies; 13+ messages in thread
From: David Sterba @ 2020-08-27 14:47 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, Jungyeon Yoon

On Wed, Aug 19, 2020 at 02:35:46PM +0800, Qu Wenruo wrote:
> v5:
> - Properly inline the check while make the report code into another
>   function for the 1st patch
> 
> - Keep btrfs_abort_transaction() call where it is for the 2nd patch
>   To make the line number correct and abort transaction asap.
> 
> - Function naming update for the 4th patch
> 
> Qu Wenruo (4):
>   btrfs: extent_io: do extra check for extent buffer read write
>     functions
>   btrfs: extent-tree: kill BUG_ON() in __btrfs_free_extent() and do
>     better comment
>   btrfs: extent-tree: kill the BUG_ON() in
>     insert_inline_extent_backref()
>   btrfs: ctree: checking key orders before merged tree blocks

With several fixups and updates added to misc-next. Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-08-27 14:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-19  6:35 [PATCH v5 0/4] btrfs: Enhanced runtime defence against fuzzed images Qu Wenruo
2020-08-19  6:35 ` [PATCH v5 1/4] btrfs: extent_io: do extra check for extent buffer read write functions Qu Wenruo
2020-08-19 17:11   ` David Sterba
2020-08-19 23:14     ` Qu Wenruo
2020-08-20  9:50       ` David Sterba
2020-08-20  9:58         ` Qu Wenruo
2020-08-20 14:46           ` David Sterba
2020-08-20 15:18             ` David Sterba
2020-08-20 23:39             ` Qu Wenruo
2020-08-19  6:35 ` [PATCH v5 2/4] btrfs: extent-tree: kill BUG_ON() in __btrfs_free_extent() and do better comment Qu Wenruo
2020-08-19  6:35 ` [PATCH v5 3/4] btrfs: extent-tree: kill the BUG_ON() in insert_inline_extent_backref() Qu Wenruo
2020-08-19  6:35 ` [PATCH v5 4/4] btrfs: ctree: checking key orders before merged tree blocks Qu Wenruo
2020-08-27 14:47 ` [PATCH v5 0/4] btrfs: Enhanced runtime defence against fuzzed images David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).