All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: grub-devel@gnu.org
Cc: linux-btrfs@vger.kernel.org
Subject: [PATCH] fs/btrfs: Make extent item iteration to handle gaps
Date: Sat, 16 Oct 2021 09:40:49 +0800	[thread overview]
Message-ID: <20211016014049.201556-1-wqu@suse.com> (raw)

[BUG]
Grub btrfs implementation can't handle two very basic btrfs file
layouts:

1. Mixed inline/regualr extents
   # mkfs.btrfs -f test.img
   # mount test.img /mnt/btrfs
   # xfs_io -f -c "pwrite 0 1k" -c "sync" -c "falloc 0 4k" \
	       -c "pwrite 4k 4k" /mnt/btrfs/file
   # umount /mnt/btrfs
   # ./grub-fstest ./grub-fstest --debug=btrfs ~/test.img hex "/file"

   Such mixed inline/regular extents case is not recommended layout,
   but all existing tools and kernel can handle it without problem

2. NO_HOLES feature
   # mkfs.btrfs -f test.img -O no_holes
   # mount test.img /mnt/btrfs
   # xfs_io -f -c "pwrite 0 4k" -c "pwrite 8k 4k" /mnt/btrfs/file
   # umount /mnt/btrfs
   # ./grub-fstest ./grub-fstest --debug=btrfs ~/test.img hex "/file"

   NO_HOLES feature is going to be the default mkfs feature in the incoming
   v5.15 release, and kernel has support for it since v4.0.

[CAUSE]
The way GRUB btrfs code iterates through file extents relies on no gap
between extents.

If any gap is hit, then grub btrfs will error out, without any proper
reason to help debug the bug.

This is a bad assumption, since a long long time ago btrfs has a new
feature called NO_HOLES to allow btrfs to skip the padding hole extent
to reduce metadata usage.

The NO_HOLES feature is already stable since kernel v4.0 and is going to
be the default mkfs feature in the incoming v5.15 btrfs-progs release.

[FIX]
When there is a extent gap, instead of error out, just try next item.

This is still not ideal, as kernel/progs/U-boot all do the iteration
item by item, not relying on the file offset continuity.

But it will be way more time consuming to correct the whole behavior
than starting from scratch to build a proper designed btrfs module for GRUB.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 grub-core/fs/btrfs.c | 33 ++++++++++++++++++++++++++++++---
 1 file changed, 30 insertions(+), 3 deletions(-)

diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index 63203034dfc6..4fbcbec7524a 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -1443,6 +1443,7 @@ grub_btrfs_extent_read (struct grub_btrfs_data *data,
       grub_size_t csize;
       grub_err_t err;
       grub_off_t extoff;
+      struct grub_btrfs_leaf_descriptor desc;
       if (!data->extent || data->extstart > pos || data->extino != ino
 	  || data->exttree != tree || data->extend <= pos)
 	{
@@ -1455,7 +1456,7 @@ grub_btrfs_extent_read (struct grub_btrfs_data *data,
 	  key_in.type = GRUB_BTRFS_ITEM_TYPE_EXTENT_ITEM;
 	  key_in.offset = grub_cpu_to_le64 (pos);
 	  err = lower_bound (data, &key_in, &key_out, tree,
-			     &elemaddr, &elemsize, NULL, 0);
+			     &elemaddr, &elemsize, &desc, 0);
 	  if (err)
 	    return -1;
 	  if (key_out.object_id != ino
@@ -1494,10 +1495,36 @@ grub_btrfs_extent_read (struct grub_btrfs_data *data,
 			PRIxGRUB_UINT64_T "\n",
 			grub_le_to_cpu64 (key_out.offset),
 			grub_le_to_cpu64 (data->extent->size));
+	  /*
+	   * The way of extent item iteration is pretty bad, it completely
+	   * requires all extents are contiguous, which is not ensured.
+	   *
+	   * Features like NO_HOLE and mixed inline/regular extents can cause
+	   * gaps between file extent items.
+	   *
+	   * The correct way is to follow kernel/U-boot to iterate item by
+	   * item, without any assumption on the file offset continuity.
+	   *
+	   * Here we just manually skip to next item and re-do the verification.
+	   *
+	   * TODO: Rework the whole extent item iteration code, if not the
+	   * whole btrfs implementation.
+	   */
 	  if (data->extend <= pos)
 	    {
-	      grub_error (GRUB_ERR_BAD_FS, "extent not found");
-	      return -1;
+	      err = next(data, &desc, &elemaddr, &elemsize, &key_out);
+	      if (err < 0)
+		return -1;
+	      /* No next item for the inode, we hit the end */
+	      if (err == 0 || key_out.object_id != ino ||
+		  key_out.type != GRUB_BTRFS_ITEM_TYPE_EXTENT_ITEM)
+		      return pos - pos0;
+
+	      csize = grub_le_to_cpu64(key_out.offset) - pos;
+	      buf += csize;
+	      pos += csize;
+	      len -= csize;
+	      continue;
 	    }
 	}
       csize = data->extend - pos;
-- 
2.33.0


             reply	other threads:[~2021-10-16  1:41 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-16  1:40 Qu Wenruo [this message]
2021-10-28  7:36 ` [PATCH] fs/btrfs: Make extent item iteration to handle gaps Qu Wenruo
2021-10-28  9:27   ` Michael Chang
2021-10-28  9:40     ` Qu Wenruo
2021-10-29 12:52   ` David Sterba
2021-11-02 14:53     ` Daniel Kiper
     [not found] <20211018124903.661dca99@suse.de>
2021-10-18 10:51 ` Qu Wenruo
2021-10-18 10:51 David Disseldorp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211016014049.201556-1-wqu@suse.com \
    --to=wqu@suse.com \
    --cc=grub-devel@gnu.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.