All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] bcachefs: writeback and byte-order misc fixes
@ 2023-11-03 13:09 Brian Foster
  2023-11-03 13:09 ` [PATCH 1/3] bcachefs: allow writeback to fill bio completely Brian Foster
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Brian Foster @ 2023-11-03 13:09 UTC (permalink / raw)
  To: linux-bcachefs

Hi all,

This series is a repost of a handful of patches that happened to have
fallen through the cracks. Patch 1 relates back to the discussion[1] on
increasing the writeback bio size limit. We can't safely do that until
the bounce path can be enhanced to handle larger sizes, but we can at
least fix the writeback code to completely fill the bio. Patches 2-3 are
a couple more byte order fixes that were uncovered by swapping an
on-disk filesystem between big and little endian machines. As of these
two, I've not encountered any further byte-order issues via this sort of
test.

All three of these patches have been soaking in my test branch [2] for
quite some time. Thoughts, reviews, flames appreciated.

Brian

[1] https://lore.kernel.org/linux-bcachefs/20230927112338.262207-3-bfoster@redhat.com/
[2] https://evilpiepirate.org/~testdashboard/ci?branch=bfoster

Brian Foster (3):
  bcachefs: allow writeback to fill bio completely
  bcachefs: byte order swap bch_alloc_v4.fragmentation_lru field
  bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield

 fs/bcachefs/alloc_background.c | 10 +---------
 fs/bcachefs/backpointers.c     |  2 +-
 fs/bcachefs/backpointers.h     |  9 +++++++++
 fs/bcachefs/fs-io-buffered.c   | 19 ++++++++++++++++---
 4 files changed, 27 insertions(+), 13 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] bcachefs: allow writeback to fill bio completely
  2023-11-03 13:09 [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Brian Foster
@ 2023-11-03 13:09 ` Brian Foster
  2023-11-03 13:09 ` [PATCH 2/3] bcachefs: byte order swap bch_alloc_v4.fragmentation_lru field Brian Foster
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Brian Foster @ 2023-11-03 13:09 UTC (permalink / raw)
  To: linux-bcachefs

The bcachefs folio writeback code includes a bio full check as well
as a fixed size check to determine when to split off and submit
writeback I/O. The inclusive check of the latter against the limit
means that writeback can submit slightly prematurely. This is not a
functional problem, but results in unnecessarily split I/Os and
extent merging.

This can be observed with a buffered write sized exactly to the
current maximum value (1MB) and with key_merging_disabled=1. The
latter prevents the merge from the second write such that a
subsequent check of the extent list shows a 1020k extent followed by
a contiguous 4k extent.

The purpose for the fixed size check is also undocumented and
somewhat obscure. Lift this check into a new helper that wraps the
bio check, fix the comparison logic, and add a comment to document
the purpose and how we might improve on this in the future.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/bcachefs/fs-io-buffered.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/bcachefs/fs-io-buffered.c b/fs/bcachefs/fs-io-buffered.c
index 58ccc7b91ac7..52f0e7acda3d 100644
--- a/fs/bcachefs/fs-io-buffered.c
+++ b/fs/bcachefs/fs-io-buffered.c
@@ -389,6 +389,21 @@ static inline struct bch_writepage_state bch_writepage_state_init(struct bch_fs
 	return ret;
 }
 
+/*
+ * Determine when a writepage io is full. We have to limit writepage bios to a
+ * single page per bvec (i.e. 1MB with 4k pages) because that is the limit to
+ * what the bounce path in bch2_write_extent() can handle. In theory we could
+ * loosen this restriction for non-bounce I/O, but we don't have that context
+ * here. Ideally, we can up this limit and make it configurable in the future
+ * when the bounce path can be enhanced to accommodate larger source bios.
+ */
+static inline bool bch_io_full(struct bch_writepage_io *io, unsigned len)
+{
+	struct bio *bio = &io->op.wbio.bio;
+	return bio_full(bio, len) ||
+		(bio->bi_iter.bi_size + len > BIO_MAX_VECS * PAGE_SIZE);
+}
+
 static void bch2_writepage_io_done(struct bch_write_op *op)
 {
 	struct bch_writepage_io *io =
@@ -606,9 +621,7 @@ static int __bch2_writepage(struct folio *folio,
 
 		if (w->io &&
 		    (w->io->op.res.nr_replicas != nr_replicas_this_write ||
-		     bio_full(&w->io->op.wbio.bio, sectors << 9) ||
-		     w->io->op.wbio.bio.bi_iter.bi_size + (sectors << 9) >=
-		     (BIO_MAX_VECS * PAGE_SIZE) ||
+		     bch_io_full(w->io, sectors << 9) ||
 		     bio_end_sector(&w->io->op.wbio.bio) != sector))
 			bch2_writepage_do_io(w);
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] bcachefs: byte order swap bch_alloc_v4.fragmentation_lru field
  2023-11-03 13:09 [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Brian Foster
  2023-11-03 13:09 ` [PATCH 1/3] bcachefs: allow writeback to fill bio completely Brian Foster
@ 2023-11-03 13:09 ` Brian Foster
  2023-11-03 13:09 ` [PATCH 3/3] bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield Brian Foster
  2023-11-03 15:18 ` [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Kent Overstreet
  3 siblings, 0 replies; 5+ messages in thread
From: Brian Foster @ 2023-11-03 13:09 UTC (permalink / raw)
  To: linux-bcachefs

A simple test to populate a filesystem on one CPU architecture and
fsck on an arch of the opposite byte order produces errors related
to the fragmentation LRU. This occurs because the 64-bit
fragmentation_lru field is not byte-order swapped when reads detect
that the on-disk/bset key values were written in opposite byte-order
of the current CPU.

Update the bch2_alloc_v4 swab callback to handle fragmentation_lru
as is done for other multi-byte fields. This doesn't affect existing
filesystems when accessed by CPUs of the same endianness because the
->swab() callback is only called when the bset flags indicate an
endianness mismatch between the CPU and on-disk data.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/bcachefs/alloc_background.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/bcachefs/alloc_background.c b/fs/bcachefs/alloc_background.c
index bcfae91667af..ad256a88cb5c 100644
--- a/fs/bcachefs/alloc_background.c
+++ b/fs/bcachefs/alloc_background.c
@@ -319,6 +319,7 @@ void bch2_alloc_v4_swab(struct bkey_s k)
 	a->io_time[1]		= swab64(a->io_time[1]);
 	a->stripe		= swab32(a->stripe);
 	a->nr_external_backpointers = swab32(a->nr_external_backpointers);
+	a->fragmentation_lru	= swab64(a->fragmentation_lru);
 
 	bps = alloc_v4_backpointers(a);
 	for (bp = bps; bp < bps + BCH_ALLOC_V4_NR_BACKPOINTERS(a); bp++) {
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield
  2023-11-03 13:09 [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Brian Foster
  2023-11-03 13:09 ` [PATCH 1/3] bcachefs: allow writeback to fill bio completely Brian Foster
  2023-11-03 13:09 ` [PATCH 2/3] bcachefs: byte order swap bch_alloc_v4.fragmentation_lru field Brian Foster
@ 2023-11-03 13:09 ` Brian Foster
  2023-11-03 15:18 ` [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Kent Overstreet
  3 siblings, 0 replies; 5+ messages in thread
From: Brian Foster @ 2023-11-03 13:09 UTC (permalink / raw)
  To: linux-bcachefs

The bucket_offset field of bch_backpointer is a 40-bit bitfield, but the
bch2_backpointer_swab() helper uses swab32. This leads to inconsistency
when an on-disk fs is accessed from an opposite endian machine.

As it turns out, we already have an internal swab40() helper that is
used from the bch_alloc_v4 swab callback. Lift it into the backpointers
header file and use it consistently in both places.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/bcachefs/alloc_background.c | 9 ---------
 fs/bcachefs/backpointers.c     | 2 +-
 fs/bcachefs/backpointers.h     | 9 +++++++++
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/fs/bcachefs/alloc_background.c b/fs/bcachefs/alloc_background.c
index ad256a88cb5c..1fec0e67891f 100644
--- a/fs/bcachefs/alloc_background.c
+++ b/fs/bcachefs/alloc_background.c
@@ -297,15 +297,6 @@ int bch2_alloc_v4_invalid(struct bch_fs *c, struct bkey_s_c k,
 	return ret;
 }
 
-static inline u64 swab40(u64 x)
-{
-	return (((x & 0x00000000ffULL) << 32)|
-		((x & 0x000000ff00ULL) << 16)|
-		((x & 0x0000ff0000ULL) >>  0)|
-		((x & 0x00ff000000ULL) >> 16)|
-		((x & 0xff00000000ULL) >> 32));
-}
-
 void bch2_alloc_v4_swab(struct bkey_s k)
 {
 	struct bch_alloc_v4 *a = bkey_s_to_alloc_v4(k).v;
diff --git a/fs/bcachefs/backpointers.c b/fs/bcachefs/backpointers.c
index 3b79bde1ce2f..5ed96dddae08 100644
--- a/fs/bcachefs/backpointers.c
+++ b/fs/bcachefs/backpointers.c
@@ -77,7 +77,7 @@ void bch2_backpointer_swab(struct bkey_s k)
 {
 	struct bkey_s_backpointer bp = bkey_s_to_backpointer(k);
 
-	bp.v->bucket_offset	= swab32(bp.v->bucket_offset);
+	bp.v->bucket_offset	= swab40(bp.v->bucket_offset);
 	bp.v->bucket_len	= swab32(bp.v->bucket_len);
 	bch2_bpos_swab(&bp.v->pos);
 }
diff --git a/fs/bcachefs/backpointers.h b/fs/bcachefs/backpointers.h
index 4ab9f3562912..ab866feeaf66 100644
--- a/fs/bcachefs/backpointers.h
+++ b/fs/bcachefs/backpointers.h
@@ -7,6 +7,15 @@
 #include "buckets.h"
 #include "super.h"
 
+static inline u64 swab40(u64 x)
+{
+	return (((x & 0x00000000ffULL) << 32)|
+		((x & 0x000000ff00ULL) << 16)|
+		((x & 0x0000ff0000ULL) >>  0)|
+		((x & 0x00ff000000ULL) >> 16)|
+		((x & 0xff00000000ULL) >> 32));
+}
+
 int bch2_backpointer_invalid(struct bch_fs *, struct bkey_s_c k,
 			     enum bkey_invalid_flags, struct printbuf *);
 void bch2_backpointer_to_text(struct printbuf *, const struct bch_backpointer *);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] bcachefs: writeback and byte-order misc fixes
  2023-11-03 13:09 [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Brian Foster
                   ` (2 preceding siblings ...)
  2023-11-03 13:09 ` [PATCH 3/3] bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield Brian Foster
@ 2023-11-03 15:18 ` Kent Overstreet
  3 siblings, 0 replies; 5+ messages in thread
From: Kent Overstreet @ 2023-11-03 15:18 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-bcachefs

On Fri, Nov 03, 2023 at 09:09:35AM -0400, Brian Foster wrote:
> Hi all,
> 
> This series is a repost of a handful of patches that happened to have
> fallen through the cracks. Patch 1 relates back to the discussion[1] on
> increasing the writeback bio size limit. We can't safely do that until
> the bounce path can be enhanced to handle larger sizes, but we can at
> least fix the writeback code to completely fill the bio. Patches 2-3 are
> a couple more byte order fixes that were uncovered by swapping an
> on-disk filesystem between big and little endian machines. As of these
> two, I've not encountered any further byte-order issues via this sort of
> test.
> 
> All three of these patches have been soaking in my test branch [2] for
> quite some time. Thoughts, reviews, flames appreciated.
> 
> Brian
> 
> [1] https://lore.kernel.org/linux-bcachefs/20230927112338.262207-3-bfoster@redhat.com/
> [2] https://evilpiepirate.org/~testdashboard/ci?branch=bfoster
> 
> Brian Foster (3):
>   bcachefs: allow writeback to fill bio completely
>   bcachefs: byte order swap bch_alloc_v4.fragmentation_lru field
>   bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield

These will all go out in the next pull request.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-03 15:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-03 13:09 [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Brian Foster
2023-11-03 13:09 ` [PATCH 1/3] bcachefs: allow writeback to fill bio completely Brian Foster
2023-11-03 13:09 ` [PATCH 2/3] bcachefs: byte order swap bch_alloc_v4.fragmentation_lru field Brian Foster
2023-11-03 13:09 ` [PATCH 3/3] bcachefs: use swab40 for bch_backpointer.bucket_offset bitfield Brian Foster
2023-11-03 15:18 ` [PATCH 0/3] bcachefs: writeback and byte-order misc fixes Kent Overstreet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.