All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs: zoned: fixes for data relocation
@ 2022-06-06 15:59 Naohiro Aota
  2022-06-06 15:59 ` [PATCH 1/2] btrfs: zoned: prevent allocation from previous data relocation BG Naohiro Aota
  2022-06-06 15:59 ` [PATCH 2/2] btrfs: zoned: fix critical section of relocation inode writeback Naohiro Aota
  0 siblings, 2 replies; 5+ messages in thread
From: Naohiro Aota @ 2022-06-06 15:59 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

There are two long-standing potential bugs in the data relocation path of
zoned btrfs. They are recently revealed by commit 5f0addf7b890 ("btrfs:
zoned: use dedicated lock for data relocation"). One is a mixed issue of
WRITE (for relocation extents) and ZONE APPEND (for regular extent) at the
same time, which confuses the write pointer. The other one is a too short
critical section, which can cause an out-of-order issue of the IOs.

Actually, these bugs are easily reproducible with a smaller zone size (e.g,
128 MB) with fstests btrfs/232. For example, IO failures occurs like this:

  [99909.031820][T4038707] WARNING: CPU: 3 PID: 4038707 at fs/btrfs/extent-tree.c:2381 btrfs_cross_ref_exist+0xfc/0x120 [btrfs]
  <snip>
  [99909.268769][T4038707] Call Trace:
  [99909.272105][T4038707]  <TASK>
  [99909.275093][T4038707]  run_delalloc_nocow+0x7f1/0x11a0 [btrfs]
  [99909.280996][T4038707]  ? test_range_bit+0x174/0x320 [btrfs]
  [99909.286622][T4038707]  ? fallback_to_cow+0x980/0x980 [btrfs]
  [99909.292333][T4038707]  ? find_lock_delalloc_range+0x33e/0x3e0 [btrfs]
  [99909.298825][T4038707]  btrfs_run_delalloc_range+0x445/0x1320 [btrfs]
  [99909.305222][T4038707]  ? test_range_bit+0x320/0x320 [btrfs]
  [99909.310844][T4038707]  ? lock_downgrade+0x6a0/0x6a0
  [99909.315732][T4038707]  ? orc_find.part.0+0x1ed/0x300
  [99909.320705][T4038707]  ? __module_address.part.0+0x25/0x300
  [99909.326280][T4038707]  writepage_delalloc+0x159/0x310 [btrfs]
  <snip>
  [99909.883814][    C3] sd 10:0:1:0: [sde] tag#2620 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
  [99909.893855][    C3] sd 10:0:1:0: [sde] tag#2620 Sense Key : Illegal Request [current]
  [99909.901819][    C3] sd 10:0:1:0: [sde] tag#2620 Add. Sense: Unaligned write command
  [99909.909525][    C3] sd 10:0:1:0: [sde] tag#2620 CDB: Write(16) 8a 00 00 00 00 00 02 f3 63 87 00 00 00 2c 00 00
  [99909.919544][    C3] critical target error, dev sde, sector 396041272 op 0x1:(WRITE) flags 0x800 phys_seg 3 prio class 0
  [99909.930329][    C3] BTRFS error (device dm-1): bdev /dev/mapper/dml_102_2 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0

Or, an assertion failure occur like this:

  [   12.527832] assertion failed: start >= found_start && end <= found_end, in fs/btrfs/free-space-tree.c:737
  <snip>
  [   12.533391] Call Trace:
  [   12.533391]  <TASK>
  [   12.533391]  __remove_from_free_space_tree.cold+0x11/0x22 [btrfs]
  [   12.542073]  ? setup_items_for_insert.isra.0+0x2bf/0x3f0 [btrfs]
  [   12.542073]  remove_from_free_space_tree+0x80/0x110 [btrfs]
  [   12.542073]  alloc_reserved_file_extent+0x1b4/0x240 [btrfs]
  [   12.542073]  __btrfs_run_delayed_refs+0x692/0xf30 [btrfs]
  [   12.542073]  ? btrfs_btree_balance_dirty+0x2f/0x50 [btrfs]
  [   12.542073]  btrfs_run_delayed_refs+0x81/0x1e0 [btrfs]
  [   12.542073]  btrfs_commit_transaction+0x54/0xaf0 [btrfs]
  [   12.542073]  ? start_transaction+0xc2/0x5b0 [btrfs]
  [   12.542073]  ? _raw_read_lock_irqsave+0x20/0x40
  [   12.542073]  relocate_block_group+0x320/0x550 [btrfs]
  [   12.542073]  btrfs_relocate_block_group+0x1f9/0x3a0 [btrfs]
  [   12.542073]  btrfs_relocate_chunk+0x36/0xf0 [btrfs]
  [   12.542073]  btrfs_reclaim_bgs_work.cold+0x4f/0x74 [btrfs]
  [   12.542073]  process_one_work+0x1b0/0x310
  [   12.542073]  worker_thread+0x48/0x3d0
  [   12.542073]  ? rescuer_thread+0x3a0/0x3a0
  [   12.542073]  kthread+0xed/0x120
  [   12.550506]  ? kthread_complete_and_exit+0x20/0x20
  [   12.550506]  ret_from_fork+0x22/0x30
  [   12.550506]  </TASK>

This series fixes the two issues. The first one is fixed by introducing a
new btrfs_block_group bit to disallow extent allocation but still allow
nocow writes to start.

The second one is simply fixed by extending the critical section.

Naohiro Aota (2):
  btrfs: zoned: prevent allocation from previous data relocation BG
  btrfs: zoned: fix critical section of relocation inode writeback

 fs/btrfs/block-group.h |  1 +
 fs/btrfs/extent-tree.c | 20 ++++++++++++++++++--
 fs/btrfs/extent_io.c   |  3 ++-
 fs/btrfs/inode.c       |  2 ++
 fs/btrfs/zoned.c       | 27 +++++++++++++++++++++++++++
 fs/btrfs/zoned.h       |  5 +++++
 6 files changed, 55 insertions(+), 3 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-06-07  7:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-06 15:59 [PATCH 0/2] btrfs: zoned: fixes for data relocation Naohiro Aota
2022-06-06 15:59 ` [PATCH 1/2] btrfs: zoned: prevent allocation from previous data relocation BG Naohiro Aota
2022-06-06 17:40   ` David Sterba
2022-06-07  6:59     ` Naohiro Aota
2022-06-06 15:59 ` [PATCH 2/2] btrfs: zoned: fix critical section of relocation inode writeback Naohiro Aota

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.