error writing primary super block on zoned btrfs

* error writing primary super block on zoned btrfs
@ 2022-07-18  5:49 Christoph Hellwig
  2022-07-18 12:28 ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2022-07-18  5:49 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: Matthew Wilcox, linux-btrfs

Hi Naohiro, (and willy for insights on the pagecache, see below),

when running plain fsx on zoned btrfs on a null_blk devices as below:

dev="/sys/kernel/config/nullb/nullb1"
size=12800 # MB

mkdir ${dev}
echo 2 > "${dev}"/submit_queues
echo 2 > "${dev}"/queue_mode
echo 2 > "${dev}"/irqmode
echo "${size}" > "${dev}"/size
echo 1 > "${dev}"/zoned
echo 0 > "${dev}"/zone_nr_conv
echo 128 > "${dev}"/zone_size
echo 96 > "${dev}"/zone_capacity
echo 14 > "${dev}"/zone_max_active
echo 1 > "${dev}"/memory_backed
echo 1000000 > "${dev}"/completion_nsec
echo 1 > "${dev}"/power
mkfs.btrfs -m single /dev/nullb1
mount /dev/nullb1 /mnt/test/
~/xfstests-dev/ltp/fsx /mnt/test/foobar

fsx will eventually after ~10 minutes fail with the following left
in dmesg:

[ 1185.480200] BTRFS error (device nullb1): error writing primary super block to device 1
[ 1185.480988] BTRFS: error (device nullb1) in write_all_supers:4488: errno=-5 IO failure (1 errors while writing supers)
[ 1185.481971] BTRFS info (device nullb1: state E): forced readonly
[ 1185.482521] BTRFS: error (device nullb1: state EA) in btrfs_sync_log:3341: errno=-5 IO failure

I tracked this down to the find_get_page call in wait_dev_supers
returning NULL, and digging furter it seems to come from
xa_is_value() in __filemap_get_folio returnin true.  I'm not sure
why we'd see a value here in the block device mapping and why that
only happens in zoned mode (the same config on regular device ran
for 10 hours last night).

^ permalink raw reply	[flat|nested] 7+ messages in thread