[RFC PATCH 0/6] xfs: sort out the AGFL size mess

* [RFC PATCH 0/6] xfs: sort out the AGFL size mess
@ 2016-09-02  2:27 Dave Chinner
  2016-09-02  2:27 ` [PATCH 1/6] xfs: clean up XFS_AGFL_SIZE Dave Chinner
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Dave Chinner @ 2016-09-02  2:27 UTC (permalink / raw)
  To: linux-xfs; +Cc: xfs

Hi folks,

This patchset attempts to address the overall pproblem with the AGFL
size in the v5 format. The underlying problemis that I screwed up
when defining the AGFL header by not padding it correctly for 32/64
bit system sanity, and so it changed size depending on compiler
padding. This then changes the number of entries in the AGFL, and
that can lead to problems when moving a filesysetm between different
platforms.

What this patchset does is fix the size of the AGFL to be consistent
across all platforms and architectures, and then detects the
off-by-one condition that occurs when a filesystem has a size
mismatch on a wrapped AGFL. It then automatically corrects the
off-by-one so the user does not need to even know that this problem
existed when upgrading their kernel. If there is more than an
off-by-one error, the kernel will flag a corruption in the usual way
(i.e. shutdown) and that is left to repair to fix up.

As the userspace tools always rebuild the AGFL when required, they
do not expose the problematic wrapping condition to the kernel.
Hence we only really need this set of automatic fixups for kernel
upgrade situations. And because we always use the smaller, valid
AGFL size, we can remove the growfs hack we put inplace to prevent
initialising the new AGFL indexes to an invalid index.

I think I've caught all the conditions we need to here. I've been
testing with the script attached below (requires an xfs_db patch I
posted a couple of days ago) to exercise the "detect and correct
oversize AGFL indexes" case. If I run this on an unmodified kernel,
it crashes and burns. With this patch set, it either corrects the
problem automatically or flags corruption. I'll need to turn this
into an xfstest but it suffices for the moment.

Thoughts, comments and testing on random platforms welcome!

-Dave.

--
#!/bin/bash

do_write()
{
	mount /dev/ram0 /mnt/test 
	echo > /mnt/test/foo
	sync
	umount /mnt/test 
	#xfs_db -x -c "agf 0" -c "p" /dev/ram0
	xfs_repair -n /dev/ram0 > /dev/null 2>&1
	xfs_repair /dev/ram0 > /dev/null 2>&1
	mount /dev/ram0 /mnt/test 
	echo > /mnt/test/bar
	umount /mnt/test 
	xfs_repair /dev/ram0 > /dev/null 2>&1
}

agfl_copy()
{
	source=$1
	dest=$2

	agbno=`xfs_db -x -c "agfl 0" -c "p bno[$source]" /dev/ram0 | \
		cut -d "=" -f 2`
	if [ "$agbno" == " null" ]; then
		agbno="0xffffffff"
	fi
	echo agbno "$agbno"
	xfs_db -x -c "agfl 0" -c "write -d bno[$source] 0xffffffff" /dev/ram0 > /dev/null
	xfs_db -x -c "agfl 0" -c "write -d bno[$dest] $agbno" /dev/ram0 > /dev/null
}

run_test()
{
	flfirst=$1
	fllast=$2
	flcount=$3
	urk=$4

	echo "Testing flfirst=$flfirst fllast=$fllast flcount=$flcount...."
	mkfs.xfs -f -s size=512 /dev/ram0 > /dev/null
	xfs_db -x -c "agf 0"                    \
		-c "write -d flfirst $flfirst"  \
		-c "write -d fllast $fllast"    \
		-c "write -d flcount $flcount"  \
		/dev/ram0

	# we need to write a bunch of block numbers into the new part
	# of the AGFL. So we just copy 0 -> flfirst and so on.
	let i=0
	while (($flcount - $i > 0)) ; do
		dst=$((flfirst + i))
		if [ $dst -ge 118 ]; then
			dst=$((dst - 118))
		fi
		agfl_copy $i $dst
		i=$((i + 1))
	done

	do_write
}

# run_test flfirst fllast flcount
#
# mkfs default on 512 byte sectors is "0 3 4" w/ size 118
# hence 118 should be the first invalid index, and the number
# filesystems with the agfl header packing bug use.
#
# We want to test corrections for:
#       flfirst being oversize w/ matching flcount
run_test 118 3 5
#       fllast being oversize w/ matching flcount
run_test 114 118 5
#       flfirst/flast being in range w/ oversize flcount
run_test 117 3 6

#
# We want to test corruption detection for:
# where "non-matching flcount" exercises both too small and too large
#       flfirst being oversize w/ non-matching flcount
run_test 118 3 4
run_test 118 3 6
#       fllast being oversize w/ non-matching flcount
run_test 114 118 4
run_test 114 118 6
#       flfirst/flast being in range w/ non-matching flcount
run_test 117 3 5
run_test 117 3 7

--

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread