On Sat, Mar 2, 2019 at 3:38 AM Cesare Leonardi <celeonar@gmail.com> wrote:

Hello Ingo, I've made several tests but I was unable to trigger any
filesystem corruption. Maybe the trouble you encountered are specific to
encrypted device?

Yesterday and today I've used:
Debian unstable
kernel 4.19.20
lvm2 2.03.02
e2fsprogs 1.44.5

On 01/03/19 09:05, Ingo Franzki wrote:
> Hmm, maybe the size of the volume plays a role as Bernd has pointed out. ext4 may use -b 4K by default on larger devices.
> Once the FS uses 4K block anyway you wont see the problem.
>
> Use tune2fs -l <device> after you created the file system and check if it is using 4K blocks on your 512/512 device. If so, then you won't see the problem when moved to a 4K block size device.

I confirm that tune2fs reports 4096 block size for the 1 GB ext4
filesystem I've used.
I've also verified what Bernd said: mkfs.ext4 still use 4096 block size
for a +512M partition, but use 1024 for +500M.

As suggested by Stuart, I also made a test using a 4k loop device and
pvmoving the LV into it. As you expected, no data corruption.
To do it I've recreated the same setup ad yesterday:
/dev/mapper/vgtest-lvol0 on /dev/sdb4, a 512/512 disk, with some data on
it. Then:
# fallocate -l 10G testdisk.img
# losetup -f -L -P -b 4096 testdisk.img
# pvcreate /dev/loop0
# vgextend vgtest /dev/loop0
# pvmove /dev/sdb4 /dev/loop0
# fsck.ext4 -f /dev/mapper/vgtest-lvol0

While I was there, out of curiosity, I've created an ext4 filesystem on
a <500MB LV (block size = 1024) and I've tried pvmoving data from the
512/512 disk to 512/4096, then to the 4096/4096 loop device.
New partitions and a new VG was used for that.

The setup:
/dev/sdb5: 512/512
/dev/sdc2: 512/4096
/dev/loop0 4096/4096

# blockdev -v --getss --getpbsz --getbsz /dev/sdb
get logical block (sector) size: 512
get physical block (sector) size: 512
get blocksize: 4096

# blockdev -v --getss --getpbsz --getbsz /dev/sdc
get logical block (sector) size: 512
get physical block (sector) size: 4096
get blocksize: 4096

# blockdev -v --getss --getpbsz --getbsz /dev/loop0
get logical block (sector) size: 4096
get physical block (sector) size: 4096
get blocksize: 4096

# pvcreate /dev/sdb5
# vgcreate vgtest2 /dev/sdb5
# lvcreate -L 400M vgtest2 /dev/sdb5
# mkfs.ext4 /dev/mapper/vgtest2-lvol0

# tune2fs -l /dev/mapper/vgtest2-lvol0
[...]
Block size: 1024
[...]

# mount /dev/mapper/vgtest2-lvol0 /media/test
# cp -a SOMEDATA /media/test/
# umount /media/test
# fsck.ext4 -f /dev/mapper/vgtest2-lvol0

Now I've moved data from the 512/512 to the 512/4096 disk:
# pvcreate /dev/sdc2
# vgextend vgtest2 /dev/sdc2
# pvmove /dev/sdb5 /dev/sdc2
# fsck.ext4 -f /dev/mapper/vgtest2-lvol0

No error reported.

Did you try to mount the lv after the pvmove?

Now I've moved data to the 4096/4096 loop device:
# pvcreate /dev/loop0
# vgextend vgtest2 /dev/loop0
# pvmove /dev/sdc2 /dev/loop0
# fsck.ext4 -f /dev/mapper/vgtest2-lvol0

Still no data corruption.

I can reproduce this without moving data, just by extending vg with 4k device,

and then extending lv to use both devices.

Here is what I tested:

# truncate -s 500m disk1

# truncate -s 500m disk2

# losetup -f disk1 --sector-size 512 --show

/dev/loop2

# losetup -f disk2 --sector-size 4096 --show

/dev/loop3

# pvcreate /dev/loop2

Physical volume "/dev/loop2" successfully created.

# pvcreate /dev/loop3

Physical volume "/dev/loop3" successfully created.

# vgcreate test /dev/loop2

Volume group "test" successfully created

# lvcreate -L400m -n lv1 test

Logical volume "lv1" created.

# mkfs.xfs /dev/test/lv1

meta-data=/dev/test/lv1 isize=512 agcount=4, agsize=25600 blks

= sectsz=512 attr=2, projid32bit=1

= crc=1 finobt=1, sparse=0, rmapbt=0, reflink=0

data = bsize=4096 blocks=102400, imaxpct=25

= sunit=0 swidth=0 blks

naming =version 2 bsize=4096 ascii-ci=0 ftype=1

log =internal log bsize=4096 blocks=855, version=2

= sectsz=512 sunit=0 blks, lazy-count=1

realtime =none extsz=4096 blocks=0, rtextents=0

# mkdir /tmp/mnt

# mount /dev/test/lv1 /tmp/mnt

# vgextend test /dev/loop3

Volume group "test" successfully extended

# lvextend -L+400m test/lv1

Size of logical volume test/lv1 changed from 400.00 MiB (100 extents) to 800.00 MiB (200 extents).

Logical volume test/lv1 successfully resized.

# umount /tmp/mnt

# mount /dev/test/lv1 /tmp/mnt

mount: /tmp/mnt: mount(2) system call failed: Function not implemented.

From journalctl:

Mar 02 21:52:53 lean.local kernel: XFS (dm-7): Unmounting Filesystem

Mar 02 21:53:01 lean.local kernel: XFS (dm-7): device supports 4096 byte sectors (not 512)

I also tried the same with ext4:

(same disks/vg/lv setup as above)

# mkfs.ext4 /dev/test/lv1

mke2fs 1.44.2 (14-May-2018)

Discarding device blocks: done

Creating filesystem with 409600 1k blocks and 102400 inodes

Filesystem UUID: 9283880e-ee89-4d79-9c29-41f4af98f894

Superblock backups stored on blocks:

8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409

Allocating group tables: done

Writing inode tables: done

Creating journal (8192 blocks): done

Writing superblocks and filesystem accounting information: done

# vgextend test /dev/loop3

Volume group "test" successfully extended

# lvextend -L+400 test/lv1

Size of logical volume test/lv1 changed from 400.00 MiB (100 extents) to 800.00 MiB (200 extents).

Logical volume test/lv1 successfully resized.

# mount /dev/test/lv1 /tmp/mnt

mount: /tmp/mnt: wrong fs type, bad option, bad superblock on /dev/mapper/test-lv1, missing codepage or helper program, or other error.

From journalctl:

Mar 02 22:06:09 lean.local kernel: EXT4-fs (dm-7): bad block size 1024

Now same with pvmove:

(same setup as above, using xfs)

# mount /dev/test/lv1 /tmp/mnt

# dd if=/dev/urandom bs=8M count=1 of=/tmp/mnt/data

# vgextend test /dev/loop3

Physical volume "/dev/loop3" successfully created.

Volume group "test" successfully extended

# pvmove -v /dev/loop2 /dev/loop3

Cluster mirror log daemon is not running.

Wiping internal VG cache

Wiping cache of LVM-capable devices

Archiving volume group "test" metadata (seqno 3).

Creating logical volume pvmove0

Moving 100 extents of logical volume test/lv1.

activation/volume_list configuration setting not defined: Checking only host tags for test/lv1.

Creating test-pvmove0

Loading table for test-pvmove0 (253:8).

Loading table for test-lv1 (253:7).

Suspending test-lv1 (253:7) with device flush

Resuming test-pvmove0 (253:8).

Resuming test-lv1 (253:7).

Creating volume group backup "/etc/lvm/backup/test" (seqno 4).

activation/volume_list configuration setting not defined: Checking only host tags for test/pvmove0.

Checking progress before waiting every 15 seconds.

/dev/loop2: Moved: 15.00%

/dev/loop2: Moved: 100.00%

Polling finished successfully.

# umount /tmp/mnt

# mount /dev/test/lv1 /tmp/mnt

mount: /tmp/mnt: mount(2) system call failed: Function not implemented.

From journalctl:

Mar 02 22:20:36 lean.local kernel: XFS (dm-7): device supports 4096 byte sectors (not 512)

Tested on Fedora 28 with:

kernel-4.20.5-100.fc28.x86_64

lvm2-2.02.177-5.fc28.x86_64

Nir