linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Nir Soffer <nsoffer@redhat.com>
To: LVM general discussion and development <linux-lvm@redhat.com>
Cc: Ingo Franzki <ifranzki@linux.ibm.com>
Subject: Re: [linux-lvm] Filesystem corruption with LVM's pvmove onto a PV with a larger physical block size
Date: Sat, 2 Mar 2019 22:25:04 +0200	[thread overview]
Message-ID: <CAMRbyyuVek7CnYhD3S6JcYLTfxP73CScpErCX3D2eapbfJgd4g@mail.gmail.com> (raw)
In-Reply-To: <30346b34-c1e1-f7ba-be4e-a37d8ce8cf03@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7552 bytes --]

On Sat, Mar 2, 2019 at 3:38 AM Cesare Leonardi <celeonar@gmail.com> wrote:

> Hello Ingo, I've made several tests but I was unable to trigger any
> filesystem corruption. Maybe the trouble you encountered are specific to
> encrypted device?
>
> Yesterday and today I've used:
> Debian unstable
> kernel 4.19.20
> lvm2 2.03.02
> e2fsprogs 1.44.5
>
> On 01/03/19 09:05, Ingo Franzki wrote:
> > Hmm, maybe the size of the volume plays a role as Bernd has pointed out.
> ext4 may use -b 4K by default on larger devices.
> > Once the FS uses 4K block anyway you wont see the problem.
> >
> > Use  tune2fs -l <device> after you created the file system and check if
> it is using 4K blocks on your 512/512 device. If so, then you won't see the
> problem when moved to a 4K block size device.
>
> I confirm that tune2fs reports 4096 block size for the 1 GB ext4
> filesystem I've used.
> I've also verified what Bernd said: mkfs.ext4 still use 4096 block size
> for a +512M partition, but use 1024 for +500M.
>
> As suggested by Stuart, I also made a test using a 4k loop device and
> pvmoving the LV into it. As you expected, no data corruption.
> To do it I've recreated the same setup ad yesterday:
> /dev/mapper/vgtest-lvol0 on /dev/sdb4, a 512/512 disk, with some data on
> it. Then:
> # fallocate -l 10G testdisk.img
> # losetup -f -L -P -b 4096 testdisk.img
> # pvcreate /dev/loop0
> # vgextend vgtest /dev/loop0
> # pvmove /dev/sdb4 /dev/loop0
> # fsck.ext4 -f /dev/mapper/vgtest-lvol0
>
> While I was there, out of curiosity, I've created an ext4 filesystem on
> a <500MB LV (block size = 1024) and I've tried pvmoving data from the
> 512/512 disk to 512/4096, then to the 4096/4096 loop device.
> New partitions and a new VG was used for that.
>
> The setup:
> /dev/sdb5: 512/512
> /dev/sdc2: 512/4096
> /dev/loop0 4096/4096
>
> # blockdev -v --getss --getpbsz --getbsz /dev/sdb
> get logical block (sector) size: 512
> get physical block (sector) size: 512
> get blocksize: 4096
>
> # blockdev -v --getss --getpbsz --getbsz /dev/sdc
> get logical block (sector) size: 512
> get physical block (sector) size: 4096
> get blocksize: 4096
>
> # blockdev -v --getss --getpbsz --getbsz /dev/loop0
> get logical block (sector) size: 4096
> get physical block (sector) size: 4096
> get blocksize: 4096
>
> # pvcreate /dev/sdb5
> # vgcreate vgtest2 /dev/sdb5
> # lvcreate -L 400M vgtest2 /dev/sdb5
> # mkfs.ext4 /dev/mapper/vgtest2-lvol0
>
> # tune2fs -l /dev/mapper/vgtest2-lvol0
> [...]
> Block size:               1024
> [...]
>
> # mount /dev/mapper/vgtest2-lvol0 /media/test
> # cp -a SOMEDATA /media/test/
> # umount /media/test
> # fsck.ext4 -f /dev/mapper/vgtest2-lvol0
>
> Now I've moved data from the 512/512 to the 512/4096 disk:
> # pvcreate /dev/sdc2
> # vgextend vgtest2 /dev/sdc2
> # pvmove /dev/sdb5 /dev/sdc2
> # fsck.ext4 -f /dev/mapper/vgtest2-lvol0
>
> No error reported.
>

Did you try to mount the lv after the pvmove?


> Now I've moved data to the 4096/4096 loop device:
> # pvcreate /dev/loop0
> # vgextend vgtest2 /dev/loop0
> # pvmove /dev/sdc2 /dev/loop0
> # fsck.ext4 -f /dev/mapper/vgtest2-lvol0
>
> Still no data corruption.
>

I can reproduce this without moving data, just by extending vg with 4k
device,
and then extending lv to use both devices.

Here is what I tested:

# truncate -s 500m disk1
# truncate -s 500m disk2
# losetup -f disk1 --sector-size 512 --show
/dev/loop2
# losetup -f disk2 --sector-size 4096 --show
/dev/loop3

# pvcreate /dev/loop2
  Physical volume "/dev/loop2" successfully created.
# pvcreate /dev/loop3
  Physical volume "/dev/loop3" successfully created.
# vgcreate test /dev/loop2
  Volume group "test" successfully created
# lvcreate -L400m -n lv1 test
  Logical volume "lv1" created.

# mkfs.xfs /dev/test/lv1
meta-data=/dev/test/lv1          isize=512    agcount=4, agsize=25600 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0, rmapbt=0,
reflink=0
data     =                       bsize=4096   blocks=102400, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=855, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

# mkdir /tmp/mnt
# mount /dev/test/lv1 /tmp/mnt
# vgextend test /dev/loop3
  Volume group "test" successfully extended
# lvextend -L+400m test/lv1
  Size of logical volume test/lv1 changed from 400.00 MiB (100 extents) to
800.00 MiB (200 extents).
  Logical volume test/lv1 successfully resized.
# umount /tmp/mnt

# mount /dev/test/lv1 /tmp/mnt
mount: /tmp/mnt: mount(2) system call failed: Function not implemented.

From journalctl:
Mar 02 21:52:53 lean.local kernel: XFS (dm-7): Unmounting Filesystem
Mar 02 21:53:01 lean.local kernel: XFS (dm-7): device supports 4096 byte
sectors (not 512)


I also tried the same with ext4:

(same disks/vg/lv setup as above)

# mkfs.ext4 /dev/test/lv1
mke2fs 1.44.2 (14-May-2018)
Discarding device blocks: done
Creating filesystem with 409600 1k blocks and 102400 inodes
Filesystem UUID: 9283880e-ee89-4d79-9c29-41f4af98f894
Superblock backups stored on blocks:
8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409

Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

# vgextend test /dev/loop3
  Volume group "test" successfully extended
# lvextend -L+400 test/lv1
  Size of logical volume test/lv1 changed from 400.00 MiB (100 extents) to
800.00 MiB (200 extents).
  Logical volume test/lv1 successfully resized.

# mount /dev/test/lv1 /tmp/mnt
mount: /tmp/mnt: wrong fs type, bad option, bad superblock on
/dev/mapper/test-lv1, missing codepage or helper program, or other error.

From journalctl:
Mar 02 22:06:09 lean.local kernel: EXT4-fs (dm-7): bad block size 1024


Now same with pvmove:

(same setup as above, using xfs)

# mount /dev/test/lv1 /tmp/mnt
# dd if=/dev/urandom bs=8M count=1 of=/tmp/mnt/data
# vgextend test /dev/loop3
  Physical volume "/dev/loop3" successfully created.
  Volume group "test" successfully extended

# pvmove -v /dev/loop2 /dev/loop3
    Cluster mirror log daemon is not running.
    Wiping internal VG cache
    Wiping cache of LVM-capable devices
    Archiving volume group "test" metadata (seqno 3).
    Creating logical volume pvmove0
    Moving 100 extents of logical volume test/lv1.
    activation/volume_list configuration setting not defined: Checking only
host tags for test/lv1.
    Creating test-pvmove0
    Loading table for test-pvmove0 (253:8).
    Loading table for test-lv1 (253:7).
    Suspending test-lv1 (253:7) with device flush
    Resuming test-pvmove0 (253:8).
    Resuming test-lv1 (253:7).
    Creating volume group backup "/etc/lvm/backup/test" (seqno 4).
    activation/volume_list configuration setting not defined: Checking only
host tags for test/pvmove0.
    Checking progress before waiting every 15 seconds.
  /dev/loop2: Moved: 15.00%
  /dev/loop2: Moved: 100.00%
    Polling finished successfully.
# umount /tmp/mnt
# mount /dev/test/lv1 /tmp/mnt
mount: /tmp/mnt: mount(2) system call failed: Function not implemented.

From journalctl:
Mar 02 22:20:36 lean.local kernel: XFS (dm-7): device supports 4096 byte
sectors (not 512)


Tested on Fedora 28 with:
kernel-4.20.5-100.fc28.x86_64
lvm2-2.02.177-5.fc28.x86_64

Nir

[-- Attachment #2: Type: text/html, Size: 16506 bytes --]

  reply	other threads:[~2019-03-02 20:25 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-25 15:33 [linux-lvm] Filesystem corruption with LVM's pvmove onto a PV with a larger physical block size Ingo Franzki
2019-02-27  0:00 ` Cesare Leonardi
2019-02-27  8:49   ` Ingo Franzki
2019-02-27 14:59     ` Stuart D. Gathman
2019-02-27 17:05       ` Ingo Franzki
2019-03-02  1:37         ` L A Walsh
2019-02-28  1:31     ` Cesare Leonardi
2019-02-28  1:52       ` Stuart D. Gathman
2019-02-28  8:41       ` Ingo Franzki
2019-02-28  9:48         ` Ilia Zykov
2019-02-28 10:10           ` Ingo Franzki
2019-02-28 10:41             ` Ilia Zykov
2019-02-28 10:50             ` Ilia Zykov
2019-02-28 13:13               ` Ilia Zykov
2019-03-01  1:24         ` Cesare Leonardi
2019-03-01  2:56           ` [linux-lvm] Filesystem corruption with LVM's pvmove onto a PVwith " Bernd Eckenfels
2019-03-01  8:00             ` Ingo Franzki
2019-03-01  3:41           ` [linux-lvm] Filesystem corruption with LVM's pvmove onto a PV with " Stuart D. Gathman
2019-03-01  7:59             ` Ingo Franzki
2019-03-01  8:05           ` Ingo Franzki
2019-03-02  1:36             ` Cesare Leonardi
2019-03-02 20:25               ` Nir Soffer [this message]
2019-03-04 22:45                 ` Cesare Leonardi
2019-03-04 23:22                   ` Nir Soffer
2019-03-05  7:54                     ` Ingo Franzki
2019-03-04  9:12               ` Ingo Franzki
2019-03-04 22:10                 ` Cesare Leonardi
2019-03-05  0:12                   ` Stuart D. Gathman
2019-03-05  7:53                     ` Ingo Franzki
2019-03-05  9:29                       ` Ilia Zykov
2019-03-05 11:42                         ` Ingo Franzki
2019-03-05 16:29                         ` Nir Soffer
2019-03-05 16:36                           ` David Teigland
2019-03-05 16:56                             ` Stuart D. Gathman
2019-02-28 14:36 ` Ilia Zykov
2019-02-28 16:30   ` Ingo Franzki
2019-02-28 18:11     ` Ilia Zykov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMRbyyuVek7CnYhD3S6JcYLTfxP73CScpErCX3D2eapbfJgd4g@mail.gmail.com \
    --to=nsoffer@redhat.com \
    --cc=ifranzki@linux.ibm.com \
    --cc=linux-lvm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).