Hi, On 02/05/2018 03:42 PM, Liwei wrote: > Hi Eric, >     Thanks for answering! Here are the details: > > # lvm version >   LVM version:     2.02.176(2) (2017-11-03) >   Library version: 1.02.145 (2017-11-03) >   Driver version:  4.37.0 >   Configuration:   ./configure --build=x86_64-linux-gnu --prefix=/usr > --includedir=${prefix}/include --mandir=${prefix}/share/man > --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var > --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu > --libexecdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run > --disable-maintainer-mode --disable-dependency-tracking --exec-prefix= > --bindir=/bin --libdir=/lib/x86_64-linux-gnu --sbindir=/sbin > --with-usrlibdir=/usr/lib/x86_64-linux-gnu --with-optimisation=-O2 > --with-cache=internal --with-clvmd=corosync --with-cluster=internal > --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 > --with-default-pid-dir=/run --with-default-run-dir=/run/lvm > --with-default-locking-dir=/run/lock/lvm --with-thin=internal > --with-thin-check=/usr/sbin/thin_check > --with-thin-dump=/usr/sbin/thin_dump > --with-thin-repair=/usr/sbin/thin_repair --enable-applib > --enable-blkid_wiping --enable-cmdlib --enable-cmirrord > --enable-dmeventd --enable-dbus-service --enable-lvmetad > --enable-lvmlockd-dlm --enable-lvmlockd-sanlock --enable-lvmpolld > --enable-notify-dbus --enable-pkgconfig --enable-readline > --enable-udev_rules --enable-udev_sync > > # uname -a > Linux dataserv 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) > x86_64 GNU/Linux Sorry, I'm not sure if this the root cause of your issue, without testing myself. If you have interest to have a try, you can revert cd15fb64ee56192760ad5c1e2ad97a65e735b18b (Revert "dm mirror: use all available legs on multiple failures") and try my patch in https://patchwork.kernel.org/patch/9808897/ The "reverting" fix for the crash issue is in 4.14.0 kernel. '"" ╭─eric@ws ~/workspace/linux  ‹master› ╰─$ git log --grep "Revert \"dm mirror: use all available legs on multiple failures\"" commit cd15fb64ee56192760ad5c1e2ad97a65e735b18b Author: Mike Snitzer Date:   Thu Jun 15 08:39:15 2017 -0400     Revert "dm mirror: use all available legs on multiple failures"     This reverts commit 12a7cf5ba6c776a2621d8972c7d42e8d3d959d20. ╭─eric@ws ~/workspace/linux  ‹master› ╰─$ git describe cd15fb64ee56192760ad5c1e2ad97a65e735b18b v4.12-rc5-2-gcd15fb64ee56 """ Eric > > Warm regards, > Liwei > > On 5 Feb 2018 15:27, "Eric Ren" > > wrote: > > Hi, > > Your LVM version and kernel version please? > > like: > """" > # lvm version >   LVM version:     2.02.177(2) (2017-12-18) >   Library version: 1.03.01 (2017-12-18) >   Driver version:  4.35.0 > > # uname -a > Linux sle15-c1-n1 4.12.14-9.1-default #1 SMP Fri Jan 19 09:13:51 > UTC 2018 (849a2fe) x86_64 x86_64 x86_64 GNU/Linux > """ > > Eric > > On 02/03/2018 05:43 PM, Liwei wrote: > > Hi list, >      I had a LV that I was converting from linear to mirrored (not > raid1) whose source device failed partway-through during the > initial > sync. > >      I've since recovered the source device, but it seems like the > mirror is still acting as if some blocks are not readable? I'm > getting > this in my logs, and the FS is full of errors: > > [  +1.613126] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.000278] device-mapper: raid1: Primary mirror (253:25) failed > while out-of-sync: Reads may fail. > [  +0.085916] device-mapper: raid1: Mirror read failed. > [  +0.196562] device-mapper: raid1: Mirror read failed. > [  +0.000237] Buffer I/O error on dev dm-27, logical block > 5371800560, > async page read > [  +0.592135] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.082882] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.246945] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.107374] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.083344] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.114949] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.085056] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.203929] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.157953] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +3.065247] recovery_complete: 23 callbacks suppressed > [  +0.000001] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.128064] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.103100] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.107827] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.140871] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.132844] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.124698] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.138502] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.117827] device-mapper: raid1: Unable to read primary mirror > during recovery > [  +0.125705] device-mapper: raid1: Unable to read primary mirror > during recovery > [Feb 3 17:09] device-mapper: raid1: Mirror read failed. > [  +0.167553] device-mapper: raid1: Mirror read failed. > [  +0.000268] Buffer I/O error on dev dm-27, logical block > 5367765816, > async page read > [  +0.135138] device-mapper: raid1: Mirror read failed. > [  +0.000238] Buffer I/O error on dev dm-27, logical block > 5367765816, > async page read > [  +0.000365] device-mapper: raid1: Mirror read failed. > [  +0.000315] device-mapper: raid1: Mirror read failed. > [  +0.000213] Buffer I/O error on dev dm-27, logical block > 5367896888, > async page read > [  +0.000276] device-mapper: raid1: Mirror read failed. > [  +0.000199] Buffer I/O error on dev dm-27, logical block > 5367765816, > async page read > >      However, if I take down the destination device and > restart the LV > with --activateoption partial, I can read my data and everything > checks out. > >      My theory (and what I observed) is that lvm continued the > initial > sync even after the source drive stopped responding, and has now > mapped the blocks that it 'synced' as dead. How can I make lvm > retry > those blocks again? > >      In fact, I don't trust the mirror anymore, is there a way > I can > conduct a scrub of the mirror after the initial sync is done? > I read > about --syncaction check, but seems like it only notes the > number of > inconsistencies. Can I have lvm re-mirror the inconsistencies > from the > source to destination device? I trust the source device > because we ran > a btrfs scrub on it and it reported that all checksums are valid. > >      It took months for the mirror sync to get to this stage > (actually, > why does it take months to mirror 20TB?), I don't want to > start it all > over again. > > Warm regards, > Liwei > > _______________________________________________ > linux-lvm mailing list > linux-lvm@redhat.com > https://www.redhat.com/mailman/listinfo/linux-lvm > > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ > > >