linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] Unsync-ed LVM Mirror
@ 2018-02-03  9:43 Liwei
  2018-02-05  3:21 ` Liwei
  2018-02-05  7:27 ` Eric Ren
  0 siblings, 2 replies; 7+ messages in thread
From: Liwei @ 2018-02-03  9:43 UTC (permalink / raw)
  To: linux-lvm

Hi list,
    I had a LV that I was converting from linear to mirrored (not
raid1) whose source device failed partway-through during the initial
sync.

    I've since recovered the source device, but it seems like the
mirror is still acting as if some blocks are not readable? I'm getting
this in my logs, and the FS is full of errors:

[  +1.613126] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.000278] device-mapper: raid1: Primary mirror (253:25) failed
while out-of-sync: Reads may fail.
[  +0.085916] device-mapper: raid1: Mirror read failed.
[  +0.196562] device-mapper: raid1: Mirror read failed.
[  +0.000237] Buffer I/O error on dev dm-27, logical block 5371800560,
async page read
[  +0.592135] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.082882] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.246945] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.107374] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.083344] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.114949] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.085056] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.203929] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.157953] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +3.065247] recovery_complete: 23 callbacks suppressed
[  +0.000001] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.128064] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.103100] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.107827] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.140871] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.132844] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.124698] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.138502] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.117827] device-mapper: raid1: Unable to read primary mirror
during recovery
[  +0.125705] device-mapper: raid1: Unable to read primary mirror
during recovery
[Feb 3 17:09] device-mapper: raid1: Mirror read failed.
[  +0.167553] device-mapper: raid1: Mirror read failed.
[  +0.000268] Buffer I/O error on dev dm-27, logical block 5367765816,
async page read
[  +0.135138] device-mapper: raid1: Mirror read failed.
[  +0.000238] Buffer I/O error on dev dm-27, logical block 5367765816,
async page read
[  +0.000365] device-mapper: raid1: Mirror read failed.
[  +0.000315] device-mapper: raid1: Mirror read failed.
[  +0.000213] Buffer I/O error on dev dm-27, logical block 5367896888,
async page read
[  +0.000276] device-mapper: raid1: Mirror read failed.
[  +0.000199] Buffer I/O error on dev dm-27, logical block 5367765816,
async page read

    However, if I take down the destination device and restart the LV
with --activateoption partial, I can read my data and everything
checks out.

    My theory (and what I observed) is that lvm continued the initial
sync even after the source drive stopped responding, and has now
mapped the blocks that it 'synced' as dead. How can I make lvm retry
those blocks again?

    In fact, I don't trust the mirror anymore, is there a way I can
conduct a scrub of the mirror after the initial sync is done? I read
about --syncaction check, but seems like it only notes the number of
inconsistencies. Can I have lvm re-mirror the inconsistencies from the
source to destination device? I trust the source device because we ran
a btrfs scrub on it and it reported that all checksums are valid.

    It took months for the mirror sync to get to this stage (actually,
why does it take months to mirror 20TB?), I don't want to start it all
over again.

Warm regards,
Liwei

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Unsync-ed LVM Mirror
  2018-02-03  9:43 [linux-lvm] Unsync-ed LVM Mirror Liwei
@ 2018-02-05  3:21 ` Liwei
  2018-02-05  7:27 ` Eric Ren
  1 sibling, 0 replies; 7+ messages in thread
From: Liwei @ 2018-02-05  3:21 UTC (permalink / raw)
  To: linux-lvm

On 3 February 2018 at 17:43, Liwei <xieliwei@gmail.com> wrote:
> Hi list,
>     I had a LV that I was converting from linear to mirrored (not
> raid1) whose source device failed partway-through during the initial
> sync.
>
>     I've since recovered the source device, but it seems like the
> mirror is still acting as if some blocks are not readable? I'm getting
> this in my logs, and the FS is full of errors:
>
> [  +1.613126] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.000278] device-mapper: raid1: Primary mirror (253:25) failed
> while out-of-sync: Reads may fail.
> [  +0.085916] device-mapper: raid1: Mirror read failed.
> [  +0.196562] device-mapper: raid1: Mirror read failed.
> [  +0.000237] Buffer I/O error on dev dm-27, logical block 5371800560,
> async page read
> [  +0.592135] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.082882] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.246945] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.107374] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.083344] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.114949] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.085056] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.203929] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.157953] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +3.065247] recovery_complete: 23 callbacks suppressed
> [  +0.000001] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.128064] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.103100] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.107827] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.140871] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.132844] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.124698] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.138502] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.117827] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.125705] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [Feb 3 17:09] device-mapper: raid1: Mirror read failed.
> [  +0.167553] device-mapper: raid1: Mirror read failed.
> [  +0.000268] Buffer I/O error on dev dm-27, logical block 5367765816,
> async page read
> [  +0.135138] device-mapper: raid1: Mirror read failed.
> [  +0.000238] Buffer I/O error on dev dm-27, logical block 5367765816,
> async page read
> [  +0.000365] device-mapper: raid1: Mirror read failed.
> [  +0.000315] device-mapper: raid1: Mirror read failed.
> [  +0.000213] Buffer I/O error on dev dm-27, logical block 5367896888,
> async page read
> [  +0.000276] device-mapper: raid1: Mirror read failed.
> [  +0.000199] Buffer I/O error on dev dm-27, logical block 5367765816,
> async page read
>
>     However, if I take down the destination device and restart the LV
> with --activateoption partial, I can read my data and everything
> checks out.
>
>     My theory (and what I observed) is that lvm continued the initial
> sync even after the source drive stopped responding, and has now
> mapped the blocks that it 'synced' as dead. How can I make lvm retry
> those blocks again?
>
>     In fact, I don't trust the mirror anymore, is there a way I can
> conduct a scrub of the mirror after the initial sync is done? I read
> about --syncaction check, but seems like it only notes the number of
> inconsistencies. Can I have lvm re-mirror the inconsistencies from the
> source to destination device? I trust the source device because we ran
> a btrfs scrub on it and it reported that all checksums are valid.
>
>     It took months for the mirror sync to get to this stage (actually,
> why does it take months to mirror 20TB?), I don't want to start it all
> over again.
>
> Warm regards,
> Liwei

Okay, the sync managed to reach 99.99%, and now there's no drive
activity, it is just stuck there. What should I do? Theoretically, if
I can take a look at the contents of mlog and manipulate it, I can
manually do a sync of the failed segments, and remove lvm's opinion of
them being missing.

I'm looking through the lvm2 source for the format but if someone can
point out the way (or a better way), I'll be very appreciative!

Also, is there a way I can access the mlog and mimage* subvolumes directly?

Warm regards,
Liwei

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Unsync-ed LVM Mirror
  2018-02-03  9:43 [linux-lvm] Unsync-ed LVM Mirror Liwei
  2018-02-05  3:21 ` Liwei
@ 2018-02-05  7:27 ` Eric Ren
  2018-02-05  7:42   ` Liwei
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Ren @ 2018-02-05  7:27 UTC (permalink / raw)
  To: LVM general discussion and development, Liwei

Hi,

Your LVM version and kernel version please?

like:
""""
# lvm version
   LVM version:     2.02.177(2) (2017-12-18)
   Library version: 1.03.01 (2017-12-18)
   Driver version:  4.35.0

# uname -a
Linux sle15-c1-n1 4.12.14-9.1-default #1 SMP Fri Jan 19 09:13:51 UTC 
2018 (849a2fe) x86_64 x86_64 x86_64 GNU/Linux
"""

Eric

On 02/03/2018 05:43 PM, Liwei wrote:
> Hi list,
>      I had a LV that I was converting from linear to mirrored (not
> raid1) whose source device failed partway-through during the initial
> sync.
>
>      I've since recovered the source device, but it seems like the
> mirror is still acting as if some blocks are not readable? I'm getting
> this in my logs, and the FS is full of errors:
>
> [  +1.613126] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.000278] device-mapper: raid1: Primary mirror (253:25) failed
> while out-of-sync: Reads may fail.
> [  +0.085916] device-mapper: raid1: Mirror read failed.
> [  +0.196562] device-mapper: raid1: Mirror read failed.
> [  +0.000237] Buffer I/O error on dev dm-27, logical block 5371800560,
> async page read
> [  +0.592135] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.082882] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.246945] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.107374] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.083344] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.114949] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.085056] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.203929] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.157953] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +3.065247] recovery_complete: 23 callbacks suppressed
> [  +0.000001] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.128064] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.103100] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.107827] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.140871] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.132844] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.124698] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.138502] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.117827] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [  +0.125705] device-mapper: raid1: Unable to read primary mirror
> during recovery
> [Feb 3 17:09] device-mapper: raid1: Mirror read failed.
> [  +0.167553] device-mapper: raid1: Mirror read failed.
> [  +0.000268] Buffer I/O error on dev dm-27, logical block 5367765816,
> async page read
> [  +0.135138] device-mapper: raid1: Mirror read failed.
> [  +0.000238] Buffer I/O error on dev dm-27, logical block 5367765816,
> async page read
> [  +0.000365] device-mapper: raid1: Mirror read failed.
> [  +0.000315] device-mapper: raid1: Mirror read failed.
> [  +0.000213] Buffer I/O error on dev dm-27, logical block 5367896888,
> async page read
> [  +0.000276] device-mapper: raid1: Mirror read failed.
> [  +0.000199] Buffer I/O error on dev dm-27, logical block 5367765816,
> async page read
>
>      However, if I take down the destination device and restart the LV
> with --activateoption partial, I can read my data and everything
> checks out.
>
>      My theory (and what I observed) is that lvm continued the initial
> sync even after the source drive stopped responding, and has now
> mapped the blocks that it 'synced' as dead. How can I make lvm retry
> those blocks again?
>
>      In fact, I don't trust the mirror anymore, is there a way I can
> conduct a scrub of the mirror after the initial sync is done? I read
> about --syncaction check, but seems like it only notes the number of
> inconsistencies. Can I have lvm re-mirror the inconsistencies from the
> source to destination device? I trust the source device because we ran
> a btrfs scrub on it and it reported that all checksums are valid.
>
>      It took months for the mirror sync to get to this stage (actually,
> why does it take months to mirror 20TB?), I don't want to start it all
> over again.
>
> Warm regards,
> Liwei
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Unsync-ed LVM Mirror
  2018-02-05  7:27 ` Eric Ren
@ 2018-02-05  7:42   ` Liwei
  2018-02-05  8:43     ` Eric Ren
  2018-02-05 10:07     ` Eric Ren
  0 siblings, 2 replies; 7+ messages in thread
From: Liwei @ 2018-02-05  7:42 UTC (permalink / raw)
  To: Eric Ren; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 6424 bytes --]

Hi Eric,
    Thanks for answering! Here are the details:

# lvm version
  LVM version:     2.02.176(2) (2017-11-03)
  Library version: 1.02.145 (2017-11-03)
  Driver version:  4.37.0
  Configuration:   ./configure --build=x86_64-linux-gnu --prefix=/usr
--includedir=${prefix}/include --mandir=${prefix}/share/man
--infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var
--disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu
--libexecdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run
--disable-maintainer-mode --disable-dependency-tracking --exec-prefix=
--bindir=/bin --libdir=/lib/x86_64-linux-gnu --sbindir=/sbin
--with-usrlibdir=/usr/lib/x86_64-linux-gnu --with-optimisation=-O2
--with-cache=internal --with-clvmd=corosync --with-cluster=internal
--with-device-uid=0 --with-device-gid=6 --with-device-mode=0660
--with-default-pid-dir=/run --with-default-run-dir=/run/lvm
--with-default-locking-dir=/run/lock/lvm --with-thin=internal
--with-thin-check=/usr/sbin/thin_check --with-thin-dump=/usr/sbin/thin_dump
--with-thin-repair=/usr/sbin/thin_repair --enable-applib
--enable-blkid_wiping --enable-cmdlib --enable-cmirrord --enable-dmeventd
--enable-dbus-service --enable-lvmetad --enable-lvmlockd-dlm
--enable-lvmlockd-sanlock --enable-lvmpolld --enable-notify-dbus
--enable-pkgconfig --enable-readline --enable-udev_rules --enable-udev_sync

# uname -a
Linux dataserv 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) x86_64
GNU/Linux

Warm regards,
Liwei

On 5 Feb 2018 15:27, "Eric Ren" <zren@suse.com> wrote:

> Hi,
>
> Your LVM version and kernel version please?
>
> like:
> """"
> # lvm version
>   LVM version:     2.02.177(2) (2017-12-18)
>   Library version: 1.03.01 (2017-12-18)
>   Driver version:  4.35.0
>
> # uname -a
> Linux sle15-c1-n1 4.12.14-9.1-default #1 SMP Fri Jan 19 09:13:51 UTC 2018
> (849a2fe) x86_64 x86_64 x86_64 GNU/Linux
> """
>
> Eric
>
> On 02/03/2018 05:43 PM, Liwei wrote:
>
>> Hi list,
>>      I had a LV that I was converting from linear to mirrored (not
>> raid1) whose source device failed partway-through during the initial
>> sync.
>>
>>      I've since recovered the source device, but it seems like the
>> mirror is still acting as if some blocks are not readable? I'm getting
>> this in my logs, and the FS is full of errors:
>>
>> [  +1.613126] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.000278] device-mapper: raid1: Primary mirror (253:25) failed
>> while out-of-sync: Reads may fail.
>> [  +0.085916] device-mapper: raid1: Mirror read failed.
>> [  +0.196562] device-mapper: raid1: Mirror read failed.
>> [  +0.000237] Buffer I/O error on dev dm-27, logical block 5371800560,
>> async page read
>> [  +0.592135] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.082882] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.246945] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.107374] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.083344] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.114949] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.085056] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.203929] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.157953] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +3.065247] recovery_complete: 23 callbacks suppressed
>> [  +0.000001] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.128064] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.103100] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.107827] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.140871] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.132844] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.124698] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.138502] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.117827] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [  +0.125705] device-mapper: raid1: Unable to read primary mirror
>> during recovery
>> [Feb 3 17:09] device-mapper: raid1: Mirror read failed.
>> [  +0.167553] device-mapper: raid1: Mirror read failed.
>> [  +0.000268] Buffer I/O error on dev dm-27, logical block 5367765816,
>> async page read
>> [  +0.135138] device-mapper: raid1: Mirror read failed.
>> [  +0.000238] Buffer I/O error on dev dm-27, logical block 5367765816,
>> async page read
>> [  +0.000365] device-mapper: raid1: Mirror read failed.
>> [  +0.000315] device-mapper: raid1: Mirror read failed.
>> [  +0.000213] Buffer I/O error on dev dm-27, logical block 5367896888,
>> async page read
>> [  +0.000276] device-mapper: raid1: Mirror read failed.
>> [  +0.000199] Buffer I/O error on dev dm-27, logical block 5367765816,
>> async page read
>>
>>      However, if I take down the destination device and restart the LV
>> with --activateoption partial, I can read my data and everything
>> checks out.
>>
>>      My theory (and what I observed) is that lvm continued the initial
>> sync even after the source drive stopped responding, and has now
>> mapped the blocks that it 'synced' as dead. How can I make lvm retry
>> those blocks again?
>>
>>      In fact, I don't trust the mirror anymore, is there a way I can
>> conduct a scrub of the mirror after the initial sync is done? I read
>> about --syncaction check, but seems like it only notes the number of
>> inconsistencies. Can I have lvm re-mirror the inconsistencies from the
>> source to destination device? I trust the source device because we ran
>> a btrfs scrub on it and it reported that all checksums are valid.
>>
>>      It took months for the mirror sync to get to this stage (actually,
>> why does it take months to mirror 20TB?), I don't want to start it all
>> over again.
>>
>> Warm regards,
>> Liwei
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>>
>

[-- Attachment #2: Type: text/html, Size: 7755 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Unsync-ed LVM Mirror
  2018-02-05  7:42   ` Liwei
@ 2018-02-05  8:43     ` Eric Ren
  2018-02-05  9:26       ` Liwei
  2018-02-05 10:07     ` Eric Ren
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Ren @ 2018-02-05  8:43 UTC (permalink / raw)
  To: Liwei; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 9109 bytes --]

Months ago,   I worked on a NULL pointer deference crash on dm mirror 
target. I worked out two patches
to fix the crash issue, but when I was submitting them, I found that 
upstream had "fixed" the crash by
reverting, you can find the discussion here:

    - https://patchwork.kernel.org/patch/9808897/


Zdenek did through out his doubt, but no body gave response:
"""

>> Which kernel version is this ?
>>
>> I'd thought we've already fixed this BZ for old mirrors:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1382382
>>
>> There similar BZ for md-raid based mirrors (--type raid1)
>> https://bugzilla.redhat.com/show_bug.cgi?id=1416099
> My base kernel version is 4.4.68, but with this 2 latest fixes applied:
> 
> """
> Revert "dm mirror: use all available legs on multiple failures"

Ohh  - I've -rc6 - while this  'revert' patch went to 4.12-rc7.

I'm now starting to wonder why?

It's been a real fix for a real issue - and 'revert' message states
there is no such problem ??

I'm confused....

Mike  - have you tried the sequence from BZ  ?

Zdenek

"""

I wrongly accepted the facts:

1. the crash issue do disappear;
2.  the "reverting" fixing way is likely wrong, but I did follow up it 
further because
people now mainly uses raid1 instead of mirror  - my fault to think that 
way.

But, I was just feeling it's hard to persuade the maintainer to revert 
the "reverting fixes"
and try my fix.

Anyway, why are you using mirror? why not raid1?

Eric


On 02/05/2018 03:42 PM, Liwei wrote:
> Hi Eric,
>     Thanks for answering! Here are the details:
>
> # lvm version
>   LVM version:     2.02.176(2) (2017-11-03)
>   Library version: 1.02.145 (2017-11-03)
>   Driver version:  4.37.0
>   Configuration:   ./configure --build=x86_64-linux-gnu --prefix=/usr 
> --includedir=${prefix}/include --mandir=${prefix}/share/man 
> --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var 
> --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu 
> --libexecdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run 
> --disable-maintainer-mode --disable-dependency-tracking --exec-prefix= 
> --bindir=/bin --libdir=/lib/x86_64-linux-gnu --sbindir=/sbin 
> --with-usrlibdir=/usr/lib/x86_64-linux-gnu --with-optimisation=-O2 
> --with-cache=internal --with-clvmd=corosync --with-cluster=internal 
> --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 
> --with-default-pid-dir=/run --with-default-run-dir=/run/lvm 
> --with-default-locking-dir=/run/lock/lvm --with-thin=internal 
> --with-thin-check=/usr/sbin/thin_check 
> --with-thin-dump=/usr/sbin/thin_dump 
> --with-thin-repair=/usr/sbin/thin_repair --enable-applib 
> --enable-blkid_wiping --enable-cmdlib --enable-cmirrord 
> --enable-dmeventd --enable-dbus-service --enable-lvmetad 
> --enable-lvmlockd-dlm --enable-lvmlockd-sanlock --enable-lvmpolld 
> --enable-notify-dbus --enable-pkgconfig --enable-readline 
> --enable-udev_rules --enable-udev_sync
>
> # uname -a
> Linux dataserv 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) 
> x86_64 GNU/Linux
>
> Warm regards,
> Liwei
>
> On 5 Feb 2018 15:27, "Eric Ren" <zren@suse.com <mailto:zren@suse.com>> 
> wrote:
>
>     Hi,
>
>     Your LVM version and kernel version please?
>
>     like:
>     """"
>     # lvm version
>       LVM version:     2.02.177(2) (2017-12-18)
>       Library version: 1.03.01 (2017-12-18)
>       Driver version:  4.35.0
>
>     # uname -a
>     Linux sle15-c1-n1 4.12.14-9.1-default #1 SMP Fri Jan 19 09:13:51
>     UTC 2018 (849a2fe) x86_64 x86_64 x86_64 GNU/Linux
>     """
>
>     Eric
>
>     On 02/03/2018 05:43 PM, Liwei wrote:
>
>         Hi list,
>              I had a LV that I was converting from linear to mirrored (not
>         raid1) whose source device failed partway-through during the
>         initial
>         sync.
>
>              I've since recovered the source device, but it seems like the
>         mirror is still acting as if some blocks are not readable? I'm
>         getting
>         this in my logs, and the FS is full of errors:
>
>         [  +1.613126] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.000278] device-mapper: raid1: Primary mirror (253:25) failed
>         while out-of-sync: Reads may fail.
>         [  +0.085916] device-mapper: raid1: Mirror read failed.
>         [  +0.196562] device-mapper: raid1: Mirror read failed.
>         [  +0.000237] Buffer I/O error on dev dm-27, logical block
>         5371800560,
>         async page read
>         [  +0.592135] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.082882] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.246945] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.107374] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.083344] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.114949] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.085056] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.203929] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.157953] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +3.065247] recovery_complete: 23 callbacks suppressed
>         [  +0.000001] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.128064] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.103100] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.107827] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.140871] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.132844] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.124698] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.138502] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.117827] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.125705] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [Feb 3 17:09] device-mapper: raid1: Mirror read failed.
>         [  +0.167553] device-mapper: raid1: Mirror read failed.
>         [  +0.000268] Buffer I/O error on dev dm-27, logical block
>         5367765816,
>         async page read
>         [  +0.135138] device-mapper: raid1: Mirror read failed.
>         [  +0.000238] Buffer I/O error on dev dm-27, logical block
>         5367765816,
>         async page read
>         [  +0.000365] device-mapper: raid1: Mirror read failed.
>         [  +0.000315] device-mapper: raid1: Mirror read failed.
>         [  +0.000213] Buffer I/O error on dev dm-27, logical block
>         5367896888,
>         async page read
>         [  +0.000276] device-mapper: raid1: Mirror read failed.
>         [  +0.000199] Buffer I/O error on dev dm-27, logical block
>         5367765816,
>         async page read
>
>              However, if I take down the destination device and
>         restart the LV
>         with --activateoption partial, I can read my data and everything
>         checks out.
>
>              My theory (and what I observed) is that lvm continued the
>         initial
>         sync even after the source drive stopped responding, and has now
>         mapped the blocks that it 'synced' as dead. How can I make lvm
>         retry
>         those blocks again?
>
>              In fact, I don't trust the mirror anymore, is there a way
>         I can
>         conduct a scrub of the mirror after the initial sync is done?
>         I read
>         about --syncaction check, but seems like it only notes the
>         number of
>         inconsistencies. Can I have lvm re-mirror the inconsistencies
>         from the
>         source to destination device? I trust the source device
>         because we ran
>         a btrfs scrub on it and it reported that all checksums are valid.
>
>              It took months for the mirror sync to get to this stage
>         (actually,
>         why does it take months to mirror 20TB?), I don't want to
>         start it all
>         over again.
>
>         Warm regards,
>         Liwei
>
>         _______________________________________________
>         linux-lvm mailing list
>         linux-lvm@redhat.com <mailto:linux-lvm@redhat.com>
>         https://www.redhat.com/mailman/listinfo/linux-lvm
>         <https://www.redhat.com/mailman/listinfo/linux-lvm>
>         read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>         <http://tldp.org/HOWTO/LVM-HOWTO/>
>
>


[-- Attachment #2: Type: text/html, Size: 13316 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Unsync-ed LVM Mirror
  2018-02-05  8:43     ` Eric Ren
@ 2018-02-05  9:26       ` Liwei
  0 siblings, 0 replies; 7+ messages in thread
From: Liwei @ 2018-02-05  9:26 UTC (permalink / raw)
  To: Eric Ren; +Cc: LVM general discussion and development

This is a long story. Precisely, I hit that bug (?), due to one half
of the mirror (the source) encountering bad sectors. So I was forced
to upgrade the kernel in order to salvage the situation (the bug was
so persistent, I could not get to a shell even in single-user mode -
had to boot a version of Ubuntu with the patch applied, chroot and
install the update).

I'm not versed enough with the inner workings of LVM to understand the
differences with what was accepted upstream and the 2 patches you
worked on (what is a bio?), so it didn't occur to me that I'd meet
with problems down the road.

I was using mirror because I was following this guide (which in
retrospect, was seriously outdated),
https://utcc.utoronto.ca/~cks/space/blog/linux/LVMCautiousMigration ,
and some reading made me think that it was not possible to cleanly
undo a mirror by raid1.

A little background: We were trying to (safely?) migrate our VG from
an antiquated 13x2TB raid6 array to a shiny new 6x6TB array while
minimizing downtime. So the plan was to turn on mirroring, wait for it
to sync, then split the mirror. Of course as with all poorly-planned
plans go, 2 of the source drives dropped out and a third drive
developed bad sectors, all within a few days. And here I am with this
problem.

Anyways, I'm still not sure what's happening, but is there an easy (in
terms of, do-not-need-to-resync-everything) way to proceed?

Warm regards,
Liwei




On 5 February 2018 at 16:43, Eric Ren <zren@suse.com> wrote:
> Months ago,   I worked on a NULL pointer deference crash on dm mirror
> target. I worked out two patches
> to fix the crash issue, but when I was submitting them, I found that
> upstream had "fixed" the crash by
> reverting, you can find the discussion here:
>
>    - https://patchwork.kernel.org/patch/9808897/
>
>
> Zdenek did through out his doubt, but no body gave response:
> """
>
>>> Which kernel version is this ?
>>>
>>> I'd thought we've already fixed this BZ for old mirrors:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1382382
>>>
>>> There similar BZ for md-raid based mirrors (--type raid1)
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1416099
>> My base kernel version is 4.4.68, but with this 2 latest fixes applied:
>>
>> """
>> Revert "dm mirror: use all available legs on multiple failures"
>
> Ohh  - I've -rc6 - while this  'revert' patch went to 4.12-rc7.
>
> I'm now starting to wonder why?
>
> It's been a real fix for a real issue - and 'revert' message states
> there is no such problem ??
>
> I'm confused....
>
> Mike  - have you tried the sequence from BZ  ?
>
> Zdenek
>
> """
>
> I wrongly accepted the facts:
>
> 1. the crash issue do disappear;
> 2.  the "reverting" fixing way is likely wrong, but I did follow up it
> further because
> people now mainly uses raid1 instead of mirror  - my fault to think that
> way.
>
> But, I was just feeling it's hard to persuade the maintainer to revert the
> "reverting fixes"
> and try my fix.
>
> Anyway, why are you using mirror? why not raid1?
>
> Eric
>
>
>
> On 02/05/2018 03:42 PM, Liwei wrote:
>
> Hi Eric,
>     Thanks for answering! Here are the details:
>
> # lvm version
>   LVM version:     2.02.176(2) (2017-11-03)
>   Library version: 1.02.145 (2017-11-03)
>   Driver version:  4.37.0
>   Configuration:   ./configure --build=x86_64-linux-gnu --prefix=/usr
> --includedir=${prefix}/include --mandir=${prefix}/share/man
> --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var
> --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu
> --libexecdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run
> --disable-maintainer-mode --disable-dependency-tracking --exec-prefix=
> --bindir=/bin --libdir=/lib/x86_64-linux-gnu --sbindir=/sbin
> --with-usrlibdir=/usr/lib/x86_64-linux-gnu --with-optimisation=-O2
> --with-cache=internal --with-clvmd=corosync --with-cluster=internal
> --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660
> --with-default-pid-dir=/run --with-default-run-dir=/run/lvm
> --with-default-locking-dir=/run/lock/lvm --with-thin=internal
> --with-thin-check=/usr/sbin/thin_check --with-thin-dump=/usr/sbin/thin_dump
> --with-thin-repair=/usr/sbin/thin_repair --enable-applib
> --enable-blkid_wiping --enable-cmdlib --enable-cmirrord --enable-dmeventd
> --enable-dbus-service --enable-lvmetad --enable-lvmlockd-dlm
> --enable-lvmlockd-sanlock --enable-lvmpolld --enable-notify-dbus
> --enable-pkgconfig --enable-readline --enable-udev_rules --enable-udev_sync
>
> # uname -a
> Linux dataserv 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) x86_64
> GNU/Linux
>
> Warm regards,
> Liwei
>
> On 5 Feb 2018 15:27, "Eric Ren" <zren@suse.com> wrote:
>>
>> Hi,
>>
>> Your LVM version and kernel version please?
>>
>> like:
>> """"
>> # lvm version
>>   LVM version:     2.02.177(2) (2017-12-18)
>>   Library version: 1.03.01 (2017-12-18)
>>   Driver version:  4.35.0
>>
>> # uname -a
>> Linux sle15-c1-n1 4.12.14-9.1-default #1 SMP Fri Jan 19 09:13:51 UTC 2018
>> (849a2fe) x86_64 x86_64 x86_64 GNU/Linux
>> """
>>
>> Eric
>>
>> On 02/03/2018 05:43 PM, Liwei wrote:
>>>
>>> Hi list,
>>>      I had a LV that I was converting from linear to mirrored (not
>>> raid1) whose source device failed partway-through during the initial
>>> sync.
>>>
>>>      I've since recovered the source device, but it seems like the
>>> mirror is still acting as if some blocks are not readable? I'm getting
>>> this in my logs, and the FS is full of errors:
>>>
>>> [  +1.613126] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.000278] device-mapper: raid1: Primary mirror (253:25) failed
>>> while out-of-sync: Reads may fail.
>>> [  +0.085916] device-mapper: raid1: Mirror read failed.
>>> [  +0.196562] device-mapper: raid1: Mirror read failed.
>>> [  +0.000237] Buffer I/O error on dev dm-27, logical block 5371800560,
>>> async page read
>>> [  +0.592135] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.082882] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.246945] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.107374] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.083344] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.114949] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.085056] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.203929] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.157953] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +3.065247] recovery_complete: 23 callbacks suppressed
>>> [  +0.000001] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.128064] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.103100] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.107827] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.140871] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.132844] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.124698] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.138502] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.117827] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [  +0.125705] device-mapper: raid1: Unable to read primary mirror
>>> during recovery
>>> [Feb 3 17:09] device-mapper: raid1: Mirror read failed.
>>> [  +0.167553] device-mapper: raid1: Mirror read failed.
>>> [  +0.000268] Buffer I/O error on dev dm-27, logical block 5367765816,
>>> async page read
>>> [  +0.135138] device-mapper: raid1: Mirror read failed.
>>> [  +0.000238] Buffer I/O error on dev dm-27, logical block 5367765816,
>>> async page read
>>> [  +0.000365] device-mapper: raid1: Mirror read failed.
>>> [  +0.000315] device-mapper: raid1: Mirror read failed.
>>> [  +0.000213] Buffer I/O error on dev dm-27, logical block 5367896888,
>>> async page read
>>> [  +0.000276] device-mapper: raid1: Mirror read failed.
>>> [  +0.000199] Buffer I/O error on dev dm-27, logical block 5367765816,
>>> async page read
>>>
>>>      However, if I take down the destination device and restart the LV
>>> with --activateoption partial, I can read my data and everything
>>> checks out.
>>>
>>>      My theory (and what I observed) is that lvm continued the initial
>>> sync even after the source drive stopped responding, and has now
>>> mapped the blocks that it 'synced' as dead. How can I make lvm retry
>>> those blocks again?
>>>
>>>      In fact, I don't trust the mirror anymore, is there a way I can
>>> conduct a scrub of the mirror after the initial sync is done? I read
>>> about --syncaction check, but seems like it only notes the number of
>>> inconsistencies. Can I have lvm re-mirror the inconsistencies from the
>>> source to destination device? I trust the source device because we ran
>>> a btrfs scrub on it and it reported that all checksums are valid.
>>>
>>>      It took months for the mirror sync to get to this stage (actually,
>>> why does it take months to mirror 20TB?), I don't want to start it all
>>> over again.
>>>
>>> Warm regards,
>>> Liwei
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm@redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>
>>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Unsync-ed LVM Mirror
  2018-02-05  7:42   ` Liwei
  2018-02-05  8:43     ` Eric Ren
@ 2018-02-05 10:07     ` Eric Ren
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Ren @ 2018-02-05 10:07 UTC (permalink / raw)
  To: Liwei; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 8580 bytes --]

Hi,

On 02/05/2018 03:42 PM, Liwei wrote:
> Hi Eric,
>     Thanks for answering! Here are the details:
>
> # lvm version
>   LVM version:     2.02.176(2) (2017-11-03)
>   Library version: 1.02.145 (2017-11-03)
>   Driver version:  4.37.0
>   Configuration:   ./configure --build=x86_64-linux-gnu --prefix=/usr 
> --includedir=${prefix}/include --mandir=${prefix}/share/man 
> --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var 
> --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu 
> --libexecdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run 
> --disable-maintainer-mode --disable-dependency-tracking --exec-prefix= 
> --bindir=/bin --libdir=/lib/x86_64-linux-gnu --sbindir=/sbin 
> --with-usrlibdir=/usr/lib/x86_64-linux-gnu --with-optimisation=-O2 
> --with-cache=internal --with-clvmd=corosync --with-cluster=internal 
> --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 
> --with-default-pid-dir=/run --with-default-run-dir=/run/lvm 
> --with-default-locking-dir=/run/lock/lvm --with-thin=internal 
> --with-thin-check=/usr/sbin/thin_check 
> --with-thin-dump=/usr/sbin/thin_dump 
> --with-thin-repair=/usr/sbin/thin_repair --enable-applib 
> --enable-blkid_wiping --enable-cmdlib --enable-cmirrord 
> --enable-dmeventd --enable-dbus-service --enable-lvmetad 
> --enable-lvmlockd-dlm --enable-lvmlockd-sanlock --enable-lvmpolld 
> --enable-notify-dbus --enable-pkgconfig --enable-readline 
> --enable-udev_rules --enable-udev_sync
>
> # uname -a
> Linux dataserv 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) 
> x86_64 GNU/Linux

Sorry, I'm not sure if this the root cause of your issue, without 
testing myself. If you have interest to
have a try, you can revert

cd15fb64ee56192760ad5c1e2ad97a65e735b18b (Revert "dm mirror: use all 
available legs on multiple failures")

and try my patch in https://patchwork.kernel.org/patch/9808897/


The "reverting" fix for the crash issue is in 4.14.0 kernel.
'""
╭─eric@ws ~/workspace/linux  ‹master›
╰─$ git log --grep "Revert \"dm mirror: use all available legs on 
multiple failures\""

commit cd15fb64ee56192760ad5c1e2ad97a65e735b18b
Author: Mike Snitzer <snitzer@redhat.com>
Date:   Thu Jun 15 08:39:15 2017 -0400

     Revert "dm mirror: use all available legs on multiple failures"

     This reverts commit 12a7cf5ba6c776a2621d8972c7d42e8d3d959d20.

╭─eric@ws ~/workspace/linux  ‹master›
╰─$ git describe cd15fb64ee56192760ad5c1e2ad97a65e735b18b
v4.12-rc5-2-gcd15fb64ee56
"""

Eric

>
> Warm regards,
> Liwei
>
> On 5 Feb 2018 15:27, "Eric Ren" <zren@suse.com <mailto:zren@suse.com>> 
> wrote:
>
>     Hi,
>
>     Your LVM version and kernel version please?
>
>     like:
>     """"
>     # lvm version
>       LVM version:     2.02.177(2) (2017-12-18)
>       Library version: 1.03.01 (2017-12-18)
>       Driver version:  4.35.0
>
>     # uname -a
>     Linux sle15-c1-n1 4.12.14-9.1-default #1 SMP Fri Jan 19 09:13:51
>     UTC 2018 (849a2fe) x86_64 x86_64 x86_64 GNU/Linux
>     """
>
>     Eric
>
>     On 02/03/2018 05:43 PM, Liwei wrote:
>
>         Hi list,
>              I had a LV that I was converting from linear to mirrored (not
>         raid1) whose source device failed partway-through during the
>         initial
>         sync.
>
>              I've since recovered the source device, but it seems like the
>         mirror is still acting as if some blocks are not readable? I'm
>         getting
>         this in my logs, and the FS is full of errors:
>
>         [  +1.613126] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.000278] device-mapper: raid1: Primary mirror (253:25) failed
>         while out-of-sync: Reads may fail.
>         [  +0.085916] device-mapper: raid1: Mirror read failed.
>         [  +0.196562] device-mapper: raid1: Mirror read failed.
>         [  +0.000237] Buffer I/O error on dev dm-27, logical block
>         5371800560,
>         async page read
>         [  +0.592135] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.082882] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.246945] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.107374] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.083344] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.114949] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.085056] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.203929] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.157953] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +3.065247] recovery_complete: 23 callbacks suppressed
>         [  +0.000001] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.128064] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.103100] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.107827] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.140871] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.132844] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.124698] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.138502] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.117827] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [  +0.125705] device-mapper: raid1: Unable to read primary mirror
>         during recovery
>         [Feb 3 17:09] device-mapper: raid1: Mirror read failed.
>         [  +0.167553] device-mapper: raid1: Mirror read failed.
>         [  +0.000268] Buffer I/O error on dev dm-27, logical block
>         5367765816,
>         async page read
>         [  +0.135138] device-mapper: raid1: Mirror read failed.
>         [  +0.000238] Buffer I/O error on dev dm-27, logical block
>         5367765816,
>         async page read
>         [  +0.000365] device-mapper: raid1: Mirror read failed.
>         [  +0.000315] device-mapper: raid1: Mirror read failed.
>         [  +0.000213] Buffer I/O error on dev dm-27, logical block
>         5367896888,
>         async page read
>         [  +0.000276] device-mapper: raid1: Mirror read failed.
>         [  +0.000199] Buffer I/O error on dev dm-27, logical block
>         5367765816,
>         async page read
>
>              However, if I take down the destination device and
>         restart the LV
>         with --activateoption partial, I can read my data and everything
>         checks out.
>
>              My theory (and what I observed) is that lvm continued the
>         initial
>         sync even after the source drive stopped responding, and has now
>         mapped the blocks that it 'synced' as dead. How can I make lvm
>         retry
>         those blocks again?
>
>              In fact, I don't trust the mirror anymore, is there a way
>         I can
>         conduct a scrub of the mirror after the initial sync is done?
>         I read
>         about --syncaction check, but seems like it only notes the
>         number of
>         inconsistencies. Can I have lvm re-mirror the inconsistencies
>         from the
>         source to destination device? I trust the source device
>         because we ran
>         a btrfs scrub on it and it reported that all checksums are valid.
>
>              It took months for the mirror sync to get to this stage
>         (actually,
>         why does it take months to mirror 20TB?), I don't want to
>         start it all
>         over again.
>
>         Warm regards,
>         Liwei
>
>         _______________________________________________
>         linux-lvm mailing list
>         linux-lvm@redhat.com <mailto:linux-lvm@redhat.com>
>         https://www.redhat.com/mailman/listinfo/linux-lvm
>         <https://www.redhat.com/mailman/listinfo/linux-lvm>
>         read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>         <http://tldp.org/HOWTO/LVM-HOWTO/>
>
>


[-- Attachment #2: Type: text/html, Size: 12541 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-02-05 10:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-03  9:43 [linux-lvm] Unsync-ed LVM Mirror Liwei
2018-02-05  3:21 ` Liwei
2018-02-05  7:27 ` Eric Ren
2018-02-05  7:42   ` Liwei
2018-02-05  8:43     ` Eric Ren
2018-02-05  9:26       ` Liwei
2018-02-05 10:07     ` Eric Ren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).