* [linux-lvm] new multipath device mistakenly replaced another PV in existing volume group
@ 2017-04-19 21:53 Neutron Sharc
2017-04-20 16:04 ` David Teigland
0 siblings, 1 reply; 5+ messages in thread
From: Neutron Sharc @ 2017-04-19 21:53 UTC (permalink / raw)
To: linux-lvm
I'm seeing a strange problem (iscsi LUNs + multipath + lvm) that I
will walk through an example.
I have an iscsi target machine that exposes many iscsi LUNs. Iscsi
initiator logs in 4 iscsi LUNs (vol1_[0-4]), creates a multipath
device for each LUN (/dev/mapper/vol1_[0-4]), and combines the 4
multipath devices into a volume group and LV (vol1/vol1_lv).
Then I log in another 4 iscsi LUNs (vol3_[0-3]), create a multipath
device for each new LUN (/dev/mapper/vol3_[0-3]). Now there is a
strange thing:
some new multipath devices (/dev/mapper/vol3_0, /dev/mapper/vol3_2)
replaced existing PVs in vol1. As a result, these new multipath
devices have open-count > 0, so I cannot pvcreate on them:
$ sudo dmsetup ls --tree
vol1-vol1_lv (252:4)
├─vol3_0 (252:9) <== fresh multipath device, should NOT be in vol1
│ └─ (65:128)
├─vol3_2 (252:10) <== fresh multipath device, should NOT be in vol1
│ └─ (65:144)
├─vol1_1 (252:1)
│ └─ (65:48)
└─vol1_0 (252:0)
└─ (65:32)
vol1_3 (252:3) <== was in vol1-vol1_lv, but was knocked out
└─ (65:16)
vol1_2 (252:2) <== was in vol1-vol1_lv, but was knocked out
└─ (65:64)
vol3_3 (252:11)
└─ (65:160)
vol3_1 (252:12)
└─ (65:176)
Please note that all these vol3_[0-3] are fresh, without any LVM
metadata on them, as shown by pvscan::
sudo pvscan --cache /dev/mapper/vol3_0
Incorrect metadata area header checksum on /dev/mapper/vol3_0 at offset 4096
Incorrect metadata area header checksum on /dev/mapper/vol3_0 at offset 4096
Incorrect metadata area header checksum on /dev/mapper/vol3_0@offset 4096
$ sudo pvcreate /dev/mapper/vol3_0 <== this multipath device was
mistakenly included into vol1
Found duplicate PV 9F6vU9NVBfEq1w3e04T5UreO6fDVPJNy: using
/dev/mapper/vol1_3 not /dev/mapper/vol3_0
Using duplicate PV /dev/mapper/vol1_3 without holders, replacing
/dev/mapper/vol3_0
Can't open /dev/mapper/vol3_0 exclusively. Mounted filesystem?
========== Configs I used:
BTW, I have enabled lvmetad, my lvm.conf has this:
filter = [ "a|/dev/mapper/.*|", "r|.*|" ]
global_filter = [ "a|/dev/mapper/.*|", "r|.*|" ]
My /etc/multipath.conf is:
defaults {
user_friendly_names yes
path_grouping_policy failover
polling_interval 10
path_selector "round-robin 0"
find_multipaths yes
features "1 queue_if_no_path"
}
blacklist {
devnode "^sda[1-9]"
}
multipaths {
multipath {
wwid 360000000758757de9fb289cbde12abab
alias vol1_0
}
// more devices
}
iscsi initiator is on centos 6.5, with pkgs version:
lvm2-2.02.143-12.el6.x86_64
device-mapper-multipath-0.4.9-100.el6.x86_64
iscsi target is tgtd on another Ubuntu machine.
Comments are appreciated.
-Shawn
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-lvm] new multipath device mistakenly replaced another PV in existing volume group
2017-04-19 21:53 [linux-lvm] new multipath device mistakenly replaced another PV in existing volume group Neutron Sharc
@ 2017-04-20 16:04 ` David Teigland
2017-04-20 16:24 ` David Teigland
0 siblings, 1 reply; 5+ messages in thread
From: David Teigland @ 2017-04-20 16:04 UTC (permalink / raw)
To: Neutron Sharc; +Cc: linux-lvm
On Wed, Apr 19, 2017 at 02:53:19PM -0700, Neutron Sharc wrote:
> Please note that all these vol3_[0-3] are fresh, without any LVM
> metadata on them, as shown by pvscan::
> sudo pvscan --cache /dev/mapper/vol3_0
> Incorrect metadata area header checksum on /dev/mapper/vol3_0 at offset 4096
> Incorrect metadata area header checksum on /dev/mapper/vol3_0 at offset 4096
> Incorrect metadata area header checksum on /dev/mapper/vol3_0 at offset 4096
That error indicates there is LVM metadata on them.
> Found duplicate PV 9F6vU9NVBfEq1w3e04T5UreO6fDVPJNy: using
> /dev/mapper/vol1_3 not /dev/mapper/vol3_0
That error indicates it's the same LVM metadata on them.
Perhaps you're exporting the same data source on the back end via separate
devices, or have copied the data sources, or are using snapshots of them.
Dave
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-lvm] new multipath device mistakenly replaced another PV in existing volume group
2017-04-20 16:04 ` David Teigland
@ 2017-04-20 16:24 ` David Teigland
2017-04-22 2:13 ` Neutron Sharc
0 siblings, 1 reply; 5+ messages in thread
From: David Teigland @ 2017-04-20 16:24 UTC (permalink / raw)
To: Neutron Sharc; +Cc: linux-lvm
On Thu, Apr 20, 2017 at 11:04:32AM -0500, David Teigland wrote:
> On Wed, Apr 19, 2017 at 02:53:19PM -0700, Neutron Sharc wrote:
> > Please note that all these vol3_[0-3] are fresh, without any LVM
> > metadata on them, as shown by pvscan::
> > sudo pvscan --cache /dev/mapper/vol3_0
> > Incorrect metadata area header checksum on /dev/mapper/vol3_0 at offset 4096
> > Incorrect metadata area header checksum on /dev/mapper/vol3_0 at offset 4096
> > Incorrect metadata area header checksum on /dev/mapper/vol3_0 at offset 4096
>
> That error indicates there is LVM metadata on them.
>
> > Found duplicate PV 9F6vU9NVBfEq1w3e04T5UreO6fDVPJNy: using
> > /dev/mapper/vol1_3 not /dev/mapper/vol3_0
>
> That error indicates it's the same LVM metadata on them.
>
> Perhaps you're exporting the same data source on the back end via separate
> devices, or have copied the data sources, or are using snapshots of them.
Also, if you upgrade to a new version of lvm, there is better checking and
reporting for these conditions.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-lvm] new multipath device mistakenly replaced another PV in existing volume group
2017-04-20 16:24 ` David Teigland
@ 2017-04-22 2:13 ` Neutron Sharc
0 siblings, 0 replies; 5+ messages in thread
From: Neutron Sharc @ 2017-04-22 2:13 UTC (permalink / raw)
To: David Teigland; +Cc: linux-lvm
Thank you David for replying.
The problem turns out to be caused by stale read buffer at backend.
When lvm read a fresh iscsi LUN2 at backend, the read will miss. So
the read buffer isn't updated, and it contains previous data which may
be metadata of other LUN1. LVM receives this data and think lun2 has
the same metadata as lun1, and starts to get confused.
I made a fix to zero out read buffer on a read miss at backend. Now
the problem is gone.
btw, I'm running centos 6.5 with lvm 2.02.143. The latest lvm is
2.02.169 (or later?). How to upgrade to the latest lvm2 on centos
6.5?
On Thu, Apr 20, 2017 at 9:24 AM, David Teigland <teigland@redhat.com> wrote:
> On Thu, Apr 20, 2017 at 11:04:32AM -0500, David Teigland wrote:
>> On Wed, Apr 19, 2017 at 02:53:19PM -0700, Neutron Sharc wrote:
>> > Please note that all these vol3_[0-3] are fresh, without any LVM
>> > metadata on them, as shown by pvscan::
>> > sudo pvscan --cache /dev/mapper/vol3_0
>> > Incorrect metadata area header checksum on /dev/mapper/vol3_0 at offset 4096
>> > Incorrect metadata area header checksum on /dev/mapper/vol3_0 at offset 4096
>> > Incorrect metadata area header checksum on /dev/mapper/vol3_0 at offset 4096
>>
>> That error indicates there is LVM metadata on them.
>>
>> > Found duplicate PV 9F6vU9NVBfEq1w3e04T5UreO6fDVPJNy: using
>> > /dev/mapper/vol1_3 not /dev/mapper/vol3_0
>>
>> That error indicates it's the same LVM metadata on them.
>>
>> Perhaps you're exporting the same data source on the back end via separate
>> devices, or have copied the data sources, or are using snapshots of them.
>
> Also, if you upgrade to a new version of lvm, there is better checking and
> reporting for these conditions.
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-lvm] new multipath device mistakenly replaced another PV in existing volume group
[not found] <971194068.8176464.1492969500597.ref@mail.yahoo.com>
@ 2017-04-23 17:45 ` matthew patton
0 siblings, 0 replies; 5+ messages in thread
From: matthew patton @ 2017-04-23 17:45 UTC (permalink / raw)
To: David Teigland, LVM general discussion and development
> I made a fix to zero out read buffer on a read miss at backend.Â
how many more land mines like this exist? One *ALWAYS* zero buffers before use.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-04-23 17:45 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-19 21:53 [linux-lvm] new multipath device mistakenly replaced another PV in existing volume group Neutron Sharc
2017-04-20 16:04 ` David Teigland
2017-04-20 16:24 ` David Teigland
2017-04-22 2:13 ` Neutron Sharc
[not found] <971194068.8176464.1492969500597.ref@mail.yahoo.com>
2017-04-23 17:45 ` matthew patton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.