Hi Dave,

Your information is insightful, I'm sorry for I mixed lvlock & VGLK up.
It's a pity that the log of lvmlockd has been lost since I misconfigured
related configuration.

But I gained a good deal of enlightenment from your reply.

Here I need to talk more about the background:

On May 14th, my customer mapped some new LUNs to all servers in this
"24-node sanlock cluster". Meanwhile we found lots of hosts "multipath -ll"
got hung forever with these logs in messages:

  May 14 17:52:50 sdc-c-3 kernel: blk_update_request: I/O error, dev sdcd,
sector 0
  May 14 17:52:50 sdc-c-3 kernel: Buffer I/O error on dev sdcd, logical
block 0, async page read
  May 14 17:52:50 sdc-c-3 kernel: sd 1:0:0:10: [sdcd] tag#0 FAILED Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE
  May 14 17:52:50 sdc-c-3 kernel: sd 1:0:0:10: [sdcd] tag#0 Sense Key :
Illegal Request [current]
  May 14 17:52:50 sdc-c-3 kernel: sd 1:0:0:10: [sdcd] tag#0 ASC=0x25
ASCQ=0x1
  May 14 17:52:50 sdc-c-3 kernel: sd 1:0:0:10: [sdcd] tag#0 CDB: Read(10)
28 00 00 00 00 00 00 00 08 00

It may because of some wrong configuration in storage side, so my customer
unmapped these LUNs.
The default multipath configuration of vender "DGC" contains 'features "1
queue_if_no_path"', so though customer has unmaped LUNs, some IO possibly
queued, multipathd still not works. Execute "dmsetup message xx 0
'fail_if_no_path'" also hung...
Eventually we rebooted some of these servers to make multipathd works
well(that's painful, I'm wondering if there is any better solution). And
plan to reboot others on the second day.

On May 15th, my customer told us that some of his business still not work
well, so I found VGLK stuck with two tricky shared locks we talked about in
the previous posts.

Since critical log lost, I try to simulate the environment by this setup:
https://pastebin.com/St2NP4c6
In a word, I use LIO to simulate two multipath, backstorage of LUN0 is a
lvm lv, LUN1 is a virtio volume. And I set "queue_if_no_path" to multipaths.
Then I use "dmsetup suspend LUN0" to make it not respond to any IO to
simulate the customer's wrong storage.
Meanwhile "multipath -ll" hung -- that's what I expected, after a while,
"multipath -ll" recovered but LUN0 and LUN1 both failed.

Although there are some "symptoms" that are not exactly the same, I believe
"queue_if_no_path" may not work as expected, if some paths fail, other
paths may also not work well.

Damon

On Wed, May 20, 2020 at 12:09 AM David Teigland <teigland@redhat.com> wrote:

> On Tue, May 19, 2020 at 09:31:16PM +0800, Damon Wang wrote:
> > Hi,
> >
> > I got a wired case recently with a 24-nodes sanlock cluster recently.
> > One sanlock process stuck with getting VGLK since it thinks there are
> > two live host using VGLK shared lock, here is sanlock log from this
> > host(host_id 19): https://pastebin.com/zJubYLRv
> >
> > I find important messages:
>
> Hi, thanks for isolating the relevant portions of the debug log.
>
> >     "2020-05-14 21:28:51 2018540 [14765]: r857 clear_dead_shared
> > host_id 61 gen 1 alive"
> >     "2020-05-14 21:28:51 2018540 [14765]: r857 clear_dead_shared
> > host_id 94 gen 1 alive"
> >     "2020-05-14 21:28:52 2018541 [14764]: r858 clear_dead_shared
> > host_id 61 gen 1 alive"
> >     "2020-05-14 21:28:52 2018541 [14764]: r858 clear_dead_shared
> > host_id 94 gen 1 alive"
> >     "2020-05-14 21:28:53 2018542 [14765]: r859 clear_dead_shared
> > host_id 61 gen 1 alive"
> >     "2020-05-14 21:28:53 2018542 [14765]: r859 clear_dead_shared
> > host_id 94 gen 1 alive"
> >     "2020-05-14 21:28:54 2018543 [14764]: r860 clear_dead_shared
> > host_id 61 gen 1 alive"
> >     "2020-05-14 21:28:54 2018543 [14764]: r860 clear_dead_shared
> > host_id 94 gen 1 alive"
> >     "2020-05-14 21:28:55 2018544 [14765]: r861 clear_dead_shared
> > host_id 61 gen 1 alive"
> >     "2020-05-14 21:28:55 2018544 [14765]: r861 clear_dead_shared
> > host_id 94 gen 1 alive"
>
> So far this looks like sanlock is working correctly.  At this point it
> would have been interesting to look for any lvm commands on 61 & 94 that
> might be holding this VGLK, or evidence of lvm commands that might have
> failed to release the VGLK.  lvmlockctl -i / -d would help with that.
>
> > So I try to find out what happend to host_id 61 and host_id 94, it
> > seems both two hosts tried to acquire VGLK and then released, though
> > with long time.
> >
> > full log of host_id 61: https://pastebin.com/zDWQAtHE
> >     2020-05-14 20:23:03 2545197 [33739]: cmd_acquire 6,16,33743 ci_in 6
> fd 16 count 1 flags 9
> >     2020-05-14 20:23:03 2545197 [33739]: s5:r1044 resource
> lvm_fc261665e94c4d0f8341e254398e7ed5:VGLK:/dev/mapper/fc261665e94c4d0f8341e254398e7ed5-lvmlock:69206016:SH
> for 6,16,33743
> >     2020-05-14 20:23:03 2545197 [33739]: r1044 paxos_acquire begin e 0 0
> >     2020-05-14 20:23:03 2545197 [33739]: r1044 leader 6067 owner 36 2 0
> dblocks 35:12132036:12132036:36:2:2099201:6067:1,
> >     2020-05-14 20:23:03 2545197 [33739]: r1044 paxos_acquire leader 6067
> owner 36 2 0 max mbal[35] 12132036 our_dblock 12096061 12096061 61 1
> 2537794 6049
> >     2020-05-14 20:23:03 2545197 [33739]: r1044 paxos_acquire leader 6067
> free
> >     2020-05-14 20:23:03 2545198 [33739]: r1044 acquire_disk rv 1 lver
> 6068 at 2545198
> >     2020-05-14 20:23:03 2545198 [33739]: r1044 write_host_block host_id
> 61 flags 1 gen 1 dblock 12134061:12134061:61:1:2545198:6068:RELEASED.
> >     2020-05-14 20:23:03 2545198 [33739]: r1044 paxos_release leader 6068
> owner 61 1 2545198
> >     2020-05-14 20:23:03 2545198 [33739]: cmd_acquire 6,16,33743 result 0
> pid_dead 0
> >     2020-05-14 20:23:03 2545198 [33740]: cmd_get_lvb ci 8 fd 20 result 0
> res lvm_fc261665e94c4d0f8341e254398e7ed5:VGLK
>
> That log section shows the shared VGLK being acquired for host_id 61.
>
> >     2020-05-14 20:23:03 2545198 [33739]: cmd_acquire 6,16,33743 ci_in 6
> fd 16 count 1 flags 8
> >     2020-05-14 20:23:03 2545198 [33739]: s5:r1045 resource
> lvm_fc261665e94c4d0f8341e254398e7ed5:lL0kHh-CZ1i-jtMi-1qHO-ykfR-6nmu-Gk0rES:/dev/mapper/fc261665e94c4d0f8341e254398e7ed5-lvmlock:104857600
> for 6,16,33743
> >     2020-05-14 20:23:03 2545198 [33739]: r1045 paxos_acquire begin a 0 0
> >     2020-05-14 20:55:05 2547119 [33740]: r1045 paxos_release leader 10
> owner 61 1 2545198
> >     2020-05-14 20:55:05 2547119 [33740]: r1045 release_token done
> r_flags 8
> >     2020-05-14 20:55:05 2547119 [33740]: cmd_release 6,16,33743 result 0
> pid_dead 0 count 1
>
> That log section shows an LV lock (r1045) being released.
>
> I'm not seeing a log of the VGLK (r1044) being released, so perhaps the
> VGLK is still held by lvmlockd (either because it failed to release the
> VGLK or because there's some long running lvm command holding it.)
>
> > useful log, it's similar to host_id 60, but shorter time to release:
> >
> >     2020-05-14 20:01:20 8414771 [11203]: cmd_acquire 6,16,11206 ci_in 6
> fd 16 count 1 flags 9
> >     2020-05-14 20:01:20 8414771 [11203]: s5:r3021 resource
> lvm_fc261665e94c4d0f8341e254398e7ed5:VGLK:/dev/mapper/fc261665e94c4d0f8341e254398e7ed5-lvmlock:69206016:SH
> for 6,16,11206
> >     2020-05-14 20:01:20 8414771 [11203]: r3021 paxos_acquire begin e 0 0
> >     2020-05-14 20:01:20 8414771 [11203]: r3021 leader 6063 owner 36 2 0
> dblocks 35:12124036:12124036:36:2:2097966:6063:1,
> >     2020-05-14 20:01:20 8414771 [11203]: r3021 paxos_acquire leader 6063
> owner 36 2 0 max mbal[35] 12124036 our_dblock 11840094 11840094 94 1
> 8325309 5921
> >     2020-05-14 20:01:20 8414771 [11203]: r3021 paxos_acquire leader 6063
> free
> >     2020-05-14 20:01:20 8414771 [11203]: r3021 acquire_disk rv 1 lver
> 6064 at 8414771
> >     2020-05-14 20:01:20 8414771 [11203]: r3021 write_host_block host_id
> 94 flags 1 gen 1 dblock 12126094:12126094:94:1:8414771:6064:RELEASED.
> >     2020-05-14 20:01:20 8414771 [11203]: r3021 paxos_release leader 6064
> owner 94 1 8414771
> >     2020-05-14 20:01:20 8414771 [11203]: cmd_acquire 6,16,11206 result 0
> pid_dead 0
> >     2020-05-14 20:01:20 8414771 [11202]: cmd_get_lvb ci 8 fd 20 result 0
> res lvm_fc261665e94c4d0f8341e254398e7ed5:VGLK
> >     2020-05-14 20:01:20 8414771 [11203]: cmd_acquire 6,16,11206 ci_in 6
> fd 16 count 1 flags 8
>
> That log section shows the shared VGLK being acquired for host_id 94.
> Again, there's no logging showing the VGLK being released.
>
> >     2020-05-14 20:01:20 8414771 [11203]: s5:r3022 resource
> lvm_fc261665e94c4d0f8341e254398e7ed5:lL0kHh-CZ1i-jtMi-1qHO-ykfR-6nmu-Gk0rES:/dev/mapper/fc261665e94c4d0f8341e254398e7ed5-lvmlock:104857600:SH
> for 6,16,11206
> >     2020-05-14 20:01:20 8414771 [11203]: r3022 paxos_acquire begin e 0 0
> >     2020-05-14 20:01:20 8414771 [11203]: r3022 leader 8 owner 36 2 0
> dblocks 35:14036:14036:36:2:1566014:8:1,
> >     2020-05-14 20:01:20 8414771 [11203]: r3022 paxos_acquire leader 8
> owner 36 2 0 max mbal[35] 14036 our_dblock 0 0 0 0 0 0
> >     2020-05-14 20:01:20 8414771 [11203]: r3022 paxos_acquire leader 8
> free
> >     2020-05-14 20:01:20 8414771 [11203]: r3022 acquire_disk rv 1 lver 9
> at 8414771
> >     2020-05-14 20:01:20 8414771 [11203]: r3022 write_host_block host_id
> 94 flags 1 gen 1 dblock 16094:16094:94:1:8414771:9:RELEASED.
> >     2020-05-14 20:01:20 8414771 [11203]: r3022 paxos_release leader 9
> owner 94 1 8414771
> >     2020-05-14 20:01:20 8414771 [11203]: cmd_acquire 6,16,11206 result 0
> pid_dead 0
> >     ... ...
> >     2020-05-14 20:11:20 8415371 [11202]: cmd_release 6,16,11206 ci_in 6
> fd 16 count 1 flags 0
> >     2020-05-14 20:11:20 8415371 [11202]: r3022 release_token r_flags 1
> lver 9
> >     2020-05-14 20:11:20 8415371 [11202]: r3022 write_host_block host_id
> 94 flags 0 gen 0 dblock 16094:16094:94:1:8414771:9:RELEASED.
> >     2020-05-14 20:11:20 8415371 [11202]: r3022 release_token done
> r_flags 1
> >     2020-05-14 20:11:20 8415371 [11202]: cmd_release 6,16,11206 result 0
> pid_dead 0 count 1
>
> That log section shows an LV lock being acquired and released.
>
> > So, I got several questions:
> >
> > 1. Why it takes so long to release lock, does it should blame to storage
> I/O ?
>
> It doesn't appear the VG lock is being released, so we'd want to look at
> lvmlockd state/logs and for an lvm commands that might be holding it.
>
> > 2. Why other host(host_id 19) can read pervious shared lock(host_id 60 &
> 94) after hour?
>
> I'm not seeing an issue like that.
>
> > 3. What's the "right" way to solve this problem?
>
> Reinitializing the lock on disk was insightful and effective, and since
> the lock was shared is probably quite harmless.  If there were lvm
> commands holding the VGLK for a long time (which we'd want to fix in lvm),
> it would probably have been better to kill those commands so their locks
> would be released.  If lvmlockd failed to release VG locks that were no
> long being used (we'd also want to fix in lvm), then you could have
> stopped and restarted the VG to clear things.
>
> Dave
>
>